next up previous contents
Next: 3.3 Miscellaneous node operations Up: 3.2 System booting Previous: 3.2.1 Starting system servers

3.2.2 OS modules startup

The object responsible for bringing OS modules into operation is named ModLoader. Node::modstart will simply call a machine dependent load_mods which scans through the list of modules found at the boot time (using ModLoader::load to load them in turn). Finally, ModLoader::start will start the just loaded modules. If no module could be loaded we will panic (there will be no system to run).

<off_Node::modstart implementation. >= (U->)
//Start OS modules.
void off_Node::modstart(void)
{
  static off_ModLoader loader;
  assert(valid());

  n_mdep.load_mods(loader); 
  //  loader.start(off_Processor::self()); //  Run modules at processor #0. XXX fix it
  panic("ModLoader::start returned.");
}

It is mostly a machine dependent operation because the module information is stored in a machine dependent way.

<Off node dependencies. >+= (<-U) [<-D]
#include <flux/debug.h>         // for panic()
#include <node/mod.h>           // for off_ModLoader

The modstart implementation is also kept in node/Node.C.

<Implementation of other methods of off_Node. >+= (<-U) [<-D->]
<off_Node::modstart implementation. >

\subsubsection{The module loader}

The ModLoader object provides some methods needed by modstart. Namely, it knows how to load an OS module and how to start an OS module.

<Off module loader. >= (U->)
// An OS module loader.
//
class off_ModLoader {
protected:
  <Other protected members of off_ModLoader. >
public:
  off_ModLoader();
  // Loads the module m.
  err_t load(off_mdepModule &m);
  // Arranges for the modules to start running at proc.
  // void start(off_Processor &proc);  XXX fix this
};

Defines off_ModLoader (links are to index).

<Off module loader dependencies. >= (U->)
#include <klib/ids.h>           // for off_id_t et al.
#include <klib/err.h>           // for err_t and error numbers.
#include <node/mdep/mmod.h>     // for off_mdepModule et al.
//#include <hw/Processor.h>      // for off_Processor     XXX fix this

The ModLoader starts with an empty list of loaded modules. Each call to load will load a given module into memory and create a portal, a DTLB and a shuttle for that module to run. Finally, when start is called, a run queue will be installed in the boot processor so that every shuttle created could start running.

The information kept for each module being loaded is defined by off_mod_t.

<Off module data type. >= (U->)
// Information kept for each OS module.
//
struct off_mod_t {
   off_shtl_id_t m_shtl;
   off_prtl_id_t m_prtl;
   off_dtlb_id_t m_dtlb;
};

Defines off_mod_t (links are to index).

The ModLoader keeps a list of modules to be filled up during the loading process.

<Other protected members of off_ModLoader. >= (<-U) [D->]
off_mod_t mods[OFF_NMODS_MAX];
int       nmods;

The maximum number of modules OFF_NMODS_MAX is a system limit.

<Off limits. >+= [<-D->]
const int OFF_NMODS_MAX = 8;    // Max. number of OS modules.
Defines OFF_NMODS_MAX (links are to index).

<off_ModLoader::ModLoader implementation. >= (U->)
// Initializes the module loader.
off_ModLoader::off_ModLoader (void): 
  nmods(0) 
{
  putchar('m');
  for(;;);
}

The task of load is to setup an image for each module and to create a portal, a DTLB and a shuttle for that module.

<off_ModLoader::load implementation. >= (U->)
// Loads module m.
// returns error code or zero if no error. 
err_t off_ModLoader::load(off_mdepModule &m)
{
  off_mod_t *mod = mods+nmods;   // module being loaded
  <Other local variables of off_ModLoader::local. >

  <Create shuttle for mod using m. >
  <Create portal for mod. >
  <Create dtlb for mod if dtlb support is activated. >
  <Load mod image into core. >
  <Allocate stack sp for mod. >
  <Setup main arguments in sp. >
  <Install mod shuttle in the nmodsth run queue slot. >
  nmods++;

  return EOK;
}

To create a shuttle we need to specify both the initial program counter and the initial stack pointer. Also, we have to provide initial arguments for the shuttle entry point. Those arguments are:

To obtain the initial stack pointer value we create an mdepStck object (which will be set to a predefined initial value) and then use the ``-='' operator to compute the new value after every argument has be pushed on it. The size for the argument list is rounded to a multiple of the machine word size.

NOTE: When no DTLB support, must choose a suitable (phys) stack value.

<Other local variables of off_ModLoader::local. >= (<-U)
off_mdepStck sp(0);             // XXX Initialize w/ default stack value.
                                // MDEP_USTCK or new_user_stack() is !dmm

<Create shuttle for mod using m. >= (<-U)
sp -= (sizeof(off_prtl_id_t)+   // Exception portal id if any
       sizeof(off_dtlb_id_t)+   // DTLB id if any
       sizeof(char*)        +   // Pointer to argument string
       sizeof(off_Rights)   +   // Kernel Rights
       (strlen(m.argl())+strlen(m.argl())%sizeof(int))); // Argument string.

<Off module loader implementation dependencies. >= (U->)
#include <klib/prot.h>          // for off_Rights
#include <klib/mdep/mStck.h>    // for off_Stck

The method argl of m has been used to obtain the module argument list.

NOTE: Shuttle creation goes here.

Every time a new shuttle is created, its identifier is installed in a run queue. The run queue is kept in ModLoader.

<Other protected members of off_ModLoader. >+= (<-U) [<-D]
off_shtl_id_t runq[OFF_NMODS_MAX]; // Initial run queue.

<Install mod shuttle in the nmodsth run queue slot. >= (<-U)
runq[nmods]=mod->m_shtl;

\Note{Remaining off_ModLoader::load code goes here too.

Finally, start will give the shuttles of the modules a chance to run. To do so, it installs the run queue in the specified processor.

<off_ModLoader::start implementation. >= (U->)
// Arranges for the modules to start running at proc.
#if 0               // XXX fix this
void off_ModLoader::start(off_Processor &proc)
{
  proc.run(runq,nmods);
  panic("Run queue installation returned.")
}
#endif

\subsubsection{The module loader \cpp{} source files}

Module loader code is kept in node/mod.h and node/mod.C.

<mod.h*>=
<Read the literate code instead warning. >
#ifndef __OFF_MOD_H
#define __OFF_MOD_H 1

<Off module loader dependencies. >

#ifdef __KERNEL__
<Off module data type. >
<Off module loader. >

#endif // __KERNEL__

#endif // __OFF_MOD

<mod.C*>=
<Read the literate code instead warning. >

#include <node/mod.h>           // Exported interface.
<Off module loader implementation dependencies. >

<off_ModLoader::ModLoader implementation. >
<off_ModLoader::load implementation. >
<off_ModLoader::start implementation. >

\section{System halt, reboot and suspend}

System halt and reboot are done by machine dependent routines. The machine independent part must deal just with kind messages and serial line console termination.

<off_Node::halt implementation. >= (<-U)
// Halts this node
void off_Node::halt(const char *msg )
{
  w_lock();
  kcout << msg << nl;
  if (using_serial_console())
    gdb_serial_exit(0);
  n_mdep.halt();
  panic("halt returned");
}

<off_Node::reboot implementation. >= (<-U)
void off_Node::reboot( void )
{
  assert(valid());
  w_lock();
  if (using_serial_console())
    gdb_serial_exit(0);
  n_mdep.reboot();
  halt("node: Reboot failed, halting.");
}

<Off node implementation dependencies. >+= (<-U) [<-D->]
#include <klib/str.h>       // for kcout
#include <flux/gdb_serial.h>    // for gdb_serial_exit

Suspend is not implemented.

<off_Node::suspend implementation. >= (<-U)
err_t off_Node::suspend( void )
{
  assert(valid());
  return ENOSYS;
}

\section{Frozen nodes}

To support freeze and melt nodes must implement freeze_state and copy_state.

<Other protected methods of off_Node. >= (<-U) [D->]
// Freeze dynamic resource state at curbuf in buf with avail bytes. 
virtual err_t freeze_state(char *buf, char *&curbuf, size_t &avail,
                           char *&curto, size_t &toavail,
                           const off_prtl_id_t  &frozen_domain  );

// Copy static resource state in curbuf with toavail bytes. 
virtual err_t copy_state(char *&curto, size_t &toavail) const;

The first one is more delicate. To freeze a the state of a whole node we freeze every system server and then the node itself. A node freeze operation reminds of a suspend operation, because once the node is frozen many system services are no longer available. But even when frozen, basic system servers should leave a small amount of non-freezeable resources available so that the whole node state could be transferred to a different hardware-node and melted there. Read only (const) operations are still feasible. Besides, as devices cannot be frozen, any OS device driver should be restarted.

<off_Node::freeze_state implementation. >= (U->)
// Freeze dynamic resource state at curbuf in buf with avail bytes. 
err_t off_Node::freeze_state(char *buf, char *&curbuf, size_t &avail,
                             char *&curto, size_t &toavail,
                             const off_prtl_id_t  &frozen_domain  ) 
{
  <off_Node::freeze_state local variables. >

  assert(valid());
  assert(buf && curbuf);
  <Freeze node members in buf at curbuf in avail bytes. >
  <Freeze system servers in buf at curbuf in avail bytes. >
  return EOK;
}

Node members n_name, n_url, and n_owner use dynamic memory. Their value is stored in curbuf and they are adjusted to contain relative pointers into buf.

<Freeze node members in buf at curbuf in avail bytes. >= (<-U)
if (!::freeze(n_name,buf,curbuf,avail)  ||
    !::freeze(n_owner,buf,curbuf,avail) || 
    !::freeze(n_url,buf,curbuf,avail)     )
  return ENOSPC;

We have used freeze. It is a generic utility function which stores the given argument into the buffer, advancing the buffer pointer, subtracting the newly stored size and adjusting the pointer to the argument given so that it is a relative value into buf.

<Off node implementation dependencies. >+= (<-U) [<-D->]
#include <assert.h>             // for assert
#include <string.h>             // for strcpy et al.
#include <klib/freeze.h>        // for freeze

Systems servers must be frozen on their own. We freeze them in the order shown below. Even though we are freezing the shuttle server, the current shuttle will continue its execution unaffected by the freeze operation. Such shuttle can be used later on to suspend the node, halt it or to transfer the frozen node contents to a different location. (Note that read-only operations may proceed while the system is frozen. Only read-write operations will be affected).

<Freeze system servers in buf at curbuf in avail bytes. >= (<-U)
#if 0               // XXX fix this
if ((err=n_shtl->freeze(curbuf,avail,curto,toavail,frozen_domain)) || 
    // Only runs this shtl beyond this point
    (err=n_prtl->freeze(curbuf,avail,curto,toavail,frozen_domain)) ||
    (err=n_dmm->freeze(curbuf,avail,curto,toavail,frozen_domain))      )
  return err;
for (natural_t i=0; i<n_nprocs; i++)
  if ((err=n_proc[i].freeze(curbuf,avail,curto,toavail,frozen_domain)))
    return err;
for (natural_t i=0; i<n_ndmas; i++)
  if ((err=n_dma[i].freeze(curbuf,avail,curto,toavail,frozen_domain)))
    return err;
#endif
for (natural_t i=0; i<n_niobanks; i++) 
  if ((err=n_iobank[i].freeze(curbuf,avail,curto,toavail,frozen_domain)))
    return err;
for (natural_t i=0; i<n_nmbanks; i++)
  if ((err=n_mbank[i].freeze(curbuf,avail,curto,toavail,frozen_domain)))
    return err;

We have introduced a new local variable.

<off_Node::freeze_state local variables. >= (<-U)
err_t err;                      // Error code

Even though every system service is frozen now, hardware system services will be still available but for resource allocation. With respect to system servers, portals will still deliver events (no allocation nor change allowed though), DTLBs will still work (no translations changed nor status bits updated though) and (only) the current shuttle will remain running.

To copy the node static state to a user buffer we copy the node itself. There is no need to copy the state of every system server contained in the node, because when freeze_state is called it entirely freezes every system server.

<off_Node::copy_state implementation. >= (U->)
// Copy static resource state in curbuf with toavail bytes. 
err_t off_Node::copy_state(char *&curto, size_t &toavail) const
{
  assert(valid());
  if (curto){
    if (toavail<sizeof(off_Node))
      return ENOSPC;
    *(off_Node*)curto = *this;
    curto += sizeof(off_Node);
    toavail-=sizeof(*this);
  }
  return EOK;
}

Once a node is frozen, it can be melted. To melt a node we must assist the generic melt implementation provided by Resource by implementing some methods.

<Other protected methods of off_Node. >+= (<-U) [<-D->]
// Does resource cleanup. No external resource should be used after 
// return. 
virtual err_t cleanup(void);

// Restores the static state from a user supplied buffer. 
virtual err_t restore_state(char *&buf, size_t &bsize);

// Restores dynamic resource state. 
virtual err_t melt_state(char *&buf, size_t &bsize, char *&from, size_t &size,
                         const off_prtl_id_t &melted_domain);

First, to clean up node state we simply clean up every contained resource and release memory hold by the node name, owner and URL are (the only dynamic members of a node).

<off_Node::cleanup implementation. >= (U->)
// Does resource cleanup. No external resource should be used after 
// return. 
err_t off_Node::cleanup(void)
{
  assert(valid());
  assert(n_name && n_url && n_owner);
  free((char*)n_name);          // created w/ strdup. Must use free.
  free((char*)n_url);           // and thus discard `const'. 
  free((char*)n_owner);
  return EOK;
}

The static state restore function is also simple. The most picky part is that we must check (using can_melt) that the node image can indeed be melted. For example, we can not melt a node with lots of memory into a node with few memory, or a node with three different memory banks into a node with a single one. This node should be at least as powerful as the node image being melted.

<off_Node::restore_state implementation. >= (U->)
// Restores the static state from a user supplied buffer. 
err_t off_Node::restore_state(char *&buf, size_t &bsize)
{
  off_Node *other=(off_Node*)buf;
  assert(valid());
  if (buf){
    if (bsize < sizeof(*this) || !can_melt(other)){
      return EINVAL;
    }
    *this=*other;
    buf+=sizeof(*this);
    bsize-=sizeof(*this);
  }    
  return EOK;
}

We simply used a copy operator.

<Other public methods of off_Node. >+= (<-U) [<-D->]
// Copy node contents. 
const off_Node &operator=(const off_Node &other);

Such operator is carefully defined not to copy those members which must remain as they are (e.g. n_mdep) or should remain the same (e.g. n_nav).

<off_Node::operator= implementation. >= (U->)
// Copy node contents. 
const off_Node &off_Node::operator=(const off_Node &other)
{
   assert(valid() && other.valid());
   *(off_Resource*)this=*(off_Resource*)&other;
   _seq=other._seq;             // Private members for sequencing objects.
   n_xdt=other.n_xdt;
   n_auth=other.n_auth;
   n_comredirect=other.n_comredirect;
   n_coutredirect=other.n_coutredirect;
   n_cout=other.n_cout;
   return *this;
}

<off_Node::melt_state implementation. >= (U->)
// Restores dynamic resource state. 
err_t off_Node::melt_state(char *&buf,size_t &bsize,
                           char *&from, size_t &size,
                           const off_prtl_id_t &melted_domain)
{
  assert(valid());
  assert(buf);
  <Melt dynamic members from buf of length bsize. >
  <Melt system servers at buf (len bsize) and from (size). >
  return EOK;
}

To melt dynamic node members from its frozen state we use the melt utility function. Note how n_opts is missing (as it was in the copy operator) because such options have been already processed by the frozen node.

<Melt dynamic members from buf of length bsize. >= (<-U)
n_name=::melt(n_name,buf,bsize);
n_owner=::melt(n_owner,buf,bsize);
n_url=::melt(n_url,buf,bsize);

If everything looks fine, we proceed melting system servers dynamic state in the order they were frozen. They will pick their dynamic frozen state in turn.

<Melt system servers at buf (len bsize) and from (size). >= (<-U)
// Melt system servers. 
// NB: This code is too silly. When source and target nodes differ
// we should check which local hardware containers match best with
// remote ones. XXX

#if 0               // XXX fix this
n_shtl->melt(buf,bsize,from,size,melted_domain);
n_prtl->melt(buf,bsize,from,size,melted_domain);
n_dmm->melt(buf,bsize,from,size,melted_domain);
for (natural_t i=0; i<n_nprocs; i++)
  n_proc[i].melt(buf,bsize,from,size,melted_domain);
for (natural_t i=0; i<n_ndmas; i++)
  n_dma[i].melt(buf,bsize,from,size,melted_domain);
#endif
for (natural_t i=0; i<n_niobanks; i++) 
  n_iobank[i].melt(buf,bsize,from,size,melted_domain);
for (natural_t i=0; i<n_nmbanks; i++)
  n_mbank[i].melt(buf, bsize,from,size,melted_domain);

We are not checking error codes because, anyway, the user is trying to melt a whole node. If melting fails the node is likely to become unuseful (the user was trying to replace the current one). We simply hope for the best. Nevertheless, we should check error codes.

We still have to answer the question, what nodes are can be melted at this one? The answer is given by can_melt.

<Other protected methods of off_Node. >+= (<-U) [<-D]
// Is the other node meltable at this one?
boolean_t can_melt(const off_Node *other) const;

We simply check that there are enough hardware containers to melt the remote node.

<off_Node::can_melt implementation. >= (U->)
// Is the other node meltable at this one?
boolean_t off_Node::can_melt(const off_Node *other) const
{
  assert(valid() && other && other->valid());
  return (n_nmbanks >= other->n_nmbanks &&
          n_niobanks>= other->n_niobanks//&&      XXX fix this
          //n_ndmas>= other->n_ndmas
);

}

Code for freeze and melt is included with the implementation of other Node methods.

<Implementation of other methods of off_Node. >+= (<-U) [<-D->]
<off_Node::freeze_state implementation. >
<off_Node::copy_state implementation. >
<off_Node::cleanup implementation. >
<off_Node::restore_state implementation. >
<off_Node::operator= implementation. >
<off_Node::melt_state implementation. >
<off_Node::can_melt implementation. >

\section{Translating alien resources to native format}

Whenever an alien resource is detected and conversion to native format is desired, assimilate is called.

<Other public methods of off_Node. >+= (<-U) [<-D->]
// Tries to assimilate the resource in buf to the specified kind 
// of resource for the given architecture. 
err_t assimilate(const char *kind, const char *arch, char *&buf, size_t &size);

The implementation issues an upcall to a trusted external data translator.

<off_Node::assimilate implementation. >= (U->)
// Tries to assimilate the resource in buf to the specified kind 
// of resource for the given architecture. 
err_t off_Node::assimilate(const char *kind, const char *arch, 
                           char *&buf, size_t &size)
{
  err_t r=EOK;
  (void)kind; (void)arch; (void)buf; (void)size;
#if 0               // XXX fix this
  off_XDTReq m(kindof, buf, size);
  off_XDTRep r(buf,size);
  assert(valid());
  assert(arch && buf && size>0);
  if (nd.get_xdt() == OFF_PRTL_NULL)
    return ENOPRTL;
  if ((r=prtl.kpct(nd.get_xdt(),off_Shtl::self(),
                   sizeof(m),sizeof(r),&m,&r,0)))
    return r;
  else {
    buf=r.m_kbuf;
    size=r.m_len;
    return r.m_err;
  }
#endif
  return r;
}

The data translator invoked must be trusted and it is a good idea to let it write buf so we avoid unnecessary copying.

Messages exchanged by the kernel and the translator are as follow: The kernel sends the expected object kind as well as the user buffer.

<Off user-kernel messages. >= [D->]
// XDT request
struct off_XDTReq :public off_MsgReq {
  off_Magic m_kind;             // kind of object needed.
  char     *m_kbuf;             // Addres in kernel for the user buffer.
  size_t    m_size;             // Size of the buffer.

  //Creates an XDT request.
  off_XDTReq(const off_Magic &kind, char *buf, size_t len) :
    off_MsgReq(OFF_EX_XDT),
    m_kind(kind), m_kbuf(buf), m_size(len) {;}
};

Defines off_XDTReq (links are to index).

<Off user-kernel messages dependencies. >= [D->]
#include <klib/Magic.h>         // for off_Magic

The user is expected to return the (kernel) address of a buffer with the data already translated.

<Off user-kernel messages. >+= [<-D->]
struct off_XDTRep : public off_MsgRep {
  err_t   m_err;                // Error code.
  char   *m_kbuf;               // Buffer with translation. 
  size_t  m_len;                // Translated data length.
  off_XDTRep(void){;}
  off_XDTRep(const err_t err, char *buf, size_t len) :
    m_err(err), m_kbuf(buf), m_len(len) {;}
};

Defines off_XDTRep (links are to index).

These message types are defined in prtl/ex.h.

<Off node implementation dependencies. >+= (<-U) [<-D]
#include <prtl/ex.h>            // for XDT{Req|Rep} 

<Implementation of other methods of off_Node. >+= (<-U) [<-D->]
<off_Node::assimilate implementation. >

\section{Dumping node contents}

To dump some summary information about node contents, this operator is provided.

<Other public methods of off_Node. >+= (<-U) [<-D->]
friend OStr &operator<<(OStr &s,  off_Node &n);

<off_Node::operator<< implementation. >= (U->)
// Dumps node summary information.
OStr &operator<<(OStr &s,  off_Node &n)
{
  assert(n.valid());
  s << (off_Resource)n << nl;
  s << " name="" <<n.n_name <<""" << nl;
  s << " url="" <<n.n_url <<""" << nl;
  s << "auth=" << n.n_auth<< " xdt=" <<n.n_xdt <<" cout=" <<n.n_cout << nl;
  s << n.get_arch() << " " << 
    n.n_niobanks << " iobanks," << // n.n_ndmas << " dmas," <<  XXX fix this
    n.n_nmbanks << " mbanks." << nl;
  return s;
}

Its implementation is kept along with remaining node code.

<Implementation of other methods of off_Node. >+= (<-U) [<-D->]
<off_Node::operator<< implementation. >


next up previous contents
Next: 3.3 Miscellaneous node operations Up: 3.2 System booting Previous: 3.2.1 Starting system servers
Francisco J. Ballesteros
1998-05-25