A Not-So-Correct Multithreaded System

On this page, we shall use what we have learned so far to implement a multithreaded system. It is not a correct one, because we did not discuss how to create an individual stack for each running thread. However, if we assume that all "threads" use global variables only, we shall have a sample program that is good enough for illustrating many important concepts in building a user-level kernel that supports multithreaded programming. Again, check our mini-project page for such an implementation. Click here to download a copy of this program.

Data Structures

Each thread of the system is described and controlled by a PCB (i.e., process control block). This data structure is shown below:

typedef struct PCB_NODE  *PCB_ptr; /* pointer to a PCB     */

typedef struct PCB_NODE {          /* a PCB:               */
     jmp_buf   Environment;        /*   jump buffer        */
     int       Name;               /*   thread name: unused*/
     PCB_ptr   Next;               /*   next PCB           */
}  PCB;

It consists of three members: Environment, Name and Next. Environment is a jump buffer for saving the execution environment of the thread before it is switched out; Name is the thread identifier; and Next is a linked-list pointer to the next PCB.

In addition, there are two more jump buffers MAIN and SCHEDULER. The former is used to save the main program's environment, and the latter is for the scheduler. Since the main program and the scheduler are not scheduled by the scheduler, they are not in the PCB list.

There are two pointers: Head pointing to the head of the PCB list, and Current to the PCB of the running thread. The following diagram gives a complete picture of this data structure:

The Scheduler

The scheduler is quite simple and is shown below:

void  Scheduler(void)
{
     if (setjmp(SCHEDULER) == 0)
          longjmp(MAIN, 1);
     Current = Current->Next;
     longjmp(Current->Environment, 1);         
}

Initially, the scheduler Scheduler() is called by the main program to set an entry point in jump buffer SCHEDULER and jump back to the main via jump buffer MAIN that was setup before the call to Scheduler(). After this, to enter the scheduler, we use a long jump to SCHEDULER rather than via function calls. If the scheduler is re-entered through a long jump to SCHEDULER, it picks up the next PCB and long-jumps to the environment stored there. This will resume the execution of that thread.

How do we get into the scheduler? THREAD_YIELD()

This is a critical question. We certainly can use an alarm clock or an interval timer to periodically generate timer signals; but, this will make the program very complex. Therefore, we choose to take the non-preemptive approach. More precisely, the running thread must relinquish the control of CPU voluntarily using function THREAD_YIELD(). In fact, except for the scheduler Scheduler(), all thread related functions are implemented as C macros. So, macro THREAD_YIELD() is implemented as follows:

#define   THREAD_YIELD(name)    {                           \
                    if (setjmp(Current->Environment) == 0)  \
                    longjmp(SCHEDULER, 1);                  \
               }

When THREAD_YIELD() is called to indicate that the caller is willing to release the CPU, the environment is saved to the Environment member of this thread's PCB and then longjmp() is used to jump to the scheduler. As mentioned in the previous section, the scheduler selects and runs the next thread.

Creating and Initializing a Thread: THREAD_CREATE() and THREAD_INIT()

Under this system, and like any other systems, threads are created by running functions. The following is a function to be run as a thread:

void  funct_1(int  name)
{
     int  i;

     THREAD_INIT(name);            /* initialize as thread */
     while (1) {                   /* running the thread   */
          for (i = 1; i <= MAX_COUNT; i++)
               printf("funct_1() executing\n");
          THREAD_YIELD(name);      /* yield control        */
     }
}

The first statement of a thread function must be the call to THREAD_INIT(). In an actual implementation, this call can be incorporated into the thread creation function, and is not necessary. In order to simplify our system, we make this call explicit. This function has an infinite loop and a call to THREAD_YIELD() at the end of each iteration to relinquish the CPU.

So, how to initialize a thread? The following is the macro THREAD_INIT():

#define   THREAD_INIT(name)     {                           \
                    work = (PCB_ptr) malloc(sizeof(PCB));   \
                    work->Name = name;                      \
                    if (Head == NULL)                       \
                         Head = work;                       \
                    else                                    \
                         Current->Next = work;              \
                    work->Next = Head;                      \
                    Current = work;                         \
                    if (setjmp(work->Environment) == 0)     \
                         longjmp(MAIN, 1);                  \
               }

It is a simple procedure. First, a PCB is allocated and inserted into the PCB list. Second, we make this newly created thread the current one (i.e., Current = work). Note that after all threads are created, Current points to the first of the PCB list. Third, the current execution environment is saved to the Environment member of this PCB. Now, the PCB is all set; but why returns to the main program? The reason is quite artificial and will be clear after the discussion of THREAD_CREATE(). Anyway, sometime later, when the scheduler selects this thread to run and long-jumps to the saved environment, this macro finishes its job and the execution enters the next statement following this macro call.

The following is THREAD_CREATE(). It takes two arguments, a function name to be run as a thread and a thread ID which is an integer. This function sets a return point in jump buffer MAIN and calls the thread function. This causes that threads can only be created by the main program. Once the thread function is called, it will execute THREAD_INIT() and return to the main program after initialization. Hence, the setjmp() in THREAD_CREATE() is simply for setting up a return point for THREAD_INIT() to return.

#define   THREAD_CREATE(function, name) {                   \
                    if (setjmp(MAIN) == 0)                  \
                         (function)(name);                  \
               }

Finally, the main program

The main program is shown below. It creates five threads, each of which runs a function. Because all thread creations return the control back to the main program, we are sure that after all threads are created, no thread is run. Therefore, the main program sets up MAIN and calls the scheduler. Note that the scheduler will long-jump back to the main program via jump buffer MAIN. Right after returning from the scheduler, the main program long-jumps to the scheduler to start running the threads.

void  main(void)
{
     Head = Current = NULL;        /* initialize pointers  */

     THREAD_CREATE(funct_1, 1);    /* initialize threads   */
     THREAD_CREATE(funct_2, 2);
     THREAD_CREATE(funct_3, 3);
     THREAD_CREATE(funct_4, 4);

     if (setjmp(MAIN) == 0)        /* initialize scheduler */   
          Scheduler();
     longjmp(SCHEDULER,1);         /* start scheduler      */
}

As you can see from the above code segments, we normally do the following:

Set up a jump buffer and call a function.
In the called function, it also sets up a jump buffer and uses a longjmp() to return to the caller.
Once the control is back to the caller, the caller uses another longjmp() to re-enter the called program.

Thus, the function call and the longjmp() return provide a chance for initialization. The actual job is carried out via the longjmp() from the caller to the called function. Compare this with the Ping-Pong program discussed on the previous page.

Discussion

Note that there is no intention to implement a perfect program on this page because is it too complex to be done here. The first, and the most important, problem of this program is the use of local variables in the five thread functions. As discussed earlier, this will not produce correct results. However, this program appears correct on many systems. Why? You are somewhat cheated, anyway. In each of the five thread functions, the local variable i is used until the call to THREAD_YIELD(). After returning from THREAD_YIELD(), this variable is reinitialized and reused, and, as a result, no conflict occurs.

Then, how about the content stored in the Environment member of this thread's PCB? Is its content correct. Generally speaking, its content is not correct. However, in terms of the next instruction to be executed, it is correct because this information (i.e., the address of the next instruction) is not stored in an activation record.

Hmmm, there is no THREAD_EXIT()! It is not a difficult for you to add THREAD_EXIT() and other thread related functions as long as they do not use local variables.