Implementation Models


The performance of a multithreaded program is strongly dependent on the underlying implementation of the threads package. Though for many applications the programmer need not be concerned with the implementation strategy of the threads package, for some applications, particularly compute-intensive applications on multiprocessors, the programmer must take into account various crucial aspects of the threads package. In this section we summarize the commonly used implementation strategies.

Many-to-one Model

For kernels that do not support multiple threads of control, multithreading can be implemented entirely as a user-level library. These libraries, without the kernel's knowledge, schedule multiple threads of control onto the process's single kernel thread. Thus, just as a uniprocessor provides the illusion of parallelism by multiplexing multiple processes on a single CPU, user-level threads packages provide the illusion of parallelism by multiplexing multiple user threads on a single kernel thread; this is referred to as the many-to-one model [Kleiman et al. 1996]. There are several advantages to this model:

However, the many-to-one model does not come without a price. Specifically:

Despite substantial disadvantages, the relative ease of implementation of many-to-one threads packages has made it the most popular model to date. For example, the current implementations of Netscape and Java achieve their multithreading strictly through user-level threads packages.

One-to-One Model

An obvious alternative to the many-to-one model is that every user thread have its own kernel thread (i.e., that there be a one-to-one correspondence between user threads and kernel threads). This provides several advantages:

While the one-to-one model can yield a major performance win, it too is not without its costs. Most of the benefits of the many-to-one model do not carry over to the one-to-one model:

Many-to-Many Model

In an attempt to combine these two models, some operating systems, notably Mach 3.0 [Golub et al. 1990], SVR4/MP, Solaris 2.x[Powell et al. 1991], and Digital UNIX 4.0, give the programmer both user-level threads and kernel threads to the programmer. User-level threads are multiplexed on top of kernel-level threads, which in turn are scheduled on top of processors. The kernel knows only about the kernel-level threads; it does not know of the multiplexing performed by the user-level scheduler. Due to the many-to-many relationship between user threads and kernel threads, this is called the many-to-many model [Kleiman et al. 1996] (it is also referred to as the two-level model [Catanzaro 1994], the split model [Vahalia 1996] and the LWP model). By taking a hybrid approach, this model aims to combine the advantages of both the many-to-one model and the one-to-one model, while minimizing these models' disadvantages.

The major advantage of the many-to-many model is that large numbers of threads can be supported relatively cheaply. As with the many-to-one model, the creation of a user thread does not necessarily require the (relatively expensive) creation of a kernel thread. Thus one can create a large number of user threads, but have the overhead of creating only a small number of kernel threads. Synchronization can also be inexpensive: the implementation of synchronization primitives involves primarily user-level code. A user thread that must block on a synchronization primitive (such as a mutex) is queued on a wait queue and the underlying kernel thread finds and runs another user thread on the user-level run queue. Only if no runnable user thread is available does the kernel thread make a system call and block (or park) in the kernel. Thus the cost of a context switch from one thread to another can be no worse than the cost of a few subroutine calls--a system call is often not necessary.

User threads in the many-to-many model normally ``float'' among kernel threads--they may run on whatever kernel thread is available when they become runnable. However, in some cases it may be necessary to associate a user thread permanently with a kernel thread, i.e., to bind the user thread to the kernel thread. Such bound threads behave as threads do in the one-to-one model--their creation requires the creation of a kernel thread, and synchronization operations requires system calls (to park the kernel thread in the kernel when the user thread is blocked and to unpark it when the user thread is released).

Scheduler Activations

The many-to-many model employs two schedulers: one in the kernel and one in the user threads library. It is not immediately obvious how the kernel scheduler can cooperate with the user scheduler. For example, say the user scheduler has a high-priority thread to schedule, so it preempts the execution of a lower-priority thread, reassigning its kernel thread to the high-priority user thread. But, at the same time, the kernel scheduler decides that the kernel thread's time slice has expired and reassigns the processor to another kernel thread, perhaps one that has been assigned by our user-level scheduler to a lower-priority thread. Thus the thread deemed the most important by the user-level scheduler is denied immediate use of the processor by the kernel scheduler in favor of a less important thread.

Another problem is the number of kernel threads. How many kernel threads should be created to support a particular process? If there are too few, then the available concurrency will not be realized--user threads that are ready to run will stand idle, even though there may also be idle processors. If there are too many, then the kernel may needlessly be multiplexing a number of kernel threads on a smaller number of processors, wasting time doing the context switching, even though the application has no need for such time slicing.

One might be tempted to give a process just as many kernel threads as there are processors. But if a user thread executes a blocking system call (such as reading from an I/O device) or suffers a page fault, then its underlying kernel thread also blocks--another user thread may be ready to execute, but no kernel thread is available to be assigned to it.

An elegant solution to both problems, not yet appearing in a commercial system, is scheduler activations, an approach devised at the University of Washington [Anderson et al. 1992]. This variant of the many-to-many model provides an explicit means for the user-level and kernel schedulers to cooperate. The kernel assigns processors to processes and the user-level scheduler assigns these processors to user threads. The user-level scheduler keeps the kernel apprised of how many processors it needs; the kernel scheduler notifies the user-level scheduler of all processor-related events that affect the user process, such as when processors become available to it or are taken away from it. This model appears to solve many of the problems of the many-to-many family of models; its greatest drawback is perhaps the frequent crossings of the user-kernel boundary.

[ Top | Introduction | Solaris | Threadmon | References ]


The text of this web document was taken from a paper by Bryan M. Cantrill and Thomas W. Doeppner Jr., Department of Computer Science, Brown University, Providence, RI 02912-1910
Greg Foxman (gmf@cs.brown.edu)