Blocking - Voluntary Suspension

We now turn our attention to blocking, which the is mechanism that threads use to give up processor time voluntarily to wait for an event to occur or a resource to become available.

The term voluntary is chosen from the perspective of the scheduler and not necessarily from the application's perspective. In this context voluntary suspension refers to an action taken by a thread to give up its time-slice. This will include direct actions such as waiting on semaphores as well as calling APIs, which for internal reasons need to wait for a resource or an event.

PROCBLOCK and its counterpart PROCRUN are the two kernel routines at the heart of the block/run mechanism. These are callable directly by kernel component and also by Device Drivers and File System Drivers through a small interface layer. Application code only gets to call PROCBLOCK and PROCRUN indirectly through system APIs and in particular through the semaphore APIs.

The block/run mechanism is designed with the following criteria:

A thread should be able to block without the waking thread having to know whether anyone, or who, had blocked on a resource

Multiple threads should be able to wake when an event or resource becomes available.

This is achieved by having an abstract token, known as the Block ID, associated with the resource or event. The BlockId is passed to PROCBLOCK when a thread blocks. Similarly when another thread wishes to wake threads waiting for a resource or event the BlockId that represents the resource or event is passed to PROCRUN.

In addition to the BlockId, callers of PROCRUN receive a flag that indicates whether all or just the highest priority thread waiting on the BlockId should wake.

This mechanism has shortcomings unless certain constraints are applied:

BlockIds need to be subject to a convention that gives uniqueness otherwise it is possible that threads will incorrectly block and run. A solution is to use the address of a control block memory object that relates uniquely to the resource or event.

If addresses are to be used for BlockIds then they must point to global data for reasons of uniqueness. Furthermore, if they are to be reference by disabled code then the storage needs to be in resident memory. This more or less implies that addresses must be taken from within the System Arena.

If BlockIds are in use that do not represent addresses then they must not conflict with any potential addresses used as BlockIds.

Even if addresses are use there is no accounting information that says who owns the related resource.

A workable scheme is implemented by limiting the direct use of PROCBLOCK and PROCRUN to system code, device drivers and file system drivers, all of which have access to the System Arena.

Apart from three special conventions the system and most device drivers use addresses as BlockIds. There are three system defined conventional BlockIds are:

fffe....

results from a RAMSEM wait. fffd....

results from a MUXWAIT. ffca....

results from a Child Wait x....... (x=a - f)

Linear address of the memory object of control block that relates to the resource. ........

Probably selector:offset address of the memory object or control block that relates to the resource.

This scheme could be subverted by device drivers, but in general they will choose to block on addresses of resources they own, which are usually allocated out of the system arena and addressed using a GDT select:offset.

Accountability remains an exposure. For BlockIds that are addresses the owner of the memory that the BlockId points to gives a big clue. For conventional BlockIds we have to do more work. These are discussed in detail later. We will first we look at an example of a BlockId that is an address.

Basic Technique:

The technique for analysing blocked threads is two-pronged:

We can look at the wait from the application perspective by examining the current user registers and by trying to identify the API issued. This is usually relatively easy but often gives no clue as to the underlying wait since any single API may block on many occasions for many reasons.
Examine the problem from the Internal, or Kernel perspective to determine what an API might be waiting for. This process starts with finding the associated BlockId.

When a thread blocks its BlockId is stored in the TCB TCBSleepId field. Conveniently, this is formatted by using the .PB KDB and DF command.

Note:

.PB under DF lists non-blocked threads. BlockIds are irrelevant for such threads.

PB also attempts to interpret the BlockId. The full details of these are given in the Kernel Debugger and Dump Formatter Command Reference. In addition to classifying the BlockId, .PB examines TCB_SemInfo and TCB_SemDebugAddr.

For many semaphore originated BlockIds TCB_SemInfo is used to store the address or handle of the user's semaphore that lead to the thread blocking. .PB will attempt to locate a near symbol to the semaphore address and display it.

Under the kernel Debugger, TCB_SemDebugAddr is used to store the address of the caller to the Semaphore API when the thread blocked. If this field is not 0xffffffff .PB attempts to locate a near symbol to the caller and display it.

Once we have the BlockId, TCB_SemInfo, and TCB_SemDebugAddr we are able to begin searching for information associated with reason for blocking.

The next step is to decide whether the BlockId is one of the three special categories or to be treated as an address.

[Back: Thread Scheduling and Dispatching Topics]
[Next: Blocking on the Address of a Resource]