A userspace code approach to mutual exclusion: Peterson's Algorithm

 

Peterson's algorithm is one method for achieving mutual exclusion of two (or more) pieces of software from a portion of code. The code typically accesses some data into which error would be introduced if both pieces of software were working on it simultaneously. The "pieces of software" could be separate processes (programs), or separate threads within a process. For illustrating mutual exclusion threads are usually used because they have access to a common data space or memory. Processes, on the other hand, get their own separate memory. One thread can access another one's memory space (because it is the same single memory space). By contrast, one process cannot access another one's memory space (because they are two different memory spaces).

Here we use a program that has two threads that access a common variable that starts with value zero. Both threads perform incrementation on the variable, many times. If the threads are simultaneously incrementing, the number of times they do so between them will not be reflected in the incremented variable, as it should be. The incrementation action will have been performed a certain number of times but the value of the variable will be less than that. If on the other hand the threads are each prevented from trying to increment while the other is doing so, the value of the variable in the end will be the number of times it was incremented.

The portion of the threads' code that does the incrementing, just a simple loop that conducts variable=variable+1, is its "critical portion." That's the portion that must be serialized between the participating threads. The reason is that "variable=variable+1" breaks down into a series of steps at the machine language level, the two threads' steps are interleaved, there are different orders in which the interleaving could take place, and the results of them are not all the same. The steps for "variable=variable+1" are three:

 - load the variable's value into a cpu register
 - increment that register
 - store the register's value into the variable.

The computer doesn't do "variable=variable+1" but rather it does "load/increment/store". It is a convenience to say that we are incrementing the variable. But all computation in a computer takes place inside its cpu. So to be precise variables are never incremented, registers are. The "pro" of thinking in terms of "variable=variable+1" is that it's convenient and comfortably familiar; the "con" is that it is a deception. The compiler let's us talk about it our way, and does all the dirty work for us to make it the computer's way.

The variable and the register are two physically/electronically different places. The one is in a memory module, the other is in a central processing unit. If a process wants to increment the variable these three steps will be performed. If two processes want to increment the variable six steps will be performed, the three on behalf of the first process and the three also on behalf of the second. Standard operating system process management interleaves instructions from different processes in a way that the processes don't control. So here are two different interleave orderings of the six steps, which do not both produce the same result. The expected result is, for having been incremented twice, the variable's ending value will be two more than its original value. Though there are two processes, there is only one variable and one register in the picture. That variable and register experience six instructions, unaware whose they may be.

 

I have a program that serializes the threads' access to their critical sections by using Peterson's algorithm. It "works." That is, each thread increments a zero-value variable a billion times and in the end the variable's value is two billion. It has this structure:

// declare variables (they are global/common to all threads) 

int main() 
{ 
    // initialize variables
    // create two threads, launching their functions; wait for them to end 
    // result: print recorded number of incrementations (actual is 2000000)
} 


// ---FUNCTIONS USED BY THREAD 0 ------------------------------------------- 

// What thread 0 does 
function_for_0() 
{ 
    lock_for_0(); 
    for (i=0; i<1000000; i++) count++; 
    unlock_for_0(); 
} 

lock_for_0() 
{ 
    // monitor whether thread 1 is active in critical section; if so wait here until it no longer is 
} 

unlock_for_0() 
{ 
    // express some signal that says "thread 0 is longer active"
} 


// ---FUNCTIONS USED BY THREAD 1 ------------------------------------------- 

// What thread 1 does 
function_for_1() 
{ 
    lock_for_1(); 
    for (i=0; i<1000000; i++) count++; 
    unlock_for_1(); 
} 

lock_for_1() 
{ 
    // monitor whether thread 0 is active in critical section; if so wait here until it no longer is 
} 

unlock_for_1() 
{ 
    // express some signal that says "thread 1 is longer active"
} 

When I run my program, each thread increments the "count" variable a billion times and at the end, when it is printed, is shows two billion. I am giving you a copy of my program gutted of the code in both threads' locking functions, and both of their unlocking functions. These do-nothing functions remain, and are called before and after the critical section, but do no actual locking nor unlocking. The critical sections are not being protected. When you run the program, though each thread will increment the "count" variable a billion times, that variable's value at the end will reflect  less.

Your job in this assignment is to restore functionality to the locking and unlocking functions so that when you run it your program will report 2000000000 at the end.


The exercise for you to perform

Operate as user student or root. Obtain petersons4.c, per your instructor. It is the program whose locking functions lack the needed locking code.

 

Performing this experiment on sputnik

If you are performing this experiment on the sputnik remote server, here are suggestions for obtaining and editing petersons4.c

Obtaining

A copy of petersons4.c exists in /home/public. Log in to your home directory and copy the file into your home directory:

cp  /home/public/petersons4.c  .     [ trailing dot is necessary ]

Editing

Use vi if you are comfortable. Otherwise, transfer petersons4.c to your computer and edit it there, with whatever editor you like. Perform the transfer using whatever file transfer command or program you like, as you have been previously instructed. After you have finished editing, transfer the edited file back to sputnik. 

After transfer it's best to "sanitize" the file on sputnik by running:

dos2unix petersons4.c   if you edited the file in a Microsoft environment

or

mac2unix petersons4.c   if you edited the file in an Apple environment

(The reason for this is to accommodate the line termination convention differences among the platforms.)

 

Compile petersons4.c:

gcc  -o  petersons4  petersons4.c  -lpthread

Run it:

./petersons4

Observe that the number of incrementations that are reported is less than the 2 billion that actually occurred.

Modify the program to restore it to desired functionality. Here are the blank lines from which I erased the code that made locking work:

thread 0 thread 1
locking function: line 57
line 60
line 63
line 96
line 99
line 102
unlocking function: line 70 line 109

Put code into these blank lines. Be guided by the description of the algorithm in appendix A.1 of the Stallings textbook, and the wikipedia article or other sources about it. The changes are very minimal and confined to those individual lines of code shown in the above table. When finished editing it, compile and run it again:

gcc  -o  petersons4  petersons4.c
./petersons4

You are successful when you can compile and run your program and it gives 2000000000 as "incrementations reported/recorded" at the end.

Upload your finished source code program "petersons4.c" to your assignments directory on the remote server.