Parallel Programming: Msynch 

pseudo code

class Msynch  {
     int replies;
     int currentState = 1;
        synchronized void acquire ( ) {
 // Called by thread wanting access to a critical section
              while (currentState != 1) {wait ( );}
              replies = 0; currentState = 2;
              //         
              // (Here, 5 messages are sent)         
              //         
        while (replies < 5) {wait ( );} // Await 5 replies         
              currentState = 3;    
        }   
        synchronized void replyReceived ( ) {
        // Called by communication thread when reply is received         
              replies++;         
              notifyAll ( );
        }    
        synchronized void release ( ) {
        // Called by a thread releasing the critical section    
              currentState = 1;        
              notifyAll ( );   
        }
}

 

class Msynch1  {    
        int replies;    
        int currentState = 1;      
        synchronized void acquire ( ) {
       // Called by thread wanting access to a critical section         
              while (currentState != 1) {yield ( );}         
              replies = 0; currentState = 2;         
              //         
              // (Here, 5 messages are sent)         
              //         
              if (replies < 5) {wait ( );} // Await 5 replies         
              currentState = 3;    
        }     
        synchronized void replyReceived ( ) {
        // Called by communication thread when reply is received
              replies++;
        }       
        synchronized void release ( ) {
        // Called by a thread releasing the critical section    
              currentState = 1;        
              notifyAll ( );   
        } 
}

From the two sets of code above, I have highlighted three differences (three synchronization related errors in Msynch1) in three different colors, as identified by a line-by-line code comparison.  The reason why

  1. notifyAll ( ); is missing, therefore this code won’t work because without this line of code we will not be able to unblock any other thread. Thus, this error will not allow for replyReceived ( ) to be synchronized. This missing part of the code should activate all threads in the wait set to allow them to compete with each other for the lock based on priorities.
  2. {yield ( );} won’t work is because it won’t block until queue not full like in {wait ( );}. Thus, the wait( ) function, which aids in releasing the threads lock is needed. When a thread calls wait( ) it will unlock the object.  After returning from wait( ) call, it will re-lock the object.
  3. if won’t work is because the wait( ) call should be in a “wait loop”: while (condition {wait( ); } as shown in Msynch. Without it, the thread cannot retest the condition after it returns from the wait ( ) call. With the if-statement, the condition is only tested once, unlike with the while-statement.

An additional fourth error was identified after reviewing the notes from class in Java threads shortened_1.

  1. Best practices are not followed in either Msynch and Msynch1 where the wait loop must actually reside in a try block, as follows: while condition) try {wait( );)}.

When the first thread (appThread) in Msynch calls acquire( ) the first time, it currentState = 1, so it enters into the wait loop.  Thus, its replies are initialized at zero and currentState =1.  This thread sends out 5 messages to other threads (calling on replyReceived( )).  As long as replies are less than five it stays locked and the currentState remains equal to two.  Once it has received 5 replies from any five threads, the code will unlock and it increments the currentState by one, so now it is equaled to three.

As the initial thread  (appThread) running in aquire( ) calls out other threads (commThreads) for at least 5 messages, these other threads do so by calling replyRecieved( ).  This code increments the number of replies by one each time a thread calls it and unlocks replies so it can increment it by one and then locks it so that another thread calling replyReceived( ) can increment it by one.  Thus, once five threads, any five threads can successfully run replyReceived( ), then we can increment currentState =3 as the lock on currentState is removed.

Words to define

Semaphores: In a coded programmed, where there exists a code between a lock and unlock instruction, it becomes known as a critical section.  That critical section can only be accessed by one thread at a time.  If a thread sees a semaphore that is open, the thread will close it in one uninterruptible and automatic operation, and if that was successful the thread can confidently proceed into the critical section.  When the thread completes its task in the critical section, it will reopen the semaphore. This, changes of state in the semaphore are important, because if a thread sees that the semaphore is closed that thread stalls.  Thus, this ensures that only one thread at a time can work on the critical section of the code (Oracle, 2015).

Test-and-set (TSET/TAND/TSL): It is a special set of hardware instructions, which semaphores operate in order to have both read access and conditional write access by testing a bit of the code (Sanden, 2011).  This will eventually allow a thread to eventually work on the critical section of the code.   Essentially, a semaphore is open if the bit is 1 and closed if the bit is 0 (Oracle, 2015).  If the bit is 1, the test set of instructions will attempt to close the semaphore by changing and setting the bit to 0.  This TSET is conducted automatically and cannot be interrupted.

Preemption: Multi-threading can be set with or without each thread having an assigned priority preemptively.  If a priority is set amongst the threads, a thread can be suspended (forestalled) by a higher-priority thread at any one time.  This becomes important when the ratio of thread to computer cores is high (like 10 threads on a uniprocessor).  Preemption becomes obsolete when the number of threads is less than the number of cores provided for the code to run (Sanden, 2011).

References

Parallel Programming: Practical examples of a thread

Here is a simple problem: A boy and a girl toss a ball back and forth to each other. Assume that the boy is one thread (node) and the girl is another thread, and b is data.

Boy = m

Girl = f

Ball = b

  • m has b
    1. m throws b –> f catches b
  • f has b
    1. f throws b –> m catches b

Assuming we could drop the ball, and holding everything else constant.

  • m has b
    1. m throws b –> f catches b
    2. m throws b –> f drops b
      1. f picks up the dropped b
  • f has b
    1. f throws b –> m catches b
    2. f throws b –> m drops b
      1. m picks up the dropped b

 

Suppose you add a third player.

Boy = m

Girl = f

Ball = b

3rd player = x

  • m has b
    1. m throws b –> f catches b
    2. m throws b –> x catches b
  • f has b
    1. f throws b –> m catches b
    2. f throws b –> x catches b
  • x has b
    1. x throws b –> m catches b
    2. x throws b –> f catches b

Assuming we could drop the ball, and holding everything else constant.

  • m has b
    1. m throws b –> f catches b
    2. m throws b –> f drops b
      1. f picks up the dropped b
    3. m throws b –> x catches b
    4. m throws b –> x drops b
      1. x picks up the drooped b
  • f has b
    1. f throws b –> m catches b
    2. f throws b –> m drops b
      1. m picks up the dropped b
    3. f throws b –> x catches b
    4. f throws b –> x drops b
      1. x picks up the dropped b
  • x has b
    1. x throws b –> m catches b
    2. x throws b –> m drops b
      1. m picks up the dropped b
    3. x throws b –> f catches b
    4. x throws b –> f drops b
      1. f picks up the dropped b

Will that change the thread models? What if the throwing pattern is not static; that is, the boy can throw to the girl or to the third player, and so forth? 

In this example: Yes, there is an additional thread that gets added, because each player is a tread that can catch or drop a ball.  Each player is a thread on its own, transferring data ‘b’ amongst them and throwing the ‘b’ is locking the data before transferring and catching ‘b’ is unlocking the data.  After the ball is dropped (maybe calculated randomly), the player with the ball now has to pick it up, which can be equivalent to analyze the data based on a certain condition that is met like account balance is < 500 or else.  The model changes with the additional player because each person has a choice to make now on which person should receive the ball next, which is not present in the first model when there were two threads.  If there exists a static toss like

  • f –> m –> x –> f

Then the model doesn’t change, because there is no choice now.

Parallel Programming: Logical Clocks

In a distributed system nodes can talk (cooperate) to each other and coordinate their systems.  However, the different nodes can execute concurrently, there is no global clock in which all nodes function on, and some of these nodes can fail independently (Sandén, 2011).  Since nodes talk to each other, we must study them as they interact with each other.  Thus, a need to use logical clocks (because we don’t have global clocks) which show that distances in time are lost (Sandén, 2011). In logical clocks: all nodes agree on an order of events, partially (where something can happen before another event).  They only describe the order of events, not with respect to time.  If nodes are completely disjoint in a logical clock, then a node can fail independently. This is one way to visualize the complex nature of nodes.

The following is an example of a logical clock:

Capture

Reference

Parallel Programming: Locks

A lock for a Node could be for the latest service request. Nodes in a group have to agree on which one of them holds the lock during any one moment of time, which can be seen on a vector graph if we note which one holds the lock. A node can release and request a lock.

Mutual exclusion algorithms can have a centralized coordinator node that handles the requests for the lock, which then means if that node fails so will the program (Sandén, 2011). Mutual exclusion algorithms can allow for a contention-based exclusion where nodes compete for the lock equally, and a queue is created for pending requests.  Finally, controlled exclusions have a logical piece of code to visit each node at a regulated interval of time to lend them the lock.

Lamport’s clock can help order the contention-based scenario where every node is trying to get the lock and it can only be had through a queue (Sandén, 2011). The queue tracks their request through a timestamp. Nodes can earn a lock if it has all the reply messages it needs to run its task and it’s on the top of the list in its queue.

Sandén (2011), states that multicast is done to all nodes that the lock has been released, and abolishing this message can optimize the process. Thus, the node should get the request from the next in the queue and postpone it until is done with the lock.

Reference

Parallel Programming: Logical Clocks

Per Sandén (2011), in a distributed system nodes can talk (cooperate) to each other and coordinate their systems.  However, the different nodes can execute concurrently, there is no global clock in which all nodes function on, and some of these nodes can fail independently.  Since nodes talk to each other, we must study them as they interact with each other.  Thus, there is a need to use logical clocks (because we don’t have global clocks) which show that distances in time that is lost.

In logical clocks: all nodes agree on an order of events, partially (where something can happen before another event).  They only describe the order of events, not with respect to time.  If nodes are completely disjoint in a logical clock, then a node can fail independently.

Reference

Parallel Programming: Deadlocks

Deadlock occurs while you are getting an additional resource while holding another or more resource, especially when it creates a circularity  (Sandén, 2011).

Sandén (2011), stated that to prevent deadlocks, resources need to be controlled.  One should do a wait chain diagram to make sure your design can help prevent a deadlock.  Especially when there is a mix of transactions occurring.  It is also best to know how many threads/entities are needed to be called on simultaneously before a deadlock can occur, especially true when you have multiple threads calling on shared resources.

Thus, we should manage the resources to ensure no circularity, limit the number of entities to just below the threshold to cause a deadlock, eliminate wait.

There are many in real life like the one shown in Sandén (2011) with each of 4 cars halfway into an intersection. The following is a real-life suggested a deadlock scenario:  

There is one set of measuring cup (1/2 a cup).  There are no other ways to measure this amount.  Jack and Jill are backing a cake at the same time.  They have all the objects need, eggs, cake mix, oil, and milk.  However, they need the only measuring cup to measure oil and milk and they reach for it at the same time.  This is a deadlock.

To un-deadlock this scenario, Jack can pour the eggs, and cake mix, while Jill measures and pours the oil and milk.  When Jill is done, Jack measures and pours the oil and milk and Jill pours the cake mix and eggs.  The same could be done with up to four people.  Where each person is a thread and the measuring cup is the resource.

Once we introduce a fifth or more person, the wait chain has unnecessarily long periods of wait for one thread to be able to begin to use a resource.

Reference

Parallel Programming: Resource Guard

A quick note:

“In the resource-guard-thread pattern, resource-guard threads represent resources.  Such threads are arranged in a virtual assembly line and connected by queues implemented as safe objects” (Saden, 2011)

By the definition above, the search and insertion threads have exclusive data to perform subdivision and legalization through an insertion point, not a queue, thus this is a resource-user thread pattern.

“As long as each resource user has exclusive access to no more than one resource at a time, the designer can usually choose between a solution with resource-guard threads and one with resource threads.  In this sense, the two patterns are dual.” (Saden, 2011)

A dual solution would look like: The search and insertion threads would return an index to a safe object, which would house all the data.  The data can then be a queue from in order to proceed with step two which is subdivision and legalization.

Reference

Parallel Programming: Safe objects and shared objects

Shared objects that force mutual exclusion on threads that try to call it are “safe objects”.  The mutual exclusion on threads/operations can be relaxed when threads don’t change any data, this may be a read of the data in the “safe object” (Sanden, 2011).  In the examples for this course, we have dealt with such Java “safe objects” which are called synchronized.

  1. A safe object in a jukebox represents the CD player. Customer threads call an operation to queue up to play a song.
    • Input into Song Queue: data can be added by multiple people on multiple devices that only have one set of CDs, and can only play one song from a CD.  Data is stored in an array.
    • Change Song order in the Queue: The Song Queue can be prioritized based on predefined parameters, like the DJ, can have ultimate priority to adjust the order and make their own request, but customers have a less priority.  If there is a tiered pay structure then we can see a higher priority placed on a Song on the Song Queue for those willing to pay more. This means that the data stored in the array can be rearranged depending on the thread’s priority.
    • Remove Song from Queue: after the song is done playing, the song’s name is removed from the Song Queue position number one. This will force the array values to shift up by one.
    • Read Song Queue: though not needed to be mutually exclusive, it is still an operation that is needed in order to find the next song to play.  This shouldn’t change any data in the array, it is only reading the song in position 0 of the array.
  1. In a different design, the safe object in a jukebox represents a queue of song requests. Customer threads call an operation to add a song request to the queue. A CD thread calls a different operation to retrieve the next request.
    • All of those that are required for a song queue in the previous example could be applied to this example or a subset.  An example of a sufficient subset would be {Input into Song Queue, Remove Song from Queue, Read Song Queue}
    • Locate the next CD request: Based on the data in Input into Song Queue, pull, locate the CD containing the next Song to be played.
    • Play Song on CD: One song from one CD can be played at any time.
    • Transition Song on CD: As one song ends, fade out the noise exponentially in the last 10 seconds and begin the next song on the Song Queue by increasing the song volume exponentially in the first 5 seconds to normal volume.
    • Put away the CD from the last song played: places the cd back into its predetermined location for future use. Once completed it will call on the Locate next CD Request Safe Operation.

References:

Parallel Programming: Synchronized Objects

Sanden (2011) shows how to use synchronized objects (concurrency in Java), which is a “safe” object, that are protected by locks in critical synchronized methods.  Through Java we can create threads by: (1) extend class Thread or (2) implement the interface Runnable.  The latter defines the code of a thread under a method: void run ( ), and the thread completes its execution when it reaches the end of the method (which is essentially a subroutine in FORTRAN).  Using the former you need the contractors public Thread ( ) and public Thread (Runnable runObject) along with methods like public start ( ).

Additional Examples:

MapReduce

According to Hortonworks (2013), MapReduce’s Process in a high level is: Input -> Map -> Shuffle and Sort -> Reduce -> Output.

Tasks:  Mappers, create and process transactions on a data set filed away in a distributed system and places the wanted data on a map/aggregate with a certain key.  Reducers will know what the key values are, and will take all the values stored in a similar map but in different nodes on a cluster (per the distributed system) from the mapper to reduce the amount of data that is relevant (Hortonworks, 2013). Reducers can work on different keys.

Example: A great example of this a MapReduce: Request, is to look at all CTU graduate students and sum up their current outstanding school loans per degree level.  Thus, the final output from our example would be:

  • Doctoral Students Current Outstanding School Loan Amount
  • Master Students Current Outstanding School Loan Amount.

Now let’s assume that this ran in Hadoop, which can do MapReduce.   Also, let’s assume that I could use 50 nodes (threads) to process this transaction request.  The bad data that gets thrown out in the mapper phase would be the Undergraduate Students, given that it does not match the initial search criteria.  The safe data will be those that are associated with Doctoral and Masters Students.  So, during the mapping phase, the threads will assign Doctoral Students to one key, and Master students would get another key.  Each node (thread) will use the same keys for their respective students, thus the keys are similar in all nodes (threads).  The reducer uses these keys and the safe objects in them, to sum up, all of the current outstanding school loan amounts get processed under the correct group.  Thus, once all nodes (threads) use the reducer part, we will have our two amounts:

  • Doctoral Students Current Outstanding School Loan
  • Masters Students Current Outstanding School Loan

Complexity could be added if we only wanted to look into graduate students that are currently active and non-active service members.  Or they could be complicated by gender, profession, diversity signifiers, we can even map to the current industry.

Resources