Parallel Programming: Safe objects and shared objects

Shared objects that force mutual exclusion on threads that try to call it are “safe objects”.  The mutual exclusion on threads/operations can be relaxed when threads don’t change any data, this may be a read of the data in the “safe object” (Sanden, 2011).  In the examples for this course, we have dealt with such Java “safe objects” which are called synchronized.

  1. A safe object in a jukebox represents the CD player. Customer threads call an operation to queue up to play a song.
    • Input into Song Queue: data can be added by multiple people on multiple devices that only have one set of CDs, and can only play one song from a CD.  Data is stored in an array.
    • Change Song order in the Queue: The Song Queue can be prioritized based on predefined parameters, like the DJ, can have ultimate priority to adjust the order and make their own request, but customers have a less priority.  If there is a tiered pay structure then we can see a higher priority placed on a Song on the Song Queue for those willing to pay more. This means that the data stored in the array can be rearranged depending on the thread’s priority.
    • Remove Song from Queue: after the song is done playing, the song’s name is removed from the Song Queue position number one. This will force the array values to shift up by one.
    • Read Song Queue: though not needed to be mutually exclusive, it is still an operation that is needed in order to find the next song to play.  This shouldn’t change any data in the array, it is only reading the song in position 0 of the array.
  1. In a different design, the safe object in a jukebox represents a queue of song requests. Customer threads call an operation to add a song request to the queue. A CD thread calls a different operation to retrieve the next request.
    • All of those that are required for a song queue in the previous example could be applied to this example or a subset.  An example of a sufficient subset would be {Input into Song Queue, Remove Song from Queue, Read Song Queue}
    • Locate the next CD request: Based on the data in Input into Song Queue, pull, locate the CD containing the next Song to be played.
    • Play Song on CD: One song from one CD can be played at any time.
    • Transition Song on CD: As one song ends, fade out the noise exponentially in the last 10 seconds and begin the next song on the Song Queue by increasing the song volume exponentially in the first 5 seconds to normal volume.
    • Put away the CD from the last song played: places the cd back into its predetermined location for future use. Once completed it will call on the Locate next CD Request Safe Operation.

References:

Parallel Programming: Synchronized Objects

Sanden (2011) shows how to use synchronized objects (concurrency in Java), which is a “safe” object, that are protected by locks in critical synchronized methods.  Through Java we can create threads by: (1) extend class Thread or (2) implement the interface Runnable.  The latter defines the code of a thread under a method: void run ( ), and the thread completes its execution when it reaches the end of the method (which is essentially a subroutine in FORTRAN).  Using the former you need the contractors public Thread ( ) and public Thread (Runnable runObject) along with methods like public start ( ).

Additional Examples:

MapReduce

According to Hortonworks (2013), MapReduce’s Process in a high level is: Input -> Map -> Shuffle and Sort -> Reduce -> Output.

Tasks:  Mappers, create and process transactions on a data set filed away in a distributed system and places the wanted data on a map/aggregate with a certain key.  Reducers will know what the key values are, and will take all the values stored in a similar map but in different nodes on a cluster (per the distributed system) from the mapper to reduce the amount of data that is relevant (Hortonworks, 2013). Reducers can work on different keys.

Example: A great example of this a MapReduce: Request, is to look at all CTU graduate students and sum up their current outstanding school loans per degree level.  Thus, the final output from our example would be:

  • Doctoral Students Current Outstanding School Loan Amount
  • Master Students Current Outstanding School Loan Amount.

Now let’s assume that this ran in Hadoop, which can do MapReduce.   Also, let’s assume that I could use 50 nodes (threads) to process this transaction request.  The bad data that gets thrown out in the mapper phase would be the Undergraduate Students, given that it does not match the initial search criteria.  The safe data will be those that are associated with Doctoral and Masters Students.  So, during the mapping phase, the threads will assign Doctoral Students to one key, and Master students would get another key.  Each node (thread) will use the same keys for their respective students, thus the keys are similar in all nodes (threads).  The reducer uses these keys and the safe objects in them, to sum up, all of the current outstanding school loan amounts get processed under the correct group.  Thus, once all nodes (threads) use the reducer part, we will have our two amounts:

  • Doctoral Students Current Outstanding School Loan
  • Masters Students Current Outstanding School Loan

Complexity could be added if we only wanted to look into graduate students that are currently active and non-active service members.  Or they could be complicated by gender, profession, diversity signifiers, we can even map to the current industry.

Resources

Parallel Programming: Threads

A thread is a unit (or sequence of code) that can be executed by a scheduler, essentially a task (Sanden, 2011). A single thread (task) will have one program counter and a sequence of code. Multi-threading occurs when one program counter shares a common code. Thus, the counter in multi-threading has many sequences of code that can be assigned to different processors to run in parallel (simultaneously) to speed up a task. Another way for multi-threading is to have the counter execute the same code on different processors with different inputs. If data is shared between the threads, there is a need for a “safe” object through synchronization, where one thread can access the data stored in a “safe” object at one time. It is through these “safe” objects that a thread can communicate with another thread.

An additional example that may help illustrate the material: 

Maybe we would like to know the average of the sum of all the credits and the average of the sum of all the debits made in personal checking accounts in December in Suntrust Bank. After Map-Reduce techniques using multiple threading, we can go through their entire database system to find accounts and timestamp transactions, map out all the data and reduce it to what we need to return the two numbers in our query. 

Resources: