Homework 5
		      DUE: Thursday, Mar. 2
		      OLD (OBSOLETE) DEADLINE: Tuesday, Feb. 28

There are two parts to this assignment.  You must submit a tar file
with a Makefile (needed for PART 2), and a text file with your answers
for PART 1.  As usual, your Makefile must include targets for:
  clean, build, and check.

In addition to the readings in ostep.org, please note the course directory:
  /course/cs7600/thread-synch  [or access it through the web]
for notes on mutex, semaphore, and condition variable.
In particular, please note:
  https://course.ccs.neu.edu/cs7600/parent/thread-synch/acquire-release.pdf
That pdf reviews a commonly used architecture for condition variables.
This goes beyond what you will not find in ostep.org.

================================
PART 1:

This assignment is to find bugs and write a brief English paragraph describing
the bugs in the program and sometimes how to fix them.  You only need
to submit a single text file with your answers for parts A, B, C, D and E.

As discussed in class, you can use McMini to help you find an
execution trace with a bug, and then to replay that trace to
diagnose that bug.  To build McMini, do:
  tar xvf mcmini-alpha-X.tar.gz  # for appropriate X
  cd mcmini-alpha-X
  ./configure
  make -j 10
  # -j says to compile the files in parallel.  I limit to at most 10 parallel
  #     compiles so as not to bring down the operating system (too many forks).

You can then run McMini on a target application.  For example (from the
'test' subdirctory):
  gcc -g3 -O0 -o two-threads-and-mutex two-threads-and-mutex.c -pthread
  # Don't forget the -pthread flag, to link with libpthread.so.
Now test it with McMini:
  ./mcmini --first -m 12 ./two-threads-and-mutex
  # --first says to stop at the first deadlock, if one is found.
  # -m (--max-depth-per-thread) says to prune a search branch if the next
  #   thread will have executed more than 12 thread operations (transitions).
  # If you set it too large, McMini might search forever.  If you set
  #   it too small, McMini might miss searching a branch that shows deadlock.

Many of the target programs take a flag, '--quiet'.  This is useful when
McMini searches through many traces:
  ./mcmini ./a.out --quiet
Also, be sure to explore './mcmini --help'.  In particular, the
--max-depth-per-thread says how deeply McMini should explore.
Make this large enough to exhibit a potential bug, but not so large
that it will take forever for McMini to explore all the possible
thread traces (thread schedules).

AA. [ Consider adding cond. var. problem w/ wrong 'while' logic. ]

A.  For the first program, it will be easiest for you to simply use
    the code found in the 'producer-consumer' subdirectory found here.
    Within that subdirectory, you can 'make'.
    You will find that you can then execute:
      mcmini -m 30 ./producer-consumer
    to discover an 'assertion' violation.
    What is the cause of this bug?

    NOTE: Please read the text at the end of partial-solution.c for
          hints about the cause of this bug.

    [ *** I wroter earlier:
    [ ***   > As of mcmini-alpha-b, McMini has not yet find the bug.
    [ ***   > This will probably be fixed the weekend of Feb. 25.
    [ *** We were unable to fix McMini in time.  But please
    [ *** do read the end of partial-solution.c for a description
    [ *** that will help you find the cause of the bug that.
    [ ***
    [ *** April 9, 2023:  mcmini-alpha-d.tar.gz should now properly
    [ ***                 display the trace exhibting the bug here.

B.  Compile parent-child-tasks.c and test it.
    To compile:
      gcc -g3 -O0 -o parent-child-tasks parent-child-tasks.c -pthread
    If you test it with McMini, remember to use '-m 7 -first' (or some other
    reasonable max-depth parameter).
    What is the bug, and why?
    [ NOTE:  If you use McMini, it will respond:
             "Trace 0 stopped early!"
             This means that it stopped due to SEGFAULT, exit with failure,
               possibly an assert failure, or else something similar.
               (And a bug in McMini may cause it to select "stopped early"
                even for a _successful_ exit.)
             Future McMini's will be more precise. ]

C.  Compile two-threads-and-mutex.c and test it.
    To compile:
      gcc -g3 -O0 -o two-threads-and-mutex \
      two-threads-and-mutex.c -pthread
    If you test it with McMini, remember to use '-m 7 -first' (or some other
    reasonable max-depth parameter).
    What is the bug, and why?
    What is a simple fix for this bug, by just re-ordering certain lines?

D.  Compile reader-writer.c and test it.
    To compile:
      gcc -g3 -O0 -o reader-writer reader-writer.c -pthread
    Unfortunately, a bug in McMini prevents it from showing the problem.
      So, you'll have to find the bug the old way, or with ordinary GDB.
    Here is a quick way to view the bug inside GDB:
      # Try ./reader-writer several times on Khoury login if needed,
      #   until you can see it "hang" and not exit.
      ./reader-writer &
      # Now that you can see it hanging (waiting), continue as follows:
      gdb attach `pgrep --newest reader-writer`
      (gdb) thread apply all where -4
    What is the bug, and why?
    How can you fix the bug by changing the text of a certain line of code?
    (Note that this line of code occurs in two places, and should be
     changed in both places.)

E.  Compile barrier.c and test it.
    What is the bug, and why?
    [ McMini does not catch this bug, because the bug is one of livelock --
      not deadlock. ]

================================
PART 2:
[ *** My original description for PART 2 is not clear enough.
  *** Based on comments in class, I will polish this description,
  *** and then remove this note. 
 *** ]

    Write a short program using condition variables.  It should create
two condition variables:  one for readers and one for writers.
When there are no more _active_ readers, then the the program should
signal to a single waiting writer.
When there are no more _active_ writers, then the the program should
broadcast to _all_ waiting readers, since multiple readers can be
active at the same time.
When there are no more _active_ readers, then the program should
signal to exactly one waiting writer.

NOTE:  By waking up either one writer or all readers, we have greater
efficiency.  If there were only one condition variable, then our
design would be forced to wake up all waiting threads (both readers
and writers).

    Beyond the above specs, you have a choice of how to handle the
remaining cases that can arise.  These cases are often referred
to as reader-preferred and writer-preferred
1. If there are no more active readers or writers, then you can choose
    whether you wish to signal/broadcast to readers or to writers.
2. If there exists one or more active readers and also a waiting writer,
    and then a new reader enters the acquire section, then you can
    choose whether to allow the new reader to acquire read permission
    or to have it wait until any waiting writers have executed.

NOTE:  Depending on what choices you make, above, you might see
mostly all reads in succession ("reader-preferred"), or several
writes in the middle ("writer-preferred").

    Your code should include an "acquire" and a "release" section.
And within the release section of your code, write two 'assert'
statements:  both when a thread enters the release section, and when a
thread exits the release section.  Each of the two assert's should test
that your invariant is still true.  Since you have two condition variables
(one for readers and one for writers), you have two possibilities for
asserting the invariant.  A reader thread should assert the reader
invariant and a writer thread should assert the writer invariant.
    [ ** I'm deleting my former comment on conjunction of invariants. ** ]
    [ ** It's too confusing, and not relevant.                        ** ]

In your main routine, create 5 readers and 2 writers.  All readers and
writers must be created before you "join" with any of them.
Each reader should "read" 10 times.  Each writer should "write" twice.
Before doing the "work" each reader and writer should call:
  sleep( rand() % 3 );
to simulate doing the work (after acquiring the read permission
or write permission, and before releasing releasing that permission).
Just after the 'sleep' statement, please add a print statement of
the form:
  Reader 5 did task 2.
  Writer 2 did task 4.
  etc.
You can find and copy code for this from the program reader-writer.c
in this directory.

  [ ** Sorry for my poor writing in the earlier write-up.  A     ** ]
  [ ** "read permission" is sometimes also called a "read lock". ** ]
  [ ** But that is ambiguous here, since we also have a          ** ]
  [ ** "mutex lock".  So, better to call it a "permission".      ** ]

You are not required to use McMini to verify your logic.  If you choose
to do that, then please be aware that you will want to limit your
test case to fewer readers/writers (fewer threads), and fewer iterations
for each thread.  But again, you do not need to use McMini.

And a reminder:  As stated at the beginning, for Part 2, you must include
a Makefile with targets for clean, build, and check.

================================
THE McMini MODEL CHECKER IS PROVIDED TO HELP YOU.

An alpha release of McMini (mcmini-alpha.tar.gz) is provided to help
you diagnose the bugs.  Like all alpha releases, beware of bugs.  To use it:
  tar zxvf mcmini-alpha.tar.gz
  cd mcmini-alpha
  ./configure && make

TERMINOLOGY:
  TRACE:
    A _trace_ is a sequential schedule for the threads.  McMini will
    explore all traces or schedules up to a certain depth.
  TRANSITION:
    A _transition_ is a thread operation.  McMini can continue to execute
    a single thread, _until_ there is a thread operation.  At the thread
    operation, McMini should consider continuing the current thread
    or scheduling a different thread.
    NOTE:  The non-thread statements before a thread operation cannot affect
      the behavior of other threads.  So, we don't need to consider scheduling
      a different thread at a non-thread statement.
      (Here, we assume that if there is a shared global variable, then
      it is protected by the thread operations pthread_mutex_lock and
      pthread_mutex_unlock.  The mutex lock/unlock will force McMini
      to consider other schedules.)

For a demo, try:
  make check-gdb
  (gdb) mcmini help

This demo will use a producer-consumer application based on the standard
semaphore library provided by glibc in Linux..
    This provides extended commands to goto a specific execution sequence
(trace), and to examine each step involving a thread operation (each transition).

To try it with your own application, do:
  ./mcmini-gdb APPLICATION APPLICATION_ARGS
  (gdb) mcmini help

When using 'mcmini-gdb', you should be aware of some important flags:
  --max-depth <NUM>  [limit the depth or number of transitions explored for
                       each trace;  Try setting --max-depth to a small
                       number initially, and then raising it.]

Two other option that you may find useful are --verbose and -first.

Note, if you use 'mcmini back', then the thread numbers will be renumbered,
but the threads will be in the same order.

If you apply 'mcmini-gdb' on your hw6 solution, you will find it necessary
to first change partial-solution.c and producer-consumer.c, both, to
change all occurrences of 'sem_' to 'mysem_'.  This is needed so that
the model checker will realize that it should model-check based on the
underlying pthread_mutex_lock/unlock functions, and not on sem_wait/post.

Also, if you apply 'mcmini-gdb' to your hw6 solution consider easing
the burden of work for McMini.  For example, consider creating just
one consumer thread, and either 2 or 3 producer threads.  McMini can
then reach any bug much sooner.

And McMini continues to support:
  info threads, where, finish, next, step, continue

This version of McMini specially calls 'alarm(7200)' to cause it
to exit after two hours.  This prevents runaway processes on Khoury login.

STRATEGY 1:
  It is important to use a '<NUM>' in the middle (not too small, not too large).
  If it is too small, then the trace will not execute enough transitions
    to reach the bug.
  If it is too large, then McMini will spend a long time exploring many
    traces that begin with the same sequence of operations, and may never
    reach a later sequence of operations that contains the actual bug.

STRATEGY 2:
  mcmini -m <NUM> ./a.out  # find bad traceId (See STRATEGY 1 for choosing NUM)
  mcmini-gdb -m <NUM>; (gdb) mcmini gotoTrace NUM+1; (gdb) mcmini forward <NUM2>
    [ and repeat 'mcmini forward <NUM2>' by binary search to stop just before
       any segfault. Then use that <NUM2> ]
    AND:  (gdb) mcmini gotoTrace NUM+1; (gdb) mcmini forward <NUM2>
          (gdb) finish, next, step, thread #, etc.
  'mcmini gotoTrace <NUM>