Homework 5 DUE: Thursday, Mar. 2 OLD (OBSOLETE) DEADLINE: Tuesday, Feb. 28 There are two parts to this assignment. You must submit a tar file with a Makefile (needed for PART 2), and a text file with your answers for PART 1. As usual, your Makefile must include targets for: clean, build, and check. In addition to the readings in ostep.org, please note the course directory: /course/cs7600/thread-synch [or access it through the web] for notes on mutex, semaphore, and condition variable. In particular, please note: https://course.ccs.neu.edu/cs7600/parent/thread-synch/acquire-release.pdf That pdf reviews a commonly used architecture for condition variables. This goes beyond what you will not find in ostep.org. ================================ PART 1: This assignment is to find bugs and write a brief English paragraph describing the bugs in the program and sometimes how to fix them. You only need to submit a single text file with your answers for parts A, B, C, D and E. As discussed in class, you can use McMini to help you find an execution trace with a bug, and then to replay that trace to diagnose that bug. To build McMini, do: tar xvf mcmini-alpha-X.tar.gz # for appropriate X cd mcmini-alpha-X ./configure make -j 10 # -j says to compile the files in parallel. I limit to at most 10 parallel # compiles so as not to bring down the operating system (too many forks). You can then run McMini on a target application. For example (from the 'test' subdirctory): gcc -g3 -O0 -o two-threads-and-mutex two-threads-and-mutex.c -pthread # Don't forget the -pthread flag, to link with libpthread.so. Now test it with McMini: ./mcmini --first -m 12 ./two-threads-and-mutex # --first says to stop at the first deadlock, if one is found. # -m (--max-depth-per-thread) says to prune a search branch if the next # thread will have executed more than 12 thread operations (transitions). # If you set it too large, McMini might search forever. If you set # it too small, McMini might miss searching a branch that shows deadlock. Many of the target programs take a flag, '--quiet'. This is useful when McMini searches through many traces: ./mcmini ./a.out --quiet Also, be sure to explore './mcmini --help'. In particular, the --max-depth-per-thread says how deeply McMini should explore. Make this large enough to exhibit a potential bug, but not so large that it will take forever for McMini to explore all the possible thread traces (thread schedules). AA. [ Consider adding cond. var. problem w/ wrong 'while' logic. ] A. For the first program, it will be easiest for you to simply use the code found in the 'producer-consumer' subdirectory found here. Within that subdirectory, you can 'make'. You will find that you can then execute: mcmini -m 30 ./producer-consumer to discover an 'assertion' violation. What is the cause of this bug? NOTE: Please read the text at the end of partial-solution.c for hints about the cause of this bug. [ *** I wroter earlier: [ *** > As of mcmini-alpha-b, McMini has not yet find the bug. [ *** > This will probably be fixed the weekend of Feb. 25. [ *** We were unable to fix McMini in time. But please [ *** do read the end of partial-solution.c for a description [ *** that will help you find the cause of the bug that. [ *** [ *** April 9, 2023: mcmini-alpha-d.tar.gz should now properly [ *** display the trace exhibting the bug here. B. Compile parent-child-tasks.c and test it. To compile: gcc -g3 -O0 -o parent-child-tasks parent-child-tasks.c -pthread If you test it with McMini, remember to use '-m 7 -first' (or some other reasonable max-depth parameter). What is the bug, and why? [ NOTE: If you use McMini, it will respond: "Trace 0 stopped early!" This means that it stopped due to SEGFAULT, exit with failure, possibly an assert failure, or else something similar. (And a bug in McMini may cause it to select "stopped early" even for a _successful_ exit.) Future McMini's will be more precise. ] C. Compile two-threads-and-mutex.c and test it. To compile: gcc -g3 -O0 -o two-threads-and-mutex \ two-threads-and-mutex.c -pthread If you test it with McMini, remember to use '-m 7 -first' (or some other reasonable max-depth parameter). What is the bug, and why? What is a simple fix for this bug, by just re-ordering certain lines? D. Compile reader-writer.c and test it. To compile: gcc -g3 -O0 -o reader-writer reader-writer.c -pthread Unfortunately, a bug in McMini prevents it from showing the problem. So, you'll have to find the bug the old way, or with ordinary GDB. Here is a quick way to view the bug inside GDB: # Try ./reader-writer several times on Khoury login if needed, # until you can see it "hang" and not exit. ./reader-writer & # Now that you can see it hanging (waiting), continue as follows: gdb attach `pgrep --newest reader-writer` (gdb) thread apply all where -4 What is the bug, and why? How can you fix the bug by changing the text of a certain line of code? (Note that this line of code occurs in two places, and should be changed in both places.) E. Compile barrier.c and test it. What is the bug, and why? [ McMini does not catch this bug, because the bug is one of livelock -- not deadlock. ] ================================ PART 2: [ *** My original description for PART 2 is not clear enough. *** Based on comments in class, I will polish this description, *** and then remove this note. *** ] Write a short program using condition variables. It should create two condition variables: one for readers and one for writers. When there are no more _active_ readers, then the the program should signal to a single waiting writer. When there are no more _active_ writers, then the the program should broadcast to _all_ waiting readers, since multiple readers can be active at the same time. When there are no more _active_ readers, then the program should signal to exactly one waiting writer. NOTE: By waking up either one writer or all readers, we have greater efficiency. If there were only one condition variable, then our design would be forced to wake up all waiting threads (both readers and writers). Beyond the above specs, you have a choice of how to handle the remaining cases that can arise. These cases are often referred to as reader-preferred and writer-preferred 1. If there are no more active readers or writers, then you can choose whether you wish to signal/broadcast to readers or to writers. 2. If there exists one or more active readers and also a waiting writer, and then a new reader enters the acquire section, then you can choose whether to allow the new reader to acquire read permission or to have it wait until any waiting writers have executed. NOTE: Depending on what choices you make, above, you might see mostly all reads in succession ("reader-preferred"), or several writes in the middle ("writer-preferred"). Your code should include an "acquire" and a "release" section. And within the release section of your code, write two 'assert' statements: both when a thread enters the release section, and when a thread exits the release section. Each of the two assert's should test that your invariant is still true. Since you have two condition variables (one for readers and one for writers), you have two possibilities for asserting the invariant. A reader thread should assert the reader invariant and a writer thread should assert the writer invariant. [ ** I'm deleting my former comment on conjunction of invariants. ** ] [ ** It's too confusing, and not relevant. ** ] In your main routine, create 5 readers and 2 writers. All readers and writers must be created before you "join" with any of them. Each reader should "read" 10 times. Each writer should "write" twice. Before doing the "work" each reader and writer should call: sleep( rand() % 3 ); to simulate doing the work (after acquiring the read permission or write permission, and before releasing releasing that permission). Just after the 'sleep' statement, please add a print statement of the form: Reader 5 did task 2. Writer 2 did task 4. etc. You can find and copy code for this from the program reader-writer.c in this directory. [ ** Sorry for my poor writing in the earlier write-up. A ** ] [ ** "read permission" is sometimes also called a "read lock". ** ] [ ** But that is ambiguous here, since we also have a ** ] [ ** "mutex lock". So, better to call it a "permission". ** ] You are not required to use McMini to verify your logic. If you choose to do that, then please be aware that you will want to limit your test case to fewer readers/writers (fewer threads), and fewer iterations for each thread. But again, you do not need to use McMini. And a reminder: As stated at the beginning, for Part 2, you must include a Makefile with targets for clean, build, and check. ================================ THE McMini MODEL CHECKER IS PROVIDED TO HELP YOU. An alpha release of McMini (mcmini-alpha.tar.gz) is provided to help you diagnose the bugs. Like all alpha releases, beware of bugs. To use it: tar zxvf mcmini-alpha.tar.gz cd mcmini-alpha ./configure && make TERMINOLOGY: TRACE: A _trace_ is a sequential schedule for the threads. McMini will explore all traces or schedules up to a certain depth. TRANSITION: A _transition_ is a thread operation. McMini can continue to execute a single thread, _until_ there is a thread operation. At the thread operation, McMini should consider continuing the current thread or scheduling a different thread. NOTE: The non-thread statements before a thread operation cannot affect the behavior of other threads. So, we don't need to consider scheduling a different thread at a non-thread statement. (Here, we assume that if there is a shared global variable, then it is protected by the thread operations pthread_mutex_lock and pthread_mutex_unlock. The mutex lock/unlock will force McMini to consider other schedules.) For a demo, try: make check-gdb (gdb) mcmini help This demo will use a producer-consumer application based on the standard semaphore library provided by glibc in Linux.. This provides extended commands to goto a specific execution sequence (trace), and to examine each step involving a thread operation (each transition). To try it with your own application, do: ./mcmini-gdb APPLICATION APPLICATION_ARGS (gdb) mcmini help When using 'mcmini-gdb', you should be aware of some important flags: --max-depth [limit the depth or number of transitions explored for each trace; Try setting --max-depth to a small number initially, and then raising it.] Two other option that you may find useful are --verbose and -first. Note, if you use 'mcmini back', then the thread numbers will be renumbered, but the threads will be in the same order. If you apply 'mcmini-gdb' on your hw6 solution, you will find it necessary to first change partial-solution.c and producer-consumer.c, both, to change all occurrences of 'sem_' to 'mysem_'. This is needed so that the model checker will realize that it should model-check based on the underlying pthread_mutex_lock/unlock functions, and not on sem_wait/post. Also, if you apply 'mcmini-gdb' to your hw6 solution consider easing the burden of work for McMini. For example, consider creating just one consumer thread, and either 2 or 3 producer threads. McMini can then reach any bug much sooner. And McMini continues to support: info threads, where, finish, next, step, continue This version of McMini specially calls 'alarm(7200)' to cause it to exit after two hours. This prevents runaway processes on Khoury login. STRATEGY 1: It is important to use a '' in the middle (not too small, not too large). If it is too small, then the trace will not execute enough transitions to reach the bug. If it is too large, then McMini will spend a long time exploring many traces that begin with the same sequence of operations, and may never reach a later sequence of operations that contains the actual bug. STRATEGY 2: mcmini -m ./a.out # find bad traceId (See STRATEGY 1 for choosing NUM) mcmini-gdb -m ; (gdb) mcmini gotoTrace NUM+1; (gdb) mcmini forward [ and repeat 'mcmini forward ' by binary search to stop just before any segfault. Then use that ] AND: (gdb) mcmini gotoTrace NUM+1; (gdb) mcmini forward (gdb) finish, next, step, thread #, etc. 'mcmini gotoTrace