Introduction to OpenMP

There are many excellent introductions to OpenMP, and more being created all the timme. Rather than re-create the wheel, here is one that I recommend.

For GPUs, people want to develop the same type of abstract framework as MPI for distributed memory and OpenMP for POSIX threads. A recent proposal in 2011 is OpenAcc

Simultaneously, a company, CAPS (spinoff of INRIA research organization in France) has developed HMPP:

A separate topic not covered so far is the issue of benchmarks. Here are some classical benchmark test suites for parallel progrmas (using MPI, OpenMP, and other languages):

Information about DMTCP

DMTCP

If you have questions about DMTCP, please send e-mail to Kapil Arya and me. The username of Kapil Arya is his first name (all lower case) and: @ccs.neu.edu

DMTCP is available through the sourceforge web page. The easiest way to start is (in Linux) to type:

  svn co https://dmtcp.svn.sourceforge.net/svnroot/dmtcp/trunk dmtcp
  cd dmtcp
  ./configure
  [ OR:   ./configure --enable-debug ]
  make
  make check
  # NOTE:  If you use the 'svn', the waitpid test will fail.  This is normal.
  #   (IT is a new test planned for the next DMTCP release.)

The DMTCP commands are in the binsubdirectory of the DMTCP distribution. If you would like to test DMTCP manually, you can try it out on test/dmtcp1 (in the dmtcp test subdirectory). If you did make check above, then you will have automatically compiled test/dmtcp1 from test/dmtcp1.c . please set up two terminal windows. In the first terminal window, execute
  dmtcp_coordinator
In the second terminal window, remove any old checkpoint image files:
  rm -rf ckpt_*.dmtcp
and execute
  dmtcp_checkpoint PATH_TO_TEST_SUBDIR/dmtcp1
Next, in the coordinator window, try:
  ? [help command]
  s [status command]
  l [list of processes under checkpoint control]
  c [checkpoint command]
You should now see a file ckpt_dmtcp1_*.dmtcp in the directory where dmtcp1 was executing. Kill the dmtcp1 program (for example, by typing control-c). Then type
  dmtcp_restart PATH_TO_CKPT_FILE/ckpt_dmtcp1_*.dmtcp

If you have any troubles, please don't hesitate to personally ask Kapil Arya or myself. A five minute answer to your question could save you several hours of your own trial-and-error searching. The best times to find us are during afternoons and early evening in my office (336 WVH) or the High Performance Computing Lab (370 WVH). Or else, send e-mail to Kapil Arya and myself. (My e-mail is gene@ccs.neu.edu . For Kapil Arya, use the his first name as the username, all in lower case.)

Further Optional Reading

Then browse the QUICK-START file in the top-level dmtcp directory. From there start browsing the source code. Some good techniques are:
  grep -r STRING ./
to do a recursive search for lines with STRING in it. Read man grep for further information. For example, grep -C2 ... will show you two lines before and after the string match. You can also pipe through grep -v BAD_SUBSTRING to avoid bad matches.

If you want to see the source code of DMTCP as it executes, try:
  gdb --args dmtcp_checkpoint PATH_TO_TEST_SUBDIR/dmtcp1

An older description of the design of DMTCP is in DMTCP publications under DMTCP: Transparent Checkpointing for Cluster Computations and the Desktop.

I will supplement this reading with further lectures about the design of DMTCP.

QUESTIONS ABOUT DMTCP

  1. dmtcp_checkpoint ./a.out works by setting the environment variable LD_PRELOAD to a special library, dmtcphijack.so. That library loads an additional library, libmtcp.so, and then starts a second thread, the checkpoint thread. The checkpoint thread then connects to the DMTCP coordinator. This completes the initialization by dmtcphijack.so, and the "main" routine of the a.out application is allowed to begin executing.
    The dmtcphijack.so includes a C++ class, DmtcpWorker. A global variable is set to create a new instance of a DmtcpWorker object. This must be initialized by C++ before 'main' executes. Find the file and line number where the instance of a DmtcpWorker object is initialized.
  2. To restart, one executes dmtcp_restart ckpt_*.dmtcp. Find the file and line number containing the 'main' routine for the program for dmtcp_restart.
  3. The mtcp subdirectory is responsible for checkpointing a single process. The libmtcp.so library is built in that directory. The dmtcp/src subdirectory of the top-level dmtcp directory contains the code for managing issues pertinent to multiple processes. For example, it handles checkpointing of socket connections, and other multi-process constructs of the Linux operating system.
    In the mtcp subdirectory, pthread_create is called to create the checkpoint thread. Find the file and line number where that happens.
  4. The Linux operating system shows you the memory layout of a running process. For example, try:
    cat /proc/self/maps
    to see the memory segments of the shell that you are currently executing along with addresses and execute permissions. For how many of the lines in the maps file can you identify the purpose of that section of memory?
  5. The checkpoint image ckpt_*.dmtcp contains the contents of the memory segments. Try executing
    mtcp/readmtcp ckpt_*.dmtcp
    for a single checkpoint image file. Describe the contents of the memory segments inside the checkpoint image.
  6. The file mtcp/mtcp_restart.c and the follow-on file mtcp/mtcp_restart_nolibc.c are responsible for extracting the memory segments from the ckpt_*.dmtcp file. Scan those two files to identify the places in the code where the memory segment of the existing mtcp_restart program is "unmapped", and the memory segments of the ckpt_*.dmtcp file are then mapped. Note that information about the pertinent system calls can be found in man mmap and man munmap.
  7. More questions will be added later. For the work to hand in after Jan. 20, it is enough to answer the above questions.