Introduction to OpenMP
There are many excellent introductions to OpenMP, and more being
created all the timme. Rather than re-create the wheel, here
is one that I recommend.
For GPUs, people want to develop the same type of abstract framework
as MPI for distributed memory and OpenMP for POSIX threads.
A recent proposal in 2011 is OpenAcc
Simultaneously, a company, CAPS (spinoff of INRIA research organization
in France) has developed HMPP:
- HMPP
(goal of also supporting multiple GPUs along with many-core
on motherboard)
A separate topic not covered so far is the issue of benchmarks. Here
are some classical benchmark test suites for parallel progrmas
(using MPI, OpenMP, and other languages):
Information about DMTCP
DMTCP
If you have questions about DMTCP, please send e-mail to
Kapil Arya and me. The username of Kapil Arya is his
first name (all lower case) and: @ccs.neu.edu
DMTCP is available through
the sourceforge web page.
The easiest way to start is (in Linux) to type:
svn co https://dmtcp.svn.sourceforge.net/svnroot/dmtcp/trunk dmtcp
cd dmtcp
./configure
[ OR: ./configure --enable-debug ]
make
make check
# NOTE: If you use the 'svn', the waitpid test will fail. This is normal.
# (IT is a new test planned for the next DMTCP release.)
The DMTCP commands are in the binsubdirectory of the DMTCP
distribution.
If you would like to test DMTCP manually, you can try it out on
test/dmtcp1 (in the dmtcp test subdirectory).
If you did make check above, then you will have automatically
compiled test/dmtcp1 from test/dmtcp1.c .
please set up two terminal
windows. In the first terminal window, execute
dmtcp_coordinator
In the second terminal window, remove any old checkpoint image files:
rm -rf ckpt_*.dmtcp
and execute
dmtcp_checkpoint PATH_TO_TEST_SUBDIR/dmtcp1
Next, in the coordinator window, try:
? [help command]
s [status command]
l [list of processes under checkpoint control]
c [checkpoint command]
You should now see a file ckpt_dmtcp1_*.dmtcp in the directory where
dmtcp1 was executing. Kill the dmtcp1 program (for example, by typing
control-c). Then type
dmtcp_restart PATH_TO_CKPT_FILE/ckpt_dmtcp1_*.dmtcp
If you have any troubles, please don't hesitate to personally ask
Kapil Arya or myself. A five minute answer to your question could
save you several hours of your own trial-and-error searching.
The best times to find us are during afternoons and early evening
in my office (336 WVH)
or the High Performance Computing Lab (370 WVH).
Or else, send e-mail to Kapil Arya and myself.
(My e-mail is gene@ccs.neu.edu . For Kapil Arya, use the
his first name as the username, all in lower case.)
Further Optional Reading
Then browse the QUICK-START file in the top-level dmtcp directory.
From there start browsing the source code. Some good techniques are:
grep -r STRING ./
to do a recursive search for lines with STRING in it. Read man grep
for further information. For example, grep -C2 ... will show
you two lines before and after the string match. You can also pipe
through grep -v BAD_SUBSTRING to avoid bad matches.
If you want to see the source code of DMTCP as it executes, try:
gdb --args dmtcp_checkpoint PATH_TO_TEST_SUBDIR/dmtcp1
An older description of the design of DMTCP is in
DMTCP publications
under
DMTCP: Transparent Checkpointing for Cluster Computations and the Desktop.
I will supplement this reading with further lectures about the design
of DMTCP.
QUESTIONS ABOUT DMTCP
- dmtcp_checkpoint ./a.out works by setting the
environment variable LD_PRELOAD to a special library,
dmtcphijack.so. That library loads an additional library,
libmtcp.so, and then starts a second thread, the checkpoint
thread. The checkpoint thread then connects to the DMTCP coordinator.
This completes the initialization by dmtcphijack.so, and
the "main" routine of the a.out application is allowed to
begin executing.
The dmtcphijack.so includes a C++ class, DmtcpWorker.
A global variable is set to create a new instance of a
DmtcpWorker object. This must be initialized by C++ before 'main'
executes. Find the file and line number where the instance of a
DmtcpWorker object is initialized.
- To restart, one executes dmtcp_restart ckpt_*.dmtcp.
Find the file and line number containing the 'main' routine
for the program for dmtcp_restart.
- The mtcp subdirectory is responsible for checkpointing a single
process. The libmtcp.so library is built in that directory.
The dmtcp/src subdirectory of the top-level dmtcp directory
contains the code for managing issues pertinent to multiple processes.
For example, it handles checkpointing of socket connections,
and other multi-process constructs of the Linux operating system.
In the mtcp subdirectory, pthread_create is called
to create the checkpoint thread. Find the file and line number where
that happens.
- The Linux operating system shows you the memory layout of a running
process. For example, try:
cat /proc/self/maps
to see the memory segments of the shell that you are currently executing
along with addresses and execute permissions. For how many of the
lines in the maps file can you identify the purpose of that
section of memory?
- The checkpoint image ckpt_*.dmtcp contains the contents
of the memory segments. Try executing
mtcp/readmtcp ckpt_*.dmtcp
for a single checkpoint image file. Describe the contents of the
memory segments inside the checkpoint image.
- The file mtcp/mtcp_restart.c and the follow-on file
mtcp/mtcp_restart_nolibc.c are responsible for extracting
the memory segments from the ckpt_*.dmtcp file. Scan those
two files to identify the places in the code where the memory segment
of the existing mtcp_restart program is "unmapped", and the memory
segments of the ckpt_*.dmtcp file are then mapped.
Note that information about the pertinent system calls can be found in
man mmap and man munmap.
- More questions will be added later. For the work to hand in after
Jan. 20, it is enough to answer the above questions.