Lecture on Parallel Computing (read the first half only, for
a nice overview and graphs of why multi-core is necessary, and
the benefits of many-core)
Texts for second half of course:
- Online text: ostep.org
(The final will cover all of the chapters on concurrency;
and also the chapters on files
from "Files and Directories" through "Fast File System (FFS)":
superblock, inode, etc. --- you do not
need to read about I/O Devices, disk drives and disk arrays.)
-
xv6-rev8.pdf:
UNIX xv6 (hyperlinked):
OR MAYBE:
UNIX xv6 (hyperlinked): (systems team re-organizing
our web directories?)
xv6 source code for UNIX kernel: especially see:
proc.c,h (process table);
swtch.S (context switch to a new process);
spinlock.c,h (spinlock, like a mutex);
file.c,h (file descriptor pointing to an open file);
fs.c,h (filesystem using superblock, inode, etc.);
pipe.c,h (pipes: e.g., ls | wc
);
bio.c,h (buffered I/O: linked list of buffers acting as a cache
for disk blocks or disk sectors).
If you are curious, you can also read about the
history of xv6.
-
Book with commentary on xv6-rev8 (UNIX kernel)
- Python (based on materials in
/course/cs5600/python)
-
Early history of Python (just for fun)
Course Resources
MIPS Simulator for Assembly Language homework (MARS):
- There is a
MIPS Assembly language simulator with
free downloads available and
online documentation of the MARS simulator.
There is also an
online video demo of the Mars simulator.
There is generous, high-quality documentation of the MIPS assembly
language with the
free, online documentation for the MIPS assembler
(including for SPIM, and early version of the MARS simulator),
and the
SPIM Quick Reference
(or an even shorter quick reference here)
- To begin running the simulator, go inside the folder, and
double-click on the Mars.jar icon. Alternatively, if running
from the command line in Linux, type:
java -jar Mars.jar
If you download Mars.jar for your computer, there are also
suggestions on that page for running Mars.
- The syntax of the assembly language is intended to be compatible
with Appendix B of our textbook (also
available online
(requires CCIS password)).
- The MARS software is distributed as a Java .jar file. It requires
Java J2SE 1.5 or later. Depending on your configuration,
you may be able to directly open it from the download menu.
If you have trouble, or if you prefer to run from the
command line on your own computer, the Java SDK is is also available
for free download from the same download page. The instructions
for running it from Windows or DOS should work equally well
on Linux. The CCIS machines should already have
the necessary Java SDK installed.
- GOTCHAS: There are several important things to watch out for.
- When you hit the "Assemble" menu item, any error messages about
failure to assemble are in the
bottom window pane, tab: "Mars Messages".
Input/Output is in the bottom window pane, tab: "Run I/O"
- If you paste assembly code into the edit window pane, you must
save that code to a file before Mars will let you assemble it.
- If you have selected a box with your mouse (e.g. "Value" box in
the data window pane, or "Register" box), then Mars will not
update the value in that box. Mars assumes you prefer to write
your own value into that box, and not to allow the assembly
program to use that box.
- If your program stops at an error, read the "Mars Messages" for
the cause of the error, and then hit the "Backstep" menu item
to see what caused that error. Continue hitting "Backstep"
or "Singlestep" in order to identify the error.
- Your main routine must call the "exit" system call to terminate.
It may not simply call return ("jr $ra"). Note that page B-44
of Appendix B of the text (fourth edition) has a table of
"system services" (system calls). These allow you to do "exit"
and also I/O.
- One of the nicer features of this software is a limited
backstep capability (opposite of single-step) for debugging.
In addition, the help menu includes a short summary
of the MIPS assembly instructions.
In general, I like this IDE for assembly even better than some
of the IDEs that I have seen for C/C++/Java.
(The one feature that I found
a little unintuitive is that if you want to look at the
stack (for example) instead of data, you must go to the
Data Segment window pane, and use the drop-down menu at the
bottom of that pane to choose "current $sp" instead of ".data".)
- Please note the
three sample assembly programs, along with an accompanying
tutorial on the same web page.
- I'd appreciate if if you could be on the lookout
for any unusual issues, and report them promptly (along with
possible workarounds), so the rest of the class can benefit.
Thanks very much for your help on this.

GDB (GNU DeBugger):
A Few Simple Debugging Commands Save You Hours of Work
INVOKING gdb:
gdb --args
Example: gdb --args ./a.out myargs
COMMON gdb COMMANDS:
BREAKPOINTS: break, continue
STARTING: break main, run
WHERE AM I: info threads, thread 1; where, frame 2; list
PRINT: ptype, print ( ptype argv[0]; print argv[0] )
EXECUTING: next, step, until, finish, continue
(next line, step inside fnc, until previously
unseen line (escape a loop), finish current fnc,
continue to next breakpoint)
EXIT: quit
< Cursor keys work, TAB provides auto-complete >
PARENT AND CHILD PROCESSES:
set follow-fork-mode child
(after call to fork(), switch to debugging child process)
DEBUGGING INFINITE LOOP:
run; TYPE ^C (ctrl-C) TO TALK TO gdb; where; print var1; etc.
HELP: help e.g.: (gdb) help continue
ADVANCED gdb: USING gdbinit:
In a more sophisticated debugging session, you may have to try
multiple GDB sessions. To do this, you will want to try doing:
gdb -x gdbinit --args ./a.out myargs
First, create a gdbinit file. An easy way is, in your last GDB session,
(gdb) show commands
Then copy-paste those comamnds into gdbinit and edit as desired.
Then at the top of gdbinit, insert:
# gdb -x gdbinit --args ./a.out myargs [Customize as needed]
set breakpoint pending on
set pagination off
# Stop if about to exit:
break _exit
# Continue w/ output of 'show commands'; Extend and modify as needed
MORE ADVANCED gdb:
info function
IF DEBUGGING C++ CODE, you will find this useful, in order
to discover the full GDB name of your target method.
macro expand MACRO_FNC(ARG) [requires 'gdb -g3']
debugging child after fork:
set follow-fork-mode child
break fork
run
EXTENDED "WHERE AM I":
set detach-on-fork off
info inferiors; inferior 2
info threads; thread 1
NOTE: The qualified thread number is: .
So, 'thread 2.3' switches to thread 3 in inferior 2.
where; frame 2
list; print myvariable
And for the adventurous, consider this full-featured
GDB Dashboard. It's a nicer version of things like "layout src".
For digging deeper into GDB, try:
"gdb - customize it the way you want".
NOTE: For those who like a
full-screen display of the current code, try the command
^Xa
(ctrl-X a) to turn full-screen mode on and off. Also, try:
(gdb) layout split
And finally, try "focus cmd" and "focus src", or ^Xo, to decide which pane
you want your cursor keys to operate on.
NOTE: For those who like to try a
reversible version of GDB, see
rr-project.org
(for GDB with reverse execution)
NOTE: For a _really cool_
GDB interface, look at: https://github.com/cyrus-and/gdb-dashboard. To try it out,
just go to the .gdbinit file from that web site, and then copy
the code into a local file,
gdbinit-dashboard,
and then execute
source gdbinit-dashboard
in GDB. Personally, I like to use Dashboard by
displaying the whole dashboard in another terminal.
Python
There are some good introductory materials for Python in
the instructor's directory.
After trying those out, here are some other Python background
materials:
Python
Virtual memory
The following note by Rob Landley
is a truly excellent summary of the most important points of
virtual memory as it's really used (not just the textbook theoretical
synopsis):
Motivation for multi-core CPUs (and hence, multi-threading)
Current Events
NEWS (from 2022 and earlier):
SPECULATION ABOUT FUTURE CHIPS
AND THE END OF MOORE'S LAW:
(Many of these articles
are from digitimes.com.)
For context, note the
Wikipedia Silicon article
on the covalent radius
of silicon is 0.111 nm. So, the distance between adjacent silicon atoms
is the diameter (0.222 nm).
7 nm is approximately 31.5 silicon atom diameters.
5 nm is approximately 22.5 silicon atom diameters.
3 nm is approximately 13.5 silicon atom diameters.
2 nm is approximately 9.0 silicon atom diameters.
- NEWS (Dec., 30, 2022):
TSMC gearing up for 3nm capacity expansion, 2nm fab construction
- NEWS (Dec., 5, 2022):
Intel unveils 2D and 3D IC research breakthroughs to extend
Moore's Law
- Moore's Law and future chip technologies:
- NEWS (Oct., 14, 2022):
TSMC to see 3nm generate 4-6% of 2023 revenue
- NEWS (Oct., 7, 2022):
Apple preparing for 2nm chips
- NEWS (Oct., 4, 2022):
Samsung plans 1.4nm for 2027 while polishing up 3D packaging
technology
- NEWS (Sept, 21, 2022):
Nvidia launches next-gen GeForce RTX built on TSMC N4 node
- NEWS (July 19, 2022):
Competition in 3nm smartphone AP market to take place in 2H23
- NEWS (July 12, 2022):
Volkswagen constructing first in-house battery plant in Europe,
considering 'big moves' in China
- NEWS (Apr. 21, 2022):
US bid to boost chipmaking to be expensive and wasteful,
says TSMC founder Morris Chang
- NEWS (Feb. 21, 2022):
Global chipmakers find ways to improve competitiveness
($440B, $380B, and $280B capital expenditures in 2022
by TSMC, Samsung and Intel)
- NEWS (Jan. 19, 2022):
Intel is on track to adopt 0.55 High-NA EUV lithography in 2025
("NA" is "numerical aperture"; or alt or alt2)
- NEWS (Dec. 7, 2021):
The Great Tech Rivalry: China vs the U.S.
(by Avoiding Great Power War Project at Harvard University's
Belfter Center;
while an excellent review
of technical progress in the U.S. and China, it unfortunately
does not review the fast rising technical progress in India)
- NEWS (12/24/21):
TSMC to move 3nm process to commercial production in 4Q22
- NEWS (12/23/21):
Autonomous delivery picking up in US
- NEWS (12/1/21):
TSMC enters pilot production of 3nm chips
- NEWS (10/8/21):
TSMC on track to ramp 3nm chip production (in second half, 2022)
- NEWS (8/31/21):
WHY THE GLOBAL CHIP SHORTAGE IS MAKING IT SO HARD TO BUY A PS5
(about the semiconductor supply chain)
- NEWS (8/18/21):
Samsung unlikely to move 3nm GAA process to volume production until 2023
- NEWS (7/2/21):
Micron to adopt EUV in DRAM manufacturing by 2024
- NEWS (4/16/21):
TSMC to boost 5nm chip output in 2H21
- NEWS (4/8/21):
Microsoft unveils liquid cooling solution for datacenters
- NEWS (3/26/21):
Ball screws with smart maintenance)
- NEWS (12/15/20):
TSMC to see 20% rise in 5nm shipments in first half of 2021
- NEWS (12/2/20):
TSMC to roll out 3nm Plus process in 2023
- NEWS (10/16/20):
TSMC expects 5nm chip sales to boost in 2021
Hiwin develops EV-use smart ball screws
(another source of high demand for CPU chips:
Ball screws with smart maintenance)
- NEWS (3/24/21):
Intel announces US$20 billion fab expansion plans in foundry revamp
- NEWS (3/17/21):
Micron to shift resources from 3D XPoint to CXL memory
(Definintions:
CXL: Compute-Express Link; 3D XPoint (aka Optane memory);
CDI: Composable disaggregated infrastructure)
- What is composable infrastructure?
("One workload could be a compute-heavy application requiring a lot
of CPU power, while another could be memory-heavy. The application
can grab whatever it needs at the time that it runs, and when it's
done, it returns it to the pool.")
- CXL initiative tackles memory challenges in heterogeneous computing
("In CXL, we start with CPUs, with cacheable memory both North and
South, both to its own memory and to the accelerator memory. Those
two pools would be part of the coherent memory pool addressable by
both machines." ... "In a data center, CXL operates primarily at the
node-level layer .... For the rack and row levels, the open systems
Gen-Z interconnect can provide memory-semantic access to data and
devices via direct-attached, switched or fabric topologies." ...
"CXL and Gen-Z are very complementary.")
- NEWS (2/26/21):
Highlights of the day: TSMC expanding 5nm capacity
- NEWS (10/8/20):
TSMC likely to make another upward adjustment to capex outlook
... due to strong demand for 7nm and 5nm
- NEWS (9/24/20):
TSMC mulling more 2nm capacity
- NEWS (9/21/20):
TSMC reportedly adopts GAA transistors for 2nm chips
- NEWS (7/27/20):
Intel may expand partnership with TSMC (7nm chips)
OLDER NEWS from Spring, 2015:
NEWS:
Talk by Yale Patt (famous researcher
in Computer Architecture)
NEWS:
2015 CCIS Colloquia (research talks by invited guests to CCIS: topics including security, big data, social networks, robotics, natural language, etc.)
NEWS:
Android Apps that Never Die (talk by me, Gene Cooperman, and Rohan Garg, at ACM undergrad chapter: 6 p.m., Wed., Feb. 25, 104 WVG) (pizza included)
NEWS:
One VLSI fabrication facility: $6.2 billion as of 2014
(from digitimes.com):
UMC to build 12-inch fab in Xiamen
(Many of these articles
are from digitimes.com.
For context, note the Wikipedia Silicon article on the covalent radius of silicon is 0.111 nm. So, the distance between adjacent silicon atoms is the diameter (0.222 nm).
10 nm is approximately 45 silicon atom diameters.
5 nm is approximately 22.5 silicon atom diameters.
3 nm is approximately 13.5 silicon atom diameters.)

- TOP500 Supercomputer Sites
- NOTE: This writing is as of Fall, 2021. (Things change quickly
in this area.) In terms of CPUs, we are in a three-way race
among chips from Intel, AMD, and ARM (but also the IBM Power9
CPU chip on supercomputers, such as the
Summit supercomputer).
In the past, it was
between Intel and AMD. The current largest supercomputer
is
Fugaku in Japan, based on 7~nanometer ARM chips (with no GPUs).
China's upcoming
Tianhe-3 supercomputer will also be based on ARM.
Similarly, the
Apple M1 chips are based on A64 ARM (with 4 big and 4 little
cores). Most of the remaining top supercomputers are based
on Intel or AMD, and often include NVIDIA GPUs on each node.
Intel will soon be offering CPUs with discrete
Intel Xe GPU chips, also to be included in the upcoming
Aurora supercomputer. Meanwhile, the upcoming
El Capitan supercomputer
will be based on AMD CPUs and AMD GPUs, and will be used
especially by the National Nuclear Securitay Administration
(NNSA) for numclear weapon modeling.
- Lists of Top 500
supercomputers in the world, with Top 10 on first page.
- Some recent blogs from the TOP500 site:
-
El Capitan, Frontier, and Aurora: exascale supercomputers
(appearing from 2021--2023; see especially the table, lower down
in this article)
Deep Learning (a motivator for high-end HPC):
Deep Learning on NVIDIA GPUs
DeepMind Beats Human Champion at Game of Go
(in 2015)
"[The
deep learning algorithm of] AlphaGo doesn't actually use that much
hardware in play, but we needed a lot of hardware to train it and do all
the different versions and have them play each other in tournaments on
the cloud. That takes quite a lot of hardware to do efficiently, so we
couldn't have done it in this time frame without those resources."
Relative Popularity of Different Languages
Benchmark Games (Which is faster, C, Java, or Python?):
(Benchmarks are notoriously variable. Be careful about how you interpret this.)
Three Newer Languages (with lessons from Scheme/Java/C/C++)
- Go (widely used at Google;
also the source language for
Docker,
a new type of lightweight virtual machine built on
top of Linux containers)
- Rust (grew out
of Mozilla, the developer of Firefox; may be used for a future
version of Firefox)
- Scala (runs on JVM;
Spark, a proposed successor to Hadoop, is built using Scala,
and supports Scala, Java, and Python)

A Few Simple Debugging Commands Save You Hours of Work
INVOKING gdb:
gdb --args
Example: gdb --args ./a.out
COMMON gdb COMMANDS:
BREAKPOINTS: break, continue
STARTING: break main, run
WHERE AM I: info threads, thread 1; where, frame 2; list
PRINT: ptype, print ( ptype argv[0]; print argv[0] )
EXECUTING: next, step, finish, continue
(next line, step inside fnc, finish current fnc, cont. to next breakpoint)
EXIT: quit
< Cursor keys work, TAB provides auto-complete >
PARENT AND CHILD PROCESSES:
set follow-fork-mode child
(after call to fork(), switch to debugging child process)
DEBUGGING INFINITE LOOP:
run; TYPE ^C (ctrl-C) TO TALK TO gdb; where; print var1; etc.
HELP: help e.g.: (gdb) help continue
ADVANCED gdb:
info function
IF DEBUGGING C++ CODE, you will find this useful, in order
to discover the full GDB name of your target method.
macro expand MACRO_FNC(ARG)
debugging after fork:
set detach-on-fork off
break ...
run
info inferiors
EXTENDED WHERE AM I:
info inferiors; inferior 2
info threads; thread 1
where; frame 2
list; print myvariable
For digging deeper into GDB, try "gdb - customize it the way you want".
NOTE: For those who like a
full-screen display of the current code, try the command ^Xa (ctrl-X a)
to turn full-screen mode on and off.
NOTE: For those who like to try a reversible version of GDB, see
rr-project.org
(for GDB with reverse execution)
There are some good introductory materials for Python in
the instructor's directory.
After trying those out, here are some other Python background
materials:
Python
The following note by Rob Landley
is a truly excellent summary of the most important points of
virtual memory as it's really used (not just the textbook theoretical
synopsis):