Web Resources for CS7600
(Intensive Computer Systems)
CS 7600 (Spring, 2023): Intensive Computer Systems
(This page now includes the up-to-date syllabus for Spring, 2023.)
The core philosophy of the course is
to avoid reading or writing massive amounts of code. Instead,
the emphasis will be on
a high ratio of new concepts and learning per lines of codes written.
It's about concepts, not code!
*** Three course principles ***
(concerning relevance to all C.S. disciplines,
and concerning respect for students' time)
Fun stuff:
Before doing too deep a dive, you might want to take a look at
some fun stuff.
This includes
some of the news on the web and enrichment topics that we will
discuss from time to time.
Course basics:
Here are some immediate resources in the course for doing homework, following
the readings, etc.
- Course home:
- The course materials will always be available either from this web page or
from the course directory on the Khoury Linux computers: /course/cs3650/.
- Linux:
- The course is based on Linux (the third of the three popular
operating systems). If you do not yet have a Khoury Linux account, get one.
If you are not fluent in the basics of Linux/UNIX,
please go to any of the good tutorials on the Web.
Personally, I like:
*** this Linux tutorial *** (from M. Stonebank,
U. of Surrey, England), or this
*** updated one
for Linux/bash *** (by C. Tothill)
Other useful tips on Linux:
- Productivity tip: In most settings: the four cursor keys work
(for editing the current or previous commands);
and the <TAB> key works (for auto-completion of filenames and commands).
- If you're using Apple (the second most popular operating system), the
terminal window provides you with a UNIX-like environment.
- If you're using Windows (the most popular operating system), you can
install WSL2 (Windows Subsytem for Linux, version 2).
This gives you a Linux virtual machine side-by-side with Windows.
If you are unsure of which Linux distribution to use, Ubuntu is
a common first choice for those unfamiliar with Linux. Some
"gotchas" and workarounds follow.
After installing WSL, a known bug from May, 2021 causes
the vmmem process to consume a lot of
CPU time at wake-after-sleep, if WSL was running. There is
a workaround here.
(When I followed the instructions for WSL2, I got
Error 0x1bc and I fixed it by downloading
the MSI package listed here. I then needed to enter "Windows
Features" in the search bar, and then select "Windows Hypervisor
Platform", and then go back to following the Microsoft
Windows instructions. Some people may also need to
turn on virtualization in their BIOS. After entering BIOS,
examine all options in all tabs with the letter 'v'. For me,
the option was "AMD SVM".)
(Microsoft will be offering a Linux graphic Desktop in the near
future. Finally, for the adventurous, you can build a Linux
graphics Desktop now by
following this article (using the lighter
weight XDMCP instead of VNC).
I also have an xlaunch-genie.sh
script that slightly automates some of those instructions.
And if you want to go further, here is
how to relay Windows sound to WsL2 PulseAudio, and here is
how to use CUDA with WSL2 for NVIDIA GPUs.)
(Another option is Ubuntu
Multipass, which requires VirtualBox, not necessarily
compatible
with WSLs's use of Hyper-V. And here is a good
article on installing the full Linux desktop for WSL2.
However, once WSL2 grabs RAM for the desktop, it seems to not want
to give it back to the rest of Windows. So, you'll want to
limit RAM for WSL2 as in the guide.)
-
In order to copy files between Windows and WSL, I create a link
inside WSL to the 'Downloads' folder:
cd
ls /mnt/c/Users
# For me, I am user 16176 in Windows. So, I type:
ln -s /mnt/c/Users/16176/Downloads ./
# You can now copy (cp) between ~/Downloads and ~ in WSL,
# where '~' is the Linux shortcut for your home directory.
- But if you insist on not using WSL2, then please try
putty
(substitute for 'ssh') and
WinSCP (substitute
for 'scp').
- Regardless of whether you are using Linux, Apple, or Windows/WSL2
on your laptop, practice using
ssh USERNAME@login.ccs.neu.edu
and
scp myfile.txt USERNAME@login.ccs.neu.edu:
(and notice the final colon with the 'scp' command line).
For help, try:
man ssh
and man scp
.
- Syllabus:
- The syllabus
contains the required readings in the textbook. It's available
from the link or from /course/cs7600/ (the course directory)
on Khoury Linux computers. Note especially our online
textbook ostep.org.
- Homework:
- The homework subdirectory
of the course directory contains all course homework,
and the course directory contains all handouts.
- Help directory:
- The course directory includes a help directory.
There are two older reviews of UNIX there.
But please consider this
excellent modern introduction to UNIX by M. Stonebank.
(or alt,
with updates to Linux/bash by C. Tothill)
- Linux editors:
- Please also note the directory for
UNIX (Linux) editors.
On Khoury Linux, try:
cd /course/cs7600/unix-editors. The editor vi
is a popular choice. To learn vi (estimated time: one hour),
login to Khoury Linux and do:
vi /course/cs7600/editors-unix/vitutor.vi
and follow the on-screen instructions.
- MIPS assembly language:
- The assembly portion of the course will be based on MIPS.
Please download the MARS Simulator for use with
MIPS assembly (and see the detailed overview on this page).
You can find the textbook's
"Green Card" online. You can find a manual for MIPS as
Appendix A by James Larus
Read this, and re-read it.
Some other possibly useful resources from the web are
this quick reference sheet and this old introduction to the MARS Simulator.
The course will also heavily use the MIPS simulator.
- Common Linux Commands:
- Here is one web site
(Common Linux Commands)
with about 30 common Linux commands. If you haven't used Linux
much, try a quick review of these commands (passive knowledge
only). If you need more, there are plenty of great text and
video tutorials on the Linux command line.
- C language:
- We will briefly review the C language (the core of Java, but including
primitive types, only). For a more extensive overview of C, consider
a good, free
on-line C book by Mike Banahan, Declan Brady and Mark Doran;
(See especially Chapter 5: "C Pointers and Arrays" from
Banahan or else Chapter 5 up to multi-dimensional arrays in
the classic Kernighan and Ritchie C book.)
- GDB debugger, gcc compiler, etc.:
- Some help files for Linux and its compilers,
editors, etc. are also available.
As you use Linux, please especially practice using GDB
(GNU debugger) (and see the
detailed overview on this page).
This will help when you test your homeworks under Linux.
It will also reward you with more productive debugging for the rest of
your computer science career.
- Python
- See materials on Python,
and the Python subdirectory
- UNIX xv6
- The original UNIX was small enough that one could read all
of the code, with the right guidance. The
UNIX xv6 subdirectory has an easy-to-read
hyperlinked version of UNIX xv6, with about 7,000 lines of code
(about 100 pages of code).
In order to help you in reading the xv6 code, some notes are provided at:
000-EXPLORING-xv6.txt
- Thread synchronization
- "thread-synch" subdirectory
(notes on basic use of mutex, semaphore and condition variables)
- Technical writing
-
Some systems papers for your writing project are now available:
systems papers for survey
Please pick three papers on which to write your own survey paper.
(With the instructor's permission, you may substitute a new research
paper in your area that includes a large systems component,
or write the systems section in a new research paper.)
We will begin the writing track by reviewing this
"Recipe for Writing"
Written technical
communication is a key skill. There are rules (even recipes)
for good technical
writing, and I intend to teach those rules and recipes.
For example, there are global recipes, such as reading just
the first sentence of each paragraph for content; and there
are local recipes such as the "Structure of Prose" in
"The Science of Scientific Writing"
(from American Scientist). Good technical
writing is something that you can take with you for a
lifetime.
We will also be using the course Wiki
as a central place to track progress on each of the student
survey papers.
- Debugging
-
Finally, as you progress in the course, please come back and frequently
review my list of
"Debugging and Other Systems
Tricks". As you move into working with systems in depth,
you will want to learn more of the countless systems tricks that
are typically learned only by the random interaction of "hackers"
working alongside other "hackers". I have tried to bring these pearls
together in a single web page, to make it easily accessible as part
of a gentle introduction to computer systems.
Going beyond (enrichment material):
- Some students have asked about a book for a more advanced introduction
to Linux and systems programming. This book goes well beyond what is
needed for the course. But for those who are interested, there is:
Advanced UNIX Programming, by Marc J. Rochkind
(free online access through
library.northeastern.edu)
- If you're interested in reading the original UNIX research paper, which
announced the implementation of UNIX to the world, see
The UNIX Time-Sharing System (1974).
-
Lecture on Parallel Computing (read the first half only, for
a nice overview and graphs of why multi-core is necessary, and
the benefits of many-core)
Course Resources
MIPS Simulator for Assembly Language homework (MARS):
- There is a
MIPS Assembly language simulator with
free downloads available and
online documentation of the MARS simulator.
There is also an
online video demo of the Mars simulator.
There is generous, high-quality documentation of the MIPS assembly
language with the
free, online documentation for the MIPS assembler
(including for SPIM, and early version of the MARS simulator),
and the
SPIM Quick Reference
(or an even shorter quick reference here)
- To begin running the simulator, go inside the folder, and
double-click on the Mars.jar icon. Alternatively, if running
from the command line in Linux, type:
java -jar Mars.jar
If you download Mars.jar for your computer, there are also
suggestions on that page for running Mars.
- The MARS software is distributed as a Java .jar file. It requires
Java J2SE 1.5 or later. Depending on your configuration,
you may be able to directly open it from the download menu.
If you have trouble, or if you prefer to run from the
command line on your own computer, the Java SDK is is also available
for free download from the same download page. The instructions
for running it from Windows or DOS should work equally well
on Linux. The CCIS machines should already have
the necessary Java SDK installed.
- GOTCHAS: There are several important things to watch out for.
- When you hit the "Assemble" menu item, any error messages about
failure to assemble are in the
bottom window pane, tab: "Mars Messages".
Input/Output is in the bottom window pane, tab: "Run I/O"
- If you paste assembly code into the edit window pane, you must
save that code to a file before Mars will let you assemble it.
- If you have selected a box with your mouse (e.g. "Value" box in
the data window pane, or "Register" box), then Mars will not
update the value in that box. Mars assumes you prefer to write
your own value into that box, and not to allow the assembly
program to use that box.
- If your program stops at an error, read the "Mars Messages" for
the cause of the error, and then hit the "Backstep" menu item
to see what caused that error. Continue hitting "Backstep"
or "Singlestep" in order to identify the error.
- Your main routine must call the "exit" system call to terminate.
It may not simply call return ("jr $ra"). Note that page B-44
of Appendix B of the text (fourth edition) has a table of
"system services" (system calls). These allow you to do "exit"
and also I/O.
- One of the nicer features of this software is a limited
backstep capability (opposite of single-step) for debugging.
In addition, the help menu includes a short summary
of the MIPS assembly instructions.
In general, I like this IDE for assembly even better than some
of the IDEs that I have seen for C/C++/Java.
(The one feature that I found
a little unintuitive is that if you want to look at the
stack (for example) instead of data, you must go to the
Data Segment window pane, and use the drop-down menu at the
bottom of that pane to choose "current $sp" instead of ".data".)
- Please note the
three sample assembly programs, along with an accompanying
tutorial on the same web page.
- I'd appreciate if if you could be on the lookout
for any unusual issues, and report them promptly (along with
possible workarounds), so the rest of the class can benefit.
Thanks very much for your help on this.
GDB (GNU DeBugger):
A Few Simple Debugging Commands Save You Hours of Work
INVOKING gdb:
gdb --args
Example: gdb --args ./a.out myargs
COMMON gdb COMMANDS:
BREAKPOINTS: break, continue
STARTING: break main, run
WHERE AM I: info threads, thread 1; where, frame 2; list
PRINT: ptype, print ( ptype argv[0]; print argv[0] )
EXECUTING: next, step, until, finish, continue
(next line, step inside fnc, until previously
unseen line (escape a loop), finish current fnc,
continue to next breakpoint)
EXIT: quit
< Cursor keys work, TAB provides auto-complete >
PARENT AND CHILD PROCESSES:
set follow-fork-mode child
(after call to fork(), switch to debugging child process)
DEBUGGING INFINITE LOOP:
run; TYPE ^C (ctrl-C) TO TALK TO gdb; where; print var1; etc.
HELP: help e.g.: (gdb) help continue
ADVANCED gdb: USING gdbinit:
In a more sophisticated debugging session, you may have to try
multiple GDB sessions. To do this, you will want to try doing:
gdb -x gdbinit --args ./a.out myargs
First, create a gdbinit file. An easy way is, in your last GDB session,
(gdb) show commands
Then copy-paste those comamnds into gdbinit and edit as desired.
Then at the top of gdbinit, insert:
# gdb -x gdbinit --args ./a.out myargs [Customize as needed]
set breakpoint pending on
set pagination off
# Stop if about to exit:
break _exit
# Continue w/ output of 'show commands'; Extend and modify as needed
MORE ADVANCED gdb:
info function
IF DEBUGGING C++ CODE, you will find this useful, in order
to discover the full GDB name of your target method.
macro expand MACRO_FNC(ARG) [requires 'gdb -g3']
debugging child after fork:
set follow-fork-mode child
break fork
run
EXTENDED "WHERE AM I":
set detach-on-fork off
info inferiors; inferior 2
info threads; thread 1
NOTE: The qualified thread number is: .
So, 'thread 2.3' switches to thread 3 in inferior 2.
where; frame 2
list; print myvariable
And for the adventurous, consider this full-featured
GDB Dashboard. It's a nicer version of things like "layout src".
For digging deeper into GDB, try:
"gdb - customize it the way you want".
NOTE: For those who like a
full-screen display of the current code, try the command
^Xa
(ctrl-X a) to turn full-screen mode on and off. Also, try:
(gdb) layout split
And finally, try "focus cmd" and "focus src", or ^Xo, to decide which pane
you want your cursor keys to operate on.
NOTE: For those who like to try a
reversible version of GDB, see
rr-project.org
(for GDB with reverse execution)
NOTE: For a _really cool_
GDB interface, look at: https://github.com/cyrus-and/gdb-dashboard. To try it out,
just go to the .gdbinit file from that web site, and then copy
the code into a local file,
gdbinit-dashboard,
and then execute
source gdbinit-dashboard
in GDB. Personally, I like to use Dashboard by
displaying the whole dashboard in another terminal.
Python
There are some good introductory materials for Python in
the instructor's directory.
After trying those out, here are some other Python background
materials:
Python
Virtual memory
The following note by Rob Landley
is a truly excellent summary of the most important points of
virtual memory as it's really used (not just the textbook theoretical
synopsis):
Motivation for multi-core CPUs (and hence, multi-threading)
Memory Consistency Models
When we use lock-free algorithms, we would often like to assume
a strong memory consistency model, such as sequential consistency.
Many CPUs offer only a relaxed consistency model. This can affect the
design of your lock-free algorithm.
- Note this truly excellent tutorial on memory consistency models
from 1995:
Shared Memory Consistency Models: A Tutorial by Sarita V. Adve.
- Here's a really accessible set of slides on lock-free algorithms:
Lock-Free Programming by Geoff Langdale.
- The double-checked locking is an important issue for lock-free
algorithms. See Double-Checked Locking is Fixed In C++11:
"The double-checked locking pattern (DCLP) is a bit of a notorious
case study in lock-free programming. Up until 2004, there was no
safe way to implement it in Java. Before C++11, there was no safe
way to implement it in portable C++."
- Some useful Wikipedia articles:
- Lock-free data structures generally suffer from the ABA problem.
Two solutions to the ABA problem are the double-world
atomic CAS (compare-and-swap) and hazard pointers. Here is an
article with a toth types of solution.
Current Events
NEWS (from 2023 and earlier):
SPECULATION ABOUT FUTURE CHIPS
AND THE END OF MOORE'S LAW:
(Many of these articles
are from digitimes.com.)
For context, note the
Wikipedia Silicon article
on the covalent radius
of silicon is 0.111 nm. So, the distance between adjacent silicon atoms
is the diameter (0.222 nm).
7 nm is approximately 31.5 silicon atom diameters.
5 nm is approximately 22.5 silicon atom diameters.
3 nm is approximately 13.5 silicon atom diameters.
2 nm is approximately 9.0 silicon atom diameters.
Motivation for multi-core CPUs:
Limits of CPU Power Trends
at beginning of the millenium (from here)
- NEWS (Oct., 20, 2023):
Samsung reveals eMRAM and BCD roadmap while pushing automotive chip down to 2 nm
(and see Wikipedia article on how close MRAM is toward
competing with other RAM technologies)
- NEWS (Sept., 19, 2023):
TSMC to put off 2nm mass production until 2026
- NEWS (Sept., 17, 2023):
Development of SSMB EUV Light Source at THU
(Tsinghua University)
(and see more recent Chinese blog; click on "Google Translate")
- NEWS (Jul., 30, 2023):
Japan is eyeing heterogeneous integration on way to mass-produce 2nm chips
- NEWS (Jul., 3, 2023):
Samsung set to commercialize 2nm chips in 2025, 1.4nm by 2027
- NEWS (Jun., 28, 2023):
Surging iPhone 15 series orders may boost TSMC revenue in the third quarter by 11% (Mew chip moving to 3nm, and Apple will book 90% of TSMC's 3nm production.)
- NEWS (Apr., 23, 2023):
TSMC starts 2nm pre-production, targets mass production by 2025: report
- NEWS (Apr., 23, 2023):
Tech war: China's top memory chip maker YMTC making progress in producing advanced 3D NAND products with locally sourced equipment: sources
(or here)
- NEWS (Dec., 30, 2022):
TSMC gearing up for 3nm capacity expansion, 2nm fab construction
- NEWS (Dec., 5, 2022):
Intel unveils 2D and 3D IC research breakthroughs to extend
Moore's Law
- Moore's Law and future chip technologies:
- NEWS (Oct., 14, 2022):
TSMC to see 3nm generate 4-6% of 2023 revenue
- NEWS (Oct., 7, 2022):
Apple preparing for 2nm chips
- NEWS (Oct., 4, 2022):
Samsung plans 1.4nm for 2027 while polishing up 3D packaging
technology
- NEWS (Sept, 21, 2022):
Nvidia launches next-gen GeForce RTX built on TSMC N4 node
- NEWS (July 19, 2022):
Competition in 3nm smartphone AP market to take place in 2H23
- NEWS (July 12, 2022):
Volkswagen constructing first in-house battery plant in Europe,
considering 'big moves' in China
- NEWS (Apr. 21, 2022):
US bid to boost chipmaking to be expensive and wasteful,
says TSMC founder Morris Chang
- NEWS (Feb. 21, 2022):
Global chipmakers find ways to improve competitiveness
($440B, $380B, and $280B capital expenditures in 2022
by TSMC, Samsung and Intel)
- NEWS (Jan. 19, 2022):
Intel is on track to adopt 0.55 High-NA EUV lithography in 2025
("NA" is "numerical aperture"; or alt or alt2)
- NEWS (Dec. 7, 2021):
The Great Tech Rivalry: China vs the U.S.
(by Avoiding Great Power War Project at Harvard University's
Belfter Center;
while an excellent review
of technical progress in the U.S. and China, it unfortunately
does not review the fast rising technical progress in India)
- NEWS (12/24/21):
TSMC to move 3nm process to commercial production in 4Q22
- NEWS (12/23/21):
Autonomous delivery picking up in US
- NEWS (12/1/21):
TSMC enters pilot production of 3nm chips
- NEWS (10/8/21):
TSMC on track to ramp 3nm chip production (in second half, 2022)
- NEWS (8/31/21):
WHY THE GLOBAL CHIP SHORTAGE IS MAKING IT SO HARD TO BUY A PS5
(about the semiconductor supply chain)
- NEWS (8/18/21):
Samsung unlikely to move 3nm GAA process to volume production until 2023
- NEWS (7/2/21):
Micron to adopt EUV in DRAM manufacturing by 2024
- NEWS (4/16/21):
TSMC to boost 5nm chip output in 2H21
- NEWS (4/8/21):
Microsoft unveils liquid cooling solution for datacenters
- NEWS (3/26/21):
Ball screws with smart maintenance)
- NEWS (12/15/20):
TSMC to see 20% rise in 5nm shipments in first half of 2021
- NEWS (12/2/20):
TSMC to roll out 3nm Plus process in 2023
- NEWS (10/16/20):
TSMC expects 5nm chip sales to boost in 2021
Hiwin develops EV-use smart ball screws
(another source of high demand for CPU chips:
Ball screws with smart maintenance)
- NEWS (3/24/21):
Intel announces US$20 billion fab expansion plans in foundry revamp
- NEWS (3/17/21):
Micron to shift resources from 3D XPoint to CXL memory
(Definintions:
CXL: Compute-Express Link; 3D XPoint (aka Optane memory);
CDI: Composable disaggregated infrastructure)
- What is composable infrastructure?
("One workload could be a compute-heavy application requiring a lot
of CPU power, while another could be memory-heavy. The application
can grab whatever it needs at the time that it runs, and when it's
done, it returns it to the pool.")
- CXL initiative tackles memory challenges in heterogeneous computing
("In CXL, we start with CPUs, with cacheable memory both North and
South, both to its own memory and to the accelerator memory. Those
two pools would be part of the coherent memory pool addressable by
both machines." ... "In a data center, CXL operates primarily at the
node-level layer .... For the rack and row levels, the open systems
Gen-Z interconnect can provide memory-semantic access to data and
devices via direct-attached, switched or fabric topologies." ...
"CXL and Gen-Z are very complementary.")
- NEWS (2/26/21):
Highlights of the day: TSMC expanding 5nm capacity
- NEWS (10/8/20):
TSMC likely to make another upward adjustment to capex outlook
... due to strong demand for 7nm and 5nm
- NEWS (9/24/20):
TSMC mulling more 2nm capacity
- NEWS (9/21/20):
TSMC reportedly adopts GAA transistors for 2nm chips
- NEWS (7/27/20):
Intel may expand partnership with TSMC (7nm chips)
OLDER NEWS from Spring, 2015:
NEWS:
Talk by Yale Patt (famous researcher
in Computer Architecture)
NEWS:
2015 CCIS Colloquia (research talks by invited guests to CCIS: topics including security, big data, social networks, robotics, natural language, etc.)
NEWS:
Android Apps that Never Die (talk by me, Gene Cooperman, and Rohan Garg, at ACM undergrad chapter: 6 p.m., Wed., Feb. 25, 104 WVG) (pizza included)
NEWS:
One VLSI fabrication facility: $6.2 billion as of 2014
(from digitimes.com):
UMC to build 12-inch fab in Xiamen
- TOP500 Supercomputer Sites
- NOTE: This writing is as of Fall, 2021. (Things change quickly
in this area.) In terms of CPUs, we are in a three-way race
among chips from Intel, AMD, and ARM (but also the IBM Power9
CPU chip on supercomputers, such as the
Summit supercomputer).
In the past, it was
between Intel and AMD. The current largest supercomputer
is
Fugaku in Japan, based on 7~nanometer ARM chips (with no GPUs).
China's upcoming
Tianhe-3 supercomputer will also be based on ARM.
Similarly, the
Apple M1 chips are based on A64 ARM (with 4 big and 4 little
cores). Most of the remaining top supercomputers are based
on Intel or AMD, and often include NVIDIA GPUs on each node.
Intel will soon be offering CPUs with discrete
Intel Xe GPU chips, also to be included in the upcoming
Aurora supercomputer. Meanwhile, the upcoming
El Capitan supercomputer
will be based on AMD CPUs and AMD GPUs, and will be used
especially by the National Nuclear Securitay Administration
(NNSA) for numclear weapon modeling.
- Lists of Top 500
supercomputers in the world, with Top 10 on first page.
- Some recent blogs from the TOP500 site:
-
El Capitan, Frontier, and Aurora: exascale supercomputers
(appearing from 2021--2023; see especially the table, lower down
in this article)
Deep Learning (a motivator for high-end HPC):
Deep Learning on NVIDIA GPUs
DeepMind Beats Human Champion at Game of Go
(in 2015)
"[The
deep learning algorithm of] AlphaGo doesn't actually use that much
hardware in play, but we needed a lot of hardware to train it and do all
the different versions and have them play each other in tournaments on
the cloud. That takes quite a lot of hardware to do efficiently, so we
couldn't have done it in this time frame without those resources."
-->
Relative Popularity of Different Languages
Benchmark Games (Which is faster, C, Java, or Python?):
(Benchmarks are notoriously variable. Be careful about how you interpret this.)
Three Newer Languages (with lessons from Scheme/Java/C/C++)
- Go (widely used at Google;
also the source language for
Docker,
a new type of lightweight virtual machine built on
top of Linux containers)
- Rust (grew out
of Mozilla, the developer of Firefox; may be used for a future
version of Firefox)
- Scala (runs on JVM;
Spark, a proposed successor to Hadoop, is built using Scala,
and supports Scala, Java, and Python)
The following note by Rob Landley
is a truly excellent summary of the most important points of
virtual memory as it's really used (not just the textbook theoretical
synopsis):