Web Resources for CS7600 (Intensive Computer Systems)

Instructor: Gene Cooperman
Spring, 2023

CS 7600 (Spring, 2023): Intensive Computer Systems

(This page now includes the up-to-date syllabus for Spring, 2023.)
The core philosophy of the course is to avoid reading or writing massive amounts of code. Instead, the emphasis will be on a high ratio of new concepts and learning per lines of codes written.
It's about concepts, not code!

*** Three course principles *** (concerning relevance to all C.S. disciplines, and concerning respect for students' time)

===============================
Fun stuff: Before doing too deep a dive, you might want to take a look at some fun stuff. This includes some of the news on the web and enrichment topics that we will discuss from time to time.

===============================

Course basics: Here are some immediate resources in the course for doing homework, following the readings, etc.

Course home:
The course materials will always be available either from this web page or from the course directory on the Khoury Linux computers: /course/cs3650/.
Linux:
The course is based on Linux (the third of the three popular operating systems). If you do not yet have a Khoury Linux account, get one. If you are not fluent in the basics of Linux/UNIX, please go to any of the good tutorials on the Web. Personally, I like: *** this Linux tutorial *** (from M. Stonebank, U. of Surrey, England), or this *** updated one for Linux/bash *** (by C. Tothill)
Other useful tips on Linux:
  1. Productivity tip: In most settings: the four cursor keys work (for editing the current or previous commands); and the <TAB> key works (for auto-completion of filenames and commands).
  2. If you're using Apple (the second most popular operating system), the terminal window provides you with a UNIX-like environment.
  3. If you're using Windows (the most popular operating system), you can install WSL2 (Windows Subsytem for Linux, version 2). This gives you a Linux virtual machine side-by-side with Windows. If you are unsure of which Linux distribution to use, Ubuntu is a common first choice for those unfamiliar with Linux. Some "gotchas" and workarounds follow.
    After installing WSL, a known bug from May, 2021 causes the vmmem process to consume a lot of CPU time at wake-after-sleep, if WSL was running. There is a workaround here.
    (When I followed the instructions for WSL2, I got Error 0x1bc and I fixed it by downloading the MSI package listed here. I then needed to enter "Windows Features" in the search bar, and then select "Windows Hypervisor Platform", and then go back to following the Microsoft Windows instructions. Some people may also need to turn on virtualization in their BIOS. After entering BIOS, examine all options in all tabs with the letter 'v'. For me, the option was "AMD SVM".)
    (Microsoft will be offering a Linux graphic Desktop in the near future. Finally, for the adventurous, you can build a Linux graphics Desktop now by following this article (using the lighter weight XDMCP instead of VNC). I also have an xlaunch-genie.sh script that slightly automates some of those instructions. And if you want to go further, here is how to relay Windows sound to WsL2 PulseAudio, and here is how to use CUDA with WSL2 for NVIDIA GPUs.)
    (Another option is Ubuntu Multipass, which requires VirtualBox, not necessarily compatible with WSLs's use of Hyper-V. And here is a good article on installing the full Linux desktop for WSL2. However, once WSL2 grabs RAM for the desktop, it seems to not want to give it back to the rest of Windows. So, you'll want to limit RAM for WSL2 as in the guide.)
  4. In order to copy files between Windows and WSL, I create a link inside WSL to the 'Downloads' folder:
    cd
    ls /mnt/c/Users
    # For me, I am user 16176 in Windows. So, I type:
    ln -s /mnt/c/Users/16176/Downloads ./
    # You can now copy (cp) between ~/Downloads and ~ in WSL,
    # where '~' is the Linux shortcut for your home directory.
  5. But if you insist on not using WSL2, then please try putty (substitute for 'ssh') and WinSCP (substitute for 'scp').
  6. Regardless of whether you are using Linux, Apple, or Windows/WSL2 on your laptop, practice using ssh USERNAME@login.ccs.neu.edu and scp myfile.txt USERNAME@login.ccs.neu.edu: (and notice the final colon with the 'scp' command line). For help, try: man ssh and man scp.
Syllabus:
The syllabus contains the required readings in the textbook. It's available from the link or from /course/cs7600/ (the course directory) on Khoury Linux computers. Note especially our online textbook ostep.org.
Homework:
The homework subdirectory of the course directory contains all course homework, and the course directory contains all handouts.
Help directory:
The course directory includes a help directory. There are two older reviews of UNIX there. But please consider this excellent modern introduction to UNIX by M. Stonebank. (or alt, with updates to Linux/bash by C. Tothill)
Linux editors:
Please also note the directory for UNIX (Linux) editors. On Khoury Linux, try: cd /course/cs7600/unix-editors. The editor vi is a popular choice. To learn vi (estimated time: one hour), login to Khoury Linux and do:
    vi /course/cs7600/editors-unix/vitutor.vi
and follow the on-screen instructions.
MIPS assembly language:
The assembly portion of the course will be based on MIPS. Please download the MARS Simulator for use with MIPS assembly (and see the detailed overview on this page). You can find the textbook's "Green Card" online. You can find a manual for MIPS as Appendix A by James Larus Read this, and re-read it. Some other possibly useful resources from the web are this quick reference sheet and this old introduction to the MARS Simulator. The course will also heavily use the MIPS simulator.
Common Linux Commands:
Here is one web site (Common Linux Commands) with about 30 common Linux commands. If you haven't used Linux much, try a quick review of these commands (passive knowledge only). If you need more, there are plenty of great text and video tutorials on the Linux command line.
C language:
We will briefly review the C  language (the core of Java, but including primitive types, only). For a more extensive overview of C, consider a good, free on-line C book by Mike Banahan, Declan Brady and Mark Doran; (See especially Chapter 5: "C Pointers and Arrays" from Banahan or else Chapter 5 up to multi-dimensional arrays in the classic Kernighan and Ritchie C book.)
GDB debugger, gcc compiler, etc.:
Some help files for Linux and its compilers, editors, etc. are also available. As you use Linux, please especially practice using GDB (GNU debugger) (and see the detailed overview on this page). This will help when you test your homeworks under Linux. It will also reward you with more productive debugging for the rest of your computer science career.
Python
See materials on Python, and the Python subdirectory
UNIX xv6
The original UNIX was small enough that one could read all of the code, with the right guidance. The UNIX xv6 subdirectory has an easy-to-read hyperlinked version of UNIX xv6, with about 7,000 lines of code (about 100 pages of code). In order to help you in reading the xv6 code, some notes are provided at: 000-EXPLORING-xv6.txt
Thread synchronization
"thread-synch" subdirectory (notes on basic use of mutex, semaphore and condition variables)
Technical writing
Some systems papers for your writing project are now available:
     systems papers for survey
Please pick three papers on which to write your own survey paper. (With the instructor's permission, you may substitute a new research paper in your area that includes a large systems component, or write the systems section in a new research paper.)

We will begin the writing track by reviewing this
     "Recipe for Writing"

Written technical communication is a key skill. There are rules (even recipes) for good technical writing, and I intend to teach those rules and recipes. For example, there are global recipes, such as reading just the first sentence of each paragraph for content; and there are local recipes such as the "Structure of Prose" in
     "The Science of Scientific Writing"
(from American Scientist). Good technical writing is something that you can take with you for a lifetime.

We will also be using the course Wiki as a central place to track progress on each of the student survey papers.

Debugging
Finally, as you progress in the course, please come back and frequently review my list of "Debugging and Other Systems Tricks". As you move into working with systems in depth, you will want to learn more of the countless systems tricks that are typically learned only by the random interaction of "hackers" working alongside other "hackers". I have tried to bring these pearls together in a single web page, to make it easily accessible as part of a gentle introduction to computer systems.

Going beyond (enrichment material):

===============================
Course Resources

MIPS Simulator for Assembly Language homework (MARS):

  1. There is a MIPS Assembly language simulator with free downloads available and online documentation of the MARS simulator. There is also an online video demo of the Mars simulator. There is generous, high-quality documentation of the MIPS assembly language with the free, online documentation for the MIPS assembler (including for SPIM, and early version of the MARS simulator), and the SPIM Quick Reference (or an even shorter quick reference here)
  2. To begin running the simulator, go inside the folder, and double-click on the Mars.jar icon. Alternatively, if running from the command line in Linux, type: java -jar Mars.jar If you download Mars.jar for your computer, there are also suggestions on that page for running Mars.
  3. The MARS software is distributed as a Java .jar file. It requires Java J2SE 1.5 or later. Depending on your configuration, you may be able to directly open it from the download menu.

    If you have trouble, or if you prefer to run from the command line on your own computer, the Java SDK is is also available for free download from the same download page. The instructions for running it from Windows or DOS should work equally well on Linux. The CCIS machines should already have the necessary Java SDK installed.

  4. GOTCHAS: There are several important things to watch out for.
    1. When you hit the "Assemble" menu item, any error messages about failure to assemble are in the bottom window pane, tab: "Mars Messages". Input/Output is in the bottom window pane, tab: "Run I/O"
    2. If you paste assembly code into the edit window pane, you must save that code to a file before Mars will let you assemble it.
    3. If you have selected a box with your mouse (e.g. "Value" box in the data window pane, or "Register" box), then Mars will not update the value in that box. Mars assumes you prefer to write your own value into that box, and not to allow the assembly program to use that box.
    4. If your program stops at an error, read the "Mars Messages" for the cause of the error, and then hit the "Backstep" menu item to see what caused that error. Continue hitting "Backstep" or "Singlestep" in order to identify the error.
    5. Your main routine must call the "exit" system call to terminate. It may not simply call return ("jr $ra"). Note that page B-44 of Appendix B of the text (fourth edition) has a table of "system services" (system calls). These allow you to do "exit" and also I/O.
  5. One of the nicer features of this software is a limited backstep capability (opposite of single-step) for debugging. In addition, the help menu includes a short summary of the MIPS assembly instructions. In general, I like this IDE for assembly even better than some of the IDEs that I have seen for C/C++/Java. (The one feature that I found a little unintuitive is that if you want to look at the stack (for example) instead of data, you must go to the Data Segment window pane, and use the drop-down menu at the bottom of that pane to choose "current $sp" instead of ".data".)
  6. Please note the three sample assembly programs, along with an accompanying tutorial on the same web page.
  7. I'd appreciate if if you could be on the lookout for any unusual issues, and report them promptly (along with possible workarounds), so the rest of the class can benefit. Thanks very much for your help on this.

===============================

The GNU debugger

GDB (GNU DeBugger):
  A Few Simple Debugging Commands Save You Hours of Work

        INVOKING gdb:
          gdb --args  
          Example:  gdb --args ./a.out myargs

        COMMON gdb COMMANDS:
          BREAKPOINTS:  break, continue
          STARTING:  break main, run
          WHERE AM I:  info threads, thread 1; where, frame 2; list
          PRINT:  ptype, print  (   ptype argv[0]; print argv[0]   )
          EXECUTING:  next, step, until, finish, continue
            (next line, step inside fnc, until previously
             unseen line (escape a loop), finish current fnc,
             continue to next breakpoint)
          EXIT:  quit
          < Cursor keys work, TAB provides auto-complete >
          PARENT AND CHILD PROCESSES:
            set follow-fork-mode child
              (after call to fork(), switch to debugging child process)
          DEBUGGING INFINITE LOOP:
            run; TYPE ^C (ctrl-C) TO TALK TO gdb; where; print var1; etc.
          HELP:  help    e.g.: (gdb) help continue

        ADVANCED gdb: USING gdbinit:
          In a more sophisticated debugging session, you may have to try
          multiple GDB sessions.  To do this, you will want to try doing:
            gdb -x gdbinit --args ./a.out myargs
          First, create a gdbinit file. An easy way is, in your last GDB session,
            (gdb) show commands
          Then copy-paste those comamnds into gdbinit and edit as desired.
          Then at the top of gdbinit, insert:
            # gdb -x gdbinit --args ./a.out myargs  [Customize as needed]
            set breakpoint pending on
            set pagination off
            # Stop if about to exit:
            break _exit           
            # Continue w/ output of 'show commands'; Extend and modify as needed

        MORE ADVANCED gdb:
          info function 
            IF DEBUGGING C++ CODE, you will find this useful, in order
            to discover the full GDB name of your target method.
          macro expand MACRO_FNC(ARG)
          debugging after fork:
            set detach-on-fork off
            break ...
            run
            info inferiors
          EXTENDED "WHERE AM I":
            info inferiors; inferior 2
            info threads; thread 1
              NOTE:  The qualified thread number is: .
                     So, 'thread 2.3' switches to thread 3 in inferior 2.
            where; frame 2
            list; print myvariable
For digging deeper into GDB, try: "gdb - customize it the way you want".

  NOTE: For those who like a full-screen display of the current code, try the command ^Xa (ctrl-X a) to turn full-screen mode on and off. Also, try: (gdb) layout split

  NOTE: For those who like to try a reversible version of GDB, see rr-project.org (for GDB with reverse execution)

  NOTE: For a _really cool_ GDB interface, look at: https://github.com/cyrus-and/gdb-dashboard. To try it out, just go to the .gdbinit file from that web site, and then copy the code into a local file, gdbinit-dashboard, and then execute source gdbinit-dashboard in GDB. Personally, I like to use Dashboard by displaying the whole dashboard in another terminal.

===============================
Python

There are some good introductory materials for Python in the instructor's directory. After trying those out, here are some other Python background materials:

Python

===============================
Virtual memory

The following note by Rob Landley is a truly excellent summary of the most important points of virtual memory as it's really used (not just the textbook theoretical synopsis):

Motivation for multi-core CPUs (and hence, multi-threading)

Memory Consistency Models

When we use lock-free algorithms, we would often like to assume a strong memory consistency model, such as sequential consistency. Many CPUs offer only a relaxed consistency model. This can affect the design of your lock-free algorithm.

===============================
Current Events

NEWS (from 2023 and earlier):
SPECULATION ABOUT FUTURE CHIPS AND THE END OF MOORE'S LAW: (Many of these articles are from digitimes.com.)
      For context, note the Wikipedia Silicon article on the covalent radius of silicon is 0.111 nm. So, the distance between adjacent silicon atoms is the diameter (0.222 nm).
Motivation for multi-core CPUs: Limits of CPU Power Trends at beginning of the millenium (from here)

       7 nm is approximately 31.5 silicon atom diameters. 5 nm is approximately 22.5 silicon atom diameters. 3 nm is approximately 13.5 silicon atom diameters. 2 nm is approximately 9.0 silicon atom diameters.