Saturday, March 4, 10pm
See Assignment 5 on Canvas for the Github link.
This is a pair assignment, but you can work alone, if you so choose.
Submit the contents of your repository via Gradescope. See Deliverables below for what to submit. If you are working with a partner, do not forget to include their name with the submission.
There will be no autograder for this assignment. Read the requirements and run tests locally.
Do not use login
for this assignment. If you still have trouble setting up a VM, reach out to us or Khoury Systems.
A user-space memory allocator is a library (a set of functions and associated data structures) that handles programmer requests for memory. Most of the work of a memory allocator is to keep track of free memory, give out memory blocks that are available, and to request more memory from the OS when the allocator runs out of its managed memory.
For this assignment, you will be writing your own memory allocator. Writing a custom memory allocator is something you might do if you work on performance sensitive systems (games, graphics, quantitative finance, embedded devices or any application you want to run fast!). Malloc and free are general purpose functions written to manage memory in the average use case quite well, but they can always be optimized for a given workload. That said, a lot of smart people have worked on making malloc/free quite performant over a wide range of workloads. Optimization aside, you might write an allocator to add in cool debugging features, and swap it in as needed.
For this assignment, you will implement a portion of a custom memory allocator for the C language. You will write your own versions of:
This assignment will be the first of two memory allocators you will create this term.
Read Chapter 17 of OSTEP. This covers most of what you need to know about free space management and the workings of a memory allocator conceptually. With the correct understanding, this assignment is relatively quick and easy. Ask questions.
Your objective will be to create three functions in mymalloc.c
mymalloc
mycalloc
myfree
Please read through the following design decisions to help guide you. This is some of the thought process a designer of such a system may go through. There are more concrete specifications following.
Remember that malloc
, calloc
, and free
are all working with memory allocated on the heap. We can request memory from the operating system using a system call such as sbrk. There exist other ways to request memory, such as mmap, which you will use for your next memory allocator assignment.
For this assignment, all memory requests MUST use the sbrk
system call.
Once you have retrieved memory, we need to keep track of it. That means that every time a user uses your malloc
or calloc
functions, you will want to know where that memory exists and how big it is. Thus, we want to keep track of all of the memory we request in a convenient data structure that can dynamically expand.
So think about: What data structure could I use?
You may define any helping data structures and functions that assist you in this task. This means you might even have a global variable or two to assist with your implementation. Depending on what data structure you decide to store all of the requested memory in, it may be useful to have additional helper functions for traversing your data structure and finding free blocks/requesting blocks for example.
Programs may frequently allocate and then free memory throughout the program’s execution. It can thus become very inefficient to keep expanding the heap segment of our process. Instead, we try to reuse blocks as efficiently as possible. That is, if I have allocated a block of memory of 8 bytes, and that 8 bytes gets freed, the next time I call malloc
I can use the previous 8 bytes without having to make another call to sbrk
. There exist several strategies for searching free memory. The most straightforward ones are first-fit and best-fit. Both are described in detail, along with other very useful supporting information in OSTEP Chapter 17.
In this assignment, we will use the first-fit strategy, which is the simplest to implement.
Here are the default specifications to put everyone on equal footing. You are welcome to diverge if you think you can build something more optimal, but get this basic allocator with the specifications below to work first!
Do not modify malloc.h
.
Use an embedded linked list data structure. See OSTEP Chapter 17.
This data structure will keep track of the blocks that you have allocated within the mymalloc
function.
You should have a global variable that serves as the “head” or “first block of memory” in your linked list.
You should have some notion of a ‘block’ of memory in your program.
An example is provided here with some fields that may be useful:
typedef struct block {
size_t size; // How many bytes beyond this metadata have been
// allocated for this block
struct block *next; // Where is the next block in the free list
int debug; // (optional) Perhaps you can embed other
// information--remember, you are the boss!
} block_t;
Use the sbrk system call. Your version of malloc
(mymalloc
or its helper functions) shall use sbrk
. Understand what, e.g., sbrk(0)
and sbrk(10)
do before you start.
The myfree
function sets a block of memory to be free, by setting the free
flag in block_t
to “true”. Consider how memory is laid out in the heap and make sure you are only accessing your struct. Here is a simple diagram:
|block|---actual memory---|block|------actual memory-----|block|--actual memory--|
^ Here is where your struct lives, this is what you want to update.
The mymalloc
function is returning the actual memory
|block|---actual memory---|block|------actual memory-----|block|--actual memory--|
^ Here is what you'll return the the programmer as their memory.
The first-fit memory allocator looks for the first block available in the linked list of memory. Remind yourself what the trade-off is between the other allocators (e.g. compare to a ‘best-fit’ allocator).
When called, mymalloc
must print out "Malloc %zu bytes\n"
using the provided debug_printf
function (which is used just like printf
; see the file debug.h
)
When called, mycalloc
must print out "Calloc %zu bytes\n"
using the provided debug_printf
function (Yes, a proper implementation will print “Malloc …” followed by “Calloc …”)
When called, myfree
must print out "Freed %zu bytes\n"
(using debug_printf
)
The malloc.h
header defines aliases for these functions, so you will actually see the tests using malloc
, calloc
, and free
.
With these print-outs, you can see if they match the original programs.
We will examine your code to confirm you do not use the C library malloc
/calloc
/free
from stdlib.h
. You should only be using syscalls such as sbrk
to request memory from the operating system.
We have included a Makefile to make your life easier. Here’s a quick overview of the possible targets:
make
- compile mymalloc.c
to object file, mymalloc.o
make test
- compile and run tests in the tests directory with mymalloc
.make demo
- compile and run tests in the tests directory with standard malloc
.make help
- print available targetsWe have provided some “tests” for you, which exercise your allocator. Passing all the tests without crashing does not guarantee a perfect assignment, but it will give you some confidence your implementation is working. It would be wise to write additional tests to exercise your implementation.
Implement your memory allocator in mymalloc.c
and include any additional .c
and .h
files your implementation relies on. For example, you might want to compile your helper data structure separately.
Commit the code to your repository. Do not include any executables, .o
files, or other binary, temporary, or hidden files.
Once you are done, remember to submit your solution to Gradescope and do not forget to include your partner.
assert()
used to test invariantsmymalloc
/mycalloc
/myfree
)
sbrk
only called as neededQ: Do I have to reduce the heap ever?
A: No, you do not need to ever make calls to sbrk(-10)
for example.
Q: Can I use valgrind
to check for memory leaks in my program?
A: Likely not, because you are implementing your own allocator. By default, valgrind
is not able to reliably track calls to sbrk
(or mmap
). This is not to say that using valgrind gives no information, but it won’t give you the same level of detail as if the code had used the system allocator.
Q: So if I cannot use valgrind
, what can I do?
A1: One suggestion is to keep track of how many total bytes you allocate and how many you mark as free as global variables.
A2: A second suggestion is that you could use some of gcc’s compile-time interpositioning tricks to add a function that is automatically called at the end of the program. This function would traverse your linked list structure and check to see if any of the memory in your ‘block’ structure are marked as unfreed.
Q: How will I know my tool is working?
A1: This is a pretty general question, but a tool like strace
can be helpful. strace
can be run on the tests once you have compiled them. strace
reports how many calls you have made to sbrk
for example, and you can test your assumptions to see if that system call is being made too often (i.e. not reusing blocks that have been previously allocated and could be used). strace
will also confirm that you are NOT using the system malloc or free.
A2: You could also run your allocator with your linked list or other data structures and see if it passes the given unit tests.
A3: Write more unit tests to exercise your implementation, e.g., mallocs of random sizes, interleaved mallocs and frees, large numbers of mallocs and frees, walking the free list, edge cases…
man
is your friend. Check out sbrk
, malloc
, calloc
, free
, realloc
, …assert
to check that your assumptions about state are valid.assert
s to check expected results. Use our tests for queue/vector from Assignment 4 as an example.if
-else if
-else
or a multi-case switch
should be the only reason to go beyond 40-50 lines per function. Even so, the body of each branch/case should be at most 3-5 lines long.malloc
implementations (the GNU C library is just one), so you can take a look at it if you are curious.