For the CSAPP Cache Lab, the students are asked to write a small C program (200~300 lines) that simulates a cache memory.
The full code is here on GitHub.
A cache can be described with the following four parameters:
When the CPU wants to access a 64-bit address, the cache doesn’t look at the whole number at once. It slices the address into three distinct fields:
| Field | Purpose |
|---|---|
| Tag | Used to uniquely identify the memory block within a specific set. t = m - b - s |
| Set Index | Determines which set the address maps to. |
| Block Offset | Identifies the specific byte within the cache line. |
When our simulator receives an address (e.g., from an L or S operation in the trace file), it follows these steps:
cache structure.valid == true AND the tag matches the address tag.valid == false), fill it with the new tag and set valid = true.For this Lab Project, we will write a cache simulator that takes a valgrind memory trace as an input.
The input looks like:
1 | I 0400d7d4,8 |
Each line denotes one or two memory accesses. The format of each line is
1 | [space]operation address,size |
The operation field denotes the type of memory access:
Mind you: There is never a space before each “I”. There is always a space before each “M”, “L”, and “S”.
The address field specifies a 64-bit hexadecimal memory address. The size field specifies the number of bytes accessed by the operation.
Our program should take the following command line arguments:
Usage: ./csim-ref [-hv] -s <s> -E <E> -b <b> -t <tracefile>
-h: Optional help flag that prints usage info-v: Optional verbose flag that displays trace info-s <s>: Number of set index bits (S = 2s is the number of sets)-E <E>: Associativity (number of lines per set)-b <b>: Number of block bits (B = 2b is the block size)-t <tracefile>: Name of the valgrind trace to replayFor this lab, we ignore all Is (the instruction cache accesses).
We assume that memory accesses are aligned properly, such that a single memory access never crosses block boundaries.
We basically start from scratch, given an almost blank csim.c file to fill in. The file comes with only a main function and no header files.
1 |
|
First, we add the int argc, char** argv parameters to the main function. argc stands for argument count, while argv stands for argument values.
We use getopt to parse arguments.
1 | void handleArgs(int argc, char** argv){ |
getopt comes in unistd.h, but the compiler option is set to -std=c99, which hides all POSIX extensions. GNU systems provide a standalone <getopt.h> header. So we include getopt.h instead.
1 | opt = getopt(argc, argv, "hvs:E:b:t:") |
h and v: These are boolean flags.s:, E:, b:, and t:: These are required arguments. The colon tells getopt that these flags must be followed by a value (e.g., -s 4).After parsing the arguments, we set the initial value of our Cache Data Model.
1 | sets = 1LL << set_bit; |
1 |
|
Caution: malloc has to be initialized. Or the data might contain garbage values.
So we use calloc. The calloc (stands for contiguous allocation) function is similar to malloc but it initializes the allocated memory to zero.
And don’t forget to free the allocated memory!
1 | void freeCache() { |
1 |
|
Caution:
fscanf does not skip spaces before %c, so we add a space before %c in the format string.!feof(traceFile) does not work correctly here.It only returns true after a read operation has already attempted to go past the end of the file and failed. Using it as a loop condition (e.g., while (!feof(p))) causes an “off-by-one” error, where the loop executes one extra time with garbage data from the last successful read.1 |
|
We use bit masks to parse the addresses.
1 | void loadData(long long address, int size) { |
The code simulates the process of loading cache.
We first check if the data already exists in the cache.
If it doesn’t exist, we have to scan for blank lines to load the data.
If blank lines don’t exist, we need to evict a line using the LRU strategy. We replace the victim line with the new line.
1 | void storeData(long long address, int size) { |
For this simulator, storing data and modifying data are basically the same thing as loading data.
We are asked to output the answer using the printSummary function.
1 |
|
And Voila!
1 | Your simulator Reference simulator |
In this project, we moved from the theory of hierarchy to the practical reality of memory management. By building this simulator, we reinforced several core concepts of computer systems.
With our simulator passing all the trace tests, we’ve effectively mirrored how a CPU “thinks” about memory. The next step is applying these insights to optimize actual code, ensuring our algorithms play nicely with the hardware we’ve just simulated.
For the CSAPP Cache Lab, the students are asked to write a small C program (200~300 lines) that simulates a cache memory.
The full code is here on GitHub.
A cache can be described with the following four parameters:
When the CPU wants to access a 64-bit address, the cache doesn’t look at the whole number at once. It slices the address into three distinct fields:
| Field | Purpose |
|---|---|
| Tag | Used to uniquely identify the memory block within a specific set. t = m - b - s |
| Set Index | Determines which set the address maps to. |
| Block Offset | Identifies the specific byte within the cache line. |
When our simulator receives an address (e.g., from an L or S operation in the trace file), it follows these steps:
cache structure.valid == true AND the tag matches the address tag.valid == false), fill it with the new tag and set valid = true.For this Lab Project, we will write a cache simulator that takes a valgrind memory trace as an input.
The input looks like:
1 | I 0400d7d4,8 |
Each line denotes one or two memory accesses. The format of each line is
1 | [space]operation address,size |
The operation field denotes the type of memory access:
Mind you: There is never a space before each “I”. There is always a space before each “M”, “L”, and “S”.
The address field specifies a 64-bit hexadecimal memory address. The size field specifies the number of bytes accessed by the operation.
Our program should take the following command line arguments:
Usage: ./csim-ref [-hv] -s <s> -E <E> -b <b> -t <tracefile>
-h: Optional help flag that prints usage info-v: Optional verbose flag that displays trace info-s <s>: Number of set index bits (S = 2s is the number of sets)-E <E>: Associativity (number of lines per set)-b <b>: Number of block bits (B = 2b is the block size)-t <tracefile>: Name of the valgrind trace to replayFor this lab, we ignore all Is (the instruction cache accesses).
We assume that memory accesses are aligned properly, such that a single memory access never crosses block boundaries.
We basically start from scratch, given an almost blank csim.c file to fill in. The file comes with only a main function and no header files.
1 |
|
First, we add the int argc, char** argv parameters to the main function. argc stands for argument count, while argv stands for argument values.
We use getopt to parse arguments.
1 | void handleArgs(int argc, char** argv){ |
getopt comes in unistd.h, but the compiler option is set to -std=c99, which hides all POSIX extensions. GNU systems provide a standalone <getopt.h> header. So we include getopt.h instead.
1 | opt = getopt(argc, argv, "hvs:E:b:t:") |
h and v: These are boolean flags.s:, E:, b:, and t:: These are required arguments. The colon tells getopt that these flags must be followed by a value (e.g., -s 4).After parsing the arguments, we set the initial value of our Cache Data Model.
1 | sets = 1LL << set_bit; |
1 |
|
Caution: malloc has to be initialized. Or the data might contain garbage values.
So we use calloc. The calloc (stands for contiguous allocation) function is similar to malloc but it initializes the allocated memory to zero.
And don’t forget to free the allocated memory!
1 | void freeCache() { |
1 |
|
Caution:
fscanf does not skip spaces before %c, so we add a space before %c in the format string.!feof(traceFile) does not work correctly here.It only returns true after a read operation has already attempted to go past the end of the file and failed. Using it as a loop condition (e.g., while (!feof(p))) causes an “off-by-one” error, where the loop executes one extra time with garbage data from the last successful read.1 |
|
We use bit masks to parse the addresses.
1 | void loadData(long long address, int size) { |
The code simulates the process of loading cache.
We first check if the data already exists in the cache.
If it doesn’t exist, we have to scan for blank lines to load the data.
If blank lines don’t exist, we need to evict a line using the LRU strategy. We replace the victim line with the new line.
1 | void storeData(long long address, int size) { |
For this simulator, storing data and modifying data are basically the same thing as loading data.
We are asked to output the answer using the printSummary function.
1 |
|
And Voila!
1 | Your simulator Reference simulator |
In this project, we moved from the theory of hierarchy to the practical reality of memory management. By building this simulator, we reinforced several core concepts of computer systems.
With our simulator passing all the trace tests, we’ve effectively mirrored how a CPU “thinks” about memory. The next step is applying these insights to optimize actual code, ensuring our algorithms play nicely with the hardware we’ve just simulated.