CS61C Summer 2004

Project 3: MIPS processor in Verilog

Update: Jul 31 1:07 PM
There has been an update regarding the data memory. You can see it on the newsgroup (post #611) or here

Update: Jul 27, 9:57 PM
The monitor is now required for Stage 2 and Stage 3. See the section on the monitor at the bottom for details.

Update: Jul 27, 5:15 PM
The naming convnetion on wires is now just optional as it seemed to be confusing more people than it helped. Just give you wires reasonable names and be consistent.

Update: Jul 26, 6:28 PM
Use gmake instead of make. If you are working on the x86 machines you will be using an inferior, slightly buggy version of iverilog that may not give you error messages. If you are working on the SPARC machines, you will be using iverilog-0.6 which seems to be more robust, but still does not have a working time display. That shouldn't matter for the project. The makefile should also select the correct version of verilog for you.

Also, make a symbolic link to the new Makefile instead of just copying it over in your proj3 directory:

     ln -s ~cs61c/lib/proj3/Makefile Makefile
This way any other updates will happen automatically.

Important Note!

Read this entire document before beginning the project. This project may be long and/or difficult depending on your understanding of the material so get started early! You should try to get the single cycle CPU working as soon as possible (like a week before the deadline or sooner) so you can move on to the pipeline.

Contents

Purpose:

The purpose of this assignment is for you to get practice programming in Verilog and to help you understand the detailed operation of processors. Processor implementations are complex, even the simple MIPS; a good understanding of their operation comes only after the experience you will gain by implementing and simulating a processor for yourself.

Administrative Requirements:

To submit your project, create a directory named proj3 that contains your single.v, pipeline.v, pregs.txt, pipelinef.v, pregsf.txt and README files. From within that directory, type "submit proj3". As usual, the project will be due by 11:59pm on Tuesday, August 3. This is an individual project, not to be done in partnership. Hand in your own work, and do not collaborate with anyone else. Your submission should be structured as follows:

We will be building your project with the makefile in ~cs61c/lib/proj3/Makefile so make sure your project builds with that on the instructional machines. We will be running tests with the trycpu script, so make sure that works with your processor as well.

Background Reading:

Chapters 5 and 6 in Computer Organization and Design and the Verilog Tutorial.

Introduction:

In this project you will build a pipelined MIPS processor similar to that presented in Chapters 5 and 6 of Computer Organization and Design. This will be done in three phases: first you will build a simple single cycle processor using only structural Verilog; then you will pipeline this processor using behavioral pipeline registers; the final step is to add forwarding paths to handle data dependencies. You should read through all of the stages before beginning the project so that you can plan your time accordingly.

Stage 1: Single Cycle

You will design a module named CPU whose parameters are as follows:

	module CPU (CLK,RST,halt,dumpDataMem);
		input CLK,RST,dumpDataMem;
	   	output halt;
			
CLK and RST are the clock and reset signals; use of RST is described below. halt is asserted (made =1) when the halt instruction is encountered (see below). dumpDataMem should be passed on to the mem module (see the below).

Your processor must match the one shown in figure 5.19 of P&H. We have provided you with a file ~cs61c/lib/proj3/blocks.v containing behavioral verilog definitions for modules that you can use for the blocks shown in the figure, with the exception of the two blocks labeled "control". Those, you will have to implement for yourself using primitive gates. See section C.2 (in the appendix) for additional imformation on the control blocks.

Your processor must correctly execute the following instructions:

	LW, SW, BEQ, AND, OR, ADD, SUB, and SLT.
			
This is the same set as the processor in the book. Additionally, your processor must execute the "halt" instruction. We have added this instruction to aid in testing the processor. Although not documented in the book, the halt instruction has the "op" field equal to all 1s. Halt instructions are sometimes included in the instruction set of a processor to provide the means for a program to stop the processor. In this case, upon execution of the halt instruction, your processor will simply assert a signal connected to a port. The testbench module will use this signal to dump memory and stop the simulation.

The basic strategy for running programs on your simulated processor is the following. At startup the contents of both the instruction and data memories are read from files by the Verilog runtime system. Simulated processor execution proceeds until a halt instruction is executed, at which point the data memory is dumped to a different file. You can then inspect the file to see if the program executed correctly. Of course, this scenerio assumes that all is well with your processor. For debugging, you may need to add monitor or display commands to your simulation.

The file ~cs61c/lib/proj3/blocks.v contains behavioral definitions of the modules that you should instantiate to implement your processor. Although these could have been implemented at the gate level, we have used behavioral constructs to speed up the simulation. The comments preceding each module describe its operation. For this project you don’t need to understand the Verilog code for each block, but you should understand each module’s operation as described in the comments. In particular, notice the various uses of the DMP and RST parameters:

RULES: You may not use any behavioral or dataflow Verilog with the exception of "~" for "not". It must all be structural. You should put your CPU module in a file called "single.v". You may not make any changes to blocks.v or testbench.v (or at least don't turn in a processor that relies on any changes you made to these files; if you need to make changes for testing, go ahead). You should copy the Makefile in ~cs61c/proj3/Makefile to your proj3 directory. Once you are ready to compile and run your processor you can type "gmake single". To run your CPU you can use the trycpu script (see the "testing" section below).

Stage 2: Pipeline

After you have completed the single cycle processor, you should split it up into 5 pipeline stages like those in Chapter 6 of COD. You will do this by figuring out what signals you want to store in pipeline registers and then creating said registers to go between your modules. These modules will take CLK and RST as inputs, along with all the other inputs you want. You may (and should) use behavioral verilog for the registers. We have provided a script ~cs61c/lib/proj3/makepr.pl that you must use to generate pipeline registers. The makefile will run this script for you.

In doing this part of the project, it is suggested that you start with your single cycle processor and copy it to a new file to work in. You may then want to arrange pipeline stages together before inserting registers. The testbench will remain the same, so the inputs and outputs of the CPU module must be the same as that of the single cycle processor.

One big part of pipelining your processor is to make the branch delay slot work correctly. To do this, branches must be resolved in the second stage (Decode) of your pipeline. This the instruction immediately following the branch is always executed regardless of whether the branch is taken or not. This will require some modification to the way branches are done in the single cycle processor.

Another issue with the pipeline involves taking note of which state elements are synchronous and which are asynchronous. This will affect when you must setup the inputs to these state elements.

You must also connect the monitor that will disassemble the instructions running through your pipeline. You will find the supplied monitor at ~cs61c/lib/proj3/monitor.v. You should not need to modify it or even copy it to your home directory as it is included in the Makefile. See below for details on using the monitor.

You may want to use the following naming convention for the signals coming out of your pipeline register: signalName_fromStage_toStage. The registers inside the pipeline register (and inputs/outputs) already follow the naming convention signalName_fromStage and signalName_toStage. You should not change this (do not alter the pipeline registers at all after they have been generated by makepr.pl). For example, if you have a signal called "foo" generated in the ID stage, it will connect to an input to pipeline register pr_ID_EX called foo_ID. The corresponding output of this register will be called foo_EX. You should connect a wire named foo_ID_EX to this output. This wire will exist in your CPU module, so basically wires are named for the pipeline register they come from. If you need other signals elsewhere in a stage, or inputs to pipeline registers that have been passed along, you may want to name them simply for the stage they exist in (e.g. bar_EX if bar already exists in a previous stage).

RULES: You should put your pipelined CPU module in a file called pipeline.v. Again, you should not have to make changes to testbench.v or blocks.v. You should keep your pipeline registers in a separate file called pregs.v. Any other modules that you have to add can be added to pipeline.v (e.g. a comparator). You may use any verilog constructs you wish in building these other modules, but it is suggested that you keep your CPU module strictly structural. To build your pipelined processor you can type "gmake pipeline" (see below for details on the Makefile).

Stage 3: Forwarding

You may have noticed that your pipelined processor is rather crippled as you must insert many dummy instructions to get around the data dependencies in your programs. To fix this you will add a forwarding unit to your datapath that eliminates some of these data dependencies. Your CPU must have the following forwarding paths: Memory to Decode, Execute to Decode, and Execute to Execute. These paths do not eliminate all data dependencies, but improve the situation somewhat and can be done with relative simplicity (hint: if done right, one of them is already built into the datapath and requires nothing extra).

This part of the project requires some thought and should not be taken lightly. If done incorrectly it can be the source of many painful bugs.

It is recommended that you do your forwarding unit using only dataflow (assign foo = baz | snoz, etc) verilog as it will simplify your code and your life. It is also recommended that you run all of the data being forwarded right through the forwarding unit rather than using external multiplexors as is shown in COD. This will make your CPU module less complicated. When doing this, you will find the trinary ?: operator to be a useful replacement for "if" statements or multiplexors.

RULES: You should copy your pipeline.v over to pipelinef.v and pregs.txt to pregsf.txt. Put your forwarding unit in the new pipelinef.v file and instantiate it in your CPU module. You will find that your pipeline registers will need more signals, so you can add those to pregsf.txt You only need to implement the three forwarding paths mentioned above. You can make and test your project the same way as in Stage 2.

Testing

There is a set of files in the ~cs61c/lib/proj3/tests) directory that you may find useful for testing your simulation.

The tests in this files will probably not reveal all your bugs. You should create more exhaustive tests, based on these samples. The above files are designed to work with your single cycle processor and will probably not work with your pipelined processor because we have not resolved all the data dependencies and hazards (your pipelined processor cannot induce stalls). There are counterparts to these files that begin with "pipeline" that have explicit no-op instructions added to make them work with your pipelined processor (with forwarding).

For each of these test programs, you will find four files, ".s", a ".text", a ".data", and a ".dump" file. The ".s" file is the assembly source. The ".text" file is a hex equivalent of the binary corresponding to the ".s" file. Each line in the file contains one 32-bit instruction, written as an 8-digit hex number. The ".data" file contains the data used to initialize the data memory of the processor. It has one 32-bit data word per line, again, each written as an 8-bit hex number. The ".dump" file is a copy of the data memory of the processor which gets dumped out at the end of simulation.

The two files, ".text", and ".data" will have to be renamed to use them with your Verilog simulation. When your simulation starts up, the Verilog runtime system, expects to find a file called "text.dat" for initializing instruction memory, and a file called "data.dat" for initializing data memory. For each simulation run, rename the appropriate ".text" file to "text.dat" and the corresponding ".data" file to "data.dat". At the end of the simulation, Verilog will dump the contents of the data memory into a file called "dump.dat".

A C shell script named trycpu in ~cs61c/lib/proj3/tests), simplifies the running of these tests by automatically renaming files; see the comment block in trycpu for more information. (and the file tests/trycpu.README) You may wish to copy trycpu and the tests to you proj3 directory.

When you generate your own test programs, for each one, you will need to generate a ".text" file with the instructions, and a corresponding ".data" file for initializing the data memory. Remember to use only the instructions that your processor can execute. To create your own test program, load your ".s" file containing the assembler source into SPIM then used the "dump" command to get the binary. Once you have the binary version, you can convert it to hex by using the od program with the "-X" switch. The result will be a text file that you can edit with a text editor to finish the job. The od program inserts addresses, which you will need to remove, and puts multiple words per line. Also, you will have to edit in the final "halt" instruction, as spim doesn’t support this instruction. You can generate the ".data" by hand.

You do not necesarily need to use SPIM (or MIPSASM) to translate your assember test programs to binary - you can do it by hand, but SPIM saves some time and helps prevent errors.

As there are many combinations of test cases and bugs, we encourage you to exchange test programs with your peers for this project. This is not usually allowed, but it should help you given the nature of this project. Feel free to post test programs to the newsgroup, but do NOT post verilog code from your project that could be used by others.

Using the Makefile

You should make a symbolic link in your proj3 directory to the Makefile in ~cs61c/lib/proj3/Makefileto then use gmake to build your project. You can make a symbolic link by doing:

     ln -s ~cs61c/lib/proj3/Makefile Makefile
You can use it by typing "gmake single" or "gmake [insert target here]" at the shell prompt from the directory you copied the makefile to. Valid targets are: It is important to note that the Makefile will use iverilog-0.6 if you are working on a SPARC machine and just iverilog if you are working on anything else. iverilog-0.6 is the preferred executable as iverilog is a bit buggy and my not give you intuitive error messages. The only issue is that you will not get times ($time) to display correctly if you are working on a SPARC machine. iverilog on the x86 machines should work fine with correct code, so you should still be okay there.

Using makepr.pl

In general you should not have to use makepr.pl directly as the makefile takes care of it for you. It is recommended that you go ahead and rely on the makefile and makepr.pl to take care of your pregs.v file. You should format your pregs.txt file or pregsf.txt file as described in the comments of makepr.pl. Then just type "gmake pregs" or "gmake pregsf". If you are unclear about the format of the input file or how it works, try copying the example in the comments into a new file called test.txt and then type "./makepr.pl < test.txt". This will dump the generated verilog to the console for your inspection.

Using the monitor

You should not have to modify the monitor and should leave it in monitor.v. You should instantiate it inside your CPU module so that it has access to the data. The inputs are as follows:

		input clk,rst; //CLK and RST used in CPU
   	input [31:0] address; //address that your instruction is being read from in IF stage
   	input [31:0] instruction; //instruction being read from address in IF
   	input [31:0] regOut1; //data read from register file (or forwarded) in EX stage (note stage!)
   	input [31:0] regOut2; //data read from register file (or forwarded) in EX stage (note stage!)
   	input [31:0] regWriteData; //data to be written to register file in MEM stage
   	
The monitor prints out each instruction when it hits the MEM stage, so you may see some unknown instructions when your cpu first starts up (because there's nothing in the pipeline!). Make sure you get the monitor working as it is crucial for testing and grading.


Adapted by Navtej Sadhal for CS61C Summer 2004 from an earlier (much easier) CS61C project. My apologies to you all. You'll thank me later. Corrections, clarifications, and flames should be directed to cs61c-td@imail.eecs.berkeley.edu.