Date: Sat, 31 Jul 2004 12:51:38 -0700 From: Navtej Sadhal Newsgroups: ucb.class.cs61c Subject: IMPORTANT PROJ UPDATE!! (READ!) Please read this entire message. This is very important! Read every word! Don't skim! READ IT! The data memory module supplied to you in blocks.v was originally designed for the single cycle processor, therefore it has a synchronous write and an asynchronous read. This is not ideal for the pipelined processor as you may already have found out. Two generally accepted solutions for data memory for pipelined processors are either to have writes occur on the negative edge of the clock or to have both reads AND writes be synchronous. Rather than setup the project for either of these solutions, we opted to use the existing data memory with its strange behavior in order to get you to understand asynchronous vs. synchronous state elements and how they behave relative to the clock. As a result, the solution we had you pursue was to bypass the pipeline register in the case of a write but not bypass the register in the case of a read. This solution works for the most part and was done to avoid having to forward memory data. That is, if you did not bypass the EX_MEM pipeline register on a write, then your write to memory would happen one cycle later. This is not an issue except in the case that you read from the address you wrote to immediately. That is, you do the following: sw $t0, 0($t1) lw $t2, 0($t1) The read and write will be happening at the same time, so the data written will not be accessible to the lw instruction. Thus we bypass the register so that the sw will complete in time for the lw to access the data. An alternate solution is to do forwarding of the data memory. This is different from the forwarding of register data described in the book. What you would have to do is detect if you had a sw followed by a lw and check if their addresses were the same with a comparator. Then you would forward the sw's write data to the lw's read data bypassing the data memory. This solution should work in all cases and is valid. The problem with the bypassing via mux solution is that any lw followed by a sw will not work because the load and store will be trying to setup the address at the same time: lw $t0, 0($t1) sw $t1, 4($t2) Since the store is trying to jump the gun and setup the address before the clock while the load is doing it after the clock... there will be a problem. Which one wins out depends on the select bit to your mux. If you used MemWrite_EX as the select bit to your mux, then the store will always work while the load will not. However, the behavior of the memory when both read and write bits are asserted may be undefined. This is not good. In general, it doesn't make sense to use a memory with this behavior for a pipelined processor. If reads were synchronous, you would bypass the pipeline register for both reads and writes, and everything should work fine. Because of this issue, we have decided to allow you to do you writes on the negative edge of the CLK in hopes of making things simpler. I had initially said that you may not put any gates on the CLK because this does not work in hardware. Normally this would introduce awful delays and clock skew into your system. Since this is simulation, however, negating the CLK has no ill effects. It is important to understand, however, that negating the CLK in this case is meant to be symbolic of having a data memory that does writes on the negative edge. You should never do this in real life (remember that when taking cs150 or 152). We decided to have you make the change in your code rather than changing the blocks.v file so as not to surprise people by breaking their processors because they have not yet read this message. The way you should do this is to change the input to CLK in your instantiation of the data memory to be ~CLK: mem memBlock (.CLK(~CLK), .RST(RST),.... etc Since reads are asynchronous, they will be unaffected. Now you must make sure you are not bypassing the EX_MEM pipeline register ANYWHERE. All of the inputs to the data memory should be coming out of the EX_MEM pipeline register (w/ the exception of CLK, RST, and DMP). If you had already successfully implemented the bypassying solution with a multiplexor that I had described previously, then the easiest way to implement this fix is to do the following: 1. hardwire the select bit on your address bypass mux (the one that chooses between the address coming out of the ALU or the one comnig out of pr_EX_MEM) to always choose the value coming out of pr_EX_MEM. This means you just need to do .select(0) or .select(1) depending on which input is the one you want. This is a slightly hacky fix, but it's a good way to get it working first before you start hacking up your wires. You should eliminate the mux later once you get things cleaned up. 2. Change the .WR input to be .WR(MemWrite_EX_MEM) instead of .WR(MemWrite_ID_EX). Or whatever your wires are named... 3. Change .writeD to take the value from pr_EX_MEM instead of the value coming from the EX stage. You may need to add this to your EX_MEM pipeline register. Make sure that you are using a forwarded value (if you have already implemented forwarding). We have provided a test case for you to check if this works in s61c/lib/proj3/tests called pipeline.lw.sw.2. This should work with both Stage 2 and Stage 3 as I have inserted sufficient noop instructions. We apologize for making this change now, but we feel that it should be simple enough and not consume very much of your time or set you back very far. As to day is Saturday, you still have plenty of time to talk to your TAs about any problems you are still having. This change should simplify your testing and your code. Feel free to email me and/or post to the newgroup with any issues you are having with this change. Good Luck. Navtej