KEMBAR78
CA I - Chapter 3 RISC V Processor | PDF | Central Processing Unit | Manufactured Goods
0% found this document useful (0 votes)
54 views103 pages

CA I - Chapter 3 RISC V Processor

The document discusses the hardware/software interface and RISC-V processor design. It covers abstraction levels from high-level languages to machine code. It also describes the CPU components like the datapath and control, and how they work together to implement the RV32I instruction set in a one-instruction-per-cycle RISC-V machine.

Uploaded by

Đức Minh
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
54 views103 pages

CA I - Chapter 3 RISC V Processor

The document discusses the hardware/software interface and RISC-V processor design. It covers abstraction levels from high-level languages to machine code. It also describes the CPU components like the datapath and control, and how they work together to implement the RV32I instruction set in a one-instruction-per-cycle RISC-V machine.

Uploaded by

Đức Minh
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 103

Computer Architecture 1

Computer Organization and Design


THE HARDWARE/SOFTWARE INTERFACE

[Adapted from Computer Organization and Design, RISC-V Edition, Patterson & Hennessy, © 2018, MK]
[Adapted from Great ideas in Computer Architecture (CS 61C) lecture slides, Garcia and Nikolíc, © 2020, UC Berkeley]

1/29/2024 1
RISC-V Processor Design
1/29/2024 2
Great Idea #1: Abstraction
(Levels of Representation/Interpretation)
High Level Language temp = v[k];
Program (e.g., C) v[k] = v[k+1];
v[k+1] = temp;
Compiler
lw x3, 0(x10)
Assembly Language lw x4, 4(x10)
Program (e.g., RISC-V) sw x4, 0(x10)
sw x3, 4(x10)
Assembler 1000 1101 1110 0010 0000 0000 0000 0000
Machine Language 1000 1110 0001 0000 0000 0000 0000 0100
Program (RISC-V) 1010 1110 0001 0010 0000 0000 0000 0000
1010 1101 1110 0010 0000 0000 0000 0100

Hardware Architecture Description +4 Reg [] pc


1
alu
pc+4

(e.g., block diagrams)


wb DataD
Reg[rs1] 2
alu 1 ALU DMEM
pc IMEM inst[11:7] AddrD 0 1
0 Reg[rs2] Addr wb
pc+4 Branch DataR 0
inst[19:15] AddrA DataA 0
Comp. DataW mem
inst[24:20] AddrB DataB 1

Architecture Implementa tion


Logic Circuit Description inst[31:7] Imm.
Gen
imm [31:0]

(Circuit Schematic Diagrams) A


B
Out = AB+CD
C
1/29/2024 D
3
Our Single-Core Processor So Far…
Processor Memory
Enable? Input
Control Read / Write

Program

Address
Da tap ath
Program Counter (PC)
Bytes
Registers

Write Da ta
Data

Read Data
Arithmetic-Log ic Output
Unit (ALU)

1/29/2024 4
The CPU

• Processor (CPU): the active part of the computer that does all
the work (data manipulation and decision-making)
• Datapath: portion of the processor that contains hardware
necessary to perform operations required by the processor
(the brawn)
• Control: portion of the processor (also in hardware) that tells
the datapath what needs to be done (the brain)

1/29/2024 5
Need to Implement All RV32I Instructions
Open Reference Card
Base Integer Instructions: RV32I
Category Name Fmt RV32I Base Category Name Fmt RV32I Base

Shifts Shift Left Logical R SLL rd,rs1,rs2 Loads Load Byte I LB rd,rs1,imm
Shift Left Log. Imm. I SLLI rd,rs1,shamt Load Halfword I LH rd,rs1,imm
Shift Right Logical R SRL rd,rs1,rs2 Load Byte Unsigned I LBU rd,rs1,imm
Shift Right Log. Imm. I SRLI rd,rs1,shamt Load Half Unsigned I LHU rd,rs1,imm
Shift Right Arithmetic R SRA rd,rs1,rs2 Load Word I LW rd,rs1,imm

Shift Right Arith. Imm. I SRAI rd,rs1,shamt Stores Store Byte S SB rs1,rs2,imm

Arithmetic ADD R ADD rd,rs1,rs2 Store Halfword S SH rs1,rs2,imm

ADD Immediate I ADDI rd,rs1,imm Store Word S SW rs1,rs2,imm


SUBtract R SUB rd,rs1,rs2 Branches Branch = B BEQ rs1,rs2,imm
Load Upper Imm U LUI rd,imm Branch ≠ B BNE rs1,rs2,imm
Add Upper Imm to PC U AUIPC rd,imm Branch < B BLT rs1,rs2,imm
Logical XOR R XOR rd,rs1,rs2 Branch ≥ B BGE rs1,rs2,imm
XOR Immediate I XORI rd,rs1,imm Branch < Unsigned B BLTU rs1,rs2,imm
OR R OR rd,rs1,rs2 Branch ≥ Unsigned B BGEU rs1,rs2,imm
OR Immediate I ORI rd,rs1,imm Jump & Link J&L J JAL rd,imm
AND R AND rd,rs1,rs2 Jump & Link Register I JALR rd,rs1,imm
AND Immediate I ANDI rd,rs1,imm

Compare Set < R SLT rd,rs1,rs2 Synch Synch thread I FENCE


Set < Immediate I SLTI rd,rs1,imm Not in
Set < Unsigned
Set < Imm Unsigned
R SLTU rd,rs1,rs2
I SLTIU rd,rs1,imm
Environment CALL
BREAK
I
I
ECALL
EBREAK
61C
1/29/2024 6
1/29/2024 7
One-Instruction-Per-Cycle RISC-V Machine

▪ On every tick of the clock, the computer executes


one instruction
▪ Current state outputs drive the inputs to the
combinational logic, whose outputs settles at the
Combinational values of the state before the next clock edge
Logic ▪ At the rising clock edge, all the state elements are
updated with the combinational logic outputs,
and execution moves to the next clock cycle

1/29/2024 8
Stages of the Datapath : Overview
• Problem: a single, “monolithic” block that “executes an
instruction” (performs all necessary operations beginning
with fetching the instruction) would be too bulky and
inefficient
• Solution: break up the process of “executing an instruction”
into stages, and then connect the stages to create the whole
datapath
– smaller stages are easier to design
– easy to optimize (change) one stage without touching the
others (modularity)

1/29/2024 9
Five Stages of the Datapath
• Stage 1: Instruction Fetch (IF)
• Stage 2: Instruction Decode (ID)

• Stage 3: Execute (EX) - ALU (Arithmetic-Logic Unit)

• Stage 4: Memory Access (MEM)

• Stage 5: Write Back to Register (WB)

1/29/2024 10
Basic Phases of Instruction Execution

rd

Reg[ ]
PC
rs1

IMEM

DMEM
rs2 ALU

+4 imm
mux

1. Instruction 2. Decode/ 5. Register


3. Execute 4. Memory
Fetch Register Write
Read Access
Clock
time
1/29/2024 11
Datapath Components: Combinational
▪ Combinational elements
CarryIn Select OP
A A
32 A 32
Adder

Sum 32

MUX

ALU
Y Result
32 32
32
B CarryOut B B
32 32 32

Adder Multiplexer ALU

▪ Storage elements + clocking methodology


▪ Building blocks

1/29/2024 12
Datapath Elements: State and Sequencing (1/3)
Write Enable

Data In Data Out


N N
▪ Register
▪ Write Enable: clk

 Low (or deasserted) (0): Data Out will not


change
 Asserted (1): Data Out will become Data In
on positive edge of clock

1/29/2024 13
Datapath Elements: State and Sequencing (2/3)
RWRA RB
▪ Register file (regfile, RF) consists of 32 registers: Write Enable 5 5 5
 Two 32-bit output busses: busA and busB busA
 One 32-bit input bus: busW busW 32 x 32-bit 32
32 Registers busB
▪ Register is selected by: Clk 32
 RA (number) selects the register to put on busA (data)
 RB (number) selects the register to put on busB (data)
 RW (number) selects the register to be written
via busW (data) when Write Enable is 1
▪ Clock input (Clk)
 Clk input is a factor ONLY during write operation
 During read operation, behaves as a combinational
logic block:
 RA or RB valid  busA or busB valid after “access time.”

1/29/2024 14
Datapath Elements: State and Sequencing (3/3)
▪ “Magic” Memory Write Enable Address
 One input bus: Data In
Data In DataOut
 One output bus: Data Out 32 32
▪ Memory word is found by: Clk

 For Read: Address selects the word to put on Data Out


 For Write: Set Write Enable = 1: address selects the memory word to be
written via the Data In bus
▪ Clock input (CLK)
 CLK input is a factor ONLY during write operation
 During read operation, behaves as a combinational logic block: Address
valid  Data Out valid after “access time”

1/29/2024 15
State Required by RV32I ISA (1/2)
Each instruction during execution reads and updates the state of :
(1) Registers, (2) Program counter, (3) Memory
▪ Registers (x0..x31)
 Register file (regfile) Reg holds 32 registers x 32 bits/register:
Reg[0]..Reg[31]
 First register read specified by rs1 field in instruction
 Second register read specified by rs2 field in instruction
 Write register (destination) specified by rd field in instruction
 x0 is always 0 (writes to Reg[0]are ignored)
▪ Program Counter (PC)
 Holds address of current instruction

1/29/2024 16
State Required by RV32I ISA (2/2)

▪ Memory (MEM)
 Holds both instructions & data, in one 32-bit byte-addressed
memory space
 We’ll use separate memories for instructions (IMEM) and data
(DMEM)
 These are placeholders for instruction and data caches
 Instructions are read (fetched) from instruction memory
(assume IMEM read-only)
 Load/store instructions access data memory

1/29/2024 17
1/29/2024 18
Review: R-Type Instructions
3
1 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 10 9 8 7 6 5 4 3 2 1 0
R-format : ALU
[31:25] [24:20] [19:15] [14:12] [11:7] [6:0]
7 5 5 3 5 7
func7 rs2 rs1 func3 rd opcode
0000000 rs2 rs1 000 : ADD rd 0110011:OP-R
0100000 rs2 rs1 000 : SUB rd 0110011:OP-R
0000000 rs2 rs1 001 : SLL rd 0110011:OP-R
0000000 rs2 rs1 010 : SLT rd 0110011:OP-R
0000000 rs2 rs1 011 : SLTU rd 0110011:OP-R
0000000 rs2 rs1 100 : XOR rd 0110011:OP-R
0000000 rs2 rs1 101 : SRL rd 0110011:OP-R
0100000 rs2 rs1 101 : SRA rd 0110011:OP-R
0000000 rs2 rs1 110 : OR rd 0110011:OP-R
0000000 rs2 rs1 111 : AND rd 0110011:OP-R
▪ E.g. Addition/subtraction add rd, rs1, rs2
R[rd] = R[rs1] + R[rs2]
sub rd, rs1, rs2
R[rd] = R[rs1] - R[rs2]
1/29/2024 19
Implementing the add instruction
31 25 24 20 19 1514 12 11 76 0
funct7 rs2 rs1 funct3 rd opcode
7 5 5 3 5 7

31 25 24 20 19 1514 12 11 76 0
0000000 rs2 rs1 000 rd 0110011
7 5 5 3 5 7
add rs2 rs1 add rd Reg-Reg OP

add rd, rs1, rs2


▪ Instruction makes two changes to machine’s state:
 Reg[rd] = Reg[rs1] + Reg[rs2]
 PC = PC + 4

1/29/2024 20
Datapath for add
Reg[rd] = Reg[rs1] + Reg[rs2]
+4
Add

DataD
Inst[11:7]
PC AddrD
addr Reg[rs1]
pc+4 Inst[19:15]
inst AddrA DataA alu
+
Inst[24:20] Reg[rs2]
AddrB DataB ALU
clk
IMEM Reg [ ]

Inst[31:0] clk

RegWriteEnable (RegWEn) =1

Control logic
1514 12 11 0
31 25 24 20 19 76
funct7 rs2 rs1 funct3 rd opcode
7 5 5 3 5 7
1/29/2024 21
Timing Diagram for add
+4
Add
DataD
Inst[11:7]
PC AddrD Reg[rs1]
addr Inst[19:15]
pc+4 inst AddrA DataA alu
Inst[24:20] Reg[rs2]
+
clk
AddrB DataB ALU
IMEM Reg [ ]
Inst[31:0]
clk

RegWEn

Clock

PC 1000 1004

PC+4 1004 1008

inst[31:0] add x1,x2,x3 add x6,x7,x9

Reg[rs1] Reg[2] Reg[7]


Reg[rs2] Reg[3] Reg[9]
alu Reg[2]+Reg[3] Reg[7]+Reg[9]

Reg[1] ??? Reg[2]+Reg[3]

1/29/2024
time 22
1/29/2024 23
Implementing the sub instruction
0000000 rs2 rs1 000 rd 0110011 add
0100000 rs2 rs1 000 rd 0110011 sub

sub rd, rs1, rs2


▪ Almost the same as add, except now have to
subtract operands instead of adding them
▪ inst[30] selects between add and subtract

1/29/2024 24
Datapath for add/sub
PC = PC + 4
Reg[rd] = Reg[rs1] +/- Reg[rs2]
+4
Add

DataD
Inst[11:7]
AddrD
PC addr Reg[rs1]
Inst[19:15]
pc+4 inst AddrA DataA alu
Inst[24:20] Reg[rs2] ALU
AddrB DataB
clk
IMEM Reg [ ]

Inst[31:0] clk

RegWriteEnable (RegWEn) =1 ALUSel


(add=0/ sub=1)
Control logic
1514 12 11 0
31 25 24 20 19 76
0100000 rs2 rs1 000 rd 0110011
7 5 5 3 5 7
1/29/2024 25
Implementing Other R-Format Instructions
0000000 rs2 rs1 000 rd 0110011 add
0100000 rs2 rs1 000 rd 0110011 sub
0000000 rs2 rs1 001 rd 0110011 sll
0000000 rs2 rs1 010 rd 0110011 slt
0000000 rs2 rs1 011 rd 0110011 sltu
0000000 rs2 rs1 100 rd 0110011 xor
0000000 rs2 rs1 101 rd 0110011 srl
0100000 rs2 rs1 101 rd 0110011 sra
0000000 rs2 rs1 110 rd 0110011 or
0000000 rs2 rs1 111 rd 0110011 and

All implemented by decoding funct3 and funct7 fields


and selecting appropriate ALU function
1/29/2024 26
1/29/2024 27
Implementing I-Format - addi instruction
▪ RISC-V Assembly Instruction:
addi x15,x1,-50
31 20 19 1514 12 11 76 0
imm[11:0] rs1 funct3 rd opcode
12 5 3 5 7

111111001110 00001 000 01111 0010011


imm=-50 rs1=1 add rd=15 OP-Imm

1/29/2024 28
Datapath for add/sub
PC = PC + 4
Reg[rd] = Reg[rs1] + Imm
+4
Add

DataD
Inst[11:7]
AddrD
PC addr Reg[rs1]
pc+4 Inst[19:15]
inst AddrA DataA alu
Inst[24:20] ALU
AddrB Reg[rs2]
clk DataB
IMEM Reg [ ]

Inst[31:0] clk

RegWriteEnable (RegWEn) ALUSel


=1 (add=0/ sub=1)
Control logic

Immediate should
1/29/2024
be here 29
Adding addi to Datapath
PC = PC + 4
Reg[rd] = Reg[rs1] + Imm
+4
Add

DataD
Inst[11:7]
AddrD
PC addr Reg[rs1]
pc+4 Inst[19:15]
inst AddrA DataA alu
Inst[24:20] Reg[rs2]
AddrB ALU
clk DataB 0
IMEM Reg [ ] 1

clk Imm[31:0]
Inst[31:0]
BSel ALUSel
Reg WriteEnable
(rs2=0/ (add=0/ sub=1)
(RegWEn)=1
Control logic Imm=1)

31 20 19 1514 12 11 76 0
imm[11:0] rs1 000 rd 0010011
12 5 3 5 7
1/29/2024 30
Adding addi to Datapath
PC = PC + 4
Reg[rd] = Reg[rs1] + Imm
+4
Add

DataD
Inst[11:7]
AddrD
PC addr Reg[rs1]
Inst[19:15]
pc+4 inst AddrA DataA alu
Inst[24:20] Reg[rs2]
AddrB ALU
clk DataB 0
IMEM Reg [ ] 1

Inst clk
[31:20] Imm.
Gen Imm[31:0]
Inst[31:0]

ImmSel RegWriteEnable BSel ALUSel


=I (RegWEn)=1 (rs2=0/ (add=0/ sub=1)
Control logic Imm=1)

1/29/2024 31
Adding addi to Datapath
PC = PC + 4
Reg[rd] = Reg[rs1] + Imm
+4
Add

DataD
Inst[11:7]
AddrD
PC addr Reg[rs1]
Inst[19:15]
pc+4 inst AddrA DataA alu
Inst[24:20] Reg[rs2]
AddrB DataB
ALU
clk 0
IMEM Reg [ ] 1
clk
Inst Bsel=1
[31:20] Imm.
Gen Imm[31:0]
Inst[31:0]

ImmSel Reg WriteEnable BSel ALUSel


=I (RegWEn)=1 (rs2=0/ (add=0/ sub=1)
Control logic Imm=1)

1/29/2024 32
I-Format Immediates
-inst[31]-
31 30 20 19 1514 12 11 76 0
imm[11:0] rs1 funct3 rd opcode
12 inst[31:0]

--inst[31]-(sign-extension)-- inst[30:20]
imm[31:0]
• High 12 bits of instruction (inst[31:20])
inst[31:20] imm[31:0] copied to low 12 bits of immediate
Imm.
Gen (imm[11:0])
• Immediate is sign-extended by copying
ImmSel=I value of inst[31] to fill the upper 20 bits
of the immediate value (imm[31:12])
1/29/2024 33
Adding addi to Datapath
Works for all other I-format arithmetic
instructions (slti,sltiu,andi,
+4 ori,xori,slli,srli,
Add srai) just by changing ALUSel
DataD
Inst[11:7]
AddrD
PC addr Reg[rs1]
Inst[19:15]
pc+4 inst AddrA DataA alu
Inst[24:20] Reg[rs2]
AddrB DataB
ALU
clk 0
IMEM Reg [ ] 1

Inst clk
[31:20] Imm.
Gen Imm[31:0]
Inst[31:0]

ImmSel RegWriteEnable BSel ALUSel


=I (RegWEn)=1 (rs2=0/
Control logic Imm=1)

1/29/2024 34
1/29/2024 35
R+IArithmetic/LogicDatapath

+4
Add

DataD
Inst[11:7]
PC AddrD Reg[rs1]
addr Inst[19:15]
pc+4 inst AddrA DataA alu
Inst[24:20] Reg[rs2]
AddrB ALU
clk DataB 0
IMEM
Reg [ ] 1

Inst clk
[31:20] Imm.
Gen Imm[31:0]
Inst[31:0]

ImmSel RegWriteEnable BSel ALUSel

Control logic

1/29/2024 36
RISC-V (37)
Add lw
▪ RISC-V Assembly Instruction (I-type): lw x14, 8(x2)
31 20 19 1514 12 11 76 0
imm[11:0] rs1 funct3 rd opcode
12 5 3 5 7
offset[11:0] base width dest LOAD
000000001000 00010 010 01110 0000011

imm=+8 rs1=2 lw rd=14 LOAD

▪ The 12-bit signed immediate is added to the base


address in register rs1 to form the memory address
 This is very similar to the add-immediate operation but used to
create address not to create final result
▪ The value loaded from memory is stored in register rd
1/29/2024 37
RISC-V (38)
R+IArithmetic/LogicDatapath

+4
Add

DataD
Inst[11:7] alu 1
PC AddrD Reg[rs1]
addr Inst[19:15] 0
pc+4 inst AddrA DataA DataR
Inst[24:20] ALU
AddrB Reg[rs2] addr
clk DataB 0
IMEM
Reg [ ] 1
DMEM
clk
Inst clk
[31:20] Imm.
Gen Imm[31:0]
Inst[31:0]

ImmSel RegWriteEnable BSel ALUSel MemRW WBSel

Control logic

1/29/2024 38
RISC-V (39)
R+IArithmetic/LogicDatapath

+4
Add

DataD
Inst[11:7] alu 1
PC AddrD Reg[rs1]
addr Inst[19:15]
0
pc+4 inst AddrA DataA DataR
Inst[24:20] ALU
AddrB Reg[rs2] addr
clk DataB 0
IMEM
Reg [ ] 1
DMEM
clk
Inst clk
[31:20] Imm.
Gen Imm[31:0]
Inst[31:0]

ImmSel RegWEn=1 BSel=1 ALUSel MemRW WBSel


=I =Add =Read =0
Control logic

1/29/2024 39
RISC-V (40)
AllRV32 Load Instructions
imm[11:0] rs1 000 rd 0000011 lb
imm[11:0] rs1 001 rd 0000011 lh
imm[11:0] rs1 010 rd 0000011 lw
imm[11:0] rs1 100 rd 0000011 lbu
imm[11:0] rs1 101 rd 0000011 lhu

• funct3 field encodes size and


‘signedness’ of load data
▪ Supporting the narrower loads requires additional logic to extract the correct
byte/halfword from the value loaded from memory, and sign- or zero-extend the result to
32 bits before writing back to register file.
 It is just a mux + a few gates

1/29/2024 40
RISC-V (41)
1/29/2024 41
Adding sw instruction
sw: Reads two registers, rs1 for base memory address, and rs2 for data to
be stored, as well immediate offset!
sw x14, 8(x2)
31 25 24 20 19 1514 12 11 76 0
Imm[11:5] rs2 rs1 funct3 imm[4:0] opcode
7 5 5 3 5 7
offset[11:5] src base width offset[4:0] STORE

0000000 01110 00010 010 01000 0100011

offset[11:5] offset[4:0]
=0 rs2=14 rs1=2 SW =8 STORE

0000000 01000 combined 12-bit offset = 8


1/29/2024 42
RISC-V (43)
Datapath withlw
+4
Add

DataD
Inst[11:7] alu 1
PC AddrD Reg[rs1]
addr Inst[19:15] 0
pc+4 AddrA DataA DataR

mem
inst
Inst[24:20] ALU
AddrB Reg[rs2] addr
clk DataB 0
IMEM DMEM
Reg [ ] 1
clk
Inst clk
[31:20] Imm.
Gen Imm[31:0]

Inst[31:0] ImmSel RegWriteEnable BSel ALUSel MemRW WBSel

Control logic

1/29/2024 43
RISC-V (44)
Adding sw to Datapath

+4
Add

DataD
Inst[11:7] alu 1
PC AddrD Reg[rs1]
addr Inst[19:15] 0
pc+4 AddrA DataA DataR

mem
inst
Inst[24:20] ALU
AddrB Reg[rs2] addr
clk DataB 0
IMEM DMEM
Reg [ ] 1
DataW
clk
Inst clk
[31:20] Imm.
Gen Imm[31:0]
RISC-V (44)

Inst[31:0] ImmSel RegWriteEnable Bsel ALUSel MemRW WBSel


=S =0 =1 =Add =Write =*
Control logic (don’t care)

1/29/2024
Adding sw to Datapath

+4
Add

DataD
Inst[11:7] alu 1
PC AddrD Reg[rs1]
addr Inst[19:15]
0
pc+4 AddrA DataA DataB

mem
inst
Inst[24:20] ALU
AddrB Reg[rs2] addr
clk DataB 0
IMEM DMEM
Reg [ ] 1
DataW
clk
Inst
[31:20] Imm.
Gen Imm[31:0]
Inst[31:0]

ImmSel RegWEn=0 BSel=1 ALUSel MemRW WBSel


=S =Add =Write =*
Control logic

1/29/2024 45
RISC-V (46)
I+S ImmediateGeneration
31 30 25 24 20 19 15 14 12 11 7 6 0
imm[11:0] rs1 funct3 rd I-opcode I
S
imm[11:5] rs2 rs1 funct3 imm[4:0] S-opcode

5 5 inst[31:0]
1 6

I/S I S

inst[31] (sign extension) inst[30:25] inst[24:20] I


inst[31] (sign extension) inst[30:25] inst[11:7] S
31 11 10 5 4 0
imm[31:0]
• Just need a 5-bit mux to select between two positions where low
five bits of immediate can reside in instruction
• Other bits in immediate are wired to fixed positions in instruction
1/29/2024 46
RISC-V (47)
AllRV32 StoreInstructions
• Store byte writes the low byte to memory
• Store halfword writes the lower two bytes to
memory

Imm[11:5] rs2 rs1 000 imm[4:0] 0100011 sb


Imm[11:5] rs2 rs1 001 imm[4:0] 0100011 sh
Imm[11:5] rs2 rs1 010 imm[4:0] 0100011 sw
width

1/29/2024 47
RISC-V (48)
1/29/2024 48
RISC-V B-Formatfor Branches
31 30 25 24 2019 15 14 12 11 8 7 6 0
imm[12] imm[10:5] rs2 rs1 funct3 imm[4:1] imm[11] opcode
1 6 5 5 3 4 1 7
offset[12|11:5] rs1 funct3 BRANCH
rs2 offset[4:1|11]
▪ B-format is mostly same as S-Format, with two register
sources (rs1/rs2) and a 12-bit immediate imm[12:1]
▪ But now immediate represents values -4096 to +4094 in 2-byte
increments
▪ The 12 immediate bits encode even 13-bit signed byte offsets
(lowest bit of offset is always zero, so no need to store it)

1/29/2024
Datapath So Far
+4
Add

DataD
Inst[11:7] alu 1
PC AddrD Reg[rs1]
addr Inst[19:15] 0
pc+4 AddrA DataA DataR

mem
inst
Inst[24:20] ALU
AddrB Reg[rs2] addr
clk DataB 0
IMEM DMEM
Reg [ ] 1
DataW
clk
Inst clk
[31:20] Imm.
Gen Imm[31:0]

Inst[31:0] ImmSel RegWriteEnable BSel ALUSel MemRW WBSel

Control logic

RISC-V (50)
1/29/2024
1/29/2024 51
ToAddBranches
▪ Different change to the state:
PC = PC + 4, branch not taken
PC + immediate, branch taken
▪ Six branch instructions: beq, bne, blt, bge,
bltu, bgeu
▪ Need to compute PC + immediate and to
compare values of rs1 and rs2
 But have only one ALU – need more hardware

1/29/2024
Adding Branches
+4
Add

pc+4 R[rs1]
DataD
Inst[11:7] alu 1
0
PC addr AddrD 1
1 Inst[19:15] 0
pc inst AddrA DataA 0
DataB
Inst[24:20] Branch ALU mem
AddrB Comp addr
clk DataB 0
IMEM
Reg [ ] 1 DMEM
clk
Inst clk
[31:20] Imm. R[rs2]
Gen Imm[31:0]

PCSel= Inst[31:0] ImmSel RegWEn BrUn BrLT Bsel Asel ALUSel MemRW WBSel
taken/not taken =B =0 BrEq =1 =1 =add =read =*
Control logic

RISC-V (53)
1/29/2024
Branch Comparator

•BrEq = 1, if A=B
A Branch
B Comp •BrLT = 1, if A<B

• BrUn = 1 selects unsigned


comparison for BrLT,
0=signed
BrU BrLT
BrEq BGE branch: A >= B, if A<B

A<B = !(A<B)
1/29/2024
RISC-V (54)
Branch Immediates (In OtherISAs)
• 12-bit immediate encodes PC-relative offset of -4096 to +4094 bytes in
multiples of 2 bytes
• Standard approach: Treat immediate as in range -2048..+2047, then shift
left by 1 bit to multiply by 2 for branches

s imm[10:5] rs2 rs1 funct3 imm[4:0] B-opcode

sign-extension s imm[10:5] imm[4:0] S-Immediate


B-Immediate
sign-extension s imm[10:5] imm[4:0] 0
(shift left by 1)
Each instruction immediate bit can appear in one of two places
in output immediate value – so need one 2-way mux per bit

1/29/2024
Branch Immediates (In RISC -V ISAs)
• 12-bit immediate encodes PC-relative offset of -4096 to +4094
bytes in multiples of 2 bytes
NOTE : Page. 116 (RISC V Textbook -> get extra information)
• RISC-V approach: keep 11 immediate bits in fixed position in
output value, and rotate LSB of S-format to be bit 12 of B-
format
S-Immediate
sign=imm[11] imm[10:5] imm[4:0]
B-Immediate
sign=imm[12] imm[11] imm[10:5] imm[4:1] 0 (shift left by 1)

Only one bit changes position between S and B, so only need a


single-bit 2-way mux
The RISC-V architects wanted to support the possibility of instructions that are only 2 bytes long, so the
branch instructions represent the number of halfwords between the branch and the branch target
1/29/2024
RISC-V ImmediateEncoding
Instruction encodings, inst[31:0]
31 30 25 24 20 19 15 14 12 11 8 76 0
imm[11:0] rs1 funct3 rd opcode I-type
imm[11:5] rs2 rs1 funct3 imm[4:0] opcode S-type
imm[12|10:5] rs2 rs1 funct3 imm[4:1|11] opcode B-type

32-bit immediates produced, imm[31:0]


31 25 24 12 11 10 5 4 1 0
-inst[31]- inst[30:25] inst[24:21] inst[20] I-imm.
-inst[31]- inst[30:25] inst[11:8] inst[7] S-imm.
-inst[31]- inst[7] inst[30:25] inst[11:8] 0 B-imm.
Upper bits sign-extended from inst[31] always Only bit 7 of instruction changes role in
1/29/2024
immediate between Sand B 57
Lighting Up Branch Path
+4
Add

pc+4 R[rs1]
DataD
Inst[11:7] alu 1
0
PC addr AddrD 1
1 Inst[19:15] 0
pc inst AddrA DataA 0
DataB
Inst[24:20] Branch ALU mem
AddrB Comp addr
clk DataB 0
IMEM
Reg [ ] 1 DMEM
clk
Inst clk
[31:20] Imm. R[rs2]
Gen Imm[31:0]

PCSel= Inst[31:0] ImmSel RegWEn BrUn BrLT Bsel Asel ALUSel MemRW WBSel
taken/not taken =B =0 BrEq =1 =1 =add =read =*
Control logic

1/29/2024 58
RISC-V (58)
1/29/2024 59
Let’s Add JALR(I-Format)
31 20 19 15 14 12 11 76 0
imm[11:0] rs1 func3 rd opcode
12 5 3 5 7
offset[11:0] base 0 dest JALR

▪ JALR rd, rs, immediate


▪ Two changes to the state
 Writes PC+4 to rd (return address)
 Sets PC = rs1 + immediate
 Uses same immediates as arithmetic and loads
◾no multiplication by 2 bytes
◾LSB is ignored
1/29/2024 60
RISC-V (60)
Datapath So Far, WithBranches

+4
Add

pc+4 R[rs1]
DataD
Inst[11:7] alu 1
0
PC addr AddrD 1
1 Inst[19:15] 0
pc inst AddrA DataA 0
DataB
Inst[24:20] Branch ALU mem
AddrB Comp addr
clk DataB 0
IMEM
Reg [ ] 1 DMEM
clk
Inst clk
[31:20] Imm. R[rs2]
Gen Imm[31:0]

PCSel Inst[31:0] ImmSel RegWEn BrUn BrLT Bsel Asel ALUSel MemRW WBSel
BrEq
Control logic

1/29/2024 61
RISC-V (61)
Datapath WithJALR
+4
Add

pc+4 R[rs1] 2
DataD
Inst[11:7] alu 1
0
PC addr AddrD 1
1 Inst[19:15] 0
pc inst AddrA DataA DataB
Branch 0 ALU
Inst[24:20] addr mem
AddrB Comp
clk DataB 0
IMEM
Reg [ ] 1 DMEM
clk
Inst clk
[31:20] Imm. R[rs2]
Gen Imm[31:0]

PCSel Inst[31:0] ImmSel RegWEn BrUn BrLT Bsel Asel ALUSel MemRW WBSel
=taken =I =1 =* BrEq=* =1 =0 =Add =Read =2
Control logic =*

1/29/2024 62
RISC-V (62)
Datapath WithJALR
+4
Add

pc+4 R[rs1] 2
DataD
Inst[11:7] alu 1
0
PC addr AddrD 1
1 Inst[19:15] 0
pc inst AddrA DataA DataB
Branch 0 ALU
Inst[24:20] addr mem
AddrB Comp
clk DataB 0
IMEM
Reg [ ] 1 DMEM
clk
Inst clk
[31:20] Imm. R[rs2]
Gen Imm[31:0]

PCSel Inst[31:0] ImmSel RegWEn BrUn BrLT Bsel Asel ALUSel MemRW WBSel
=taken =I =1 =* BrEq=* =1 =0 =Add =Read =2
Control logic =*

1/29/2024 63
RISC-V (63)
1/29/2024 64
J-FormatforJumpInstructions
31 30 21 20 19 12 11 76 0
imm[20] imm[10:1] imm[11] imm[19:12] rd opcode
1 10 1 8 5 7
offset[20:1] dest JAL
▪ Two changes to the state
 jal saves PC+4 in register rd (the return address)
 Set PC = PC + offset (PC-relative jump)
▪ Target somewhere within ±219 locations, 2 bytes apart
 ±218 32-bit instructions
▪ Immediate encoding optimized similarly to branch instruction
to reduce hardware cost

1/29/2024 65
RISC-V (65)
DatapathwithJAL
+4
Add

pc+4 R[rs1] 2
DataD
Inst[11:7] alu 1
0
PC addr AddrD 1
1 Inst[19:15] 0
pc inst AddrA DataA DataB
Branch 0 ALU
Inst[24:20] addr mem
AddrB Comp
clk DataB 0
IMEM
Reg [ ] 1 DMEM
clk
Inst clk
[31:20] Imm. R[rs2]
Gen Imm[31:0]

PCSel Inst[31:0] ImmSel RegWEn BrUn BrLT Bsel Asel ALUSel MemRW WBSel
=taken =J =1 =* BrEq=* =1 =0 =Add =Read =2
Control logic =*

1/29/2024 66
RISC-V (66)
LightUpJAL Path
+4
Add

pc+4 R[rs1] 2
DataD
Inst[11:7] alu 1
0
PC addr AddrD 1
1 Inst[19:15] 0
pc inst AddrA DataA DataB
Branch 0 ALU
Inst[24:20] addr mem
AddrB Comp
clk DataB 0
IMEM
Reg [ ] 1 DMEM
clk
Inst clk
[31:20] Imm. R[rs2]
Gen Imm[31:0]

PCSel Inst[31:0] ImmSel RegWEn BrUn BrLT Bsel Asel ALUSel MemRW WBSel
=taken =J =1 =* BrEq=* =1 =1 =Add =Read =2
Control logic =*

1/29/2024 67
RISC-V (67)
1/29/2024 68
U-Formatfor“UpperImmediate”Instructions
31 12 11 76 0
imm[31:12] rd opcode
20 5 7
U-immediate[31:12] dest LUI
U-immediate[31:12] dest AUIPC

▪ Has 20-bit immediate in upper 20 bits of 32-bit


instruction word
▪ One destination register, rd
▪ Used for two instructions
 lui – Load Upper Immediate
 auipc – Add Upper Immediate to PC
1/29/2024 69
RISC-V (69)
DatapathWithLUI,AUIPC
+4
Add

pc+4 R[rs1] 2
DataD
Inst[11:7] alu 1
0
PC addr AddrD 1
1 Inst[19:15] 0
pc inst AddrA DataA DataB
Branch 0 ALU
Inst[24:20] addr mem
AddrB Comp
clk DataB 0
IMEM
Reg [ ] 1 DMEM
clk
Inst clk
[31:20] Imm. R[rs2]
Gen Imm[31:0]

PCSel Inst[31:0] ImmSel RegWEn BrUn BrLT Bsel Asel ALUSel MemRW WBSel
=pc+4 =U =1 =* BrEq=* =1 =* =Read =1
Control logic =*

1/29/2024 70
RISC-V (70)
Lighting Up LUI
+4
Add

pc+4 R[rs1] 2
DataD
Inst[11:7] alu 1
0
PC addr AddrD 1
1 Inst[19:15] 0
pc inst AddrA DataA DataB
Branch 0 ALU
Inst[24:20] addr mem
AddrB Comp
clk DataB 0
IMEM
Reg [ ] 1 DMEM
clk
Inst clk
[31:20] Imm. R[rs2]
Gen Imm[31:0]

PCSel Inst[31:0] ImmSel RegWEn BrUn BrLT Bsel Asel ALUSel MemRW WBSel
=pc+4 =U =1 =* BrEq=* =1 =* =B =Read =1
Control logic =*

1/29/2024 71
RISC-V (71)
Lighting Up AUIPC
+4
Add

pc+4 R[rs1] 2
DataD
Inst[11:7] alu 1
0
PC addr AddrD 1
1 Inst[19:15] 0
pc inst AddrA DataA DataB
Branch 0 ALU
Inst[24:20] addr mem
AddrB Comp
clk DataB 0
IMEM
Reg [ ] 1 DMEM
clk
Inst clk
[31:20] Imm. R[rs2]
Gen Imm[31:0]

PCSel Inst[31:0] ImmSel RegWEn BrUn BrLT Bsel Asel ALUSel MemRW WBSel
=pc+4 =U =1 =* BrEq=* =1 =* =add =Read =1
Control logic =*

1/29/2024 72
RISC-V (72)
1/29/2024 73
CompleteRV32IDatapath!
+4
Add

pc+4 R[rs1] 2
DataD
Inst[11:7] alu 1
0
PC addr AddrD 1
1 Inst[19:15] 0
pc inst AddrA DataA DataB
Branch 0 ALU
Inst[24:20] addr mem
AddrB Comp
clk DataB 0
IMEM
Reg [ ] 1 DMEM
clk
Inst clk
[31:20] Imm. R[rs2]
Gen Imm[31:0]

PCSel Inst[31:0] ImmSel RegWEn BrUn BrLT Bsel Asel ALUSel MemRW WBSel
BrEq
Control logic

1/29/2024 74
RISC-V (74)
1/29/2024 75
CompleteRV32I ISA!
Open Reference Card
Base Integer Instructions: RV32I
CategoryName Fmt RV32I Base Category Name Fmt RV32I Base

Shifts Shift Left Logical R SLL rd,rs1,rs2 Loads Load Byte I LB rd,rs1,imm
Shift Left Log. Imm. I SLLI rd,rs1,shamt Load Halfword I LH rd,rs1,imm
Shift Right Logical R SRL rd,rs1,rs2 Load Byte Unsigned I LBU rd,rs1,imm
Shift Right Log. Imm. I SRLI rd,rs1,shamt Load Half Unsigned I LHU rd,rs1,imm
Shift Right Arithmetic R SRA rd,rs1,rs2 Load Word I LW rd,rs1,imm

Shift Right Arith. Imm. I SRAI rd,rs1,shamt Stores Store Byte S SB rs1,rs2,imm

Arithmetic ADD R ADD rd,rs1,rs2 Store Halfword S SH rs1,rs2,imm

ADD Immediate I ADDI rd,rs1,imm Store Word S SW rs1,rs2,imm


SUBtract R SUB rd,rs1,rs2 Branches Branch = B BEQ rs1,rs2,imm
Load Upper Imm U LUI rd,imm Branch ≠ B BNE rs1,rs2,imm
Add Upper Imm to PC U AUIPC rd,imm Branch < B BLT rs1,rs2,imm
Logical XOR R XOR rd,rs1,rs2 Branch ≥ B BGE rs1,rs2,imm
XOR Immediate I XORI rd,rs1,imm Branch < Unsigned B BLTU rs1,rs2,imm
OR R OR rd,rs1,rs2 Branch ≥ Unsigned B BGEU rs1,rs2,imm
OR Immediate I ORI rd,rs1,imm Jump & Link J&L J JAL rd,imm
AND R AND rd,rs1,rs2 Jump & Link Register I JALR rd,rs1,imm
AND Immediate I ANDI rd,rs1,imm

Compare Set < R SLT rd,rs1,rs2 Synch Synch thread I FENCE


Set < Immediate I SLTI rd,rs1,imm Not in
R SLTU rd,rs1,rs2 ECALL
Set < Unsigned
Set < Imm Unsigned I SLTIU rd,rs1,imm
Environment CALL
BREAK
I
I EBREAK
61C
1/29/2024 76
RISC-V (75)
Review
▪ We have designed a complete datapath
 Capable of executing all RISC-V instructions in one cycle each
 Not all units (hardware) used by all instructions
▪ 5 Phases of execution
 IF, ID, EX, MEM, WB
 Not all instructions are active in all phases
▪ Controller specifies how to execute instructions
 We still need to design it

1/29/2024 77
RISC-V (76)
1/29/2024 85
OurSingle-Core Processor
Processor Memory
Enable? Input
Control Read/Write

Program

Address
Datapath
Program Counter (PC)
Bytes
Registers

Write Data
Data

Read Data
Arithmetic-Logic Output
Unit (ALU)

1/29/2024 86
Single-Cycle RV32I Datapath and Control
+4
Add

pc+ 4
R[rs1] 2
DataD
Inst[11:7] alu 1
0 AddrD
PC addr 1
1 Inst[19:15] DataB 0
pc inst AddrA DataA
Inst[24:20] Branch 0 ALU mem
Comp addr
AddrB DataB
clk 0
IMEM
Reg [ ] 1 DMEM
clk
Inst clk
[31:20] Imm. R[rs2]
Gen Imm[31:0]

PCSel Inst[31:0] ImmSel RegWEn BrUn BrLT Bsel Asel ALUSel MemRW WBSel
BrEq
Control logic

1/29/2024 87
Exa mple: sw
+4
Add

pc+ 4
R[rs1] 2
DataD
Inst[11:7] alu 1
0 AddrD
PC addr 1
1 Inst[19:15] DataB 0
pc inst AddrA DataA
Inst[24:20] Branch 0 ALU mem
Comp addr
AddrB DataB
clk 0
IMEM
Reg [ ] 1 DMEM
clk
Inst clk
[31:20] Imm. R[rs2]
Gen Imm[31:0]

PCSel Inst[31:0] ImmSel RegWEn BrUn BrLT Bsel Asel ALUSel MemRW WBSel
=S = =* =* =1 = =add =Write =*
=pc+4 BrEq =* 0
Control logic 0

1/29/2024 88
Exa mple: beq

+4
Add

pc+ 4
R[rs1] 2
DataD
Inst[11:7] alu 1
0 AddrD
PC addr 1
1 Inst[19:15] DataB 0
pc inst AddrA DataA
Inst[24:20] Branch 0 ALU mem
Comp addr
AddrB DataB
clk 0
IMEM
Reg [ ] 1 DMEM
clk
Inst clk
[31:20] Imm. R[rs2]
Gen Imm[31:0]

PCSel Inst[31:0] ImmSel RegWEn BrUn BrLT Bsel Asel ALUSel MemRW WBSel
=B = =* BrEq =* =1 =1 =add =Read =*
Control logic 0

1/29/2024 89
1/29/2024 90
1/29/2024 91
Exa mple: add
+4

Add

R[rs1] 2
pc+ 4 DataD
Inst[11:7] alu 1
0
PC addr AddrD 1
1 Inst[19:15] 0
pc inst AddrA DataA DataB
Branch 0
Inst[24:20] ALU mem
AddrB Comp addr
clk DataB
0
IMEM 1
Reg [ ] DMEM
clk
Inst clk
[31:20] Imm. R[rs2]
Gen Imm[31:0]

PCSel Inst[31:0] ImmSel Reg WEn BrUn BrLT Bsel Asel ALUSel MemRW WBSel
=* BrEq =* =0 =0
=pc+4 =* =1 =add =Read =1
=*
Control logic

1/29/2024 92
Add Execution +4

Add

R[rs1] 2
pc+ 4 DataD
Inst[11:7] alu 1
0 AddrD
PC addr 1
Inst[19:15] 0
1 DataB
pc inst AddrA DataA 0
Inst[24:20] Branch ALU mem
AddrB addr
DataB Comp
clk 0
IMEM Reg [ ] 1
DMEM
clk
Inst clk
[31:20] Imm. R[rs2]
Gen Imm[31:0]

PCSel Inst[31:0] ImmSel Reg WEn BrUn BrLT Bsel Asel ALUSel MemRW WBSel
BrEq
Control logic
Clock
PC 1000 1004

PC+4 1004 1008

inst[31:0] add x1,x2,x3 add x6,x7,x9


Control logic add control add control
Reg[rs1] Reg[2] Reg[7]
Reg[rs2] Reg[3] Reg[9]
alu Reg[2]+Reg[3] Reg[7]+Reg[9]
wb Reg[2]+Reg[3] Reg[7]+Reg[9]
Reg[1] ??? Reg[2]+Reg[3]
1/29/2024 93
Exa mple: add timing
+4

Add

R[rs1] 2
pc+ 4 DataD
Inst[11:7] alu 1
0
PC addr AddrD 1
Inst[19:15] 0
1 pc DataB
inst AddrA DataA 0
Inst[24:20] Branch ALU mem
AddrB Comp addr
clk DataB
0
IMEM 1
Reg [ ] DMEM
clk
Inst clk
[31:20] Imm. R[rs2]
Gen Imm[31:0]

PCSel Inst[31:0] ImmSel RegWEn BrUn BrLT Bsel Asel ALUSel MemRW WBSel
=* BrEq =* =0 =0
=pc+4 =* =1 =add =Read =1
=*
Control logic

Critical path = tclk-q +max {tAdd+tmux, tIMEM+tReg +tmux+tALU+tmux}+tsetup


= tclk-q + tIMEM+tReg+tmux+tALU+tmux+tsetup
1/29/2024 94
Exa mple: lw
+4

Add

R[rs1] 2
pc+ 4 DataD
Inst[11:7] alu 1
0 AddrD
PC addr 1
Inst[19:15] 0
1 pc AddrA DataB
inst DataA 0
Inst[24:20] Branch ALU mem
AddrB Comp addr
clk DataB
0
IMEM 1
Reg [ ] DMEM
clk
Inst clk
[31:20] Imm. R[rs2]
Gen Imm[31:0]

PCSel Inst[31:0] ImmSel Reg WEn BrUn BrLT Bsel Asel ALUSel MemRW WBSel
=* BrEq =* =1 =0
=pc+4 =I =1 =add =Read =0
Control logic =*
Critical path = tclk-q +max {tAdd+tmux, tIMEM+tImm+tmux+tALU+tDMEM+tmux,
tIMEM+tReg+tmux+tALU+tDMEM+tmux}+tsetup
1/29/2024 95
Instruction Timing

I-MEM Reg Read ALU D-MEM Reg W Total


200 ps 100 ps 200 ps 200 ps 100 ps 800 ps
IF ID EX MEM WB
clock
PC old pc pc+4
Instr. fetch old instruction
Instr. decode old registerout
Execute old ALUresult
Memory Access old memory data
tIF tID tEX tMEM tWB

1/29/2024 96
Instruction Timing
Instr IF = 200ps ID = 100ps ALU = 200ps MEM=200ps WB = 100ps Total

add X X X X 600ps
beq X X X 500ps
jal X X X X 600ps
lw X X X X X 800ps
sw X X X X 700ps

▪ Maximum clock frequency


 fmax = 1/800ps = 1.25 GHz
▪ Most blocks idle most of the time
 E.g. fmax,ALU = 1/200ps = 5 GHz!

1/29/2024 97
1/29/2024 98
Control Logic Truth Table
Inst[31:0] BrEq BrLT PCSel ImmSel BrUn ASel BSel ALUSel MemRW RegWEn WBSel
add * * +4 * * Reg Reg Add Read 1 ALU
sub * * +4 * * Reg Reg Sub Read 1 ALU

(R-R Op) * * +4 * * Reg Reg (Op) Read 1 ALU

addi * * +4 I * Reg Imm Add Read 1 ALU


lw * * +4 I * Reg Imm Add Read 1 Mem
sw * * +4 S * Reg Imm Add Write 0 *
beq 0 * +4 B * PC Imm Add Read 0 *
beq 1 * ALU B * PC Imm Add Read 0 *
bne 0 * ALU B * PC Imm Add Read 0 *
bne 1 * +4 B * PC Imm Add Read 0 *
blt * 1 ALU B 0 PC Imm Add Read 0 *
bltu * 1 ALU B 1 PC Imm Add Read 0 *
jalr * * ALU I * Reg Imm Add Read 1 PC+4
jal * * ALU J * PC Imm Add Read 1 PC+4
auipc * * +4 U * PC Imm Add Read 1 ALU
1/29/2024 99
Control Realization Options
▪ ROM
 “Read-Only Memory”
 Regular structure
 Can be easily reprogrammed
 fixerrors
 add instructions
 Popularwhen designing control logic manually
▪ Combinatorial Logic
 Today, chip designers use logic synthesis tools to converttruth
tables to networks of gates

1/29/2024 100
RV32I, A Nine-Bit ISA!
▪ Instruction type encoded
using only 9 bits:
▪ inst[30],
inst[14:12],
inst[6:2]

inst[6:2]
inst[14:12]
inst[30]

1/29/2024 101
ROM-based Control
11-bitaddress (inputs)
Inst[30,14:12,6:2] BrEq BrLT
9
PCSel
3 ImmSel[2:0]
BrUn
ASel

ROM 4
BSel
ALUSel[3:0]
MemRW
RegWEn
2 WBSel[1:0]

15 data bits (outputs)

1/29/2024 102
ROM Controller Implementation
AND OR
add
Control Word for add
sub
Inst[] or
Control Word for sub

Address Decoder
BrEQ Control Word for or

BrLT .
. .
. .
11
.

jal

Controller output (PCSel, ImmSel, … )


1/29/2024 103
The control unit using an explicit counter to compute the next state

1/29/2024 104
1/29/2024 105
Combinational Logic Control
▪ Simplest example: BrUn
inst[14:12] inst[6:2]

• How to decode whetherBrUn is 1?


BrUn = Inst [13] • Branch

1/29/2024 106
Control Logic to Decode add
add = i[30]•i[14]•i[13]•i[12]•R-type
inst[30] inst[14:12] inst[6:2]

R-type = i[6]•i[5]•i[4]•i[3]•i[2]•RV32I
RV32I = i[1]•i[0]
1/29/2024 107
1/29/2024 108
Call home, we’ve made HW/SW contact!
High Level Language temp = v[k];
v[k] = v[k+1];
Program (e.g., C) v[k+1] = temp;
Compiler
lw x3, 0(x10)
Assembly Language lw x4, 4(x10)
sw x4, 0(x10)
Program (e.g., RISC-V) sw x3, 4(x10)
Assembler 1000 1101 1110 0010 0000 0000 0000 0000
Machine Language 1000 1110 0001 0000 0000 0000 0000 0100
1010 1110 0001 0010 0000 0000 0000 0000
Program (RISC-V) 1010 1101 1110 0010 0000 0000 0000 0100

Hardware Architecture Description +4 Reg []


wb DataD
pc

Reg[rs1]
1
alu
pc+4

2
alu 1 ALU DMEM
(e.g., block diagrams) pc 0
inst[11:7] 1 wb
pc+4 0 IMEM AddrD Reg[rs2] Addr
DataR
Branch 0 0
inst[19:15] AddrA DataA
Comp. DataW mem
inst[24:20] AddrB DataB 1

Architecture Implementation
Logic Circuit Description
inst[31:7] Imm. imm[31:0]
Gen

(Circuit Schematic Diagrams) A


B
Out = AB+CD

C
D

1/29/2024 109
“And In conclusion…”
▪ Wehave built a processor!
 Capable of executing all RISC-V instructions in one cycle each
 Notall units (hardware) used by all instructions
 Critical path changes
▪ 5 Phases of execution
 IF, ID, EX, MEM, WB
 Notall instructions are active in all phases
▪ Controller specifies how to execute instructions
 Implemented as ROM or logic

1/29/2024 110

You might also like