FPGA Lec02 DesignFlow
FPGA Lec02 DesignFlow
Bit stream
RTL/Verilog
Program circuit
Logic synthesis
Configured FPGA
Circuit netlist
Lecture 2 2
FPGA Lookup Tables (LUTs)
Basic idea: Memory can implement combinational logic
n Ex: 2-address memory can implement 2-input logic
n 1-bit wide memory – 1 function; 2-bits wide – 2 functions
Such memory in FPGA known as lookup table (LUT)
F = x'y' + xy F = x'y' + xy
4x1 Mem. 4x1 Mem. G = xy' 4x2 Mem.
x y F 1 rd 1 rd
0 0 1 x y F G 1 rd 0 10
0 1 0 1
0 1 0 0 0 1 0
1 0 1 0 1 00
1 0 0 2 0 2 0 0 1 0 0 2 01
1 1 1 3 1 x=0 3 1 1 0 0 1 3 10
x a1 a1 x a1
y a0 a0 1 1 1 0 y a0 D1 D0
D D
y=0
F F=1
F G
(a) (b) (c) (d) (e) a
a a
a
warning light p w
0
0
0
0
0
1
0
0
0 1 0 0
0 1 1 0
s
1 0 0 0
(a) 1 0 1 0
1 1 0 1
8x1 Mem.
1 1 1 0 a
0 0 (b)
1 0
2 0
Programming
3 0 (seconds)
k a2
p a1 4 0
s a0 5 0
6 1
7 0
IC D Fab
(c)
Courtesy: Frank Vahid w
Lecture 2 4
FPGAs More Efficient With
Numerous Small LUTS
Lookup tables become inefficient for more inputs
n 3 inputs à only 8 words 8 inputs à 256 words 16 inputs à 65,536 words!
FPGAs thus have numerous small (3, 4, 5, or even 6-input) LUTs
n If circuit has more inputs, must partition circuit among LUTs
n Ex: 9-input circuit more efficient on 8x1 mems rather than 512x1
a a 512x1 Mem.
b b 3x1
c c
d d
e F e 3x1 3x1 F a
f f 8x1 Mem.
g g
h h 3x1
i i
(a) (b) (c)
Lecture 2
a a 6
Mapping a Circuit to 3x1 LUTs
Divide circuit into 3-input sub-circuits
Map each sub-circuit to 3x1 LUT
(Assume for now that we can create any wires to/between
LUTs)
8x1Mem. 8x1Mem.
0 0 0 0
1 0 1 1
k k 2 0 2 1
p p x
3 0 0 3 1
s w s w k a2 a2
t t p a1 4 0 a1 4 0
s a0 5 0 a0 5 0
6 1 6 0
Sub-circuit has 7 0 7 0
only 2 inputs
( a) (b) D D
x w
t
(c )
Italics: contents don’t matter
Lecture 2 8
Mapping to 3x2 LUTs
Example: Mapping a 2x4 decoder to 3-input 2-output LUTs
uts
utp
d0
2o
0 10 0 00
d1
inp
1 01 1 00
2
2 00 2 10
has
d2 3 00 3 01
0 a2 0 a2
uit
a1 4 00 a1 4 00
c
i1
-cir
d3 a0 5 00 a0 5 00
i0
ut as 2
Sub
6 00 6 00
ts
2o th
pu
7 00 7 00
ts, cui
pu ir
in b-c
D1 D0 D1 D0
Su
i1 i0 a a
d0 d1 d2 d3
(a) (b)
Lecture 2 9
More Mapping Issues
Gate has more inputs than does LUT à Decompose gate first
Sub-circuit has fewer outputs than LUT à Just don't use output
a 8x2 Mem. 8x2 Mem.
b
c 0 00 0 00
d 00 1 10
F 1
e 2 00 2 00
(a) 3 00 3 10
a a2 a2
a 1 b a1 4 00 a1 4 00
b 2 t c a0 5 00 a0 5 10
c 3 1 6 00 6 10
2
d F 7 01 7 10
e 3 D1 D0 D1 D0
(b) t
d
(Note: decomposed one 4- a e F
input AND input two (c)
smaller ANDs to enable
First column unused; Second column unused;
partitioning into 3-input
sub-circuits) second column a first column implements
implements AND AND/OR sub-circuit
10
Lecture 2
FPGA Internals: Switch Matrices
Previous slides had hardwired connections between LUTs
Program connections using programmable interconnect
n Simple mux-based version – each output can be set to any of the
four inputs just by programming its 2-bit configuration memory
FPGA Switch
8x2 Mem. 8x2 Mem. matrix 2-bit mem.
0 00 0 00
m0 s1 s0
1 00 1 00 i0
m1
2 00 2 00 i1 4x1 o0
m2 i2 mux d
3 00 3 00 m3
i3
P0 a2 a2
P1 a1 4 00 a1 4 00
P2 a0 5 00 a0 5 00
2-bit mem .
6 00 6 00
7 00 7 00
s1 s0
D1 D0 Switch D1 D0 i0
matrix i1 4x1 o1
m0 o0 Q0 i2 mux d
m1 i3
o1 Q1
P3
P4
m2
m3 o2 ...
2-bit mem .
Likewise for o2.. .
o2
(b) 11
Lecture 2 (a) a a
Ex: FPGA with Switch Matrix
Mapping the extended seatbelt warning light circuit onto an FPGA with a switch matrix
Lecture 2 (b) 12
(a) a
Configurable Logic Blocks (CLBs)
Include flip-flops FPGA
to support CLB CLB
8x2Mem. 8x2Mem.
sequential circuits 0 00 0 00
1 00 1 00
Muxes 2
3
00
00
2 00
3 00
programmed to P0 a2 a2
a1 4 00 a1 4 00
output registered P1
P2 a0 5 00 a0 5 00
or non-registered 6 00 6 00
LUT output 7 00 7 00
CLB output D1 D0 D1 D0
flip-flop
1-bit 10 10 10 10
CLB output 0 2x1 0 2x1 Switch 0 2x1 0 2x1
configuration matrix
memory m0 00 o0 Q0
m1 Q1
P3 m2 00 o1
P4 m3 00 o2
Lecture 2 13
Sequential Circuits Mapped to FPGA
FPGA
CLB CLB
8x2 Mem. 8x2 Mem.
0 00 0 00
1 01 1 01
2 00 2 01
3 01 v 3 01
a P0 a2
00 u
a2
10
b P1 a1 4 a1 4
01 d
c P2 a0 5 a0 5 11
6 10 6 11
a 7 01 7 11
t
b u D1 D0 D1 D0
c F v
t c
c
G
v 10 10 10 10
1 2x1 1 2×1 Switch 0 2×1 0 2×1
d matrix
m0 00 o0 Q0 F
v
u m1 Q1 G
o1
d P3 m2 01
P4 m3 10 o2
(b)
Lecture 2 a 14
FPGA Internals: Overall Architecture
Consists of hundreds or thousands of CLBs and switch
matrices (SMs) arranged in regular pattern on a chip
Represents channel with Connections for just one
tens of wires CLB shown, but all
CLBs are obviously
connected to channels
CLB CLB CLB
SM SM
SM SM
Lecture 2 15
Programming an FPGA
FPGA
All configuration Pin
CLB
8x2Mem.
CLB
8x2Mem.
memory bits are Pclk 0 0 0 0 0 0
connected as 1
2
0
0
1
0
1
2
0
0
1
1
one big shift P0 a2
3 0 1 v
u
a2
3 0 1
a1 4 0 0 a1 4 1 0
register P1
P2 a0 5 0 1 d
a0 5 1 1
6 1 0 6 1 1
n Known as scan 7 0 1 7 1 1
chain D1 D0 D1
v
D0