0% found this document useful (0 votes)

41 views16 pages

Computer Science 146 Computer Architecture

This document contains notes from lecture 17 of the Computer Science 146 course at Harvard University taught by Professor David Brooks in Spring 2004. The lecture covered main memory organization and configurations. It discussed the differences between static RAM (SRAM) and dynamic RAM (DRAM) used for main memory and caches. It described various DRAM organizations including fast page mode, extended data out, synchronous DRAM and Rambus DRAM. The lecture also covered techniques for improving memory bandwidth through wider DRAMs, memory interleaving using multiple DRAM banks, and avoiding bank conflicts.

Uploaded by

harshv

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

41 views16 pages

Computer Science 146 Computer Architecture

Uploaded by

harshv

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 16

Computer Science 146

Computer Architecture
Spring 2004
Harvard University
Instructor: Prof. David Brooks
dbrooks@eecs.harvard.edu
Lecture 17: Main Memory

Computer Science 146

David Brooks

Course Outline Revisited

Feb 4

Introduction

Ch. 1

Feb 9/11

Measuring Perf.

ISA Design

Ch. 2, A.1-3

Feb 16/18

Holiday/No Class

Basic Pipelining

A.4-11, Ch3

Feb 23/25

Multicycle/Scoreboard

Tomasulos Algorithm

Ch 3

Mar 1/3

Branch Pred./Fetch

Mult. Issue/Speculation

Ch 3

Mar 8/10

Processor Case Studies

Static Issue

Ch 3/4
Ch 4

Mar 15/17

Static ILP vs. HW ILP

IA64 Study/Review

Mar 22/24

IA64 Study/Review

Midterm

Mar 29/31

W10

Apr 5/7

Caches

Spring Break
Caches

Ch 5

W11

Apr 12/14

Caches

Main Memory

Ch 5

W12

Apr 19/21

Virtual Memory

Shared Memory MPs

Papers

W13

Apr 26/28

Multithreading (SMT/MP)

Storage, I/O, Clusters

Ch 6/7

W14

May 3/5

Security Processors

Network/GPU Processors

Papers

Lecture Outline
Main Memory

Computer Science 146

David Brooks

Main Memory Background

Random Access Memory
Different flavors at different levels
Physical Makeup (CMOS, DRAM)
Low Level Architectures (FPM,EDO,SDRAM,RAMBUS)

Cache uses SRAM: Static Random Access Memory

No refresh (6 transistors/bit vs. 1 transistor
Size: DRAM/SRAM - 4-8x,
Cost and Cycle time: SRAM/DRAM - 8-16x

Main Memory is DRAM: Dynamic Random Access Memory

Dynamic since needs to be refreshed periodically (8 ms, 1% time)
Addresses divided into 2 halves (Memory as a 2D matrix):
RAS or Row Access Strobe
CAS or Column Access Strobe
Computer Science 146
David Brooks

Static RAM (SRAM)

Six transistors in cross
connected fashion
Provides regular AND
inverted outputs
Implemented in CMOS
process

Single Port 6-T SRAM Cell

Computer Science 146
David Brooks

Dynamic RAM
SRAM cells exhibit
high speed/poor density
DRAM: simple
transistor/capacitor
pairs in high density
form

Word Line

.
.
.

Bit Line

Sense Amp

Computer Science 146

David Brooks

DRAM Operations
Write
Charge bitline HIGH or LOW and set wordline HIGH

Read
Bit line is precharged to a voltage halfway
between HIGH and LOW, and then the
C
word line is set HIGH.
Depending on the charge in the cap, the
precharged bitline is pulled slightly higher or lower.
Sense Amp Detects change

Word
Line

.
.
.

Reads are destructive (Must follow with a write)

Must refresh capacitor every so often
Access Time = Time to Read
Cycle Time = Time between Reads

Bit Line

Sense
Amp

Computer Science 146

David Brooks

DRAM logical organization

Column Decoder

Sense Amps & I/O

Row Decoder

11
Address Buffer

Square Row/Column Matrix

Multiplex Address Lines to save
pins
Internal Row Buffer
Put Row Address on Lines
Set RAS
Read row into row buffer
Put Column Adddress on Lines
Set CAS
Read Column bits out of row
buffer

Memory Array
(2,048 x 2,048)
Storage
Word Line Cell

Computer Science 146

David Brooks

Vanilla DRAM Read

Access Time
Cycle Time

Timing diagrams from Ars Technica

RAM Guide

Computer Science 146

David Brooks

Fast Page DRAM

Computer Science 146

David Brooks

Extended Data Out (EDO) DRAM

Computer Science 146

David Brooks

Synchronous DRAM
DDR SDRAM: Transmit
Data on Both Clock Edges

Comparison with SRAM

By its nature, DRAM isnt built for speed
Response times dependent on capacitive circuit
properties which get worse as density increases

DRAM process isnt easy to integrate into

standard CMOS process
SRAM:
Optimized for speed (8x - 16x DRAM), not density
Bits not erased on read
No refresh, access time = cycle time

Computer Science 146

David Brooks

Main Memory Organizations

Simple:
CPU, Cache, Bus, Memory
same width
(32 or 64 bits)

Wide:
CPU/Mux 1 word;
Mux/Cache, Bus, Memory N
words (Alpha: 64 bits & 256
bits; UtraSPARC 512)

Interleaved:
CPU, Cache, Bus 1 word:
Memory N Modules
(4 Modules); example is word
interleaved
Computer Science 146
David Brooks

Main Memory Configurations

Simple Main Memory

32-bit DRAM (1 word of data at a time)

Access time: 2 cycles (A)
Transfer time: 1 cycle (T)
Cycle Time: 4 cycles (B = cycle time access time)
Miss penalty for a 4-word block?

Computer Science 146

David Brooks

Simple Main Memory

Cycle

Addr

Mem

steady

T/B

4 word access = 15
cycles
4-word cycle = 16
cycles
How to improve?
Lower latency?

A,B,T are fixed

Higher bandwidth?

T/B

12
13

Bandwidth: Wider DRAMs

Cycle

Addr

Mem

steady

T/B

64-bit DRAM instead

4 word access = 7 cycles
4-word cycle = 8 cycles
64-bit buses are more
expensive (Pentium vs.
486)

Computer Science 146

David Brooks

Bandwidth: Interleaving/Banking
Use Multiple DRAMs, exploit their aggregate
bandwidth

Each DRAM is called a bank

M 32-bit banks
Word A in bank (A % M) at (A div M)
Simple interleaving: banks share address lines

Computer Science 146

David Brooks

Simple Interleaving
Cycle

Addr

Bank0

Bank1

Bank2

Bank3

T/B

steady

*
T

4-word access = 6-cycles

4-word cycle = 4-cycles
Can start a new access in cycle 5
Overlap access with transfer (and still use a 32-bit bus!)
Computer Science 146
David Brooks

Complex Interleaving
Simple interleaving: banks share address lines
Complex interleaving: banks are independent
More expensive (separate address lines for each bank)
address3
address2
address1
address0

address

B0 B1 B2 B3
data

Computer Science 146

David Brooks

Complex Interleaving
Bank1

Bank2

Bank3

steady

T/B

Cycle

Addr

Bank0

T/B

5
6
7

4-word access = 6-cycles

4-word cycle = 4-cycles
Same as simple interleaving
Computer Science 146
David Brooks

Simple Interleaving (Non-Sequential)

Cycle

Addr

Bank0

Bank1

Bank2

Bank3

steady

12(15)

*
*

T/B

Non-sequential access, e.g. stride = 3

4-word access = 4-word cycle = 12-cycles

Complex Interleaving
(Non-Sequential)
Cycle

Addr

Bank0

T/B

Bank1

Bank2

Bank3

steady

*
*

T/B

4-word access = 6-cycles

4-word cycle = 4-cycles
DMA (I/O), Multiprocessors are non-sequential
Want more banks than words in a cache line
Multiple cache misses in parallel (non-blocking caches)
Computer Science 146
David Brooks

Interleaving Problem
Bank1

Bank2

Bank3

Cycle

Addr

Bank0

steady
*
*

T/B

Powers of 2 strides are a problem all addresses, same bank

4-word access = 15 cycles, 4-word cycle = 16 cycle
Solution: Use prime number of banks (e.g. 17)
Computer Science 146
David Brooks

Avoiding Bank Conflicts

Lots of banks
int x[256][512];
for (j = 0; j < 512; j = j+1)
for (i = 0; i < 256; i = i+1)
x[i][j] = 2 * x[i][j];
Even with 128 banks, since 512 is multiple of 128, conflict on
word accesses
SW: loop interchange or declaring array not power of 2 (array
padding)
HW: Add more Banks, Add Prime number of banks

bank number = address mod number of banks

address within bank = address / number of words in bank
modulo & divide per memory access with prime no. banks?
address within bank = address mod number words in bank
bank number? easy if 2N words per bank

Independent Memory Banks

How many banks?
number banks number clocks to access word in bank

For sequential accesses, otherwise will return to

original bank before it has next word ready

Increasing DRAM => fewer chips => less banks

4 banks/chip * 1 rank = 4 total banks

16 banks/chip * 8 chips = 128 banks

Independent Memory Banks

DIMM (Dual-Inline Memory Module)

Configuration
Banking occurs at the chip, module, and
system levels
1 Rank of devices responds to each access
All devices respond similarly

Single-Sided DIMM
4 banks per device => DIMM has 4 banks

512MB DIMM = 8x64Mx8, 4 Banks

RAMBUS (RDRAM)
Protocol based RAM w/ narrow (16-bit) bus
High clock rate (400 Mhz), but long latency
Pipelined operation

Multiple arrays w/ data transferred on both edges

of clock

RAMBUS Bank

RDRAM Memory System

RDRAM Timing

Computer Science 146

David Brooks

Independent Memory Banks

Standard PC Upgrade Path
Traditional DIMMS => 8 devices at a time with 8-bit chips
Rambus RIMMs => One at a time

Successful Markets: PlayStation 2 (High Bandwidth, Small

Memory)
Rambus: 400MHz, 16-bits per channel, 2-bits per clock
1.6GB/sec per channel (only 1 chip needed)
2 Rambus Channels in Parallel, 3.2GB/sec memory bandwidth

Traditional:PC100 SDRAM: 100MHz, 1-bit per clock

Would need 32 chips to achieve 3.2GB/sec bandwidth

Computer Science 146

David Brooks

Interleaving Summary
Banks
Method to get high bandwidth with cheap (narrow) bus

Bandwidth determines memory capacity

Hard to make many banks from narrow DIMMs
32, 64-bit banks from 1x64MB DRAMS => 2048 DIMMS =>
4GB
Cant force customers to buy so much memory to get good
bandwidth
Must use wider DRAMs
RAMBUS does better with small memory systems (PS2)
Big servers have lots of memory so traditional banking works
Computer Science 146
David Brooks

Next Time
Multiprocessors

Computer Science 146

David Brooks

Kien-Truc-May-Tinh - David-Brooks - Cs146-Lecture17-Main-Memory - (Cuuduongthancong - Com)
No ratings yet
Kien-Truc-May-Tinh - David-Brooks - Cs146-Lecture17-Main-Memory - (Cuuduongthancong - Com)
16 pages
Computer Architecture: Main Memory
No ratings yet
Computer Architecture: Main Memory
18 pages
DRAM Basics by Prof. Matthew D. Sinclair
No ratings yet
DRAM Basics by Prof. Matthew D. Sinclair
103 pages
Memory System
No ratings yet
Memory System
70 pages
Main Memory DRAM's:: Time Between The Read Is Requested and The Desired
No ratings yet
Main Memory DRAM's:: Time Between The Read Is Requested and The Desired
3 pages
DRAM and SRAM Memory Hierarchy
No ratings yet
DRAM and SRAM Memory Hierarchy
3 pages
11 Memory Org
No ratings yet
11 Memory Org
47 pages
Module 5.3
No ratings yet
Module 5.3
39 pages
EECS 150 - Components and Design Techniques For Digital Systems Lec 16 - Storage: DRAM, SDRAM
No ratings yet
EECS 150 - Components and Design Techniques For Digital Systems Lec 16 - Storage: DRAM, SDRAM
26 pages
Unit IV The Memory System
No ratings yet
Unit IV The Memory System
78 pages
Computer Memory Basics
No ratings yet
Computer Memory Basics
23 pages
Ch04 The Memory System
No ratings yet
Ch04 The Memory System
45 pages
Unit 5 COA
No ratings yet
Unit 5 COA
95 pages
Chapter 5 Internal Memory
No ratings yet
Chapter 5 Internal Memory
13 pages
Chapter5-The Memory System
No ratings yet
Chapter5-The Memory System
78 pages
Memory System Overview
No ratings yet
Memory System Overview
84 pages
Memory Systems-Module 3
No ratings yet
Memory Systems-Module 3
79 pages
Lecture19 New2024
No ratings yet
Lecture19 New2024
25 pages
Memory Systems Overview
No ratings yet
Memory Systems Overview
73 pages
Chapter 5-The Memory System
No ratings yet
Chapter 5-The Memory System
80 pages
Supplemental Material On Cache From ECE-341 Memory
No ratings yet
Supplemental Material On Cache From ECE-341 Memory
79 pages
250324digital System Design - Memory
No ratings yet
250324digital System Design - Memory
107 pages
Module 4-The Memory System
No ratings yet
Module 4-The Memory System
55 pages
Memory Systems and Cache Mapping
No ratings yet
Memory Systems and Cache Mapping
37 pages
04 Memory
No ratings yet
04 Memory
41 pages
DD Slides 7
No ratings yet
DD Slides 7
61 pages
Main Memory Architecture Lecture
No ratings yet
Main Memory Architecture Lecture
50 pages
Chapter 5 - Memory - Systems
No ratings yet
Chapter 5 - Memory - Systems
80 pages
Computer Memory Types & Operations
No ratings yet
Computer Memory Types & Operations
75 pages
Digital Systems Storage Lecture
No ratings yet
Digital Systems Storage Lecture
38 pages
Computer Memory Essentials
100% (1)
Computer Memory Essentials
74 pages
Memory Systems for BTech Students
No ratings yet
Memory Systems for BTech Students
154 pages
Dl&co Unit-4
No ratings yet
Dl&co Unit-4
43 pages
Memory Systems
No ratings yet
Memory Systems
93 pages
Memory and Storage Systems
No ratings yet
Memory and Storage Systems
14 pages
Mod 5 Memory
No ratings yet
Mod 5 Memory
70 pages
FALLSEM2019-20 CSE2001 TH VL2019201001220 Reference Material I 09-Sep-2019 Memory
No ratings yet
FALLSEM2019-20 CSE2001 TH VL2019201001220 Reference Material I 09-Sep-2019 Memory
95 pages
The Memory System: Deepak John, Department Computer Applications, SJCET-Pala
No ratings yet
The Memory System: Deepak John, Department Computer Applications, SJCET-Pala
63 pages
Memory Basics: Emt 235 Digital Electronic Principles 2
No ratings yet
Memory Basics: Emt 235 Digital Electronic Principles 2
58 pages
Unit 5
No ratings yet
Unit 5
56 pages
Understanding Computer Memory
No ratings yet
Understanding Computer Memory
24 pages
Internal Memory & RAM Types
No ratings yet
Internal Memory & RAM Types
37 pages
Memory Subsystem: Dr. Gayathri Sivakumar Assistant Professor (SG-I) School of Electronics VIT, Chennai
No ratings yet
Memory Subsystem: Dr. Gayathri Sivakumar Assistant Professor (SG-I) School of Electronics VIT, Chennai
16 pages
SSC Course 8 Memory
No ratings yet
SSC Course 8 Memory
33 pages
EN Ch2 Memories
No ratings yet
EN Ch2 Memories
9 pages
Chapter5-The Memory System
No ratings yet
Chapter5-The Memory System
77 pages
Fundamental Concepts
No ratings yet
Fundamental Concepts
64 pages
CH 5
No ratings yet
CH 5
36 pages
Ch-4-Memory System Org and Arch
No ratings yet
Ch-4-Memory System Org and Arch
58 pages
AO Chapter2 Memoires
No ratings yet
AO Chapter2 Memoires
9 pages
Arch Lec 2 Memory Hierarchy
No ratings yet
Arch Lec 2 Memory Hierarchy
19 pages
Lecture 12 - Memory Technologies
No ratings yet
Lecture 12 - Memory Technologies
47 pages
Lec 22
No ratings yet
Lec 22
14 pages
Chapter 5-The Memory System
100% (1)
Chapter 5-The Memory System
80 pages
Coa Unit 4
No ratings yet
Coa Unit 4
90 pages
Memory Systems-Module 3
No ratings yet
Memory Systems-Module 3
79 pages
Unit 5 COA
No ratings yet
Unit 5 COA
34 pages
CS5204/EE5364 - Advanced Computer Architecture - Memory
No ratings yet
CS5204/EE5364 - Advanced Computer Architecture - Memory
67 pages
CSE2213 Lecture 8 Chapter5 the-Memory-System
No ratings yet
CSE2213 Lecture 8 Chapter5 the-Memory-System
48 pages
Bitcoin Price Prediction via Graph Analysis
No ratings yet
Bitcoin Price Prediction via Graph Analysis
8 pages
Computer Science 146 Computer Architecture
No ratings yet
Computer Science 146 Computer Architecture
17 pages
Multithreading & I/O in Computer Architecture
No ratings yet
Multithreading & I/O in Computer Architecture
19 pages
Computer Science 146 Computer Architecture
No ratings yet
Computer Science 146 Computer Architecture
17 pages
Computer Science 146 Computer Architecture
No ratings yet
Computer Science 146 Computer Architecture
18 pages
Computer Science 146 Computer Architecture
No ratings yet
Computer Science 146 Computer Architecture
13 pages
Computer Science 146 Computer Architecture
No ratings yet
Computer Science 146 Computer Architecture
52 pages
Computer Science 146 Computer Architecture
No ratings yet
Computer Science 146 Computer Architecture
18 pages
Computer Architecture: Speculation & Multiple Issue
No ratings yet
Computer Architecture: Speculation & Multiple Issue
22 pages
Getting Started With Docker
No ratings yet
Getting Started With Docker
21 pages
What Are The Minimum System Requirements For ProPresenter 7 - Renewed Vision
No ratings yet
What Are The Minimum System Requirements For ProPresenter 7 - Renewed Vision
1 page
Microsoft Access 2010 Part 1: Introduction To Access: C S U, L A
No ratings yet
Microsoft Access 2010 Part 1: Introduction To Access: C S U, L A
24 pages
MC Lab Manual
No ratings yet
MC Lab Manual
25 pages
Gatekeeper Error
No ratings yet
Gatekeeper Error
1 page
The Humble Dialog Box
No ratings yet
The Humble Dialog Box
9 pages
AE 09 A2B SigmaStudio UserGuide
No ratings yet
AE 09 A2B SigmaStudio UserGuide
83 pages
USB Tester UM25/UM25C Guide
100% (1)
USB Tester UM25/UM25C Guide
6 pages
ANSWER KEYS TLE-8 CHS Pandemic
No ratings yet
ANSWER KEYS TLE-8 CHS Pandemic
1 page
Advanced Arrays MR Long Summary
No ratings yet
Advanced Arrays MR Long Summary
9 pages
Mtech Project
No ratings yet
Mtech Project
23 pages
How To Manage Oracle Grid Infrastructure During Operating System Upgrades (Doc ID 1559762.1)
No ratings yet
How To Manage Oracle Grid Infrastructure During Operating System Upgrades (Doc ID 1559762.1)
3 pages
HP ProDesk 400 G3 Technical
No ratings yet
HP ProDesk 400 G3 Technical
88 pages
Creating A VB Scheduler in
No ratings yet
Creating A VB Scheduler in
11 pages
Vmware Advanced Design Exam Guide
No ratings yet
Vmware Advanced Design Exam Guide
10 pages
6294A-EnU TrainerManual Volume1
No ratings yet
6294A-EnU TrainerManual Volume1
556 pages
Introduction to Data Communication
No ratings yet
Introduction to Data Communication
39 pages
Disklavier Piano Control Unit DKC-850
No ratings yet
Disklavier Piano Control Unit DKC-850
2 pages
Srinivasulu Reddy - Oracle Apps Technical - PWC - Topsys
No ratings yet
Srinivasulu Reddy - Oracle Apps Technical - PWC - Topsys
3 pages
Phạm Huỳnh Nguyên Bảo-SE172188Lab 9
No ratings yet
Phạm Huỳnh Nguyên Bảo-SE172188Lab 9
26 pages
Senior Front-End Developer Resume
No ratings yet
Senior Front-End Developer Resume
1 page
Daftar Tayang SIPLah Blibli - 20201024
No ratings yet
Daftar Tayang SIPLah Blibli - 20201024
76 pages
Current Log
No ratings yet
Current Log
5 pages
Google Classroom Guide for GGSMC Students
No ratings yet
Google Classroom Guide for GGSMC Students
1 page
Tech Consumers and the 'Kill Switch'
No ratings yet
Tech Consumers and the 'Kill Switch'
1 page
JCMS 4.1 UserGuide PDF
No ratings yet
JCMS 4.1 UserGuide PDF
159 pages
Object-Oriented Programming (CS F213) : BITS Pilani
No ratings yet
Object-Oriented Programming (CS F213) : BITS Pilani
14 pages
Trading Time and Ict Services Courses Outline
No ratings yet
Trading Time and Ict Services Courses Outline
2 pages
Draft Specifications For PR and DR
No ratings yet
Draft Specifications For PR and DR
9 pages
TCS NQT Questions and Answers For Programming Logic
100% (2)
TCS NQT Questions and Answers For Programming Logic
4 pages

Computer Science 146 Computer Architecture

Uploaded by

Computer Science 146 Computer Architecture

Uploaded by

Computer Science 146

Computer Science 146

Course Outline Revisited

Processor Case Studies

Static ILP vs. HW ILP

Shared Memory MPs

Storage, I/O, Clusters

Computer Science 146

Main Memory Background

Cache uses SRAM: Static Random Access Memory

Main Memory is DRAM: Dynamic Random Access Memory

Static RAM (SRAM)

Single Port 6-T SRAM Cell

Computer Science 146

Reads are destructive (Must follow with a write)

Computer Science 146

DRAM logical organization

Sense Amps & I/O

Square Row/Column Matrix

Computer Science 146

Vanilla DRAM Read

Timing diagrams from Ars Technica

Computer Science 146

Fast Page DRAM

Computer Science 146

Extended Data Out (EDO) DRAM

Computer Science 146

Comparison with SRAM

DRAM process isnt easy to integrate into

Computer Science 146

Main Memory Organizations

Main Memory Configurations

32-bit DRAM (1 word of data at a time)

Computer Science 146

Simple Main Memory

A,B,T are fixed

Bandwidth: Wider DRAMs

64-bit DRAM instead

Computer Science 146

Each DRAM is called a bank

Computer Science 146

4-word access = 6-cycles

Computer Science 146

4-word access = 6-cycles

Simple Interleaving (Non-Sequential)

Non-sequential access, e.g. stride = 3

4-word access = 6-cycles

Powers of 2 strides are a problem all addresses, same bank

Avoiding Bank Conflicts

bank number = address mod number of banks

Independent Memory Banks

For sequential accesses, otherwise will return to

Increasing DRAM => fewer chips => less banks

16 banks/chip * 8 chips = 128 banks

Independent Memory Banks

DIMM (Dual-Inline Memory Module)

512MB DIMM = 8x64Mx8, 4 Banks

Multiple arrays w/ data transferred on both edges

RDRAM Memory System

Computer Science 146

Independent Memory Banks

Successful Markets: PlayStation 2 (High Bandwidth, Small

Traditional:PC100 SDRAM: 100MHz, 1-bit per clock

Computer Science 146

Bandwidth determines memory capacity

Computer Science 146

You might also like