16.
1 DSP Applications
INTRODUCTION TO DSP
PROCESSORS
Voice mail
Digital cameras
Navigation equipment
Modems
Audio production
Noise cancellation
Videoconferencing
Medical ultrasound
Pagers
Patient monitoring
Music synthesis, effects
Radar
611 37100 Lecture 16-4
16.1 DSP Applications
INTRODUCTION TO DSP
PROCESSORS
16.1 DSP Applications
16.2 The Evolution of DSP Processors
16.3 The TMS320C6xxx Core CPU
Architecture
16.4 The TMS320C6xxx Memory Architecture
16.5 The TMS320C6xxx Peripherals
16.6 The TMS320C6xxx Software
Development
611 37100 Lecture 16-2
16.1 DSP Applications
DSP tasks for microprocessors
Speech and audio compression
Filtering
Modulation and demodulation
Error correction coding and decoding
Servo control
Audio processing (e.g., surround sound, noise
reduction, equalization, sample rate conversion, echo
cancellation)
Signaling (e.g., DTMF detection)
Speech recognition
Signal synthesis (e.g., music, speech synthesis)
Image processing
611 37100 Lecture 16-5
16.1 DSP Applications
Digital cellular phones
Satellite communications
Automated inspection
Seismic analysis
Vehicle collision avoidance
Secure communications
Voice over Internet
Tapeless answering machines
Motor control
Sonar
Consumer audio
Cordless phones
611 37100 Lecture 16-3
DSP tasks requirements
Repetitive numeric computations
Attention to numeric fidelity
High memory bandwidth, mostly via array
accesses
Real-time processing
611 37100 Lecture 16-6
16.1 DSP Applications
16.2 The Evolution of DSP Processors
Advantages and disadvantages
Advantages
Common DSP Features
Data path configured for DSP
Specialized instruction set
Multiple memory banks and buses
Specialized addressing modes
Specialized execution control
Specialized peripherals for DSP
Flexibility: Easy to modify and upgrade with software for
using the same hardware
Reproducibility: The performance of a DSP can be
repeated precisely from one unit to another
Reliability: The memory and logic of DSP hardware does
not deteriorate with age
Complexity: Allows sophisticated applications
Disadvantages
Speed and cost
Longer hardware and software design time
Problem of Finite word length error
611 37100 Lecture 16-7
611 37100 Lecture 16-10
16.1 DSP Applications
16.2 The Evolution of DSP Processors
DSP manufacturers and products
Data path
DSP Processors
General-Purpose Processor
Specialized hardware
performs all key arithmetic
operations in 1 cycle.
Multiplies often take >1 cycle
Hardware support for
managing numeric fidelity:
Shifts often take >1 cycle
Other operations (e.g.,
saturation, rounding) typically
take multiple cycles
Shifters
Guard bits
Saturation
611 37100 Lecture 16-8
611 37100 Lecture 16-11
16.1 DSP Applications
16.2 The Evolution of DSP Processors
General-purpose DSP market share
Instruction set
DSP Processors
General-Purpose Processor
Specialized, complex
instructions
General-purpose instructions
Multiple operations per
instruction
Mac x0,y0,a x:(r0)+,x0 y:(r4)+,y0
Forward Concepts, 1998
611 37100 Lecture 16-9
611 37100 Lecture 16-12
Typically only one operation per
instruction
mov *r0,x0
mov *r1,y0
mpy x0,y0,a
add a,b
mov y0,*r2
inc r0
inc r1
16.2 The Evolution of DSP Processors
16.2 The Evolution of DSP Processors
Memory architecture
Typical DSP algorithms
DSP Processors
General-Purpose Processor
Harvard architecture
Von Neumann architecture
2-4 memory accesses/cycle
Typically 1 access/cycle
No cacheson-chip SRAM
611 37100 Lecture 16-13
611 37100 Lecture 16-16
16.2 The Evolution of DSP Processors
16.2 The Evolution of DSP Processors
Addressing
Characteristics of DSP-based systems
Algorithms
DSP Processors
General-Purpose Processor
Dedicated address
generation units
Often, no separate address
generation unit
Specialized addressing
modes; e.g.:
General-purpose addressing
modes
Modulo (circular)
Clock rates
Numeric representations
Bit-reversed (for FFT)
611 37100 Lecture 16-14
The rate at which samples are consumed, processed, or
produced
MIPS (Million Instructions per Second)
MFLOPS (Million Floating-point Operations per Second)
Autoincrement
Good immediate data
support
Specifies the arithmetic operations to be performed
Sampling rates
Fixed-point numeric
Float-point numeric
611 37100 Lecture 16-17
16.2 The Evolution of DSP Processors
16.2 The Evolution of DSP Processors
Specialized peripherals for DSP
Host ports
Bit I/O ports
On-chip DMA controller
Clock generators
Synchronous serial ports
Parallel ports
Timers
On-chip A/D, D/A converters
First generation DSP (Texas Instruments
611 37100 Lecture 16-15
TMS32010)
Introduced in 1982
16-bit fixed-point
Harvard architecture
Accumulator
Specialized instruction set
390 ns MAC time
(228 ns today)
611 37100 Lecture 16-18
16.2 The Evolution of DSP Processors
16.2 The Evolution of DSP Processors
Second Generation DSPs
Processor DSP speed
Example: Motorola DSP56001 (1987)
24-bit data, instructions
3 memory spaces (X, Y, P)
Parallel moves
Single- and multi-instruction
hardware loops
Modulo addressing
75 ns MAC (21 ns today)
Other second-generation processors:
AT&T DSP16A, Analog Devices
ADSP-2100, Texas Instruments
TMS320C50
611 37100 Lecture 16-19
16.2 The Evolution of DSP Processors
Third generation DSPs (1995)
Examples: Motorola DSP56301, TI TMS320C541 (1995)
Enhanced conventional DSP architectures
3.0 or 3.3 volts
More on-chip memory
Application-specific function units in data path or as coprocessors
More sophisticated debugging and application development
tools
DSP cores (Pine and Oak from DSP Group, cDSP from TI)
20 ns MAC (10 ns today)
Architectural innovation mostly limited to adding
application-specific function units and miscellaneous minor
refinements.
Also, multiple processors/chip (TI TMS320C80, Motorola
MC68356)
611 37100 Lecture 16-20
611 37100 Lecture 16-22
16.2 The Evolution of DSP Processors
TI TMS320 family evolution
611 37100 Lecture 16-23
16.2 The Evolution of DSP Processors
16.3 The TMS320C6xxx Core CPU
Architecture
Fourth Generation DSP
The block diagram of TMS320C6000
INTERNAL BUS
DATA
CPU
611 37100 Lecture 16-24
PERIPHERALS
611 37100 Lecture 16-21
MEMORY
ADDRESS
EMIF
TI TMS320C6201, Intel Pentium with MMX (1997-1998)
Todays top DSP performers adopt architectures far
different from conventional DSP processor designs.
Blazing clock speeds and superscalar architectures make
some general-purpose processors, such as the PowerPC
604e, top floating-point performers, despite lack of many
DSP features.
Multimedia SIMD extensions, such as MMX, offer strong
fixed point performance on general-purpose processors.
VLIW-like architectures, such as that of the TI
TMS320C6201, achieve top performance via high
parallelism and increased clock speeds.
3 ns MAC throughput... but expensive, power-hungry
16.3 The TMS320C6xxx Core CPU
Architecture
16.3 The TMS320C6xxx Core CPU
Architecture
The block diagram of TMS320C6000
The program control unit
These units operate in an assembly line fashion and
are necessary for the CPU to execute instructions
Program fetch unit
To retrieve a fetch packet (FP), which is a group of eight
instructions, four phases are required:
PG phase: the CPU generate a fetch address
PS phase: the CPU sends the address to the memory
PW phase: the CPU waits for the data to be ready
PR phase: the CPU reads the opcode
Instruction dispatch unit
Instruction decode unit
611 37100 Lecture 16-25
611 37100 Lecture 16-28
16.3 The TMS320C6xxx Core CPU
Architecture
16.3 The TMS320C6xxx Core CPU
Architecture
The block diagram of TMS320C6000
Core CPU
Instruction dispatch unit
Eight instructions are received if the memory is
internal or a single instruction if the memory is
external.
Eight functional units
Data path
Control unit
32 32-bit registers
DP
(1 cycle)
On chip memory
Data RAM
Program RAM
DC
(1 cycle)
[8 units]
On chip peripherals
Internal buses
EMIF (External Memory Interface)
PG
611 37100 Lecture 16-26
PR*
Memory
PS
PW
611 37100 Lecture 16-29
16.3 The TMS320C6xxx Core CPU
Architecture
16.3 The TMS320C6xxx Core CPU
Architecture
The Central Processing Unit (CPU)
CPU data paths
Two general-purpose register files (A and B)
Program Control Unit
-Program Fetch
-Instruction Dispatch
-Instruction Decode
Data Path 1
16 32-bit registers (A0~A15) for file A
16 32-bit registers (B0~B15) for file B
Eight functional units
Data Path 2
Two load-from-memory paths (LD1 and LD2)
Two store-to-memory paths (ST1 and ST2)
Two register file cross paths (1X and 2X)
Two data address paths (DA1 and DA2)
Control Registers
Test, Emulation, Control
And Interrupt Logics
611 37100 Lecture 16-27
.L1, .S1, .M1, and .D1 for A
.L2, .S2, .M2, and .D2 for B
611 37100 Lecture 16-30
16.3 The TMS320C6xxx Core CPU
Architecture
16.3 The TMS320C6xxx Core CPU
Architecture
CPU data paths
Data cross paths
The data cross paths can also be referred to as
the register file cross paths.
Allow operands from one side to cross to the
other side.
Basic arithmetic instructions: MPY, ADD, SUB
Only one cross path per direction per execute
packet is permitted.
The destination register is always on the same
side of the unit used.
Program Control Unit
Data Path 1
Data Path 2
Register File A
Register File B
.L1 .S1 .M1 .D1
.L2 .S2 .M2 .D2
Control Registers
Test, Emulation, Control
And Interrupt Logics
611 37100 Lecture 16-31
611 37100 Lecture 16-34
16.3 The TMS320C6xxx Core CPU
Architecture
16.3 The TMS320C6xxx Core CPU
Architecture
CPU data paths and control
Address cross paths
The advantage of using an address cross path is
to be able to generate the address using one
register file, and accessing the data from the
other register file.
A1
DA1
.D1
*A0
Register
File A
Data path 1
Data path 2
.D2
DA2
B1
611 37100 Lecture 16-32
611 37100 Lecture 16-35
*B0
Register
File B
16.3 The TMS320C6xxx Core CPU
Architecture
16.3 The TMS320C6xxx Core CPU
Architecture
Cross paths
Cross paths enable linking of one side of the
CPU to the other.
Type of operand cross paths:
Data cross path
Address cross path
Address cross paths
Only one address cross path per direction per
execute packet is allowed.
When an address cross path is used, the
destination register for the load (LD) instructions
and the source register for the store (ST)
instruction should come from the opposite side of
the unit, or simply the register pointers must
come from the same side of the .D unit used.
If both .D units are to be used, then either none
or both of the address cross path should be used.
611 37100 Lecture 16-33
611 37100 Lecture 16-36
16.3 The TMS320C6xxx Core CPU
Architecture
16.3 The TMS320C6xxx Core CPU
Architecture
Functional units
There are four functional units for each data path.
Functional units - .M units
.M units are hardware multiplier units capable of
performing 16-bit by 16-bit multiplications
producing 32-bit results (for the C62xx).
The operands may come from the 16 MSB or the
16 LSB of the registers used.
. L units
.M units
.S units
.D units
All the units operate on 32-bit operands and
execute instructions simultaneously.
Eight functional units can be used in parallel to
execute up to 8 instructions in a given cycle.
If the cycle time for one instruction is 5ns (200
MHz clock), this results in 1600 MIPs of
performance.
611 37100 Lecture 16-37
Register a
a
MPY
a,b,c
MPYH
a,b,c
MPYHL
a,b,c
MPYLH
a,b,c
611 37100 Lecture 16-40
Register b
X
Register c
=
axb
AxB
Axb
axB
16.3 The TMS320C6xxx Core CPU
Architecture
16.3 The TMS320C6xxx Core CPU
Architecture
Functional units
Functional units - .S units
The .S units contain 32-bit integer ALUs and 40bit shifters
32-bit arithmetic, logic and bit field operations
32/40-bit shifts
Branches (.S2 only when using a register)
Register transfers to and from control registers
(.S2 only)
Constant generation
611 37100 Lecture 16-38
611 37100 Lecture 16-41
16.3 The TMS320C6xxx Core CPU
Architecture
16.3 The TMS320C6xxx Core CPU
Architecture
Functional units - .L units
40-bit integer Arithmetic and Logic Units (ALUs)
32/40-bit arithmetic and compare operations
32-bit logical operations
Normalization and bit count operations
Saturated arithmetic for 32/40-bit operations
Functional units - .D units
Load and store with 5-bit constant offset
Load and store with 15-bit constant offset (.D2
only)
32-bit additions/subtractions
Linear and circular address calculation
611 37100 Lecture 16-39
611 37100 Lecture 16-42
16.3 The TMS320C6xxx Core CPU
Architecture
16.3 The TMS320C6xxx Core CPU
Architecture
Control registers
The C62xx devices have 10 registers for control
purposes, while the C67xx have 13 control
registers.
Reading and writing to the control registers can
only be performed via the .S2 unit.
All the control registers can only be accessed by
the MVC (move constant) instruction.
Register files
Each data path contains a register file composed
of 16 32-bit general purpose registers (A0-A15
for data path 1 and B0-B15 for data path 2).
These registers can support 32- and 40-bit fixed
point data or 64-bit double-precision floating point
data for the case of the C67xx.
The general-purpose register can be used for:
Data
Data address pointer
Conditional registers
611 37100 Lecture 16-43
611 37100 Lecture 16-46
16.3 The TMS320C6xxx Core CPU
Architecture
16.4 The TMS320C6xxx Memory
Architecture
Control registers
Memory map
Memory map 0 (direct execution)
Internal program RAM: 64K/256K from 01400000H
Internal data RAM: 64K/256K from 80000000H
Internal peripherals: 8MB from 01800000H
Expansion bus: 1GB from 40000000H on C62
External memory space (CE0-CE3)
Memory map 1 (boot mode)
611 37100 Lecture 16-44
Internal program RAM: 64K/256K from 01400000H
Internal data RAM: 64K/256K from 80000000H
Internal peripherals: 8MB from 01800000H
Expansion bus: 1GB from 40000000H on C62
External memory space (CE0-CE3)
611 37100 Lecture 16-47
16.3 The TMS320C6xxx Core CPU
Architecture
16.4 The TMS320C6xxx Memory
Architecture
Control registers Extension for floating
Memory map
point operations
611 37100 Lecture 16-45
611 37100 Lecture 16-48
16.4 The TMS320C6xxx Memory
Architecture
16.4 The TMS320C6xxx Memory
Architecture
Data memory access
The CPU and DMA controller requests are made
through the Data MEMory Controller (DMEMC).
The DMA controller only uses the DMEMC for
internal data memory access.
The CPU uses the DMEMC for data request to
internal data memory as well as to on-chip
peripherals or to the External Memory InterFace
(EMIF).
For the CPU/DMA access, the arbitration is
performed by the DMEMC.
Cache architecture
611 37100 Lecture 16-49
611 37100 Lecture 16-52
16.4 The TMS320C6xxx Memory
Architecture
16.4 The TMS320C6xxx Memory
Architecture
Data memory access
External memory interface (EMIF)
The external memory interface (EMIF) connects
the CPU and external memory, such as
synchronous dynamic RAM (SDRAM),
synchronous burst static RAM (SBSRAM), and
asynchronous memory.
The EMIF also provides 8-bit-wide and 16-bitwide memory read capability to support low-cost
boot ROM memories (flash, EEPROM, EPROM,
and PROM).
The EMIF supports high throughput interfaces to
SDRAM, including burst capability.
CPU-DMEMC memory bank connections (C6201 Rev.2)
611 37100 Lecture 16-50
611 37100 Lecture 16-53
16.4 The TMS320C6xxx Memory
Architecture
16.4 The TMS320C6xxx Memory
Architecture
Internal memory
External memory interface (EMIF)
The amount and location of internal memory depends on
the particular device. The C6201, C6202, and C6701
have separate program and data memories (Harvard
architecture), while the C6211 has a portion of its internal
memory that may be used for either program or data.
611 37100 Lecture 16-51
611 37100 Lecture 16-54
16.5 The TMS320C6xxx Peripherals
16.5 The TMS320C6xxx Peripherals
TMS320C6000 Peripherals
Host-Port Interface (HPI)
611 37100 Lecture 16-55
611 37100 Lecture 16-58
16.5 The TMS320C6xxx Peripherals
16.5 The TMS320C6xxx Peripherals
Direct Memory Access (DMA) controller
The direct memory access (DMA) controller
transfers data between regions in the memory
map without intervention by the CPU.
The DMA allows movement of data to and from
internal memory, internal peripherals, or external
devices to occur in the background of CPU
operation.
The DMA has four independently programmable
channels allowing four different contexts for
operation. In addition, a fifth (auxiliary) channel
allows the DMA to service requests from the
host-port interface (HPI) or the Expansion Bus
(XB).
Boot configuration logic
611 37100 Lecture 16-56
The C62x and C67x devices provide a variety of boot
configurations for proper device initialization. These
configurations determine what actions the C62x/C67x
performs after device reset to prepare for initialization. These
boot configurations, which are set by external input pins,
determine:
The memory map the device selects. The memory map
determines whether internal or external memory is mapped
at address 0.
The type of external memory at address 0 (if external
memory is mapped at address 0)
The boot process used to initialize the memory at address
0 before the CPU is allowed to run.
611 37100 Lecture 16-59
16.5 The TMS320C6xxx Peripherals
16.5 The TMS320C6xxx Peripherals
Host-Port Interface (HPI)
Boot configuration logic
Three types of boot processes are available:
The Host-Port Interface (HPI) is a 16-bit wide parallel port
through which a host processor can directly access the
CPUs memory space.
The host device functions as a master to the interface,
which increases ease of access.
The host and CPU can exchange information via internal or
external memory. The host also has direct access to
memory-mapped peripherals.
The HPI is connected to the internal memory via a set of
registers. Either the host or the CPU may use the HPI
Control register (HPIC) to configure the interface.
The host can access the host address register (HPIA) and
the host data register (HPID) to access the internal memory.
The HPIC is a memory-mapped register, which allows the
CPU access.
611 37100 Lecture 16-57
No boot process (direct-execution startup): The CPU
simply starts running from the memory located at address 0.
When this memory location resides on SDRAM, the CPU is
held until SDRAM initialization finishes.
ROM boot process: A section of external memory is
copied to address 0 by the DMA/EDMA controller. Although
the boot process begins when the device is released from
external reset, this transfer occurs while the CPU is held in
reset internally.
Host-boot process: In the host-boot process, the CPU is
held in reset while the remainder of the device is released
from reset. During this period, an external host can initialize
the CPUs memory space as necessary through the HPI or
expansion bus.
611 37100 Lecture 16-60
10
16.5 The TMS320C6xxx Peripherals
16.5 The TMS320C6xxx Peripherals
Multichannel Buffered Serial Port (McBSP)
The standard serial port interface provides:
Internal timers
Full-duplex communication
Double-buffered data registers, which allow a continuous
data stream
Independent framing and clocking for reception and
transmission
Direct interface to industry-standard codecs, analog
interface chips (AICs), and other serially connected A/D
and D/A devices
External shift clock generation or an internal programmable
frequency shift clock
611 37100 Lecture 16-61
The C62x/C67x has two 32-bit general-purpose
timers that you can use to:
Time events
Count events
Generate pulses
Interrupt the CPU
Send synchronization events to the DMA
controller
611 37100 Lecture 16-64
16.5 The TMS320C6xxx Peripherals
16.5 The TMS320C6xxx Peripherals
Multichannel Buffered Serial Port (McBSP)
Internal timers
The timer has two signaling modes and can be
clocked by an internal or an external source.
The timer has an input pin (TINP) and an output
pin (TOUT). The TINP pin can be used as a
general-purpose input, and the TOUT pin can be
used as a general-purpose output.
With an internal clock, the timer can signal an
external A/D converter to start a conversion, or it
can trigger the DMA controller to begin a data
transfer.
With an external clock, the timer can count
external events and interrupt the CPU after a
specified number of events.
611 37100 Lecture 16-62
611 37100 Lecture 16-65
16.5 The TMS320C6xxx Peripherals
16.5 The TMS320C6xxx Peripherals
Multichannel Buffered Serial Port (McBSP)
Interrupts
The C62x/C67x CPU has 14 interrupts. These
are reset, the nonmaskable interrupt (NMI), and
interrupts 415. These interrupts correspond to
the RESET, NMI, and INT4INT15 signals on the
CPU boundary.
In some C62x/C67x devices, these signals may
be tied directly to pins on the device, connected
to on-chip peripherals, or may be disabled
permanently by being tied inactive on chip.
Generally, RESET and NMI are connected
directly to pins on the device.
Multichannel Buffered Serial Port (McBSP) Registers
611 37100 Lecture 16-63
611 37100 Lecture 16-66
11
16.5 The TMS320C6xxx Peripherals
16.5 The TMS320C6xxx Peripherals
Interrupts
Interrupts
Characteristics of interrupt servicing include:
The IACK pin from the CPU is used to
acknowledge an interrupt request.
The INUM0INUM3 pins indicate which interrupt
vector is being serviced.
Interrupt vectors are relocatable.
Interrupt vectors consist of one fetch packet
which provides for quick servicing.
Interrupt control registers
611 37100 Lecture 16-67
16.5 The TMS320C6xxx Peripherals
Interrupts
When an interrupt occurs, the CPU automatically
recognizes the source of the interrupt and jumps
to the interrupt vector location.
Each vector location can accommodate eight
instructions which correspond to a fetch packet.
Such a location is known as the Interrupt
Service Fetch Packet (ISFP) address.
When the CPU branches to the ISFP address,
the ISP can be fitted in a single fetch packet or in
multiple (successive or non-successive) fetch
packets.
611 37100 Lecture 16-68
611 37100 Lecture 16-70
16.6 The TMS320C6000 Software
Development
Instruction set
611 37100 Lecture 16-71
16.5 The TMS320C6xxx Peripherals
16.6 The TMS320C6000 Software
Development
Interrupts
Instruction set
Interrupt source
ISFP address
Reset
0x0000
NMI
0x0020
Reserved
0x0040
Reserved
0x0060
INT4
0x0080
INT5
0x00A0
INT6
0x00C0
INT7
0x00E0
INT8
0x0100
INT9
0x0120
INT10
0x0140
INT11
0x0160
INT12
0x0180
INT13
0x01A0
INT14
0x01C0
INT15
0x01E0
611 37100 Lecture 16-69
Interrupt Service Table (IST)
611 37100 Lecture 16-72
12
16.6 The TMS320C6000 Software
Development
16.6 The TMS320C6000 Software
Development
Instruction syntax
Fetch phases of the pipeline
The fetch phases of the pipeline are:
Label
Mnemonic
Instruction
Directive
Unit specifier
Operand field
A register (e.g. A1) or a register pointer (e.g. *A1)
A symbol (e.g. loop) or constant (e.g. 390)
An expression (e.g. if label >= 100)
Comment field
611 37100 Lecture 16-73
PG: Program address generate
PS: Program address send
PW: Program access ready wait
PR: Program fetch packet receive
611 37100 Lecture 16-76
16.6 The TMS320C6000 Software
Development
16.6 The TMS320C6000 Software
Development
Example of assembly language syntax
Decode phases of the pipeline
The decode phases of the pipeline are:
DP: Instruction dispatch
DC: Instruction decode
611 37100 Lecture 16-74
611 37100 Lecture 16-77
16.6 The TMS320C6000 Software
Development
16.6 The TMS320C6000 Software
Development
Pipeline operation
Execution phases of the pipeline
The pipeline phases are divided into three stages:
Fetch
Decode
Execute
All instructions in the C62x/C64x instruction set flow
through the fetch, decode, and execute stages of the
pipeline.
The fetch stage of the pipeline has four phases for all
instructions, and the decode stage has two phases for all
instructions. The execute stage of the pipeline requires a
varying number of phases, depending on the type of
instruction.
611 37100 Lecture 16-75
611 37100 Lecture 16-78
13
16.6 The TMS320C6000 Software
Development
16.6 The TMS320C6000 Software
Development
Execution phases of the pipeline
Operations occurring during pipeline phases
611 37100 Lecture 16-79
611 37100 Lecture 16-82
16.6 The TMS320C6000 Software
Development
16.6 The TMS320C6000 Software
Development
Pipeline operation:
One execute packet per fetch packet
Software development tools
A complete development tool set for both the PC
and Sun workstations includes the following:
611 37100 Lecture 16-80
C compiler
Assembly optimizer
Assembler
Linker
Evaluation tools
The C6000 platforms C compiler eliminates the
need for extensive knowledge of DSP
architecture while the the inherent performance
benefits of the advanced VLIW architecture is
maintained.
611 37100 Lecture 16-83
16.6 The TMS320C6000 Software
Development
16.6 The TMS320C6000 Software
Development
Operations occurring during pipeline phases
Software development tools
611 37100 Lecture 16-81
611 37100 Lecture 16-84
14
16.6 The TMS320C6000 Software
Development
16.6 The TMS320C6000 Software
Development
Process to develop DSP code phase 1
TMS320C6201 Evaluation Module, EVM
The C6x EVM is a full-size PCI board with
approximate dimensions of 4.2 inches wide,
12.28 inches long, and 0.49 inches high.
The C6x EVM can be operated stand-alone on a
desktop with the use of an external power supply
and XDS510 or XDS510WS emulator.
The C6x EVM has a C6201 or C6701 DSP
onboard that allows full-speed verification of C6x
code with the included source debugger.
611 37100 Lecture 16-85
611 37100 Lecture 16-88
16.6 The TMS320C6000 Software
Development
16.6 The TMS320C6000 Software
Development
Process to develop DSP code phase 2
TMS320C6201 Evaluation Module, EVM
611 37100 Lecture 16-86
611 37100 Lecture 16-89
16.6 The TMS320C6000 Software
Development
16.6 The TMS320C6000 Software
Development
Process to develop DSP code phase 3
TMS320C6201 Evaluation Module, EVM
611 37100 Lecture 16-87
611 37100 Lecture 16-90
15
16.6 The TMS320C6000 Software
Development
EVM features
Uses the TMS320C6201 DSP processor
DSP clock: 32.25 MHz, 50 MHz, 133 MHz or 200
MHz.
External memory:
64K x 32, 133 MHz SBSRAM
1M x 32, 100 MHz SDRAM (bank 0)
1M x 32, 100 MHz SDRAM (bank 1)
Expansion memory interface provided
PCI or JTAG Emulator Interface
Stereo codec interface (44.1 kHz sampling)
Code generation tools and support library
included
611 37100 Lecture 16-91
16.6 The TMS320C6000 Software
Development
C6000 EVM
611 37100 Lecture 16-92
16