Design of 32 Bit RISC V Processor
Design of 32 Bit RISC V Processor
Abstract—This paper presents the design and the foundational framework, encompassing the
implementation of a RISC-V processor core with initial 37 instructions from the base instruction set.
a single-stage architecture, focusing on the This endeavor stands as a testament to the intricate
execution of the base 32I instruction set. The interplay between hardware design, assembly
processor core features a 32-bit address and data language, and simulation technologies. The
architecture, with 32 general-purpose registers processor core design is executed in Verilog, a
for data manipulation. The design adheres hardware description language, with the
strictly to the RISC-V ISA specifications, customasm assembler providing flexibility and
ensuring compatibility with existing software and customizability. Simulation is facilitated through
toolchains. Key components of the processor core EDA playground and Verilog, while the Digital
include a fetch unit, decode unit, ALU unit, open-source logic designer and simulator serve as
register file, memory control unit, and instruction the principal platform for comprehensive testing.
memory unit. Through rigorous design and
simulation, the processor core achieves single- The core architecture shuns the complexities of
clock-cycle execution, enabling efficient a pipelined structure, opting for a straight forward
processing of instructions at an operating approach of executing a single instruction per
frequency of 2.2 KHz.The instruction memory cycle. Memory management is bifurcated into
unit facilitates the retrieval of instructions from a separate 32-bit addressable spaces for program and
4.5 KB memory array, while simultaneous read data, each serviced by dedicated buses to optimize
and write capabilities are supported for both data transfer efficiency. Decoding instructions is a
registers and data memory. Additionally, 13 pivotal task handled by a dedicated decoder, which
pseudo-instructions have been incorporated to identifies instruction types based on the opcode.
enhance programming ease and readability. The subsequent routing of instructions to specific
Performance evaluation of the processor core decoders ensures precise control signal generation
demonstrates its effectiveness in executing a and data retrieval from registers and memory.
variety of computational tasks, showcasing its Execution units, in turn, manipulate the acquired
reliability and efficiency within the specified data to compute results, which are then written
operational parameters. The design serves as a back to registers or memory, as dictated by the
foundation for future enhancements, including instruction.
the integration of pipeline architecture, branch
prediction mechanisms, and frequency II. LITERATURE REVIEW
optimization.
Andrew Waterman [10] discussed about the
Keywords—PUF, Modeling attacks, Vulnerability, RISC-V Instruction Set Manual is a foundational
Enhanced Security, Unpredictable responses document that outlines the principles and design
goals of the RISC-V instruction set architecture
I.INTRODUCTION (ISA). Published by a team from the University of
California at Berkeley, serves as a comprehensive
This In the realm of processor architecture, this guide for computer architecture research and
project delves into the meticulous design and education. The RISC-V ISA is characterized by its
realization of a 32-bit processor core adhering to open nature, simplicity, and efficiency, aiming to
the RISC-V 32I ISA. The RISC-V architecture, provide a realistic yet open ISA suitable for various
known for its simplicity and versatility, serves as microarchitecture styles and implementation
technologies. It supports both 32-bit and 64-bit Architecture" that delves into the evolution of the
address spaces, highly-parallel multicore ARM reduced-instruction-set computing (RISC)
implementations, and an efficient instruction processor, which has advanced to encompass a
encoding scheme with variable-length instructions. range of chips, including multiprocessors. The
Additionally, the manual highlights the support for ARM architecture has adapted to meet the
the revised IEEE 754 floating-point standard and escalating demands for performance in embedded
emphasizes the importance of being fully applications, incorporating new technologies to
virtualizable. enhance efficiency. Throughout its development,
Diefendorff [1] proposed the AltiVec ARM has employed various techniques to exploit
extension to the PowerPC architecture, known as parallelism effectively, such as variable execution
the Velocity Engine by Apple is used to enhance time, subword parallelism, digital signal processor-
multimedia processing performance in future PCs. like operations, thread-level parallelism, exception
This extension is a SIMD (single-instruction, handling, and multiprocessing. By leveraging
multiple-data) addition to general-purpose parallelism at multiple levels, ARM's innovative
processors, specifically designed to cater to chip designs have the potential to revolutionize
multimedia-rich applications like audio and video technology accessibility. With over 1.5 billion
compression, image processing, 3D graphics, ARM processors sold annually and a rapidly
speech recognition, and more. Unlike other expanding market, software developers now have
extensions that overload floating-point registers for extensive opportunities to deploy ARM code across
multimedia data, AltiVec introduces a dedicated diverse sectors.
large register file exclusively for multimedia data, Zhang [13] conducted an in-depth examination of
treating it as first-class data in the form of vectors. the RISC-V Instruction Set Architecture (ISA),
The design of AltiVec prioritizes high functionality emphasizing its adaptability and scalability. They
over backward compatibility, aiming to stressed the crucial role of comprehending ISA
significantly boost processing speed in critical specifications in the context of optimizing
loops that handle large input data sets commonly microarchitecture design. By elucidating the
found in signal and image processing applications. flexibility and extensibility of the RISC-V ISA, the
The AltiVec extension's capabilities include a full- study underscored how designers can tailor
range data-type support, a four-operand instruction sets to suit diverse application needs.
nondestructive instruction format, powerful SIMD Additionally, the authors emphasized the necessity
instruction set, and efficient data reorganization of aligning microarchitectural features with the
capabilities like permute operations. Overall, intricacies of the RISC-V ISA, illustrating how a
AltiVec's advancements enable media-rich deep understanding of ISA specifications informs
applications to run efficiently on general-purpose decisions regarding pipeline structure, instruction
PowerPC microprocessors without the need for scheduling, and memory management. In summary,
specialized media processors or dedicated hardware the study highlights the symbiotic relationship
accelerators. between ISA specifications and microarchitecture
John M. Frankovich [11] developed the design, showcasing the significance of a
Lincoln TX-2 computer, built at the Massachusetts comprehensive approach to processor optimization.
Institute of Technology Lincoln Laboratory, is a A novel pipeline architecture tailored for 32-bit
large-scale digital computer that incorporates new RISC-V microprocessors, with a specific emphasis
memory and circuit components along with on enhancing performance through optimized
innovative logical design concepts. This computer instruction scheduling. Their proposal underscored
is intended for use as a research tool in scientific the critical role of microarchitectural optimizations
computations, data-handling, and real-time in achieving efficient processor designs. By
applications, reflecting both the available prioritizing instruction scheduling within their
components and the intended application. The TX-2 pipeline architecture, the authors aimed to minimize
is part of a series of experimental computers pipeline stalls and maximize instruction throughput,
developed at the Lincoln Laboratory, emphasizing thereby improving overall processor performance.
large-scale digital systems suitable for real-time The study highlights the importance of fine-tuning
control. It builds upon its predecessors, Whirlwind I microarchitectural features to harness the full
and the Memory Test Computer, by introducing potential of RISC-V microprocessors, ultimately
new developments in components, circuits, contributing to the development of more efficient
memories, and logical organization. The input- computing systems[2].
output system of the Lincoln TX-2 computer is Key challenges in the design of RISC-V
designed with various devices suitable for research microprocessors, including concerns related to
and control applications, allowing multiple input- power efficiency, security, and the accommodation
output devices to operate simultaneously. of emerging workloads are addressed in [3][7]. To
J. Goodacre and A.N. Sloss [12] developed address these challenges, they proposed several
"Parallelism and the ARM Instruction Set solutions. Firstly, they suggested implementing
performance and functionality of the fetch unit ensuring reliable computation across a wide range
have been thoroughly evaluated. of applications. The inclusion of operations such as
The waveform analysis of the fetch unit "set less than" and "set less than unsigned" enables
indicates its ability to effectively increment the efficient comparison of input data, facilitating
program counter (PC) with each clock cycle when conditional branching and decision-making
enabled. This functionality aligns with the basic processes within program execution. The figure
requirement of fetching sequential instructions for 4.3 represents the output of ALU unit.
execution. Additionally, the fetch unit
demonstrates robust behaviour in response to reset
signals, promptly resetting the program counter to
its initial state. The figure 4.1 represents the output
of fetch unit.
The decoder unit of the designed RISC-V
processor core plays a pivotal role in interpreting
the incoming instruction stream and preparing the
necessary control signals for subsequent stages of
execution. Through rigorous analysis and Fig 4.3: Output of ALU Unit
evaluation, the effectiveness and efficiency of the
decoder unit can be assessed. The Register File unit, a critical component of
The decoder unit successfully identifies the the RISC-V processor core, serves as a storage
type of instruction being processed, categorizing mechanism for the 32 general-purpose registers,
them into distinct categories such as register-type, facilitating efficient data access and manipulation
immediate-type, load-type, store-type, branch- during program execution. Through rigorous testing
type, call-type, load-immediate-type, and jump- and analysis, the Register File unit demonstrated
type. This categorization ensures proper handling robust functionality and performance in meeting the
of diverse instruction formats, facilitating seamless core's design objectives.
execution within the processor core. The figure 4.2
represents the output of decoder unit. Upon examination of the waveform generated
during simulation, it is evident that the Register File
unit effectively handles simultaneous read and write
operations, ensuring seamless data flow within the
processor core. The waveform illustrates the timely
propagation of data from the designated source
registers to the destination registers, validating the
correctness of the read and write logic implemented
within the unit. The figure 4.4 represents the output
of register file unit.
Upon examination of the ALU unit's Fig 4.4: Output of Register File Unit
waveform, it is evident that the implemented
operations, including addition, subtraction, logical The instruction memory unit serves as a
left shift, logical right shift, arithmetic right shift, fundamental component of the RISC-V processor
bitwise XOR, bitwise OR, and bitwise AND, are core, facilitating the retrieval of instructions stored
executed seamlessly within the desired clock in memory for execution. Upon evaluating the
cycle. The waveform illustrates the timely functionality of the instruction memory unit, it was
generation of output data based on the specified observed that the implemented design successfully
ALU operation and input operands. fetched instructions based on the provided address
inputs.
Furthermore, the ALU unit exhibits accurate
handling of both signed and unsigned data,
The waveform analysis revealed a consistent brings about refinements and enhancements aimed at
and accurate retrieval of instructions from the achieving greater performance, efficiency, and
memory array during each clock cycle. This versatility. As computing demands continue to
indicates the proper functioning of the instruction evolve, the RISC-V processor core stands poised to
memory unit in accessing the designated memory adapt to the RISC-V architecture.
locations and delivering the corresponding
instructions to downstream processing units. The
figure 4.5 represents the output of instruction FUTURE SCOPE
memory unit.
While the current iteration of the processor
core has demonstrated functionality within its design
constraints, there exists a promising avenue for
future enhancements and refinements to bolster its
performance and capabilities.One significant aspect
of future development involves transitioning the
processor core from a single-stage architecture to a
five-stage pipeline architecture. This transition will
Fig 4.5: Output of Instruction Memory Unit
involve the introduction of distinct pipeline stages
for fetch, decode, execute, memory operation, and
write back to register. By breaking down the
V. CONCLUSION
instruction execution process into discrete stages, the
In culmination, the development and evaluation processor can exploit parallelism and achieve higher
of the RISC-V processor core represent a significant throughput, thereby enhancing overall performance.
endeavour in the realm of computer architecture
design. Through meticulous design choices and In addition to pipeline implementation,
implementation efforts, a functional processor core incorporating a branch prediction module represents
has been realized, capable of executing a subset of a critical advancement. Branch prediction techniques
RISC-V instructions with single-clock-cycle can mitigate the performance penalties associated
efficiency.The project's success hinges on the with conditional branches by speculatively executing
effective integration and coordination of various instructions based on predicted branch outcomes.
processing units, including the fetch unit, decode This optimization can significantly enhance the
unit, ALU unit, register file, memory control unit, efficiency of the processor core, especially in
and instruction memory unit. Together, these scenarios with frequent branch
components form a cohesive architecture that instructions.Furthermore, to address potential
enables the execution of instructions within the hazards arising from instruction dependencies within
specified operational parameters.Development of a the pipeline, the inclusion of instruction stalls can
functional RISC-V processor core that efficiently ensure proper data dependencies and maintain
executes a subset of the RISC-V instructions in a program correctness. By intelligently managing
single clock cycle. instruction flow and resource utilization, instruction
stalls can mitigate hazards such as data hazards and
Throughout the project, a thorough control hazards, thus optimizing pipeline efficiency.
understanding of RISC-V architecture principles
guided the design decisions, ensuring compliance Another area of focus for future enhancements
with the base 32I instruction set while also involves increasing the operating frequency of the
incorporating pseudo-instructions to enhance processor core. By leveraging advancements in
programmability and readability. The inclusion of semiconductor technology and optimizing the design
simultaneous read and write capabilities for registers for higher clock speeds, the processor can achieve
and data memory further enriches the processor greater computational throughput and
core's functionality and versatility.The evaluation of responsiveness, making it better suited for
the processor core's performance, as evidenced by demanding computing tasks.
waveform analysis and functional testing,
demonstrates its reliability and efficiency in REFERENCES
executing instructions within the specified frequency [1]. K. Diefendorff et al.(2000),'AltiVec extension to
of 2.2 KHz. However, future iterations could explore PowerPC accelerates media processing', IEEE Micro,
Vol. 20, No. 2, pp. 85 –95.
avenues for performance optimization, such as [2] Anu samanta et al (2024), ‘A Comprehensive Survey
increasing memory capacity, incorporating advanced and Comparison on Pipelined RISC System
instruction extensions, or implementing pipeline Architectures’, J. Electrical Systems 20-3 (2024):
stages to exploit instruction-level parallelism. 2817-2825
[3] Ravi Rajwar and James R. Goodman
(2001),'Speculative lock elision: enabling highly
In essence, the project underscores the iterative concurrent multithreaded execution', IEEE Computer
nature of processor design, wherein each iteration Society, pp. 294–305.