KEMBAR78
RISC-V SystemC-TLM Simulator | PDF
0% found this document useful (0 votes)
47 views4 pages

RISC-V SystemC-TLM Simulator

The document describes a RISC-V SystemC-TLM simulator that is focused on simplicity and easy expandability. It includes a full RISC-V instruction set simulator that supports the RISC-V ISA and extensions. The simulator modules communicate using TLM-2 sockets and the entire simulator is published as a docker image.

Uploaded by

Yasser Dahshan
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
47 views4 pages

RISC-V SystemC-TLM Simulator

The document describes a RISC-V SystemC-TLM simulator that is focused on simplicity and easy expandability. It includes a full RISC-V instruction set simulator that supports the RISC-V ISA and extensions. The simulator modules communicate using TLM-2 sockets and the entire simulator is published as a docker image.

Uploaded by

Yasser Dahshan
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 4

A RISC-V SystemC-TLM simulator

Marius Monton
marius.monton@uab.cat
Departament de microelectrònica i sistemes electrònics
Universitat Autònoma de Barcelona
Barcelona, Spain

ABSTRACT communication between modules, hiding or avoiding the details of


This work presents a SystemC-TLM based simulator for a RISC-V the communication itself: a transaction is an access from a Master
microcontroller. This simulator is focused on simplicity and easy (called Initiator) to a Slave (called Target) to a memory address
arXiv:2010.10119v1 [cs.AR] 20 Oct 2020

expandable of a RISC-V. It is built around a full RISC-V instruction with a length and some attributes. The Slave will respond to the
set simulator that supports full RISC-V ISA and extensions M, A, C, transaction within a time (that can be 0 for a basic modeling) and
Zicsr and Zifencei. the writing or reading of the transaction. All other details of the
The ISS is encapsulated in a TLM-2 wrapper that enables it transaction (bus access, signals change, etc.) are not modeled. In
to communicate with any other TLM-2 compatible module. The more detailed modeling, the different phases of a bus access can
simulator also includes a very basic set of peripherals to enable a be specified. Currently, SystemC standard includes TLM modeling
complete SoC simulator. The running code can be compiled with [1]. The modules can also interchange data using direct pointers to
standard tools and using standard C libraries without modifications. memory instead to transactions to increase simulation speed. This
The simulator is able to correctly execute the riscv-compliance technique is named Direct Memory Interface (DMI).
suite. The entire simulator is published as a docker image to ease its TLM has boosted the interoperability between vendors models
installation and use by developers. A porting of FreeRTOSv10.2.1 and the appearance of many IPs that are interchangeable and fully
for the simulated SoC is also published. compatible among different systems and vendors. The fundamental
idea of this work is to introduce all these features to a RISC-V
CCS CONCEPTS simulator.
The source code of the entire project is open-source and pub-
· Computer systems organization → Embedded hardware;
lished [14].
High-level language architectures; · Hardware → Simulation
The presented simulator is intended for an easy use and simple
and emulation.
to extend, with clear code and able to simulate an entire SoC, like
any embedded microcontroller in the market. To keep the code
KEYWORDS simple, meta-programming has been avoided and C++ templates
RISC-V, SystemC, TLM-2.0, Simulation Infrastructure, ISS use is keep as low as possible.
The paper is structured in the following sections: Section II
1 INTRODUCTION depicts the architecture of the entire simulator, Section III show
Many simulators has been published since the release of first drafts software particularities and tool-chain modifications, Section IV
of RISC-V ISA [8]. These simulators use different techniques and shows simulation performance and compliance results. Section V
technologies to achieve different requirements: good performance, concludes the paper.
good visualization of the processor, architectural exploration, etc.
Most of them conform to RISC-V ISA specifications; some of them 2 SIMULATOR ARCHITECTURE
use a previous infrastructure and adapt the ISS to follow the RISC-V
One of the main goals of this simulator was to be easily extensible
ISA and re-uses some peripherals already simulated [3, 4, 7, 19].
and modifiable. To achieve this objective, the original design was
Others are written from scratch and includes the ISS and a mini-
very simple and clear, with the use of naive techniques and a source
mum set of peripherals [6, 11]. There are FPGA-based simulators
code designed for simplicity.
to increase performance and simulation speed [9] as well as the
The simulator architecture includes a ISS for RV32I ISA [20], a
precision of the simulation results.
bus controller, the main memory and peripherals. Communication
The Spike simulator is most common simulator and it is used
between these modules is done by TLM-2 sockets (see Figure 1).
as reference model for RISC-V ISA [6]. Other simulators are in-
tended for a graphical visualization for the entire execution of the
instructions inside the CPU [15]. 2.1 CPU
SystemC is a set of libraries for the C++ language to allow the The ISS simulates a single hardware thread (HART) and includes
description and simulation of hardware based systems by a event- privileged instructions. It is divided in three modules: Instruction,
driven simulation model. This libraries add time management, con- Execute and Registers:
currency and hardware-like data types to C++ [1].
Transaction Level Modelling adds a layer to SystemC in order • Instruction Decodes instructions and checks for extensions.
to model the interface between different modules in a lightweight This module can access all fields of each instruction type (R,
way. This model technique uses transactions to abstract any kind of I, S, B, U and J type).
M. Montón

Figure 2: Simulator running with an xterm windows as ter-


minal
Figure 1: TLM Diagram of the entire simulator

• Execute Executes instructions, accessing registers and mem-


ory and performing operations. This module also executes
"MACZicsr_Zifencei" extensions [20].
• Registers Implements the register file for the entire CPU, in-
cluding general-purpose registers (r0-r31), Program counter
(pc) and all necessary entries in Control and Status Registers Figure 3: Log file view
(CSR) registers.
This CPU is a minimal, fully functional model with a end-less
loop fetching and executing instructions without pipeline, branch
predictions or any other optimization technique. All instructions
are executed in one single cycle, but it can be easy customized to the ISS. This module has a Simple target socket to be accessed that
per instruction cycle count. supports DMI to increase simulation speed.
The Execute module implements each instruction with a class The simulated Soc includes a very basic Timer module. This
method that receives the instruction register. These methods per- module includes two 64 bits register mapped to 4 addresses. On of
form all necessary steps to execute the instruction. In case of a this registers (mtime) keeps current simulated time in nanosecodns
branch instruction, these methods are able to change the PC value. resolution. The second register (mtimecmp) is intended to program
For Load/Store instructions, the methods are in charge to access a future IRQ. The module triggers an IRQ using its Simple initiator
the required memory address. socket.
The CPU is designed following Harvard architecture, hence the The Trace module is a very simple tracing device, that outputs
ISS has separate TLM sockets to interface with external modules: through a xterm window the characters received. This module is
• Data bus: Simple initiator socket to access data memory. intended as a basic mimic of the ITM module of Cortex-M CPUs
• Instruction bus: Simple initiator socket to access instruction [10]. Figure 2 shows the simulator running with an xterm windows
memory. as output console.
• IRQ line: Simple target socket to signal external IRQs. Two other modules are included in the simulator: Performance
and Log. The Performance module take statistics of the simulation,
like instructions executed, registers accessed, memory accesses, etc.
2.2 Bus Controller
It dumps this information when the simulation ends. The other
The simulator also includes a Bus controller in charge of the inter- module allows the simulator to create a log file with different levels
connection of all modules. The bus controller decodes the accesses of information.
address and does the communication to the proper module. In the At maximum level of logging, each instruction executed is logged
actual status of the project, it contains two target sockets (instruc- into the file with its name, address, time and register values or
tion and data buses) and three initiator sockets: Memory, Trace and addresses accessed. The log file at maximum debug level shows
Timer modules, as described below. information about the current time, PC value and the instruction
executed. It also prints the values of the registers used. Figure 3
2.3 Peripherals shows a real executed log file.
The Memory module simulates a simple RAM memory, which is the The log file at maximum debug level shows information about the
main memory of the SoC, acting as instruction memory and data current time, PC value and the instruction executed. It also prints
memory. This module can read a binary file in Intel HEX format the values of the registers used. Figure 3 shows a real executed log
obtained from the .elf file and load it to be the main program for file.
A RISC-V SystemC-TLM simulator

3 SOFTWARE IMPLEMENTATION AND Table 1: Performance result. Values in instructions/second


TOOLCHAIN
The entire simulator is designed to work on pure bare-metal simula- Test Native Docker
tion. There is not direct communication between the simulator and Test1 8.252.929 3.854.110
the host machine, meaning for instance that printf implementation Test2 6.298.774 3.291.465
outputs directly to a host computer console. This is intended to do Test3 8.921.763 3.754.295
a simulation as similar to a real Hardware as possible, because the Test4 12.899.367 4.375.651
same exact code and the compiled binary that runs in the simulator Dhrystone 10.700.733 3.796.328
will run in the real SoC.
For this reason the instructions EBREAK and ECALL are imple-
mented in that way: EBREAK stops the simulation and dump some
3.2 FreeRTOS
statistics. In a real system, has no sense to call EBREAK instruction
and depending of the implementation can trigger a system reset or A porting of FreeRTOS version 10.2.1 were written for the simulated
a NOP. The ECALL instruction raises an exception, dump statistics SoC [2]. The simulator is able to run this complex project without
of the simulation and continues the execution for the same reason. any error. The FreeRTOS test project includes 3 tasks that communi-
To supply the lack of semi-hosting options, the Trace module cate and synchronize using one common queue. The two producer
can be used to print out some information. With the use of proper tasks use FreeRTOS’ delay functions to suspend for a amount of
helper functions, it is possible to use printf() -like functions. In this time. Only one of the tasks prints out debug information.
case, the _write function must be written to send the received data
to Trace module as follows: 4 TEST AND RESULTS
Different test were done to ensure the compatibility of the simulator.
int _write(int file, const char *ptr, int len) Also some performance results are presented from the same tests.
{ The compiler for RISC-V code is the RISC-V GCC version is 8.3.0
int x; build with ABI configured to ilp32 and architecture set to rv32i.

for (x = 0; x < len; x++) { 4.1 Tests Compliance


TRACE = *ptr++;
The simulator implements RISC-V RV32IMACZicsr_Zifencei V2.1
}
instruction set [20, 21] and it passes all tests in risc-test and riscv-
compliance suites [16, 17]. The riscv-compliance tests have a cover-
return (len);
age of 97.23% for RV32I, 89.95% for RV32IM and 59.68% for RV32IMC.
}
These percentage means the number of all possible instructions
and registers combinations are tested.
The initial value for the Program Counter register (PC) is ob- A more complex program, the dhrystone benchmark test is passed
tained from the HEX binary file and set before starting the simula- with correct results as well.
tion. The stack pointer register (SP) is set to last memory address. The project code has been statically checked with coverity by
This flexibility and the compatibility accomplished enables the Synopsis. The analysis results in only 1 minor error found in TLM-2
use of the standard GCC cross compiler with little options: library code but any error in the simulator code itself [18]. Also,
code quality is checked with Codacy tool [5]. This tool checks for
-march=rv32imac -mabi=ilp32 --specs=nosys.specs code quality, security, unused code, etc. The outcome of this tool is
a A score, with only 10 minor warnings about code style.
The options specifies the architecture and ABI (Application Bi-
In the next section is discussed the performance of this simulator.
nary Interface) and specifies the bare-metal option for newlib stan-
dard C library.
4.2 Performance
This allows complete use of C library on the application code,
including math library, stdio and string libraries. A set of four program are written to test the performance of the
simulator. Of these tests, test 1 checks memory transfer between
two memory locations; test 2 and 3 perform arithmetic operations
3.1 Docker version in three variables, one prints out the results and the other one is not
A docker version of the simulator is provided [12]. It can be used using the console; the last test uses string manipulation functions
to ease the installation and use of the simulator to avoid user to from stdlib C library (printf, sprintf, strcpy).
compile and gather all necessary libraries. All test do a end-less loop of some mathematical operations and
This image has been used in conjunction with another docker prints out the result using Trace module. Each test is executed 3
image that contains a riscv-toolchain. It can be used to ease the times for different execution time (from 10 to aprox. 60 seconds
installation and use of the simulator, and specifically, to avoid the execution time). The Figure 4 shows average of these 3 runs.
user to compile and gather all necessary libraries. Its performance varies mainly with the level of the logging sys-
The simulator image is published and available in docker hub tem due to huge I/O traffic in the log file. With lowest level o logging,
[13]. the performance of the simulator is about 8 million of simulated
M. Montón

5 CONCLUSIONS
This paper introduces a new RISC-V simulator. It has been designed
from scratch to simulate an entire SoC with simplicity on focus. It
has been designed in SystemC and TLM-2 as language and modeling
schema.
It has been presented the main architecture of the simulator,
the software configuration and tools required. Followed by a brief
discussion about the simulation performance and the conformance
to the specifications.
The use of standards is important in any aspects of the engineer-
ing effort. In the case of system-level simulators, the existence of
the TLM-2 and SystemC standards should be encourage and used
by vendors and researchers to increase the interoperability and
Figure 4: Execution results for all tests re-usability of the components. This simple simulator is a first step
towards this achievement.

REFERENCES
[1] 2012. IEEE Standard for Standard SystemC Language Reference Manual. IEEE
Std 1666-2011 (Revision of IEEE Std 1666-2005) (2012), 1ś638.
[2] Inc Amazon Web Services. 2020. FreeRTOS HomePage. https://www.freertos.org/
[3] Fabrice Bellard. 2005. QEMU, a Fast and Portable Dynamic Translator. In Proceed-
ings of the Annual Conference on USENIX Annual Technical Conference (Anaheim,
CA) (ATEC ’05). USENIX Association, USA, 41.
[4] Nathan Binkert, Bradford Beckmann, Gabriel Black, Steven K. Reinhardt, Ali
Saidi, Arkaprava Basu, Joel Hestness, Derek R. Hower, Tushar Krishna, Somayeh
Sardashti, Rathijit Sen, Korey Sewell, Muhammad Shoaib, Nilay Vaish, Mark D.
Hill, and David A. Wood. 2011. The Gem5 Simulator. SIGARCH Comput. Archit.
News 39, 2 (Aug. 2011), 1ś7. https://doi.org/10.1145/2024716.2024718
[5] Codacy. 2020. Codacy - RISC-V-TLM Dashboard. https://app.codacy.com/manual/
mariusmm/RISC-V-TLM/dashboard
[6] RISC-V foundation. 2020. RISC-V Spike. https://github.com/riscv/riscv-tools
[7] Imperas. 2020. A Complete, Fully Functional, Configurable RISC-V Simulator.
https://github.com/riscv/riscv-ovpsim
Figure 5: Execution results for all tests with the Docker ver- [8] RISC-V International. 2020. RISC-V Software Ecosystem Overview - Simulators.
sion https://riscv.org/software-status/#simulators
[9] Jonathan Bachrach Scott Beamer et al Krste Asanović, Rimas Avižienis. 2016.
The Rocket Chip Generator. Technical Report Technical Report UCB/EECS-2016-
17. EECS Department, University of California, Berkeley. https://github.com/
instructions per second (see Table 1 and Figure 4) in a Intel Core chipsalliance/rocket-chip
i7-8550U CPU @ 1.88 GHz with 16 GB of memory. As a reference, [10] ARM Limited. 2020. Cortex™-M3 Technical Reference Manual - ITM.
http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.ddi0337e/
in the same computer the Spike simulator performance is about BABCCDFD.html
170 million of simulated instructions per second. [11] Neethu Bal Mallya, Cecilia Gonzalez-Alvarez, and Trevor E Carlson. 2018. Flexible
The low performance of Test2 can be due to the intensive use of timing simulation of RISC-V processors with sniper. Simulation 4 (2018), 1.
[12] Dirk Merkel. 2014. Docker: Lightweight Linux Containers for Consistent Devel-
the Trace module and the overhead it implies. opment and Deployment. Linux J. 2014, 239, Article 2 (March 2014), 1 pages.
For the Dhrystone benchmark, it is executed with good results [13] Màrius Montón. 2020. mariusmm/riscv-tlm. https://hub.docker.com/repository/
and the performance is about 7200 Dhrystones/second. It has been docker/mariusmm/riscv-tlm
[14] Màrius Montón. 2020. RISC-V-TLM Simulator. https://github.com/mariusmm/
tested with 10.000, 250.000 and 500.000 loops of the Dhrystone test. RISC-V-TLM
[15] Morten Petersen. 2020. Ripes. https://github.com/mortbopet/Ripes
[16] RISCV.org. 2020. RISC-V Compliance Task Group. https://github.com/riscv/riscv-
4.3 Docker version compliance/
The same tests has been run with the docker version of the simulator. [17] RISCV.org. 2020. RISC-V Unit Tests. https://github.com/riscv/riscv-tests
[18] Synopsys. 2020. Coverity - RISC-V-TLM. https://scan.coverity.com/projects/
The results are summarized in Table 1 and depicted in Figure 5. mariusmm-risc-v-tlm
In case of docker version, the performance has a penalty from [19] Tuan Ta, Lin Cheng, and Christopher Batten. 2018. Simulating multi-core RISC-V
47% to up to 69% depending on the test. The performance of this systems in gem5. In Workshop on Computer Architecture Research with RISC-V.
[20] Andrew Waterman and Krste Asanovi. 2019. The RISC-V Instruction Set Manual,
version is depicted in Figure 5. Volume I: User-Level ISA. Technical Report Version 20191213. RISC-V Foundation.
[21] Andrew Waterman and Krste Asanovi. 2019. The RISC-V Instruction Set Manual,
Volume II: Privileged Architecture. Technical Report Version 20190608-Priv-MSU-
Ratified. RISC-V Foundation.

You might also like