See discussions, stats, and author profiles for this publication at: https://www.researchgate.
net/publication/228849242
An Open-Source Hardware Module for High-Speed Network Monitoring on
NetFPGA
Article · January 2010
CITATIONS READS
4 344
3 authors, including:
Gianni Antichi
Queen Mary, University of London
113 PUBLICATIONS 1,555 CITATIONS
SEE PROFILE
Some of the authors of this publication are also working on these related projects:
ENDEAVOUR Horizon 2020 View project
ClassBench-ng View project
All content following this page was uploaded by Gianni Antichi on 28 May 2014.
The user has requested enhancement of the downloaded file.
An Open-Source Hardware Module for High-Speed
Network Monitoring on NetFPGA
Gianni Antichi David J. Miller Stefano Giordano
Dept. Information Engineering Computer Laboratory Dept. Information Engineering
University of Pisa University of Cambridge University of Pisa
gianni.antichi@iet.unipi.it david.miller@cl.cam.ac.uk stefano.giordano@iet.unipi.it
ABSTRACT having existed. Even when the host can keep up, time
We present a passive network measurement solution based taken to process packets that are of no interest is time
on the low-cost NetFPGA — suitable for network research, not spent on analysis of the traffic of interest [6, 7].
security applications, and traffic engineering and manage- Host load can be relieved if the network interface card
ment. Key features include accurate timestamping, and the relays only traffic of interest, but most typical network
ability to filter traffic based on flow. In this paper, we de- interfaces aren’t capable of filtering traffic based on flow
scribe our implementation. (or class) — only on whether the packet is addressed to
the host or not, which is never the case for a monitor.
Keywords As described later, a variety of specialised hardware-
based solutions exist — but they come at a cost. The
High Performance, FPGA, Low-cost, Monitoring, Time-
NetFPGA platform, which is open and low-cost, of-
stamping
fers a new opportunity to achieve the performance of
hardware-based measurement solutions, but at costs closer
1. INTRODUCTION
to that of software-only solutions.
Network measurement and monitoring has been an Our monitoring solution addresses the criterion of
active area of research for at least the past 15 years. cost by using the NetFPGA. Achieving high quality
Applications include academic research, security, and timestamp data is easy in hardware, and the degree
traffic engineering and management. to which an FPGA-based solution can be customised
An ideal measurement and monitoring solution would makes it possible to provide filtration such that only
be accurate, guarantee no loss of information, and be traffic of interest is sent to the host.
inexpensive. To be cheap, a solution must use off-the- We describe a flexible monitoring solution offering
shelf components such as common PCs with their built- high quality timestamps and flow-based filtration at
in network adaptors. Software running on the host line-rate Gigabit Ethernet based on NetFPGA.
timestamps each packet as it arrives, and stores it for
later analysis. Applications such as tcpdump [11], Wire-
shark [5], nTop [3] et al. demonstrate how effective and 2. RELATED WORK
flexible this approach can be for a large variety of mon- Passive measurement systems have been an active
itoring tasks. area of research since at least the mid 1990s. The
Many of the first solutions were like this, and some work done by Ian Graham, Stephen Donnelly [9], Jörg
still are. This approach works well enough for low speed Micheel, their colleagues in the WAND Group at Waikato
networks, or when timestamp accuracy is not too im- University, and then later Endace [1], produced the ex-
portant, but it doesn’t scale to high speed networks. cellent, but expensive DAG card.
High speed network adaptors must employ a variety A similar FPGA-based monitoring and filtering solu-
of techniques to manage the load imposed upon the tion was employed by the SCAMPI project [4] using the
host processor. One such technique — interrupt miti- COMBO6 card from CESNET. Filtration is performed
gation — notifies the host only about groups of packets, by the FPGA, and only matching packets are forwarded
rather than when each individual packet arrives. Since to the host.
the software (usually, an OS kernel) can’t timestamp Ficara et al. [10] present an Intel IXP2400-based traf-
packets until notified of their arrival, interrupt miti- fic monitoring device for Gigabit Ethernet capable of
gation and interrupt latency both contribute to poor analysing up to 50,000 filtering rules at line rate.
time-stamp quality. Luca Deri’s nCap [8] and related works provide software-
If traffic load exceeds a host’s capacity to process and based measurement techniques that work well, but are
store it, data are lost — usually with no record of it ever at the mercy of kernel-based timestamping.
1
Wolf et al. [13] propose Distributed On-line Measure- 3.2 NetFPGA data path
ment Environment (DOME), a distributed network of Our monitor adds two new modules (shown in Fig-
passive measurement nodes based on an Intel IXP2400 ure 2) to the standard NetFPGA pipeline (Figure 1)
Network Processor. DOME includes header anonymiza- described as follows, as well as making minor modifica-
tion, and performance is comparable with that of En- tions to the input arbiter, and nf2 mac grp.v.
dace DAG 4.3 cards. Both the previous systems are able
to analyse up to 500 MB/s of minimum-sized packets
(64 bytes).
These software solutions are both inexpensive and
flexible, but traffic load and timestamp quality are both
limited by NIC hardware and kernel performance. Hard-
ware solutions typically provide very good timestamp
quality, but hardware is typically expensive and offer
limited flexibility (especially in the case of proprietary
offerings).
A NetFPGA-based solution offers the accuracy of
hardware timestamping on inexpensive hardware (thanks
to support from Xilinx) with the flexibility of open
firmware, together with a rapidly growing community Figure 1: The NetFPGA reference pipeline
of developers and academics.
3. THE NETFPGA NETWORK MONITOR
This section describes the design and implementation
of our NetFPGA-based time-stamping network moni-
tor.
3.1 Deployment
A network monitor may either be installed in-series
with the link to be monitored, or connected by means of
a network tap. Optical network links make the choice Figure 2: Data path with timestamp modules
is easy: passive optical splitters are inexpensive, and
other than during initial installation, offer no possibility
of interruption of the link. 3.2.1 Timestamping module
Copper network links, on the other hand, are more The time stamping module attaches to the RGMII
challenging. Some protocols, such as 10/100 Ethernet (Reduced Gigabit Media Independent Interface), as near
can be tapped using a passive resistive network — but as possible to the MAC (Media Access Controller) so
others (including Gigabit Ethernet) require an expen- that timestamps are recorded as soon as packets are
sive active tap, such as the NetOptics TP-CU3, or in- received, in order to minimise timestamp error and jit-
stallation of the monitor in-line. ter. The RGMII asserts its ”data valid” signal when the
In-line monitoring is cheap, and offers the possibility SFD (Start of Frame Delimiter) of a frame is received at
of building an Intrusion Prevention System) system, but a physical interface. We sample the free-running time-
comes at the cost of significant extra latency and the stamp counter on the low-to-high transition of this sig-
risk of interruption of the link, should the monitor lose nal.
power, be misconfigured, or otherwise fail. Timestamps are sampled from a 64-bit, free-running
Since the NetFPGA has four ports, but supports only counter driven by the 125 MHz system clock, which
copper Ethernet, our monitoring solution integrates the increments by 8 once every 8 ns. By using the system
function of an active copper tap by internally coupling clock, the time stamp module can be made synchronous
two ports of the card. Traffic received on one port is with receive logic, and thereby avoid additional error
retransmitted out of the other, and visa versa. associated with crossing clock domains.
Where a deployment is especially cost-sensitive, a sin- At start-of-day, the timestamp counter can be ini-
gle NetFPGA-based monitor is sufficient for a single tialised with current time derived from the PC real-time
full-duplex link. (Our solution could be modified to clock (which may optionally be maintained using NTP)
monitor two full-duplex links with ease.) Where up- yielding absolute timestamps.
time is more critical, our monitor may also be used with This timestamp counter is easy to implement, but
a conventional active copper Gigabit Ethernet tap. provides no means of correcting for oscillator drift and
2
yields data in units of whole nanoseconds. Since stan- matically try both combinations of source and destina-
dard formats record time in units of seconds, conversion tion port and address, requiring two rule slots to spec-
by a floating-point division is required. Both of these ify a complete flow. Rather than try to address this
drawbacks can be addressed by means of using Direct limitation, we feel that a Bloom filter would provide
Digital Synthesis [12], a technique of producing arbi- considerably greater density, while also providing the
trarily variable frequencies using FPGA-friendly, purely flexibility of specifying only one half of a flow, should
synchronous, digital logic. that be desirable.
By means of an external time reference, such as ei- Since the object of the filter is to manage PCI inter-
ther NTP, or GPS, the rate of the DDS clock can both face throughput by limiting irrelevant traffic, any false
be corrected in real-time as well as yield a fixed-point positive matches from the bloom filters are harmless,
representation in seconds. We plan to add such a sys- and the host can simply throw them away.
tem, similar to that described by Stephen Donnelly [9]
in a future revision of our measurement solution. 3.2.3 Pipeline changes
3.2.2 Selective, flow-based monitoring
Because the NetFPGA PCI interface lacks the band-
width to record all traffic, we provide a 5-tuple (IP ad-
dress pair, protocol and port pair) filter. As described
in Section 3.1, all packets received are retransmitted.
Packets that match one of up to 32 filter rules are also Figure 4: Packet data and side-band timestamps
copied verbatim, with their timestamp prepended (as
shown in Figure 3), to the host. The timestamp is con- Our initial implementation passes timestamps in a
verted to Intel (little-endian) byte order in the card to side-channel, parallel with the main packet data path,
save the host most commonly used in these applications as shown in Figure 4. This required minor modifications
from having to do so. to parts of the standard data path. Future versions of
this code will probably pass timestamp information in-
band by means of a new module header.
3.3 Software
libpcap is the de facto standard capture API but, for
now, libpcap applications cannot yet directly be used
with our monitoring solution. Simple packet recorders
should work, but the 8-byte timestamp prepended to
each packet will confound protocol analysis.
Instead, we include a libpcap-based capture pro-
gramme which converts and removes the hardware time-
stamp, overwrites the PCAP timestamp, and records
a standard PCAP trace. It is our ambition to sup-
Figure 3: Format of packets sent to host port a live PCAPng-style capture interface, as well as
libtrace [2] (and thereby Endace [1] ERF format).
The timestamps of packets that match no rule are We also provide auxiliary command-line tools for TCAM
made available to the host via registers. In future ver- rule management, and the initialisation of the hardware
sions, we plan to pass unmatched timestamps — per- timestamp.
haps along with minimal information about the packet
— to the host, in-band with those packets that did pass 4. PRELIMINARY RESULTS
the filter. We characterised the latency through the NetFPGA
Our initial implementation uses the TCAM (Ternary with an Endace DAG 4.3ge SX, as shown in Figure 5.
Content Addressable Memory) modules available in Xil- Being optical, we were obliged to use a pair of media
inx CoreGen. Although these TCAMs are fast (one converters (Allied-Telesyn AT-MC-1004), and we didn’t
search every clock cycle), and permit on-the-fly rule up- have the means at our disposal to calibrate out the la-
dates, they are low density. Owing to problems with tency contributed by these devices.
timing closure, we found it necessary to implement the We measured latency through the two converters and
filter using two 16-entry TCAMs, rather than one 32- the NetFPGA card at a constant 2.4 µs, irrespective of
entry TCAM. whether the test packets matched a filter rule, or how
In addition, our initial implementation doesn’t auto- many rules were programmed into the filter.
3
[3] nTop network traffic probe.
http://www.ntop.org.
[4] Scampi project. http://www.ist-scampi.org.
[5] Wireshark protocol analyzer (was ethereal).
http://www.wireshark.org.
[6] L. Deri. Improving passive packet capture:
Beyond device polling. In SANE 2004.
[7] L. Deri. Passively monitoring networks at gigabit
speeds using commodity hardware and open
source software. In PAM 2003.
[8] L. Deri. nCap: Wire-speed packet capture and
transmission. In End-to-End Monitoring, May
Figure 5: Latency measurement apparatus 2005.
[9] S. F. Donnelly. High precision timing in passive
At the time of writing, we are setting up an exper- measurements of data networks. PhD thesis,
iment in which we will test the quality of timestamps University of Waikato, 2002.
returned to the host against the DAG card. [10] D. Ficara, S. Giordano, F. Oppedisano,
G. Procissi, and F. Vitucci. A cooperative
5. CONCLUSION AND FUTURE WORK pc/network-processor architecture for multi
gigabit traffic analysis. In QoS-IP 2008.
We present a flexible and cheap passive NetFPGA- [11] N. R. G. Lawrence Berkeley National Labs.
based monitoring system. Our proof-of-concept imple- tcpdump, libpcap. http://www.tcpdump.org.
mentation so far shows promising results. [12] P. Saul. Direct digital synthesis. Circuits and
Further development and rigorous quality evaluation systems tutorials, page 393, 1996.
against the respected industry standard Endace DAG [13] T. Wolf, R. Ramaswamy, S. Bunga, and N. Yang.
card are on-going. We identify a number of extensions An architecture for distribuited real-time passive
and enhancements, many of which the implementation network measurement. In MASCOTS 2006.
is underway:
• Use of Direct Digital Synthesis, together with an
external time-base to provide error-corrected times-
tamps in a convenient format;
• Re-implementation of the flow filter using Bloom
filters in order to support substantially more than
32 flows;
• Optional in-band markers for packets belonging to
unmatched flows;
• Refactor to include timestamp in a module-header;
• Live libpcap support with extended precision times-
tamps; and
• Endace ERF format and libtrace support.
The NetFPGA is a promising platform on which to
develop exciting new, low-cost instrumented devices.
For example, whereas by conventional techniques, only
link-level behaviour can be instrumented, the NetFPGA
monitoring platform offers the possibility of instrumen-
tation of router behaviour right at the routing plane.
6. REFERENCES
[1] Endace. http://www.endace.com.
[2] libtrace. http://research.wand.net.nz/
software/libtrace.php.
View publication stats