www.uandistar.
org
Embedded & Real Time Systems Notes
Need for Communication Interfaces: The need for providing communication interfaces arises due to the following reasons: The embedded system needs to sends data to a host. The host will analyze of data and present the data through a Graphical User Interface (GUI). The embedded system may need to communicate with another embedded system to transmit or receive data. Providing a standard communication interface is preferable rather than providing a proprietary interface. A number of embedded systems may need to be networked to share data. Network interfaces need to be provided in such a case. An embedded system may need to be connected to the internet so that anyone can access the embedded system. An example is a real-time weather monitoring system. The weather monitoring system can be Internetenabled using TCP/IP protocol stack and HTTP server. Mobile devices such as cell phones and palmtops need to interact with other devices such as PCs and laptops for data synchronization. When the palmtop comes near the laptop, automatically the two can form a network to exchange data. For some embedded systems, the software may need up-gradation after it is installed in the field. The software can be upgraded through communication interfaces. Due to these reasons, providing communication interfaces based on standard protocols is a must. Not surprisingly, many microcontrollers have on-chip communication interfaces such as a serial interface to meet these requirements. RS232: In telecommunications, RS-232 (Recommended Standard 232) is the traditional name for a series of standards for serial binary single-ended data and control signals connecting between a DTE (Data Terminal Equipment) and a DCE (Data Circuit-terminating Equipment). It is commonly used in computer serial ports. The standard defines the electrical characteristics and timing of signals, the meaning of signals, and the physical size and pin-out of connectors. The current version of the standard is TIA-232-F Interface between DTE and DCE Employing Serial Binary Data Interchange, issued in 1997. RS232 Communication Parameters: When two devices have to communicate through RS232, the sending device sends the data character by character. The bits corresponding to the character are called data bits. The data bits are prefixed with a bit called start bit, and suffixed with one or two bits called stop bits. The receiving device decodes the data bits using the start bit and stop bits. This mode of communication is called asynchronous communication because no clock signal is transmitted. In addition to start bit and stop bits, an additional bit called parity bit is also sent. Parity bit is used for error detection at the receiving end. RS232 Connector Configurations: RS232 standard specifies two types of connectors. They are 25-pin connector and 9-pin connector. 25-pin Connector: For two devices to communicate with each other using RS232, the communication parameters have to be set on both the systems. And, for a meaningful communication, these parameters have to be the same. The various communication parameters are: Data Rate: The rate at which data communication takes place. Data Bits: Number of bits transmitted for each character. Start Bit: The bit that is prefixed to the data bits to identify the beginning of the character. Stop Bits: These bits are appended to the data bits to identify the end of character. Parity: The bit appended to the character for error checking. The parity can be even or odd. For even parity, the parity bit will be added in such a way that the total number of bits will be even. For odd parity, the parity bit will make the total number of bits odd. If the parity is set to none, the parity bit is ignored. Flow Control: If one of the devices sends data at a very fast rate and the other device cannot absorb the data at that rate, flow control is used. Flow control is a protocol to stop or resume data transmission. This protocol is also known as handshaking. We can do hardware handshaking in RS232 using two signals: Request to Send (RTS) and Clear to Send (CTS). When a device has data to send, it asserts RTS and the receiving device asserts CTS. We can also do software handshaking a device can send a request to suspend data transmission by sending the character Control S (0x13). The signal to resume data transmission is sent using the character Control Q (0x11). This software handshaking is also known as XON/XOFF.
By Mr. Suman Kalyan Oduri
Page 1
www.uandistar.org
www.uandistar.org
Embedded & Real Time Systems Notes
9-pin Connector: The 25-way connector recommended in the standard is large compared to current practice.
UART: A Universal Asynchronous Receiver/Transmitter, abbreviated UART, is a type of "asynchronous receiver/transmitter", a piece of computer hardware that translates data between parallel and serial forms. UARTs are commonly used in conjunction with communication standards such as EIA RS-232, RS-422 or RS-485. The universal designation indicates that the data format and transmission speeds are configurable and that the actual electric signaling levels and methods, typically are handled by a special driver circuit external to the UART. A UART is usually an individual integrated circuit used for serial communications over a computer or peripheral device serial port. UARTs are now commonly included in microcontrollers. A dual UART, or DUART, combines two UARTs into a single chip. Many modern ICs now come with a UART that can also communicate synchronously; these devices are called USARTs (universal synchronous/asynchronous receiver/transmitter). Block Diagram of UART:
For transmission of 1s and 0s, voltage levels are defined in the standard. voltage levels are different for control signals data signals. Signal Voltage Level Data Input +3V and above for 0; -3V below for 1 Data Output +5V and above for 0; -5V below for 1 Control +3V and above for 1 (ON); Input and below for 0 (OFF) Control +5V and above for 1 (ON); Output and below for 0 (OFF)
the The and
and and -3V -5V
RS232 uses unbalanced transmission i.e., the voltage levels are measured with reference to the local ground. Hence, this transmission is susceptible to noise. Limitations of RS232: Because the application of RS-232 has extended far beyond the original purpose of interconnecting a terminal with a modem, successor standards have been developed to address the limitations. Issues with the RS-232 standard include: The large voltage swings and requirement for positive and negative supplies increases power consumption of the interface and complicates power supply design. The voltage swing requirement also limits the upper speed of a compatible interface. Single-ended signaling referred to a common signal ground limits the noise immunity and transmission distance. Multi-drop connection among more than two devices is not defined. While multidrop "work-arounds" have been devised, they have limitations in speed and compatibility. Asymmetrical definitions of the two ends of the link make the assignment of the role of a newly developed device problematic; the designer must decide on either a DTE-like or DCE-like interface and which connector pin assignments to use. The handshaking and control lines of the interface are intended for the setup and takedown of a dial-up communication circuit; in particular, the use of handshake lines for flow control is not reliably implemented in many devices. No method is specified for sending power to a device. While a small amount of current can be extracted from the DTR and RTS lines, this is only suitable for low power devices such as mice.
The necessary voltage level conversion has to be done to meet the voltage levels of RS232. This is achieved using a level shifter as shown in below figure.
UART chip operates at 5V. The level conversion to the desired voltage is done by the level shifter, and then the signals are passed on to the RS232 connector. Most of the processors including Digital Signal Processors have on-chip UART. ICs such as MAX3222, MAX 3241 of Maxim can be used as level shifters. RS422: RS-422 is a telecommunications standard for binary serial communications between devices. It is the protocol or specifications that must be followed to allow two devices that implement this standard to speak to each other. RS-422 is an updated version of the original serial protocol known as RS-232. One device will be known as the data terminal equipment (DTE) and the other device is known as data communications equipment (DCE). The RS represents the Recommended Standard which allows manufacturers some room for interpretation of the standard though the differences are usually small. However, to prevent
By Mr. Suman Kalyan Oduri
Page 2
www.uandistar.org
www.uandistar.org
Embedded & Real Time Systems Notes
problems the standard has been formalized by the Electronic Industries Alliance and the International Telecommunications Industry Association and is now referred to as the EIA/TIA-422 standard. RS-422 is a balanced four wire system. The signal sent from the DTE device is transmitted to the DCE device through two wires and the signal sent from the DEC device to the DTE device is transmitted through the other two wires. The signals on each pair of wires are the mirror opposite of each other, i.e., a "1" datum is transmitted as a +2V reference on one wire and a -2V reference on the other wire. To send a "0" datum, a -2V reference is transmitted through one wire and a +2V reference on the other wire. That is the opposite of what was done to transmit a '1' datum. This balanced differential approach allows for much longer distances between the DCE device and the DTE device than was possible with the earlier 3 wire RS-232 communication standard. The RS-422 standard states which signaling voltages must be used, what connectors are to be used, what pins on those connectors are to be used for each function, and also recommends maximum distances over which this technology can be reliably used. Standards are very important in industry as it allows the consumer to purchase a computer (DTE device) from one manufacturer and a printer (DCE device) from a different manufacturer and expect them to work together. Standards are also always changing as limitations are encountered and new solutions are proposed. If the changes become too drastic then a new standard evolves. In this way the RS-422 standard has evolved from, and in many cases replaced, the original RS-232 standard. In fact the RS-422 standard has now been superseded by the RS-485 standard which adds additional features to the RS-422 standard. RS485: RS-485 is a telecommunications standard for binary serial communications between devices. It is the protocol or specifications that need to be followed to allow devices that implement this standard to speak to each other. This protocol is an updated version of the original serial protocol known as RS-232. While the original RS-232 standard allowed for the connection of two devices through a serial link, RS-485 allows for serial connections between more than 2 devices on a networked system. A RS-485 compliant network is a multipoint communications network. The RS-485 standard specifies up to 32 drivers and 32 receivers on a single (2-wire) bus. New technology has since introduced "automatic" repeaters and high-impedance drivers and receivers such that the number of drivers and receivers can be extended to hundreds of nodes on a network. RS485 drivers are now even able to withstand bus contention problems and bus fault conditions. A RS-485 network can be constructed as either a balanced 2 wire system or a 4 wire system. If a RS-485 network is constructed as a 2 wire system, then all of the nodes will have equal ranking. A RS-485 network constructed as a 4 wire system, has one node designated as the master and the remaining nodes are designated as slaves. Communication in such a system is only between master and slaves and never between slaves. This approach simplifies the software protocol that needs to be used at the cost of increasing the complexity of the wiring system slightly. The RS-485 standard states what signaling voltages must be used, what connectors are to be used, what pins on those connectors are to be used for each function, and also recommends maximum distances over which this technology can be reliably used. Standards are very important in industry as it allows the consumer to purchase different devices from different manufacturers and expect them to work together. Standards are also always changing as limitations are encountered and new solutions are proposed. If the changes become too drastic then a new standard evolves. It is because of this that the RS-485 standard evolved from the original RS-232 standard. USB: Universal Serial Bus (USB) is a specification to establish communication between devices and a host controller, developed and invented by Ajay Bhatt, while working for Intel. USB has effectively replaced a variety of interfaces such as serial and parallel ports. USB can connect computer peripherals such as mice, keyboards, digital cameras, printers, personal media players, flash drives, Network Adapters, and external hard drives. For many of those devices, USB has become the standard connection method. USB was designed for personal computers, but it has become commonplace on other devices such as smart-phones, PDAs and video game consoles, and as a power cord. As of 2008, there are about 2 billion USB devices sold per year, and approximately 6 billion totals sold to date. Unlike the older connection standards RS232 or Parallel port, USB connectors also supply electric power; so many devices connected by USB do not need a power source of their own. A number of devices, up to a maximum of 127, can be connected in the form of an inverted tree. On the host, such as a PC, there will be a host controller to control all the USB devices. Devices can be connected to the host controller either directly or through a hub. A hub is also a USB device that extends the number of ports to connect other USB devices. A USB device can be self-powered, or powered by the bus. USB can supply 500mA current to the devices.
USB uses a twisted pair of wires as the medium. Data is transmitted over the twisted pair
By Mr. Suman Kalyan Oduri
Page 3
www.uandistar.org
www.uandistar.org
Embedded & Real Time Systems Notes
using differential data lines and hence balanced communication is used. Features of USB: 1. The computer acts as the host. 2. Up to 127 devices can connect to the host, either directly or by way of USB hubs. 3. Individual USB cables can run as long as 5 meters; with hubs, devices can be up to 30 meters away from the host. 4. With USB 2.0, the bus has a maximum data rate of 480 megabits per second. 5. A USB cable has two wires for power (+5V and ground) and a twisted pair of wires to carry the data. 6. On the power wires, the computer can supply up to 500 milliamps of power at 5 volts. 7. Low-power devices can draw their power directly from the bus. High-power devices have their own power supplies and draw minimal power from the bus. Hubs can have their own power supplies to provide power to devices connected to the hub. 8. USB devices are hot-swappable, meaning you can plug them into the bus and unplug them any time. 9. Many USB devices can be put to sleep by the host computer when the computer enters a power-saving mode. Infrared: Infrared (IR) light is electromagnetic radiation with a wavelength longer than that of visible light, starting from the nominal edge of visible red light at 0.7m, and extending conventionally to 300m. These wavelengths correspond to a frequency range of approximately 430 to 1 THz, and include most of the thermal radiation emitted by objects near room temperature. Microscopically, IR light is typically emitted or absorbed by molecules when they change their rotational-vibrational movements. Infrared interfaces are now commonplace for a number of devices such as palmtops, cell phones, digital cameras, printers, keyboards, mice, LCD projectors, ATMs, smart cards, etc. Infrared interface provides a low-cost, short range, point-to-point communication between two devices. The only drawback with infrared is that it operates in line of sight communication mode and it cannot penetrate through walls. Infrared Data Association (IrDA), a nonprofit industry association founded in 1993, released the specifications for low-cost infrared communication between devices. The block diagram of IrDA module is: As shown in figure below, the device will have an infrared transceiver. The transmitter is a LED and the receiver is a photodiode. For low data rates, the processor of the embedded system itself can be used whereas for high data rates, a different processor may be needed. receiving device will detect the signal, decode and de-packetize the data. Protocol Architecture: As shown in figure below, for communication through infrared interface, the physical layer (IrPHY) and data link layer (IrLAP) are specified in the standards. Link management is done through IrLMP, above which the application layer protocols will be running. Higher Layers (Application, IrCOMM) Link Management Protocol (IrLMP) Data Link Layer (IrLAP) Physical Layer (IrLMP) Physical layer: IrPHY specifies the data rates and the mode of communication. IrDA has two specifications viz., IrDA data and IrDA control. IrDA data has a range of 1m whereas for IrDA control has a range of 5m both with bidirectional communication. A host such as PC can communicate with 8 peripherals using IrDA protocols. Data link layer: The data link layer is called the IrLAP. Master/Slave protocol is used for communication between two devices. The device that starts the communication is the master. The master sends the command and the slave sends a response. Link management layer: This layer facilitates a device to query the capabilities of other devices. If also provides the software capability to share IrLAP between multiple tasks. Higher layers: The higher layer protocols are application specific. IrCOMM protocol emulates the standard serial port. When two devices such as palmtop and mobile phone both fitted with infrared interface come face to face, they can exchange the data using the application layer protocol.
IEEE 1394 Firewire: Apple Computers Inc. initiated the development of a mechanism to interconnect consumer devices such as PC, printer, TC, VCR, digital camera, CD player using a serial bus known as Fireware. Later on, it led to the development of the standard IEEE 1394.
The data to be sent on the infrared link is packetized and encoded as per the IrDA protocols and sent over the air to the other device. The
As shown in the above figure, the consumer devices can be connected using this serial bus. The cable length can be up to 4.5m. the only restriction is that the devices cannot be connected in loops. IEEE 1394 provides plug and play capability and hot insertion capability. We can insert or remove a device even when the bus is active. Another feature is that peer-to-peer communication is supported and hence even if the PC is not there, any two devices can be connected. Each device is given a 6-bit identification number and hence a maximum of 63 devices can be interconnected on a single bus. Using bridges, multiple buses can be connected. Each bus is given a 10-bit identification number and hence 1023
By Mr. Suman Kalyan Oduri
Page 4
www.uandistar.org
www.uandistar.org
Embedded & Real Time Systems Notes
buses can be interconnected. The standard specifies copper wire or optical fiber as the transmission medium with data rates 100, 200, 400, 800, 1600 and 3200 Mbps. The protocol architecture between devices is as follows: for receive path. Ethernet transmits balanced differential signals. In each pair, one wire carries signal voltage between 0 to +2.5V and the second wire carries signals with voltage between -2.5V and 0V, and hence the signal difference is 5V. Data link layer: The data link layer is divided into medium access control (MAC) layer and logical link control (LLC) layer. The MAC layer uses the carrier sense multiple access/collision detection (CSMA/CD) protocol to access the shared medium. The LLC layer specifies the protocol for logical connection establishment, flow control, error control and acknowledgements. Each Ethernet interface will have a unique Ethernet address of 48-bits.
Physical layer: This layer specifies the electrical and mechanical connections. Bus initialization and arbitration are the functions of this layer. These functions ensure that only one device transmits data at a time. Data link layer: The layer takes care of packet delivery, acknowledgements and addressing of the devices. Transaction layer: This layer handles the writing and reading of the data from the devices. Management protocols: These protocols are used to manage the bus and they run on each of the devices. These protocols do the necessary resource management and control the nodes.
Ethernet: Ethernet is a family of frame-based computer networking technologies for local area networks (LAN). It defines a number of wiring and signaling standards for the Physical Layer of the OSI networking model as well as a common addressing format and a variety of Medium Access Control procedures at the lower part of the Data Link Layer. Ethernet is standardized as IEEE 802.3. Ethernet interface is now ubiquitous. It is available on every desktop and laptop. With the availability of low cost Ethernet chips and the associated protocol stack, providing an Ethernet interface is very easy and useful to the embedded system. Through the Ethernet interface, the embedded system can be connected as a LAN. So, a number of embedded systems in a manufacturing unit can be connected as a LAN; and, another node on the LAN, a desktop computer, can monitor all these embedded systems. The data collected by an embedded system can be transferred to a database on the LAN. The Ethernet interface provides the physical layer and data link layer functionality. Above the data link layer, the TCP/IP protocol stack and the application layer protocols will run. Application Layer (SMTP, FTP, HTTP) TCP Layer IP Layer Logical Link Control Medium Access Control Physical Layer Physical layer: The Ethernet physical layer specifies a RJ 45 jack using which the device is connected to the LAN. Unshielded twisted pair or coaxial cable can be used as the medium. Two pairs of wires are used for transmission. One for transmit path and one
IEEE 802.11: IEEE 802.11 is a set of standards for implementing wireless local area network (WLAN) computer communication in the 2.4, 3.6 and 5 GHz frequency bands. They are created and maintained by the IEEE LAN/MAN Standards Committee (IEEE 802). The base current version of the standard is IEEE 802.11-2007. Each wireless LAN node has a radio and an antenna. All the nodes running the same MAC protocol and competing to access the same medium will form a basic service set (BSS). This BSS can interface to a backbone LAN through access point (AP). The backbone LAN can be a wired LAN such as Ethernet LAN. Two or more BSSs can be interconnected through the backbone LAN. In trade magazines, the access points are referred as Hotspots. The MAC protocol used in 802.11 is called CSMA/CA. Carrier sense multiple access with collision avoidance (CSMA/CA) is a wireless network multiple access method in which: A carrier sensing scheme is used. A node wishing to transmit data has to first listen to the channel for a predetermined amount of time to determine whether or not another node is transmitting on the channel within the wireless range. If the channel is sensed "idle," then the node is permitted to begin the transmission process. If the channel is sensed as "busy," the node defers its transmission for a random period of time. Once the transmission process begins, it is still possible for the actual transmission of application data to not occur. An important feature of IEEE 802.11 wireless LAN is that two or more nodes can communication directly also without the need for a centralized control. The two configurations in which the wire LAN can operate are: Communication through access point:
By Mr. Suman Kalyan Oduri
Page 5
www.uandistar.org
www.uandistar.org
Embedded & Real Time Systems Notes
Direct Communication: Class 3 - 1mW (0dBm) with a 'typical range' of 10m Class 2 - 2.5mW (4dBm) with a 'typical range' of 20m Class 1 - 100mW (20dBm) with a 'typical range' of 100m Link data rate: A maximum link baseband data rate of 723.2 kb/s is supported, with options for 1/3 bit repetition and 2/3 Hamming FEC (Forward Error Correction). Speech coding: CVSD (Continuously Variable Slope Delta Modulation) supports 64kbps acceptable speech quality even with 1-3% bit error rate (BER). Network Topology: Bluetooth enabled devices form a network known as piconet. In a piconet, there will be one master and up to seven active slaves. Communication between the master slaves can be point-to-point or point-to-multipoint. Point-to-point communications between Master and Slave:
When two or more devices form a network without the need for centralized control, they are called ad-hoc networks. For instance, a mobile phone can form a network with laptop and synchronize data automatically. Embedded systems are now being provided with wireless LAN connectivity to exchange data. The main attraction of wireless connectivity is that it can be used in environments where running a cable is difficult such as in shop floors of manufacturing. Bluetooth: Bluetooth is a proprietary open wireless technology standard for exchanging data over short distances from fixed and mobile devices, creating personal area networks (PANs) with high levels of security. Created by telecoms vendor Ericsson in 1994, it was originally conceived as a wireless alternative to RS-232 data cables. It can connect several devices, overcoming problems of synchronization. A number of technologies have been proposed for PANs. Notable among them are Bluetooth, IrDA and IEEE 802.11. Bluetooth holds a great promise because it can provide wireless connectivity to embedded systems at a very low cost. Features of Bluetooth: It is a low cost technology It has very low power consumption It is based on radio transmission in the ISM band It caters to short ranges It is based on open standards formulated by a consortium of industries and a large number of equipment vendors are committed to this technology Bluetooth System Specifications: Radio frequency: Bluetooth uses the unlicensed ISM (Industrial, Scientific and Medical) band, 2400 - 2483.5 MHz, thereby maximizing communication compatibility worldwide. This requirement for global usage was the key driver for this choice of spectrum. Modulation: Uses Gaussian frequency shift keying, GFSK, with modulation index limited to 0.28-0.35 (corresponding to a max frequency deviation of 140-175 kHz). Frequency Hopping Spread Spectrum: Bluetooth adopts the use of FHSS to hop at up to 1600 hops/sec, amongst 79 channels, spaced at 1 MHz separation. Transmission Power: Three power classes are supported by the standard. In practice almost all Bluetooth devices developed to date support only one of these options, most being the low power, short range, option.
Point-to-multipoint communications master and multiple slaves:
between
Scatternet:
Communication between Master and Slave: The Master and Slave communicate in the form of packets. Each packet is transmitted in a time slot. Each time slot is of 625s duration. These slots are numbered from 0 to 227-1. Master starts the transmission in even slots by sending a packet addressed to a slave and the slave sends the packets in odd numbered slots. A packet generally occupies one time slot; the hop frequency will be the same for the entire packet. If the Master starts the transmission in slot 0 using frequency f1, the slave transmits in slot 1 using frequency f2; mater transmits in slot 2 using frequency f3, and so on. Bluetooth State Transition Diagram: A Bluetooth device can be in different states as shown below:
By Mr. Suman Kalyan Oduri
Page 6
www.uandistar.org
www.uandistar.org
Embedded & Real Time Systems Notes
Bluetooth Protocol Architecture: Bluetooth communication occurs between a master radio and a slave radio. Bluetooth radios are symmetric in that the same device may operate as a master and also the slave. Each radio has a 48 bit unique device address (BD_ADDR) that is fixed. Two or more radio devices together form ad-hoc networks called piconet. All units within a piconet share the same channel. Each piconet has one master device and one or more slaves. There may be up to seven active slaves at a time within a piconet. Thus each active device within a piconet is identifiable by a 3 bit active device address. Inactive slaves in unconnected modes may continue to reside within the piconet. A master is the only one that may initiate a Bluetooth communication link. However, once a link is established, the slave may request a master/slave switch to become the master. Slaves are not allowed to talk to each other directly. All communication occurs within the slave and the master. Slaves within a piconet must also synchronize their internal clocks and frequency hops with that of the master. Each piconet uses a different frequency hopping sequence. Radio devices used Time Division Multiplexing (TDM). A master device in a piconet transmits on even numbered slots and the slaves may transmit on odd numbered slots. Multiple piconets with overlapping coverage areas form a scatternet. Each piconet may have only one master, but slaves may participate in different piconets on a time-division multiplex basis. A device may be a master in one piconet and a slave in another or a slave in more than one piconet. The complete protocol stack of Bluetooth system is as follows:
To start with an application program in a Bluetooth device can enter the inquiry state to enquire about other devices in the vicinity. To respond to an inquiry, the devices should periodically enter into inquiry scan state and when the inquiry is successfully completed, they enter the inquiry response state. When a device wants to get connected to another device, it enters the page state. In this state, the device will become the Master and page for other devices. The command for this paging has to come from an application program running on this Bluetooth device. When the device pages for the other device, the other device may respond and the Master enters the master response state. Devices should enter the page scan state periodically to check whether other devices are paging for them. When device receives the page scan packet, it enters the slave response state. Once paging of devices is completed, the master and the slave establish a connection. Therefore, the connection is in active state, during which the packet transmission takes place. The connection can also be put in one of the three modes: hold or sniff or park modes. In hold mode, the device will stop receiving the data traffic for a specific amount of time so that other devices in the piconet can use the channel. After the expiry of the specific time, the device will start listening to traffic again. In sniff mode, a slave will be given an instruction like listen starting with slot number S every T slots for a period of N slots. So, the device need not listen to all the packets, but only as specified through the above parameters called sniff parameters. The connection can be in park mode when the device only listens to a beacon signal from the master occasionally, and it synchronizes with the master but does not do any data transmission. Bluetooth Profiles: To use Bluetooth wireless technology, a device shall be able to interpret certain Bluetooth profiles, which are definitions of possible applications and specify general behaviors that Bluetooth enabled devices use to communicate with other Bluetooth devices. These profiles include settings to parameterize and to control the communication from start. Adherence to profiles saves the time for transmitting the parameters anew before the bi-directional link becomes effective. There are a wide range of Bluetooth profiles that describe many different types of applications or use cases for devices.
Baseband & RF: The baseband and the Link control layers enable the physical RF link between Bluetooth devices to form a piconet. Both circuit and packet switching is used. They provide two kinds of physical links using the baseband packets. They are Synchronous Connection Oriented (SCO) and Asynchronous Connectionless (ACL). ACL packets are used for data only, while the SCO packets may contain audio only or a combination of audio and data. Link Manager Protocol (LMP): The link manager protocol is responsible for the link setup between Bluetooth units. This protocol layer caters to issues of security like authentication, encryption by generating, exchanging and checking the link and encryption keys. It also deals with control and negotiation of baseband packet sizes.
By Mr. Suman Kalyan Oduri
Page 7
www.uandistar.org
www.uandistar.org
Embedded & Real Time Systems Notes
Logical Link Control and Adaptation Layer (L2CAP): The Bluetooth logical link control and adaptation layer supports higher level multiplexing, segmentation and reassembly of packets, and Quality of Service (QoS) communication and Groups. This layer does not provide any reliability and uses the baseband ARQ to ensure reliability. Channel identifiers are used to label each connection end point. Service Discovery Protocol (SDP): SDP is the basis for discovery of services on all Bluetooth devices. This is essential for all Bluetooth models. Using the SDP device information, services and the characteristics of the services can be queried and after that a connection between two or more Bluetooth devices may be established. Other service discovery protocols such as Jini, UpnP etc., may be used in conjunction with the Bluetooth SDP protocol. RFCOMM: The RFCOMM protocol is the basis for the cable replacement usage of Bluetooth. It is a simple transport protocol with additional provisions for emulating the 9 circuits of RS-232 serial ports over L2CAP part of the Bluetooth protocol stack. This is based in the ETSI standard TS 07.10 and supports a large base of legacy applications that use serial communication. It provides a reliable data stream, multiple concurrent connections, flow control and serial cable line settings. Telephony Control Protocol (TCS): The TCS binary protocol defines the call control signaling for the establishment of speech and data calls between Bluetooth devices. It is based on the ITU-T Q.931 recommendation. It is a bit oriented protocol and also provides group management. Host Controller Interface (HCI): The HCI provides a command interface to the baseband controller, link manager and access to the hardware status and control registers. The interface provides a uniform method of accessing the Bluetooth baseband capabilities. The Host control transport layer abstracts away transport dependencies and provides a common device driver interface to various interfaces. Three interfaces are defined in the core specification: USB, RS-232 and UART. Architecture of the Kernel: The kernel is the core of an operating system. It is a piece of software responsible for providing secure access to the machine's hardware and to running computer programs. Since there are many programs, and hardware access is limited, the kernel also decides when and how long a program should run. This is called scheduling. Accessing the hardware directly can be very complex, since there are many different hardware designs for the same type of component. Kernels usually implement some level of hardware abstraction (a set of instructions universal to all devices of a certain type) to hide the underlying complexity from the operating system and provide a clean and uniform interface. This helps application programmers to develop programs without having to know how to program specific devices. The kernel relies upon software drivers that translate the generic command into instructions specific to that device.
An operating system kernel is not strictly needed to run a computer. Programs can be directly loaded and executed on the "bare metal" machine, provided that the authors of those programs are willing to do without any hardware abstraction or operating system support. This was the normal operating method of many early computers, which were reset and reloaded between the runnings of different programs. Eventually, small ancillary programs such as program loaders and debuggers were typically left in-core between runs, or loaded from read-only memory. As these were developed, they formed the basis of what became early operating system kernels. The "bare metal" approach is still used today on many video game consoles and embedded systems, but in general, newer systems use modern kernels and operating systems. Four broad categories of kernels: Monolithic kernels provide rich and powerful abstractions of the underlying hardware. Microkernels provide a small set of simple hardware abstractions and use applications called servers to provide more functionality. Hybrid (modified Microkernels) Kernels are much like pure Microkernels, except that they include some additional code in kernel space to increase performance. Exokernels provide minimal abstractions, allowing low-level hardware access. In Exokernel systems, library operating systems provide the abstractions typically present in monolithic kernels. Kernel Objects: The various kernel objects are Tasks, Task Scheduler, Interrupt Service Routines, Semaphores, Mutexes, Mailboxes, Message Queues, Pipes, Event Registers, Signals and Timers.
By Mr. Suman Kalyan Oduri
Page 8
www.uandistar.org
www.uandistar.org
Embedded & Real Time Systems Notes
Task: A task is the execution of a sequential program. It starts with reading of the input data and of the internal state of the task, and terminates with the production of the results and updating the internal state. The control signal that initiates the execution of a task must be provided by the operating system. The time interval between the start of the task and its termination, given an input data set x, is called the actual duration dact(task,x) of the task on a given target machine. A task that does not have an internal state at its point of invocation is called a stateless task; otherwise, it is called a task with state. Simple Task (S-task): If there is no synchronization point within a task, we call it a simple task (S-task), i.e., whenever an S -task is started, it can continue until its termination point is reached. Because an S-task cannot be blocked within the body of the task, the execution time of an S-task is not directly dependent on the progress of the other tasks in the node, and can be determined in isolation. It is possible for the execution time of an S-task to be extended by indirect interactions, such as by task preemption by a task with higher priority. Complex Task (C-Task): A task is called a complex task (C-Task) if it contains a blocking synchronization statement (e.g., a semaphore operation "wait") within the task body. Such a "wait" operation may be required because the task must wait until a condition outside the task is satisfied, e.g., until another task has finished updating a common data structure, or until input from a terminal has arrived. If a common data structure is implemented as a protected shared object, only one task may access the data at any particular moment (mutual exclusion). All other tasks must be delayed by the "wait" operation until the currently active task finishes its critical section. The worst-case execution time of a complex task in a node is therefore a global issue because it depends directly on the progress of the other tasks within the node, or within the environment of the node. A task can be in one of the following states: running, waiting or ready-to-run. A task is said to be in running state if it is being executed by the CPU A task is said to be in waiting state if it is waiting for another event to occur. A task is said to be in ready-to-run state if it is waiting in a queue for the CPU time. Task Scheduler: An application in real-time embedded system can always be broken down into a number of distinctly different tasks. For example, Keyboard scanning Display control Input data collection and processing Responding to and processing external events Communicating with host or others Each of the tasks can be represented by a state machine. However, implementing a single sequential loop for the entire application can prove to be a formidable task. This is because of the various time constraints in the tasks - keyboard has to be scanned, display controlled, input channel monitored, etc. One method of solving the above problem is to use a simple task scheduler. The various tasks are handled by the scheduler in an orderly manner. This produces the effect of simple multitasking with a single processor. A bonus of using a scheduler is the ease of implementing the sleep mode in microcontrollers which will reduce the power consumption dramatically (from mA to A). This is important in battery operated embedded systems. There are several ways of implementing the scheduler - preemptive or cooperative, round robin or with priority. In a cooperative or nonpreemptive system, tasks cooperate with one another and relinquish control of the CPU themselves. In a preemptive system, a task may be preempted or suspended by different task, either because the latter has a higher priority or the time slice of the former one is used up. Round robin scheduler switches in one task after another in a round robin manner whereas a system with priority will switch in the highest priority task. For many small microcontroller based embedded systems, a cooperative (or nonpreemptive), round robin scheduler is adequate. This is the simplest to implement and it does not take up much memory. Ravindra Karnad has implemented such a scheduler for 8051 and other microcontrollers. In his implementation, all tasks must behave cooperatively. A task waiting for an input event thus cannot have infinite waiting loop such as the following: while (TRUE) { Check input ... } This will hog processor time and reprieve others of running. Instead, it may be written as: if (input TRUE) { ... } else (timer[i]=100ms) In this case, task i will check the input condition every 100 ms, set in the associated timer[i]. When the condition of input is false, other tasks will have a chance to run. The job of the scheduler is thus rather simple. When there is clock interrupt, all task timers are decremented. The task whose timer reaches 0 will be run. The greatest virtue of the simple task scheduler ready lies in the smallness of the code, which is of course very important in the
By Mr. Suman Kalyan Oduri
Page 9
www.uandistar.org
www.uandistar.org
Embedded & Real Time Systems Notes
case of microcontrollers. The code size ranges from 200 to 400 byes. Context Switching: A context switch is the computing process of storing and restoring state (context) of a CPU so that execution can be resumed from the same point at a later time. This enables multiple processes to share a single CPU. The context switch is an essential feature of a multitasking operating system. Context switches are usually computationally intensive and much of the design of operating systems is to optimize the use of context switches. A context switch can mean a register context switch, a task context switch, a thread context switch, or a process context switch. What constitutes the context is determined by the processor and the operating system. Switching from one process to another requires a certain amount of time for doing the administration saving and loading registers and memory maps, updating various tables and list etc. Scheduling Algorithms: A scheduling algorithm is the method by which threads, processes or data flows are given access to system resources (e.g. processor time, communications bandwidth). This is usually done to load balance a system effectively or achieve a target quality of service. The need for a scheduling algorithm arises from the requirement for most modern systems to perform multitasking (execute more than one process at a time) and multiplexing (transmit multiple flows simultaneously). Various Scheduling Algorithms are: First In First Out Round-Robin Algorithm Round-Robin with Priority Shortest Job First Non-preemptive Multitasking Preemptive Multitasking First In First Out (FIFO): In FIFO scheduling algorithm, the tasks which are ready-to-run are kept in a queue and the CPU serves the tasks on first-come-first served basis. maximum number of tasks, for some tasks may have to wait forever.
Non-Preemptive Multitasking: Assume that we are making a telephone call at a public call office. We need to make many calls, but we see another person waiting. We may make one call, ask the other person to finish his call, and then we can make out next call. This is non-preemptive multitasking is also called cooperative multitasking as the tasks have to cooperate with one another to share the CPU time.
Preemptive Multitasking: In preemptive multitasking, the highest priority task is always executed by the CPU, by preempting the lower priority task. All real-time operating systems implement this scheduling algorithm.
OS API
The various function calls provided by the for task management are: Create a task Delete a task Suspend a task Resume a task Change priority of a task Query a task
Round-Robin Algorithm: In this scheduling algorithm, the kernel allocates a certain amount of time for each task waiting in the queue. For example, if three tasks 1, 2 and 3 are waiting in the queue, the CPU first executes task1 then task2 then task3 and then again task1.
Round-Robin with Priority: The round-robin algorithm can be slightly modified by assigning priority levels to some or all the tasks. A high priority task can interrupt the CPU so that it can be executed. This scheduling algorithm can meet the desired response time for a high priority task. Shortest-Job First: In this scheduling algorithm, the task that will take minimum time to be executed will be given priority. This approach satisfies the
Interrupt Service Routines: An interrupt service routine (ISR), also known as an interrupt handler, is a callback subroutine in an operating system or device driver whose execution is triggered by the reception of an interrupt. Interrupt handlers have a multitude of functions, which vary based on the reason the interrupt was generated and the speed at which the interrupt handler completes its task. An interrupt handler is a low-level counterpart of event handlers. These handlers are initiated by either hardware interrupts or interrupt instructions in software, and are used for servicing hardware devices and transitions between protected modes of operation such as system calls. In real-time operating systems, the interrupt latency, interrupt response time and the interrupt recovery time are very important.
By Mr. Suman Kalyan Oduri
Page 10
www.uandistar.org
www.uandistar.org
Embedded & Real Time Systems Notes
Interrupt Latency: It is the time between the generation of an interrupt by a device and the servicing of the device which generated the interrupt. For many operating systems, devices are serviced as soon as the device's interrupt handler is executed. Interrupt latency may be affected by interrupt controllers, interrupt masking, and the operating system's (OS) interrupt handling methods. Interrupt Response Time: Time between receipt of interrupt signal and starting the code that handles the interrupt is called interrupt response time. Interrupt Recovery Time: Time required for CPU to return to the interrupted code/highest priority task is called interrupt recovery time. Semaphores: A semaphore is a protected variable or abstract data type that provides a simple but useful abstraction for controlling access by multiple processes to a common resource in a parallel programming environment. Binary Semaphores: A binary semaphore is a synchronization object that can have only two states. They are not taken and taken. Two operations are defined: Take: Taking a binary semaphore brings it in the taken state, trying to take a semaphore that is already taken enters the invoking thread into a waiting queue. Release: Releasing a binary semaphore brings it in the not taken state if there are not queued threads. If there are queued threads then a thread is removed from the queue and resumed, the binary semaphore remains in the taken state. Releasing a semaphore that is already in its not taken state has no effect.
A useful way to think of a semaphore is as a record of how many units of a particular resource are available, coupled with operations to safely (i.e., without race conditions) adjust that record as units are required or become free, and if necessary wait until a unit of the resource becomes available. Semaphores are a useful tool in the prevention of race conditions and deadlocks; however, their use is by no means a guarantee that a program is free from these problems. Semaphores which allow an arbitrary resource count are called counting semaphores, whilst semaphores which are restricted to the values 0 and 1 (or locked/unlocked, unavailable/available) are called binary semaphores. The semaphore concept was invented by Dutch computer scientist Edsger Dijkstra, and the concept has found widespread use in a variety of operating systems. The OS function calls provided for Semaphore management are: Create a semaphore Delete a semaphore Acquire a semaphore Release a semaphore Query a semaphore Types of Semaphores: We will consider three different types of Semaphores are Binary Semaphores, Counting Semaphores and Mutexes.
Binary semaphores have no ownership attribute and can be released by any thread or interrupt handler regardless of who performed the last take operation. Because of these binary semaphores are often used to synchronize threads with external events implemented as ISRs, for example waiting for a packet from a network or waiting that a button is pressed. Because there is no ownership concept a binary semaphore object can be created to be either in the taken or not taken state initially. Counting Semaphores: A counting semaphore is a synchronization object that can have an arbitrarily large number of states. The internal state is defined by a signed integer variable, the counter. The counter value (N) has a precise meaning: Negative, there are exactly -N threads queued on the semaphore. Zero, no waiting threads, a wait operation would put in queue the invoking thread. Positive, no waiting threads, a wait operation would not put in queue the invoking thread. Two operations are defined for counting semaphores: Wait: This operation decreases the semaphore counter; if the result is negative then the invoking thread is queued. Signal: This operation increases the semaphore counter, if the result is nonnegative then a waiting thread is removed from the queue and resumed.
Counting semaphores have no ownership attribute and can be signaled by any thread or
By Mr. Suman Kalyan Oduri
Page 11
www.uandistar.org
www.uandistar.org
Embedded & Real Time Systems Notes
interrupt handler regardless of who performed the last wait operation. Because there is no ownership concept a counting semaphore object can be created with any initial counter value as long it is non-negative. The counting semaphores are usually used as guards of resources available in a discrete quantity. For example the counter may represent the number of used slots into a circular queue, producer threads would signal the semaphores when inserting items in the queue, consumer threads would wait for an item to appear in queue, this would ensure that no consumer would be able to fetch an item from the queue if there are no items available. Mutexes: A mutex is a synchronization object that can have only two states. They are not owned and owned. Two operations are defined for mutexes: Lock: This operation attempts to take ownership of a mutex, if the mutex is already owned by another thread then the invoking thread is queued. Unlock: This operation relinquishes ownership of a mutex. If there are queued threads then a thread is removed from the queue and resumed, ownership is implicitly assigned to the thread. a controlled section, forcing other threads which attempt to gain access to that section to wait until the first thread has exited from that section. A semaphore restricts the number of simultaneous users of a shared resource up to a maximum number. Threads can request access to the resource (decrementing the semaphore), and can signal that they have finished using the resource (incrementing the semaphore). Deadlock: A condition that occurs when two processes are each, waiting for the other to complete before proceeding. The result is that both processes hang. Deadlocks occur most commonly in multitasking and client/server environments. Ideally, the programs that are deadlocked, or the operating system, should resolve the deadlock, but this doesn't always happen. In order for deadlock to occur, four conditions must be true: Mutual Exclusion: Each resource is either currently allocated to exactly one process or it is available. Hold and Wait: processes currently holding resources can request new resources. No Preemption: Once a process holds a resource, it cannot be taken away by another process or the kernel. Circular Wait: Each process is waiting to obtain a resource which is held by another process. Mailboxes: Messages are sent to a task using kernel services called message mailbox. Mailbox is basically a pointer size variable. Tasks or ISRs can deposit and receive messages (the pointer) through the mailbox. A task looking for a message from an empty mailbox is blocked and placed on waiting list for a time (time out specified by the task) or until a message is received. When a message is sent to the mail box, the highest priority task waiting for the message is given the message in priority-based mailbox or the first task to request the message is given the message in FIFO based mailbox.
Note that, unlike semaphores, mutexes do have owners. A mutex can be unlocked only by the thread that owns it, this precludes the use of mutexes from interrupt handles but enables the implementation of the Priority Inheritance protocol, and most RTOSs implement this protocol in order to address the Priority Inversion problem. It must be said that few RTOSs implement this protocol fully (any number of threads and mutexes involved) and even less do that efficiently. Mutexes have one single use, Mutual Exclusion, and are optimized for that. Semaphores can also handle mutual exclusion scenarios but are best used as a communication mechanism between threads or between ISRs and threads. The OS functions calls provided for mutex management are: Create a mutex Delete a mutex Acquire a mutex Release a mutex Query a mutex Wait on a mutex Difference between Mutex & Semaphore: Mutexes are typically used to serialize access to a section of re-entrant code that cannot be executed concurrently by more than one thread. A mutex object only allows one thread into
A mailbox object is just like our postal mailbox. Someone posts a message in our mailbox and we take out the message. A task can have a mailbox into which others can post a mail. A task or ISR sends the message to the mailbox. To manage the mailbox object, the following function calls are provided in the OS API: Create a mailbox Delete a mailbox
By Mr. Suman Kalyan Oduri
Page 12
www.uandistar.org
www.uandistar.org
Embedded & Real Time Systems Notes
Query a mailbox Post a message in a mailbox Read a message form a mailbox its execution. An external source, such as another task or an ISR, can set bits in the event register to inform the task that a particular event has occurred. For managing the event registers, the following function calls are provided: Create an event register Delete an event register Query an event register Set an event register Clear an event flag Pipes: Pipes are kernel objects that provide unstructured data exchange and facilitate synchronization among tasks. In a traditional implementation, a pipe is a unidirectional data exchange facility, as shown in below Figure.
Message Queues: It is used to send one or more messages to a task. Basically Queue is an array of mailboxes. Tasks and ISRs can send and receive messages to the Queue through services provided by the kernel. Extraction of messages from a queue may follow FIFO or LIFO fashion. When a message is delivered to the queue either the highest priority task (Priority based) or the first task that requested the message (FIFO based) is given the message. Message queue can be considered as an array of mailboxes. Some of the applications of message queue are: Taking the input from a keyboard To display output Reading voltages from sensors or transducers Data packet transmission in a network In each of these applications, a task or an ISR deposits the message in the message queue. Other tasks can take the messages. Based on our application, the highest priority task or the first task waiting in the queue can take the message. At the time of creating a queue, the queue is given a name or ID, queue length, sending task waiting list and receiving task waiting list. The following function calls are provided to manage message queues: Create a queue Delete a queue Flush a queue Post a message in queue Post a message in front of queue Read message from queue Broadcast a message Show queue information Show queue waiting list Event Registers: Some kernels provide a special register as part of each tasks control block as shown below. This register, called an event register, is an object belonging to a task and consists of a group of binary event flags used to track the occurrence of specific events. Depending on a given kernels implementation of this mechanism, an event register can be 8 or 16 or 32 bits wide, maybe even more.
Two descriptors, one for each end of the pipe (one end for reading and one for writing), are returned when the pipe is created. Data is written via one descriptor and read via the other. The data remains in the pipe as an unstructured byte stream. Data is read from the pipe in FIFO order. A pipe provides a simple data flow facility so that the reader becomes blocked when the pipe is empty, and the writer becomes blocked when the pipe is full. Typically, a pipe is used to exchange data between a data-producing task and a data-consuming task, as shown in the below Figure. It is also permissible to have several writers for the pipe with multiple readers on it.
manage
The function calls in the OS API to the pipes are: Create a pipe Open a pipe Close a pipe Read from the pipe Write to the pipe
Signals: A signal is a software interrupt that is generated when an event has occurred. It diverts the signal receiver from its normal execution path and triggers the associated asynchronous processing. Essentially, signals notify tasks of events that occurred during the execution of other tasks or ISRs. As with normal interrupts, these events are asynchronous to the notified task and do not occur at any predetermined point in the tasks execution. The difference between a signal and a normal interrupt is that signals are so-called software interrupts, which are generated via the execution of some software within the system. By contrast, normal interrupts are usually generated
Each bit in the event register treated like a binary flag and can be either set or cleared. Through the event register, a task can check for the presence of particular events that can control
By Mr. Suman Kalyan Oduri
Page 13
www.uandistar.org
www.uandistar.org
Embedded & Real Time Systems Notes
by the arrival of an interrupt signal on one of the CPUs external pins. They are not generated by software within the system but by external devices. The number and type of signals defined is both system-dependent and RTOS-dependent. An easy way to understand signals is to remember that each signal is associated with an event. The event can be either unintentional, such as an illegal instruction encountered during program execution, or the event may be intentional, such as a notification to one task from another that it is about to terminate. While a task can specify the particular actions to undertake when a signal arrives, the task has no control over when it receives signals. Consequently, the signal arrivals often appear quite random, as shown in below Figure. Memory Management: Most RTOSs have some kind of memory management subsystem. Although some offer the equivalent of the C library functions malloc and free, real-time systems engineers often avoid these two functions because they are typically slow and because their execution times are unpredictable. They favor instead functions that allocate and free fixed-size buffers, and most RTOSs offer fast and predictable functions for the purpose. The memory manager allocates memory to the processes and manages it with appropriate protection. There may be static and dynamic allocations of memory. The manager optimizes the memory needs and memory utilization. An RTOS may disable the support to the dynamic block allocation, MMU support to the dynamic page allocation and dynamic binding as this increases the latency of servicing the tasks and ISRs. An RTOS may or may not support memory protection in order to reduce the latency and memory needs of the processes. The API provides the following function calls to manage memory: Create a memory block Get data from memory Post data in the memory Query a memory block Free the memory block Priority Inversion Problem: Priority inversion is a situation in which a low-priority task executes while a higher priority task waits on it due to resource contentions.
When a signal arrives, the task is diverted from its normal execution path, and the corresponding signal routine is invoked. The terms signal routine, signal handler, asynchronous event handler, and asynchronous signal routine are interchangeable. This book uses asynchronous signal routine (ASR). Each signal is identified by an integer value, which is the signal number or vector number. The function calls to manage a signal are: Install a signal handler Remove an installed signal handler Send a signal to another task Block a signal from being delivered Unblock a blocked signal Ignore a signal Timers: A timer is the scheduling of an event according to a predefined time value in the future, similar to setting an alarm clock. For instance, the kernel has to keep track of different times: A particular task may need to be executed periodically, say, every 10ms. A timer is used to keep track of this periodicity. A task may be waiting in a queue for an event to occur. If the event does not occur for a specified time, it has to take appropriate action. A task may be waiting in a queue for a shared resource. If the resource is not available for a specified time, an appropriate action has to be taken. The following function calls are provided to manage the timer: Get time Set time Time delay (in system clock ticks) Time delay (in seconds) Reset timer
In Scheduling, priority inversion is the scenario where a low priority Task holds a shared resource that is required by a high priority task. This causes the execution of the high priority task to be blocked until the low priority task has released the resource, effectively inverting the relative priorities of the two tasks. If some other medium priority task, one that does not depend on the shared resource, attempts to run in the interim, it will take precedence over both the low priority task and the high priority task. Priority Inversion will: Make problems in real time systems. Reduce the performance of the system May reduce the system responsiveness which leads to the violation of response time guarantees. This problem can be avoided by implementing: Priority Inheritance Protocol (PIP): The Priority Inheritance Protocol is a resource access control protocol that raises the priority of a task, if that task holds a resource
By Mr. Suman Kalyan Oduri
Page 14
www.uandistar.org
www.uandistar.org
Embedded & Real Time Systems Notes
being requested by a higher priority task, to the same priority level as the higher priority task. Embedded Linux Consortium released the Embedded Linux Consortium Platform Specifications (ELCPS) to bring in standardization for using Linux in embedded systems. ELCPS defines three environments to cater to different embedded system requirements. These are minimal system environment, intermediate system environment and full system environment. Real-Time Operating System (RTOS): A real-time operating system (RTOS) is an operating system (OS) intended to serve realtime application requests. A key characteristic of an RTOS is the level of its consistency concerning the amount of time it takes to accept and complete an application's task; the variability is jitter. A hard real-time operating system has less jitter than a soft real-time operating system. The chief design goal is not high throughput, but rather a guarantee of a soft or hard performance category. An RTOS that can usually or generally meet a deadline is a soft real-time OS, but if it can meet a deadline deterministically it is a hard real-time OS. A real-time OS has an advanced algorithm for scheduling. Scheduler flexibility enables a wider, computer-system orchestration of process priorities, but a real-time OS is more frequently dedicated to a narrow set of applications. Key factors in a real-time OS are minimal interrupt latency and minimal thread switching latency, but a real-time OS is valued more for how quickly or how predictably it can respond than for the amount of work it can perform in a given period of time. RTLinux RTLinux or RTCore is a hard realtime RTOS microkernel that runs the entire Linux operating system as a fully preemptive process. It was developed by Victor Yodaiken (Yodaiken 1999), Michael Barabanov (Barabanov 1996), Cort Dougan and others at the New Mexico Institute of Mining and Technology and then as a commercial product at FSMLabs. Wind River Systems acquired FSMLabs embedded technology in February 2007 and now makes a version available as Wind River Real-Time Core for Wind River Linux. RTLinux was based on a lightweight virtual machine where the Linux "guest" was given a virtualized interrupt controller and timer - and all other hardware access was direct. From the point of view of the real-time "host", the Linux kernel is a thread. Interrupts needed for deterministic processing are processed by the real-time core, while other interrupts are forwarded to Linux, which runs at a lower priority than realtime threads. Linux drivers handle almost all I/O. FirstIn-First-Out pipes (FIFOs) or shared memory can be used to share data between the operating system and RTCore. RTLinux runs underneath the Linux OS. The Linux is an idle task for RTLinux. The real-time software running under RTLinux is given priority as compared to non-real-time threads running under Linux. This OS is an excellent choice for 32-bit processor based embedded systems.
Priority Ceiling Protocol (PCP): The priority ceiling protocol is a synchronization protocol for shared resources to avoid unbounded priority inversion and mutual deadlock due to wrong nesting of critical sections. In this protocol each resource is assigned a priority ceiling, which is a priority equal to the highest priority of any task which may lock the resource.
Embedded Operating Systems: An embedded operating system is an operating system for embedded computer systems. These operating systems are designed to be compact, efficient, and reliable, forsaking many functions that non-embedded computer operating systems provide, and which may not be used by the specialized applications they run. They are frequently also real-time operating systems, and the term RTOS is often used as a synonym for embedded operating system. An important difference between most embedded operating systems and desktop operating systems is that the application, including the operating system, is usually statically linked together into a single executable image. Unlike a desktop operating system, the embedded operating system does not load and execute applications. This means that the system is only able to run a single application. Embedded Linux: Embedded Linux is the use of Linux in embedded computer systems such as mobile phones, personal digital assistants, media players, set-top boxes, and other consumer electronics devices, networking equipment, machine control, industrial automation, navigation equipment and medical instruments. Open source software revolution started with the development of Linux. It is expected that open source software will have profound impact on the industry in the coming years and Embedded Linux is likely to be used extensively in embedded applications. It is now being used on a number of devices such as PDAs, Set Top Boxes, Internet Appliances, and Cellular Phones and also in major mission critical equipment such as telecommunication switches and routers.
By Mr. Suman Kalyan Oduri
Page 15
www.uandistar.org
www.uandistar.org
Embedded & Real Time Systems Notes
Handheld Operating Systems: Handheld computers are becoming very popular as their capabilities are increasing day by day. A handheld operating system, also known as a mobile OS, a mobile platform, is the operating system that controls a mobile device or information appliance - similar in principle to an operating system such as Windows, Mac OS, or Linux that controls a desktop computer or laptop. However, they are currently somewhat simpler, and deal more with the wireless versions of broadband and local connectivity, mobile multimedia formats, and different input methods. Typical examples of devices running a mobile operating system are smart phones, personal digital assistants (PDAs), tablet computers and information appliances, or what are sometimes referred to as smart devices, which may also include embedded systems, or other mobile devices and wireless devices. The important requirements for a mobile operating system are: To keep the cost of the handheld computer low; small footprint of the OS is required. The OS should have support for soft realtime performance. TCP/IP stack needs to be integrated along with the OS. Communication protocol stacks for Infrared, Bluetooth, IEEE 802.11 interfaces need to be integrated. There should be support for data synchronization. Windows CE: The Handheld PC (H/PC) was a hardware design for PDA devices running Windows CE. It provides the appointment calendar functions usual for any PDA. The intent of Windows CE was to provide an environment for applications compatible with the Microsoft Windows operating system, on processors better suited to low-power operation in a portable device. Originally announced in 1996, the Handheld PC is distinctive from its more recent counterparts such as the Palm-Size PC, Pocket PC, or Smartphone in that the specification provides for larger screen sizes as well as a keyboard. To be classed as a Windows CE Handheld PC the device must: Run Microsoft's Windows CE Be bundled with an application suite only found through an OEM Platform Release and not in Windows CE itself Use ROM Have a screen supporting a resolution greater than or equal to 480240 Include a keyboard Include a PC card Slot Include an infrared (IrDA) port Provide wired serial and/or USB connectivity Examples of Handheld PC devices are the NEC MobilePro 900c, HP 320LX, HP Jornada 720, and Vadem Clio. Microsoft has stopped developing for the Handheld PC since 2000, instead focusing development on the Pocket PC and Windows Mobile. Other Handheld PCs may not use Windows CE. Windows CE devices which match all of the hardware requirements of the H/PC specification but lack a keyboard are known as Windows CE Tablet PC or Webpad devices. Design Technology: There has been tremendous interest over the past few decades in developing design technologies that will enable designers to produce transistors more rapidly. These technologies have been developed for both software and for hardware, but the recent developments in hardware design technology deserve special attention since theyve brought us to a new era in embedded system design. Design is the task of defining a systems functionality and converting that functionality into a physical implementation, while satisfying certain constrained design metrics and optimizing other design metrics. Design is hard. Just getting the functionality right is tricky because embedded system functionality can be very complex, with millions of possible environment scenarios that must be responded to property. Not only is getting the functionality right hard, but creating a physical implementation that satisfies constraints is also very difficult because there are so many competing, tightly constrained metrics. These difficulties slow designer productivity. Embedded system designer productivity can be measured by software lines of code produced per month or hardware transistors produced per month. Productivity numbers are surprisingly low, with some studies showing just tens of lines of code or just hundreds of transistors produced per designer-day. In response to low production rates, the design community has focused much effort and resources to developing design technologies that improve productivity. We can classify many of those technologies into three general techniques. Automation is the task of using a computer program to replace manual design effort. Reuse is the process of using predesigned components rather than designing those components oneself. Verification is the task of ensuring the correctness and completeness of each design step. Automation: Synthesis: The Parallel Evolution of Compilation and Synthesis: Because of the different techniques used to design software and hardware, a division between the fields of hardware design and software design occurred. Design tools simultaneously evolved in both fields.
By Mr. Suman Kalyan Oduri
Page 16
www.uandistar.org
www.uandistar.org
Embedded & Real Time Systems Notes
As shown in the above figure, early software consisted of machine instructions, coded as sequences of 0s and 1s, necessary to carry out the desired system behavior on a general-purpose processor. A collection of machine instructions was called a program. As program sizes grew from 100s of instructions to 1000s of instructions, the tediousness of dealing with 0s and 1s became evident, resulting in use of assemblers and linkers. These tools automatically translate assembly instructions, consisting of instructions written using letters and numbers to represent symbols, into equivalent machine instructions. Soon, the limitations of assembly instructions became evident for programs consisting of 10s of 1000s of instructions, resulting in the development of compilers. Compilers automatically translate sequential programs, written in a high-level language like C, into equivalent assembly instructions. Compilers became quite popular starting in the 1960s, and their popularity has continued to grow. Tools like assemblers/linkers, and then compilers, helped software designers climb to higher abstraction levels. Early hardware consisted of circuits of interconnected logic gates. As circuit sizes grew from 1000s of gates to 10s of 1000s, the tediousness of dealing with gates became apparent, resulting in the development of logic synthesis tools. These tools automatically convert logic equations or finite-state machines into logic gates. As circuit sizes continued to grow, registertransfer (RT) synthesis tools evolved. These tools automatically convert Finite State Machine with Datapaths (FSMDs) into Finite State Machine (FSMs), logic equations, and predesigned RT components like registers and adders. In the 1990s, behavioral synthesis tools started to appear, which convert sequential programs into FSMDs. Hardware design involves many more design dimensions. While a compiler must generate assembly instructions to implement a sequential program on a given processor, a synthesis tool must actually design the processor itself. To implement behavior in hardware rather than software implies that one is extremely concerned about size, performance, power, and/or other design metrics. Therefore, optimization is crucial, and humans tend to be far better at multidimensional optimization than are computers, as long as the problem size is not too large and enough design time is available. Both hardware and software design fields have continued to focus design effort on increasingly higher abstraction levels. Starting design from a higher abstraction level has two advantages. First, descriptions at higher levels tend to be smaller and easier to capture. Second, a description at a higher abstraction level has many more possible implementations than those at lower levels. However, a logic-level description may have transistor implementations varying in performance and size by only a factor of two or so. Synthesis Levels (Gajski Chart or Y-Chart): A more detailed abstraction model is the Y-chart Gajski and Kuhn invented in 1983. With this well-known chart it is possible to visualize design views as well as design hierarchies. It is widely used within VHDL design and can give us an idea for modeling abstraction levels, too. The name Y-chart arises from the three different design views, which are shown as radial axes forming a Y.
Five concentric circles characterize the hierarchical levels within the design process, with increasing abstraction from the inner to the outer circle. Each circle characterizes a model. Before going to models first we go for the three domains. Behavior: This domain describes the temporal and functional behavior of a system. Structure: A system is assembled from subsystems. Here the different subsystems and their interconnection to each other are contemplated for each level of abstraction. Geometry: Important in this domain are the geometric properties of the system and its subsystems. So there is information about the size, the shape and the physical placement. Here are the restrictions about what can be implemented e.g., in respect of the length of connections. With these three domains the most important properties of a system can be well specified. The domain axes intersect with the circles that show the abstraction levels. The five circles from highest to lowest level of abstraction are (outer to inner circles): Architectural: A systems requirements and its basic concepts for meeting the requirements are specified here. Algorithmic: The how aspect of a solution is refined. Functional descriptions about how the different subsystems interact, etc. are included. Functional Block (Register-Transfer): Detailed descriptions of what is going on, from what register over which line to where a data is transferred, is the contents of this level. Logic: The single logic cell is in the focus here, but not limited to AND, OR gates, also Flip-Flops and the interconnections are specified. Circuit: This is the actual hardware level. The transistor with its electric characteristics is used to describe the system. Information from this level printed on silicon results in the chip.
By Mr. Suman Kalyan Oduri
Page 17
www.uandistar.org
www.uandistar.org
Embedded & Real Time Systems Notes
Logic Synthesis: Logic synthesis automatically converts a logic-level behavior, consisting of logic equations and/or an FSM, into a structural implementation, consisting of connected gates. Let us divide logic synthesis into combinational-logic synthesis and FSM synthesis. Combinational logic synthesis can be further subdivided into two-level minimization and multilevel minimization. Two-Level Logic Minimization: We can represent any logic function as a sum of products (or product of sums). We can implement this function directly using a level consisting of AND gates, one for each product term, and a second level consisting of a single OR gate. Thus, we have two levels, plus inverters necessary to complement some inputs to the AND gates. The longest possible path from an input signal to an output signal passes through at most two gates, not counting inverters. We cannot in general obtain faster performance. Multi-Level Logic Minimization: Typical practical implementations of a logic function utilize a multi-level network of logic elements. Starting from an RTL description of a design, the synthesis tool constructs a corresponding multilevel Boolean network. Next, this network is optimized using several technology-independent techniques before technology-dependent optimizations are performed. The typical cost function during technology-independent optimizations is total literal count of the factored representation of the logic function. Finally, technology-dependent optimization transforms the technologyindependent circuit into a network of gates in a given technology. The simple cost estimates are replaced by more concrete, implementation-driven estimates during and after technology mapping. Mapping is constrained by factors such as the available gates (logic functions) in the technology library, the drive sizes for each gate, and the delay, power, and area characteristics of each gate. FSM logic synthesis can be further subdivided into state minimization and state encoding. State Minimization: It reduces the number of FSM states by identifying and merging equivalent states. Reducing the number of states may result in a smaller state register and fewer gates. Two states are equivalent if their output and next states are equivalent for all possible inputs. State Encoding: It encodes each state as a unique bit sequence, such that some design metric like size is optimized. Register-Transfer Synthesis: Logic synthesis allowed us to describe our system as Boolean equations, or as an FSM. However, many systems are too complex to initially describe at this logic level of abstraction. Instead, we often describe our system using a more abstract computation model, such as an FSMD. Recall that an FSMD allows variable declarations of complex data types, and allows arithmetic actions and conditions. Clearly, more work is necessary to convert an FSMD to gates than to convert an FSM to gates, and this extra work is performed by RT synthesis. RT synthesis takes an FSMD and converts it to a custom singlepurpose processor, consisting of a data-path and an FSM controller. In particular, it generates a complete data-path, consisting of register units to store variables, functional units to implement arithmetic operations, and connection units to connect these other units. It also generates an FSM that controls this data-path. Creating the data-path requires solving two key sub-problems: allocation and binding. Allocation is the problem of instantiating storage units, functional units, and connection units. Binding is the problem of mapping FSMD operations to specific units. As in logic synthesis, both of these RT synthesis problems are hard to solve optimally. Behavioral Synthesis: Behavioral synthesis sometimes referred to as C synthesis, electronic system level (ESL) synthesis, algorithmic synthesis, or High-level synthesis (HLS), is an automated design process that interprets an algorithmic description of a desired behavior and creates hardware that implements that behavior. Behavioral synthesis is the enabling technology for implementing a practical methodology for high-level design. It fits in with existing design flows and can be selectively applied to portions of a design that will derive the greatest benefit from the using a higher level of abstraction and the automation that it provides.
Behavioral synthesis allows design at higher levels of abstraction by automating the translation and optimization of a behavioral description, or high-level model, into an RTL implementation. It transforms un-timed or partially timed functional models into fully timed RTL implementations. Because a micro-architecture is generated automatically, designers can focus on designing and verifying the module functionality. Design teams create and verify their designs in an order of magnitude less time because it eliminates the need to fully schedule and allocate design resources with existing RTL methods. This behavioral design flow increases design productivity, reduces errors, and speeds verification.
By Mr. Suman Kalyan Oduri
Page 18
www.uandistar.org
www.uandistar.org
Embedded & Real Time Systems Notes
The behavioral synthesis process incorporates a number of complex stages. This process starts with a high-level language description of a module's behavior, including I/O actions and computational functionality. Several algorithmic optimizations are performed to reduce the complexity of a result and then the description is analyzed to determine the essential operations and the dataflow dependencies between them. The other inputs to the synthesis process are a target technology library, for the selected fabrication process, and a set of directives that will influence the resulting architecture. The directives include timing constraints used by the algorithms, as they create a cycle-by-cycle schedule of the required operations. The allocation and binding algorithms assign these operations to specific functional units such as adders, multipliers, comparators, etc. Finally, a state machine is generated that will control the resulting datapath's implementation of the desired functionality. The datapath and state machine outputs are in RTL code, optimized for use with conventional logic synthesis or physical synthesis tools. System Synthesis & Hardware/Software Codesign: Behavioral synthesis converts a single sequential program (behavior) to a single-purpose processor (structure). The original behavior may be better described using multiple concurrently executing sequential programs, known as processes. System synthesis converts multiple processes into multiple processors. The term system here refers to a collection of processors. Given one or more processes, system synthesis involves several tasks. Transformation is the task of rewriting the processes to be more amenable to synthesis. Transformations include procedure inlining and loop unrolling. Allocation is the task of selecting the numbers and types of processors to use to implement the processes. Allocation includes selecting processors, memories, and buses. Allocation is essentially the design of the system architecture. Partitioning is the task of mapping the processes to processors. One can be implemented on multiple processors, and multiple processes can be implemented on a single processor. Scheduling is the task of determining when each of the multiple processes on a single processor will have its chance to execute on the processor. These tasks may be performed in a variety of orders, and iteration among the tasks is common. System synthesis is driven by constraints. A minimum number of single-purpose processors might be used to meet the performance requirements. System synthesis for generalpurpose processors only (software) has been around for a few decades, but hasnt been called system synthesis. Names like multiprocessing, parallel processing, and real-time scheduling have been more common. The joint consideration of general-purpose and single-purpose processors by the same automatic tools was in stark contrast to the prior art. Thus, the term hardware/software co-design has been used extensively in the research community, to highlight research that focuses on the unique requirements of such simultaneous consideration of both hardware and software during synthesis. Verification: Verification is the task of ensuring that a design is correct and complete. Correctness means that the design implements its specification accurately. Completeness means that the designs specification described appropriate output responses to all relevant input sequences. The two main approaches to verification are known as formal verification and simulation. Formal verification is an approach to verification that analyzes a design to prove or disprove certain properties. Simulation is an approach in which we create a model of the design that can be executed on a computer. We provide sample input values to this model, and check that the output values generated by the model match our expectations. Compared with a physical implementation, simulation has several advantages with respect to testing and debugging a system. The two most important advantages are excellent controllability and observability. Controllability is the ability to control the execution of the system. Observability is the ability to examine system values. Simulation also has several disadvantages compared with a physical implementation: Setting up simulation could take much time for systems with complex external environments. A designer may spend more time modeling the external environment than the system itself. The models of the environment will likely be somewhat incomplete, so may not model complex environment behavior correctly, especially when that behavior is undocumented. Simulation speed can be quite slow compared to execution of a physical implementation. Hardware-Software Co-Simulation: An instruction-level model of a generalpurpose processor is known as an instruction-set simulator (ISS). A simulator that is designed to hide the details of the integration of an ISS and HDL simulator is known as a hardware-software co-simulator. While faster than HDL only simulation and while capitalizing and the popularity of ISSs, co-simulators can still be quite slow if the general-purpose and single-purpose processors must communicate with one another frequently. As it turns out, in many embedded systems, those processors do have frequent communication. Therefore, modern hardwaresoftware co-simulators do more than just integrate two simulators. They also seek to minimize the communication between those simulators. Reuse: Intellectual Property Cores: Designers have always had disposal commercial off-the-shelf
at their (COTS)
By Mr. Suman Kalyan Oduri
Page 19
www.uandistar.org
www.uandistar.org
Embedded & Real Time Systems Notes
components, which they could purchase and use in building a given system. Using such predesigned and prepackaged ICs, each implementing generalpurpose or single-purpose processors, greatly reduced design and debug time, as compared to building all system components from scratch. The trend of growing IC capacities is leading to all the components of a system being implemented on a single chip, known as a systemon-chip (SOC). This trend, therefore, is leading to a major change in the distribution of such off-theshelf components. Rather than being sold as ICs, such components are increasingly being sold in the form of intellectual property, or IP. Specifically, they are sold as behavioral, structural or physical descriptions, rather than actual ICs. A designer can integrate those descriptions with other descriptions to form one large SOC description that can then be fabricated into a new IC. Processor-level components that are available in the form of IP are known as cores. Initially, the term core referred only to microprocessors, but now is used for nearly any general-purpose or single-purpose processor IP component. Cores come in three forms: A soft core is a synthesizable behavioral description of a component, typically written in a HDL like VHDL or Verilog. A firm core is a structural description of a component, again typically provided in an HDL. A hard core is physical description, provided in any of a variety of physical layout file formats.
New Challenges Posed by Cores to Processor Providers: The advent of cores has dramatically changed the business model of vendors of generalpurpose processors and standard single-purpose processors. Two key areas that have changed significantly are pricing models and IP protection. New Challenges Posed by Cores to Processor Users: The advent of cores also poses new challenges to designers seeking to use a generalpurpose of standard single-purpose processor. These include licensing arrangements, extra design effort, and verification.
By Mr. Suman Kalyan Oduri
Page 20
www.uandistar.org