Data Representation Notes
Binary System
- Using binary system the data is represented in two possible state, that is, 0s and 1s, thus in
base 2.
- Most computing devices use binary numbering to represent electronic circuit voltage state,
(i.e., on/off switch), which considers 0 voltage input as off and 1 input as on.
Advantages of binary codes
- Binary codes are suitable for the computer applications and digital communications.
- Binary codes make the analysis and designing of digital circuits easier since only 0 and 1 are
used.
- Calculations can be carried out using reasonably simple active electronics (simple transistor
circuits), since it only requires on and off (1 and 0) signals.
- Binary data is also reasonable simple to store - again only needing a two state storage (on/off -
1/0)
Disadvantages of binary codes
- Difficult for most people to read.
- It takes a lot of digits to represent any reasonable number (for instance up to 99 million takes 8
digits in Decimal and 27 digits in Binary).
Octal Representation
- It is commonly used as a shorter representation of binary numbers by grouping binary digits
into threes.
- It is also used as a shorthand for representing file permissions on UNIX systems and
representation of UTF8 numbers
Advantages of octal representation
- Easier to work with than binary
- Conversions to and from binary are straight forward
Disadvantages
- Computer does not understand octal number system directly, so we need octal to binary
converter.
Hexadecimal arithmetic
- Hexadecimal can be used to write large binary numbers in just a few digits.
- It makes life easier as it allows grouping of binary numbers which makes it easier to read,
write and understand. It is more human-friendly, as humans are used to grouping together
numbers and things for easier understanding.
- Since computers can't understand hexadecimal, we get a piece of software to convert it to
binary for us.
- Hexadecimal is used to instructional codes in the computer.
1
Advantages of hexadecimal representation
- Easier to work with and read than a chain of binary digits.
- Easy to conversions from binary.
- It easier to read, write and understand.
- It is more human-friendly, as humans are used to grouping together numbers and things for
easier understanding.
Disadvantages
- Computer does not understand octal number system directly, so we need octal to binary
converter.
Character set
- A character set refers to all the characters that a computer system can recognise, which often
equates to characters on the keyboard.
- The characters within a character set can be text, number or even symbols.
- Each character is represented as a number.
- Examples of character set used by general computers to represent characters are ASCII,
UNICODE, BCD and EBCDIC.
American Standard Code for Information Interchange (ASCII)
- The ASCII code associates an integer value for each symbol in the character set, such as letters,
digits, punctuation marks, special characters, and control characters.
- ASCII is widely used in computers of all types and communications equipment.
- ASCII codes also occupy a lot of disc storage space.
- The ASCII codes are of two types namely ASCII-7 and ASCII-8.
• ASCII-7
- ASCII-7 is a 7-bit standard ASCII code.
- ASCII-7 allows 27 = 128 combinations/128unique symbols to be represented.
• ASCII-8
- It is an extended version of ASCII-7.
- ASCII-8 allows 28 = 256 combinations/256 unique symbols.
- The ASCII table below has 128 characters, with values from 0 through 127.
2
3
Table: ASCII character set
Codes 0 to 31 and 127 represent non printable control characters.
o Codes 48 to 57 stand for numeric 0-9.
o Codes 65 to 90 stand for uppercase letters A-Z.
o Codes 97 to 122 stand for lowercase letters a-z.
o Codes 128 to 255 are the extended ASCII codes.
- The space character is the 33rd value, followed by punctuations, digits, uppercase characters
and lowercase characters.
- It is important to note that digit characters have code values that differ from their numeric
equivalents: the code value of '0' is 48, that of '1' is 49, that of '2' is 50, and so forth.
- For example, 32 is the ASCII code for a space. So 32 = 00100000 (in binary)
- The name Adria in ASCII is represented as 65 100 114 105 97 since 65 represents A, 100
represents d, 114 represents r, 105 represents i and 97 represents a.
Advantages of using ASCII
- It uses the English language and can hold all the alphanumeric characters
- It uses less space than Unicode because it doesn't have many characters
- It has a "free" bit which can be used for error detection during data transmission.
- Most computers use ASCII codes, which makes it possible to transfer data from one computer
to another.
Disadvantages of using ASCII
- It only makes use of the English language.
- ASCII does not display characters read from a UNICODE document correctly.
Binary Coded Decimal (BCD)
- BCD is used to represent some numbers that are not proper numbers (numbers that don’t
behave like numbers), for example, barcode.
- In BCD, each decimal digit is simply changed into a four-bit binary number which are then
placed one after another in order.
- The following shows decimal digits in BCD:
0 0000
1 0001
2 0010
3 0011
4 0100
5 0101
6 0110
7 0111
8 1000
9 1001
- The decimal number 41 is coded as 0100 0001
- The decimal number 9212 is thus coded as 1001 0010 0001 00102 in BCD
4
Advantages of using BCD
- It is easy to convert a number from BCD to decimal form and vice versa.
- It is also simple to implement a hardware algorithm for the BCD converter.
- No rounding off errors since there is no rounding off numbers when computing fractional
numbers.
Disadvantages of using BCD
- BCD code for a given decimal number requires more bits than the straight binary code and
hence there is difficult to represent the BCD form in high speed digital computers in arithmetic
operations.
- The arithmetic operations using BCD code require a complex design of Arithmetic and Logic
Unit (ALU) than the straight binary number system
- The speed of the arithmetic operations that can be realised using BCD code is naturally slow
due to the complex hardware circuitry involved.
Extended Binary Coded Decimal Interchange Code (EBCDIC)
- The Extended Binary Coded Decimal Interchange Code (EBCDIC) uses 8 bits to represent a
symbol in the data.
- EBCDIC allows 28 = 256 different characters to be represented.
- EBCDIC codes are mainly used in the mainframe and mini computers.
Advantages of using EBCDIC
- An eight-bit code thus having a total of 256 different combinations.
- Code words are all of equal length so relatively easy to decode.
Disadvantages of using EBCDIC
- Unlike ASCII there is no "free" bit which can be used for error detection. If a computer using
this code for internal representation is to be connected to a data transmission system, the data
will generally have to be converted to ASCII before transmission
UNICODE
- Unicode is a universal character encoding standard for the representation of text which
includes letters, numbers and symbols in multi-lingual environments.
- Uses 16-bits which represent 65536 different characters.
- It is enough to represent characters in any language, even Chinese and hieroglyphics.
- Most Microsoft Software and Linux distributions use UNICODE.
Advantages of using UNICODE
- It can store characters from more than one language
- It can store characters from languages with more than 250 characters
- Unicode is almost universally accepted by computing platforms, browsers, and mobile devices
Disadvantages of using UNICODE
- Unicode uses a lot more space since it has a lot of characters