THE COPPERBELT UNIVERSITY
SCHOOL OF INFORMATION AND COMMUNICATION TECHNOLOGY
                        DEPARTMENT OF COMPUTER SCIENCE
UNIT 3: Data Representation for Processing
In the realm of Computer Science, data representation plays a paramount role. It refers to the
methods or techniques used to represent, or express information in a computer system. This
encompasses everything from text and numbers to images, audio, and beyond.
3.1 Basic Concepts of Data Representation
Data representation in computer science is about how a computer interprets and functions
with different types of information. Different information types require different
representation techniques. For instance, a video will be represented differently than a text
document. When working with various forms of data, it is important to grasp a fundamental
understanding of:
                     Binary system
                     Bits and Bytes
                     Number systems: decimal, hexadecimal
                     Character encoding: ASCII, Unicode
Data in a computer system is represented in binary format, as a sequence of 0s and 1s,
denoting 'off' and 'on' states respectively. The smallest component of this binary
representation is known as a bit, which stands for 'binary digit'. A byte, on the other hand,
generally encompasses 8 bits. An essential aspect of expressing numbers and text in a
computer system, are the decimal and hexadecimal number systems, and character encodings
like ASCII and Unicode.
Role of Data Representation
Data Representation is the foundation of computing systems and affects both hardware and
software designs. It enables both logic and arithmetic operations to be performed in
the binary number system, on which computers are based. An illustrative example of the
importance of data representation is when you write a text document. The characters you type
are represented in ASCII code - a set of binary numbers. Each number is sent to the memory,
represented as electrical signals; everything you see on your screen is a representation of the
underlying binary data. Computing operations and functions, like searching, sorting or
adding, rely heavily on appropriate data representation for efficient execution. Also,
computer programming languages and compilers require a deep understanding of data
representation to successfully interpret and execute commands. As technology evolves, so too
does our data representation techniques. Quantum computing, for example, uses quantum bits
or "qubits". A qubit can represent a 0, 1, or both at the same time, thanks to the phenomenon
of quantum superposition.
Types of Data Representation
In computer systems, various types of data representation techniques are utilized:
           o      Numbers can be represented in real, integer, and rational formats.
           o      Text is represented by using different types of encodings, such as ASCII or
                  Unicode.
           o      Images can be represented in various formats like JPG, PNG, or GIF, each
                  having its specific rendering algorithm and compression techniques.
           o      Tables are another important way of data representation, especially in the
                  realm of databases.
           Name                                  Email
           John Doe                              john@gmail.com
           Jane Doe                              jane@gmail.com
This approach is particularly effective in storing structured data, making information readily
accessible and easy to handle. By understanding the principles of data representation, you can
better appreciate the complexity and sophistication behind our everyday interactions with
technology.
Data Representation and Interpretation
The core of data representation and interpretation is founded on the binary system.
Represented by 0s and 1s, the binary system signifies the 'off' and 'on' states of electric
current, seamlessly translating them into a language comprehensible to computing hardware.
For instance,1101in binary is equivalent to 13 in decimal This interpretation happens
consistently in the background during all of your interactions with a computer system. Now,
try imagining a vast array of these binary numbers. It could get overwhelming swiftly. To
bring order and efficiency to this chaos, binary digits (or bits) are grouped into larger sets like
bytes, kilobytes, and so on. A single byte, the most commonly used set, contains eight bits.
Here's a simplified representation of how bits are grouped:
1 bit = Binary Digit
8 bits = 1 byte
1024 bytes = 1 kilobyte (KB)
1024 KB = 1 megabyte (MB)
1024 MB = 1 gigabyte (GB)
1024 GB = 1 terabyte (TB)
However, the binary system isn't the only number system pivotal for data interpretation. Both
decimal (base 10) and hexadecimal (base 16) systems play significant roles in processing
numbers and text data. Moreover, translating human-readable language into computer
interpretable format involves character encodings like ASCII (American Standard Code for
Information Interchange) and Unicode.
These systems interpret alphabetic characters, numerals, punctuation marks, and other
common symbols into binary code. For example, the ASCII value for capital 'A' is 65, which
corresponds to 01000001 in binary.
In the world of images, different encoding schemes interpret pixel data. JPG, PNG, and GIF,
being common examples of such encoded formats. Similarly, audio files utilize encoding
formats like MP3 and WAV to store sound data.
Understanding Binary Data Representation
Binary data representation is the most fundamental and elementary form of data
representation in computing systems. At the lowermost level, every piece of information
processed by a computer is converted into a binary format. Binary data representation is
based on the binary numeral system. This system, also known as the base-2 system, uses only
two digits - 0 and 1 to represent all kinds of data. The concept dates back to the early 18th-
century mathematics and has since found its place as the bedrock of modern computers. In
computing, the binary system's digits are called bits (short for 'binary digit'), and they are the
smallest indivisible unit of data.
Each bit can be in one of two states representing 0 ('off') or 1 ('on'). This mathematical
translation makes it possible for computing machines to perform complex operations even
though they understand only the simple language of 'on' and 'off' signals.
When representing character data, computing systems use binary-encoded formats. ASCII
and Unicode are common examples. In ASCII, each character is assigned a unique 7-bit
binary code. For example, the binary representation for the uppercase letter 'A' is 0100001.
Interpreting such encoded data back to a human-readable format is a core responsibility of
computing systems and forms the basis for the exchange of digital information globally.
Practical Application of Binary Data Representation
Binary data representation is used across every single aspect of digital computing. From
simple calculations performed by a digital calculator to the complex animations rendered in a
high-definition video game, binary data representation is at play in the background.
Consider a simple calculation like 7+5. When you input this into a digital calculator, the
numbers and the operation get converted into their binary equivalents. The microcontroller
inside the calculator processes these binary inputs, performs the sum operation in binary, and
finally, returns the result as a binary output. This binary output is then converted back into a
decimal number which you see displayed on the calculator screen.
When it comes to text files, every character typed into the document is converted to its binary
equivalent using a character encoding system, typically ASCII or Unicode. It is then saved
onto your storage device as a sequence of binary digits.
Similarly, for image files, every pixel is represented as a binary number. Each binary number,
called a 'bit map', specifies the color and intensity of each pixel. When you open the image
file, the computer reads the binary data and presents it on your screen as a colorful, coherent
image. The concept extends even further into the internet and network communications, data
encryption, data compression, and more. When you are downloading a file over the internet,
it is sent to your system as a stream of binary data. The web browser on your system receives
this data, recognizes the type of file and accordingly interprets the binary data back into the
intended format. In essence, every operation that you can perform on a computer system, no
matter how simple or complex, essentially boils down to large-scale manipulation of binary
data. And that sums up the practical application and universal significance of binary data
representation in digital computing.
3.2 Data Model Representation
When dealing with vast amounts of data, organizing and understanding the relationships
between different pieces of data is of utmost importance. This is where data model
representation comes into play in computer science. A data model provides an abstract,
simplified view of real-world data. It defines the data elements and the relationships among
them, providing an organized and consistent representation of data.
Exploring Different Types of Data Models
Understanding the intricacies of data models will equip you with a solid foundation in
making sense of complex data relationships. Some of the most commonly used data models
include:
           o   Hierarchical Model
           o   Network Model
           o   Relational Model
           o   Entity-Relationship Model
           o   Object-Oriented Model
           o   Semantic Model
The Hierarchical Model presents data in a tree-like structure, where each record has one
parent record and many children. This model is largely applied in file
systems and XML documents. The limitations are that this model does not allow a child to
have multiple parents, thus limiting its real-world applications.
The Network Model an enhancement of the hierarchical model, allows a child node to have
multiple parent nodes, resulting in a graph structure. This model is suitable for representing
complex relationships but comes with its own challenges such as iteration and navigation,
which can be intricate.
The Relational Model created by E.F. Codd, uses a tabular structure to depict data and their
relationships. Each row represents a collection of related data values, and each column
represents a particular attribute. This is the most widely used model due to its simplicity and
flexibility.
The Entity-Relationship Model illustrates the conceptual view of a database. It uses three
basic concepts: Entities, Attributes (the properties of these entities), and Relationships among
entities. This model is most commonly used in database design.
The Object-Oriented Model goes a step further and adds methods (functions) to the entities
besides attributes. This data model integrates the data and the operations applicable to the
data into a single component known as an object. Such an approach enables encapsulation, a
significant characteristic of object-oriented programming.
The Semantic Model aims to capture more meaning of data by defining the nature of data
and the relationships that exist between them. This model is beneficial in representing
complex data interrelations and is used in expert systems and artificial intelligence fields.
The Role of Data Models in Data Representation
Data models provide a method for the efficient representation and interaction of data
elements, thus forming an integral part of any database system. They provide the theoretical
foundation for designing databases, thereby playing an essential role in the development of
applications. A data model is a set of concepts and rules for formally describing and
representing real-world data. It serves as a blueprint for designing and implementing
databases and assists communication between system developers and end-users. Databases
serve as vast repositories, storing a plethora of data. Such vast data needs effective
organization and management for optimal access and usage. Here, data models come into
play, providing a structural view of data, thereby enabling the efficient organization, storage
and retrieval of data.
Consider a library system. The system needs to record data about books, authors, publishers,
members, and loans. All these items represent different entities. Relationships exist between
these entities. For example, a book is published by a publisher, an author writes a book, or a
member borrows a book. Using an Entity-Relationship Model, we can effectively represent
all these entities and relationships, aiding the library system's development process.
Designing such a model requires careful consideration of what data is required to be stored
and how different data elements relate to each other. Depending on their specific
requirements, database developers can select the most suitable data model representation.
This choice can significantly affect the functionality, performance, and scalability of the
resulting databases.
From decision-support systems and expert systems to distributed databases and data
warehouses, data models find a place in various applications. Modern NoSQL databases often
use several models simultaneously to meet their needs. For example, a document-based
model for unstructured data and a column-based model for analyzing large data sets. In this
way, data models continue to evolve and adapt to the burgeoning needs of the digital world.
Therefore, acquiring a strong understanding of data model representations and their roles
forms an integral part of the database management and design process. It empowers you with
the ability to handle large volumes of diverse data efficiently and effectively.
Importance of Data Interpretation
Understanding data interpretation is integral to unlocking the potential of any computing
process or system. When coded data is input into a system, your computer must interpret this
data accurately to make it usable. Consider typing a document in a word processor like
Microsoft Word. As you type, each keystroke is converted to an ASCII code by your
keyboard. Stored as binary, these codes are transmitted to the active word processing
software. The word processor interprets these codes back into alphabetic characters, enabling
the correct letters to appear on your screen, as per your keystrokes. Data interpretation is not
just an isolated occurrence, but a recurring necessity - needed every time a computing process
must deal with data. This is no different when you're watching a video, browsing a website,
or even when the computer boots up. Rendering images and videos is an ideal illustration of
the importance of data interpretation.
Digital photos and videos are composed of tiny dots, or pixels, each encoded with specific
numbers to denote color composition and intensity. Every time you view a photo or play a
video, your computer interprets the underlying data and reassembles the pixels to form a
comprehensible image or video sequence on your screen. Data interpretation further extends
to more complex territories like facial recognition, bioinformatics, data mining, and even
artificial intelligence. In these applications, data from various sources is collected, converted
into machine-acceptable format, processed, and interpreted to provide meaningful outputs.
In summary, data interpretation is vital for the functionality, efficiency, and progress
of computer systems and the services they provide. Understanding the basics of data
representation and interpretation, thereby, forms the backbone of computer science studies.
Data Representation - Key takeaways
             -   Data representation refers to techniques used to express information in
                 computer systems, encompassing text, numbers, images, audio, and more.
             -   Data Representation is about how computers interpret and function with
                 different information types, including binary systems, bits and bytes,
                 number systems (decimal, hexadecimal) and character encoding (ASCII,
                 Unicode).
             -   Binary Data Representation is the conversion of all kinds of information
                 processed by a computer into binary format.
-   Data Model Representation is an abstract, simplified view of real-world data
    that defines the data elements, and their relationships and provides a
    consistently organized way of representing data.