IE 3425 Eng’g.
Database
Lecture 1 Examples
Topics: Data & Information, Why Databases?, Types of
DB, Relational DB, Data Modeling, Tables, Keys (PK,
FK, CPK, secondary key, candidate key)
© Professor Zeid, Northeastern University
What is Data?
Data are raw facts about objects of a system with no context
If you were told, “15, Roger, Wristwatch, $85,” you would
not have learned anything. But you would have some data
Data does not say much about the system, in fact, data are
just raw bits
Information is knowledge or processed data that is useful in
decision making
Now, if you were told that in a queue at the checkout
counter, the 15th person in line is Roger, and he is waiting in
line to buy a Wristwatch that costs $85, you have information
2
Example 1: Data & Info
System: Retail Store
Objects: products, customers, employees, etc.
Data: products → TVs, cell phones, computer, etc.
customers → name, address, etc.
Info: products → best selling products, inventory
level of products, Products out of
stock
customers → demographics (aggregate data for
privacy)
Exercise: Apply above concept to a bank operation 3
Why do we need databases?
Why can’t we just use the file system?
Sometimes using the file system for your applications
is just fine: Word processing, Pictures, Games
As the complexity of the application and the amount
of data it works with increases, some disadvantages
of solely relying on the file system start to surface.
For Example – consider the retail store system in
Example 1
4
File System: Examples and Complexities
Example 2 Envelope with Business Cards
Using an envelope of business cards is useful as long as it
doesn’t contain too many cards.
You can find a particular piece of information (for example, a
person’s phone number) by looking through all of the cards.
The envelope is easy to expand by shoving more cards into the
envelope, at least up to a point. If you have more than a dozen
or so business cards, finding a particular card can be time
consuming.
You can even rearrange the cards a bit to improve performance
for cards you use often. Each time you use a card, move it to
the front of the pile. Over time, those that are used most will
be in front.
5
File System: Examples and Complexities
Example 3 Paper Notebook with data
A paper notebook is small, easy to use, easy to carry, doesn’t
require electricity, and doesn’t need to boot before you can use it.
A paper notebook is also easily extensible because you can buy
another notebook to add to your collection when the first one is
full.
However, a notebook’s contents are arranged sequentially.
If you want to find information about a particular topic, you’ll
have to look through the pages one at a time until you find what
you want.
The more data you have, the harder this kind of search
becomes. 6
File System: Examples and Complexities
Example 4 Filing Cabinet System
A filing cabinet can store a lot more data than a notebook and
you can easily expand the system by adding more files or
cabinets.
Finding a particular piece of data in the filing cabinet can
be easier than finding it in a notebook as long as you are
searching for the type of data used to arrange the records.
If the filing cabinet is full of customer data sorted by
customer name, and you want to find a particular customer’s
data, you’re in luck.
If you want to find all of the customers that live in a certain
city, you’ll have to dig through the files one at a time.
7
What’s the difference between a database
and a spreadsheet?
Databases and spreadsheets (such as Microsoft Excel) are both
convenient ways to store data. The primary differences between the
two are:
How the data is stored and manipulated
Who can access the data
How much data can be stored
Spreadsheets are generally designed for a single user or a small
number of users. Databases, on the other hand, are designed to hold
large amounts of organized data for multiple users in a secure
manner.
8
Databases (DBs)
A database is an organized collection of structured data
typically stored electronically in a computer system
All data in a database should be related, and separate
databases should be created to manage unrelated data.
For example, a database that contains data about students
should not also hold data about company stock prices
A database is usually controlled by a Database Management
System (DBMS)
Databases have evolved dramatically since their inception in
the early 1960s – from Relational Databases to Object
Oriented Databases and more recently, NoSQL Databases
9
Relational Databases (DBs)
A relational database is a type of database that stores and
provides access to data that are related to one another
Relational databases are based on the relational model,
organized as a set of tables with columns and rows
In a relational database, all the tables are related by one or
more fields, so that it is possible to connect all the tables in
the database through the field(s) they have in common
We shall talk more about fields, keys, etc.
Relational DB uses relations (tables)
10
Other Types of Databases
Many other database models exist that provide different strengths
than the relational model.
The most interesting new development is the concept of NoSQL (from
the phrase “not only SQL”).
A NoSQL database can work with data in a looser way, allowing for a
more unstructured environment, communicating changes to the data
over time to all the servers that are part of the database.
The relational database model is the most used database model today
and that’s what we will stick to in this course
Read more here: https://aws.amazon.com/nosql/ 11
and here: https://www.w3resource.com/mongodb/nosql.php
Data Modeling for a Relational DB
Data is stored in tables
A table has rows (records)
Each row consists of columns/fields (attributes)
A row has many names: row, record, tuple (4-tuple, n-tuple)
A column has many names: column, attribute, field
Example 5-field student Table Attributes/Columns/Fields
Records/Rows/Tuples
12
Data Modeling for a DB
The data model for a DB is as follows:
Data Model (DB)
Table 1 Table 2 …. Table n
Row 1 …. Row n
Attribute 1 …. Attribute n
13
Goals of Effective Database Design
Using modern database tools, just about anyone can build a
database. The question is, will the resulting database be useful?
A database won’t do you much good if you can’t get
information out of it quickly, reliably, and consistently.
It won’t be useful if it’s full of incorrect or contradictory data.
It also won’t be useful if it is stolen, lost, or corrupted by data
that was only half written when the system crashed.
All these potential problems can be addressed by using modern
design tools and good design principles, but only if you understand
what these problems are so that they can be avoided
14
Identifiers (Keys) of DB Tables
If a DB is to be searchable, each row must be uniquely identified,
i.e., each rows has a unique key
*There are different types of keys for a DB (key finds a row(s)):
Primary, Composite, Foreign, Secondary, and Candidate
PK (Primary Key): uniquely identifies a row
CPK (Composite PK): while a PK is one attribute, a CPK is a
combination of attributes
FK (Foreign Key): an attribute in a table that is used in another table.
The FK is usually the PK of the other table
To help you understand these terms further, let’s walk through an
example
15
Example 6: Identifiers (Keys) of DB Tables –
Primary/Foreign Key
[include in different table]
employee
PK employee_id employee_first_ employee_last_ employee_phone employee_address
name name
Must be
Unique 100 Roger Shah 000-999-2323 Apt 1, X Ave, Boston,
MA
THINK: Is this a bad design. Why? What is used in
practice?
16
Example 6: Identifiers (Keys) of DB Tables –
Primary/Foreign Key [include in different table]
employee
employee_id employee_first_ employee_last_ employee_phone employee_address
PK name name
100 Roger Shah 000-999-2323 Apt 1, X Ave, Boston,
MA
New Design
employee
employee_id employee_first_name employee_last_name
PK
100 Roger Shah
address
FK employee_id address_id address_1 address_2 address_city address_stat address_zip
17 e
100 1001 Apt 1 X Ave Boston MA 02115
Identifiers (Keys) of DB Tables
Secondary Key: a table attribute that you can use
if you do not know or remember the table PK. Use
a secondary key to search a table. A secondary
key may return multiple rows (records) as search
results
Candidate Key: a table attribute that could be
used as a possible PK, but not selected
See Example 7
18
Example 7: Identifiers (Keys) of DB Tables –
Secondary/Candidate Key
employee
Secondary keys Candidate key
PK employee_id employee_first_ employee_last_ employee_ssn
name name
100 Abe Zeid 12357
110 Meg Smith 58910
THINK: Provide another example of a candidate key
19
Additional Resources:
Videos:
YouTube: What are Database & SQL?
YouTube: Introduction to Databases
YouTube: The Relational Model
Web-based:
https://www.quackit.com/database/tutorial/
https://en.wikipedia.org/wiki/Comparison_of_relational_database_managem
ent_systems
20