Database Management
Database Management
COURSE MATERIAL
FOR
1
ACKNOWLEDGEMENT
We acknowledge the use of the Courseware of the National Open University of Nigeria
(NOUN) as the primary resource. Internal reviewers in the Ahmadu Bello University
have also been duly listed.
2
COPYRIGHT PAGE
© 2018 Ahmadu Bello University (ABU) Zaria, Nigeria
All rights reserved. No part of this publication may be reproduced in any form or by any
means, electronic, mechanical, photocopying, recording or otherwise without the prior
permission of the Ahmadu Bello University, Zaria, Nigeria.
ISBN:
Ahmadu Bello University e-Learning project,
Ahmadu Bello University
Zaria, Nigeria.
Tel: +234
E-mail:
3
COURSE WRITERS/DEVELOPMENT TEAM
Editor
Prof. M.I Sule
Course Materials Development Overseer
Dr. Usman Abubakar Zaria
Subject Matter Expert
Dr. Musa Hayatu
Subject Matter Reviewer
AbdullahiHussaini
Language Reviewer
EnegoloinuAdakole
Instructional Designers/Graphics
Ibrahim Otukoya, Abubakar Haruna
Proposed Course Coordinator
AbdullahiHussaini
ODL Expert
Dr. Musa Hayatu
4
CONTENTS
Title Page…………………………………………………………….……?
Acknowledgement Page…………………………………………… ……?
Copyright Page………………………………………………………..……?
Course Writers/Development Team………………………………………?
Table of Content………………………………..……………………………?
MODULE 2: - - - - - - - --70
Study Session 1: Entities and Entity Sets - - - - -70
Study Session 2: Structure of Relational Database - - -76
Study Session 3: The Relational Algebra - - - - -82
MODULE 3: - - - - - - - -86
Study Session 1: Structured Query Language (SQL) Fundamentals -86
Study Session 2: SQL Expressions - - - - -92
Study Session 3: Database Modification - - - - -99
Study Session 4: Integrity Constraints - - - - -107
Study Session 5: Fundamentals of XML - - - - -117
Study Session 6: Significance of XML - - - - -124
5
Study Session 7: XML Document - - - - -131
MODULE 4: - - - - - - - -140
Study Session 1: Computer Data Storage and Levels - - -140
Study Session 2: Features of Storage Technologies - - -151
Study Session 3: Common Storage Technologies - - -161
Study Session 4: File Organisation - - - - -169
Study Session 5: Document Type Declaration - - - -178
Study Session 6: Introduction to Web Services - - - -194
XIII. Glossary - - - - - - - - - -
6
COURSE STUDY GUIDE
i. COURSE INFORMATION
Course Code: LIBS 867
Course Title: Database System and Management
Credit Units: 2
Semester: First
Description:
This module has thus been designed to enhance your database management expertise.
Even if you have gained previous database management experience, I am going to
recommend that you go through the entire course to maximally benefit.
7
iv. COURSE LEARNING RESOURCES
i. Course Textbooks
Avi, S. et al. (N.D).Database System Concepts (6thed.). McGraw-Hill
Beaulieu, A. & Mary, E. T. (Eds.). Learning SQL (2nd ed.). Sebastapol, CA, USA:
O'Reilly.
Beynon-Davies, P. (2004).Database Systems (3rd ed.). Palgrave: Basingstoke,
UK.
Byers, F. R. (2003).Care and Handling of CDs and DVDs — A Guide for Librarians and
Archivists.National Institute of Standards and Technology.
Codd, E. F. (1970).”A Relational Model of Data for Large Shared Data Banks”.
Communications of the ACM 13 (6):pp. 377–387.
Codd, E. F. (1970). “A Relational Model of Data for Large Shared Data Banks”.
Communications of the ACM archive. Vol. 13. Issue 6.pp. 377-387.
“Database Design Basics”. (N.D.). Retrieved May 1, 2010, from
http://office.microsoft.com/en-us/access/HA012242471033.aspx
Development of an Object-Oriented DBMS; Portland, Oregon, United States. (1986). pp.
472 – 482.
Doll, S. (2002). “Is SQL a Standard Anymore?” TechRepublic’sBuilder.com.
TechRepublic.
http://articles.techrepublic.com.com/5100-10878_11-1046268.html. Retrieved
2010 01-07.
Gehani, N. (2006). The Database Book: Principles and Practice using MySQL. Summit,
NJ: Silicon Press.
itl.nist.gov (1993) Integration Definition for Information Modeling(IDEFIX). 21
December 1993.
Lightstone, S.; et al. (2007). Physical Database Design: The Database Professional's
8
Guide to Exploiting Indexes, Views, Storage, and More. Morgan Kaufmann Press.
Kawash, J. (2004). “Complex Quantification in Structured QueryLanguage (SQL): a
Tutorial using Relational Calculus”. Journal of Computers in Mathematics and
ScienceTeaching. Volume 23, Issue 2, 2004 AACE Norfolk, Virginia.
Mike, C. (2011). “Referential
Integrity”.http://databases.about.com/:About.com.http://databases.about.com/cs/ad
ministration/g/refintegrity.htm. Retrieved 2011-03-17.
Oppel, A. (2004). Databases Demystified. San Francisco, CA: McGraw-Hill Osborne
Media.Performance Enhancement through Replication in an Object-Oriented
DBM.(1986).pp. 325-336.
Ramakrishnan, R. &Gehrke, J. (2000). .Database Management Systems(2nd ed.).
McGraw-Hill Higher Education.
Seltzer, M. (2008).“Beyond Relational Databases”. Communications of the ACM, 51(7),
pp. 52-58.
Teorey, T.; et al. (2005). Database Modeling& Design: Logical Design ( 4th ed.). Morgan
Kaufmann Press.
Teorey, T.J.; et al. (2009). Database Design: Know it All. Burlington, MA: Morgan
Kaufmann Publishers.
Thomas, C.; et al. (2009). Database Systems: A Practical Approach to Design,
Implementation and Management.
Tsitchizris, D. C. &Lochovsky, F.H. (1982). Data Models. Englewood-Cliffs: Prentice-
Hall.
9
Introduce the concepts associated with system development;
1. define the term ‘database management system’
2. state typical examples of database management systems
3. identify the categories of database management systems
4. explain the concept ‘database servers’
5. outline the evolution of database management systems
6. state the common features of database management systems
7. explain the notion ‘ data models’
8. identify the common categories of data models
9. explain the concept of entity-relationship models
10.describe the concept of mapping cardinalities with respect to
11.entity-relationship diagram
12.identify the link between instances and schemes
13.list and describe the components of a data structure
14.state the data structures required for physical system
15.describe the notion of relationship sets
16.mention a formal definition of a relational algebra
17.state the key role of each component of an SQL expression
18.identify the common forms of database modification
19.discuss the link between main Constraints and Integrity
20.Constraints
21.itemise the core characteristics of storage technologies
22.sketch a succinct description of computer data storage
23.classify the levels of storage
24.distinguish between volatile and non-volatile memory
10
vi. ACTIVITIES TO MEET COURSE OBJECTIVES
Specifically, this course shall comprise of the following activities:
1. Studying courseware
2. Listening to course audios
3. Watching relevant course videos
4. Field activities, industrial attachment or internship, laboratory or studio
work (whichever is applicable)
5. Course assignments (individual and group)
6. Forum discussion participation
7. Tutorials (optional)
8. Semester examinations (CBT and essay based).
11
TOTAL 100%
C. Grading Scale:
A = 70-100
B = 60 – 69
C = 50 - 59
D = 45-49
F = 0-44
D. Feedback
Courseware based:
1. In-text questions and answers (answers preceding references)
2. Self-assessment questions and answers (answers preceding references)
Tutor based:
1. Discussion Forum tutor input
2. Graded Continuous assessments
Student based:
1. Online programme assessment (administration, learning resource, deployment, and
assessment).
12
ix. LINKS TO OPEN EDUCATION RESOURCES
OSS Watch provides tips for selecting open source, or for procuring free or open
software.
SchoolForge and SourceForge are good places to find, create, and publish open software.
SourceForge, for one, has millions of downloads each day.
Open Source Education Foundation and Open Source Initiative, and other organisation
like these, help disseminate knowledge.
Creative Commons has a number of open projects from Khan Academy to Curriki where
teachers and parents can find educational materials for children or learn about Creative
Commons licenses. Also, they recently launched the School of Open that offers courses
on the meaning, application, and impact of "openness."
Numerous open or open educational resource databases and search engines exist. Some
examples include:
• OEDb: over 10,000 free courses from universities as well as reviews of colleges
and rankings of college degree programmemes
• Open Tapestry: over 100,000 open licensed online learning resources for an
academic and general audience
• OER Commons: over 40,000 open educational resources from elementary school
through to higher education; many of the elementary, middle, and high school
resources are aligned to the Common Core State Standards
• Open Content: a blog, definition, and game of open source as well as a friendly
search engine for open educational resources from MIT, Stanford, and other
universities with subject and description listings
• Academic Earth: over 1,500 video lectures from MIT, Stanford, Berkeley,
Harvard, Princeton, and Yale
13
• JISC: Joint Information Systems Committee works on behalf of UK higher
education and is involved in many open resources and open projects including
digitising British newspapers from 1620-1900!
Other sources for open education resources
Universities
• The University of Cambridge's guide on Open Educational Resources for Teacher
Education (ORBIT)
• OpenLearn from Open University in the UK
Global
• Unesco's searchable open database is a portal to worldwide courses and research
initiatives
• African Virtual University (http://oer.avu.org/) has numerous modules on subjects
in English, French, and Portuguese
• https://code.google.com/p/course-builder/ is Google's open source software that is
designed to let anyone create online education courses
• Global Voices (http://globalvoicesonline.org/) is an international community of
bloggers who report on blogs and citizen media from around the world, including
on open source and open educational resources
Individuals (which include OERs)
• Librarian Chick: everything from books to quizzes and videos here, includes
directories on open source and open educational resources
• K-12 Tech Tools: OERs, from art to special education
• Web 2.0: Cool Tools for Schools: audio and video tools
• Web 2.0 Guru: animation and various collections of free open source software
• Livebinders: search, create, or organise digital information binders by age, grade,
or subject (why re-invent the wheel?)
14
x. ABU DLC ACADEMIC CALENDAR/PLANNER
PERIOD
Semester Semester 1 Semester 2 Semester 3
Activity JAN FEB MAR APR MAY JUN JUL AUG SEPT OCT NOV DEC
Registration
Resumption
Late Registn.
Facilitation
Revision/
Consolidation
Semester
Examination
15
xi. COURSE STRUCTURE AND OUTLINE
Course Structure
WEEK MODULE STUDY SESSION ACTIVITY
16
Study Session 1 1. Read Courseware for the corresponding Study Session.
Title: 2. View the Video(s) on this Study Session
Entities and Entity 3. Listen to the Audio on this Study Session
Week 3 Sets 4. View any other Video/U-tube (address/site
Pp. 70 https://bit.ly/2TxhDb9)
5. View referred Animation (Address/Site
STUDY https://bit.ly/2SqWwY6)
MODULE Study Session 2 1. Read Courseware for the corresponding Study Session.
2 Structure of 2. View the Video(s) on this Study Session
Relational 3. Listen to the Audio on this Study Session
Database 4. View any other Video/U-tube (address/site
Pp. 76 https://bit.ly/2iawCra)
5. View referred Animation (Address/Site
https://bit.ly/2iawCra)
Study Session 3 1. Read Courseware for the corresponding Study Session.
The Relational 2. View the Video(s) on this Study Session
Algebra 3. Listen to the Audio on this Study Session
Week 4 Pp. 82 4. View any other Video/U-tube (address/site
https://bit.ly/2RId0sS)
5. View referred Animation (Address/Site
https://bit.ly/2Y0OQCd)
17
Study Session 2 1. Read Courseware for the corresponding Study Session.
SQL Expressions 2. View the Video(s) on this Study Session
STUDY
Pp. 92 3. Listen to the Audio on this Study Session
4. View any other Video/U-tube (address/site
https://bit.ly/2Go0qgx)
5. View referred Animation (Address/Site
MODULE https://bit.ly/2Z8ypjK)
3 Study Session 3 1. Read Courseware for the corresponding Study Session.
Database 2. View the Video(s) on this Study Session
Modification 3. Listen to the Audio on this Study Session
Week 6 Pp. 99 4. View any other Video/U-tube (address/site
https://bit.ly/2MR79AO)
5. View referred Animation (Address/Site
https://bit.ly/2YihEoS)
Study Session 4 1. Read Courseware for the corresponding Study Session.
Integrity 2. View the Video(s) on this Study Session
Constraints 3. Listen to the Audio on this Study Session
Pp. 107 4. View any other Video/U-tube (address/site
https://bit.ly/2WLLcb5)
5. View referred Animation (Address/Site
https://bit.ly/2Lx73jR)
Study Session 5: 1. Read Courseware for the corresponding Study Session.
Fundamentals of 2. View the Video(s) on this Study Session
XML 3. Listen to the Audio on this Study Session
Week 7 Pp. 117 4. View any other Video/U-tube (address/site
https://bit.ly/2Gc4Zv5)
6. View referred Animation (Address/Site
https://bit.ly/2Y6SSns)
Study Session 6: 1. Read Courseware for the corresponding Study Session.
Significance Of 2. View the Video(s) on this Study Session
XML 3. Listen to the Audio on this Study Session
Pp. 124 4. View any other Video/U-tube (address/site
https://bit.ly/2TwJqs3)
18
6. View referred Animation (Address/Site
https://bit.ly/2nIIvnB)
Week 8 Study Session 7: 1. Read Courseware for the corresponding Study Session.
XML Document 2. View the Video(s) on this Study Session
Pp. 131 3. Listen to the Audio on this Study Session
4. View any other Video/U-tube (address/site
https://bit.ly/2MNYzCX)
6. View referred Animation (Address/Site
https://bit.ly/2y1i3ND)
Study Session 1; 1. Read Courseware for the corresponding Study Session.
Computer Data 2. View the Video(s) on this Study Session
Storage and Levels 3. Listen to the Audio on this Study Session
Pp. 140 4. View any other Video/U-tube (address/site
https://bit.ly/2GpvUmv)
MODULE
Week 9 6. View referred Animation (Address/Site
4 https://bit.ly/2Z2PImb)
19
Pp. 169 3. Listen to the Audio on this Study Session
4. View any other Video/U-tube (address/site
https://bit.ly/2DW23Ap)
6. View referred Animation (Address/Site
https://bit.ly/2JFZfKG)
Week 11 Study Session 5: 1. Read Courseware for the corresponding Study Session.
Document Type 2. View the Video(s) on this Study Session
Declaration 3. Listen to the Audio on this Study Session
Pp. 178 4. View any other Video/U-tube (address/site
https://bit.ly/2MOHkkZ)
6. View referred Animation (Address/Site
https://bit.ly/2y1q3OU)
Week 12 Study Session 6: 1. Read Courseware for the corresponding Study Session.
Introduction to 2. View the Video(s) on this Study Session
Web Services 3. Listen to the Audio on this Study Session
Pp. 194 4. View any other Video/U-tube (address/site
https://bit.ly/2D9VYyJ)
6. View referred Animation (Address/Site
https://bit.ly/2Z1Rgwx)
Week 13 REVISION/TUTORIALS (On Campus or Online) &
CONSOLIDATION WEEK
20
Course Outline
MODULE 1:
Study Session 1: Basic Concepts of Database Management Systems
Study Session 2: Data Models
Study Session 3: Instances and Schemes
Study Session 4: Overall System Structure
MODULE 2:
Study Session 1: Entities and Entity Sets
Study Session 2: Relationships and Relationship Sets
Study Session 3: Structure of Relational Database
Study Session 4: The Relational Algebra
MODULE 3:
Study Session 1: Structured Query Language (SQL) Fundamentals
Study Session 2: SQL Expressions
Study Session 3: Database Modification
Study Session 4: Integrity Constraints
Study Session 5: Fundamentals of XML
Study Session 6: Significance of XML
Study Session 7: XML Document
MODULE 4:
Study Session 1: Computer Data Storage and Levels
Study Session 2: Features of Storage Technologies
Study Session 3: Common Storage Technologies
Study Session 4: File Organisation
Study Session 5: Document Type Declaration
Study Session 6:Introduction to Web Services
21
XII. STUDY MODULES
MODULE 1: Introduction to Database Management Systems
Contents:
Study Session 1: Basic Concepts of Database Management Systems
Study Session 2: Data Models
Study Session 3: Instances and Schemes
Study Session 4: Overall System Structure
STUDY SESSION 1
Basic Concepts of Database Management Systems
Section and Subsection Headings:
Introduction
1.0 Learning Outcomes
2.0 Main Content
2.1- Overview of a Database Management System
2.1.1- What is a Database Management System?
2.1.2- Categories of Database Management Systems
2.1.3- Relational Database Management Systems
2.1.4- Hierarchical Database Management System
2.1.5- Object-Oriented Database Management System
2.1.6- Features of Database Management Systems
2.1.7- Database Servers
2.2-Evolution of Database Management Systems (DBMS)
2.2.1- Navigational DBMS
2.2.2- Relational DBMS
2.2.3- SQL DBMS
2.2.4- Object-Oriented Database
2.2.5- Current Trends
2.3- Components of Database Management Systems (DBMS)
22
2.3.1-Modeling Language
2.3.2- Data Structure
2.3.3- Data Query Language
2.3.4- Transaction Mechanism
3.0 Study Session Summary and Conclusion
4.0 Self-Assessment Questions
5.0 Additional Activities (Videos, Animations & Out of Class activities)
6.0 References/Further Readings
Introduction:
As I am sure you already know by now, this course (LIBS 867) is about Data System
and Management. It is going to provide you with a fundamental overview of the
concepts, principles and techniques of modern database management systems and of
database (data-driven) business application system development. The most important
thing you would need to grasp from this module is the fact that database management
systems make the logical presentation of database information to users possible. It is
much more than just learning new functions, syntax, etc. Thus, database management
systems require a logical way of thinking.
This module has thus been designed to enhance your database management expertise.
Even if you have gained previous database management experience, I am going to
recommend that you go through the entire module systematically to gain some insight
into the course.
23
4. Explain the concept of ‘database servers’
5. Outline the evolution of database management systems
6. Mention the common features of database management systems
7. Describe the main components of a database management system.
24
used by the great majority of database applications to manage practically all the
world’s databases.
In-text Question
What is Database Management System (DBMS)
Answer
Database Management System (DBMS) is a set of software programmes that controls the
organisation, storage, management, and retrieval of data in a database.
25
Querying is the process of requesting attribute information from various
perspectives and combinations of factors. Example: “How many 2-door cars in
Texas are green?” A database query language and report writer allow users to
interactively interrogate the database, analyse its data and update it according to
the users’privileges on data.
B. Backup and replication
It is wise to make copies of attributes regularly in case primary disks or other
equipment fails. A periodic copy of attributes may also be created for a distant
organisation that cannot readily access the original. DBMS usually provide
utilities to facilitate the process of extracting and disseminating attribute sets.
When data is replicated between database servers, so that the information
remains consistent throughout the database system and users cannot tell or even
know which server in the DBMS they are using, the system is said to exhibit
replication transparency.
C. Rule enforcement
Often one wants to apply rules to attributes so that the attributes are clean and
reliable. For example, we may have a rule that says each car can have only one
engine associated with it (identified by Engine Number). If somebody tries to
associate a second engine with a given car, we want the DBMS to deny such
arequest and display an error message. However, with changes in the model
specification such as, in this example, hybrid gas electriccars, rules may need to
change. Ideally such rules should be able to be added and removed as needed
without significantdata layout redesign.
D. Security
Often it is desirable to limit who can see or change whichattributes or groups of
attributes. This may be managed directly by an individual, or by the assignment
of individuals and privileges to groups, or (in the most elaborate models)
through the assignment of individuals and groups to roles which are then
granted entitlements.
26
E. Computation
There are common computations requested on attributes such as counting,
summing, averaging, sorting, grouping, cross-referencing, etc. Rather than have
each computer application implement these from scratch, they can rely on the
DBMS to supply such calculations.
F. Change and access logging
Often one wants to know who accessed what attributes, what was changed, and
when it was changed. Logging services allow this by keeping a record of access
occurrences and changes.
G. Automated optimisation
If there are frequently occurring usage patterns or requests, some DBMS can
adjust themselves to improve the speed of those interactions. In some cases the
DBMS will merely provide tools to monitor performance, allowing a human
expert to make the necessary adjustments after reviewing the statistics collected.
Fig 1.1.2:
Database Servers
Source: webclasses.net
In-text Question
Define Relational database management system
27
Answer
Relational database management system organises data in tabular files. Most modern Database
Management Systems (Oracle, Sybase, and Microsoft SQL Server) are relational databases.
The Codasyl approach was based on the “manual” navigation of a linked data set
which was formed into a large network. To find any particular record the programmer
had to step through these pointers one at a time until the required record was returned.
28
Simple queries like “find all the people in India” required the programme to walk the
entire data set and collect the matching results. There was, essentially, no concept of
“find” or “search”. This might sound like a serious limitation today, but in an era when
the data was most often stored on magnetic tape such operations were too expensive to
contemplate anyway.
IBM also had their own DBMS system in 1968, known as IMS. IMS was a
development of software written for the Apollo programme on the System/360. IMS
was generally similar in concept to Codasyl, but used a strict hierarchy for its model of
data navigation instead of Codasyl’s network model. Both concepts later became
known as navigational databases due to the way data was accessed, and Bachman’s
1973 Turing Award presentation was The Programmer as Navigator. IMS is classified
as a hierarchical database. IMS and IDMS, both CODASYL databases, as well as
CINCOMs TOTAL database are classified as network databases.
In-text Question
Mention the four major DBMS
Answer
I. Navigational DBMS
II. Relational DBMS
III. Multidimensional DBMS
IV. Object DBMS
29
a new system for storing and working with large databases. Instead of records being
stored in some sort of linked list of free-form records as in Codasyl, Codd's idea was
to use a “table” of fixed-length records. A linked-list system would be very inefficient
when storing “sparse” databases where some of the data for any one record could be
left empty. The relational model solved this by splitting the data into a series of
normalised tables, with optional elements being moved out of the main table to where
they would take up room only if needed.
In the relational model, related records are linked together with a “key”. For instance,
a common use of a database system is to track information about users, their name,
login information, various addresses and phone numbers. In the navigational approach
all of these data would be placed in a single record, and unused items would simply
not be placed in the database. In the relational approach, the data would be normalised
into a user table, an address table and a phone number table (for instance). Records
would be created in these optional tables only if the address or phone numbers were
actually provided.
Linking the information back together is the key to this system. In the relational
model, some bit of information was used as a “key”, uniquely defining a particular
record. When information was being collected about a user, information stored in the
optional (or related) tables would be found by searching for this key. For instance, if
the login name of auser is unique, addresses and phone numbers for that user would be
recorded with the login name as its key. This “re-linking” of related data back into a
single collection is something that traditional computer languages are not designed for.
Just as the navigational approach would require programmes to loop in order to collect
records, the relational approach would require loops to collect information about any
one record. Codd’s solution to the necessary looping was a set-oriented language, a
suggestion that would later spawn the ubiquitous SQL. Using a branch of mathematics
30
known as tuple calculus, he demonstrated that such a system could support all the
operations of normal databases (inserting, updating etc.) as well as providing a
simple system for finding and returning sets of data in a single operation.
IBM itself did one test implementation of the relational model, PRTV, and a
production one, Business System 12, both now discontinued. All other DBMS
implementations usually called relational are actually SQL DBMSs. In 1968, the
University of Michigan began development of the Micro DBMS relational database
management system. It was used to manage very large data sets by the US Department
of Labour, the Environmental Protection Agency and researchers from University of
Alberta, the University of Michigan and Wayne State University. It ran on mainframe
computers using Michigan Terminal System. The system remained in production until
1996.
2.2.3SQL DBMS
IBM started working on a prototype system loosely based on Codd’s concepts as
System R in the early 1970s. The first version was ready in 1974/5, and work then
started on multi-table systems in which the data could be split so that all of the data for
a record (much of which is often optional) did not have to be stored in a single large
“chunk”. Subsequent multi-user versions were tested by customers in 1978 and 1979,
by which time a standardised query language, SQL, had been added. Codd's ideas
were establishing themselves as both workable and superior to Codasyl, pushing IBM
to develop a true production version of System R, known as SQL/DS, and, later,
Database 2 (DB2).
Many of the people involved with INGRES became convinced of the future
commercial success of such systems, and formed their own companies to
commercialise the work but with an SQL interface. Sybase, Informix, Nonstop SQL
and eventually Ingres itself were all being sold as offshoots to the original INGRES
31
product in the 1980s. Even Microsoft SQL Server is actually a re-built version of
Sybase, and thus, INGRES.
Stonebraker went on to apply the lessons from INGRES to develop a new database,
Postgres, which is now known as PostgreSQL. PostgreSQL is often used for global
mission critical applications (the .org and .info domain name registries use it as
their primary data store, as do many large companies and financial institutions).
In Sweden, Codd’s paper was also read and Mimer SQL was developed from the mid-
70s at Uppsala University. In 1984, this project was consolidated into an independent
enterprise. In the early 1980s, Mimer introduced transaction handling for high
robustness in applications,an idea that was subsequently implemented on most other
DBMS.
Another big game changer for databases in the 1980s was the focus on increasing
reliability and access speeds. In 1989, two professors from the University of Michigan
at Madison published an article at an ACM associated conference outlining their
methods on increasing database performance. The idea was to replicate specific
important and often queried information, and store it in a smaller temporary database
that linked these key features back to the main database. This meant that a query could
32
search the smaller database much quicker, rather than search the entire dataset. This
eventually led to the practice of indexing, which is used by almost every operating
system from Windows to the system that operates Apple iPod devices.
These companies are able to record customer transactions made within their business.
Online transactions have become tremendously popular with the e-business world.
Consumers and businesses are able to make payments securely on company websites.
None of these current developments would have been possible without the evolution
33
of database management. Even with all the progress and current trends of database
management, there will always be a need for new development as specifications and
needs grow.
As the speeds of consumer internet connectivity increase, and as data availability and
computing become more ubiquitous, database are seeing migration to web services.
Web-based languages such as XML and PHP are being used to process databases over
web-based services. These languages allow databases to live in "the cloud". As with
many other products such as Google's Gmail, Microsoft's Office 2010,andCarbonite’s
online backup services, many services are beginning to move to web based services
due to increasing internet reliability, data storage efficiency, and the lack of a need for
dedicated IT staff to manage the hardware. A Faculty at Rochester Institute of
Technology published a paper regarding the use of databases in the cloud and stated
that their school planned to add cloud based database computing to their curriculum to
"keep [their] information technology (IT) curriculum at the forefront of technology”.
language of each database hosted via the DBMS. There are several approaches
currently in use, with hierarchical, network, relational, and object examples.
34
Essentially, the modelling language ensures the ability of the databases to
communicate with the DBMS and thus operate on the system.
Answer
1. Modelling language,
35
2. Data structure,
3. Database query language, and
4. Transaction mechanisms
3.0 Conclusion/Summary
I hope you enjoyed this study session. We defined some basic concepts of database
managementsystems. We also looked at the categories and main components of
database managementsystems. In addition, went through an overview of database
management systems: basic definition, examples, evolution, features and key
components. Now, let us attempt the questions below.
5.0 Additional Activities (Videos, Animations & Out of Class activities) e.g.
a. Visit U-tube add https://bit.ly/2GnVw33. Watch the video & summarise in 1
paragraph
b. View the animation on add/site https://bit.ly/2GnVw33 and critique it in the
discussion forum
36
Codd, E. F. (1970).”A Relational Model of Data for Large Shared Data Banks”.
Communications of the ACM 13 (6):pp. 377–387.
Codd, E. F. (1970). “A Relational Model of Data for Large Shared Data Banks”.
Communications of the ACM archive. Vol. 13. Issue 6.pp. 377-387.
“Database Design Basics”. (N.D.). Retrieved May 1, 2010, from
http://office.microsoft.com/en-us/access/HA012242471033.aspx
Development of an Object-Oriented DBMS; Portland, Oregon, United States. (1986).
pp. 472 – 482.
Doll, S. (2002). “Is SQL a Standard Anymore?” TechRepublic’sBuilder.com.
TechRepublic.
http://articles.techrepublic.com.com/5100-10878_11-1046268.html. Retrieved
2010 01-07.
Gehani, N. (2006). The Database Book: Principles and Practice using MySQL.
Summit, NJ: Silicon Press.
itl.nist.gov (1993) Integration Definition for Information Modelling(IDEFIX). 21
December 1993.
Lightstone, S.; et al. (2007). Physical Database Design: The Database Professional's
Guide to Exploiting Indexes, Views, Storage, and More. Morgan Kaufmann
Press.
Kawash, J. (2004). “Complex Quantification in Structured QueryLanguage (SQL): a
Tutorial using Relational Calculus”. Journal of Computers in Mathematics and
ScienceTeaching. Volume 23, Issue 2, 2004 AACE Norfolk, Virginia.
Mike, C. (2011). “Referential
Integrity”.http://databases.about.com/:About.com.http://databases.about.com/cs/
administration/g/refintegrity.htm. Retrieved 2011-03-17.
Oppel, A. (2004). Databases Demystified. San Francisco, CA: McGraw-Hill Osborne
Media.
Performance Enhancement through Replication in an Object-Oriented
DBM.(1986).pp. 325-336.
37
Ramakrishnan, R. &Gehrke, J. (2000). .Database Management Systems(2nd ed.).
McGraw-Hill Higher Education.
Seltzer, M. (2008).“Beyond Relational Databases”. Communications of the ACM,
51(7), pp. 52-58.
Teorey, T.; et al. (2005). Database Modelling& Design: Logical Design ( 4th ed.).
Morgan Kaufmann Press.
Teorey, T.J.; et al. (2009). Database Design: Know it All. Burlington, MA: Morgan
Kaufmann Publishers.
Thomas, C.; et al. (2009). Database Systems: A Practical Approach to Design,
Implementation and Management.
Tsitchizris, D. C. &Lochovsky, F.H. (1982). Data Models. Englewood-Cliffs:
Prentice-Hall.
38
STUDY SESSION 2
Data Models
Section and Subsection Headings:
Introduction
1.0 Learning Outcomes
2.0 Main Content
2.1- Object-Based Logical Models
2.1.1- The E-R Model
2.1.2- The Object-Oriented Model
2.2- Record-based Logical Models
2.2.1 The Relational Model
2.2.2- The Network Model
2.2.3- The Hierarchical Model
2.3- Physical Data Models
3.0Study Session Summary and Conclusion
4.0 Self-Assessment Questions
5.0Additional Activities (Videos, Animations &Out of Class activities)
6.0References/Further Readings
Introduction:
I welcome you to this study session which provides a general idea of data models as
well as the different categories of data models. By the way, data models are a
collection of conceptual tools for describing data, data relationships, data semantics
and data constraints. There are three different groups and we will look at them in more
detail subsequently.
39
2. Identify common categories of data models
3. Describe object-based logical models
4. List the common types of object-based logical models
5. Discuss the concept of entity-relationship models
6. State the essential elements of an entity-relationship diagram
7. Describe the concept of mapping cardinalities with respect toentity-relationship
diagram
8. Specify classical record-based logical models.
40
entities. For example, a customer_account relationship associates a customer with
each account he or she has. Thus, the set of all entities or relationships of the same
type is called the entity set or relationship set.
Another essential element of the E-R diagram is the mapping cardinalities, which
express the number of entities to which another entity can be associated with via a
relationship set. We will see later how well this model works to describe real world
situations.The overall logical structure of a database can be expressed graphically by
an E-R diagram as
depicted in Figure 1.2.1
below:
41
parts of the object, the instance variables and method code, are not visible externally.
Answer
Object-based logical models describe data at the conceptual and view levels. They provide fairly
flexible structuring capabilities and facilitate the explicit specification of data constraints.
The three most widely accepted models are the relational, network, and hierarchical.
We will now briefly consider these models in the units that follow.
42
2.2.1 The Relational Model
In the relational model, data and relationships are represented by a collection of
tables. Each table has a
number of columns with
unique names, e.g.
customer, account.
Figure 1.3 shows a
sample relational
database.
Fig. 1.2.3: A Sample Relational Database
43
2.2.3 The Hierarchical Model
The hierarchical model is
similar to the network model.
However, in this model
organisation of the records is as a
collection of trees, rather than
arbitrary graphs. Fig. 1.2.5: A Sample Hierarchical Database
The relational model does not use pointers or links, but relates records by the values
they contain. This allows a formal mathematical foundation to be defined. Figure 1.5
below shows a sample hierarchical database.
In-text Question
What is Relational Model?
Answer
Relational model, data and relationships are represented by a collection of tables.
3.0 Conclusion/Summary
From our studies in this session, it is vital to remember that Object-based logical
models describe data at the conceptual and view levels. Two common types of object-
based logical models are Entity-relationship models and Object-oriented models. It
is equally worth noting that each data model plays a specific role following a specific
line of action.
44
In this study session, we introduced the concept of data models, identified the common
categories of data models. We equally described the notion of entity-relationship
models, stating the essential elements of an entity relationship diagram. Furthermore,
we described the concept of mapping cardinalities with respect to entity-relationship
diagram. I hope you found the unit enlightening. To assess your comprehension,
attempt thequestions below.
6.0References/Further Readings
Avi, S.; et al. (N.D).Database System Concepts(6th ed.). McGraw-Hill.
45
Communications of the ACM 13 (6):pp. 377–387.
Codd, E. F. (1970). “A Relational Model of Data for Large Shared DataBanks”.
Communications of the ACM archive. Vol. 13. Issue6.pp. 377-387
“Database Design Basics”. (N.D.). Retrieved May 1, 2010, from
http://office.microsoft.com/en-us/access/HA012242471033.aspx
Development of an Object-Oriented DBMS; Portland, Oregon, United States.
(1986). pp. 472 – 482.
Doll, S. (2002). “Is SQL a Standard Anymore?” TechRepublic’s
Builder.com.TechRepublic.http://articles.techrepublic.com.com/5100-
10878_11-1046268.html. Retrieved 2010-01-07.
Gehani, N. (2006). The Database Book: Principles and Practice usingMySQL.
Summit, NJ: Silicon Press.
itl.nist.gov (1993) Integration Definition for Information Modelling(IDEFIX).
21 December 1993.
Lightstone, S. et al. (2007). Physical Database Design: The Database
Professional's Guide to Exploiting Indexes, Views, Storage, andMore. Morgan
Kaufmann Press.
Kawash, J. (2004). “Complex Quantification in Structured QueryLanguage
(SQL): a Tutorial using Relational Calculus”. Journalof Computers in
Mathematics and Science Teaching. Volume 23,Issue 2, 2004 AACE Norfolk,
Virginia.
Mike, C. (2011). “Referential Integrity”.http://databases.about.com/:
About.com.http://databases.about.com/cs/administration/g/refintegrity.htm.Retri
eved 2011-03-17.
Oppel, A. (2004). Databases Demystified. San Francisco, CA: McGraw-Hill
Osborne Media.
Performance Enhancement through Replication in an Object-Oriented DBM.
(1986).pp. 325-336.
Ramakrishnan, R. &Gehrke, J. (2000). .Database Management Systems(2nd
46
ed.). McGraw-Hill Higher Education.
Seltzer, M. (2008).“Beyond Relational Databases”. Communications ofthe
ACM, 51(7), pp. 52-58.
Teorey, T.; et al. (2005). Database Modelling& Design: Logical Design( 4th
ed.). Morgan Kaufmann Press.
Teorey, T.J.; et al. (2009). Database Design: Know it All. Burlington,MA:
Morgan Kaufmann Publishers.
Thomas, C.; et al. (2009).Database Systems: A Practical Approach toDesign,
Implementation and Management.
Tsitchizris, D. C. &Lochovsky, F.H. (1982). Data Models. Englewood-Cliffs:
Prentice-Hall.
47
STUDY SESSION 3
Instances and Schemes
Section and Subsection Headings:
Introduction
1.0 Learning Outcomes
2.0 Main Content
2.1- Concepts of Instances and Schemes
2.1.1- Link between Instances and Schemes
2.1.2- Analogy with Programming Languages
2.1.3- Categories of Schemes
2.2- Data Independence
2.2.1- Classes of Data Independence
2.3- Data Definition Language (DDL)
2.4- Data Manipulation Language (DML)
2.4.1- Data Manipulation
2.4.2- Data Manipulation Language (DML)
2.4.3- Types of Data Manipulation Language
2.5- Database Administrator
2.5.1- Duties of a Database Administrator
2.6- Database Users
2.6.1- Application Programmers
2.6.2- Sophisticated Users
2.6.3- Specialised Users
2.6.4- Naïve Users
3.0 Study Session Summary and Conclusion
4.0 Self-Assessment Questions
5.0 Additional Activities (Videos, Animations &Out of Class activities)
6.0References/Further Readings
48
Introduction:
The initial task we have in this unit is to describe the notion of instances and schemes,
in this way you will gain a broader understanding of the mode of change of data over
time. In sum, I will introduce to you data manipulation and other associated terms.
49
2.1.1 Link between Instances and Schemes
In order to grasp the key aspects of Instances and Schemes, let us first identify the
link between these two concepts. Instances and Schemes are two terms closely
associated with the mode of change database over time.
Answer
The information in a database at a particular point in time is called an instance of the database.
While the overall design of the database is called the database scheme.
50
2.2 Data Independence
Data independence refers to the ability to modify a scheme definition in one level
without affecting a scheme definition in a higher level.
51
are to a large extent heavily dependent on the logical structure of the data.
In-text Question
What is Data independence?
Answer
Data independence refers to the ability to modify a scheme definition in one level without affecting a
scheme definition in a higher level.
52
2.4 Data Manipulation Language (DML)
53
to the fact that a query language is a portion of a DML involving information
retrieval only.
In-text Question
Meaning of data definition language (DDL)
Answer
The data definition language (DDL) is a language used to specify a database scheme as a set of
definitions expressed in a DDL.
54
consulted by the database manager module whenever updates occur.
55
permanent application programmes (e.g. automated teller machine).
In-text Question
What is Database Administrator?
Answer
The term Database Administrator simply refers to a person having central control over data and
programmes accessing that data.
3.0 Conclusion/Summary
I have said that instances and Schemes are two terms closely associated with the
mode of change database over time. The information in a database at a particular point
in time is called an instance of the database. While the overall design of the database is
called the database scheme. A database Administrator is a person having central
control over data and programmes and programmes accessing that data. The main goal
of the DML is to provide efficient human interaction with the system.
In this study session, we considered instances and schemes, stating the classical
categories of schemes. We equally looked at data definition language and data
manipulation language as well as database users and duties of a database
administrator. I hope that you understood the topics discussed, you may now attempt
the questions below.
4.0Self-Assessment Questions
1. State the categories of data independence.
2. How would a database administrator define a scheme?
3. Describe a data dictionary.
4. How do sophisticated users interact with database systems?
57
Science Teaching. Volume 23,Issue 2, 2004 AACE Norfolk, Virginia.
Mike, C. (2011). “Referential Integrity”.http://databases.about.com/:About.com.
http://databases.about.com/cs/administration/g/refintegrity.htm. Retrieved 2011-03-17.
Oppel, A. (2004). Databases Demystified. San Francisco, CA: McGraw-Hill Osborne
Media.
Performance Enhancement through Replication in an Object-Oriented
DBM.(1986).pp. 325-336.
Ramakrishnan, R. &Gehrke, J. (2000). .Database Management Systems(2nd ed.).
McGraw-Hill Higher Education.
Seltzer, M. (2008).“Beyond Relational Databases”.Communications ofthe ACM,
51(7), pp. 52-58.
Teorey, T.; et al. (2005). Database Modelling& Design: Logical Design(4th ed.).
Morgan Kaufmann Press.
Teorey, T.J.; et al. (2009). Database Design: Know it All. Burlington,MA: Morgan
Kaufmann Publishers.
Thomas, C.; et al. (2009).Database Systems: A Practical Approach toDesign,
Implementation and Management.
Tsitchizris, D. C. &Lochovsky, F.H. (1982). Data Models. Englewood-Cliffs:
Prentice-Hall.
58
STUDY SESSION 4
Overall System Structure
Section and Subsection Headings:
Introduction
1.0 Learning Outcomes
2.0 Main Content
2.1- Partitioning of Databases
2.2- Components of Data Structure
2.3- Data Structures for Physical Implementation
3.0Study Session Summary and Conclusion
4.0 Self-Assessment Questions
5.0 Additional Activities (Videos, Animations &Out of Class activities)
6.0 References/Further Readings
Introduction:
In the previous unit we discussed instances and schemes, data manipulation and data
definition language. This unit presents the components of data structure and data
structures required for physical system implementation.
59
2.0 Main Content
2.1 Partitioning of Databases
Database systems are normally
partitioned into modules for different
functions. Some functions (e.g. file
systems) may be provided by the
operating system. Fig 1.4.1: Partitioning of Databases
Source: sqishack.com
60
efficient implementation of the dictionary.
3. Indices: provide fast access to data items holding particular values.
61
In-text Question
What are the components of data structure?
Answer
1. File manager manages allocation of disk space and data structures used to represent information
on disk.
2. Database manager: The interface between low-level data and application programmes and
queries.
3. Query processor translates statements in a query language into low-level instructions the
database manager understands. (May also attempt to find an equivalent but more efficient form.)
4. DML pre-compiler converts DML statements embedded in an application programme to normal
procedure calls in a host language. The pre-compiler interacts with the query processor.
5. DDL compiler converts DDL statements to a set of tables containing metadata stored in a data
dictionary.
3.0Conclusion/Summary
You should take note that a typical data structure consists of file manager, database
manager, query processor, data manipulation language pre-compiler and the data
definition language compiler. Database systems are partitioned into modules for
different functions. Data structures required for physical system implementation
include data files, indices and data dictionary.
In this sessionwe focused on databases and data structures. We were also able to
highlight the components of a data structure. In order to assess your understanding of
this unit, you need to attempt the questions below.
62
discussion forum
6.0References/Further Readings
Avi, S.; et al. (N.D). Database System Concepts(6th ed.). McGraw-Hill
Beaulieu, A. & Mary, E. T. (Eds.). Learning SQL (2nd ed.). Sebastapol, CA,
USA: O'Reilly.
Beynon-Davies, P. (2004).Database Systems (3rd ed.).Palgrave: Basingstoke,
UK.
Byers, F. R. (2003). Care and Handling of CDs and DVDs — A Guide for
Librarians and Archivists.National Institute of Standards and Technology.
Codd, E.F. (1970).”A Relational Model of Data for Large Shared Data Banks”.
Communications of the ACM 13 (6):pp. 377–387.
Codd, E.F. (1970). “A Relational Model of Data for Large Shared Data Banks”.
Communications of the ACM archive. Vol. 13. Issue 6.pp. 377-387
“Database Design Basics”. (N.D.). Retrieved May 1, 2010, from
http://office.microsoft.com/en-us/access/HA012242471033.aspx
Development of an Object-Oriented DBMS; Portland, Oregon, United States.
(1986). pp. 472 – 482.
Doll, S. (2002). “Is SQL a Standard Anymore?”.TechRepublic’s Builder.com.
TechRepublic.http://articles.techrepublic.com.com/5100-10878_11-
1046268.html.Retrieved 2010-01-07.
Gehani, N. (2006). The Database Book: Principles and Practice using MySQL.
Summit, NJ: Silicon Press.
itl.nist.gov (1993) Integration Definition for Information Modelling(IDEFIX).
21 December 1993.
Lightstone, S.; et al. (2007). Physical Database Design: The Database
Professional's Guide to Exploiting Indexes, Views, Storage, and More. Morgan
Kaufmann Press.
63
Kawash, J. (2004). “Complex Quantification in Structured Query Language
(SQL): a Tutorial using Relational Calculus”. Journal of Computers in
Mathematics and Science Teaching.Volume 23, Issue 2, 2004 AACE Norfolk,
Virginia.
Mike, C. (2011). “Referential Integrity”.http://databases.about.com/:
About.com.
http://databases.about.com/cs/administration/g/refintegrity.htm.Retrieved 2011-
03-17.
Oppel, A. (2004). Databases Demystified. San Francisco, CA: McGraw-Hill
Osborne Media.
Performance Enhancement through Replication in an Object-Oriented DBM.
(1986).pp. 325-336.
Ramakrishnan, R. &Gehrke, J. (2000). .Database Management Systems (2nd
ed.). McGraw-Hill Higher Education.
Seltzer, M. (2008).“Beyond Relational Databases”. Communications of the
ACM, 51(7), pp. 52-58.
Teorey, T.; et al. (2005). Database Modelling& Design: Logical Design (4th
ed.). Morgan Kaufmann Press.
Teorey, T.J.; et al. (2009). Database Design: Know it All. Burlington, MA:
Morgan Kaufmann Publishers.
Thomas, C.; et al. (2009).Database Systems: A Practical Approach to Design,
Implementation and Management.
Tsitchizris, D. C. &Lochovsky, F.H. (1982). Data Models. Englewood-Cliffs:
Prentice-Hall.
64
MODULE 2
The Entity-Relationship Data Model
Contents:
Study Session 1: Entities and Entity Sets
Study Session 2: Structure of Relational Database
Study Session 3: The Relational Algebra
STUDY SESSION 1
Entities and Entity Sets
Section and Subsection Headings:
Introduction
1.0 Learning Outcomes
2.0 Main Content
2.1- What is an Entity?
2.2- Types of Entities
2.3- Entity Set
2.4- Entity Representation
2.5- Entity and Programming Languages
3.0Study Session Summary and Conclusion
5.0Self-Assessment Questions
6.0Additional Activities (Videos, Animations &Out of Class activities)
6.0References/Further Readings
Introduction:
The E-R (entity-relationship) data model views the real world as a set of basic
objects (entities) and relationships among these objects. It is intended primarily for the
database design process by allowing the specification of an enterprise scheme. This
represents the overall logical structure of the database. In this unit, we will consider
65
entities, types and representation. A brief analogy would be made with respect to
programming languages.
66
An entity may be concrete (a person or a book, for example) or abstract (like a
holiday or a concept).
Source: gatevidyalay.com
67
Define an entity
Answer
An entity simply refers to an object that exists and is distinguishable from other objects.
3.0 Conclusion/Summary
We discovered that the E-R (entity-relationship) data model is intended primarily for
the database design process by allowing the specification of an enterprise scheme.
This represents the overall logical structure of the database. The concept of an entity
set corresponds to the programming language type definition. A variable of a given
type has a particular value at a point in time. Thus, a programming language variable
corresponds to an entity in the E-R model. An entity is represented by a set of
attributes.
In this study session, we learnt about the entity-relationship data model, giving a
concise description of an entity. We equally considered entity types and
representations. A brief analogy was made with respect to programming languages. Be
assured that the facts gathered from this unit will be valuable for understanding other
aspects of database systems. Now, attempt the questions below.
68
UK.
Byers, F. R. (2003). Care and Handling of CDs and DVDs — A Guide for
Librarians and Archivists.National Institute of Standards and Technology.
Codd, E.F. (1970).”A Relational Model of Data for Large Shared Data Banks”.
Communications of the ACM 13 (6):pp. 377–387.
Codd, E.F. (1970). “A Relational Model of Data for Large Shared Data Banks”.
Communications of the ACM archive. Vol 13. Issue 6.pp. 377-387
“Database Design Basics”. (N.D.). Retrieved May 1, 2010, from
http://office.microsoft.com/en-us/access/HA012242471033.aspx
Development of an Object-Oriented DBMS; Portland, Oregon, United States.
(1986). pp. 472 – 482.
Doll, S. (2002). “Is SQL a Standard Anymore?” TechRepublic’s Builder.com.
TechRepublic.http://articles.techrepublic.com.com/5100-10878_11-
1046268.html.Retrieved 2010-01-07.
Gehani, N. (2006). The Database Book: Principles and Practice using MySQL.
Summit, NJ: Silicon Press.
itl.nist.gov (1993) Integration Definition for Information Modeling(IDEFIX).
21 December 1993.
Lightstone, S.; et al. (2007). Physical Database Design: The Database
Professional's Guide to Exploiting Indexes, Views, Storage, and More. Morgan
Kaufmann Press.
Kawash, J. (2004). “Complex Quantification in Structured Query Language
(SQL): a Tutorial using Relational Calculus”. Journal of Computers in
Mathematics and Science Teaching ISSN 0731-9258 0731-9258 Volume 23,
Issue 2, 2004 AACENorfolk, Virginia.
Mike, C. (2011). “Referential Integrity”.http://databases.about.com/:
About.com.http://databases.about.com/cs/administration/g/refintegrity.htm.
Retrieved 2011-03-17.
Oppel, A. (2004). Databases Demystified. San Francisco, CA: McGraw-Hill
69
Osborne Media.
Performance Enhancement through Replication in an Object-Oriented DBM.
(1986).pp. 325-336.
Ramakrishnan, R. &Gehrke, J. (2000). Database Management Systems (2nd
ed.). McGraw-Hill Higher Education.
Seltzer, M. (2008).“Beyond Relational Databases”. Communications of the
ACM, 51(7),pp. 52-58.
Teorey, T.; et al. (2005). Database Modeling& Design: Logical Design. (4th
ed.). Morgan Kaufmann Press.
Teorey, T.J.; et al. (2009). Database Design: Know it All. Burlington, MA:
Morgan Kaufmann Publishers.
Thomas, C.; et al. (2009).Database Systems: A Practical Approach to Design,
Implementation and Management.
Tsitchizris, D. C. &Lochovsky, F.H. (1982). Data Models. Englewood-Cliffs:
Prentice-Hall.
70
STUDY SESSION 2
Structure of Relational Database
Section and Subsection Headings:
Introduction
1.0 Learning Outcomes
2.0 Main Content
2.1- The Entity Relationship Diagram
2.2- Components of an Entity Relationship Diagram
2.3- Design of an E-R Database Scheme
3.0Study Session Summary and Conclusion
4.0Self-Assessment Questions
5.0Additional Activities (Videos, Animations &Out of Class activities)
6.0References/Further Readings
Introduction:
The previous unit introduced relationships and relationship sets and entities. We will
consider specific entity-relationships in this unit.
71
2.2 Components of an Entity Relationship Diagram
The components of an entity-relationship diagram and their corresponding roles are as
follows:
1. Rectangles representing entity sets
2. Ellipses representing attributes
3. Diamonds representing relationship sets
4. Lines linking attribute to entity sets and
entity sets to relationship sets.
Fig 2.3.1: The Entity Relationship Diagram
In the course of our studies, lines would be directed (either with an arrow at the end)
to signify mapping cardinalities for relationship sets.
72
2. Whether an entity set or a relationship set best fit a real-world concept.
3. Whether to use an attribute or an entity set.
4. Use of a strong or weak entity set.Appropriateness of generalization.
5. Appropriateness of aggregation.
In-text Question
Itemize the components of an entity-relationship diagram and their corresponding roles
Answer
1. Rectangles representing entity sets
2. Ellipses representing attributes
3. Diamonds representing relationship sets
4. Lines linking attribute to entity sets and entity sets to relationship sets.
3.0 Conclusion/Summary
In conclusion, the entire logical structure of a relational database can be represented
graphically by means of an Entity-Relationship (E-R) diagram. Components of an
entity-relationship diagram and their corresponding roles are as follows:
1. Rectangles representing entity sets
2. Ellipses representing attributes
3. Diamonds representing relationship sets
4. Lines linking attribute to entity sets and entity sets to relationship sets.
In this session, we considered specific entity-relationship diagrams and their
components. We equally looked at models for designing E-R database schemes. Now,
please attempt the questions below.
73
2. List the components of an E-R diagram.
3. State the specific roles of each component of an E-R diagram.
5.0 Additional Activities (Videos, Animations &Out of Class activities) e.g.
a. Visit U-tube add https://bit.ly/2iawCra. Watch the video & summarise in 1
paragraph
b. View the animation on add/site https://bit.ly/2iawCra and critique it in the
discussion forum
74
Summit, NJ: Silicon Press.
itl.nist.gov (1993) Integration Definition for Information Modeling(IDEFIX).
21 December 1993.
Lightstone, S.; et al. (2007). Physical Database Design: The Database
Professional's Guide to Exploiting Indexes, Views, Storage, and More. Morgan
Kaufmann Press.
Kawash, J. (2004). “Complex Quantification in Structured Query Language
(SQL): a Tutorial using Relational Calculus”. Journal of Computers in
Mathematics and Science Teaching. Volume 23, Issue 2, 2004 AACE Norfolk,
Virginia.
Mike, C. (2011). “Referential Integrity”.http://databases.about.com/:
About.com.http://databases.about.com/cs/administration/g/refintegrity.htm.
Retrieved 2011-03-17.
Oppel, A. (2004). Databases Demystified. San Francisco, CA: McGraw-Hill
Osborne Media.
Performance Enhancement through Replication in an Object-Oriented DBM.
(1986).pp. 325-336.
Ramakrishnan, R. &Gehrke, J. (2000). .Database Management Systems (2nd
ed.). McGraw-Hill Higher Education.
Seltzer, M. (2008).“Beyond Relational Databases”. Communications of the
ACM, 51(7), pp. 52-58.
Teorey, T.; et al. (2005). Database Modeling& Design: Logical Design. (4th
ed.). Morgan Kaufmann Press.
Teorey, T.J.; et al. (2009). Database Design: Know it All. Burlington, MA:
Morgan Kaufmann Publishers.
Thomas, C.; et al. (2009).Database Systems: A Practical Approach to Design,
Implementation and Management.
Tsitchizris, D. C. &Lochovsky, F.H. (1982). Data Models. Englewood-Cliffs:
Prentice-Hall.
75
STUDY SESSION 3
The Relational Algebra
Section and Subsection Headings:
Introduction
1.0 Learning Outcomes
2.0 Main Content
2.1- Formal Definition of a Relational Algebra
2.2- Fundamental Operations
2.2.1- The Select Operation
2.2.2- The Project Operation
2.2.3- The Cartesian product Operation
2.2.4- The Rename Operation
2.2.5- The Union Operation
2.2.6- The Set Difference Operation
3.0Study Session Summary and Conclusion
4.0Self-Assessment Questions
5.0Additional Activities (Videos, Animations &Out of Class activities)
6.0References/Further Readings
Introduction:
Welcome to a newsession. Here we shall be looking into relational algebra. We are
going to highlight the fundamental operations and also specify the outcome that
corresponds to each operation. As you study this unit, I urge you to take note of the
key points.
76
3. State the specific role of each operation
4. Identify the operators that correspond to each operation.
General expressions of a relational algebra are formed out of smaller sub expressions
using
1. σp(ᴱ1) select (p a predicate)
2. project (s a list of attributes)
3. rename (x a relation name)
4. union
5. set difference
6. artesian product
In-text Question
Define Relational Algebra
Answer
Relational algebra is a procedural query language.
77
b. View the animation on add/site https://bit.ly/2Y0OQCd and critique it in the
discussion forum
78
MODULE 3
SQL and Integrity Constraints
Contents:
Study Session 1: Structured Query Language (SQL) Fundamentals
Study Session 2: SQL Expressions
Study Session 3: Database Modification
Study Session 4: Integrity Constraints
Study Session 5: Fundamentals of XML
Study Session 6: Significance of XML
Study Session 7: XML Document
STUDY SESSION 1
Structured Query Language (SQL) Fundamentals
Section and Subsection Headings:
Introduction
1.0 Learning Outcomes
2.0 Main Content
2.1- Structured Query Language (SQL)
2.2- Structural Components of SQL
3.0Study Session Summary and Conclusion
4.0Self-Assessment Questions
5.0Additional Activities (Videos, Animations &Out of Class activities)
6.0References/Further Readings
Introduction:
This unit delves into an important aspect of a database management system- the
Structured Query language (SQL). We will briefly consider the structural
components of the Structured Query language.
79
1.0 Study Session Learning Outcomes
After studying this session, I expect you to be able to:
1. Identify the structural components of SQL
2. State the specific roles of a data definition language
3. Describe the concept of an interactive data manipulation language.
80
3. Interactive data manipulation language (DML) –which is a query language
based on both relational algebra and tuple relational calculus, plus commands to
insert, delete and modify tuples.
Answer
1. Define relation schemes.
2. Delete relations.
3. Create indices.
4. Modify schemes.
3.0 Conclusion/Summary
Winding up, we can go over the main points of this unit. The Structured Query
Language (SQL) is regarded as the standard relational database language. A typical
SQL consists of several parts which include: Data definition language (DDL),
Interactive data manipulation language (DML), Embedded data manipulation
language, View Definition, Authorisation, Integrity and Transaction control. The Data
definition language (DDL) - provides commands to: Define relation schemes, Delete
relations, Create indices and Modify schemes. The Interactive data manipulation
language (DML) - a query language based on both relational algebra and tuple
relational calculus, plus commands to insert, delete and modify tuples.
81
This session provided an overview of Structured Query Language (SQL), specifying
the structural components of SQL. I hope you have found this unit interesting.
82
(1986). pp. 472 – 482.
Doll, S. (2002). “Is SQL a Standard Anymore?”.TechRepublic’sBuilder.com.
TechRepublic.
http://articles.techrepublic.com.com/5100-10878_11-1046268.html. Retrieved
2010-01-07.
Gehani, N. (2006). The Database Book: Principles and Practice using MySQL.
Summit, NJ: Silicon Press.
itl.nist.gov (1993) Integration Definition for Information Modeling(IDEFIX).
21 December 1993.
Lightstone, S.; et al. (2007). Physical Database Design: The Database
Professional's Guide to Exploiting Indexes, Views, Storage, and More. Morgan
Kaufmann Press.
Kawash, J. (2004). “Complex Quantification in Structured Query Language
(SQL): a Tutorial using Relational Calculus”. Journal of Computers in
Mathematics and Science Teaching. Volume 23, Issue 2, 2004 AACE Norfolk,
Virginia.
Mike, C. (2011). “Referential Integrity”.http://databases.about.com/:
About.com.http://databases.about.com/cs/administration/g/refintegrity.htm.
Retrieved 2011-03-17.
Oppel, A. (2004). Databases Demystified. San Francisco, CA: McGraw-Hill
Osborne Media.
Performance Enhancement through Replication in an Object-Oriented DBM.
(1986).pp. 325-336.
Ramakrishnan, R. &Gehrke, J. (2000). Database Management Systems (2nd
Ed.). McGraw-Hill Higher Education.
Seltzer, M. (2008).“Beyond Relational Databases”. Communications of the
ACM, 51(7), pp. 52-58.
Teorey, T.; et al. (2005). Database Modeling& Design: Logical Design (4th
ed.). Morgan Kaufmann Press.
83
Teorey, T.J.; et al. (2009). Database Design: Know it All. Burlington, MA:
Morgan Kaufmann Publishers.
Thomas, C.; et al. (2009). Database Systems: A Practical Approach to Design,
Implementation and Management.
Tsitchizris, D. C. &Lochovsky, F.H. (1982). Data Models. Englewood-Cliffs:
Prentice-Hall.
84
STUDY SESSION 2
SQL Expressions
Section and Subsection Headings:
Introduction
1.0 Learning Outcomes
2.0 Main Content
2.1- Components of an SQL Expression
2.2- The ‘SELECT’ Clause
2.3- The ‘Where’ Clause
2.4- The ‘From’ Clause
3.0Study Session Summary and Conclusion
4.0Self-Assessment Questions
5.0Additional Activities (Videos, Animations &Out of Class activities)
6.0References/Further Readings
Introduction:
In the previous session, we established the structural components of a Structured
Query Language (SQL). However, in this unit we will take a closer look at the
components of a classical SQL expression.
85
An SQL expression consists of ‘select’, ‘from’ and ‘where’ clauses. Ordinarily, a
query has the form
selectɅ1, Ʌ2,…, Ʌn
from
where ᴩ
Fig 3.2.1:
An example of the select clause is as follows: Find the names of all branches in the
account relation.
Selectbnamefrom account
distinct vs. all: elimination or not elimination of duplicates.
86
Find the names of all branches in the account relation.
select distinct bnamefrom account
By default, duplicates are not removed. We can state it explicitly using all.
select all bnamefrom account
When the asterisk is placed after the select clause, i.e. select *, this denotes select all
the attributes. Arithmetic operations can also be used in the selection list.
In-text Question
What is “SELECT” Clause?
Answer
The select clause lists attributes to be copied. It corresponds to relational algebra project.
This theory can be applied in the following example: Find the account numbers of
accounts with balances between $90,000 and $100,000.
select account#
from account
where balance between 90000 and 100000
87
This is the clause that corresponds to Cartesian product, which lists relations to be
used. The ‘from’ class by itself defines a Cartesian product of the relations in the
clause.
SQL does not have a natural join equivalent. However, natural join can be expressed
in terms of a Cartesian product, selection, and projection. For the relational algebra
expression we can represent this by means of an SQL statement as follows:
select distinct cname, borrower.loan#
from borrower, loan
whereborrower.loan# = loan.loan#
In-text Question
Define ‘From’ Clause
Answer
This is the clause that corresponds to Cartesian product, which lists relations to be used. The ‘from’
class by itself defines a Cartesian product of the relations in the clause.
3.0 Conclusion/Summary
In conclusion, SQL expression consists of Select, From and Where clauses. The select
clause lists attributes to be copied. The ‘where’ clause corresponds to the selection
predicate in a relational algebra, while the ‘From’ clause corresponds to the Cartesian
product, which lists relations to be used.
89
MySQL. Summit, NJ: Silicon Press.itl.nist.gov (1993) Integration Definition for
Information Modelling (IDEFIX). 21 December 1993.
Lightstone, S.; et al. (2007). Physical Database Design: The Database
Professional's Guide to Exploiting Indexes, Views, Storage, and More. Morgan
Kaufmann Press.
Kawash, J. (2004). “Complex Quantification in Structured Query
Language (SQL): a Tutorial using Relational Calculus”. Journal of Computers
in Mathematics and Science Teaching .Volume 23, Issue 2, 2004 AACE
Norfolk, Virginia.
Mike, C. (2011). “Referential Integrity”.http://databases.about.com/:
About.com.http://databases.about.com/cs/administration/g/refintegrity.htm.
Retrieved 2011-03-17.
Oppel, A. (2004). Databases Demystified. San Francisco, CA: McGraw-Hill
Osborne Media.
Performance Enhancement through Replication in an Object-Oriented DBM.
(1986).pp. 325-336.
Ramakrishnan, R. &Gehrke, J. (2000). .Database Management Systems
(2nd ed.). McGraw-Hill Higher Education.
Seltzer, M. (2008).“Beyond Relational Databases”. Communications of the
ACM, 51(7), pp. 52-58.
Teorey, T.; et al. (2005). Database Modelling & Design: Logical Design (4th
ed.). Morgan Kaufmann Press.
Teorey, T.J.; et al. (2009). Database Design: Know it All. Burlington, MA:
Morgan Kaufmann Publishers.
Thomas, C.; et al. (2009). Database Systems: A Practical Approach to Design,
Implementation and Management.
Tsitchizris, D. C. &Lochovsky, F.H. (1982). Data Models. Englewood-Cliffs:
Prentice-Hall.
90
STUDY SESSION 3
Database Modification
Section and Subsection Headings:
Introduction
1.0 Learning Outcomes
2.0 Main Content
2.1- Modes of Database Modification
2.1.1- Deletion
2.1.2- Insertion
2.1.3- Updates
3.0Study Session Summary and Conclusion
4.0Self-Assessment Questions
5.0Additional Activities (Videos, Animations &Out of Class activities)
6.0References/Further Readings
Introduction:
Up until now, we had looked at the aspect of extracting information from the database.
In this unit we shall consider the common modes of database modification.
91
An interesting feature of databases is the transformative capacity. Wewill consider the
common forms of a database modification in the ensuing units.
2.1.1 Deletion
Deletion is expressed in much the same way as a query. Instead of displaying, the
selected tuples are removed from the database. We can only delete whole tuples. The
syntax for deletion in SQL is given as:
delete from r
where P (is there not something missing here?)
Tuples in r for which P is true are deleted. In the event that the ‘where’ clause is
omitted, all tuples are deleted. Furthermore, the request delete from loan deletes all
tuples from the relation loan. Other examples are as follows:
92
1. Delete all of Smith’s account records.
2. delete from depositor
3. where cname=“Smith”
4. Delete all loans with loan numbers between 1300 and 1500.
5. delete from loan
6. where loan# between 1300 and 1500
7. Delete all accounts at branches located in Surrey.
8. delete from account
9. where bname in
10.(select bname
11.from branch
12.where bcity=”Surrey”)
Tuples can only be deleted from one relation at a time, but we may reference any
number of relations in a select-from-where clauseembedded in the where clause of a
delete. However, if the delete request contains an embedded select that references the
relation from which tuples are to be deleted, ambiguities may result. For example, to
delete the records of all accounts with balances belowtheaverage, we might write
delete from account where balance <(select avg(balance) from account)
In this case, when we delete tuples from account, the average balance changes!
2.1.2 Insertion
To insert data into a relation, we either specify a tuple, or write a query whose result is
the set of tuples to be inserted. Attribute values for inserted tuples must be members of
the attribute's domain.
Here are some examples:
1. To insert a tuple for Smith who has $1200 in account A-9372 at the SFU
branch.
2. insert into account
3. values (“SFU”, “A-9372”, 1200)
93
4. To provide each loan that the customer has in the SFU branch with a $200
savings account.
5. insert into account
6. select bname, loan#, 200
7. from loan
8. where bname=”SFU”
2.1.3 Updates
The ‘Update’ statement allows us to change some values in a tuple without necessarily
changing all. The example below demonstrates this as follows:
1. To increase all account balances by 5 percent.
2. update account
3. set balance=balance * 1.05
This statement is applied to every tuple in account.
To make two different rates of interest payment, depending on balance amount:
1. update account
2. set balance=balance * 1.06
94
3. where balance > 10,000
4. update account
5. set balance=balance * 1.05
6. where balance 10,000
In-text Question
Define Deletion
Answer
Deletion is expressed in much the same way as a query. Instead of displaying, the selected tuples are
removed from the database.
3.0 Conclusion/Summary
To wrap up, tuples can only be deleted from one relation at a time. Data is inserted
into a relation by either specifying a tuple, or writing a query whose result is the set of
tuples to be inserted. Selective alteration of tuples is made possible by means of
‘Update’ statements.
We considered the common forms of database modification, specifying the general
syntax of some common database statement. To test your knowledge, attempt the
exercise below.
95
6.0 References/Further Readings
Avi, S; et al. (N.D).Database System Concepts (6th ed.). McGraw-Hill
Beaulieu, A. & Mary, E. T. (Eds.). Learning SQL (2nd ed.). Sebastapol, CA,
USA: O'Reilly.
Beynon-Davies, P. (2004). Database Systems (3rd ed.). Palgrave: Basingstoke,
UK.
Byers, F. R. (2003). Care and Handling of CDs and DVDs — A Guide for
Librarians and Archivists.National Institute of Standards and Technology.
Codd, E. F. (1970).”A Relational Model of Data for Large Shared Data Banks”.
Communications of the ACM 13 (6):pp. 377–387.
Codd, E. F. (1970). “A Relational Model of Data for Large Shared Data Banks”.
Communications of the ACM archive. Vol. 13. Issue 6.pp. 377-387.
“Database Design Basics”. (N.D.). Retrieved May 1, 2010, from
http://office.microsoft.com/en-us/access/HA012242471033.aspx
Development of an Object-Oriented DBMS; Portland, Oregon, UnitedStates.
(1986). pp. 472 – 482.
Doll, S. (2002). “Is SQL a Standard Anymore?”.TechRepublic’sBuilder.com.
TechRepublic.
http://articles.techrepublic.com.com/5100-10878_11-1046268.html.Retrieved
2010-01-07.
Gehani, N. (2006). The Database Book: Principles and Practice using MySQL
(1st Ed.). Summit, NJ: Silicon Press.itl.nist.gov (1993) Integration Definition
for Information Modelling (IDEFIX). 21 December 1993.
Lightstone, S.; et al. (2007). Physical Database Design: The Database
Professional's Guide to Exploiting Indexes, Views, Storage, and More. Morgan
Kaufmann Press.
Kawash, J. (2004). “Complex Quantification in Structured Query
96
Language (SQL): a Tutorial using Relational Calculus”. Journal of Computers
in Mathematics and Science Teaching ISSN 0731-9258 0731-9258 Volume 23,
Issue 2, 2004 AACE Norfolk, Virginia.
Mike, C. (2011). “Referential Integrity”.http://databases.about.com/:
About.com.http://databases.about.com/cs/administration/g/refintegrity.htm.
Retrieved 2011-03-17.
Oppel, A. (2004). Databases Demystified. San Francisco, CA: McGraw-Hill
Osborne Media.
Performance Enhancement through Replication in an Object-Oriented DBM.
(1986).pp. 325-336.
Ramakrishnan, R. &Gehrke, J. (2000). .Database Management Systems (2nd
ed.). McGraw-Hill Higher Education.
Seltzer, M. (2008).“Beyond Relational Databases”. Communications of the
ACM, 51(7), pp. 52-58.
Teorey, T.; et al. (2005). Database Modelling & Design: Logical Design (4th
ed.). Morgan Kaufmann Press.
Teorey, T.J.; et al. (2009). Database Design: Know it All. Burlington, MA:
Morgan Kaufmann Publishers.
Thomas, C.; et al. (2009). Database Systems: A Practical Approach to Design,
Implementation and Management.
Tsitchizris, D. C. &Lochovsky, F.H. (1982). Data Models. Englewood-Cliffs:
Prentice-Hall.
97
STUDY SESSION 4
Integrity Constraints
Section and Subsection Headings:
Introduction
1.0 Learning Outcomes
2.0 Main Content
2.1- Domain Constraints
2.1.1- Domain/Key Normal Form (DKNF)
2.1.2- Domain Constraints and Integrity Constraints
2.1.3- Domain Constraint Guidelines
2.2- The ‘Check’ Clause
2.3- Referential Integrity
2.3.1- Referential Integrity in the E-R Model
2.3.2- Referential Integrity in SQL
2.4- Foreign Keys
3.0Study Session Summary and Conclusion
4.0Self-Assessment Questions
5.0Additional Activities (Videos, Animations &Out of Class activities)
6.0References/Further Readings
Introduction:
In this session we will learn about the concept of Integrity constraints. We will also
study two core aspects of referential integrity: referential integrity in the E-R Model
and referential integrity in SQL. I hope you are able to grasp the main ideas. Don’t
forget that you can always go back a step or two for revision whenever you are feeling
lost.
98
1. Define domain constraints
2. Show the link between main Constraints and Integrity Constraints
3. Distinguish between referential integrity in the E-R Model and referential
integrity in SQL.
99
1. Attributes may have the same domain, e.g. cname and employeename.
2. It is not as clear whether bname and cname domains ought to be distinct.
3. At the implementation level, they are both character strings.
4. At the conceptual level, we do not expect customers to have the same names as
branches, in general.
5. Strong typing of domains enables one to test for values inserted, and whether
queries make sense. Newer systems, particularly object-oriented database
systems, offer a rich set of domain types that can be extended easily.
In-text Question
What do you understand by domain constraints?
Answer
A domain constraint specifies the permissible values for a given attribute, while a key constraint
specifies the attributes that uniquely identify a row in a given table.
Fig. 3.4.3 shows an n-ary relationship set R relating entity sets. Let us denote the
primary key of (some texts seem to be missing here). The attributes of the relation
scheme for relationshipset R include (some texts seem to be missing here)..Each in the
scheme for R is a foreign key that leads to a referential integrity constraint.
Relation schemes for weak entity sets must include the primary key ofthe strong entity
set on which they are existence dependent. This is aforeign key, which leads to
another referential integrity constraint.
101
2. unique key clause includes a list of attributes forming a candidate key.
3. foreign key clause includes a list of attributes forming the foreign key,and the
name of the relation referenced by the foreign key.
The example below illustrates a summary of the features mentioned so far: create
tablecustomer
(cname char(20) not null,
street char(30),
city char(30),
primary key (cname))
createtablebranch
(bname char(15) not null,
bcity char(30),
assets integer,
primary key (bname)
check (assets >= 0))
createtableaccount
(account# char(10) not null,
(bname char(15),
balance integer,
primary key (account#)
foreign key (bname) references branch,
check (balance >= 0))
create table depositor
(cname char(20) not null,
account# char(10) not null,
primary key (cname, account#)
foreign key (cname) references customer,
foreign key (account#) references account)
102
3.4 Foreign Keys
Fundamentally, in the database context, a foreign key simply refers to the short form
for declaring a single column. For example: bname
Char(15) references branch
Normally, when a referential integrity constraint is violated, the action is rejected.
However, a foreign key clause in SQL-92 can specify steps to be taken to change the
tuples in the referenced relation to restore the constraint.
For example:
Createtableaccount
...
foreign key (bname) references branch
on delete cascade
on insert cascade,
...
If a delete of a tuple in branch results in the preceding referential integrity constraints
being violated, the delete is not rejected, but the delete “cascade” to the account
relation, deleting the tuple that refers to the branch that was deleted.
Update will be cascaded to the new value of the branch!
SQL-92 also allows the foreign key clause to specify actions other than cascade, such
as setting the referencing field to null, or to a default value, if the constraint is
violated. If there is a chain of foreign key dependencies across multiple relations, a
deletion or update at one end of the chain can propagate across the entire chain.
If a cascading update or delete causes a constraint violation that cannot be handled by
a further cascading operation, the system aborts the transaction and all the changes
caused by the transaction and its cascading actions are undone.
103
Given the complexity and arbitrary nature of the way constraints in SQL behave with
null values, it is the best to ensure that all columns of unique and foreign key
specifications are declared to be non-null.
In-text Question
Define Referential Integrity
Answer
Often we wish to ensure that a value appearing in a relation for a given set of attributes also appears
for another set of attributes in another relation.
3.0 Conclusion/Summary
In this session, we discovered that permissible values for a given attribute are
specified by the domain constraint, while a key constraint specifies the attributes that
uniquely identify a row in a given table. We also considered the fact that the
Domain/Key Normal form is required in databases, to prevent the occurrence of
general constraints in the database that are not clear domain or key constraints. The
‘check’ clause enables schema designer specify a predicate that must be satisfied by
any value assigned to a variable whose type is the domain. The fact that strong typing
of domains enables one test for values to be inserted was highlighted. In the database
context, a foreign key simply refers to the short form for declaring a single column.
In this sessionwe also introduced the concept of domain constraints and integrity
constraints. The ‘Check’ clause and Referential integrity were equally highlighted. We
hope you enjoyed this unit.
104
5.0 Additional Activities (Videos, Animations &Out of Class activities) e.g.
a. Visit U-tube add https://bit.ly/2WLLcb5. Watch the video & summarise in 1
paragraph
b. View the animation on add/site https://bit.ly/2Lx73jR and critique it in the
discussion forum
105
Professional's Guide to Exploiting Indexes, Views, Storage, and More. Morgan
Kaufmann Press.Kawash, J. (2004). “Complex Quantification in Structured
Query
Language (SQL): a Tutorial using Relational Calculus”. Journal of Computers
in Mathematics and Science Teaching. Volume 23, Issue 2, 2004 AACE
Norfolk, Virginia.
Mike, C. (2011). “Referential Integrity”.
http://databases.about.com/:About.com.http://databases.about.com/cs/administra
tion/g/refintegrity.htm. Retrieved 2011-03-17.
Oppel, A. (2004). Databases Demystified. San Francisco, CA: McGraw-Hill
Osborne Media.
Performance Enhancement through Replication in an Object-Oriented DBM.
(1986).pp. 325-336.
Ramakrishnan, R. &Gehrke, J. (2000). .Database Management Systems (2nd
ed.). McGraw-Hill Higher Education.
Seltzer, M. (2008).“Beyond Relational Databases”. Communications of the
ACM, 51(7), pp. 52-58.
Teorey, T.; et al. (2005). Database Modelling & Design: Logical Design (4th
ed.). Morgan Kaufmann Press.
Teorey, T.J.; et al. (2009). Database Design: Know it All. Burlington, MA:
Morgan Kaufmann Publishers.
Thomas, C.; et al. (2009).Database Systems: A Practical Approach to Design,
Implementation and Management.
Tsitchizris, D. C. &Lochovsky, F.H. (1982). Data Models. Englewood-Cliffs:
Prentice-Hall.
106
STUDY SESSION 5
Fundamentals of XML
Section and Subsection Headings:
Introduction
1.0 Learning Outcomes
2.0 Main Content
2.1- What is XML?
2.2- Common Concepts of XML
2.2.1- Documents Concept
2.2.2- XML and HTML
2.2.3- XML and SGML
3.0Study Session Summary and Conclusion
4.0 Self-Assessment Questions
5.0 Additional Activities (Videos, Animations &Out of Class activities)
6.0 References/Further Readings
Introduction:
This unit presents an introduction to the Extensible Markup Language (XML), at a
reasonably technical level in order to gain more insight on the subject of structured
documents. In addition to covering the XML 1.0 Specification, we shall equally
underscore related XML specifications, which are evolving. I will like you to take note
of these key points.
107
5. State the difference between XML and SGML.
Structured information contains both content (words, pictures, etc.) and some
indication of what role that content plays (for
example, content in a section heading has a
different meaning from content in a
footnote, which means something different
than content in a figure caption or content in a
database table, etc.). Almost all documents
have some structure. Fig 3.5.1: What is XML?
Source: staff.washington.edu
108
2.2.2 XML and HTML
In HTML, both the tag semantics and the tag set are fixed. An <h1> is always a first
level heading and the tag <ati.product.code> is meaningless. The W3C (this means the
World Wide Web Consortium), in conjunction with browser vendors and the WWW
community, is constantly working to extend the definition of HTML to allow new tags
to keep pace with changing technology and to bring variations in presentation
(stylesheets) to the Web. However, these changes are always rigidly confined by what
the browser vendors have implemented and by the fact that backward compatibility is
paramount. And for people who want to disseminate information widely, features
supported by only the latest releases of Netscape and Internet Explorer are not useful.
On the other hand, XML specifies neither semantics nor a tag set. In fact
XML is really a meta-language for describing markup languages. In other words,
XML provides a facility to define tags and the structural relationships between them.
Since there is no predefined tag set, there cannot be any preconceived semantics. All
of the semantics of an XML document will either be defined by the applications that
process them or by stylesheets.
In-text Question
What is XML?
Answer
XML is an extensible markup language for documents containing structured information.
109
Defining XML as an application profile of SGML means that any fully conformant
SGML\ system will be able to read XML
documents. However, using and
understanding XML documents does not require a
system that is capable of understanding the full
generality of SGML. XML is, roughly speaking,
a restricted form of SGML.
Fig
3.5.2: XML and SGML
Source: cscie12dce.harvard.edu
For technical purists, it is important to note that there may also be subtle differences
between documents as understood by XML systems and SGML systems. In
particular, treatment of white space immediately adjacent to tags may be different.
In-text Question
Define XML and HTML
Answer
While XML is defined as an application profile of SGML, SGML is the Standard Generalized
Markup Language defined by ISO 8879. While
In HTML, both the tag semantics and the tag set are fixed.
3.0 Conclusion/Summary
We learnt that XML is an extensible mark-up language for documents containing
structured information. Structured information contains both content (words, pictures,
etc.) and some indication of what role that content plays. Almost all documents have
some structure. A mark-up language is a mechanism to identify structures in a
document. The XML specification defines a standard way to add mark-up to
documents.
The term “document” refers not only to traditional documents, but also to the myriad
of other XML “data formats”. These include vector graphics, e-commerce
110
transactions, mathematical equations, object meta-data, server APIs, and other kinds
of structured information.
In HTML, both the tag semantics and the tag set are fixed. On the other hand, XML
specifies neither semantics nor a tag set. In fact XML is really a meta-language for
describing markup languages. In other words, XML provides a facility to define tags
and the structural relationships between them. Since there is no predefined tag set,
there cannot be any preconceived semantics. All of the semantics of an XML
document will either be defined by the applications that process them or by
stylesheets.
111
a. Visit U-tube add https://bit.ly/2Gc4Zv5. Watch the video & summarise in 1
paragraph
b. View the animation on add/site https://bit.ly/2Y6SSns and critique it in the
discussion forum
112
Mike, C. (2011). “Referential Integrity”.http://databases.about.com/:
About.com.http://databases.about.com/cs/administration/g/refintegrity.htm.
Retrieved 2011-03-17.
Oppel, A. (2004). Databases Demystified. San Francisco, CA: McGraw-Hill
Osborne Media.
Performance Enhancement through Replication in an Object-Oriented DBM.
(1986).pp. 325-336.
Ramakrishnan, R. &Gehrke, J. (2000). .Database Management Systems
(2nd ed.). McGraw-Hill Higher Education.
Seltzer, M. (2008).“Beyond Relational Databases”. Communications of the
ACM, 51(7), pp. 52-58.
Teorey, T.; et al. (2005). Database Modeling& Design: Logical Design (4th
ed.). Morgan Kaufmann Press.
Teorey, T.J.; et al. (2009). Database Design: Know it All. Burlington, MA:
Morgan Kaufmann Publishers.
Thomas, C.; et al. (2009). Database Systems: A Practical Approach to Design,
Implementation and Management.
Tsitchizris, D. C. &Lochovsky, F.H. (1982). Data Models. Englewood-Cliffs:
Prentice-Hall.
113
STUDY SESSION 6
Significance of XML
Section and Subsection Headings:
Introduction
1.0 Learning Outcomes
2.0 Main Content
2.1- Why XML?
2.2- XML Development Goals
2.3- Defining XML
2.3.1- Extensible Markup Language (XML) 1.0
2.3.2- XML Pointing Language (XPointer) and XML Linking
Language (XLink)
2.3.3- Extensible Style Language
2.3.4- Understanding the Specs
3.0Study Session Summary and Conclusion
4.0 Self-Assessment Questions
5.0 Additional Activities (Videos, Animations &Out of Class activities)
6.0 References/Further Readings
Introduction:
The previous unit introduced some general concepts of the extensible markup
language (XML). In this unit, we will learn the relevance of XML and its development
goals. In order to understand the XML specifications, we shall equally discuss the
extensible pointer language, extensible linking language and the extensible style
language.
114
2. Identify the development goals of extensible mark-up languages (XML)
3. Know how you would view an XML document, assuming you do not have an
XML browser
4. State the relationship between the XML pointer language and the XML linking
language.
This is not to say that XML can be expected to completely replace SGML. While
XML is being designed to deliver structured content over the web, some of the very
features it lacks to make this practical, make SGML a more satisfactory solution for
the creation and long-time storage of complex documents. In many organisations,
filtering SGML to XML will be the standard procedure for web delivery.
115
2.2 XML Development Goals
Based on the W3C Recommendation, the Extensible Mark-up Language (XML) 1.0,
XML specification sets out the following goals for XML:
1. It shall be straightforward to use XML over the Internet. Users must be able to
view XML documents as quickly and easily as HTML documents. In practice, this
will only be possible when XML browsers are as robust and widely available as
HTML browsers, but the principle remains.
2. XML shall support a wide variety of applications. XML should be beneficial to
a wide variety of diverse applications: authoring, browsing, content analysis, etc.
Although the initial focus is on serving structured documents over the web, it is not
meant to narrowly define XML.
3. XML shall be compatible with SGML. Most of the people involved in the XML
effort come from organisations that have alarge, in some cases staggering, amount of
material in SGML. XML was designed pragmatically, to be compatible with existing
standards while solving the relatively new problem of sending richly structured
documents over the web.
4. It shall be easy to write programmes that process XML documents. The
colloquial way of expressing this goal while the spec was being developed is that it
ought to take about two weeks for a competent computer science graduate student to
build a programmethat can process XML documents.
5. The number of optional features in XML is to be kept to an absolute minimum,
ideally zero. Optional features inevitably raise compatibility problems when users
want to share documents that sometimes lead to confusion and frustration.
6. XML documents should be human-legible and reasonably clear. If you do not
have an XML browser and you have received a chunk of XML from somewhere, you
ought to be able to look at it in your favourite text editor and actually figure out what
the content means.
7. The XML design should be prepared quickly. Standards efforts are notoriously
slow. XML was needed immediately and developed as quickly as possible.
116
8. The design of XML shall be formal and concise. In many ways a corollary to
rule 4, it essentially means that XML must be expressed in EBNF and must be
amenable to modern compiler tools and techniques. There are a number of technical
reasons why the SGML grammar cannot be expressed in EBNF. Writing a proper
SGML parser requires handling a variety of rarely used and difficult-to-parse language
features. XML does not.
9. XML documents should be easy to create. Although, there will eventually be
sophisticated editors to create and edit XML content, they may not appear
immediately. In the interim, it must be possible to create XML documents in other
ways: directly in a text editor, with simple shell and Perl scripts, etc.
10. Terseness in XML mark-up is of minimal importance. Several SGML language
features were designed to minimise the amount of typing required to manual keying in
SGML documents. These features are not supported in XML. From an abstract point
of view, these documents are indistinguishable from their more fully specified forms,
but supporting these features adds a considerable burden to the SGML parser (or the
person writing it, anyway). In addition, most modern editors offer better facilities to
define shortcuts when entering text.
2.3.2 XML Pointer Language (XPointer) and XML Linking Language (XLink)
This defines a standard way to represent links between resources. In addition to simple
links, like HTML’s <A> tag, XML has mechanisms for links between multiple
117
resources and links between read-only resources. XPointer describes how to address a
resource, XLinkdescribes how to associate two or more resources.
In-text Question
Define extensible markup language (XML) 1.0
Answer
This defines the syntax of XML. The XML specification is the primary focus of this unit.
3.0 Conclusion/Summary
In this session, we learnt the significance of XML and its development goals. We
equally considered the extensible pointer language, extensible linking language and
the extensible style language. You can now attempt the tutor-marked assignment
below. Good luck!
118
2. State the development goals of extensible mark-up languages (XML).
3. Assuming you do not have an XML browser, how else would you view an XML
document?
4. Identify the relationship between the XML pointer language and the XML linking
language.
119
Professional's Guide to Exploiting Indexes, Views, Storage, and More. Morgan
Kaufmann Press.
Kawash, J. (2004). “Complex Quantification in Structured Query
Language (SQL): a Tutorial using Relational Calculus”. Journal of Computers
in Mathematics and Science Teaching. Volume 23, Issue 2, 2004 AACE
Norfolk, Virginia.
Mike, C. (2011). “Referential Integrity”.
http://databases.about.com/:About.com.
http://databases.about.com/cs/administration/g/refintegrity.htm. Retrieved 2011-
03-17.
Oppel, A. (2004). Databases Demystified. San Francisco, CA: McGraw-Hill
Osborne Media.
Performance Enhancement through Replication in an Object-Oriented DBM.
(1986).pp. 325-336.
Ramakrishnan, R. &Gehrke, J. (2000). Database Management Systems (2nd
ed.). McGraw-Hill Higher Education.
120
STUDY SESSION 7
XML Documents
Section and Subsection Headings:
Introduction
1.0 Learning Outcomes
2.0 Main Content
2.1- A Simple XML Document
2.2- Model of an XML Document
2.3- XML Markup and Content
2.3.1- Elements
2.3.2- Attributes
2.3.3- Entity References
2.3.4- Comments
2.3.5- Processing Instructions
2.3.6- CDATA Section
2.3.7- Document Type Declaration
3.0Study Session Summary and Conclusion
4.0 Self-Assessment Questions
5.0 Additional Activities (Videos, Animations &Out of Class activities)
6.0 References/Further Readings
Introduction:
Welcome to yet another study session! I bet you are already feeling like a computer
expert. Well, not so fast. There is still a lot you need to learn, which is why, in this
unit, we shall be looking into the simple XML document as well as XML Mark-up
and Content. You will equally learn about six kinds of mark-up that can occur in an
XML document: elements, entity references, comments, processing instructions,
marked sections, and document type declarations.
121
1.0 Study Session Learning Outcomes
After studying this session, I expect you to be able to:
1. Describe a simple XML document
2. Outline the main ideas of a model XML document
3. Explain the term ‘character references’
4. Discuss the main role of a CDATA section in a document
5. Mention the core components of XML documents
6. Identify six kinds of markup that can occur in an XML document.
122
The document begins with a processing instruction: <?xml ...?>. This is the XML
declaration. While it is not required, its presence explicitly identifies the document as
an XML document and indicates the version of XML to which it was authored.
There is no document type declaration. Unlike SGML, XML does not require a
document type declaration. However, a document type declaration can be supplied,
and some documents will require one in order to be understood unambiguously.
Empty elements (<applause/> in this example) have a modified syntax. While most
elements in a document are wrappers around some content, empty elements are simply
markers where something occurs (a horizontal rule for HTML's <hr> tag, for example
or a cross reference for DocBook’s<xref> tag). The trailing /> in the modified syntax
indicates to a programme processing the XML document that the element is empty and
no matching end-tag should be sought.
Since XML documents do not require a document type declaration, without this clue it
could be impossible for an XML parser to determine which tags were intentionally
empty and which had been left empty by mistake. XML has softened the distinction
between elements which are declared as EMPTY and elements which merely have no
content. In XML, it is legal to use the empty-element tag syntax in either case. It’s also
legal to use a start-tag/end-tag pair for empty elements: <applause></applause>. If
interoperability is of any concern, it is best to reserve empty-element tag syntax for
elements which are declared as EMPTY and to only use the empty-element tag form
for those elements.
123
2.3.1 Elements
Elements are the most common form of mark-up. Delimited by angle brackets, most
elements identify the nature of the
content they surround. Some
elements may be empty, as seen
above, in which case they have no
content. If an element is not empty, it
begins with a start-tag, Fig 3.7.2: Elements
Source: homesciencetool.com
2.3.2 Attributes
Attributes are name-value pairs that occur inside start-tags after the element name. For
example,
<div class=”preface”>
is a div element with the attribute class having the value preface. In XML, all attribute
values must be quoted.
In-text Question
What is the Typical Model of an XML Document?
Answer
<?xml version="1.0"?>
<oldjoke>
<burns>Say <quote>goodnight</quote>,
Gracie.</burns>
<allen><quote>Goodnight,
Gracie.</quote></allen>
<applause/>
</oldjoke>
124
document as content, there must be an alternative way to represent them. In XML,
entities are used to represent these special characters. Entities are also used to refer to
often repeated or varying text and to include the content of external files.
Every entity must have a unique name. Defining your own entity names is discussed in
the section on entity declarations. In order to use an entity, you simply reference it by
name. Entity references begin with the ampersand and end with a semicolon.For
example, the lt entity inserts a literal < into a document. So the string <element> can
be represented in an XML document as <element>.
A special form of entity reference, called a character reference, can be used to insert
arbitrary Unicode characters into your document. This is a mechanism for inserting
characters that cannot be typed directly on your keyboard.
Character references take one of two forms: decimal references, ℞, and
hexadecimal references, ℞. Both of these refer to character number U+211E
from Unicode (which is the standard Rx prescription symbol, in case you were
wondering).
2.3.4 Comments
Comments begin with <!-- and end with -->. Comments can contain any data except
the literal string --. You can place comments between markups anywhere in your
document.Comments are not part of the textual content of an XML document. An
XML processor is not required to pass them along to an application.
125
Processing instructions have the form: <?namepidata?>. The name, called the PI
target, identifies the PI to the application. Applications should process only the targets
they recognise and ignore all other PIs. Any data that follows the PI target is optional;
it is for the application that recognises the target. The names used in PIs may be
declared as notations in order to formally identify them.
PI names beginning with xml are reserved for XML standardisation.
126
Answer
Attributes are name-value pairs that occur inside start-tags after the element name
3.0 Conclusion/Summary
In this session, we were made to understand that XML documents appear similar to
HTML or SGML. We equally noted that in an XML Document:
1. The document begins with a processing instruction: <?xml ...?>.
2. There’s no document type declaration.
3. Empty elements (<applause/> in this example) have a modified syntax.
The trailing /> in the modified syntax indicates to a programme processing the XML
document that the element is empty and no matching end-tag should be sought.Finally,
we discovered that XML documents are composed of markups and content and that
there are six kinds of markup that can occur in an XML document: elements, entity
references, comments, processing instructions, marked sections, and document type
declarations.
This unit presented the simple XML document as well as XML Mark-up and
Content. We equally considered six kinds of mark-up that can occur in an XML
document: elements, entity references, comments, processing instructions, marked
sections, and document type declarations. In order to assess your comprehension of
the just concluded unit, you need to try out the questions that ensue.
127
5.0 Additional Activities (Videos, Animations &Out of Class activities) e.g.
a. Visit U-tube add https://bit.ly/2MNYzCX. Watch the video & summarise in 1
paragraph
b. View the animation on add/site https://bit.ly/2y1i3ND and critique it in the
discussion forum
Codd, E. F. (1970).”A Relational Model of Data for Large Shared Data Banks”.
Communications of the ACM 13 (6):pp. 377–387.
Codd, E. F. (1970). “A Relational Model of Data for Large Shared Data Banks”.
Communications of the ACM archive. Vol.13. Issue 6.pp.377-387.
Mike, C. (2011). “Referential Integrity”.
http://databases.about.com/:About.com.http://databases.about.com/cs/administra
tion/g/refintegrity.htm. Retrieved 2011-03-17.
Oppel, A. (2004). Databases Demystified. San Francisco, CA: McGraw-Hill
Osborne Media.
Performance Enhancement through Replication in an Object-Oriented DBM.
(1986).pp. 325-336.
Ramakrishnan, R. &Gehrke, J. (2000). .Database Management Systems (2nd
ed.). McGraw-Hill Higher Education.
Teorey, T.; et al. (2005). Database Modelling & Design: Logical Design (4th
ed.). Morgan Kaufmann Press.
Teorey, T.J.; et al. (2009). Database Design: Know it All. Burlington, MA:
Morgan Kaufmann Publishers.
128
MODULE 4
Computer Data Storage and FileStructure
Contents:
Study Session 1: Computer Data Storage and Levels
Study Session 2: Features of Storage Technologies
Study Session 3: Common Storage Technologies
Study Session 4: File Organisation
Study Session 5: Document Type Declaration
Study Session 6: Introduction to Web Services
STUDY SESSION 1
Computer Data Storage and File Structure
Section and Subsection Headings:
Introduction
1.0 Learning Outcomes
2.0 Main Content
2.1- Computer Data Storage
2.2- Levels of Storage
2.2.1- Primary Storage
2.2.2- Secondary Storage
2.2.3- Types of Secondary Storage
2.2.4- Tertiary Storage
2.2.5- Off-line Storage
3.0Study Session Summary and Conclusion
4.0 Self-Assessment Questions
5.0 Additional Activities (Videos, Animations &Out of Class activities)
6.0 References/Further Readings
129
Introduction:
In the previous modules, we considered domain constraints, integrity constraints and
referential integrity with respect to Structured Query Language (SQL). In this unit, we
shall be looking at the concept of computer data storage, the different levels of
storage and corresponding forms of storage.
130
Fig. 4.1.1: Levels of Storage
Historically, early computers used rotating magnetic drums as primary storage. These
were later replaced by magnetic core memory and subsequently by integrated circuit.
This led to modern random-access memory (RAM). It is small-sized, light, but quite
expensive at the same time. (The particular types of RAM used for primary
storage are also volatile, i.e. they lose the information when not powered).
Main memory is directly or indirectly connected to the central processing unit via a
memory bus. This consists of two buses: an address bus and a data bus. The
CPU initially sends a number through an address bus. This number called the
131
memoryaddress indicates the desired location of data. Then it reads or writes the data
itself using the data bus.
As the RAM types used for primary storage are volatile (cleared at start-up), a
computer containing only such storage would not have a source to read instructions
from, in order to start the computer. Hence, nonvolatileprimary storage containing a
small start-up programme (BIOS) issued to bootstrap the computer, that is, to read a
larger programme from non-volatile secondary storage to RAM and start to execute it.
A nonvolatiletechnology used for this purpose is called ROM, which stands for read-
only memory (the terminology may be somewhat confusing as most ROM types are
also capable of random access).
Many types of ‘ROM’ are not literally read only, as updates are possible; however it is
slow and memory must be erased in large portions before it can be re-written. Some
embedded systems run programmes directly from ROM (or similar), because such
programmes are rarely changed. Standard computers do not store non-rudimentary
programmes in ROM, rather use large capacities of secondary storage, which is non-
volatile as well, and not as costly.
132
What is Computer Data Storage?
Answer
Computer Data Storage Refers to computer components and recording media that retain digital data
used for computing for some interval of time.
Rotating optical storage devices such as CD and DVD drives have even longer access
times. With disk drives, once the disk read/write head reaches the proper placement
and the data of interest rotates under it, subsequent data on the track are very fast to
access. As a result, in order to hide the initial seek time and rotational latency, data is
transferred to and from disks in large contiguous blocks.
Some other examples of secondary storage technologies are: flash memory (e.g. USB
flash drives or keys), floppy disks, magnetic tape, paper tape, punched cards,
standalone RAM disks, and Iomega Zip drives.Normally, the secondary storage is
often formatted according to a file system format, which provides the abstraction
133
necessary to organise data into files and directories, providing also additional
information (called metadata) which describes the owner of a certain file, the access
time, the access permissions, and other information.
Most computer operating systems use the concept of virtual memory, allowing
utilisation of more primary storage capacity than is physically available in the system.
As the primary memory fills up, the system moves the least-used chunks (pages) to
secondary storage devices (to a swap file or page file), retrieving them later when
they are needed. As more of these retrievals from slower secondary storage are
necessary, the more the overall system performance is degraded.
Tape cartridges placed on shelves in the front, robotic arm moving in the back. Visible
height of the library is about 180 cm.When a computer needs to read information from
the tertiary storage, it will first consult a catalogue database to determine which tape or
134
disc contains the information. Next, the computer will instruct a robotic arm to fetch
the medium and place it in a drive. When the computer has finished reading the
information, the robotic arm will return the medium to its place in the library.
In modern personal computers, most secondary and tertiary storage media are also
used for off-line storage. Optical discs and flash memory devices are most popular,
and to much lesser extent removable hard disk drives. In enterprise uses, magnetic
tape is predominant. Older examples are floppy disks, Zip disks, or punched cards.
In-text Question
Define Primary Storage
Answer
135
Primary storage (or main memory or internal memory), commonly referred to as memory, is the only
memory directly accessible to the CPU.
3.0 Conclusion/Summary
On the whole, we learnt that Computer data storage refers to computer components
and recording media that retain digital data used for computing for some interval of
time. Don’t forget that I mentioned that there are different forms of storage, divided
according to their distance from the central processing unit. The fundamental
components of a general-purpose computer are arithmetic and logic unit, control
circuitry, storage space, and input/output devices. The main memory is the only
memory directly accessible to the CPU. It is directly or indirectly connected to the
central processing unit via a memory bus. The CPU continuously reads instructions
stored in the main memory and executes them as required.
On the other hand, the Secondary storage is not directly accessible by the CPU.
Common forms of Secondary Storage include: hard disk drives, rotating optical
storage devices, flash memory, floppy disks, magnetic tape, paper tape, punched
cards, and standalone RAM disks, Iomega Zip drives and a host of others. Most
computer operating systems use the concept of virtual memory. The tertiary storage or
memory involves a robotic storage mechanism which mounts and dismounts
removable mass storage media into a storage device according to the system's
demands. We equally identified the off-line storage as a form of computer data storage
on a device that is not under the control of a processing session. This storage is used to
transfer information, since the detached medium can be easily physically transported.
In this session, we learnt about computer data storage and the levels of storage as well
as classical examples of each form of storage. Now to discover if you have been
following, please answer the questions below.
136
4.0 Self-Assessment Questions
1. What is the core distinction between the auxiliary and internal memory?
2. Explain the implication of the volatility of the primary storage being volatile.
3. In the data context, list the levels of storage.
137
STUDY SESSION 2
Features of Storage Technologies
Section and Subsection Headings:
Introduction
1.0 Learning Outcomes
2.0 Main Content
2.1- Volatility
2.1.1- Non-Volatile Memory
2.1.2- Volatile Memory
2.2- Differentiation
2.2.1- Dynamic Random Access Memory
2.2.2- Static Memory
2.3- Mutability
2.3.1- Read/Write Storage or Mutable Storage
2.3.2- Read Only Storage
2.3.3- Slow Write, Fast Read Storage
2.4- Accessibility
2.4.1- Random Access
2.4.2- Sequential Access
2.5- Addressability
2.5.1- Location Addressable
2.5.2- File Addressable
2.5.3- Content Addressable
2.6- Capacity
2.6.1- Raw Capacity
2.6.2- Memory Storage Density
2.7- Performance
2.7.1- Latency
2.7.2- Throughput
138
2.8- Energy Use
3.0Study Session Summary and Conclusion
4.0 Self-Assessment Questions
5.0 Additional Activities (Videos, Animations &Out of Class activities)
6.0 References/Further Readings
Introduction:
In this session, we will be learning about the core aspects of storage technologies
which include: volatility, mutability, accessibility, and addressability. However, for
the implementation of any storage technology, the characteristics worth measuring are
capacity and performance. Keep these ideas in mind and do enjoy your studies.
139
Fig 4.2.1: Volatility
Source: commodity.com
2.2 Differentiation
2.2.1 Dynamic Random Access Memory
This is a form of volatile memory which also requires the stored information to be
periodically re-read and re-written, or refreshed, otherwise it would vanish.
Answer
A form of volatile memory which also requires the stored information to be periodically re-read and
re-written, or refreshed, otherwise it would vanish.
140
2.3 Mutability
2.3.1 Read/Write Storage or Mutable Storage
This form of storage allows information to be overwritten at any time. A computer
without some amount of read/write storage for primary storage purposes would be
useless for many tasks. Modern computers typically use read/write storage also for
secondary storage.
Answer
This form of storage allows information to be overwritten at any time.
2.4 Accessibility
This feature can be categorised in two ways:
141
Fig 4.2.2: Accessibility
Source: medium.com
2.5 Addressability
2.5.1 Location-Addressable
Each individually accessible unit of information in storage is selected with its
numerical memory address. In modern computers, location addressable storage
usually limits to primary storage, accessed internally by computer programmes, since
location-addressability is very efficient, but burdensome for humans.
2.5.2 File-Addressable
Information is divided into files of variable length, and a particular file is selected with
human-readable directory and file names. The underlying device is still location-
142
addressable, but the operating system of a computer provides the file system
abstraction to make the operation more understandable. In modern computers,
secondary, tertiary and offline storage use file systems.
2.5.3 Content-Addressable
Each individually accessible unit of information is selected based on the basis of (part
of) the contents stored there. Content-addressable storage can be implemented using
software (computer programme) or hardware (computer device), with hardware
being faster but more expensive option. Hardware content addressable memory is
often used in a computer’s CPU cache.
In-text Question
What is Random Access?
Answer
Random Access is any location in storage can be accessed at any moment in approximately the same
amount of time.
2.6 Capacity
2.6.1 Raw Capacity
This is the total amount of stored information that a storage device or medium can
hold. It is expressed as a quantity of bits or bytes (e.g. 10.4 megabytes).
2.7 Performance
143
Fig 4.2.3: Performance
Source: blogs.thomsonreuters.com
2.7.1 Latency
This refers to the time it takes to access a particular location in storage. The relevant
unit of measurement is typically nanosecond for primary storage, millisecond for
secondary storage, and second for tertiary storage. It may make sense to separate read
latency and write latency, and in case of sequential access storage, minimum,
maximum and average latency.
2.7.2 Throughput
The term ‘throughput’ simply refers to the rate at which information can be read from
or written to the storage. In computer data storage, throughput is usually expressed in
terms of megabytes per second or MB/s, though bit rate may also be used. As with
latency, read rate and write rate may need to be differentiated. Also accessing media
sequentially, as opposed to randomly, typically yields maximum throughput.
144
Answer
This refers to the compactness of stored information. It is the storage capacity of a medium divided
with a unit of length, area or volume (e.g. 1.2 megabytes per square inch).
3.0 Conclusion/Summary
To wrap up, we discovered that while the volatile memory requires constant power to
maintain the stored information is the volatile memory, non-volatile memory retains
stored information even if it is not constantly supplied with electric power. Modern
computers typically use read/write storage also for secondary storage. We also learnt
that the time it takes to access a particular location in storage is referred to as latency.
In this unit, we were able to distinguish between the volatile and non-volatile memory,
explain the mechanism of the Read/Write storage Identify the difference between the
Read/Write and Read only storage.
Describe the Slow write; fast read storage, Identify the forms of accessibility,
addressability. Explain the notion of latency as well as explain the term ‘throughput’.
146
STUDY SESSION 3
Common Storage Technologies
Section and Subsection Headings:
Introduction
1.0 Learning Outcomes
2.0 Main Content
2.1- Semi Conductors
2.2- Methods and Design Paradigm
2.3- Optical
2.4- Magneto-Optical Disc Storage
2.5- Paper Data Storage
2.6- Uncommon
3.0Study Session Summary and Conclusion
4.0 Self-Assessment Questions
5.0 Additional Activities (Videos, Animations &Out of Class activities)
6.0 References/Further Readings
Introduction:
We are still on storage. In this session, you will learn the different forms of storage
technologies. We will consider specific examples and some real-life applications. Do
enjoy your studies.
147
6. Cite common examples of optical disc storage
7. Mention the regular forms of uncommon storage.
148
2.2.1 Magnetic Disk
1. Floppy disk, used for off-line storage
2. Hard disk drive, used for secondary storage
2.2.2 Magnetic Tape Data Storage (Used for Tertiary and Off-Line Storage)
Previously, magnetic storage was also used for primary storage in a form of magnetic
drum, or core memory, core rope memory, thin-film memory, twistor memory or
bubble memory. Also unlike today, magnetic tapes are frequently used for secondary
storage.
In-text Question
Define Semi-Conductors
Answer
Semiconductor memory uses semiconductor-based integrated circuits to store information.
149
2. CD-R, DVD-R, DVD+R, BD-R: Write once storage, used for tertiary and off-
line storage
3. CD-RW, DVD-RW, DVD+RW, DVD-RAM, BD-RE: Slowwrite, fast read
storage, used for tertiary and off-line storage
4. Ultra Density Optical or UDO is similar in capacity to BD-R orBD-RE and is
slow write, fast read storage used for tertiary and off-line storage.
150
2.6.2 Electro-Acoustic Memory
Delay line memory used sound waves in a substance
such as mercury to store information. Delay line
memory was dynamic volatile, cycle sequential
read/write storage, and was used for primary
storage.
Fig
4.3.4: Electro-
Acoustic Memory
Answer
A typical optical disc, stores information on the surface of a circular disc and reads this information
by illuminating the surface with a laser diode and observing the reflection. Optical disc storage is
non-volatile.
3.0 Conclusion/Summary
In conclusion, we have seen that at the moment, the most commonly used data storage
technologies are semiconductor, magnetic, and optical, while paper still finds some
limited usage. Some other fundamental storage technologies have also been used in
the past or are proposed for development.
151
In sum, we discovered the common types of data storage technologies. We equally
considered the uncommon storage technologies. You can now attempt the questions
below.
152
http://databases.about.com/cs/administration/g/refintegrity.htm.Retrieved 2011-
03-17.
153
STUDY SESSION 4
File Organisation
Section and Subsection Headings:
Introduction
1.0 Learning Outcomes
2.0 Main Content
2.1- Introduction to File Organisation
2.2- Methods and Design Paradigms
2.3- System File Organisation Specifics
2.4- Factors that affect File Organisation
2.5- File Organisation Techniques
2.5.1- Sequential Organisation
2.5.2- Line-Sequential Paradigms
2.5.3- Indexed-Sequential Organisation
2.5.4- Inverted List Technique
2.5.5- Direct or Hashed Access
3.0Study Session Summary and Conclusion
4.0Self-Assessment Questions
5.0 Additional Activities (Videos, Animations &Out of Class activities)
6.0 References/Further Readings
Introduction:
This session presents key considerations in specifying a system of file organisation as
well as techniques of. You will actually find this aspect simple and interesting. Enjoy
your studies!
154
2. Identify the components of file organisation
3. State the key considerations in specifying a system of file organisation
4. Outline the common methods of organising files.
155
various processes like those found in a typical distributed system or standalone.
Whether the file is on a network and used by a number of users and whether it may be
accessed internally or remotely and how often it is accessed must also be determined.
1. Rapid access to a record or a number of records which are related to each other.
2. The adding, modification, or deletion of records.
3. Efficiency of storage and retrieval of records.
4. Redundancy, being the method of ensuring data integrity.
Thus, a file should be organised in such a way that the records are always available for
processing with no delay. This should be done in line with the activity and volatility of
the information.
156
Answer
File organisation is the methodology which is applied to structured computer files.
158
becausesearching is fast. However, updating is much slower. Content-based queries in
text retrieval systems use inverted indexes as their preferred mechanism. Data items in
these systems are usually stored compressed which would normally slow the retrieval
process, but the compression algorithm will be chosen to support this technique.
Answer
Line-sequential files are similar to sequential files, except that the records can contain only
characters as data.
3.0 Conclusion/Summary
To wrap up, in this session, we discovered that file organisation primarily refers to the
logical arrangement of data in a file system. Two significant components of file
organisation are: the way the internal file structure is arranged and the external file as
it is presented to the operating system or programme that calls it. The design of the file
organisation depends mainly on the system environment. Other design consideration
include: whether the file is on a network and used by a number of users and whether it
may be accessed internally or remotely and how often it is accessed must also be
determined.
159
The key considerations in specifying a system of file organisation are as follows:
Rapid access to a record or a number of records which are related to each other; the
adding, modification, or deletion of records; Efficiency of storage and retrieval of
records; and Redundancy, being the method of ensuring data integrity. In sum, a file
should be organised in such a way that the records are always available for processing
with no delay. Organising a file depends on what kind of file it happens to be. The
common methods of organising files are: Sequential, Line-sequential, Indexed-
sequential, Inverted list and Direct or hashed access organisation.
We also learnt about file organisation techniques and the factors affecting file
organisation. By now, you must certainly be ready to answer the questions below.
http://office.microsoft.com/en-us/access/HA012242471033.aspx
160
TechRepublic.http://articles.techrepublic.com.com/5100-10878_11-
1046268.html. Retrieved 2010-01-07.
161
STUDY SESSION 5
Document Type Declaration
Section and Subsection Headings:
Introduction
1.0 Learning Outcomes
2.0 Main Content
2.1- XML Document Declarations
2.2- Types of Declarations in XML
2.2.1- Element Type Declarations
2.2.2- Attribute List Declarations
2.2.3- Entity Declarations
2.2.4- Typical Entity Declarations
2.2.5- Types of Entities
2.2.6- Notation Declarations
3.0Study Session Summary and Conclusion
4.0 Self-Assessment Questions
5.0 Additional Activities (Videos, Animations &Out of Class activities)
6.0 References/Further Readings
Introduction:
This unit presents the notion of document type declarations with respect to XML
documents. Four main types of XML declarations are identified as follows: Element
Type Declarations, Attribute List Declarations, Entity Declarations and Notation
Declarations. Do take note of these main ideas as we proceed. And I urge you to
enjoy your studies.
162
2. Identify the components of meta-information
3. Describe the four main types of declarations in XML
4. List and describe the common attribute types.
<gracie><quote><oldjoke>Goodnight,
<applause/>Gracie</oldjoke></quote>
<burns><gracie>Say <quote>goodnight</quote>,
</gracie>Gracie.</burns></gracie>
It is so far outside the bounds of what we normally expect that it appears nonsensical.
It just does not mean anything. However, from a strictly syntactic point of view, there
is nothing wrong with that XML document. So, if the document is to have meaning,
and certainly if you are writing a stylesheet or application to process it, there must be
some constraint on the sequence and nesting of tags. Declarations are where these
constraints can be expressed. Generally, declarations allow a document to
communicate meta-information to the parser about its content. Meta-information
includes the allowed sequence and nesting of tags, attribute values and their types and
defaults, the names of external files that may be referenced and whether or not they
163
contain XML, the formats of some external (non- XML) data that may be referenced,
and the entities that may be encountered.
This declaration identifies the element named oldjoke. Its content model follows the
element name. The content model defines what an elementmay contain. In this case,
an oldjoke must contain burns and allenandmay contain applause. The commas
between element names indicatethat they must occur in succession. The plus after
burns indicates that itmay be repeated more than once but must occur at least once.
Thequestion mark after applause indicates that it is optional (it may beabsent, or it
may occur exactly once). A name with no punctuation, suchasallen, must occur
exactly once.
Declarations for burns, allen, applause and all other elements used inany content
model must also be present for an XML processor to checkthe validity of a
document.In addition to element names, the special symbol #PCDATA is reservedto
indicate character data. The moniker PCDATA stands for parseablecharacter
data.Elements that contain only other elements are said to have element content.
Elements that contain both other elements and #PCDATA aresaid to have mixed
content.For example, the definition for burns might be
164
<!ELEMENT burns (#PCDATA | quote)*>
The vertical bar indicates an—or relationship, the asterisk indicates that the content is
optional (may occur zero or more times); therefore, by this definition, burns may
contain zero or more characters and quote tags, mixed in any order. All mixed content
models must have this form: #PCDATA, must come first, all of the elements must be
separated byvertical bars, and the entire group must be optional.
Two other content models are possible: EMPTY indicates that theelement has no
content (and consequently no end-tag), and ANY indicates that any content is
allowed. The ANY content model is sometimes useful during document conversion,
but should be avoided at almost any cost in a production environment because it
disables all content checking in that element.
Here is a complete set of element declarations for Example 1:
<!ATTLISToldjoke
165
name
ID
#REQUIRED
label
CDATA
#IMPLIED
In this example, the oldjoke element has three attributes: name, which is an ID and is
required; label, which is a string (character data) and is not required; and status,
which must be either funny or notfunny and defaults to funny, if no value is specified.
Each attribute in a declaration has three parts: a name, a type, and a default
value.You are free to select any name you wish, subject to some slight restrictions, but
names cannot be repeated on the same element.
ii. ID
The value of an ID attribute must be a name [Section 2.3, production 5]. All of the ID
values used in a document must be different. IDs uniquely identify individual elements
in a document. Elements can have only a single ID attribute.
iii. IDREF or IDREFS
166
An IDREF attribute value must be that of a single ID attribute on some element in the
document. The value of an IDREFS attribute may contain multiple IDREF values
separated by white space.
v. ENTITY OR ENTITIES
An ENTITY attribute value must be the name of a single entity (see the discussion of
entity declarations below). The value of an ENTITIES attribute may contain multiple
entity names separated by white space.
v. NMTOKEN or NMTOKENS
Name token attributes are a restricted form of string attribute. In general, an
NMTOKEN attribute must consist of a single word, but there are no additional
constraints on the word, it does not have to match another attribute or declaration. The
value of an NMTOKENS attribute may contain multiple NMTOKEN values
separated by white space.
Answer
One of the greatest strengths of XML is that it enables one to create personal tag names. However,
for any given application, it is probably not meaningful for tags to occur in a completely arbitrary
order.
167
Entity declarations enable us to associate a name with some other fragment of content.
That construct can be a chunk of regular text, a chunk of the document type
declaration, or a reference to an external file containing either text or binary data.
<!ENTITYATIlogo
SYSTEM “/standard/logo.gif" NDATA GIF87A>
168
insert ArborText, Inc. at that location. Internal entities allow you to define shortcuts
for frequently typed text or text that is expected to change, such as the revision status
of a document. Internal entities can include references to other internal entities, but it
is an error for them to be recursive.
External Entities
External entities associate a name with the content of another file. External entities
allow an XML document to refer to the contents of another file. External entities
contain either text or binary data. If they contain text, the content of the external file is
inserted at the point of reference and parsed as part of the referral document. Binary
data is not parsed and may only be referenced in an attribute. Binary data is used to
reference figures and other non-XML content in the document.
The second and third entities in section 3.Ent are external entities.
Using &boilerplate; will have inserted the contents of the file
/standard/legalnotice.xml at the location of the entity reference. The XML processor
will parse the content of that file as if it occurred literally at that location.
The entity ATIlogo is also an external entity, but its content is binary. The ATIlogo
entity can only be used as the value of an ENTITY (or ENTITIES) attribute (on a
graphic element, perhaps). The XML processor will pass this information along to an
application, but it does not attempt to process the content of /standard/logo.gif.
169
Parameter Entities
Parameter entities can only occur in the document type declaration. A parameter entity
declaration is identified by placing % (percent-space) in front of its name in the
declaration. The percent sign is also used in references to parameter entities, instead of
the ampersand. Parameter entity references are immediately expanded in the document
type declaration and their replacement text is part of the declaration, whereas normal
entity references are not expanded. Parameter entities are not recognised in the body
of a document.
Looking back at the element declarations in Ex 2, you will notice that two of them
have the same content model:
At the moment, these two elements are the same only because they happen to have the
same literal definition. In order to make more explicit the fact that these two elements
are semantically the same, use a parameter entity to define their content model. The
advantage of using a parameter entity is two-fold. First, it allows you to give a
descriptive name to the content, and secondly, it allows you to change the content
model in only a single place, if you wish to update the element declarations, assuring
that they always stay the same:
170
<!ELEMENTallen (%personcontent;)*>
Authoring Environments
Most authoring environments need to read and process document type declarations in
order to understand and enforce the content models of thedocument.
171
Including a Document Type Declaration
If present, the document type declaration must be the first thing in the document after
optional processing instructions and comments.The document type declaration
identifies the root element of the document and may contain additional declarations.
All XML documents must have a single root element that contains all of the content of
the document. Additional declarations may come from an external DTD, called the
external subset, or be included directly in the document, the internal subset, or both:
<!ELEMENTulink (#PCDATA)*>
<!ATTLISTulink
]>
<chapter>...</chapter>
172
This example references an external DTD, dbook.dtd, and includes element and
attribute declarations for the ulink element in the internal subset. In this case, ulink is
being given the semantics of a simple link from the XLink specification.
Note that declarations in the internal subset override declarations in the external
subset. The XML processor reads the internal subset before the external subset and the
first declaration takes precedence.
In order to determine if a document is valid, the XML processor must read the entire
document type declaration (both internal and external subsets). But for some
applications, validity may not be required, and it may be sufficient for the processor to
read only the internal subset. In the example above, if validity is unimportant and the
only reason to read the doctype declaration is to identify the semantics of ulink,
reading the external subset is not necessary.
In-text Question
What is Typical Entity Declarations?
Answer
<!ENTITY
ATI
“ArborText, Inc.”>
<!ENTITYATIlogo
SYSTEM “/standard/logo.gif" NDATA GIF87A>
3.0 Conclusion/Summary
In this session, we were made to understand that XML enables one to create personal
tag names. We learnt that in order to ensure that documents are meaningful, there must
be some constraint on the sequence and nesting of tags. Thus, constraints can be
173
expressed by means of declarations. Generally, declarations allow a document to
communicate meta-informationto the parser about its content.
Meta-information includes the allowed sequence and nesting of tags, attributes, values
and their types and defaults, the names of external files that may be referenced and
whether or not they contain XML, the formats of some external (non-XML) data that
may be referenced, and the entities that may be encountered.
The four kinds of declarations in XML identified are as follows: element type
declarations, attribute list declarations, entity declarations, and notation
declarations.Element type declarations identify the names of elements and the nature
of their content. Attribute list declarations identify which elements may have
attributes, what attributes they may have, what values the attributesmay hold, and what
value is the default.
Six possible attribute types were specified as follows: CDATA, ID, IDREF or
IDREFS, Entity or Entities, NMTOKEN or NMTOKENS and A List of Names
Entity declarations enable one associate a name with some other fragment of content.
We discovered three kinds of entities: Internal entities, External entities and Parameter
entities.Notation declarations identify specific types of external binary data. This
information is passed to the processing application, which may make whatever use of
it as it wishes.
In sum, the document type declaration identifies the root element of the document and
may contain additional declarations. All XML documents must have a single root
element that contains all of the content of the document. Additional declarations may
come from an external DTD, called the external subset, or be included directly in the
document, the internal subset, or both
174
Note that declarations in the internal subset override declarations in the external
subset. The XML processor reads the internal subset before the external subset and the
first declaration takes precedence.
We equally learnt that in order to determine if a document is valid, the XML processor
must read the entire document type declaration (both internal and external subsets).
But for some applications, validity may not be required, and it may be sufficient for
the processor to read only the internal subset. In the example above, if validity is
unimportant and the only reason to read the doctype declaration is to identify the
semantics of ulink, reading the external subset is not necessary.
This unit presented details of the concept of document type declarations with respect
to XML documents. Four main types of XML declarations were equally specified as
follows: Element Type Declarations, Attribute List Declarations, Entity Declarations
and Notation Declarations. We hope you enjoyed your studies. It is time to test your
knowledge on this subject, so do attempt all the tasks listed in the tutor-marked
assignment below.
175
a. Visit U-tube add https://bit.ly/2MOHkkZ. Watch the video & summarise in 1
paragraph
b. View the animation on add/site https://bit.ly/2y1q3OU and critique it in the
discussion forum
176
STUDY SESSION 6
Introduction to Web Services
Section and Subsection Headings:
Introduction
1.0 Learning Outcomes
2.0 Main Content
2.1- Web Services Background
2.2- Website or Web Services Publishing
2.3- Accessing Information from Web Services
2.4- Advantages of Web Services
2.5- Disadvantages of Web Services
2.6- Typical Web Service Invocation
2.7- Web Services Architecture
2.7.1- Service Process
2.7.2- Service Description
2.7.3- Service Invocation
2.7.4- Transport
3.0Study Session Summary and Conclusion
4.0 Self-Assessment Questions
5.0 Additional Activities (Videos, Animations &Out of Class activities)
6.0References/Further Readings
Introduction:
This is the last sessionin this course. I hope that the time we have spent on the
previous sessions is worth it. This session provides an overview of the basic concepts
of web services. Having a basic understanding of how Web Services work, will enable
you appreciate how Web Services Resource Framework (WSRF) extends Web
Services. Even if you think you already know about Web Services, going through this
section would enhance your understanding of the topic. Enjoy your studies!
177
1.0 Study Session Learning Outcomes
After studying this session, I expect you to be able to:
1. Describe how Web Services are published
2. Discuss the procedure involved in accessing information from Web
Services
3. Specify the common advantages and disadvantages of Web Services
4. List the components of Web Services Architecture
5. Describe each component of Web Services Architecture.
For example, let us suppose a database is kept with up-to-date information about
weather in the United States, and that information has to be distributed to everyone in
the world. To do so, the weather information could be published through a Web
Service that, given a ZIP code, will provide the weather information for that ZIP code.
178
Although Web Services rely heavily on existing Web technologies (such as HTTP),
they have no relation to web browsers and HTML. Thus, while websites are for users,
Web Services are for software.
b. Most Web Services use HTTP for transmitting messages (such as the service
request and response). This is a major advantage if you want to build an Internet-scale
application, since most of the Internet's proxies and firewalls will not mess with HTTP
traffic (unlike CORBA, which usually has trouble with firewalls).
179
2.5 Disadvantages of Web Services
a. Overhead. Transmitting all your data in XML is obviously not as efficient as using a
proprietary binary code. What you win in portability, you lose in efficiency. Even so,
this overhead is usually acceptable for most applications, but you will probably never
find a critical real-time application that uses Web Services.
b. Lack of versatility. Currently, Web Services are not very versatile, since they only
allow for some very basic forms of service invocation. CORBA, for example, offers
programmers a lot of supporting services (such as persistency, notifications, lifecycle
management, transactions, etc.). Fortunately, there is a lot of emerging Web services
specifications (including WSRF) that are helping to make Web services more and
more versatile.
However, there is one important characteristic that distinguishes Web Services. While
technologies such as CORBA and EJB are geared towards highly coupled distributed
systems, where the client and the server are very dependent on each other, Web
Services are more adequate for loosely coupled systems, where the client might have
no prior knowledge of the Web Service until it actually invokes it. Highly coupled
systems are ideal for intranet applications, but perform poorly on an Internet scale.
Web Services, however, are better suited to meet the demands of an Internet-wide
application, such as grid-oriented applications.
In-text Question
Define Website or Web Services Publishing
Answer
Information on a website is intended for users. Conversely, information which is available through a
Web Service will always be accessed by software, never directly by a user (in spite of the fact that
there might be a user using the software).
180
In order to understand what a web
service invocation entails, we will
need to take a look at all the steps
involved in a complete Web Service
invocation.
2. The discovery service will reply, telling us what servers can provide us the service
we require.
3. We now know the location of a Web Service, but we have no idea of how to
actually invoke it. Sure, we know it can give us the forecast for a Nigerian city, but
how do we perform the actual service invocation? The method we have to invoke
might be called “string getCityForecast(intCityPostalCode)”, but it could also be
called “string getNigerianCityWeather(string cityName, bool isFarenheit)”. We have
to ask the Web Service to describe itself (i.e. tell us how exactly we should invoke it)
181
5. We finally know where the Web Service is located and how to invoke it. The
invocation itself is done in a language called SOAP. Therefore, we will first send a
SOAP request asking for the weather forecast of a certain city.
6. The Web Service will kindly reply with a SOAP response which includes the
forecast we asked for, or maybe an error message if our SOAP request was incorrect.
182
Invoking a Web Service (and, in general, any kind of distributed service such as a
CORBA object or an Enterprise Java Bean) involves passing messages between the
client and the server. SOAP (Simple Object Access Protocol) specifies how we should
format requests to the server, and how the server should format its responses. In
theory, we could use other service invocation languages (such as XML-RPC, or even
some adhoc XML language). However, SOAP is by far the most popular choice for
Web Services.
2.7.4 Transport
Finally, all these messages must be transmitted somehow between the server and the
client. The protocol of choice for this part of the architecture is HTTP (HyperText
Transfer Protocol), the same protocol used to access conventional web pages on the
Internet. Again, in theory we could be able to use other protocols, but HTTP is
currently the most used.
In-text Question
What do you understand by a Service Process?
Answer
This part of the architecture generally involves more than one Web service. For example, discovery
belongs in this part of the architecture, since it allows us to locate one particular service from
among a collection of Web services.
3.0 Conclusion/Summary
To wrap up, we will go over the main ideas discussed in this session.Fundamentally,
Web Services are distributed computing technology (such as CORBA, RMI, EJB, etc.)
which enables one to create client/server applications.
183
websites are for users, Web Services are for software.
Most of the Web Services Architecture is specified and standardised by the World
Wide Web Consortium, the same organisation responsible for XML, HTML, CSS, etc.
However, Web Services Architecture essentially consists of: Service Processes,
Service Descriptions, Service Invocations and Transport.
We hope that the ideas presented in this unit and the previous units have enlightened
you on the subjects focusing on database systems, structurestheir implementations and
management, as well basic concepts of XML and Web Services. It is however
recommended that you go over the course material and all the references for further
reading.
This unit presented the background of Web Services as well as Website/Web Services
Publishing. We also learnt how to access information from Web Services. Web
Services Invocation and Architecture were equally highlighted. We hope that you
184
found this course interesting and wish you the very best in your studies! However,
before you close this page, do attempt the tasks specified in the tutor-marked
assignment below.
http://office.microsoft.com/en-us/access/HA012242471033.aspx
Gehani, N. (2006). The Database Book: Principles and Practice using MySQL.
Summit, NJ: Silicon Press
185
186
XIII Glossary
ACID - The acronym standing for the properties maintained by standard database
management systems, standing for Atomicity, Consistency, Isolation, and Durability.
Application Server - A server that processes application-specific database operations
made from application client programs. The DBMS is in-process with the application
code for very fast internal access.
Aperiodic Server - Software that is specific to a particular embedded system. Such
application-specific code is generally built on a layered architecture of reusable
components, such as a real-time operating system and network protocol stack or other
middleware.
Atomicity - The property of a transaction that guarantees that either all or none of the
changes made by the transaction are written to the database.
AVL-Tree - An AVL tree is a self balancing binary search tree.
BLOB - An abbreviation for Binary Large OBject. In SQL, BLOB can be a general
term for any data of type long varbinary, long varchar, or long wvarchar. It is also a
specific term (and synonym) for data of type long varbinary.
Breakpoint - A location in a program at which execution is to be stopped and control
of the processor switched to the debugger. Mechanisms for creating and removing
breakpoints are provided by most debugging tools.
B-tree - An indexing method in which the values of the columns used in the index are
efficiently maintained in sorted order that also provides fast access (three or four
additional disk accesses) to an individual index entry. See Wikipedia
Cache - The computer memory that is set aside to contain a portion of the database
data that has most recently been accessed by the database application program. A
cache is used to minimize the amount of physical disk I/O performed by the DBMS.
Cascade - A foreign key attribute that automatically migrates the changes made to a
referenced (i.e., primary key) table to all of the referencing (foreign key) table rows.
Catalog - A repository for the computer-readable form of a database's data
definition meta-data. Sometimes called the system catalog or just syscat.
Checksum - A numerical check value calculated from a larger set of data. A
checksum is most often used when sending a packet of data over a network or other
communications channel.
Client/Server - A server is a program that runs on a computer that directly manages
the database. A client is a separate program (or process) that communicates with the
database server through some kind of Remote Procedure Call (RPC) in order to
perform application-specific database operations.
187
Cloud - Cloud is a recently coined term used to describe an execution model for
computing systems where functions and data are invoked by a name that refers to a
remote system whose location is irrelevant (hence the concept of it being "out there
somewhere." like a cloud).
Column - A single unit of named data that has a particular data type (e.g., number,
text, or date). Columns only exist in tables.
Compiler - A software-development tool that translates high-level language programs
into the machine-language instructions that a particular processor can understand and
execute. However, the object code that results is not yet ready to be run; at least a
linker or link-step must follow.
Commit - The action that causes the all of the changes made by a particular
transaction to be reliably written to the database files and made visible to other users.
Concurrency - The property in which two or more computing processes are executing
at the same time.
Connection - The means of communication between a client and a server. A process
may have multiple connections opened, each in its own thread, to one or more
databases at a time.
Consistency - The property of a transaction that guarantees that the state of the
database both before and after execution of the transaction remains consistent (i.e.,
free of any data integrity errors) whether or not the transaction commits or is rolled
back.
Core/Core-level - A lower-level set of database primitives in the form of a complete
API, used by database processors such as SQL or Cursors.
Cost-based Optimization - The process where data distribution statistics (e.g., the
number of rows in a table) are used to guide the SQL query optimizer's choice of the
best way to retrieve the needed data from the database.
Cross-compiler - A compiler that runs on a different platform from the one for which
it produces object code. Often even the processor architecture/family of the host and
target platforms differ.
Cursor - A collection of rows grouped by common criteria (key sequence, set
membership, SELECT result set) that can be navigated and updated.
Data Type - The basic kind of data that can be stored in a column. The data types that
are available in RDM SQL
are: char, wchar, varchar, wvarchar, binary, varbinary, boolean, tinyint, smallint, inte
ger, bigint, real, float, double, date, time, timestamp, long varbinary, long varchar,
and long wvarchar.
188
Database Instance - An independent database that shares the same schema as another
database. Used only in RDM.
db_VISTA - Original name from 1984 for the Raima DBMS product now
called RDM.
DBMS - An acronym for Database Management System.
DDL - Database Definition Language.
Deadlock - A situation in which resources (i.e. locks) are held by two or more
connections that are each needed by the other connections so that they are stuck in an
infinite wait loop. For example, connection 1 has a lock on table1 and is requesting a
lock on table2 that is currently held by connection 2, which is also requesting a lock
on table1.
Debugger - A tool used to test and debug software. A typical remote debugger runs on
a host computer and connects to the target through a serial port or over a network.
Using the debugger, you can download software to the target for immediate execution.
Deterministic - An attribute of a section of code whereby the limit on the time
required to execute the code is known, or determined, ahead of time. This is
commonly associated with real-time software.
Distributed Database - A database in which data is distributed among multiple
computers or devices (nodes), allowing multiple computers to simultaneously access
data residing on separate nodes. The Internet of Things (IoT) is frequently considered
a vast grid of data collection devices, requiring distributed database functionality to
manage.
DLL - Dynamic Link Library. A library of related functions that are not loaded into
memory until they are called by the application program. All RDM APIs are contained
in DLLs on those operating systems that support them (e.g., MS-Windows). These are
sometimes called shared libraries on some systems.
DML - Database Manipulation Language. In SQL, such statements as UPDATE,
INSERT and DELETE are considered DML.
Documentation - All product-related materials, specifications, technical manuals, user
manuals, flow diagrams, file descriptions, or other written information either included
with products or otherwise. Raima's documentation is online.
Domain - An alternate name for a base data type that is defined using the RDM SQL
create domain statement.
Durability - The property of a transaction in which the DBMS guarantees that all
committed transactions will survive any kind of system failure.
189
Dynamic DDL - The ability to change the definition of a database (its schema) after
data has been stored in the database without having to take the database off-line or
restructure its files.
Edge Computing - Edge computing refers to the computing infrastructure at the edge
of the network, close to the sources of data. Edge computing reduces the
communications bandwidth needed between sensors and the datacenter. Databases
with tiny footprints e.g RDM are optimized for edge computing.
Embedded Database - An embedded database is the combination of a database and
the database software which typically resides within an application. The database
holds information and the software control the database to access or store information.
Encryption - The encoding of data so that it cannot be understood by a human reader.
This usually requires the use of an encryption key. A common encryption algorithm is
called AES, which uses encryption keys of 128, 192 or 256 bits. See Wikipedia
End-User - An entity that licenses an Application for its own use from Licensee or its
Additional Reseller.
Fog Computing - An architecture that distributes computing, storage, and networking
closer to users, and anywhere along the Cloud-to-Thing continuum. Fog computing is
necessary to run IoT, IIoT, 5G and AI applications.
Foreign Key - One or more columns in a table intended to contain only values that
match the related primary/unique key column(s) in the referenced table. Foreign and
primary keys explicitly define the direct relationships between tables. Referential
Integrity is maintained when every foreign key refers to one and only one existing
primary key.
Geospatial datatypes - Data types which are specifically optimized for storage of
geographic coordinate based data.
Grouped Lock Request - A single operation that requests locks on more than one
table or rows at a time. Either all or none of the requested locks will be granted.
Issuing a grouped lock request at the beginning of a transaction that includes all of the
tables/rows that can potentially be accessed by the transaction guarantees that a
deadlock will not occur.
GUI - Graphical User Interface.
Handle - A software identification variable that is used to identify and manage the
context associated with a particular computing process or thread. For example, SQL
uses handles for each user connection (connection handle) and SQL statement
(statement handle) among other things.
Hash - An indexing method that provides for a fast retrieval (usually in only one
additional disk access) of the row that has a matching column value. See Wikipedia
190
Hierarchical Model - A special case of a network model database in which each
record type can participate only as the member of one set.
Hot Spot - In a database, a hot spot is a single shared row of a table that is used and
updated so often that it creates a performance bottleneck on the system.
I/O - Input/output. For a DBMS, this is normally a disk drive, used to create
database durability.
IEC - International Electrotechnical Commission. Along with the ISO, the IEC
controls the SQL standard (ISO/IEC 9075) and many others as well.
IIOT - Abbreviation of Industrial Internet of Things.
Implicit Locking - Done by SQL to automatically apply the locks needed to safely
execute an SQL statement in a multiuser (i.e., shared database) operational
environment.
IMDB - Abbreviation of In-Memory Database
Index - A separate structure that allows fast access to a table's rows based on the data
values of the columns used in the index. RDM supports two indexing types: hash and
b-tree. A SQL key (not foreign key) is implemented using an index.
In-Memory (Inmemory) - A feature in which the DBMS keeps the entire contents of
a database or table available in computer memory at all times while the database is
opened. Frequently, in-memory databases are volatile, meaning that they have little or
no durability if the computer malfunctions. Durability issues are frequently prioritized
below performance, which increases substantially with memory as the storage media.
In-process - When referring to a DBMS, it is in-process when the DBMS code resides
in the process space of the application program that is using it. If the process is single
threaded, then this is a single-user usage of the database(s). A process may have
multiple threads with individual connections to a shared database, making it a multi-
user database. In-process uses Local Procedure Calls (LPC) vs Remote Procedure
Calls (RPC) to a database server in a separate process.
Inner Join - A join between two tables where only the rows with matching foreign
and primary key values are returned.
Internet of Things - A recently coined phrase describing the extended reach of
connected devices. In particular, devices that use computing power to control or sense
their environment and use wifi or wires to connect to the internet.
IoT - Abbreviation of Internet Of Things
IP Address - A numerical identification tag assigned to a computing device in a
network. Originally, internet IP addresses consisted of 32 bits of data, displayed as a
set of four 3-digit numbers separated by periods (e.g., 113.12.214.2).
191
ISO - International Organization for Standardization. Along with the IEC, the ISO
controls the SQL standard (ISO/IEC 9075) and many others as well.
Isolation - The property of a transaction that guarantees that the changes made by a
transaction are isolated from the rest of the system until after the transaction has
committed.
Java - A multi-platform, object-oriented programming language, similar to C++,
which is freely available to any and all software developers. It is particularly important
in the development of internet/web and mobile applications.
JDBC - Java Database Connectivity API. JDBC provides a standard database access
and manipulations API for Java programs. RDM supports JDBC.
Join - An operation in which the rows of one table are related to the rows of another
through common column values.
JSON - A data representation offered as a more compact but still humanly readable
alternative to XML. JSON is the acronym for JavaScript Object Notation, and is
frequently utilized in web/cloud-based applications.
Key - A column or columns on which an index is constructed to allow rapid and/or
sorted access to a table's row.
LAN - A Local Area Network is used to interconnect the computers in a single
geographic location. Contrasted to Wide Area Networks (WAN). Bandwidth (speed)
is a primary difference between local and wide-area networking.
Library - The container for a set of common software API functions. Frequently, a
library is contained in a DLL or Shared Library.
Licensee - A customer that has obtained the right to use and/or distribute Raima
Product(s).
Little-Endian - The little-endian convention is a type of addressing that refers to the
order of data stored in memory. In this convention, the least significant bit (or "littlest"
end) is first stored at address 0, and subsequent bits are stored incrementally.
Little-endian is the opposite of big-endian, which stores the most significant bit first.
Because they are opposites, it is difficult to integrate two systems that use different
endian conventions.
Local Procedure Call - A software function call to a library function that exists in-
process (same computer, same process space). This is in contrast to Remote Procedure
Calls (RPC) which are to functions that reside a different process, whether they are the
same computer (using interprocess communication) or a remote computer (using
networking).
192
Locking - A method for safely protecting objects from being changed by two or more
users (processes/threads) at the same time. A write (exclusive) lock allows access
from only one user (process/thread) at a time. A read (shared) lock allows read-only
access from multiple users (processes/threads).
Maintenance and Support - The maintenance and support services for a Product
under an Agreement (Maintenance and Support Addendum).
Marks - Trademarks, trade names, service marks or logos identified on a
company's website and/or printed material.
Memory Database - A DBMS that keeps the entire contents of a database or table
available in computer memory at all times while the database is opened.
Frequently, in-memory databases are volatile, meaning that they have little or
no durability if the computer malfunctions. Durability issues are frequently prioritized
below performance, which increases substantially with memory as the storage media.
Meta-data - "Data about data." In a DBMS context, data stored in columns of a table
have certain attributes, such as the type, length, description or other characteristics that
allow the DBMS to process the data meaningfully, or allow the users to understand it
better.
Mirroring - The ability to copy the changes each transaction made to the database
from the master database to one or more slave databases so that exact copies of the
master database are always available on the slaves.
MMDB - An acronym for Main Memory Database, also called In-memory Database
Modification Stored Procedure - An SQL stored procedure that contains one or
more INSERT, UPDATE, and/or DELETE statements.
Multi-platform - The ability for a software system to run on different computer
hardware and operating systems with little or no change.
Multi-version Concurrency Control (MVCC) - MVCC is a concurrency control
method which allows for multiple types of database access to occur
simultaneously. RDM implements this through the use of database snapshots.
Natural Join - A join formed between two tables where the values of
identically named and defined columns are equal.
Network Model - A database in which inter-record type relationships are organized
using one-to-many sets. This differs from a Hierarchical Model in that it allows a
record type to be a member of more than one set. Individual rows can be retrieved
using API functions that allow an application to navigate through individual set
instances.
193
Network - An inter-connection of computers and computing devices, all of which can
send and receive messages from one another. The world's largest network is the
internet, in which billions of computers are connected.
NoSQL - A classification of data storage systems that are not primarily designed to be
relationally accessed through the common SQL language. NoSQL systems are
characterized by dynamic creation and deletion of key/value pairs and are structured to
be highly scalable to multiple computers.
Object-oriented - A computing programming paradigm that defines the computing
problem to be solved as a set of objects that are members of various object classes
each with its own set of data manipulation methods. Individual objects which have
been instantiated (created) can be manipulated only using those prescribed methods.
Open Source Software (OSS) - Software that is released under a Software License
that (1) permits each recipient of the software to copy and modify the software; (2)
permits each recipient to distribute the software in modified or unmodified form; and
(3) does not require recipients to pay a fee or royalty for the permission to copy,
modify, or distribute the software.
Optimizer - A component of the SQL system that estimates the optimum (i.e., fastest)
method to access the database data requested is by particular SQL SELECT,
UPDATE, or DELETE statement.
Outer Join - A join formed between two tables that in addition to including the rows
from the two tables with matching join column values will also include the values
from one table that do not have matching rows in the other.
Page Size - The size in bytes of a database page.
Page - The basic unit of database file input/output. Database files may be organized
into a set of fixed-sized pages containing data associated with one or more record
occurrences (table rows).
Party - A party to an Agreement (between Raima and Customer)
PLC - Programmable Logic Controller.
Port - A network portal through which two computing processes can communicate.
Where one IP Address typically identifies a device, a Port on that device identifies one
of multiple potential communication channels.
Portable - Software that has been developed to be able to run on many different
computer hardware and operating systems with little or no change.
Positioned Update/Delete - An SQL UPDATE or DELETE statement that modifies
the current row of a cursor.
194
Primary Key - A column or group of columns in a given table that uniquely identifies
each row of the table. The primary key is used in conjunction with a foreign key in
another (or even the same) table to relate the two tables together. For example, the
primary key in an author table would match the foreign key in a book table in order to
relate a particular author to that author's books.
Process - An instance of the execution of a program on a single computer. A process
can consist of one or more threads executing, more or less, concurrently. The private
memory used by a process cannot be accessed by any other process.
Product - The Raima software product(s) licensed to Licensee under an Agreement,
including all bug fixes, upgrades, updates, and releases. Product(s) does not include
any Third Party Software or any OSS that may be included and distributed with the
Product(s).
Protocol - A specific method in which messages are formulated, formatted, and
passed between computers in a network. Internet messages are passed between
computers using the TCP/IP protocol.
Query - A complete SELECT statement that specifies 1) the columns and tables from
which data is to be retrieved; 2) optionally, conditions that the data must satisfy; 3)
optionally, computations that are to be performed on the retrieved column values; and
4) optionally, a desired ordering of the result set.
RDM - Raima Database Manager.
RDM Server - Raima's client/server DBMS originally released in 1993, named RDS
(Raima Database Server), Velocis, and finally RDM Server. Still supported for
existing customers.
Read-only Transaction - A Multi-Version Concurrency Control (MVCC) feature that
allows database data to be read by one process without blocking another process's
modification of that same data. Frequently referred to as a "snapshot."
Real-time - A real-time environment is one in which specific tasks must be
guaranteed to execute within a specified time interval. For a DBMS to be considered
truly real-time, it must be able to perform specific database-related tasks in a time that
can be deterministically demonstrated .
Record Instance/Occurrence - One set of related data field values associated with a
specific record type—equivalent to an SQL row.
Record Type - A collection of closely related data fields—equivalent to an SQL table.
Similar to a C struct, a record type is defined by a set of closely related data fields.
Referential Integrity - A condition in which the foreign key column values in all of
the rows in one table have matching rows in the referenced primary key table.
Referential integrity is maintained by SQL during the processing of an INSERT and
195
DELETE statement and any UPDATE statement that modifies a foreign or primary
key value.
Relational Model - A database in which inter-table relationships are organized
primarily through common data columns, which define a one-to-many relationship
between a row of the primary key table and one or more rows of the matching foreign
key table. Equi-joins relate tables that have matching primary/foreign key values, but
other comparisons (relationships) may be defined.
Remote Procedure Call - A method of interprocess communication where a function
residing within another process is called as though it is a local (in-process) function.
The method is implemented through a local proxy function and a remote stub function.
Replication - A process where selected modifications in a master database is
replicated (re-played) into another database.
Restriction Factor - Each relational expression specified in the WHERE clause of a
query has an associated restriction factor that is estimated by the SQL optimizer,
which specifies the fraction (or percentage) of the table for which the expression will
be true.
Result Set - The complete set of rows that is returned by a particular
SELECT statement.
Rollback - An operation, usually performed by the SQL ROLLBACK statement, that
discards all of the changes made by all INSERT, UPDATE and DELETE statements
that have been executed since the most recently started transaction (e.g., START
TRANSACTION statement).
Row - One set of related values for all of the columns declared in a given table. Also
known as a record occurrence.
Royalty - A License fee set forth in an Agreement (Product and Pricing Addendum).
RTOS - A common abbreviation for real-time operating system. Raima Database
Manager runs on most RTOS, like VxWorks, Integrity, Embedded Linux and QNX.
196
Scalar Function - Either a built-in SQL function or a user-defined function that
returns a single value computed only from the values of any required arguments at the
time the function is called.
Schema - A representation of the structure of a database. It may be graphical or
textual. Graphical representations typically involve the use of boxes that represent
database tables and arrows that represent inter-table relationships. Textual schema
representations utilize Database Definition Language (DDL) statements to describe a
database design.
Searched Update/Delete - An SQL update or delete statement in which the rows to be
updated/deleted are those for which the conditional expression specified in the
WHERE clause is true.
Seat - A copy of a Product, or any of its components, installed on a single machine.
Semaphore - A primitive computing operation that is used to synchronize shared
access to data. Sometimes called a "mutex" meaning a "mutually exclusive section."
Semaphores control concurrent access to data by restricting access to critical sections
of code that manipulate that data.
Server (Software) - A Seat that resides on a single Server machine and is capable of
accepting connections from one or more Seats residing on Client machines.
Set - A method used to implement the one-to-many relationship formed between two
tables based on their foreign and primary key declarations. The term "set" comes from
the CODASYL Network Model definition.
Snapshot Isolation - When a snapshot of the database is taken, an instance of the
database is frozen and concurrent reads are allowed to occur on that snapshot.
Database writes are allowed to continue while reads on the snapshot are happening.
Source Code (Raima) - The English language source code version of a Product, and
any accompanying comments or other programmer Documentation, provided by
Raima to Licensee pursuant to the terms of an Agreement. The capitalized term
Source Code as used in an Agreement does not include OSS.
SQL - The standardized and commonly accepted language used for defining, querying
and manipulating a relational database. The etymology of "SQL" is unclear, possibly a
progression from "QueL" (Query Language) to "SeQueL" to "SQL." However, some
experts don't like the expansion "Structured Query Language" because its structure is
inconsistent and a historical patchwork. See Wikipedia
SQL PL - A SQL based programming language. This allows for a SQL programmer
to use programming constructs like variables, conditionals and loops purely through
the use of SQL statements.
197
Stack - A stack is a conceptual structure consisting of a set of homogeneous elements
and is based on the principle of last in first out (LIFO).
198
Upgrade (of Product) - A Product that has been modified in a major way, and is
released as a new version of the Product. An Upgrade is represented by a Product
version number that increments to the left of the decimal point.
Use - Storing, loading, installing, and/or running a Product, or displaying screens
generated by a Product.
User-defined Function - An application-specific SQL callable scalar or aggregate
function written in C.
User-defined Procedure - An application-specific function written in C and
invocable through use of the SQL call statement.
Vacuuming - Databases that use MVCC to isolate transactions from each other need
to periodically scan the tables to delete outdated copies of rows.
Velocis - Former name of a DBMS product, now called RDM Server.
Virtual Table - An SQL table that is defined through a set of application-specific C
functions that conform to a particular interface specification, allowing a non-database
data source (e.g., a device, etc.) to be accessed as if it were a conventional SQL table.
WAN - A Wide Area Network, as contrasted to Local Area Networks (LAN),
Normally WAN refers to the internet. Bandwidth (speed) is a primary difference
between local and wide-area networking.
Wi-Fi - The common name for standardized local-area wireless technology.
XML - Extensible Markup Language. XML documents are much used in the internet's
World Wide Web but are also used in many computing contexts in which data needs
to be shared.
199