KEMBAR78
Codd 1981 Data Model | PDF | Relational Model | Databases
0% found this document useful (0 votes)
30 views12 pages

Codd 1981 Data Model

model
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
30 views12 pages

Codd 1981 Data Model

model
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 12

DATA MODELS in DATABASE

1•mNAGEMENT

E. F. Codd
IBM Research Laboratory
San Jose, California 95193
1 WHAT IS A expressed as
DATA MODEL? insert—
update—
It is a delete rules
combination .
of three
components : Note that in
any
1) a particular
collection application
of data of a data
struc ture model it may
types (the be necessary
building to impose
blocks of further
any database (application
that —specific)
conforms to integrity
the model) ; constraints,
and ther eby
2) a
def ine a
collection
smaller set
of operators
of
or
consistent
inferencing
database
rules, which
states or
can be
changes of
applied to
state. Note
any valid
also that a
instances of
database
the data
system must
types listed
normally
in ( 1) , to
permit
retrieve or
states other
derive data
than the
from any
consistent
parts of
ones to
those
exist
structures
transiently
in any
during the
combinations
execution of
desired ;
a program.
3) a It is
collection imperative
of general that the
integrity program tell
rules, which the system
implicitly at which
or steps it is
explicitly permissible
def ine the for the
set of system to
consistent check
database integrity.
states or There may
changes of exist
state or programming
both — these languages
rules may which permit
sometimes be the
112
intermixing for direct
of integrity commercial
assertions advantage,
and the ACM
commands, copyright
but I do not notice and
know of any the title of
the
(other than
publication
database
and its date
sublanguages appear, and
) which notice is
permit the given that
specificatio copying is by
n of permission of
integrity the
points at Association
which a set for Computing
of community Machinery. To
—specified copy
integrity otherwise, or
rules are to to republish,
be checked. requires a
fee and/or
Numerous specific
permission.
authors
appear to 0 1980
think of a
data model
89791-
as nothing 031-
more than a 1/80/0600
collection
of data - 0112
struc ture $00.75
types . This understanding
is like of how the
trying to structures
understand behave . In
the way the comparing
human body data models
functions by people of ten
studying ignore the
anatomy but operators and
omit ting integrity
physiology, rules
The altogether.
operators When this
and occurs, the
integrity resulting
rules (items comparisons
2 , 3 in the run the risk
def init ion of being
above) are meaningless.
essential to
any A flagrant
example of
Permission to such a
copy without comparison is
fee all or
the statement
part of this
in a panel
material is
granted discussion on
provided that Standards in
the copies ACM SIGMOD
are not made 1979
or (recorded in
distributed the

113
Supplement to A data model
the may be used
Proceedings, in any of the
page 55) : following
"the ways :
relational
model is l) as a tool
considered to for
be a specifying
constrained the kinds of
version of data and data
the flat file organiza t
data model. " ion that are
vnlat are the permissible
operations in a specific
that are database;
allowed on 2) as a
flat f i les? basis for
What are the developing a
general general
integrity design
constraints methodology
on flat for
files? Is databases;
there even a 3) as a
generally basis for
accepted def coping with
init ion of evolution of
the structure databases so
of flat files as to have
that is minimal
sufficiently logical
precise so impact on
that we can existing
tell for sure application
whether a programs and
flat file can terminal
contain activities;
records of
4) as a
more than one
basis for the
type?
development
of families
Note that the
of very high
authors of
level
many of the
languages for
data models
query and
of the past
data
five years
manipulation
defined the
;
data
structures
5) as a
only, focus for
omitting the DBMS
operators and architecture;
integrity 6) as a
rules. Such vehicle for
models should research into
therefore be the
regarded as behavioral
partial or properties of
incomplete alternative
data models. organizations
of data.
2 PURPOSES OF
A DATA MODEL Re item 4) ,
a data model
114
need not (and attributed
probably (and I think
should not) correctly) to
dictate a a desire to
single conform to a
language for committee—def
data ined data
manipulat ion definition
and query, language .
since
However ,
different
this raison d
kinds of r
etre
users are
certainly
likely to
need does not
different apply to
kinds of existing
languages . relational
The operators systems.
or inf erence
rules The
should , increasingly
however , widespread
provide a use of the
yardstick of relational
manipulative model as a
and query vehicle for
power. logical
database
The extent to design
which data (regardless
models have of the target
influenced database
the field of management
database system by
management which the
can be seen data is to be
by observing ultimately
the new managed)
database provides
systems additional
(experimental evidence of
and product) the impact of
that have data models
been on the
developed database
during the field .
last ten Substantial
years . It is developments
hard to find in the theory
one that is of database
not based on structure
either the have been tr
CODASYL iggered by
network model the work on
or the normalization
relational of relations
model . The in the
number of relational
CODASYL model .
implementatio
ns and The
installations relational
is of ten model has

115
also spurred recognition t
vigorous and ha t general
widespread purpose
research into programrning
techniques languages
for need to
optimizing distinguish
the execution shared
of statements variables
in very high from private
level variables.
database
languages . 3 HISTORY OF
Other models DATA MODEL
are seldom, DEVELOPMENT
if ever ,
used for such As of 1979,
investigation some 40 or
s, because more data
their high models
level (mostly
languages incomplete in
(when such the sense def
exist and I ined above)
know of only have been
one that has proposed for
been the
implemented) management of
are formatted
necessarily data . The
more first such
complicated. data model to
be developed
Finally, it was the
appears that relational
database model
models have (developed in
influenced 1969) . Many
programrning people
language have the
research, erroneous
providing impression
early that the
examples of hierarchical
data and network
abstractions models
. Data models preceded the
have paved relational
the way for model. This
the much is due to a
clearer confusion
separation of between
semantic language
issues from specification
implementatio and
n issues in implementatio
programming n on the one
languages . hand and data
Data models models on the
can also be other .
expected to Hierarchical
bring about a and network
belated systems were
116
developed 4 COPOION
prior to MISUNDERSTAND
1970, but it INGS
was not until
1973 that Many people
data models fail to
for these separate in
systems were their minds
defined . It different
is a little levels of
known fact abstraction .
that the A specific
hierarchic example of
model this is the
(incomplete failure to
as it is) was realize that
defined by a tuples are at
process of a higher
abstraction level of
from IBM's abstraction
JMS . than records
Similarly , (one is not
the network allowed to
model use the
(incomplete contiguity of
as it is) was components of
defined by tuples,
abstrac t ion whereas one
from the can use the
CODASYL DBTG contiguity of
language fields in a
proposals of record).
1969. The
purpose of Likewise ,
these primary keys
definitions (whether they
was to have system—
provide a controlled
basis for surrogates or
comparing the user—
three controlled
approaches on identifiers
a common as values)
level of are at a
abstraction. higher level
Thus, than
hierarchic pointers. A
and network particular
systems occurrence of
preceded the a value V of
hierarchic a primary key
and network makes
models , reference to
whereas all other
relational occurrences
systems came of V in the
after the database that
relational are drawn
model and from the
used the domain of
model as a that primary
foundation. key.
Surrogates
117
have the claim is
property that false one
they are need only
distinct if compare two
they alternative
represent anchored
distinct obj binary
ects in the schemas for
real world , an n—ary
They are at a relation R
higher level that is known
than DBTG to possess
database keys such
, which are anomalies .
record In the first
identif iers schema there
that are are n binary
distinct for relations ,
distinct each
records. Note corresponding
that there to one of the
may be two or n attributes
more records (columns) of
describing a R. In the
single real second schema
world obj R is first of
ect, in which all non— loss
case there decomposed
are two or (by
more database projection,
keys for example)
corresponding into two or
to one more
surrogate. relations of
Moreover, lesser
within one degree, and
record there then these
may be two or relations are
more converted to
surrogates anchored
and only one binary form.
database These two
key . schemas give
rise to da
Another kind tabases with
Of confusion entirely
concerns the different
exclusion of insertion,
relations of update, and
degree higher deletion
than two from behavior .
the principal
schema . This Whether
exclusion is binary
sometimes relations
claimed to (carefully
remove all def ined with
concern for due regard to
anomalies in possible
insertion, anomalies)
update, and are better
deletion. To than
see that this relations of

118
higher degree the mo s t
( similarly widely used
carefully programming
defined) is a languages .
separate PASCAL may be
question the f irst
which can be programming
argued at language to
length. My incorporate
present some aspects
position is of the
that this is database
largely a domain
subj ective concept .
question. The highly
differences shared and
between these dynamic
two views of environment
data are not it is
significant important
for formatted that the
databases, in system keep
which the track of
data exhibit which columns
a great deal of which
of tables draw
regularity. their values
Moreover , from any
operators given domain.
that generate Otherwise, it
and is impossible
manipulate n— to write a
ary relations reliable
are program to
unavoidable remove from a
if one is to database all
support a references to
variety of a particular
user views entity (see
and a variety Appendix) .
of queries .
Another error
A common is that of
error is to identifying
confuse the the j oin
concepts of operator of
attribute the
(column) and relational
domain (the model with
set of all the links
those values (sometimes
which can known as fan
ever occur in sets) of the
a given DBTG model.
column) . These
This is concepts are
perhaps due different in
to the fact ways too
that there is numerous to
no mention.
counterpart Suffice it to
to the domain say that,
concept in while the
119
relational s into
join operates semantic data
on a pair of models
tables to represent an
yield a table important
as the contribution
result, DBTG to the
links are not understand
operators, ing of the
but merely meaning of
struc tures data in
that by formatted
themselves databases.
yield nothing However ,
(operators this work is
must act upon sorely in
them if any need of some
information obj ective
is to be criteria for
extracted completeness:
from them or i.e. ,
per them) . knowing when
to stop. At
Different
present, this
users need to
is a matter
see the
of taste.
database in
different
5 THE FUTURE
ways. As
Smith and The subj ect
Smith point of data
out, the modeling will
concepts of be a fertile
entity, area for
relationship, research,
property, development,
category and and
even database application
value for many
represent years to
different come. This is
perceptions due
of common principally
abstract to the fact
objects. A that the
data model meaning of
that does not data and the
permit a manipulation
relationship of this
to be viewed meaning are
as an entity still so
is clearly poorly
inadequate to understood.
support these Further, the
different impact of
perceptions ( data model
several such ing on
models have database
been management
proposed) . will cont
Recent inue to be
investigation high,
affecting
120
both the model is in
design of such a
databases and dominant
the design of position
database today is
management that, when
systems. originally
Gradually, introduced ,
designers are two radically
becoming different,
aware of the very high
need for very level (set—
high level oriented)
data data
sublanguages: sublanguages
(the
to support relational
effic ient algebra and
communication predicate—
of data logic—based
between
distributed ALPHA)
databases; were def
2) to enable ined for
the system to it .
determine
are now
access
approximately
strategy in
twenty such
the face of
data
data
sublanguages
representatio
for the
n which is
relational
subject to
model . By
dynamic
the end of
change from
the 80's it
time to time
is reasonable
(System R
to expect the
does this) .
relational
If a user at model to have
one node outstripped
needs a every other
collection of data model in
records from terms of the
another node number of
and he is users, the
able to number of
specify that databases,
collection in and the
a single number of
statement, it database
is absurd for systems.
him to engage
in a sequence APPENDIX: The
of single crucial role
record of domains
requests,
each followed An important
by single part of
record repl keeping a
ies . One relational
reason the database in a
relational state of

121
integrity is from the
keeping track SUPPLIER
of which relation, but
attributes in addition
(columns) are we want to
defined on remove all
which domains occurrences
. This of 3 as a
information supplier
is needed to serial number
support the in all other
global relations.
removal of These latter
all occurrences
occurrences are called
of a value V ref erential
where it occurrences ,
occurs as a and are to be
value from replaced by
domain D. null , except
Here, we are where the
ref err ing integrity
to domain in constraints
a semantic , demand that a
not a null is
syntactic , unacceptable,
sense . That in which case
is, we wish the tuple
to discuss containing
domains such the component
as supplier that
ser ial references
numbers , supplier 3 is
quantities of to be
parts , names deleted.
of projects,
rather than Suppose that
doma ins such on a certain
as day a
alphanumeric programmer
charac ter writes a
strings, program to
floating carry out the
point removal of
numbers, and any specif
integers. led
supplier ,
For example, and this
we may wish program is
to remove based on his
supplier knowledge as
Jones from of that time
the database. concerning
He happens to which columns
have the are defined
serial number on the
3, and we supplier
want to serial number
remove his domain . Such
serial number a program
and will fail to
descriptive operate
proper ties correctly, if
122
at a later addition of a
time one or new column
more new should be
tables are similarly
created that treated .
have columns Corresponding
def ined on deletions in
this domain . the column—
Clearly, for domain table
such a should occur
program to as a side
operate effect of any
correctly command that
regardless of drops tables.
changes that
may be made The proposal
in the tables that all
referencing columns on a
the supplier given domain
serial number be
domain, the identically
database named
system must throughout
have the the database
knowledge is not a
regarding feasible
which columns solution,
use which since any
doma ins — table may
and this have more
knowledge than one
must be kept column
up—to—date by defined on
the system the given
every time a domain.
new table is
created ,
extended (by
adding a new
column) , or
destroyed .

When a new
table is
declared,
accompanying
each column
name should
be the name
of the
semantic
domain on
which that
column is
defined . The
system should
keep this
information
in the column
—domain table
of the
database
catalog. The
123

You might also like