KEMBAR78
Normalization | PPT
Database Normalization TJ Racoosin 2 Dec 1998 CPCUG Access SIG Racoosin Solutions rSolutions@erols.com 703 849-1997
Overview Introductions The Normal Forms Primary Key Relationships and Referential Integrity When NOT to Normalize Real World Exercise Resources Racoosin Solutions
Introductions TJ Racoosin You Are you familiar with normalization? Used the relationship window ? Enforce referential integrity? Cascade Delete? Any issues with normalizing data? Racoosin Solutions
Why Normalize? Flexibility Structure supports many ways to look at the data Data Integrity “ Modification Anomalies” Deletion Insertion Update Efficiency Eliminate redundant data and save space Racoosin Solutions
Normalization Defined “  In relational database design, the process of organizing data to minimize duplication.  Normalization  usually involves dividing a database into two or more tables and defining relationships between the tables.  The objective is to isolate data so that additions, deletions, and modifications of a field can be made in just one table and then propagated through the rest of the database via the defined relationships .” -  Webopedia, http://webopedia.internet.com/TERM/n/normalization.html Racoosin Solutions
Another Definition "Normalization" refers to the process of creating an efficient, reliable, flexible, and appropriate "relational" structure for storing information. Normalized data must be in a "relational" data structure. -  Reid Software Development,  http://www.accessdatabase.com/normalize.html Racoosin Solutions
The Normal Forms A series of logical steps to take to normalize data tables First Normal Form Second Third Boyce Codd There’s more, but beyond scope of this Racoosin Solutions
First Normal Form (1NF) All columns (fields) must be atomic Means : no repeating items in columns Solution: make a separate table for each set of attributes with a primary key (parser, append query) Customers CustomerID Name Orders OrderID Item  CustomerID OrderDate Racoosin Solutions
Second Normal Form (2NF) In 1NF and every non-key column is fully dependent on the (entire) primary key Means : Do(es) the key field(s) imply the rest of the fields? Do we need to know both  OrderID  and  Item  to know the Customer and Date?  Clue: repeating fields Solution: Remove to a separate table (Make Table) OrderDetails OrderID Item Orders OrderID CustomerID OrderDate Racoosin Solutions
Third Normal Form (3NF) In 2NF and every non-key column is mutually independent  means : Calculations Solution: Put calculations in queries and forms OrderDetails OrderID Item Quantity Price Put expression in text control or in query: =Quantity * Price Racoosin Solutions
Boyce-Codd Form (3NF) - Examples A more restricted version of 3NF (known as  Boyce-Codd Normal Form ) requires that the determinant of every functional dependency in a relation be a key - for every FD: X => Y, X is a key Consider the following relation: STU-MAJ-ADV ( Student-Id , Major, Advisor) Advisor => Major, but Advisor is not a key Boyce-Codd Normal Form for above: STU-ADV ( Student-Id , Advisor) ADV-MAJ ( Advisor , Major) 2/16/98 10 MGS 404 2/16/98 10 Kumar Madurai:  http://www.mgt.buffalo.edu/courses/mgs/404/mfc/lecture4.ppt
Primary Key Unique Identifier for every row in the table Integers vice Text to save memory, increase speed Can be “composite” Surrogate is best bet! Meaningless, numeric column acting as primary key in lieu of something like SSN or phone number - (both can be reissued!) Racoosin Solutions
Relationships One to many to enforce “Referential Integrity” Two “foreign” keys make a composite primary key and “relate” many to many tables A look up table - it doesn’t reference any others Racoosin Solutions
Table Prefixes Aid Development First, we’ll get replace text PK with number  The Items table is a “look up” with tlkp prefix tlkp “lookup” table (no “foreign keys”) OrderDetails is renamed “trelOrderItem” a “relational” table trel “relational” (or junction or linking)  two foreign keys make a primary tblOrders OrderID CustomerID OrderDate OrderDetails OrderID Item trelOrderItem OrderID ItemID tlkpItems ItemID ItemName Racoosin Solutions
Referential Integrity Every piece of “foreign” key data has a primary key on the one site of the relationship No “orphan” records. Every child has a parent Can ’t delete records from primary table if in related table Benefits - Data Integrity and Propagation If update fields in main table, reflected in all queries Can’t add a record in related table without adding it to main Cascade Delete : If delete record from primary table, all children deleted -  use with care ! Better idea to “archive” Cascade Update : If change the primary key field, will change foreign   key Racoosin Solutions
When Not to Normalize Want to keep tables simple so user can make their own queries Avoid processing multiple tables  Archiving Records If No need to perform complex queries or “resurrect” Flatten and store in one or more tables Testing shows Normalization has poorer performance “ Sounds Like”  field example Can also try temp tables produced from Make Table queries Racoosin Solutions
Real World - School Data Student Student  Previous Current Last  First  Parent 1 Parent 2  Teacher  Teacher Smith Renee Ann Jones Theodore Smith Hamil Burke Mills  Lucy Barbara Mills Steve Mills  Hamil  Burke  Jones Brendan Jennifer Jones  Stephen Jones  Hamil  Burke  …. Street Address City State Postal Code Home Phone 5551 Private Hill Annandale Virginia 22003-  (703) 323-0893 4902 Acme Ct Annandale Virginia 22003-  (703) 764-5829 5304 Gains Street Fairfax Virginia 22032-  (703) 978-1083 …. First Year Last Year  Age Program Enrolled Attended  Birthday inSept Map Coord Notes PF   / 0   0  6/25/93 5 22 A-3 PF 96/97   0 8/14/93 5 21 F-3 PH 96/97   0  6/13/94 4 21 A-4 Racoosin Solutions
One Possible Design Racoosin Solutions
Books Access97 Developers Handbook  Litwin,Getz & Gilbert Chapter  4 Access and SQL Server Developers Handbook  Viescas, Gunderloy and Chipman Chapter 2 Access97 Expert Solutions  Lezynski Chapter 10 Racoosin Solutions
Internet Papers http://www.mtjeff.com/~calvin/devhbook/databasedesign.html  http://www.swin.edu.au/infotech/subjects/bt220/bt220s1.html http://www.bus.okstate.edu/lhammer/AISweb/Normaliz.htm http://www.inetspace.com/database.html Slides http://www.mgt.buffalo.edu/courses/mgs/404/mfc/lecture4.ppt http://www.state.sd.us/people/colink/datanorm.htm http://www.cba.nau.edu/morgan-j/class/subtop2_3/sld001.htm Racoosin Solutions

Normalization

  • 1.
    Database Normalization TJRacoosin 2 Dec 1998 CPCUG Access SIG Racoosin Solutions rSolutions@erols.com 703 849-1997
  • 2.
    Overview Introductions TheNormal Forms Primary Key Relationships and Referential Integrity When NOT to Normalize Real World Exercise Resources Racoosin Solutions
  • 3.
    Introductions TJ RacoosinYou Are you familiar with normalization? Used the relationship window ? Enforce referential integrity? Cascade Delete? Any issues with normalizing data? Racoosin Solutions
  • 4.
    Why Normalize? FlexibilityStructure supports many ways to look at the data Data Integrity “ Modification Anomalies” Deletion Insertion Update Efficiency Eliminate redundant data and save space Racoosin Solutions
  • 5.
    Normalization Defined “ In relational database design, the process of organizing data to minimize duplication. Normalization usually involves dividing a database into two or more tables and defining relationships between the tables. The objective is to isolate data so that additions, deletions, and modifications of a field can be made in just one table and then propagated through the rest of the database via the defined relationships .” - Webopedia, http://webopedia.internet.com/TERM/n/normalization.html Racoosin Solutions
  • 6.
    Another Definition "Normalization"refers to the process of creating an efficient, reliable, flexible, and appropriate "relational" structure for storing information. Normalized data must be in a "relational" data structure. - Reid Software Development, http://www.accessdatabase.com/normalize.html Racoosin Solutions
  • 7.
    The Normal FormsA series of logical steps to take to normalize data tables First Normal Form Second Third Boyce Codd There’s more, but beyond scope of this Racoosin Solutions
  • 8.
    First Normal Form(1NF) All columns (fields) must be atomic Means : no repeating items in columns Solution: make a separate table for each set of attributes with a primary key (parser, append query) Customers CustomerID Name Orders OrderID Item CustomerID OrderDate Racoosin Solutions
  • 9.
    Second Normal Form(2NF) In 1NF and every non-key column is fully dependent on the (entire) primary key Means : Do(es) the key field(s) imply the rest of the fields? Do we need to know both OrderID and Item to know the Customer and Date? Clue: repeating fields Solution: Remove to a separate table (Make Table) OrderDetails OrderID Item Orders OrderID CustomerID OrderDate Racoosin Solutions
  • 10.
    Third Normal Form(3NF) In 2NF and every non-key column is mutually independent means : Calculations Solution: Put calculations in queries and forms OrderDetails OrderID Item Quantity Price Put expression in text control or in query: =Quantity * Price Racoosin Solutions
  • 11.
    Boyce-Codd Form (3NF)- Examples A more restricted version of 3NF (known as Boyce-Codd Normal Form ) requires that the determinant of every functional dependency in a relation be a key - for every FD: X => Y, X is a key Consider the following relation: STU-MAJ-ADV ( Student-Id , Major, Advisor) Advisor => Major, but Advisor is not a key Boyce-Codd Normal Form for above: STU-ADV ( Student-Id , Advisor) ADV-MAJ ( Advisor , Major) 2/16/98 10 MGS 404 2/16/98 10 Kumar Madurai: http://www.mgt.buffalo.edu/courses/mgs/404/mfc/lecture4.ppt
  • 12.
    Primary Key UniqueIdentifier for every row in the table Integers vice Text to save memory, increase speed Can be “composite” Surrogate is best bet! Meaningless, numeric column acting as primary key in lieu of something like SSN or phone number - (both can be reissued!) Racoosin Solutions
  • 13.
    Relationships One tomany to enforce “Referential Integrity” Two “foreign” keys make a composite primary key and “relate” many to many tables A look up table - it doesn’t reference any others Racoosin Solutions
  • 14.
    Table Prefixes AidDevelopment First, we’ll get replace text PK with number The Items table is a “look up” with tlkp prefix tlkp “lookup” table (no “foreign keys”) OrderDetails is renamed “trelOrderItem” a “relational” table trel “relational” (or junction or linking) two foreign keys make a primary tblOrders OrderID CustomerID OrderDate OrderDetails OrderID Item trelOrderItem OrderID ItemID tlkpItems ItemID ItemName Racoosin Solutions
  • 15.
    Referential Integrity Everypiece of “foreign” key data has a primary key on the one site of the relationship No “orphan” records. Every child has a parent Can ’t delete records from primary table if in related table Benefits - Data Integrity and Propagation If update fields in main table, reflected in all queries Can’t add a record in related table without adding it to main Cascade Delete : If delete record from primary table, all children deleted - use with care ! Better idea to “archive” Cascade Update : If change the primary key field, will change foreign key Racoosin Solutions
  • 16.
    When Not toNormalize Want to keep tables simple so user can make their own queries Avoid processing multiple tables Archiving Records If No need to perform complex queries or “resurrect” Flatten and store in one or more tables Testing shows Normalization has poorer performance “ Sounds Like” field example Can also try temp tables produced from Make Table queries Racoosin Solutions
  • 17.
    Real World -School Data Student Student Previous Current Last First Parent 1 Parent 2 Teacher Teacher Smith Renee Ann Jones Theodore Smith Hamil Burke Mills Lucy Barbara Mills Steve Mills Hamil Burke Jones Brendan Jennifer Jones Stephen Jones Hamil Burke …. Street Address City State Postal Code Home Phone 5551 Private Hill Annandale Virginia 22003- (703) 323-0893 4902 Acme Ct Annandale Virginia 22003- (703) 764-5829 5304 Gains Street Fairfax Virginia 22032- (703) 978-1083 …. First Year Last Year Age Program Enrolled Attended Birthday inSept Map Coord Notes PF / 0 0 6/25/93 5 22 A-3 PF 96/97 0 8/14/93 5 21 F-3 PH 96/97 0 6/13/94 4 21 A-4 Racoosin Solutions
  • 18.
    One Possible DesignRacoosin Solutions
  • 19.
    Books Access97 DevelopersHandbook Litwin,Getz & Gilbert Chapter 4 Access and SQL Server Developers Handbook Viescas, Gunderloy and Chipman Chapter 2 Access97 Expert Solutions Lezynski Chapter 10 Racoosin Solutions
  • 20.
    Internet Papers http://www.mtjeff.com/~calvin/devhbook/databasedesign.html http://www.swin.edu.au/infotech/subjects/bt220/bt220s1.html http://www.bus.okstate.edu/lhammer/AISweb/Normaliz.htm http://www.inetspace.com/database.html Slides http://www.mgt.buffalo.edu/courses/mgs/404/mfc/lecture4.ppt http://www.state.sd.us/people/colink/datanorm.htm http://www.cba.nau.edu/morgan-j/class/subtop2_3/sld001.htm Racoosin Solutions

Editor's Notes

  • #2 Fundamental to Database Design Gives a place to start when you have a mess of data