DATABASE OBJECT
By
            Team Teaching
Agenda
■ Schema
■ Database Objects
■ Alias
■ Table
■ Data Organizations
■ Range-clustered Table (RCT)
■ Multidimensional Clustering (MDC) Tables
Schemas
■ A schema is a collection of named objects;
   – it provides a way to group those objects logically.
   – A schema is also a name qualifier;
   – it provides a way to use the same natural name for several
       objects, and to prevent ambiguous references to those
       objects.
   – For example, the schema names 'INTERNAL' and
       'EXTERNAL' make it easy to distinguish two different SALES
       tables (INTERNAL.SALES, EXTERNAL.SALES).
■ Schemas also enable multiple applications to store data in a
  single database without encountering namespace collisions.
Schemas, Cont’d…
■ A schema can contain tables, views, nicknames, triggers,
  functions, packages, and other objects. A schema is itself a
  database object.
■ It is explicitly created using the CREATE SCHEMA statement,
  with the current user or a specified authorization ID recorded
  as the schema owner.
■ It can also be implicitly created when another object is
  created, if the user has IMPLICIT_SCHEMA authority.
Schemas, Cont’d…
■   There are some restrictions and recommendations that you must be aware of
    when naming schemas.
     – User-defined types (UDTs) cannot have schema names longer than the
         schema length listed in SQL and XML limits.
     – The following schema names are reserved words and must not be used:
         SYSCAT, SYSFUN, SYSIBM, SYSSTAT, SYSPROC.
     – To avoid potential problems upgrading databases in the future, do not use
         schema names that begin with SYS. The database manager will not allow
         you to create modules, procedures, triggers, user-defined types or user-
         defined functions using a schema name beginning with SYS.
     – It is recommended that you not use SESSION as a schema name.
         Declared temporary tables must be qualified by SESSION. It is therefore
         possible to have an application declare a temporary table with a name
         identical to that of a persistent table, in which case the application logic
         can become overly complicated. Avoid the use of the schema SESSION,
         except when dealing with declared temporary tables.
Database objects
■ Physical database design consists of defining database objects and
  their relationships.
■ You can create the following database objects in a DBMS:
   – Aliases
   – Tables
   – Constraints
   – Indexes
   – Triggers
   – Sequences
   – Views
Aliases
■ An alias is an alternative name for an object such as a
  module, table or another alias.
   – It can be used to reference an object wherever that
       object can be referenced directly.
■ An alias cannot be used in all contexts;
■ for example, it cannot be used in the check condition of a
  check constraint. An alias cannot reference a declared
  temporary table but it can reference a created temporary
  table.
Aliases, Cont’d…
■   To create an alias using the command line, enter:
     – CREATE ALIAS alias_name FOR table_name
■   The following SQL statement creates an alias WORKERS for the EMPLOYEE
    table:
      – CREATE ALIAS WORKERS FOR EMPLOYEE
■   The alias is replaced at statement compilation time by the table or view name.
■   If the alias or alias chain cannot be resolved to a table or view name, an error
    results.
■   For example, if WORKERS is an alias for EMPLOYEE, then at compilation time:
     – SELECT * FROM WORKERS
■   Becomes in effect
     – SELECT * FROM EMPLOYEE
Tables
■ Tables are logical structures maintained by the database
  manager.
■ Tables are made up of columns and rows.
■ At the intersection of every column and row is a specific data
  item called a value.
■ A column is a set of values of the same type or one of its
  subtypes.
■ A row is a sequence of values arranged so that the nth value
  is a value of the nth column of the table.
Types of tables
■ Base tables
   – These types of tables hold persistent data. There are different
      kinds of base tables, including
         ■   Regular tables
              – Regular tables with indexes are the "general purpose" table
                  choice.
         ■   Multidimensional clustering (MDC) tables
         ■   Insert time clustering (ITC) tables
         ■   Range-clustered tables (RCT)
         ■   Partitioned tables
         ■   Temporal tables
■ Temporary tables
■ Materialized query tables
Data types
Table partitioning and data
organization schemes
■ Table partitioning is a data organization scheme in which
  table data is divided across multiple data partitions according
  to values in one or more partitioning columns of the table.
■ Data from a given table is partitioned into multiple storage
  objects, which can be in different table spaces.
Data organization schemes
■   With the introduction of table partitioning, a DB2® database offers a
    three-level data organization scheme.
■   There are three clauses of the CREATE TABLE statement that include an
    algorithm to indicate how the data is to be organized.
■   The following three clauses demonstrate the levels of data organization
    that can be used together in any combination:
      – DISTRIBUTE BY to spread data evenly across database partitions
          (to enable intraquery parallelism and to balance the load across
          each database partition) (database partitioning)
      – PARTITION BY to group rows with similar values of a single
          dimension in the same data partition (table partitioning)
      – ORGANIZE BY to group rows with similar values on multiple
          dimensions in the same table extent (multidimensional clustering)
          or to group rows according to the time of the insert operation
          (insert time clustering table).
Data organization schemes, Cont’d…
Demonstrating the table partitioning
organization scheme where a table
representing monthly sales data is
partitioned into multiple data
partitions. The table also spans two
table spaces (ts1 and ts2).
Data organization schemes, Cont’d…
Demonstrating the complementary
organization schemes of database
partitioning and table partitioning. A
table representing monthly sales
data is partitioned into multiple data
partitions, spanning two table
spaces (ts1 and ts2) that are
distributed across multiple database
partitions (dbpart1, dbpart2,
dbpart3) of a database partition
group (dbgroup1).
Data organization schemes, Cont’d…
A representation of the database
partitioning, table partitioning and
multidimensional organization
schemes where data from table
SALES is not only distributed across
multiple database partitions,
partitioned across table spaces ts1
and ts2, but also groups rows with
similar values on both the date and
region dimensions.
A range-clustered table (RCT)
■   A range-clustered table (RCT) has a table layout scheme in which each
    record in the table has a predetermined record ID (RID).
      – The RID is an internal identifier that is used to locate a record in
          the table.
■   An algorithm is used to associate a record key value with the location of
    a specific table row.
■   This approach provides exceptionally fast access to specific table rows.
■   The algorithm does not use hashing, because hashing does not preserve
    key-value order.
■   Preserving this order eliminates the need to reorganize the table data
    over time.
A range-clustered table (RCT), Cont’d…
■ Each record key value in the table must be:
   – Unique
   – Not null
   – An integer (SMALLINT, INTEGER, or BIGINT)
   – Monotonically increasing
   – Within a predetermined set of ranges based on each
      column in the key. (If necessary, use the ALLOW
      OVERFLOW option on the CREATE TABLE statement to
      allow rows with key values that are outside of the defined
      range of values.)
A range-clustered table (RCT), Cont’d…
■ In addition to direct access to specific table rows, there are
  other advantages to using range-clustered tables.
    – Less maintenance is required. A secondary structure,
       such as a B+ tree index, which would need to be
       updated after every insert, update, or delete operation,
       does not exist.
    – Less logging is required for RCTs, when compared to
       similarly-sized regular tables with B+ tree indexes.
    – Less buffer pool memory is required. There is no
       additional memory required to store a secondary
       structure, such as a B+ tree index
Guidelines for using
range-clustered tables
■ Because the process of creating a range-clustered table pre-
  allocates the required disk space, that space must be available.
■ When defining the range of key values, the minimum value is
  optional; if it is not specified, the default is 1. A negative
  minimum value must be specified explicitly. For example:
   – ORGANIZE BY KEY SEQUENCE (f1 STARTING FROM -100
       ENDING AT -10)
■ You cannot create a regular index on the same key values that
  are used to define the range-clustered table.
■ ALTER TABLE statement options that affect the physical structure
  of the table are not allowed.
Scenarios: Range-clustered tables
■ Range-clustered tables can have single-column or multiple-
  column keys, and can allow or disallow rows with key values
  that are outside of the defined range of values.
■ This section contains scenarios that illustrate how such tables
  can be created.
   – Scenario 1: Creating a range-clustered table (overflow
       allowed)
   – Scenario 2: Creating a range-clustered table (overflow
       not allowed)
Scenario 1:
Creating a range-clustered table
(overflow allowed)
■ The following example shows a range-clustered table that can be
  used to retrieve information about a specific student.
■ Each student record contains the following information:
   – School ID
   – Program ID
   – Student number
   – Student ID
   – Student first name
   – Student last name
   – Student grade point average (GPA)
Scenario 1:
Creating a range-clustered table
(overflow allowed)
CREATE TABLE students (
school_id     INT NOT NULL,
program_id     INT NOT NULL,
student_num      INT NOT NULL,
student_id     INT NOT NULL,
first_name     CHAR(30),
last_name      CHAR(30),
gpa          FLOAT
)
ORGANIZE BY KEY SEQUENCE
    (student_id STARTING FROM 1 ENDING AT 1000000)
    ALLOW OVERFLOW;
Scenario 2:
Creating a range-clustered table
(overflow not allowed)
■ In the following example, a school board administers 200
  schools, each having 20 classrooms with a capacity of 35
  students per classroom.
■ This school board can accommodate a maximum of 140,000
  students.
Multidimensional clustering
(MDC) tables
CREATE TABLE students (
 school_id     INT NOT NULL,
 class_id     INT NOT NULL,
 student_num INT NOT NULL,
 student_id INT NOT NULL,
 first_name CHAR(30),
 last_name      CHAR(30),
 gpa         FLOAT
)
 ORGANIZE BY KEY SEQUENCE
  (school_id STARTING FROM 1 ENDING AT 200,
   class_id STARTING FROM 1 ENDING AT 20,
   student_num STARTING FROM 1 ENDING AT 35)
  DISALLOW OVERFLOW;
Comparison of regular and MDC tables
• Regular tables
    • have indexes that are
        record-based.
    • Any clustering of the
        indexes is restricted to a
        single dimension
• A regular table with a
  clustering index. has two
  record-based indexes defined
  on it:
    • A clustering index on
        "Region“
MDC Block Index
■ MDC introduces indexes that are block-based. "Block indexes"
  point to blocks or groups of records instead of to individual
  records.
■ By physically organizing data in an MDC table into blocks
  according to clustering values, and then accessing these
  blocks using block indexes, MDC is able not only to address
  all of the drawbacks of clustering indexes, but to provide
  significant additional performance benefits
MDC Block Index
• The MDC table shown is
  physically organized such that
  records having the same
  "Region" and "Year" values are
  grouped together into separate
  blocks, or extents.
• An extent is a set of contiguous
  pages on disk, so these groups
  of records are clustered on
  physically contiguous data pages
How row indexes differ from
block indexes
CREATE MDC
CREATE TABLE "SATRIO"."EMPMDC" (
                    "EMPNO" INTEGER,
                    "DEPT" INTEGER,
                    "DIV" INTEGER
        )
        ORGANIZE BY ROW USING DIMENSIONS (
        ("DEPT"),
        ("DIV"))
        DATA CAPTURE NONE
COMPRESS NO;
#Sesi2
■ Constraints
■ Indexes
■ Triggers
■ Sequences
CONSTRAINT
■ DB2 provides three types of constraints:
   – Unique constraints, which are used to
     ensure that values in a column are
     unique.
   – Referential integrity constraints, which
     are used to define relationships
     between tables and ensure that these
     relationships remain valid.
   – Table check constraints, which are used
     verify that column data does not violate
     rules defined for the column.
INDEXES
■ create an index:
   – To ensure uniqueness of values in a
     column or columns.
   – To improve performance of queries
     against the table. The DB2 optimizer
     will use indexes to improve
     performance when performing
     queries, or to present results of a
     query in the order of the index.
TRIGGERS
■ There are two types of triggers:
1. BEFORE triggers
    – They are executed before any SQL
      operation.
2. AFTER triggers
    – They are executed after any SQL
      operation.
SEQUENCE
■ Sequences are
  similar to
  identity columns
  in that they both
  generate unique
  values.
CREATE SEQUENCE
ORDER_SEQ START
WITH 500
INCREMENT BY 1
MAXVALUE 1000
CYCLE CACHE 24
    VIEW
■   In SQL, a VIEW is a virtual relation based on the result-
    set of a SELECT statement.
■   A view contains rows and columns, just like a real table.
    The fields in a view are fields from one or more real
    tables in the database. In some cases, we can modify a
    view and present the data as if the data were coming
    from a single table.
■   Syntax:
       CREATE VIEW view_name AS
       SELECT column_name(s)
       FROM table_name
       WHERE condition
DATABASE OBJECT#2
■ View
■ Partitioned Table
■ History Table + Time Travelling Query
■ Temporary Table
■ Materialized Query Table (MQT)