Indexing in DBMS
What is Indexing?
• Indexing is used to optimize the performance of a database by
  minimizing the number of disk accesses required when a query is
  processed.
• The index is a type of data structure.
• It is used to locate and access the data in a database table quickly.
• Indexes can be created using some database columns.
     What is Indexing?
• Indexing is a data structure technique which allows you to quickly retrieve
  records from a database file.
• An Index is a small table having only two columns.
• The first column of the database is the search key that contains a copy of the
  primary key or candidate key of the table. The values of the primary key are
  stored in sorted order so that the corresponding data can be accessed easily.
• The second column of the database is the data reference. It contains a set of
  pointers holding the address of the disk block where the value of the particular
  key can be found.
• An index –
   • Takes a search key as input
   • Efficiently returns a collection of matching records.
• Index structure:
Types of Indexing in DBMS
• Indexing in Database is defined based on its indexing attributes. Three
  main types of indexing methods are:
   • Primary Indexing
   • Secondary Indexing
   • Clustering index
Primary Index in DBMS
• Primary Index is an ordered file which is fixed length size with two
  fields.
• The first field is the same a primary key and second field is pointed
  to that specific data block.
• In the primary Index, there is always one to one relationship
  between the entries in the index table.
• The primary Indexing in DBMS is also further divided into two types:
   • Dense Index
   • Sparse Index
Primary Index in DBMS
Dense Index
• The dense index contains an index
  record for every search key value in
  the data file. It makes searching
  faster.
• In this, the number of records in the
  index table is same as the number
  of records in the main table.
• It needs more space to store index
  record itself. The index records have
  the search key and a pointer to the
  actual record on the disk.
     Primary Index in DBMS
Sparse Index
• It is an index record that appears for only
  some of the values in the file.
• Sparse Index helps you to resolve the issues
  of dense Indexing in DBMS. In this method of
  indexing technique, a range of index columns
  stores the same data block address, and
  when data needs to be retrieved, the block
  address will be fetched.
• However, sparse Index stores index records
  for only some search-key values. It needs less
  space, less maintenance overhead for
  insertion, and deletions but It is slower
  compared to the dense Index for locating
  records.
Secondary Index in DBMS
• The secondary Index in DBMS can be generated by a field which has a
  unique value for each record, and it should be a candidate key.
• It is also known as a non-clustering index.
• This two-level database indexing technique is used to reduce the
  mapping size of the first level.
• For the first level, a large range of numbers is selected because of
  this; the mapping size always remains small.
     Secondary Index in DBMS
Secondary Index Example
• Let’s understand secondary indexing
  with a database index example:
• In a bank account database, data is
  stored sequentially by acc_no; you may
  want to find all accounts in a specific
  branch of ABC bank.
• Here, you can have a secondary index in
  DBMS for every search-key. Index record
  is a record point to a bucket that
  contains pointers to all the records with
  their specific search-key value.
Secondary Index in DBMS
     Clustering Index in DBMS
• In a clustered index, records are stored in the Index and not pointers.
• Sometimes the Index is created on non-primary key columns which might
  not be unique for each record. In such a situation, you can group two or
  more columns to get the unique values and create an index which is called
  clustered Index. This also helps you to identify the record faster.
Example:
• Let’s assume that a company recruited many employees in various
  departments. In this case, clustering indexing in DBMS should be created
  for all employees who belong to the same dept.
• It is considered in a single cluster, and index points point to the cluster as a
  whole. Here, Department _no is a non-unique key.
Clustering Index in DBMS
                The previous schema is little
                confusing because one disk block
                is shared by records which belong
                to the different cluster. If we use
                separate disk block for separate
                clusters, then it is called better
                technique.
  What is Multilevel Index?
• Multilevel Indexing in Database is
  created when a primary index does
  not fit in memory.
• In this type of indexing method, you
  can reduce the number of disk
  accesses to short any record and kept
  on a disk as a sequential file and
  create a sparse base on that file.
       B-Tree Index
• B-tree index is the widely used data
  structures for tree based indexing in
  DBMS. It is a multilevel format of tree
  based indexing in DBMS technique which
  has balanced binary search trees. All leaf
  nodes of the B tree signify actual data
  pointers.
• Moreover, all leaf nodes are interlinked
  with a link list, which allows a B tree to
  support both random and sequential
  access.
                               •   Lead nodes must have between 2 and 4 values.
                               •   Every path from the root to leaf are mostly on an equal length.
                               •   Non-leaf nodes apart from the root node have between 3 and 5 children nodes.
                               •   Every node which is not a root or a leaf has between n/2] and n children.
Advantages of Indexing
  Important pros/ advantage of Indexing are:
  • It helps you to reduce the total number of I/O operations needed
    to retrieve that data, so you don’t need to access a row in the
    database from an index structure.
  • Offers Faster search and retrieval of data to users.
  • Indexing also helps you to reduce tablespace as you don’t need
    to link to a row in a table, as there is no need to store the ROWID
    in the Index. Thus you will able to reduce the tablespace.
  • You can’t sort data in the lead nodes as the value of the primary
    key classifies it.
Disadvantages of Indexing
  Important drawbacks/cons of Indexing are:
  • To perform the indexing database management system, you
    need a primary key on the table with a unique value.
  • You can’t perform any other indexes in Database on the Indexed
    data.
  • You are not allowed to partition an index-organized table.
  • SQL Indexing Decrease performance in INSERT, DELETE, and
    UPDATE query.
Cukup untuk hari ini
QUIZ (15 Menit)
1. Apa kelemahan Indexing
2. Jelaskan perbedaan antara Dense dan sparse index
3. Gambarkan struktur index B-Tree