Parallel Database
1
             What Is a Parallel Database?
 A parallel database system seeks to improve performance
  through parallelization of various operations, such as loading
  data, building indexes and evaluating queries.
 Parallel databases improve processing & input/output speeds
  by using multiple CPUs and disks in parallel.
 Centralized and client–server database systems are not
  powerful enough to handle such applications.
 In parallel processing, many operations are performed
  simultaneously
                                                                   2
               Architecture of parallel database
Shared memory architecture:
Where multiple processors share the main memory
(RAM) space but each processor has its own disk (HDD).
If many processes run simultaneously, the speed is
reduced, the same as a computer when many parallel
tasks run and the computer slows down.
Shared disk architecture:
Where each node has its own main memory,
but all nodes share mass storage, usually
a storage area network.
Shared nothing architecture:
Where each node has its own mass storage as well as main memory.
                                                                   3
                                   Speedup
 Speedup
    Speedup is the extent to which more hardware can perform the same task in less
      time than the original system. With added hardware, speedup holds the task
      constant and measures time savings.
    With good speedup, additional processors reduce system response time. You
      can measure speedup using this formula:
 Time_Parallel is the elapsed time spent by a larger, parallel system on the given
  task.
 For example, if the original system took 60 seconds to perform a task, and two
  parallel systems took 30 seconds, then the value of speedup would equal 2.
                                                                                      4
                                          Scaleup
 Scaleup
      Scaleup is the factor m that expresses how much more work can be done in the same time period
       by a system n times larger. With added hardware, a formula for scaleup holds the time constant,
       and measures the increased size of the job which can be done.
      With good scaleup, if transaction volumes grow, you can keep response time constant by adding
       hardware resources such as CPUs.
      You can measure scaleup using this formula:
    Volume_Parallel : is the transaction volume processed in a given amount of time on a parallel
       system
      For example, if the original system can process 100 transactions in a given amount of time, and
       the parallel system can process 200 transactions in this amount of time, then the value of
       scaleup would be equal to 2. That is, 200/100 = 2.
                                                                                                         5
Intraquery Parallelism
It is about executing a single query in parallel using multiple processors or disks.
This can be done by dividing the data into many smaller units and execute the query on
  those smaller tables.
We have so many queries which are complex and consume more time and resources.
For example: SELECT * FROM Email ORDER BY Start_Date;
This query will sort the records of Email table in ascending order on the
  attribute Start_Date.
Assume that the Email table has 10000 records.
                                                                                       6
Independent parallelism and Pipe-lined parallelism
Independent parallelism:
Execution of each operation individually in different processors only if they can be
  executed independent of each other.
For example, if we need to join four tables, then two can be joined at one processor
  and the other two can be joined at another processor. Final join can be done later.
Pipe-lined parallelism:
Execution of different operations in pipe-lined fashion.
For example, if we need to join three tables, one processor may join two tables and
   send the result set records as and when they are produced to the other processor.
In the other processor the third table can be joined with the incoming records and the
   final result can be produced.
                                                                                   7
        What Are the Benefits of Parallel Database?
 Higher Performance:
    With more CPUs available to an application, higher speedup and
     scaleup can be attained.
 Higher Availability
    Nodes are isolated from each other, so a failure at one node does not
     bring the whole system down.
 Greater Flexibility
    Instances can be allocated or deallocated as necessary. When there is
     high demand for the database, more instances can be temporarily
     allocated.
 More Users
   Parallel database technology can make it possible to overcome
    memory limits, enabling a single system to serve thousands of users.