KEMBAR78
Distributed dbms (ddbms) | PPTX
Distributed Database Management Systems
• A distributed database management system (DDBMS) governs the storage and
processing of logically related data over interconnected computer systems in which
both data and processing are distributed among several sites.
• The use of a centralized database required that corporate data be stored in a single
central site, usually a mainframe computer. Data access was provided through dumb
terminals. The centralized approach worked well to fill the structured information
needs of corporations, but it fell short when quickly moving events required faster
response times and equally quick access to information. The slow progression from
information request to approval to specialist to user simply did not serve decision
makers well in a dynamic environment.
The factors influenced the evolution of the DDBMS:
• The different factors influenced the evolution of the DDBMS are as follows.
• The growing acceptance of the Internet as the platform for data access and distribution
which leads to maintain the repository for distributed data.
• The wireless revolution. The widespread use of wireless digital devices, such as smart
phones like the iPhone and BlackBerry and personal digital assistants (PDAs), has created
high demand for data access. Such devices access data from geographically dispersed
locations and require varied data exchanges in multiple formats (data, voice, video,
music, pictures, etc.) Although distributed data access does not necessarily imply
distributed databases, performance and failure tolerance requirements often make use
of data replication techniques similar to the ones found in distributed databases.
Cont’
• The accelerated growth of companies providing “application as a service” type of
services. This new type of service provides remote application services to companies
wanting to outsource their application development, maintenance, and operations.
The company data is generally stored on central servers and is not necessarily
distributed. Just as with wireless data access, this type of service may not require fully
distributed data functionality; however, other factors such as performance and failure
tolerance often require the use of data replication techniques similar to the ones
found in distributed databases.
• The increased focus on data analysis that led to data mining and data warehousing.
Although a data warehouse is not usually a distributed database, it does rely on
techniques such as data replication and distributed queries that facilitate data extraction
and integration.
The Problems with the Centralized Database Management System:
• Performance degradation because of a growing number of remote locations over
greater distances.
• High costs associated with maintaining and operating large central (mainframe)
database systems.
• Reliability problems created by dependence on a central site (single point of failure
syndrome) and the need for data replication.
• Scalability problems associated with the physical limits imposed by a single location
(power, temperature conditioning, and power consumption.)
• Organizational rigidity imposed by the database might not support the flexibility and
agility required by modern global organizations.
Advantages &Disadvantages of DDBMS
Advantages of DDBMS are as follows:
1. Data are located near the greatest demand site. The data in a
distributed database system are dispersed to match business
requirements which reduce the cost of data access.
2. Faster data access. End users often work with only a locally stored
subset of the company’s data.
3. Faster data processing. A distributed database system spreads out the
systems workload by processing data at several sites.
4. Growth facilitation. New sites can be added to the network without
affecting the operations of other sites.
5. Improved communications. Because local sites are smaller and
located closer to customers, local sites foster better communication
among departments and between customers and company staff
Cont’
6. Reduced operating costs. It is more cost-effective to add workstations
to a network than to update a mainframe system. Development work is
done more cheaply and more quickly on low-cost PCs than on mainframes.
7. User-friendly interface. PCs and workstations are usually equipped with
an easy-to-use graphical user interface (GUI). The GUI simplifies training
and use for end users.
8. Less danger of a single-point failure. When one of the computers fails,
the workload is picked up by other workstations. Data are also distributed
at multiple sites.
9. Processor independence. The end user is able to access any available
copy of the data, and an end user's request is processed by any processor
at the data location.
Disadvantages of DDBMS
1.Complexity of management and control. Applications must recognize data location, and
they must be able to stitch together data from various sites. Database administrators must
have the ability to coordinate database activities to prevent database degradation due to
data anomalies.
2. Technological difficulty. Data integrity, transaction management, concurrency control,
security, backup, recovery, query optimization, access path selection, and so on, must all be
addressed and resolved.
3. Security. The probability of security lapses increases when data are located at multiple
sites. The responsibility of data management will be shared by different people at several
sites.
4. Lack of standards. There are no standard communication protocols at the database
level. (Although TCP/IP is the de facto standard at the network level, there is no standard at
the application level.) For example, different database vendors employ different—and
often incompatible—techniques to manage the distribution of data and processing in a
DDBMS environment.
5. Increased storage and infrastructure requirements. Multiple copies of data are required
at different sites, thus requiring additional disk storage space.
6. Increased training cost. Training costs are generally higher in a distributed model than
they would be in a centralized model, sometimes even to the extent of offsetting
operational and hardware savings.
7. Costs. Distributed databases require duplicated infrastructure to operate (physical
location, environment, personnel, software, licensing, etc.)
Characteristics of Distributed Database Management Systems:
-A DDBMS governs the storage and processing of logically related data over
interconnected computer systems in which both data and processing functions
are distributed among several sites. A DBMS must have at least the following
functions to be classified as distributed:
• Application interface to interact with the end user, application programs, and
other DBMSs within the distributed database.
• Validation to analyze data requests for syntax correctness.
• Transformation to decompose complex requests into atomic data request
components.
• Query optimization to find the best access strategy. (Which database
fragments must be accessed by the query, and how must data updates, if any,
be synchronized?)
• Mapping to determine the data location of local and remote fragments.
• I/O interface to read or write data from or to permanent local storage.
• Formatting to prepare the data for presentation to the end user or to an
application program.
Cont’
• Security to provide data privacy at both local and remote databases.
• Backup and recovery to ensure the availability and recoverability of the database in case
of a failure.
• Backup and recovery to ensure the availability and recoverability of the database in case
of a failure.
• DB administration features for the database administrator.
• DB administration features for the database administrator.
• Concurrency control to manage simultaneous data access and to ensure data consistency
across database fragments in the DDBMS.
• Concurrency control to manage simultaneous data access and to ensure data consistency
across database fragments in the DDBMS.
• Transaction management to ensure that the data moves from one consistent state to
another. This activity includes the synchronization of local and remote transactions as well
as transactions across multiple distributed segments.
Distributed Processing and Distributed Databases
Components of DDBMS
-The different components of DDBMS are as follows:
• Computer workstations or remote devices (sites or nodes) that form the network
system. The distributed database system must be independent of the computer system
hardware.
• Network hardware and software components that reside in each workstation or device.
The network components allow all sites to interact and exchange data. Because the
components—computers, operating systems, network hardware, and so on—are likely to
be supplied by different vendors, it is best to ensure that distributed database functions
can be run on multiple platforms.
• Communications media that carry the data from one node to another. The DDBMS must
be communications media-independent; that is, it must be able to support several types
of communications media.
• The transaction processor (TP), which is the software component found in each
computer or device that requests data. The transaction processor receives and processes
the application’s data requests (remote and local). The TP is also known as the application
processor (AP) or the transaction manager (TM).
• The data processor (DP), which is the software component residing on each computer or
device that stores and retrieves data located at the site. The DP is also known as the data
manager (DM). A data processor may even be a centralized DBMS.
Query Optimization in DDBMS
Client/Server Vs. DDBMS

Distributed dbms (ddbms)

  • 1.
    Distributed Database ManagementSystems • A distributed database management system (DDBMS) governs the storage and processing of logically related data over interconnected computer systems in which both data and processing are distributed among several sites. • The use of a centralized database required that corporate data be stored in a single central site, usually a mainframe computer. Data access was provided through dumb terminals. The centralized approach worked well to fill the structured information needs of corporations, but it fell short when quickly moving events required faster response times and equally quick access to information. The slow progression from information request to approval to specialist to user simply did not serve decision makers well in a dynamic environment.
  • 2.
    The factors influencedthe evolution of the DDBMS: • The different factors influenced the evolution of the DDBMS are as follows. • The growing acceptance of the Internet as the platform for data access and distribution which leads to maintain the repository for distributed data. • The wireless revolution. The widespread use of wireless digital devices, such as smart phones like the iPhone and BlackBerry and personal digital assistants (PDAs), has created high demand for data access. Such devices access data from geographically dispersed locations and require varied data exchanges in multiple formats (data, voice, video, music, pictures, etc.) Although distributed data access does not necessarily imply distributed databases, performance and failure tolerance requirements often make use of data replication techniques similar to the ones found in distributed databases.
  • 3.
    Cont’ • The acceleratedgrowth of companies providing “application as a service” type of services. This new type of service provides remote application services to companies wanting to outsource their application development, maintenance, and operations. The company data is generally stored on central servers and is not necessarily distributed. Just as with wireless data access, this type of service may not require fully distributed data functionality; however, other factors such as performance and failure tolerance often require the use of data replication techniques similar to the ones found in distributed databases. • The increased focus on data analysis that led to data mining and data warehousing. Although a data warehouse is not usually a distributed database, it does rely on techniques such as data replication and distributed queries that facilitate data extraction and integration.
  • 4.
    The Problems withthe Centralized Database Management System: • Performance degradation because of a growing number of remote locations over greater distances. • High costs associated with maintaining and operating large central (mainframe) database systems. • Reliability problems created by dependence on a central site (single point of failure syndrome) and the need for data replication. • Scalability problems associated with the physical limits imposed by a single location (power, temperature conditioning, and power consumption.) • Organizational rigidity imposed by the database might not support the flexibility and agility required by modern global organizations.
  • 5.
    Advantages &Disadvantages ofDDBMS Advantages of DDBMS are as follows: 1. Data are located near the greatest demand site. The data in a distributed database system are dispersed to match business requirements which reduce the cost of data access. 2. Faster data access. End users often work with only a locally stored subset of the company’s data. 3. Faster data processing. A distributed database system spreads out the systems workload by processing data at several sites. 4. Growth facilitation. New sites can be added to the network without affecting the operations of other sites. 5. Improved communications. Because local sites are smaller and located closer to customers, local sites foster better communication among departments and between customers and company staff
  • 6.
    Cont’ 6. Reduced operatingcosts. It is more cost-effective to add workstations to a network than to update a mainframe system. Development work is done more cheaply and more quickly on low-cost PCs than on mainframes. 7. User-friendly interface. PCs and workstations are usually equipped with an easy-to-use graphical user interface (GUI). The GUI simplifies training and use for end users. 8. Less danger of a single-point failure. When one of the computers fails, the workload is picked up by other workstations. Data are also distributed at multiple sites. 9. Processor independence. The end user is able to access any available copy of the data, and an end user's request is processed by any processor at the data location.
  • 7.
    Disadvantages of DDBMS 1.Complexityof management and control. Applications must recognize data location, and they must be able to stitch together data from various sites. Database administrators must have the ability to coordinate database activities to prevent database degradation due to data anomalies. 2. Technological difficulty. Data integrity, transaction management, concurrency control, security, backup, recovery, query optimization, access path selection, and so on, must all be addressed and resolved. 3. Security. The probability of security lapses increases when data are located at multiple sites. The responsibility of data management will be shared by different people at several sites. 4. Lack of standards. There are no standard communication protocols at the database level. (Although TCP/IP is the de facto standard at the network level, there is no standard at the application level.) For example, different database vendors employ different—and often incompatible—techniques to manage the distribution of data and processing in a DDBMS environment. 5. Increased storage and infrastructure requirements. Multiple copies of data are required at different sites, thus requiring additional disk storage space. 6. Increased training cost. Training costs are generally higher in a distributed model than they would be in a centralized model, sometimes even to the extent of offsetting operational and hardware savings. 7. Costs. Distributed databases require duplicated infrastructure to operate (physical location, environment, personnel, software, licensing, etc.)
  • 8.
    Characteristics of DistributedDatabase Management Systems: -A DDBMS governs the storage and processing of logically related data over interconnected computer systems in which both data and processing functions are distributed among several sites. A DBMS must have at least the following functions to be classified as distributed: • Application interface to interact with the end user, application programs, and other DBMSs within the distributed database. • Validation to analyze data requests for syntax correctness. • Transformation to decompose complex requests into atomic data request components. • Query optimization to find the best access strategy. (Which database fragments must be accessed by the query, and how must data updates, if any, be synchronized?) • Mapping to determine the data location of local and remote fragments. • I/O interface to read or write data from or to permanent local storage. • Formatting to prepare the data for presentation to the end user or to an application program.
  • 9.
    Cont’ • Security toprovide data privacy at both local and remote databases. • Backup and recovery to ensure the availability and recoverability of the database in case of a failure. • Backup and recovery to ensure the availability and recoverability of the database in case of a failure. • DB administration features for the database administrator. • DB administration features for the database administrator. • Concurrency control to manage simultaneous data access and to ensure data consistency across database fragments in the DDBMS. • Concurrency control to manage simultaneous data access and to ensure data consistency across database fragments in the DDBMS. • Transaction management to ensure that the data moves from one consistent state to another. This activity includes the synchronization of local and remote transactions as well as transactions across multiple distributed segments.
  • 10.
    Distributed Processing andDistributed Databases
  • 11.
    Components of DDBMS -Thedifferent components of DDBMS are as follows: • Computer workstations or remote devices (sites or nodes) that form the network system. The distributed database system must be independent of the computer system hardware. • Network hardware and software components that reside in each workstation or device. The network components allow all sites to interact and exchange data. Because the components—computers, operating systems, network hardware, and so on—are likely to be supplied by different vendors, it is best to ensure that distributed database functions can be run on multiple platforms. • Communications media that carry the data from one node to another. The DDBMS must be communications media-independent; that is, it must be able to support several types of communications media. • The transaction processor (TP), which is the software component found in each computer or device that requests data. The transaction processor receives and processes the application’s data requests (remote and local). The TP is also known as the application processor (AP) or the transaction manager (TM). • The data processor (DP), which is the software component residing on each computer or device that stores and retrieves data located at the site. The DP is also known as the data manager (DM). A data processor may even be a centralized DBMS.
  • 12.
  • 13.