A data-warehouse is a heterogeneous collection of different data sources
organised under a unified schema. There are 2 approaches for constructing
data-warehouse: Top-down approach and Bottom-up approach are explained as
below.
The essential components are discussed below:
1. External Sources –
External source is a source from where data is collected irrespective of
the type of data. Data can be structured, semi structured and
unstructured as well.
2. Stage Area –
Since the data, extracted from the external sources does not follow a
particular format, so there is a need to validate this data to load into
datawarehouse. For this purpose, it is recommended to use ETL tool.
● E(Extracted): Data is extracted from External data source.
● T(Transform): Data is transformed into the standard format.
● L(Load): Data is loaded into datawarehouse after
transforming it into the standard format.
3. Data-warehouse –
After cleansing of data, it is stored in the datawarehouse as central
repository. It actually stores the meta data and the actual data gets
stored in the data marts. Note that datawarehouse stores the data in
its purest form in this top-down approach.
4. Data Marts –
Data mart is also a part of storage component. It stores the information of
a particular function of an organisation which is handled by single
authority. There can be as many number of data marts in an organisation
depending upon the functions. We can also say that data mart contains
subset of the data stored in datawarehouse.
5. Data Mining –
The practice of analysing the big data present in datawarehouse is data
mining. It is used to find the hidden patterns that are present in the
database or in datawarehouse with the help of algorithm of data mining.
This approach is defined by Inmon as – datawarehouse as a central
repository for the complete organisation and data marts are created from
it after the complete datawarehouse has been created.
advantages
6. Since the data marts are created from the datawarehouse, provides
consistent dimensional view of data marts.
7. Also, this model is considered as the strongest model for business
changes. That’s why, big organisations prefer to follow this approach.
8. Creating data mart from datawarehouse is easy.
Disadvantages of Top-Down Approach –
1. The cost, time taken in designing and its maintenance is very high.
2. Complexity: The top-down approach can be complex to implement and
maintain, particularly for large organizations with complex data needs.
Bottom-up approach:
1. First, the data is extracted from external sources (same as happens in
top-down approach).
2. Then, the data go through the staging area (as explained above) and
loaded into data marts instead of datawarehouse. The data marts are
created first and provide reporting capability. It addresses a single
business area.
3. These data marts are then integrated into datawarehouse.
Advantages of Bottom-Up Approach –
1. As the data marts are created first, so the reports are quickly
generated.
2. We can accommodate more number of data marts here and in this way
datawarehouse can be extended.
3. Also, the cost and time taken in designing this model is low
comparatively.
Disadvantage of Bottom-Up Approach –
1. This model is not strong as top-down approach as dimensional view of
data marts is not consistent as it is in above approach.