KEMBAR78
Future of Data - Big Data | PDF
Future of Data : Big Data
   Shankar Radhakrishnan
        Cognizant
Topics
 How did we get here ?
 Data Explosion
 Big Data
 Big Data in an Enterprise
 Big Data Platform - Hadoop
 Big Data Adoption
Q&A
How did we get here?
Familiar World
                                           Data Integration Problems
   EDW
   Datamarts                              Data Processing Problems



   Familiar Problems           Data
                              warehouse
                                              Storage Management



                                             Performance Problems



                                          Limitations out of Complexity




New World
   Newer type of data to integrate
   Increase in volume
   Newer analytical requirements
Data Explosion
Newer Interests
 Social Intelligence
   DBIM, Sentiment Analysis, Social Customer Care
 Predictive Analytics
   Propensity, Price Elasticity, Anti-Fraud Analytics
 Segmentation Insights
   Funnel Analysis, Behavioral Patterns, Cohort Analysis
 Mobile Analytics
   Ad-Targeting, Geo-spatial Analytics
Categories
 Structured Data
  Enterprise Data (CRM, ERP, Data Stores, Reference Data)
 Semi-structured Data
  Machine Generated Data (Sensor Data, RFIDs)
 Unstructured Data
  Social Data (Comments, Tweets), Blog posts
Big Data
                                         Volume




                      Complexity
                                        Big               Velocity
                                        Data


                                         Variety



“Big Data” refers to high volume, velocity, variety and complex information assets that
demand cost-effective, innovative forms of information processing for enhanced insight
and decision making
Big Data Platforms
• Data Integration
   o Informatica, Infosphere
   o talenD, Pentaho, Karmasphere, Apache Sqoop, Apache Flume

• Database Framework
   o Hadoop (Distributions: Cloudera, Hortonworks, MapR)
   o Hbase
   o Hive

• NoSQL Databases
   o MongoDB, CouchDB

• Machine Data Processing
   o Splunk, Mahout

• Text Analytics
   o Clarabridge, Lexanalytics
Big Data in an Enterprise

 Big Data            Big Data
            ETL
 Sources             Platform




                                   Datamarts
                       ETL                                  Analytical
                                               Datamarts   Applications
                                   Datamarts




   Data
            ETL   Data warehouse
  Sources
Hadoop - Ecosystem
Big Data : Adoption Drivers
                   Cluster         Distributed



    Platform          Storage      Scalable       Process


                   Availability    Performance




                   Data
                                    Augmented
                   Integration


                      Data
   Possibilities      Processing
                                      TCO        Ecosystem


                   Actionable
                                          ROI
                   Insights
Big Data – Adoption Scenarios

 Replatforming to Big Data (Hadoop, MapR)
 Archival Solution (Hadoop)
 Offloading Data warehouse, EDW (Hadoop, Hive)
 Social Media Integration
 Machine Data Analysis (Splunk, Mahout)
 Complex Analytical Requirements (Hbase)
Q&A

Future of Data - Big Data

  • 1.
    Future of Data: Big Data Shankar Radhakrishnan Cognizant
  • 2.
    Topics  How didwe get here ?  Data Explosion  Big Data  Big Data in an Enterprise  Big Data Platform - Hadoop  Big Data Adoption Q&A
  • 3.
    How did weget here? Familiar World Data Integration Problems  EDW  Datamarts Data Processing Problems  Familiar Problems Data warehouse Storage Management Performance Problems Limitations out of Complexity New World  Newer type of data to integrate  Increase in volume  Newer analytical requirements
  • 4.
  • 5.
    Newer Interests  SocialIntelligence  DBIM, Sentiment Analysis, Social Customer Care  Predictive Analytics  Propensity, Price Elasticity, Anti-Fraud Analytics  Segmentation Insights  Funnel Analysis, Behavioral Patterns, Cohort Analysis  Mobile Analytics  Ad-Targeting, Geo-spatial Analytics
  • 6.
    Categories  Structured Data  Enterprise Data (CRM, ERP, Data Stores, Reference Data)  Semi-structured Data  Machine Generated Data (Sensor Data, RFIDs)  Unstructured Data  Social Data (Comments, Tweets), Blog posts
  • 7.
    Big Data Volume Complexity Big Velocity Data Variety “Big Data” refers to high volume, velocity, variety and complex information assets that demand cost-effective, innovative forms of information processing for enhanced insight and decision making
  • 8.
    Big Data Platforms •Data Integration o Informatica, Infosphere o talenD, Pentaho, Karmasphere, Apache Sqoop, Apache Flume • Database Framework o Hadoop (Distributions: Cloudera, Hortonworks, MapR) o Hbase o Hive • NoSQL Databases o MongoDB, CouchDB • Machine Data Processing o Splunk, Mahout • Text Analytics o Clarabridge, Lexanalytics
  • 9.
    Big Data inan Enterprise Big Data Big Data ETL Sources Platform Datamarts ETL Analytical Datamarts Applications Datamarts Data ETL Data warehouse Sources
  • 10.
  • 11.
    Big Data :Adoption Drivers Cluster Distributed Platform Storage Scalable Process Availability Performance Data Augmented Integration Data Possibilities Processing TCO Ecosystem Actionable ROI Insights
  • 12.
    Big Data –Adoption Scenarios  Replatforming to Big Data (Hadoop, MapR)  Archival Solution (Hadoop)  Offloading Data warehouse, EDW (Hadoop, Hive)  Social Media Integration  Machine Data Analysis (Splunk, Mahout)  Complex Analytical Requirements (Hbase)
  • 13.