KEMBAR78
Platform introduction & Summary | PPTX
Big Data Europe Platform ReleaseMay 3rd 2017
Big Data Europe Integrator Platform
Empowering Communities with Data Technologies
Platform release
Dr. Hajira Jabeen
Senior researcher
University of Bonn
Platform Goals
◎Opensource
◎Ease of Use
◎Support a variety of use cases
◎Embrace emerging Big Data Technologies
◎Simple integration with custom components
Key actors
Platform Architecture Evolution
4
Platform Architecture Evolution
5
Platform Architecture Evolution
6
7
Platform Architecture Existing
Platform Architecture Existing
8
Platform Architecture Alternate View
Support Layer
Init Daemon
GUIs
Monitor
App Layer
Traffic
Forecast
Satellite Image Analysis
Platform Layer
Spark Flink Semantic Layer
Ontario SANSA Semagrow
Kafka
Real-time Stream Monitoring
...
...
Resource Management Layer (Swarm)
Hardware Layer
Premises Cloud (AWS, GCE, MS Azure, …)
Hadoop NOSQL Store CassandraElasticsearch ...RDF Store
Data Layer
BDE Supported Frameworks
Search/indexing Data processing
Apache Solr Apache Spark
Data acquisition Apache Flink
Apache Flume Semantic Components
Message passing Strabon
Apache Kafka Sextant
Data storage GeoTriples
Hue Silk
Apache Cassandra SEMAGROW
ScyllaDB LIMES
Apache Hive 4Store
Postgis OpenLink Virtuoso
10
Platform features
◎ BDE Development Environment
o Stack builder
o Workflow builder
o Instructions to add custom components to the BDE
stack
◎ Administrator Interface
o SwarmUI
◎ UI Integrator
o Workflow monitor
o Integrated web interface
11
What BDE Provides ?
◎Platform Installation Instructions
◎Usage Instructions
o Creating a stack
o Creating a workflow
o Monitoring the Stack
o Integration of Custom Components
12
Platform installation
◎Manual installation guide
◎Using Docker Machine
o On local machine (VirtualBox)
o In cloud (AWS, DigitalOcean, Azure)
o Bare metal
◎Screencasts
13
Deploying a Big Data Stack
◎ Stack Builder
◎ Stack
o Collection of communicating components to solve a
specific problem
◎ Described in Docker Compose
o Component configuration
o Application topology
14
Creation of WorkFlows
◎Pipeline Builder
o Allows creation of dependencies among
different applications
◎ WorkFlow Monitor
o Monitoring of pipeline-workflow using
15
Integrating Custom Components
◎Instructions
o Orchestrator required for initialization process
(init_daemon)
❖ Components may depend on each other
❖ Components may require manual intervention
o User Interface Integration
❖ Standard Interfaces from components
❖ Combine and align the interfaces
16
User Interfaces
◎Target: Facilitate the use of the platform
o User Interface Adaption
◎Available interfaces
o Workflow UIs
❖ Workflow Builder
❖ Workflow Monitor
o Swarm UI
o Integrator UI
17
Details !
18
Presentations by Ivan, Aad and Jens
19
Summary
20
Platform Architecture
21
Pilot Show Cases
22
SC1 SC2 SC3 SC4 SC5 SC6
SC7
SC1 - Open PHACTS discovery platform relating to biological/medical questions
SC2 - Discovery and Linking of Viticulture-relevant information
SC3 - System monitoring in energy production units
SC4 - Short-Term traffic flow forecasting.
SC5 - Supporting data-intensive climate research
SC6 - Citizens & Researchers Budget on Municipal Level
SC7 - Ingestion of remote sensing images and social sensing data to detect and verify
changes on the Earth surface for security applications
SC1- Health
23
SC2 - Food
24
SC3 - Energy
25
26
SC4 - Transport
FCD: Floating Car Data
NRT: Near Real Time
SC5 - Climate
27
SC6 - Social Sciences
28
SC7 - Security
29
BDE vs Hadoop distributions
Hortonworks Cloudera MapR Bigtop BDE
File System HDFS HDFS NFS HDFS HDFS
Installation Native Native Native Native lightweight
virtualization
Flexible Modular Architecture no no no no yes
High Availability Single failure
recovery (yarn)
Single failure
recovery (yarn)
Self healing, mult.
failure rec.
Single failure
recovery (yarn)
Failure recovery
Cost Commercial Commercial Commercial Free Free
Scaling Freemium Freemium Freemium Free Free
Addition of custom
components
Not easy No No No Yes
Integration testing yes yes yes yes --
Operating systems Linux Linux Linux Linux Windows/Mac/Linux
Management tool Ambari Cloudera manager MapR Control
system
- Docker swarm UI+
Custom
30
BDE vs Hadoop distributions
◎BDE is not built on top of existing distributions
◎Targets
o Communities
o Research Institutions
◎Bridges scientists and open data
◎Multi Tier research efforts towards Smart Data
31
Maintenance and Uptake
◎Community Driven
◎Adapters
■ Feuga , Eurostat, ILVO
■ I2cat, Vicomtech, IoF
◎Follow Up Projects
■ HOBBIT
■ Special
■ Big Ocean
■ Qrowd
32
Wrap up
33
◎Big Data Europe - Platform
o Containerized Components
o Development and Runtime Facilities
◎Show Cases
o From Components to Architectures
o Evolving Microservice Architectures

Platform introduction & Summary

  • 1.
    Big Data EuropePlatform ReleaseMay 3rd 2017 Big Data Europe Integrator Platform Empowering Communities with Data Technologies Platform release Dr. Hajira Jabeen Senior researcher University of Bonn
  • 2.
    Platform Goals ◎Opensource ◎Ease ofUse ◎Support a variety of use cases ◎Embrace emerging Big Data Technologies ◎Simple integration with custom components
  • 3.
  • 4.
  • 5.
  • 6.
  • 7.
  • 8.
  • 9.
    Platform Architecture AlternateView Support Layer Init Daemon GUIs Monitor App Layer Traffic Forecast Satellite Image Analysis Platform Layer Spark Flink Semantic Layer Ontario SANSA Semagrow Kafka Real-time Stream Monitoring ... ... Resource Management Layer (Swarm) Hardware Layer Premises Cloud (AWS, GCE, MS Azure, …) Hadoop NOSQL Store CassandraElasticsearch ...RDF Store Data Layer
  • 10.
    BDE Supported Frameworks Search/indexingData processing Apache Solr Apache Spark Data acquisition Apache Flink Apache Flume Semantic Components Message passing Strabon Apache Kafka Sextant Data storage GeoTriples Hue Silk Apache Cassandra SEMAGROW ScyllaDB LIMES Apache Hive 4Store Postgis OpenLink Virtuoso 10
  • 11.
    Platform features ◎ BDEDevelopment Environment o Stack builder o Workflow builder o Instructions to add custom components to the BDE stack ◎ Administrator Interface o SwarmUI ◎ UI Integrator o Workflow monitor o Integrated web interface 11
  • 12.
    What BDE Provides? ◎Platform Installation Instructions ◎Usage Instructions o Creating a stack o Creating a workflow o Monitoring the Stack o Integration of Custom Components 12
  • 13.
    Platform installation ◎Manual installationguide ◎Using Docker Machine o On local machine (VirtualBox) o In cloud (AWS, DigitalOcean, Azure) o Bare metal ◎Screencasts 13
  • 14.
    Deploying a BigData Stack ◎ Stack Builder ◎ Stack o Collection of communicating components to solve a specific problem ◎ Described in Docker Compose o Component configuration o Application topology 14
  • 15.
    Creation of WorkFlows ◎PipelineBuilder o Allows creation of dependencies among different applications ◎ WorkFlow Monitor o Monitoring of pipeline-workflow using 15
  • 16.
    Integrating Custom Components ◎Instructions oOrchestrator required for initialization process (init_daemon) ❖ Components may depend on each other ❖ Components may require manual intervention o User Interface Integration ❖ Standard Interfaces from components ❖ Combine and align the interfaces 16
  • 17.
    User Interfaces ◎Target: Facilitatethe use of the platform o User Interface Adaption ◎Available interfaces o Workflow UIs ❖ Workflow Builder ❖ Workflow Monitor o Swarm UI o Integrator UI 17
  • 18.
  • 19.
    Presentations by Ivan,Aad and Jens 19
  • 20.
  • 21.
  • 22.
    Pilot Show Cases 22 SC1SC2 SC3 SC4 SC5 SC6 SC7 SC1 - Open PHACTS discovery platform relating to biological/medical questions SC2 - Discovery and Linking of Viticulture-relevant information SC3 - System monitoring in energy production units SC4 - Short-Term traffic flow forecasting. SC5 - Supporting data-intensive climate research SC6 - Citizens & Researchers Budget on Municipal Level SC7 - Ingestion of remote sensing images and social sensing data to detect and verify changes on the Earth surface for security applications
  • 23.
  • 24.
  • 25.
  • 26.
    26 SC4 - Transport FCD:Floating Car Data NRT: Near Real Time
  • 27.
  • 28.
    SC6 - SocialSciences 28
  • 29.
  • 30.
    BDE vs Hadoopdistributions Hortonworks Cloudera MapR Bigtop BDE File System HDFS HDFS NFS HDFS HDFS Installation Native Native Native Native lightweight virtualization Flexible Modular Architecture no no no no yes High Availability Single failure recovery (yarn) Single failure recovery (yarn) Self healing, mult. failure rec. Single failure recovery (yarn) Failure recovery Cost Commercial Commercial Commercial Free Free Scaling Freemium Freemium Freemium Free Free Addition of custom components Not easy No No No Yes Integration testing yes yes yes yes -- Operating systems Linux Linux Linux Linux Windows/Mac/Linux Management tool Ambari Cloudera manager MapR Control system - Docker swarm UI+ Custom 30
  • 31.
    BDE vs Hadoopdistributions ◎BDE is not built on top of existing distributions ◎Targets o Communities o Research Institutions ◎Bridges scientists and open data ◎Multi Tier research efforts towards Smart Data 31
  • 32.
    Maintenance and Uptake ◎CommunityDriven ◎Adapters ■ Feuga , Eurostat, ILVO ■ I2cat, Vicomtech, IoF ◎Follow Up Projects ■ HOBBIT ■ Special ■ Big Ocean ■ Qrowd 32
  • 33.
    Wrap up 33 ◎Big DataEurope - Platform o Containerized Components o Development and Runtime Facilities ◎Show Cases o From Components to Architectures o Evolving Microservice Architectures

Editor's Notes

  • #7 Explain Docker, compose, and Swarm
  • #23 Viticulture - Weinanbau