Big data + data science startup focus points

BIG DATA & DATA SCIENCE
START-UP FOCUS POINTS
+ BUSINESS AND TECHNOLOGY
REFERENCE ARCHITECTURE
@TomZorde

I HAVE AN IDEA FOR A DATA SCIENCE START-UP
• Use these slides to focus conversation
• What stage are you at?
• What is the problem you’re trying to solve?
• What type of business model would work?
• Tools? – A rapidly evolving space.
• Reference Architecture helps identify what level of the stack
we’re talking about.

AREAS OF EARLY FOCUS
SEED STAGE - Research & Development
1. Research & Define Concept, business model, internal & sourced capabilities
2. Define customer value proposition and identify target market
ANGEL – Business Planning & Product Development
1. Identify services and products required and evaluate gaps for go-to-market readiness
2. Source funding partner to build minimum viable product and get commitment for round 2 funding
3. Assemble team and build MVP prototype exceeding expectations
ROUND 1/ SERIES A FUNDING – Commercially operational
ROUND 2 / SERIES B FUNDING – Fully Operational
ROUND 3 / SERIES C FUNDING – Expansion
IPO/ ACQUISITION

BUSINESS PLANNING & DEVELOPMENT - LOGICAL STEPS
1. Full business needs and information requirements
analysis. Business Drivers
• Revenue generation? Cost reduction? Customer
retention? Compliance?
• Process Improvement? Fraud detection?
Analytics? Dashboard?
• Solving a tough problem? Retiring/replacing
assets, technologies and systems?
2. Technology Evaluation and Selection
• Define requirements and objective first
• Evaluation a variety of technology stacks –
develop a framework first
3. Board Support for Start-up Resources
4. Prototyping, Discovery, and Planning
• Rent Infrastructure in Cloud – VMWare, AWS, MS
Azure and others
• Use Spare Hardware and Network Bandwidth
• Assessment, Proposal. Project/Program Plan for
next steps
• Start small and keep delivering
5. Architecture Design, Estimation, Business Case
6. Obtain funding and executive sponsorships,
owners, etc.
7. SDLC, don’t forget Hardware, Security, Testing,
Data governance etc.

FORESEEABLE CHALLENGES
Business urgency, time to market pressures
• Big Data /Data Science start up needs careful planning
• Big Data needs infrastructure, software stacks, people, start up plan
Lack of Big Data Resources, Lack of Sponsorships (except in some companies)
• Big Data is complex and multiple skill sets (mostly new to many companies) – Infrastructure, Administration,
Security, Programming, Testing, etc.
• Skepticism about Big Data
Integration with Existing Technologies and Systems
• Can not develop isolated big data solutions
• Integration with existing systems will be a top challenge (requires both sides to do additional work)
Open Sources: Stability, Maturity, and Security

INFORMATION AS A PRODUCT/SERVICE
TYPES OF RELEVANT BUSINESS MODELS
Differentiation
New Services
Customers Experience
Contextual Relevance
Brokering
Raw Data
Benchmarking
Analysis and Insight
(Meta Data)
Delivery
Market Place
Facilitator
Advertising

Decisions & Insight
Analytics & Discovery
Data Access and Distribution
Data Collection& Organisation
Infrastructure Platform
Monitoring,Alerts,Tools,
Security,Governance
• The technology stack is rapidly evolving with all traditional as well as new vendors providing offerings
• Open source tools remain at the foundation layers.
• Different use cases will require different technology tools.

Decisions & Insight
• IBM Watson
• Industry Specific
Analytics & Discovery
• SAP Business Objects
• IBM Cognos
• SAS Analytics
• Dell Statistica
• Oracle Hyperion
• Microsoft BI
• KNIME
• Pentaho
• Informatica

Data Access and Distribution
• Document: MongoDB, CouchDB
• Graph: Neo4j, Titan
• Key Value Pair: Riak, Redis
• Columnar: Cassandra, Hbase
• Search: Lucene, Solr, ElasticSearch
Monitoring, Alerts, Tools, Security, Governance:
• Hadoop:Apache, CloudEra, Hortonworks,
MapR, IBM
• SQL Mapping: Hive
• Big Data Transformation: Pig
• Hadoop Load: Sqoop
• Realtime-ETL: Storm
• Cluster Computing: Apache Spark
• Languages: Python, Java, R, Scala

Data Collection& Organisation (Batch & Real-Time)
• Hadoop
• Hadoop Map Reduce
• Mahout
Infrastructure Platform
• AWS
• Azure
• Mortar
• Google BigQuery
• Qubole
• Dell
• HP
• IBM

BIG DATA & DATA SCIENCE
START-UP FOCUS POINTS
@TomZorde
Thank you

Big data + data science startup focus points

More Related Content

What's hot

Viewers also liked

Similar to Big data + data science startup focus points

More from Tom Zorde

Recently uploaded

Big data + data science startup focus points