KEMBAR78
Agile development of data science projects | Part 1 | PDF
Agile development of data
science projects | Part 1
Anubhav Dhiman | July 18, 2018 | Berlin
What is data science?
Data science focuses on predicting something,
prescribing something, or in some cases explaining
something, making it distinct from Business Intelligence
(BI), which focuses on backward-looking factual
reporting (describing something that happened).
It is also distinct from big data storage and processing
technologies like Hadoop and Spark. These tools are
valuable inputs into the quantitative research process
but are insufficient to realise the full potential of data
science.
Successful organizations coordinate all three areas
(data science, BI, and big data) to achieve maximum
value
Broadly data science encompasses
quantitative research, advanced analytics,
predictive modelling and machine learning.
How reliably and
sustainably can
data science team
deliver value for
organizations?
Source: Domino Data Lab
Delivery
9. System proven in operational environment
8. System complete and qualified
7. Prototype demonstrated in operation environment
6. Algorithm integrated in development
5. Algorithm validated against production data
Discovery
4. Algorithm validated against sample data
3. Experimental proof of concept
2. Data explored and described
1. Algorithm design and development
Data Science
Readiness Levels
Source: Emily Gorcenski
Delivery
9. System proven in operational environment
8. System complete and qualified
7. Prototype demonstrated in operation environment
6. Algorithm integrated in development
5. Algorithm validated against production data
Discovery
4. Algorithm validated against sample data
3. Experimental proof of concept
2. Data explored and described
1. Algorithm design and development
Can we solve
problem as stated?
Data Scientists,
Data Engineers1
4
1
Delivery
9. System proven in operational environment
8. System complete and qualified
7. Prototype demonstrated in operation environment
6. Algorithm integrated in development
5. Algorithm validated against production data
Discovery
4. Algorithm validated against sample data
3. Experimental proof of concept
2. Data explored and described
1. Algorithm design and development
What does a MVP
look like?
+Designers,
Product Managers
Data Scientists,
Data Engineers
2
1
2
1
Delivery
9. System proven in operational environment
8. System complete and qualified
7. Prototype demonstrated in operation environment
6. Algorithm integrated in development
5. Algorithm validated against production data
Discovery
4. Algorithm validated against sample data
3. Experimental proof of concept
2. Data explored and described
1. Algorithm design and development
How do we build
the MVP?
+Designers,
Product Managers
Data Scientists,
Data Engineers
+Infra, Backend,
Frontend
3
2
1
3
2
1
Delivery
9. System proven in operational environment
8. System complete and qualified
7. Prototype demonstrated in operation environment
6. Algorithm integrated in development
5. Algorithm validated against production data
Discovery
4. Algorithm validated against sample data
3. Experimental proof of concept
2. Data explored and described
1. Algorithm design and development
How do we ship
the MVP?
+QA, Legal
+Designers,
Product Managers
Data Scientists,
Data Engineers
+Infra, Backend,
Frontend
4
3
2
1
4
3
2
1
Delivery
9. System proven in operational environment
8. System complete and qualified
7. Prototype demonstrated in operation environment
6. Algorithm integrated in development
5. Algorithm validated against production data
Discovery
4. Algorithm validated against sample data
3. Experimental proof of concept
2. Data explored and described
1. Algorithm design and development
How do we
improve MVP?
+CR, Analytics
+QA, Legal
+Designers,
Product Managers
Data Scientists,
Data Engineers
+Infra, Backend,
Frontend
5
4
3
2
1
5
4
3
2
1
How to make
collaboration
easier across
organization?
Source: Louis Dorard
From :
1. background to
specifics
2. domain
integration to
predictive
engine
Source: Louis Dorard
1
2 3
4 5
7 6
8 9
10
Up Next … Part 2
- Data Science Lifecycle
- Developing and Deploying
AI solutions

Agile development of data science projects | Part 1

  • 1.
    Agile development ofdata science projects | Part 1 Anubhav Dhiman | July 18, 2018 | Berlin
  • 2.
    What is datascience? Data science focuses on predicting something, prescribing something, or in some cases explaining something, making it distinct from Business Intelligence (BI), which focuses on backward-looking factual reporting (describing something that happened). It is also distinct from big data storage and processing technologies like Hadoop and Spark. These tools are valuable inputs into the quantitative research process but are insufficient to realise the full potential of data science. Successful organizations coordinate all three areas (data science, BI, and big data) to achieve maximum value Broadly data science encompasses quantitative research, advanced analytics, predictive modelling and machine learning.
  • 3.
    How reliably and sustainablycan data science team deliver value for organizations? Source: Domino Data Lab
  • 4.
    Delivery 9. System provenin operational environment 8. System complete and qualified 7. Prototype demonstrated in operation environment 6. Algorithm integrated in development 5. Algorithm validated against production data Discovery 4. Algorithm validated against sample data 3. Experimental proof of concept 2. Data explored and described 1. Algorithm design and development Data Science Readiness Levels Source: Emily Gorcenski
  • 5.
    Delivery 9. System provenin operational environment 8. System complete and qualified 7. Prototype demonstrated in operation environment 6. Algorithm integrated in development 5. Algorithm validated against production data Discovery 4. Algorithm validated against sample data 3. Experimental proof of concept 2. Data explored and described 1. Algorithm design and development Can we solve problem as stated? Data Scientists, Data Engineers1 4 1
  • 6.
    Delivery 9. System provenin operational environment 8. System complete and qualified 7. Prototype demonstrated in operation environment 6. Algorithm integrated in development 5. Algorithm validated against production data Discovery 4. Algorithm validated against sample data 3. Experimental proof of concept 2. Data explored and described 1. Algorithm design and development What does a MVP look like? +Designers, Product Managers Data Scientists, Data Engineers 2 1 2 1
  • 7.
    Delivery 9. System provenin operational environment 8. System complete and qualified 7. Prototype demonstrated in operation environment 6. Algorithm integrated in development 5. Algorithm validated against production data Discovery 4. Algorithm validated against sample data 3. Experimental proof of concept 2. Data explored and described 1. Algorithm design and development How do we build the MVP? +Designers, Product Managers Data Scientists, Data Engineers +Infra, Backend, Frontend 3 2 1 3 2 1
  • 8.
    Delivery 9. System provenin operational environment 8. System complete and qualified 7. Prototype demonstrated in operation environment 6. Algorithm integrated in development 5. Algorithm validated against production data Discovery 4. Algorithm validated against sample data 3. Experimental proof of concept 2. Data explored and described 1. Algorithm design and development How do we ship the MVP? +QA, Legal +Designers, Product Managers Data Scientists, Data Engineers +Infra, Backend, Frontend 4 3 2 1 4 3 2 1
  • 9.
    Delivery 9. System provenin operational environment 8. System complete and qualified 7. Prototype demonstrated in operation environment 6. Algorithm integrated in development 5. Algorithm validated against production data Discovery 4. Algorithm validated against sample data 3. Experimental proof of concept 2. Data explored and described 1. Algorithm design and development How do we improve MVP? +CR, Analytics +QA, Legal +Designers, Product Managers Data Scientists, Data Engineers +Infra, Backend, Frontend 5 4 3 2 1 5 4 3 2 1
  • 10.
    How to make collaboration easieracross organization? Source: Louis Dorard
  • 11.
    From : 1. backgroundto specifics 2. domain integration to predictive engine Source: Louis Dorard 1 2 3 4 5 7 6 8 9 10
  • 12.
    Up Next …Part 2 - Data Science Lifecycle - Developing and Deploying AI solutions