11/4/22, 12:10 PM UMQAV - Jupyter Notebook
NumPy, Data Science, and IMQAV
Ingest
Model
Query
Analyze
Visualize
Application of IMQAV
Organization
Architecture
Set of Tasks
Ingest
Ingestion is a set of software engineering techniques to adapt high volumes of data that arrive rapidly (often via
streaming).
Kafka
RabbitMQ
Fluentd
Sqoop
Kinesis (AWS)
Model
Modeling is a set of data architecture techniques to create data storage that is appropriate for a particular
domain.
Relational
MySQL
Postgres
RDS (AWS)
Key Value
Redis
Riak
DynamoDB (AWS)
Columnar
Casandra
HBase
RedShift (AWS)
Document
MongoDB
ElasticSearch
CouchBase
Graph
localhost:8888/notebooks/Desktop/Ex_Files_NumPy_Data_EssT/Exercise Files/Ch 0/00_03/Finish/UMQAV.ipynb 1/3
11/4/22, 12:10 PM UMQAV - Jupyter Notebook
p
Neo4J
OrientDB
ArangoDB
Query
Query refers to extracting data (from storage) and modifying that data to accommodate anomalies such as
missing data.
Batch
MapReduce
Spark
Elastic MapReduce (AWS)
Batch SQL
Hive
Presto
Drill
Streaming
Storm
Spark Streaming
Samza
Analyze
Analyze is a broad category that includes techniques from computer science, mathematical modeling, artificial
intelligence, statistics, and other disciplines.
NumPy is included within 'Analyze'
Statistics
SPSS
SAS
R
Statsmodels
SciPy
Pandas
Optimization and Mathematical Modeling (SciPy and other libraries)
Linear, Integer, Dynamic, Programming
Gradient and Lagrange methods
Machine Learning
Batch
H2O
Mahout
SparkML
Interactive
scikit-learn
Visualize
localhost:8888/notebooks/Desktop/Ex_Files_NumPy_Data_EssT/Exercise Files/Ch 0/00_03/Finish/UMQAV.ipynb 2/3
11/4/22, 12:10 PM UMQAV - Jupyter Notebook
Visualize refers to transforming data into visually attractive and informative formats.
matplotlib
seaborn
bokeh
pandas
D3
Tableau
Leaflet
Highcharts
Kibana
In [ ]:
localhost:8888/notebooks/Desktop/Ex_Files_NumPy_Data_EssT/Exercise Files/Ch 0/00_03/Finish/UMQAV.ipynb 3/3