Data Analysis Library
By
Muthu Priya J
19MZ06
S.NO. Library Name Functions
• Pandas are designed for quick and easy data manipulation, reading, aggregation and visualization (plotting the data
through histogram or box plot).
1 Pandas
• Pandas take data from a CSV or TSV file or a SQL database and creates a Python object with rows and columns called
a data frame which are then
• The main object of numpy is the homogeneous multidimensional array.
• The dimensions are called axes.
2 Numpy
• The number of axes is called rank.
• Numpy facilitates math operations on arrays and their vectorization which significantly enhances performance.
• SciPy uses arrays as its basic data structure.
• It has various modules to perform common scientific programming tasks like:
3 SciPy • Linear algebra
• Ordinary differential equations
• Integration
• Signal processing
• Calculus
• Matplotlib is the plotting library for Python.
• It provides an object-oriented API for embedding plots into applications.
• It helps to generate data visualizations like:
• Line plots • Stem plots
4 Matplotlib
• Scatter plots • Contour plots
• Area plots • Quiver plots
• Bar charts and Histograms • Spectrograms
• Pie charts
• Seaborn is an extension of Matplotlib with advanced features.
• Seaborn is used
5 Seaborn • To determine relationships between multiple variables
• To analyze uni-variate or bi-variate distributions and compare them between different data subsets
• To plot linear regression models for dependent variables.
S.NO. Library Name Functions
• ScikitLearn provides a range of supervised and unsupervised learning algorithms via a consistent interface.
6 ScikitLearn • Classification, Clustering, Regression, Dimensionality reduction, Visualization, Model selection and Pre-processing
can be done with ScikitLearn.
• TensorFlow is an AI library that helps developer to create large-scale neural networks with many layers using data
flow graphs.
7 Tensorflow
• TensorFlow also facilitates the building of Deep Learning models and allows easy deploy of ML-powered
applications.
• Keras is TensorFlow’s high-level API for building and training Deep Neural Network code.
8 Keras • It is an open-source neural network library in Python.
• Statistical modeling, working with images and text is a lot easier with simplified coding for deep learning.
• Statsmodel provide easy computations for descriptive statistics, estimation and inference for statistical models.
• It is used to find
9 Statsmodel • Linear regression • Uni-variate & bi-variate analysis
• Correlation • Hypothesis Testing
• Generalized linear models and Bayesian model
• Plotly is a quintessential graph plotting library.
• Users can import, copy, paste or stream data that is to be analyzed and visualized.
• The Plotly graph library has a wide range of graphs including
10 Plotly • Basic charts • Maps
• Statistical and seaborn style
• Subplots
• Scientific charts
• Transforms
• Financial charts
• Jupyter widgets interaction