0% found this document useful (0 votes)

86 views7 pages

ML & AI-Introduction To Data-Science Tools

This document provides an introduction to common data science tools used for extracting knowledge from large volumes of structured and unstructured data. It discusses linear algorithms like linear and logistic regression, principal component analysis, and tree-based algorithms like decision trees, random forests and gradient boosting. It also mentions neural networks and their use in problems like image recognition.

Uploaded by

san_misus

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

86 views7 pages

ML & AI-Introduction To Data-Science Tools

Uploaded by

san_misus

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 7

Francisco Villarreal-Valderrama

Dec 15, 2021

·
3 min read

Introduction to data-science tools

Data science is an interdisciplinary approach to extracting

knowledge from noisy, structured and unstructured large volumes
of data. It encompasses preparing data for analysis and processing,
performing advanced data analysis, and presenting the results to
reveal patterns.

The process of data mining and analysis involves

applying mathematics, statistics, computer science,
information science, and domain knowledge to illustrate
stories that clearly convey the meaning of results to decision-makers
and stakeholders at every level of technical knowledge and
understanding. This shows the role of a data scientist, which is
someone who creates programming code, and combines it with
statistical knowledge to explain how the obtained results can
be used to solve business problems.

As a scientific field, data-science unifies scientific methods,

processes, algorithms and systems into a set of tools based on
statistics, data analysis and informatics. Data science is
closely related to data mining, machine learning and big data. The
most common tools involve:
Linear algorithms

Linear regression

It creates numerical predictions using the best linear fitting of a

data-set. The resulting model is easy to understand and shows the
biggest drivers of the results. Nonetheless, it can be too simple to
capture more complex relationships among the variables.

Logistic regression

This is an adaptation of linear regression to classification problems.

Similarly, it is easy to understand but not powerful enough to handle
complex relationships between the variables.
Principal Component Analysis

It is a data-compression tool based on the correlation among the

data variables. Its applications include anomaly detection and
prediction. It’s often combined with other tools to yield better
results.
Tree-based

Decision tree

This algorithm is comprised by a series of yes/no rules based on the

data features, forming a decision tree to match all the possible
outcomes of the process. It’s an easy-to-understand algorithm but
can become large when handling complex data-sets.
Random forest

It takes advantage of many decision trees with rules created from the
data itself. Individual decision trees are combined to form a
powerful predictor with better overall performance. It tends to give
high-quality results at the cost of not-easy-to-understand large
models.
Gradient boosting

It uses simpler decision trees that are increasingly focused on known

data. It is a high performance tool that gives very case-specific
results. That is, a small change in the feature set can create radical
changes in the model.

Neural networks

General neural network models

It consists in interconnected neurons that pass messages to each

other, with layers of neurons stacked on top of one another. These
models can handle extremely complex tasks but are very slow to
train and often have a complex architecture. Neural network models
outstand for image recognition and classification problems.
Nonetheless, their use as predictors is limited since its very hard to
understand the possible outcomes.

ADDC Air Terminal Units Specs PDF
100% (1)
ADDC Air Terminal Units Specs PDF
4 pages
SDCS 02 14
No ratings yet
SDCS 02 14
11 pages
15010-Ar-Apt2-4-030-00 - Transformer Room Details - 20180611 PDF
No ratings yet
15010-Ar-Apt2-4-030-00 - Transformer Room Details - 20180611 PDF
1 page
Table of Content
No ratings yet
Table of Content
2 pages
S-AA-CIV-AID (Asset Identification, Labelling and Beautification) (Rev.0-2015)
No ratings yet
S-AA-CIV-AID (Asset Identification, Labelling and Beautification) (Rev.0-2015)
12 pages
Measurement of Voltage in Engineering Practices Lab
No ratings yet
Measurement of Voltage in Engineering Practices Lab
4 pages
MV Installation & ITM Checklist
No ratings yet
MV Installation & ITM Checklist
3 pages
Extracted Pages From SDCS-02 REV. 1 PART 1
No ratings yet
Extracted Pages From SDCS-02 REV. 1 PART 1
2 pages
ADWEA Approved Vendor List 9.4.2009
No ratings yet
ADWEA Approved Vendor List 9.4.2009
243 pages
Field Survey & Asbuilt Submittal Guidelines - v3.0
No ratings yet
Field Survey & Asbuilt Submittal Guidelines - v3.0
119 pages
Raceways and Boxes For Electrical Systems-Rev01
No ratings yet
Raceways and Boxes For Electrical Systems-Rev01
19 pages
380kV All Towers
No ratings yet
380kV All Towers
2 pages
Tes-P-104.04 (Splice - Termination)
No ratings yet
Tes-P-104.04 (Splice - Termination)
9 pages
Blasting Safety Near SEC Facilities
No ratings yet
Blasting Safety Near SEC Facilities
34 pages
EP-MS-P4-S4-015 Issue 3 2011 Specification For 11kV-433V Oil Filled Ground Mounted Distribution Transformers
No ratings yet
EP-MS-P4-S4-015 Issue 3 2011 Specification For 11kV-433V Oil Filled Ground Mounted Distribution Transformers
1 page
40-SDMS-02A CT Testing
100% (1)
40-SDMS-02A CT Testing
14 pages
Adwea/addc/aadc Standard: Sat-Aaa-Swg-Mv-Accessories (Rev.0-2015)
No ratings yet
Adwea/addc/aadc Standard: Sat-Aaa-Swg-Mv-Accessories (Rev.0-2015)
7 pages
S Ohl Spo - 00
No ratings yet
S Ohl Spo - 00
16 pages
Ups Abb Powerscale 40kva
No ratings yet
Ups Abb Powerscale 40kva
22 pages
Enersys S SMDB - Consultants - Apr 14
100% (1)
Enersys S SMDB - Consultants - Apr 14
28 pages
Conman 04
No ratings yet
Conman 04
82 pages
Dubai Lighting & Electricals Company Profile
No ratings yet
Dubai Lighting & Electricals Company Profile
93 pages
SEC Low Voltage Panel Specs
No ratings yet
SEC Low Voltage Panel Specs
25 pages
UPS Compliance
No ratings yet
UPS Compliance
11 pages
380KV Substation Tender Drawings
No ratings yet
380KV Substation Tender Drawings
11 pages
Electricity Supply Planning Guidelines
No ratings yet
Electricity Supply Planning Guidelines
20 pages
154-001-Cast-In Conduit & Galvanized Steel Boxes - ITP
No ratings yet
154-001-Cast-In Conduit & Galvanized Steel Boxes - ITP
4 pages
Sdi Data Standard Spatial Reference System, Property PDF
No ratings yet
Sdi Data Standard Spatial Reference System, Property PDF
41 pages
16-Division 16-Section 16160 Package Type Unit Substation-Version 2.0
No ratings yet
16-Division 16-Section 16160 Package Type Unit Substation-Version 2.0
20 pages
S TRD Gmo - 00
No ratings yet
S TRD Gmo - 00
23 pages
Octagonal Steel Poles Specs
100% (1)
Octagonal Steel Poles Specs
43 pages
16342-Metal Clad MV SWGR
No ratings yet
16342-Metal Clad MV SWGR
14 pages
Abu Dhabi Government Bodies NOC (Required/Not Required)
No ratings yet
Abu Dhabi Government Bodies NOC (Required/Not Required)
2 pages
S TRD GMD - 01
No ratings yet
S TRD GMD - 01
22 pages
The Saudi Arabian Distribution Code
No ratings yet
The Saudi Arabian Distribution Code
163 pages
S-AA-CAB-MV-3C-I (Rev.0-2007)
No ratings yet
S-AA-CAB-MV-3C-I (Rev.0-2007)
10 pages
Abu Dhabi University: Fall 2016-2017
0% (1)
Abu Dhabi University: Fall 2016-2017
55 pages
Chapters 5b Exercises
No ratings yet
Chapters 5b Exercises
13 pages
Electrical Methodology
100% (1)
Electrical Methodology
60 pages
Floorbox Specs for Commercial Spaces
No ratings yet
Floorbox Specs for Commercial Spaces
2 pages
11-SDMS-03 Rev. 02 Specifications For
No ratings yet
11-SDMS-03 Rev. 02 Specifications For
17 pages
Customer Load Estimation Guide
No ratings yet
Customer Load Estimation Guide
16 pages
Underground Warning Tape Specs
No ratings yet
Underground Warning Tape Specs
12 pages
Oman 565 Internalcablingen
No ratings yet
Oman 565 Internalcablingen
17 pages
Addc Guidlines For LV Services Cable Selection and Fuse Rating PDF
No ratings yet
Addc Guidlines For LV Services Cable Selection and Fuse Rating PDF
10 pages
PUBLIC LIGHTING CONTROL SYSTEM GENERAL TECHNICAL DESCRIPTION - Rev. 06 1203
No ratings yet
PUBLIC LIGHTING CONTROL SYSTEM GENERAL TECHNICAL DESCRIPTION - Rev. 06 1203
12 pages
Modular Mep Installation
No ratings yet
Modular Mep Installation
8 pages
58-TMSS-02-R0 Air Core Series Reactor 13.8kV To 380kV
No ratings yet
58-TMSS-02-R0 Air Core Series Reactor 13.8kV To 380kV
12 pages
Material Submittal - OH Conductor - ECE - 144dpi - 75% - 144dpi - 14%
No ratings yet
Material Submittal - OH Conductor - ECE - 144dpi - 75% - 144dpi - 14%
246 pages
ADDC Electricity 2020 Summary
No ratings yet
ADDC Electricity 2020 Summary
8 pages
Low Smoke Zero Halogen Conduit
No ratings yet
Low Smoke Zero Halogen Conduit
3 pages
Rmu Notes
No ratings yet
Rmu Notes
2 pages
Kahramaa Certificate
No ratings yet
Kahramaa Certificate
1 page
EATON Panel Boards
No ratings yet
EATON Panel Boards
84 pages
DPS 02
No ratings yet
DPS 02
78 pages
Item Specification - Sec
0% (1)
Item Specification - Sec
5 pages
11kV Heat Shrinkable Joints Spec
No ratings yet
11kV Heat Shrinkable Joints Spec
8 pages
KM Waterworks Approved Suppliers List
No ratings yet
KM Waterworks Approved Suppliers List
4 pages
TTDS Lectures
No ratings yet
TTDS Lectures
13 pages
Data Science
No ratings yet
Data Science
33 pages
Earthing Lighting Surge Protection
No ratings yet
Earthing Lighting Surge Protection
165 pages
Power Distribution Guidelines
100% (5)
Power Distribution Guidelines
182 pages
23014359-1 23014359 Invoice 160220234
No ratings yet
23014359-1 23014359 Invoice 160220234
1 page
WECC-Approved Dynamic Models January 2020
No ratings yet
WECC-Approved Dynamic Models January 2020
4 pages
Substation Training for Engineers
100% (1)
Substation Training for Engineers
53 pages
Mgmt-PM-A Manager's Guide To Coaching
100% (1)
Mgmt-PM-A Manager's Guide To Coaching
58 pages
Electric Power Distribution Systems
No ratings yet
Electric Power Distribution Systems
231 pages
Multi Converter
No ratings yet
Multi Converter
17 pages
Operational and Commercial Systems, 1976-1989
100% (1)
Operational and Commercial Systems, 1976-1989
13 pages
Atv61 Installation Manual
No ratings yet
Atv61 Installation Manual
47 pages
Ption: Karl Fischer Moisture Titrator
No ratings yet
Ption: Karl Fischer Moisture Titrator
6 pages
Raymarine I70s Installation Instructions 87420 (Rev 1) (EN)
No ratings yet
Raymarine I70s Installation Instructions 87420 (Rev 1) (EN)
76 pages
Tracking System (5
No ratings yet
Tracking System (5
41 pages
Myrtle Beach SEO
No ratings yet
Myrtle Beach SEO
4 pages
Introduction To Blender 30
No ratings yet
Introduction To Blender 30
8 pages
ALSPA Control System ALSPA HMI V6 Alarms & Events Viewer Reference Manual
No ratings yet
ALSPA Control System ALSPA HMI V6 Alarms & Events Viewer Reference Manual
56 pages
BFTII SM Ch01 Software
No ratings yet
BFTII SM Ch01 Software
31 pages
En Solaris Urbino 2014
No ratings yet
En Solaris Urbino 2014
11 pages
S4 HANA MDG - Vendor Master Governance
0% (1)
S4 HANA MDG - Vendor Master Governance
31 pages
Desktop Study Regarding Facilities and Services in Cept University
100% (1)
Desktop Study Regarding Facilities and Services in Cept University
6 pages
DFTS BE 4 II Sem Unit 2
No ratings yet
DFTS BE 4 II Sem Unit 2
112 pages
Antim Prahar Social Media and Web Analytics 2024
No ratings yet
Antim Prahar Social Media and Web Analytics 2024
72 pages
PureBallast PB 3.2 600 Ex
No ratings yet
PureBallast PB 3.2 600 Ex
118 pages
Workflow Document
No ratings yet
Workflow Document
4 pages
ION8650 UserGuide - 70002-0306-00 PDF
No ratings yet
ION8650 UserGuide - 70002-0306-00 PDF
216 pages
12 Best Air Conditioner Brands Available in India
No ratings yet
12 Best Air Conditioner Brands Available in India
18 pages
Tracking and Counting People in Video Surveillance Data
No ratings yet
Tracking and Counting People in Video Surveillance Data
26 pages
AutoGrid5 User Guide
100% (3)
AutoGrid5 User Guide
621 pages
Power Quality Issues & Solutions
No ratings yet
Power Quality Issues & Solutions
1 page
Lab Book
No ratings yet
Lab Book
250 pages
Mil STD 1474e Final 15apr2015
No ratings yet
Mil STD 1474e Final 15apr2015
123 pages
Final Year Project Registration
No ratings yet
Final Year Project Registration
12 pages
3046 - Executive Summary - ALCMS, PLC, Navigator
No ratings yet
3046 - Executive Summary - ALCMS, PLC, Navigator
3 pages
Model Sim Demo
No ratings yet
Model Sim Demo
598 pages
Lifelonglearning PDF
No ratings yet
Lifelonglearning PDF
15 pages
Aluminum Sheet Hydroforming for Cars
No ratings yet
Aluminum Sheet Hydroforming for Cars
7 pages
Komatsu D575A
No ratings yet
Komatsu D575A
4 pages
Website Traffic Dashboard - Google Sheets
No ratings yet
Website Traffic Dashboard - Google Sheets
1 page

ML & AI-Introduction To Data-Science Tools

Uploaded by

ML & AI-Introduction To Data-Science Tools

Uploaded by

Francisco Villarreal-Valderrama

Dec 15, 2021

Introduction to data-science tools

Data science is an interdisciplinary approach to extracting

The process of data mining and analysis involves

As a scientific field, data-science unifies scientific methods,

It creates numerical predictions using the best linear fitting of a

This is an adaptation of linear regression to classification problems.

It is a data-compression tool based on the correlation among the

This algorithm is comprised by a series of yes/no rules based on the

It uses simpler decision trees that are increasingly focused on known

General neural network models

It consists in interconnected neurons that pass messages to each

You might also like