0% found this document useful (0 votes)

32 views14 pages

Types of Data

Uploaded by

garylenmay.vargas

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

32 views14 pages

Types of Data

Uploaded by

garylenmay.vargas

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 14

Types of Data

A term common in data science and computer science is the data type, which refers to the nature
of values in the data. This includes data types such as characters, integers, float, boolean, date,
time, etc.
Going forward in the article: When a data type is mentioned, it will refer to the abovementioned
types.
 However, the data types discussed ahead refer to the variables/samples you come across
when performing statistical computing or various data science operations.
While one can divide the variables based on the aforementioned data types, categorizing the data
discussed ahead is more beneficial from a data mining and statistical computing point of view

Data is of two types:

1. Quantitative
2. Qualitative

Quantitative Data

Quantitative data refers to data that is generated by counting or measuring something. Generally,
quantitative refers to numerical data.
It’s important to note that by numerical, it doesn’t mean the data type will be numeric as, in some
cases, it can be string too. By numeric data type, we mean that it represents a number that can be
counted.
For example, if a variable in a dataframe provides the number of cars a household has and has
values such as ‘one,’ ‘two,’ ‘three,’ and so on, then this variable will be considered quantitative
data even when its data type is ‘string’.
Therefore the quantitative data can have a numeric data type and a string/character. This is
because here, the information is gathered by counting something, and if some transformation is
done, then statistical analysis can be performed on it.
Thus quantitative data answers questions of the type ‘how many’, ‘how much’, ‘how often’ etc.
Quantitative data can be categorized
1. Continuous vs. discrete
2. Ration vs. interval and discrete

Continuous vs. Discrete

Quantitative data can be of two types – discrete and continuous.

A discrete variable expresses data in countable numbers, i.e., integers. Therefore, data is known
as discrete data whenever a count of individual items is recorded.
Typically discrete data includes numeric, finite, non-negative countable integers. Examples of such
data include the number of houses, the number of students in a class, etc.
Continuous data differs from discrete data as it may or may not be comprised of whole numbers,
can consider any value in a specific range, and can take on an infinite number of values.
Another difference is that decimal numbers of fractions can be there in continuous data, whereas
this is not the case with discrete data. Such data include weight, height, distance, volume, etc.

Ratio vs. Interval and Discrete

Another way quantitative data can be categorized is Ratio and Interval data.
Ratio data is where a true zero exists, and there is an equal interval between neighboring points.
Here a zero indicates a total absence of a value.
For example, a zero measurement of population, length, or area would mean an absence of the
subject. Another example can be temperature measured in Kelvin, where zero indicates a
complete lack of thermal energy.
Interval data is similar to ratio data, with the difference being that there is no true zero point in
such data.
For example, the temperature measured in Celsius is an example of interval data where zero
doesn’t indicate an absence of temperature.

Qualitative Data

Qualitative data is fundamentally different from quantitative data as it can be measured or counted
as it is more descriptive. It relies more on language, i.e., characters, than numerical values to
provide information.
However, when the qualitative data is encoded, it can be represented through numbers.
Note: Qualitative data is often called categorical data. In contrast, quantitative data is often
referred to as numerical data (not to be confused with the numeric data type. As mentioned
earlier, in special cases, a numerical data type can have a string as the data type).
Qualitative data can be categorized into:
1. Binary
2. Ordinal
3. Nominal data

Binary Data

Binary Data is also known as dichotomous variables, where there are always two possible
outcomes. For example, Yes/No, Heads/Tails, Win/Lose, etc.

Ordinal Data

Ordinal data is where we have different categories with a natural rank order. For example, a
variable with pain level has five categories- ‘no pain’, ‘mild pain’, ‘moderate pain’, ‘high pain’, and
‘severe pain’.
Here if you notice, there is a natural order to the categories as the categories can be ranked from
the category indicating lower pain intensity to higher.
Identifying ordinal variables is important because when the data is being prepared for statistical
modeling, the categorical variable is often encoded, i.e., represented in numbers. The categorical
variable type dictates the encoding mechanism to be used.
Label encoding is used for ordinal variables where the categories are ranked, and value labels are
provided to them from 1 to N (N being the number of unique categories in the variable).

Ordinal vs. Interval Data

An important thing to note here is that ordinal data can be confused with several other data types.
For example, ordinal data, when encoded, can resemble discrete data.
Ordinal data can be confused with interval data.
The difference is that the distance between the two categories is unknown in ordinal data. In
contrast, the distance between two adjacent values is known and fixed in interval data.
For example, if data providing a scale of pain from 0 to 10 is there, with 0 indicating no pain to 10
indicating severe pain, then such data is interval.
Here, the values have fixed measurement units that are of equal and known size.
On the other hand, if the data has five categories (‘ no pain’, ‘mild pain’, ‘moderate pain’, ‘high
pain’, and ‘severe pain’), then such data will be considered to be ordinal data as we can’t quantify
the difference between one category to another.

Nominal Data

Another type of qualitative data is nominal data. Here the categories of the data are mutually
exclusive and cannot be ordered in a meaningful way.
For example, if a variable indicates a mode of transportation with categories like ‘bus’, ‘car’, ‘train’,
and ‘motorcycle’, then such data is nominal. Other examples can include zip code and genre of
music.
Nominal variables are encoded during the data preparation process using a method known as
one-hot encoding (also known as dummy variable creation).
It’s important to note that you can get confused between the two types of categorical variables.
For example, the variable indicating the color of a car and having categories like ‘green’, ‘yellow’,
and ‘red’ is nominal. In contrast, the same categories can be a part of an ordinal variable where
the colors indicate a place’s danger level, with ‘green’ indicating safe and ‘red’ indicating unsafe.
Similarly, a variable indicating the temperature of an object having categories like ‘cold’ and ‘hot’
may seem binary. Still, having more than another category, such as ‘mild,’ is impossible.
6 steps in data processing

1. Data collection

The first stage of data collection involves gathering and discovering raw data from various
sources, such as sensors, databases, or customer surveys. It is essential to ensure the collected
data is accurate, complete, and relevant to the analysis or processing goals. Care must be taken
to avoid selection bias, where the method of collecting data inadvertently favors certain outcomes
or groups, potentially skewing results and leading to inaccurate conclusions.

2. Data preparation

Once the data is collected, it moves to the data preparation stage. Here, the raw data is cleaned
up, organized, and often enriched for further processing. This stage involves checking for errors,
removing any bad data (redundant, incomplete, or incorrect), and enhancing the dataset with
additional relevant information from external sources, a process known as data enrichment. Data
preparation aims to create high-quality, reliable, and comprehensive data for subsequent
processing steps.

3. Data input

The next stage is data input. In this stage, the clean and prepped data is fed into a processing
system, which could be software or an algorithm designed for specific data types or analysis
goals. Various methods, such as manual entry, data import from external sources, or automatic
data capture, can be used to input data into the processing system.
4. Data processing

In the data processing stage, the input data is transformed, analyzed, and organized to produce
relevant information. Several data processing techniques, like filtering, sorting, aggregation, or
classification, may be employed to process the data. The choice of methods depends on the
desired outcome or insights from the data.

5. Data output and interpretation

The data output and interpretation stage deals with presenting the processed data in an easily
digestible format. This could involve generating reports, graphs, or visualizations that simplify
complex data patterns and help with decision-making. Furthermore, the output data should be
interpreted and analyzed to extract valuable insights and knowledge.

6. Data storage

Finally, in the data storage stage, the processed information is securely stored in databases
or data warehouses for future retrieval, analysis, or use. Proper storage ensures data longevity,
availability, and accessibility while maintaining data privacy and security.
Batch processing

Batch processing involves handling large volumes of data collectively at predetermined times,
making it ideal for non-time-sensitive tasks. This method allows organizations to efficiently
manage data by aggregating it and processing it during off-peak hours to minimize the impact on
daily operations.

Example: Financial institutions batch process checks and transactions overnight, updating account
balances in one comprehensive sweep to ensure accuracy and efficiency.

Real-time processing

Real-time processing is essential for tasks that require immediate handling of data upon receipt,
providing instant processing and feedback. This type of processing is crucial for applications
where delays cannot be tolerated, ensuring timely decisions and responses.

Example: GPS navigation systems rely on real-time processing to offer turn-by-turn directions,
adjusting routes based on live traffic and road conditions to ensure the fastest path.

Multiprocessing (parallel processing)

Multiprocessing, or parallel processing, involves utilizing multiple processing units or CPUs to

handle various tasks simultaneously. This approach allows for more efficient data processing,
particularly for complex computations that can be broken down into smaller, concurrent tasks,
thereby speeding up overall processing time.

Example: Movie production often utilizes multiprocessing for rendering complex 3D animations. By
distributing the rendering across multiple computers, the overall project's completion time is
significantly reduced, leading to faster production cycles and improved visual quality.

Online processing

Online processing facilitates the interactive processing of data over a network, with continuous
input and output for instant responses. It enables systems to handle user requests immediately,
making it an essential component of e-commerce and online services.
Example: Online banking systems utilize online processing for real-time financial transactions,
allowing users to transfer funds, pay bills, and check account balances with immediate updates.

Manual data processing

Manual data processing requires human intervention for the input, processing, and output of data,
typically without the aid of electronic devices. This labor-intensive method is prone to errors but
was common before the advent of computerized systems.

Example: Before the widespread use of computers, libraries cataloged books manually, requiring
librarians to carefully record each book's details by hand for inventory and retrieval purposes.

Mechanical data processing

Mechanical data processing uses machines or equipment to manage and process data tasks, a
prevalent method before the digital era. This approach involved using tangible, mechanical
devices to input, process, and output data.

Example: Voting in the early 20th century often involved mechanical lever machines, where votes
were tallied by pulling levers for each choice, simplifying vote counting and reducing the potential
for errors.

Electronic data processing

Electronic data processing employs computers and digital technology to process, store, and
communicate data with efficiency and accuracy. This modern approach to data handling allows for
rapid processing speeds, vast storage capabilities, and easy data retrieval.

Example: Retailers use electronic data processing at checkouts, where barcode scans instantly
update inventory systems and process sales, enhancing checkout speed and inventory
management.

Distributed processing

Distributed processing involves spreading computational tasks across multiple computers or

devices to improve processing speed and reliability. This method leverages the collective power of
various systems to handle large-scale processing tasks more efficiently than could be achieved
with a single computer.

Example: Video streaming services use distributed processing to deliver content efficiently. By
storing videos on multiple servers, they ensure smooth playback and quick access for users
worldwide.

Cloud computing

Cloud computing offers computing resources, such as servers, storage, and databases, over the
internet, providing flexibility and scalability. This model enables users to access and utilize
computing resources as needed, without the burden of maintaining physical infrastructure.

Example: Small businesses leverage cloud computing for data storage and software services,
avoiding the need for significant upfront hardware investments and allowing easy scaling as the
business grows.

Automatic data processing

Automatic data processing uses software to automate routine tasks, reducing the need for manual
input and increasing operational efficiency. This method streamlines repetitive processes,
minimizes human error, and frees up personnel for more strategic tasks.

Example: Automated billing systems in telecommunications automatically calculate and send out
monthly charges to customers, streamlining billing operations and reducing errors.

Convert raw data into action with data processing

Data processing is the key to unlocking the potential of raw data, transforming it into the
knowledge that shapes our future. By systematically analyzing and interpreting data, organizations
gain critical insights that inform strategic decisions, streamline processes, and drive innovation.

As the volume and complexity of data continue to expand, the ability to understand and effectively
process it will become even more essential for success in a data-driven world.

One sample t-test

A one sample t-test allows us to test whether a sample mean (of a normally distributed interval
variable) significantly differs from a hypothesized value. For example, using the hsb2 data file, say
we wish to test whether the average writing score (write) differs significantly from 50. We can do
this as shown below.

One sample median test

A one sample median test allows us to test whether a sample median differs significantly from a
hypothesized value. We will use the same variable, write, as we did in the one sample t-
test example above, but we do not need to assume that it is interval and normally distributed (we
only need to assume that write is an ordinal variable).

Binomial test

A one sample binomial test allows us to test whether the proportion of successes on a two-level
categorical dependent variable significantly differs from a hypothesized value. For example, using
the hsb2 data file, say we wish to test whether the proportion of females (female) differs
significantly from 50%, i.e., from .5. We can do this as shown below.

Chi-square goodness of fit

A chi-square goodness of fit test allows us to test whether the observed proportions for a
categorical variable differ from hypothesized proportions. For example, let’s suppose that we
believe that the general population consists of 10% Hispanic, 10% Asian, 10% African American
and 70% White folks. We want to test whether the observed proportions from our sample differ
significantly from these hypothesized proportions.

Two independent samples t-test

An independent samples t-test is used when you want to compare the means of a normally
distributed interval dependent variable for two independent groups. For example, using the hsb2
data file, say we wish to test whether the mean for write is the same for males and females.

Wilcoxon-Mann-Whitney test

The Wilcoxon-Mann-Whitney test is a non-parametric analog to the independent samples t-test

and can be used when you do not assume that the dependent variable is a normally distributed
interval variable (you only assume that the variable is at least ordinal). You will notice that the
SPSS syntax for the Wilcoxon-Mann-Whitney test is almost identical to that of the independent
samples t-test. We will use the same data file (the hsb2 data file) and the same variables in this
example as we did in the independent t-test example above and will not assume that write, our
dependent variable, is normally distributed.
Chi-square test

A chi-square test is used when you want to see if there is a relationship between two categorical
variables. In SPSS, the chisq option is used on the statistics subcommand of
the crosstabs command to obtain the test statistic and its associated p-value. Using the hsb2
data file, let’s see if there is a relationship between the type of school attended (schtyp) and
students’ gender (female). Remember that the chi-square test assumes that the expected value
for each cell is five or higher. This assumption is easily met in the examples below. However, if
this assumption is not met in your data, please see the section on Fisher’s exact test below.

Fisher’s exact test

The Fisher’s exact test is used when you want to conduct a chi-square test but one or more of
your cells has an expected frequency of five or less. Remember that the chi-square test assumes
that each cell has an expected frequency of five or more, but the Fisher’s exact test has no such
assumption and can be used regardless of how small the expected frequency is. In SPSS unless
you have the SPSS Exact Test Module, you can only perform a Fisher’s exact test on a 2×2 table,
and these results are presented by default. Please see the results from the chi squared example
above.
One-way ANOVA

A one-way analysis of variance (ANOVA) is used when you have a categorical independent
variable (with two or more categories) and a normally distributed interval dependent variable and
you wish to test for differences in the means of the dependent variable broken down by the levels
of the independent variable. For example, using the hsb2 data file, say we wish to test whether
the mean of write differs between the three program types (prog). The command for this test
would be:

Kruskal Wallis test

The Kruskal Wallis test is used when you have one independent variable with two or more levels
and an ordinal dependent variable. In other words, it is the non-parametric version of ANOVA and
a generalized form of the Mann-Whitney test method since it permits two or more groups. We will
use the same data file as the one way ANOVA example above (the hsb2 data file) and the same
variables as in the example above, but we will not assume that write is a normally distributed
interval variable.

.387).
Wilcoxon signed rank sum test

The Wilcoxon signed rank sum test is the non-parametric version of a paired samples t-test. You
use the Wilcoxon signed rank sum test when you do not wish to assume that the difference
between the two variables is interval and normally distributed (but you do assume the difference is
ordinal). We will use the same example as above, but we will not assume that the difference
between read and write is interval and normally distributed.

McNemar test

You would perform McNemar’s test if you were interested in the marginal frequencies of two
binary outcomes. These binary outcomes may be the same outcome variable on matched pairs
(like a case-control study) or two outcome variables from a single group. Continuing with
the hsb2 dataset used in several above examples, let us create two binary outcomes in our
dataset: himath and hiread. These outcomes can be considered in a two-way contingency table.
The null hypothesis is that the proportion of students in the himath group is the same as the
proportion of students in hiread group (i.e., that the contingency table is symmetric).

One-way repeated measures ANOVA

You would perform a one-way repeated measures analysis of variance if you had one categorical
independent variable and a normally distributed interval dependent variable that was repeated at
least twice for each subject. This is the equivalent of the paired samples t-test, but allows for two
or more levels of the categorical variable. This tests whether the mean of the dependent variable
differs by the categorical variable. We have an example data set called rb4wide, which is used in
Kirk’s book Experimental Design. In this data set, y is the dependent variable, a is the repeated
measure and s is the variable that

Repeated measures logistic regression

If you have a binary outcome measured repeatedly for each subject and you wish to run a logistic
regression that accounts for the effect of multiple measures from single subjects, you can perform
a repeated measures logistic regression. In SPSS, this can be done using the GENLIN command
and indicating binomial as the probability distribution and logit as the link function to be used in the
model. The exercise data file contains 3 pulse measurements from each of 30 people assigned to
2 different diet regiments and 3 different exercise regiments. If we define a “high” pulse as being
over 100, we can then predict the probability of a high pulse using diet regiment.

Factorial ANOVA

A factorial ANOVA has two or more categorical independent variables (either with or without the
interactions) and a single normally distributed interval dependent variable. For example, using
the hsb2 data file we will look at writing scores (write) as the dependent variable and gender
(female) and socio-economic status (ses) as independent variables, and we will include an
interaction of female by ses. Note that in SPSS, you do not need to have the interaction term(s)
in your data set. Rather, you can have SPSS create it/them temporarily by placing an asterisk
between the variables that will make up the interaction term(s).

Friedman test

You perform a Friedman test when you have one within-subjects independent variable with two or
more levels and a dependent variable that is not interval and normally distributed (but at least
ordinal). We will use this test to determine if there is a difference in the reading, writing and math
scores. The null hypothesis in this test is that the distribution of the ranks of each type of score
(i.e., reading, writing and math) are the same. To conduct a Friedman test, the data need to be in
a long format. SPSS handles this for you, but in other statistical packages you will have to
reshape the data before you can conduct this test.

Friedman’s chi-square has a value of 0.645 and a p-value of 0.724 and is not statistically
significant. Hence, there is no evidence that the distributions of the three types of scores are
different.
Ordered logistic regression

Ordered logistic regression is used when the dependent variable is ordered, but not continuous.
For example, using the hsb2 data file we will create an ordered variable called write3. This
variable will have the values 1, 2 and 3, indicating a low, medium or high writing score. We do not
generally recommend categorizing a continuous variable in this way; we are simply creating a
variable to use for this example. We will use gender (female), reading score (read) and social
studies score (socst) as predictor variables in this model. We will use a logit link and on
the print subcommand we have requested the parameter estimates, the (model) summary
statistics and the test of the parallel lines assumption.

Factorial logistic regression

A factorial logistic regression is used when you have two or more categorical independent
variables but a dichotomous dependent variable. For example, using the hsb2 data file we will
use female as our dependent variable, because it is the only dichotomous variable in our data set;
certainly not because it common practice to use gender as an outcome variable. We will use type
of program (prog) and school type (schtyp) as our predictor variables. Because prog is a
categorical variable (it has three levels), we need to create dummy codes for it. SPSS will do this
for you by making dummy codes for all variables listed after the keyword with. SPSS will also
create the interaction term; simply list the two variables that will make up the interaction separated
by the keyword by.

Correlation

A correlation is useful when you want to see the relationship between two (or more) normally
distributed interval variables. For example, using the hsb2 data file we can run a correlation
between two continuous variables, read and write.

Simple linear regression

Simple linear regression allows us to look at the linear relationship between one normally
distributed interval predictor and one normally distributed interval outcome variable. For example,
using the hsb2 data file, say we wish to look at the relationship between writing scores (write) and
reading scores (read); in other words, predicting write from read.

Non-parametric correlation

A Spearman correlation is used when one or both of the variables are not assumed to be normally
distributed and interval (but are assumed to be ordinal). The values of the variables are converted
in ranks and then correlated. In our example, we will look for a relationship
between read and write. We will not assume that both of these variables are normal and interval.

Simple logistic regression

Logistic regression assumes that the outcome variable is binary (i.e., coded as 0 and 1). We have
only one variable in the hsb2 data file that is coded 0 and 1, and that is female. We understand
that female is a silly outcome variable (it would make more sense to use it as a predictor variable),
but we can use female as the outcome variable to illustrate how the code for this command is
structured and how to interpret the output. The first variable listed after the logistic command is
the outcome (or dependent) variable, and all of the rest of the variables are predictor (or
independent) variables. In our example, female will be the outcome variable, and read will be the
predictor variable. As with OLS regression, the predictor variables must be either dichotomous or
continuous; they cannot be categorical.

Multiple regression

Multiple regression is very similar to simple regression, except that in multiple regression you have
more than one predictor variable in the equation. For example, using the hsb2 data file we will
predict writing score from gender (female), reading, math, science and social studies (socst)
scores.

Multiple logistic regression

Multiple logistic regression is like simple logistic regression, except that there are two or more
predictors. The predictors can be interval variables or dummy variables, but cannot be categorical
variables. If you have categorical predictors, they should be coded into one or more dummy
variables. We have only one variable in our data set that is coded 0 and 1, and that is female. We
understand that female is a silly outcome variable (it would make more sense to use it as a
predictor variable), but we can use female as the outcome variable to illustrate how the code for
this command is structured and how to interpret the output. The first variable listed after
the logistic regression command is the outcome (or dependent) variable, and all of the rest of
the variables are predictor (or independent) variables (listed after the keyword with). In our
example, female will be the outcome variable, and read and write will be the predictor variable

Discriminant analysis

Discriminant analysis is used when you have one or more normally distributed interval
independent variables and a categorical dependent variable. It is a multivariate technique that
considers the latent dimensions in the independent variables for predicting group membership in
the categorical dependent variable. For example, using the hsb2 data file, say we wish to
use read, write and math scores to predict the type of program a student belongs to (prog).

One-way MANOVA

MANOVA (multivariate analysis of variance) is like ANOVA, except that there are two or more
dependent variables. In a one-way MANOVA, there is one categorical independent variable and
two or more dependent variables

Multivariate multiple regression

Multivariate multiple regression is used when you have two or more dependent variables that are
to be predicted from two or more independent variables. In our example using the hsb2 data file,
we will predict write and read from female, math, science and social studies (socst) scores.
Canonical correlation

Canonical correlation is a multivariate technique used to examine the relationship between two
groups of variables. For each set of variables, it creates latent variables and looks at the
relationships among the latent variables. It assumes that all variables in the model are interval and
normally distributed. SPSS requires that each of the two groups of variables be separated by the
keyword with. There need not be an equal number of variables in the two groups (before and
after the with).

Factor analysis

Factor analysis is a form of exploratory multivariate analysis that is used to either reduce the
number of variables in a model or to detect relationships among variables. All variables involved
in the factor analysis need to be interval and are assumed to be normally distributed. The goal of
the analysis is to try to identify factors which underlie the variables. There may be fewer factors
than variables, but there may not be more factors than variables. For our example using the hsb2
data file, let’s suppose that we think that there are some common factors underlying the various
test scores. We will include subcommands for varimax rotation and a plot of the eigenvalues. We
will use a principal components extraction and will retain two factors.

Data and Types of Data
No ratings yet
Data and Types of Data
7 pages
Data Analytics
No ratings yet
Data Analytics
302 pages
Introduction - Importance of Data
No ratings yet
Introduction - Importance of Data
40 pages
Unit 2 1
No ratings yet
Unit 2 1
48 pages
Data 2
No ratings yet
Data 2
48 pages
Data Analysis (Spss & Excel) Module 2
No ratings yet
Data Analysis (Spss & Excel) Module 2
9 pages
RESEARCH
No ratings yet
RESEARCH
4 pages
Classes of Data
No ratings yet
Classes of Data
10 pages
Data Types in Statistics - Qualitative Vs Quantitative Data
No ratings yet
Data Types in Statistics - Qualitative Vs Quantitative Data
11 pages
Fds Unit II Notes
No ratings yet
Fds Unit II Notes
37 pages
Char of Data DV 1
No ratings yet
Char of Data DV 1
14 pages
Gre 322
No ratings yet
Gre 322
76 pages
Unit 1 Notes
No ratings yet
Unit 1 Notes
15 pages
Understanding Data Types in Data Science
No ratings yet
Understanding Data Types in Data Science
10 pages
Data Types and Analytics in R
No ratings yet
Data Types and Analytics in R
12 pages
Data Types For Analyst
No ratings yet
Data Types For Analyst
8 pages
The Machine Learning Process Involves Several Steps That Help Develop and Deploy A Successful Machine Learning Model
No ratings yet
The Machine Learning Process Involves Several Steps That Help Develop and Deploy A Successful Machine Learning Model
62 pages
Types and Characteristics
No ratings yet
Types and Characteristics
3 pages
Lecture - 2 - Basics of Data Science
No ratings yet
Lecture - 2 - Basics of Data Science
56 pages
File 1704270460 0009750 typesofdata-UNIT-1
No ratings yet
File 1704270460 0009750 typesofdata-UNIT-1
7 pages
2 Types of Data
No ratings yet
2 Types of Data
44 pages
Session 2
No ratings yet
Session 2
17 pages
Day 1 - Data
No ratings yet
Day 1 - Data
11 pages
Data Types
No ratings yet
Data Types
4 pages
Ii Sem Ba Notes
No ratings yet
Ii Sem Ba Notes
26 pages
MMW Stat 24 25
No ratings yet
MMW Stat 24 25
42 pages
Business Analytics (Tanya Pandey) Mba M3a
No ratings yet
Business Analytics (Tanya Pandey) Mba M3a
64 pages
Fds Presentation II YEAR
No ratings yet
Fds Presentation II YEAR
21 pages
Chapter 1.1 Introduction To Data
No ratings yet
Chapter 1.1 Introduction To Data
10 pages
Chapter 2 DS
No ratings yet
Chapter 2 DS
9 pages
UNIT-I - Data Categorization-by-Dr - SKY
No ratings yet
UNIT-I - Data Categorization-by-Dr - SKY
22 pages
Unit 3 Iml
No ratings yet
Unit 3 Iml
98 pages
Data and Its Types
No ratings yet
Data and Its Types
32 pages
1.4 - About Data
No ratings yet
1.4 - About Data
17 pages
Lesson 1 Classification of Data
No ratings yet
Lesson 1 Classification of Data
14 pages
Data Types
No ratings yet
Data Types
5 pages
Unit 1 Computational Statistics
No ratings yet
Unit 1 Computational Statistics
4 pages
Measurement Scale: Dr. Myint Moe Moe Khin Professor / Head Department of Statistics Monywa University of Economics
No ratings yet
Measurement Scale: Dr. Myint Moe Moe Khin Professor / Head Department of Statistics Monywa University of Economics
27 pages
Module 01 Introduction To Business Statistics
No ratings yet
Module 01 Introduction To Business Statistics
16 pages
2.2.A. Types of Data The Data Is Classified Into Four Categories
No ratings yet
2.2.A. Types of Data The Data Is Classified Into Four Categories
4 pages
Step 1: Ask Questions
No ratings yet
Step 1: Ask Questions
30 pages
Ahsan Stats
No ratings yet
Ahsan Stats
9 pages
DAT100 Int Data Ana Lec3 Types of Data
No ratings yet
DAT100 Int Data Ana Lec3 Types of Data
35 pages
Probability & Statistics Basics
No ratings yet
Probability & Statistics Basics
72 pages
Unit 1 Introduction
No ratings yet
Unit 1 Introduction
86 pages
Data Analysis Challenger PDF 2
No ratings yet
Data Analysis Challenger PDF 2
15 pages
Structureddata
No ratings yet
Structureddata
17 pages
Data Analysis Fundamentals
90% (10)
Data Analysis Fundamentals
56 pages
Data Analysis Basics: Udacity Guide
No ratings yet
Data Analysis Basics: Udacity Guide
9 pages
Topic 1 Introduction To Statistics
No ratings yet
Topic 1 Introduction To Statistics
35 pages
Understanding Data Types in Statistics
No ratings yet
Understanding Data Types in Statistics
15 pages
Basics of Data and Types of Data
No ratings yet
Basics of Data and Types of Data
3 pages
Ordinal Data
No ratings yet
Ordinal Data
3 pages
Data Preparation Notebook
No ratings yet
Data Preparation Notebook
14 pages
ML 2
No ratings yet
ML 2
4 pages
Classification and Organization of Data
No ratings yet
Classification and Organization of Data
12 pages
Final UNIT II-DESCRIPTIVE ANALYTICS
100% (1)
Final UNIT II-DESCRIPTIVE ANALYTICS
128 pages
Classical and Marginal Economics Overview
100% (1)
Classical and Marginal Economics Overview
5 pages
Mastering Relative Clauses
No ratings yet
Mastering Relative Clauses
10 pages
Emerging Land Policy Issues in India
No ratings yet
Emerging Land Policy Issues in India
20 pages
Sample Rubrics: Graphing Rubric 1
100% (1)
Sample Rubrics: Graphing Rubric 1
2 pages
LEGO Mindstorms Education Kit 9797 User Guide
100% (1)
LEGO Mindstorms Education Kit 9797 User Guide
66 pages
Carrier Central User Manual Guide
No ratings yet
Carrier Central User Manual Guide
20 pages
25x26 House Plan From House Construction Telegu YouTube Channel
No ratings yet
25x26 House Plan From House Construction Telegu YouTube Channel
1 page
FSK Filters
No ratings yet
FSK Filters
4 pages
Action Plan AP
No ratings yet
Action Plan AP
3 pages
PROJ
No ratings yet
PROJ
7 pages
Cauchy Sequences for Math Students
No ratings yet
Cauchy Sequences for Math Students
4 pages
Tank Flush Simulation Tutorial
No ratings yet
Tank Flush Simulation Tutorial
23 pages
SHP 2 Grid
No ratings yet
SHP 2 Grid
7 pages
Class 12 Geography: Planning & Sustainable Development
No ratings yet
Class 12 Geography: Planning & Sustainable Development
40 pages
ServiceManuals LG Fridge GRL257NI GR-L257NI Service Manual
100% (1)
ServiceManuals LG Fridge GRL257NI GR-L257NI Service Manual
128 pages
NSO Level 2 Class 7 Science Paper 2017 18 Part 1
No ratings yet
NSO Level 2 Class 7 Science Paper 2017 18 Part 1
5 pages
Krebs Cycle Study Resources
No ratings yet
Krebs Cycle Study Resources
1 page
43 To 49 - 2025 - Notice-NE-4
No ratings yet
43 To 49 - 2025 - Notice-NE-4
4 pages
Assignment: 2C - Complete Report: Team 16
No ratings yet
Assignment: 2C - Complete Report: Team 16
118 pages
Excel Tutorial PDF
No ratings yet
Excel Tutorial PDF
13 pages
DLL Matatag - English 4 q4 w2
No ratings yet
DLL Matatag - English 4 q4 w2
13 pages
Surbacon Maple Brochure May 2012
No ratings yet
Surbacon Maple Brochure May 2012
14 pages
Presentation of ENISA Study - Recommendations - Christina Skouloudi
No ratings yet
Presentation of ENISA Study - Recommendations - Christina Skouloudi
31 pages
IOQM Counting Techniques Guide
No ratings yet
IOQM Counting Techniques Guide
4 pages
RX200A-3-25-1D-MRZ 200mm Pedestrian + Acoustic Device
No ratings yet
RX200A-3-25-1D-MRZ 200mm Pedestrian + Acoustic Device
4 pages
Protege CaseStudyBrief
No ratings yet
Protege CaseStudyBrief
2 pages
Calculating Laterite Nickel Reserves Using The Inverse Distancing Weighteness Method
No ratings yet
Calculating Laterite Nickel Reserves Using The Inverse Distancing Weighteness Method
6 pages
List of Practicals OS Jan-Apr2022
No ratings yet
List of Practicals OS Jan-Apr2022
2 pages
Curriculum Vitae Of: MD. Shafiqul Islam
No ratings yet
Curriculum Vitae Of: MD. Shafiqul Islam
5 pages

Types of Data

Uploaded by

Types of Data

Uploaded by

Types of Data

Data is of two types:

Continuous vs. Discrete

Quantitative data can be of two types – discrete and continuous.

Ratio vs. Interval and Discrete

Ordinal vs. Interval Data

5. Data output and interpretation

Multiprocessing (parallel processing)

Multiprocessing, or parallel processing, involves utilizing multiple processing units or CPUs to

Manual data processing

Mechanical data processing

Electronic data processing

Distributed processing involves spreading computational tasks across multiple computers or

Automatic data processing

Convert raw data into action with data processing

One sample t-test

One sample median test

Chi-square goodness of fit

Two independent samples t-test

The Wilcoxon-Mann-Whitney test is a non-parametric analog to the independent samples t-test

Fisher’s exact test

Kruskal Wallis test

One-way repeated measures ANOVA

Repeated measures logistic regression

Factorial logistic regression

Simple linear regression

Simple logistic regression

Multiple logistic regression

Multivariate multiple regression

You might also like