Unit 2 - Data Literacy
2.1 - Basics of Data Literacy
Q1. What is Data Literacy?
Data literacy means knowing how to understand, work with, and talks about data. It's about being able to collect, analyse, and
show data in ways that make sense.
Q2. What is data Pyramid? Explain with help of an example
Data Pyramid is made of different stages of working with data
Moving up from the bottom
Data is available in a raw form. Data in this form is not very useful.
Data is processed to give us information about the world.
Information about the world leads to knowledge of how
things are happening.
Wisdom allows us to understand why things are
happening in a particular way.
Example:
Q3. Who is data literate? How does data literacy help a person make informed decisions when shopping online?
Data Literate is a person who can interact with data to understand the world around them. Data literacy enables a person to
filter products based on price, check user ratings for quality, and compare features to ensure the product meets all
requirements.
Q4. What is Data Literacy Process Framework
The data literacy framework provides guidance on using data efficiently and with all levels of awareness. Data literacy
framework is an iterative process.
1. Plan: Set a clear goal, know who is involved, and decide how and when to complete the program.
2. Communicate: Share the goal and ask everyone to support it.
3. Assess: Help people check how comfortable they are with data using a simple tool.
4. Develop Culture: Over time, the program will help people improve their data skills and make it part of their daily work.
5. Prescriptive Learning: Give different learning options based on how people like to learn.
6. Evaluate: Set up ways to measure progress and decide how often to review it.
Q5. Compare Data Privacy and Data Security
Q6. What is cybersecurity?
Cybersecurity involves protecting computers, servers, mobile devices, electronic systems, networks, and data from harmful
attacks.
Q7. What are some do's for ensuring cybersecurity?
Use strong, unique passwords with a mix of characters for each account.
Activate Two-Factor Authentication (2FA) for added security.
Download software from trusted sources and scan files before opening.
Prioritize websites with "https://" for secure logins.
Keep your browser, operating system (OS), and antivirus updated regularly.
Adjust social media privacy settings to limit visibility to close contacts.
Always lock your screen when away.
Connect only with trusted individuals online.
Use secure Wi-Fi networks.
Report online bullying to a trusted adult immediately.
Q8. What are some don’ts to follow for cybersecurity?
Avoid sharing personal information like your real name or phone number.
Don't send pictures to strangers or post them on social media.
Don't open emails or attachments from unknown sources.
Ignore suspicious requests for personal information, such as bank account details.
Keep passwords and security questions private.
Don't copy copyrighted software without permission.
Avoid cyberbullying or using offensive language online.
2.2 Acquiring Data, Processing, and Interpreting Data
Q9. What are different types of Data? Explain
Types of Data
Data is essential for AI, and understanding its types is important. It can be broadly divided into three categories:
1. Textual Data (Qualitative Data)
Description: This type of data includes non-numeric information, usually in the form of text. It is used in Natural
Language Processing (NLP) to interpret and work with language-based data.
Examples:
o Descriptive text like search queries on the internet.
o Sentences from chat messages or documents.
2. Numeric Data (Quantitative Data)
Description: Numeric data involves measurable quantities that can be expressed in numbers. It is used in Data Science
to interpret and work with statistical data. It is further classified into two types:
1. Continuous Data
2. Discrete Data
Continuous Data
Definition: Continuous data refers to data that can take any value within a given range. It can include decimal points
and fractions.
Examples:
o Height, weight, temperature, or voltage.
o These measurements can vary continuously within a range, such as 5.5 feet to 6.3 feet in height.
Discrete Data
Definition: Discrete data includes data that consists of whole numbers. It does not allow for fractions or decimals.
Examples:
o The number of students in a classroom or the number of cars in a parking lot.
o These are counted in whole units (e.g., 15 students, 50 cars).
3. Visual Data: Image and video data used in Computer Vision, enabling machines to interpret and understand visual content
(e.g., face recognition systems, image classification).
Q10. Differentiate between Qualitative and Quantitative data
Q11. What is data acquisition? What are the three key steps in data acquisition, and explain each with an example?
Data Acquisition, also known as acquiring data, refers to the procedure of gathering data. This involves searching for datasets
suitable for training AI models. The process typically comprises three key steps:
1. Data Discovery: Searching for and downloading datasets, like images of roads for a self-driving car model.
2. Data Augmentation: Increasing the amount of data by making small changes to existing data, such as altering the color
or brightness of images.
3. Data Generation: Recording new data using sensors, for example, gathering temperature readings of a building and
storing them in a computer.
Q12. What are the two main types of data sources? Explain
1. Primary Data Sources
These are original data collected directly by you. Examples include:
Surveys: Asking questions to gather information.
Interviews: Talking to people to get their insights.
Experiments: Conducting tests to generate data.
2. Secondary Data Sources
This data comes from external sources instead of being collected personally. Examples include:
Books: Information from published literature.
Articles: Research or news written by others.
Online databases: Websites that compile data from various sources eg. Kaggle, .gov datasets
Q13. What is Good Data and Bad Data?
Q14. How is data acquired from websites? What are Ethical concerns in data acquisition?
1. The process of collecting Data from websites using software is called Web Scraping
2. There are different tools that can help us collect data from websites
3. While web scraping is not illegal, using data without permission is illegal
4. During data acquisition, we need to make sure that the data source allows data scraping
Ethical concerns in data acquisition are:
1. Bias Take steps to understand and avoid any preferences or partiality in data
2. Consent Take necessary permissions before collecting or using an individual's data
3. Transparency Explain how you intend to use the collected data and do not hide intentions
4. Anonymity Protect the identity of the person who is the source of data
5. Accountability Take responsibility for your actions in case of misuse of data
Q15. What are the three main factors that affect the usability of data?
The three main factors are:
1. Structure: This refers to how data is organized and stored, making it easier to access and use.
2. Cleanliness: Clean data is free from errors like duplicates and missing values, which helps ensure reliable results.
3. Accuracy: This measures how closely the data reflects real-world values, ensuring it is trustworthy.
Example: If we collect data on students' test scores, well-structured data (like organized tables), clean data (with no missing or
repeated scores), and accurate data (that correctly reflects each student’s actual score) will help us analyze performance
effectively.
Q16. What are data features, and what types are important for AI models?
Data features are the characteristics or properties that describe each piece of information in a dataset. For example, in a table
of student records, features might include the student’s name, age, and grade. In a photo dataset, features could be the colors
present in each image.
In AI models, there are two important types of features:
1. Independent Features: These are the inputs to the model, representing the information we provide to make
predictions.
2. Dependent Features: These are the outputs or results of the model, representing what we are trying to predict.
Q17.List and Explain some keywords related to Data
Acquire Data- Acquiring data is to collect data from various data sources.
Data Processing- After raw data is collected, data is processed to derive meaningful information from it.
Data Analysis – Data analysis is to examine each component of the data in order to draw conclusions.
Data Interpretation – It is to be able to explain what these findings/conclusions mean in a given context.
Data Presentation- In this step, you select, organize, and group ideas and evidence in a logical way.
Q18. What is data processing and data Interpretation?
Data Processing
Data processing helps computers understand raw data.
Use of computers to perform different operations on data is included under data processing.
Data Interpretation
It is the process of making sense out of data that has been
The interpretation of data helps us answer critical questions using data.
Q19. How can we interpret Data?
Based on the two types of data, there are two ways to interpret data-
1. Quantitative Data Interpretation
2. Qualitative Data Interpretation
Qualitative Data Interpretation
Qualitative data tells us about the emotions and feelings of people
Qualitative data interpretation is focused on insights and motivations of people
Quantitative Data Interpretation
Quantitative data interpretation is made on numerical data
It helps us answer questions like “when,” “how many,” and “how often”
For example – (how many) numbers of likes on the Instagram post
Q20. Write Data Collection Methods and steps of Qualitative and Quantitative data interpretation.
Data Collection Methods- Qualitative
Record Keeping: Uses reliable documents as data sources, similar to library research.
Observation: Involves carefully observing participants' behavior and emotions.
Case Studies: Data is gathered from individual case studies.
Focus Groups: Collects data through group discussions on relevant topics.
Longitudinal Studies: Repeatedly collects data from the same source over time.
One-to-One Interviews: Data is gathered through individual interviews.
Data Collection Methods- Quantitative
Interviews: Key for gathering quantitative info.
Polls: Simple, often one-question surveys.
Observations: Collects data within a set time.
Trending Athletes & Movies: Lists of top 5.
Website Counter: Tracks visits.
CGPA: Tracks academic performance.
Student Height: Used to design suitable furniture.
Longitudinal Studies: Long-term data collection.
Survey: Collects data from large groups.
5 Steps to Qualitative Data Analysis
1. Collect Data
2. Organize
3. Set a code to the Data Collected
4. Analyse your data
5. Reporting
4 Steps to Quantitative Data Analysis
1. Relate measurement scales with variables
2. Connect descriptive statistics with data
3. Decide a measurement scale
4. Represent data in an appropriate format
Q21. Compare Qualitative & Quantitative Data Interpretation
Q22. What are the different types of Data Interpretation (DI), and how are they presented?
Answer: There are three main types of Data Interpretation:
1. Textual DI
o Data is presented in text or paragraph form.
o Used when the data is small and easy to understand by reading.
o Not suitable for large datasets.
o Example:
"More than 60% of students scored over 80% in the Olympiad! In a class of 45, 3 scored a perfect 50, 10 scored
45 and above, 15 scored 40 and above, 8 scored 30 and above, 6 scored 20 and above, and 3 scored below 19."
2. Tabular DI
o Data is shown in a structured table with rows and columns.
o The title describes the table's content (e.g., "Item of Expenditure").
o Column headings describe the information (e.g., "Year," "Salary," "Fuel and Transport").
3. Graphical DI
o Data is presented using visuals:
Bar Graphs: Vertical or horizontal bars represent data.
Pie Charts: A circular chart divided into slices, where each slice represents a proportion of the whole.
Line Graphs: Data points are connected by a line to show trends over time.
Q23. Why is Data Interpretation important?
1. Informed Decision Making: A decision is only as good as the knowledge it is based on
2. Reduced Cost: Identifying needs can lead to reduction in cost
3. Identifying Needs: We can identify needs of people by data interpretation
2.3 Project Interactive Data Dashboard & Presentation
Q24. What is data visualization? Write some tools for visualizing the data.
Data visualization is the process of displaying data in graphical formats like charts, graphs, and tables. It helps to easily identify
patterns, trends, and insights from the data.
Tools for Visualizing Data:
Microsoft Excel: Aspreadsheet is used to organize and manipulate data, perform calculations, and create basic charts,
serving as a foundation for preparing data for more advanced visualizations.
Tableau: Tableau is a powerful data visualization tool for analyzing, visualizing, and sharing insights through interactive,
shareable dashboards with visually appealing charts, graphs, and maps
Both tools help simplify data analysis and decision-making.
BOOK EXERCISE
1. Cultivating Data Literacy means:
a) Utilize vocabulary and analytical skills b) Acquire, develop, and improve data literacy skills
c) Develop skills in statistical methodologies d) Develop skills in Math
2. Data Privacy and Data Security are often used interchangeably but they are different from each other. True / False
3. The_____________________ provides guidance on using data efficiently and with all levels of awareness.
a) data security framework b) data literacy framework
c) data privacy framework d) data acquisition framework
4. _____________ allows us to understand why things are happening in a particular way
a) data b) information c) knowledge d) wisdom
5.__________ is the practice of protecting digital information from unauthorized access, corruption, or theft throughout its
entire lifecycle.
a) data security b) data literacy c) data privacy d) data acquisition
6. What are the basic building blocks of qualitative data?
a) Individuals b) Units c) Categories d) Measurements
7. Which among these is not a type of data interpretation?
a) Textual b) Tabular c) Graphical d) Raw data
8. Quantitative data is numerical in nature. True/ False
9. A Bar Graph is an example of?
a) Textual b) Tabular c) Graphical d) None of the above
10. _____________ relates to the manipulation of data to produce meaningful insights.
a) Data Interpretation b) Data Analysis c) Data Presentation d) Data Processing
11. At which stage of the AI project cycle does Tableau software prove useful?
Tableau software proves useful during the Data Exploration stage of the AI project cycle.
12. Name any five graphs that can be made using Tableau software
five types of graphs that can be created using Tableau software:
1) Bar Chart
2) Line Chart
3) Pie Chart
4) Scatter Plot
5) Heat Map