0% found this document useful (0 votes)

16 views13 pages

Slide 1

This thesis project by Utku Alperen aims to utilize machine learning anomaly detection algorithms to identify students who perform unexpectedly high or low in the PISA 2022 mathematics assessment. The research focuses on understanding the characteristics of these outliers and their implications for educational policy and practice. The presentation outlines the project's motivation, methodology, and the significance of mathematical literacy in relation to socioeconomic factors and educational outcomes.

Uploaded by

gamedark01

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

16 views13 pages

Slide 1

Uploaded by

gamedark01

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 13

Slide 1 – Title Slide:

Hello everyone,
My name is Utku Alperen
Today, I’ll be giving a preliminary presentation on my thesis project, which
explores how machine learning — specifically anomaly detection
algorithms — can help uncover students who perform unexpectedly high
or low in the PISA 2022 mathematics assessment.
This is a work in progress, so I’ll be focusing on the context, motivation,
and planned approach, rather than final results.

Slide 2 – Outline:
Before we dive into the details, here’s a quick overview of what I’ll be
covering in this presentation.
First, I’ll introduce the motivation behind this research.
Then, I’ll share the central research question that guides this project.
I’ll explain what PISA is, and why mathematical literacy — particularly in
this context — is socially and economically important.
After that, I’ll introduce the concept of anomaly detection and how it
applies to education.
Then I’ll walk through the methodology I plan to use, highlight some of
the candidate algorithms I’m exploring, and finally discuss the kinds of
insights I hope to uncover.
I’ll conclude with the next steps as this research develops further.

Slide 3 – What This Project About?:

This project aims to detect students who perform unexpectedly well or

poorly in mathematics, based on the PISA 2022 assessment.
Instead of focusing on average scores or trends, I’m interested in the
exceptions — students who defy what their background, attitudes, or
school conditions would predict.
To do this, I plan to use machine learning anomaly detection algorithms,
which are designed to find unusual patterns or outliers in large datasets.
The focus is specifically on mathematics literacy, which is the main
domain assessed in the 2022 PISA cycle.
I want to explore not just which students are outliers, but also what
characteristics define them — are they highly motivated, underserved,
disengaged, or something else entirely?
Since I’m still in the early stages of this thesis, I’ll be sharing a design-
focused view today — including background, planned methods, and
expected contributions.

Slide 4 – Main Research Question:

At the core of this project is a guiding research question:
“How effective are anomaly detection algorithms in identifying
students with unexpectedly high or low mathematics literacy scores in
PISA 2022, and what characteristics distinguish these outliers?”
This question has two key parts.
First, it's about algorithmic performance: Can these models actually
detect students who are behaving in surprising ways — for example, a
student with low socioeconomic status scoring extremely high, or the other
way around?
Second, it’s about understanding those outliers once they’ve been
identified. What do they have in common? Are there recurring patterns in
their backgrounds, attitudes, or school environments?
The project aims to bridge both machine learning evaluation and
educational interpretation — not just identifying anomalies, but making
sense of them in a meaningful way.

Slide 5 – What is PISA?:

To ground this project, it's important to briefly explain what PISA is.
PISA stands for the Programme for International Student Assessment, and
it's an international standardized assessment organized by the Organisation
for Economic Co-operation and Development.
It takes place every three years and evaluates the performance of 15-year-
old students across more than 80 countries. The core domains are reading,
science, and mathematics, and each cycle emphasizes one of these.
In the 2022 cycle, mathematics is the primary focus, which makes it
particularly relevant for this project.
Beyond scores, PISA collects a wide array of contextual information from
students, parents, and schools — such as socioeconomic background,
access to learning resources, teaching quality, and student attitudes.
This makes it not just a test dataset, but a rich educational database — ideal
for identifying complex patterns, including outlier behavior.
Slide 6 – Why Mathematical Literacy Matters?:

So, why focus on mathematical literacy in particular?

First, on a personal level, students with stronger math skills tend to have
better outcomes in life — including higher income, greater access to stable
employment, and better preparation for careers in science, technology,
engineering, and math (STEM).
But there’s also a national-level impact. Researchers have found a strong
link between math scores and a country's economic growth.
Specifically, they show that when a country's average math score improves
by one standard deviation, its GDP growth — that is, its Gross Domestic
Product, or the total value of all goods and services produced in a country
— increases by around 2% per year.
In other words, better math education doesn't just help individuals — it
strengthens economies.
However, math performance is also an equity issue. Students from low
socioeconomic backgrounds — often measured in PISA using indicators
like (Highest International Socioeconomic Index) — tend to have lower
achievement levels.
That said, there are exceptions. Some disadvantaged students succeed
greatly, while others with many advantages fall behind. These outliers are
precisely the focus of this project. By identifying them, we can uncover
stories of resilience or hidden barriers — and that insight could be valuable
for both policy and pedagogy.

Slide 7 – What is Anomaly Detection?:

With that context in mind, let’s turn to the machine learning concept at the
core of this project — anomaly detection.
Anomaly detection refers to a group of techniques that identify data points
that deviate significantly from the rest. In other words, they're used to
detect unusual or unexpected patterns in large datasets.
It’s used in many areas: for example, to detect credit card fraud, spot
malfunctions in medical devices, or even to flag security breaches in
computer networks.
What makes it powerful in education is that it allows us to spot students
who don't follow the typical pattern — without needing to predefine what
counts as “good” or “bad” performance.
There are different types of anomalies:
 A point anomaly is a single data point that looks very different — for
example, a student who performs way above or below the average.
 A contextual anomaly is one that’s only unusual given the
surrounding information — like a high score that’s surprising given a
student's background, school environment, or socioeconomic status.
 A collective anomaly involves a group that behaves differently — for
instance, a school where all students underperform despite high
resources.
In this project, I'm especially interested in contextual anomalies, because
PISA provides a lot of background data that helps us ask:
“Is this student’s performance surprising given their context?”

Slide 8 – How this project will be approached?:

Now let me walk you through the general approach I plan to take for this
project.
First, I’ll be working with the Turkey dataset from PISA 2022. This subset
provides enough student responses, along with rich contextual information,
to apply machine learning meaningfully.
Second, the approach will focus on unsupervised learning, meaning that we
won’t label students in advance as outliers or not. Instead, the algorithms
will detect anomalies based on patterns in the data.
I plan to explore and compare multiple anomaly detection algorithms. The
idea isn’t just to apply one method, but to investigate how different
techniques identify outliers — and whether they agree or not.
Once students are flagged as outliers, I’ll analyze their characteristics:
Do they have low or high HISEI scores (a measure of parental occupation
status)?
Are they highly motivated or disengaged, based on PISA survey items like
interest in math and self-efficacy?

Do they come from supportive or challenging school environments?

The goal is not just to detect outliers, but to interpret the findings through
an educational lens — asking what these outliers reveal about the
education system and student resilience.
Slide 9 – Overview of Candidate Algorithms:

In this slide, I’ll briefly introduce the three anomaly detection algorithms
I’m currently exploring.
These methods are quite different from one another — which is intentional
— because I want to compare how well each performs in the context of
educational data.
First is Isolation Forest, which isolates data points using random tree
structures. The idea is that anomalies can be separated from the rest of the
data with fewer splits — so the model measures how “easy” it is to isolate
a student from the crowd.
Second is One-Class SVM — a kernel-based algorithm that learns the
shape of the normal data and then flags anything that falls outside this
shape. It's commonly used in fraud detection and bioinformatics, but it is
increasingly being applied to behavioral data too.
And finally, I plan to explore Autoencoders — a type of neural network
that attempts to reconstruct its input data. If the reconstruction error is
high, it suggests that the input didn’t follow the learned patterns — making
it a likely anomaly.
These three models offer a mix of interpretability, efficiency, and
sensitivity to different types of patterns. As I move forward, I’ll evaluate
which one (or combination) performs best on the PISA dataset.

Slide 10 – Isolation Forest:

Let’s start with the first algorithm: Isolation Forest.

The key idea here is surprisingly simple: anomalies — by definition — are
rare and different. So, they’re easier to separate from the rest of the data.
This method works by randomly selecting a feature and then randomly
choosing a split value between the maximum and minimum of that feature.
It continues splitting until the point is isolated.
A normal student might require many splits to be separated from others,
because they share similar patterns. But an anomalous student will likely
be isolated with fewer cuts.
The model then calculates something called the average path length —
basically, how deep down the tree a point had to go before being isolated.
Points with shorter path lengths are more likely to be outliers.
What makes Isolation Forest suitable here is that it’s:
 Fast and scalable
 Requires no prior labeling (unsupervised)
 Effective in high-dimensional spaces, which is the case with PISA
data (since students are described by dozens of background
variables)
In the context of this project, it helps answer the question:
“Which students can be separated from the norm based on a few critical
features?”

Slide 11 – One-Class SVM:

The second algorithm I plan to explore is One-Class Support Vector

Machine, or One-Class SVM.
This model works quite differently from Isolation Forest. Rather than
splitting the data, it tries to learn the boundary that encloses most of the
“normal” data.
Once that boundary is established, any data point — in our case, any
student — that falls outside the boundary is labeled as an anomaly.
The reason this is called a “One-Class” SVM is because it doesn’t require
examples of outliers during training — it only uses the “normal” class to
learn the data distribution.
One of the powerful aspects of this method is the kernel trick, which
allows it to capture non-linear relationships by projecting data into a
higher-dimensional space. This is useful because educational data often
contains complex, non-linear structures — for example, a student might
perform poorly not just due to one factor, but because of an interaction
between multiple background variables.
However, One-Class SVM has some challenges:
 It can be sensitive to parameter choices, especially the nu and gamma
values, which control the boundary tightness and data distribution
assumptions.
 It works best when the normal data is well-clustered — which may
or may not hold true for a diverse student population.
Still, it's a valuable algorithm to consider — particularly because it offers a
geometric perspective on what counts as “normal” performance.
Slide 12 – Autoencoders:
The third method I’m considering is Autoencoders, a type of neural
network designed to reconstruct its input.
In simple terms, an autoencoder has two parts:
 An encoder that compresses the input into a smaller latent
representation, and
 A decoder that tries to reconstruct the original data from that
compressed form.
During training, the model learns to minimize the reconstruction error —
which is the difference between the original input and the reconstructed
output.
Here’s where anomaly detection comes in: if the model has been trained
mostly on “normal” data, it will reconstruct those cases well.
But when it sees something unusual or not seen before, the reconstruction
error will be high — and that’s how we can detect an outlier.
In the context of this project, autoencoders are attractive because they:
 Handle high-dimensional data very well (which is the case in
PISA),
 Can learn non-linear relationships between variables, and
 Are particularly useful when the structure of “normal” is complex
and multi-factorial — which is often true in education.
The main downside is interpretability. Neural networks are less
transparent than tree-based or geometric models like the previous two.
However, I plan to address that using tools like SHAP or layer-wise
relevance propagation, if needed.
Overall, autoencoders add depth to the modeling toolkit — especially
when patterns are too subtle for simpler models to capture.

Slide 13 – What Might Outliers Look Like?:

Now let me illustrate what kind of students might be identified as outliers

by these models, using hypothetical examples based on how PISA
variables typically appear.
Student A is a positive outlier.
Despite coming from a lower socioeconomic background — indicated by a
low HISEI score (which reflects parental occupational status) and minimal
resources at home (HOMEPOS) — this student scores well above
average in mathematics.
Their PISA responses show strong self-efficacy (ST90Q01) and high
interest in math (ST88Q01), suggesting strong internal motivation and
possibly good teaching support.
In contrast, Student B is a negative outlier.
They come from a well-off background — high HISEI, access to books,
internet, and study space — yet their math score is significantly below
what we’d expect.
They report low self-confidence and disengagement, despite attending a
school with good academic support ratings (SC04Q01).
These are the kinds of students that anomaly detection can help identify —
those who might otherwise go unnoticed because they don’t match the
typical trend.
Exploring their profiles more closely could provide insights into resilience,
underachievement, and how school and family environments shape
performance.

Slide 14 – How Will This Be Evaluated?:

One of the challenges in this project is that we’re working with

unsupervised learning — which means we don’t have a “correct answer”
or labeled data to tell us who the outliers are.
Because of this, traditional accuracy metrics like precision or recall are
difficult to apply directly. Instead, my evaluation strategy will include a
mix of quantitative and qualitative approaches.
On the quantitative side:
 I will look at how many students are flagged by each algorithm,
and whether there is overlap or disagreement between them.
 I’ll also examine the distribution of anomaly scores — for
example, are they clustered tightly, or do we see a clear separation
between normal and anomalous students?
On the qualitative side:
 I will analyze the profiles of flagged students to see if they are
meaningfully unexpected.
 For interpretability, I plan to use tools like SHAP (SHapley Additive
exPlanations), which assign importance values to each feature for a
given prediction. This helps to understand why a student was
flagged as an outlier.
Ultimately, the goal is not just to find anomalies mathematically, but to
ensure that the results are educationally meaningful — that they highlight
students whose performance deserves attention, support, or further study.

Slide 15 – Why This Matters for Educators and Policymakers?:

Although this is a technically focused project, the end goal is educational

— to provide meaningful insights that could help teachers, school leaders,
and policymakers.
For example, if a student is underperforming despite having strong
family and school resources, it might indicate issues like mental health
challenges, disengagement, or classroom mismatches. That’s someone
who could benefit from personalized support.
On the other hand, if a student is excelling despite structural
disadvantages — low socioeconomic status, lack of learning materials at
home, etc. — it’s important to understand what’s working for them. Are
they internally motivated? Is there a particularly effective teacher or school
program supporting them?
These are not just interesting stories — they’re actionable cases that can
improve how we design interventions.
The bigger picture is that outlier detection, when done thoughtfully, can
enhance fairness. Instead of treating all students with a “one-size-fits-all”
model, we can differentiate based on actual needs and strengths.
For policymakers, this kind of analysis could help refine how resources
are distributed — by identifying schools or student populations that
require attention not because of averages, but because of anomalies.
In short, the aim is to use data science not just to predict, but to support
human development in more precise and inclusive ways.

Slide 16 – What Comes Next?:

As I mentioned earlier, this presentation reflects an early stage of the

research. The work is ongoing, and the next steps are focused on moving
from planning into actual model development and analysis.
The first task is to finalize data preprocessing — making decisions on
how to handle missing data, normalize variables, and select which features
will go into the models. Since PISA includes a lot of categorical and
ordinal variables, this step requires careful thought.
Once the data is prepared, I will move forward with selecting and
implementing the anomaly detection algorithms. While I’ve introduced
three strong candidates, I may adjust or combine them based on
experimentation.
After running the models, I’ll compare their outputs — not just
quantitatively, but also by analyzing the students they flag. The goal is to
see if the flagged outliers really are unexpected based on their context, and
what traits set them apart.
If time allows and the models prove successful, a longer-term goal is to
expand the analysis across countries. Since PISA is a global dataset, it
offers a unique opportunity to investigate how outlier patterns differ by
cultural, economic, or policy context.
Overall, the focus is on building a robust, interpretable, and educationally
meaningful tool for understanding exceptional student performance —
both positive and negative.

Slide 17 – Summary/Conclusion:

To wrap up, let me briefly summarize what this project is all about.
The core idea is to use machine learning — specifically anomaly
detection — to identify students who perform unexpectedly in the PISA
2022 mathematics assessment.
That includes both:
 Positive outliers, who do better than expected given their
background, and
 Negative outliers, who underperform despite apparent advantages.
The plan is to explore and compare several anomaly detection algorithms,
including Isolation Forest, One-Class SVM, and Autoencoders. Each
brings something different to the table — speed, flexibility, or complexity
— and the aim is to see which yields the most meaningful and interpretable
insights.
Although this work is in its early, exploratory phase, I’ve tried to
establish a clear motivation, research question, and methodological
roadmap.
Ultimately, this project is about using AI tools to support educational
equity — to identify students whose performance might otherwise be
missed and understand the factors that shape their success or struggle.
Slide 18 – Thank You:

That concludes my presentation.

Thank you for your attention
I can answer if there is a question.

SHAP (SHapley Additive exPlanations) is a method used to explain how

much each feature (like SES or motivation) contributes to a model’s
prediction

1. Why is anomaly detection a better fit than traditional regression or

classification?
Because I’m not trying to predict average scores — I want to find students
whose results are unexpected. Anomaly detection helps flag those rare
cases without needing labels.
2. How do you define a “true outlier” in this context?
A student whose math score is very high or low compared to what we’d
expect based on their background — like SES or motivation. I check if
flagged cases make sense based on these variables.
3. How will you handle missing data in PISA?
Most likely with imputation — for example, using the median for
numerical features. I’ll avoid dropping rows unless missingness is very
high.
4. How do you prepare PISA’s categorical data for your models?
I’ll use encoding techniques. One-hot encoding for algorithms like
Isolation Forest, and label encoding or embeddings for neural networks
like autoencoders.
5. How will you select features from so many variables?
I’ll use variables that are supported in the literature — like socioeconomic
indicators and math-related attitudes. Later, I might use SHAP to refine
them.
6. What’s the main challenge of using unsupervised learning in
education?
There’s no clear label, so it’s hard to measure accuracy. Also, some outliers
might not be meaningful without context.
7. What assumptions do the algorithms make?
Isolation Forest assumes anomalies are easy to isolate. One-Class SVM
assumes normal data is dense. Autoencoders assume anomalies are hard to
reconstruct.
8. How will you compare the models without labeled data?
By comparing how many outliers they each detect, looking at overlap, and
interpreting the flagged students’ backgrounds.
9. What if two algorithms flag different outliers?
That’s possible. It could mean they’re picking up on different types of
anomalies — which might both be valid in different ways.
10. Can cultural factors affect what we see as an outlier?
Yes — what’s normal or expected varies by country. That’s why I’m
starting with just Turkey.

🧩 Practical & Ethical Questions

11. How could this help teachers?
It could help spot students who are struggling unexpectedly — or those
doing really well and deserve more support.
12. What ethical concerns come with labeling students as outliers?
We need to be careful not to stigmatize students. These models should
guide attention, not judgment.
13. How do you avoid misinterpreting outliers?
I’ll always check flagged cases in context. A high anomaly score doesn’t
mean something’s wrong — just that it’s unusual.
14. Could some outliers just be data errors?
Yes — I’ll cross-check with multiple variables and look for signs of
inconsistency or missing values.
15. Would this work for all countries in PISA?
Technically yes, but interpretation would vary. Cultural and systemic
differences affect how we define “unexpected” performance.

🚀 Future Direction Questions

16. Could you apply this to reading or science scores too?
Definitely. The method would be similar — just with domain-specific
features.
17. Can you combine the models for better results?
Yes, an ensemble could improve reliability — for example, by flagging
students only when two models agree.
18. How could you know if flagged students are actually important
long-term?
If we had follow-up data, we could track their outcomes. Right now, it’s
more of a descriptive insight.
19. Would longitudinal data improve your results?
Yes, absolutely. Seeing how a student changes over time would help
identify sudden drops or gains more accurately.

20. Could this be useful for school administrators without technical

knowledge?
Yes — if packaged as a simple dashboard that highlights unusual cases and
explains why they were flagged.

❓ Why are you interested in this project?

Suggested Answer (Graduate Student Style):
I’ve always been interested in both education and technology, and this
project combines those two areas.
I find it fascinating how data science — especially machine learning —
can be used to uncover patterns that might not be visible otherwise.
The idea that we can identify students who are quietly succeeding or
silently struggling, just by analyzing patterns in data, really motivates me.
It’s not just about the algorithms — it’s about what we can do with those
insights to support students better.

Deneme Kopyası
No ratings yet
Deneme Kopyası
6 pages
Deneme
No ratings yet
Deneme
3 pages
Ethic and Reasearch Beamer Template
No ratings yet
Ethic and Reasearch Beamer Template
3 pages
Deneme
No ratings yet
Deneme
18 pages
Deneme 1
No ratings yet
Deneme 1
27 pages
Pisa Primeri
No ratings yet
Pisa Primeri
93 pages
Pad Project
No ratings yet
Pad Project
10 pages
Factors Predicting Mathematics Achievement in PISA: A Systematic Review
No ratings yet
Factors Predicting Mathematics Achievement in PISA: A Systematic Review
42 pages
PISA 2021 Mathematics: A Broadened Perspective: Name Country Title Field
No ratings yet
PISA 2021 Mathematics: A Broadened Perspective: Name Country Title Field
32 pages
Developing Pisa
No ratings yet
Developing Pisa
13 pages
The Pisa Mathematics Assessment-An Insiders View PDF
100% (1)
The Pisa Mathematics Assessment-An Insiders View PDF
25 pages
Phase 5
No ratings yet
Phase 5
41 pages
Feduc 09 1306197
No ratings yet
Feduc 09 1306197
9 pages
SheenaSabasajeandRichardOcoArticle IJMRAP-V6N2P148Y23
No ratings yet
SheenaSabasajeandRichardOcoArticle IJMRAP-V6N2P148Y23
6 pages
Literature Review
No ratings yet
Literature Review
5 pages
Journal 3TMA Pacific Ethnomathematics Pedagogy and Practices in Mathematics Education
No ratings yet
Journal 3TMA Pacific Ethnomathematics Pedagogy and Practices in Mathematics Education
12 pages
PISA 2021 Mathematics Framework Draft
No ratings yet
PISA 2021 Mathematics Framework Draft
95 pages
Handout - Math Group
No ratings yet
Handout - Math Group
20 pages
A Rubric For Assessing Mathematical
No ratings yet
A Rubric For Assessing Mathematical
23 pages
Psychology of Mathematics Education Conference PME 47 - ConfTool Pro Printout
No ratings yet
Psychology of Mathematics Education Conference PME 47 - ConfTool Pro Printout
40 pages
Maths Project
No ratings yet
Maths Project
30 pages
EDUC 660 IRR2 Khilola Alihon
No ratings yet
EDUC 660 IRR2 Khilola Alihon
5 pages
Educ 5440 Wa6
No ratings yet
Educ 5440 Wa6
5 pages
Maths Project
No ratings yet
Maths Project
30 pages
Yash 21BSDS12 Perdictive Analysis Report
No ratings yet
Yash 21BSDS12 Perdictive Analysis Report
20 pages
Final Paper
No ratings yet
Final Paper
8 pages
Chapter 1 5 Action Research 041211
No ratings yet
Chapter 1 5 Action Research 041211
48 pages
Predictive Analytics for Students
No ratings yet
Predictive Analytics for Students
16 pages
Ed 480801
No ratings yet
Ed 480801
200 pages
1 s2.0 S2772503025000180 Main
No ratings yet
1 s2.0 S2772503025000180 Main
16 pages
Fpsyg 11 575167
No ratings yet
Fpsyg 11 575167
17 pages
Học viện ngân hàng Banking Academy of Vietnam International School of Business
No ratings yet
Học viện ngân hàng Banking Academy of Vietnam International School of Business
9 pages
Paper 13
No ratings yet
Paper 13
16 pages
Difficulties in Solving Context-Based PISA Mathematics Tasks - An
No ratings yet
Difficulties in Solving Context-Based PISA Mathematics Tasks - An
31 pages
PISA Concept-Paper
No ratings yet
PISA Concept-Paper
3 pages
7ea9ee19 en
No ratings yet
7ea9ee19 en
82 pages
Pre-Practicum Lesson 10 28
No ratings yet
Pre-Practicum Lesson 10 28
9 pages
Chapter 1 Final
No ratings yet
Chapter 1 Final
20 pages
Anomaly Detection and Curve Fitting
No ratings yet
Anomaly Detection and Curve Fitting
72 pages
Pisa by KPMG PDF
No ratings yet
Pisa by KPMG PDF
55 pages
(Colpo Et Al, 2024) Educational Data Mining For Drpout Prediction - Trends, Opportunities and Challenges
No ratings yet
(Colpo Et Al, 2024) Educational Data Mining For Drpout Prediction - Trends, Opportunities and Challenges
37 pages
FES IntroClass v2
No ratings yet
FES IntroClass v2
28 pages
Ijertv13n10 46withibthal-0.5
No ratings yet
Ijertv13n10 46withibthal-0.5
15 pages
Teaching Mathematics: Issues and Solutions: TEACHING Exceptional Children Plus
No ratings yet
Teaching Mathematics: Issues and Solutions: TEACHING Exceptional Children Plus
15 pages
Edm La Brief
No ratings yet
Edm La Brief
78 pages
HISD Professional Development 6-07-26-23
No ratings yet
HISD Professional Development 6-07-26-23
77 pages
Big Data Student Performance Analysis
No ratings yet
Big Data Student Performance Analysis
8 pages
Joint Position Statement Data Science
No ratings yet
Joint Position Statement Data Science
3 pages
Demographic Predictors
No ratings yet
Demographic Predictors
32 pages
PISA 2022 Results Volume II Learning During and From Disruption 1st Edition Oecd Instant Download
100% (1)
PISA 2022 Results Volume II Learning During and From Disruption 1st Edition Oecd Instant Download
70 pages
Preparing Students For Pisa : Mathematical Literacy
No ratings yet
Preparing Students For Pisa : Mathematical Literacy
26 pages
Ed PISA Math1
No ratings yet
Ed PISA Math1
26 pages
Garcia de Galdeano Paper Puerto
No ratings yet
Garcia de Galdeano Paper Puerto
11 pages
TCC Thomas
No ratings yet
TCC Thomas
52 pages
9-Teaching Data Science in High School - Enhancing Opportunities and Success
No ratings yet
9-Teaching Data Science in High School - Enhancing Opportunities and Success
5 pages
Investigating Data Like A Data Scientist: KEY Practices AND Processes
No ratings yet
Investigating Data Like A Data Scientist: KEY Practices AND Processes
23 pages
Azzam,+Galley+ +ED PB+ +197 214+ +asmara
No ratings yet
Azzam,+Galley+ +ED PB+ +197 214+ +asmara
18 pages
Third International Mathematics and Science Study
No ratings yet
Third International Mathematics and Science Study
17 pages
Inventory Management System Documentation
No ratings yet
Inventory Management System Documentation
7 pages
08 CSE358 Intro To Machine Learning II
No ratings yet
08 CSE358 Intro To Machine Learning II
100 pages
05-CSE358-Adversarial Search & Games
No ratings yet
05-CSE358-Adversarial Search & Games
57 pages
Lecture 4
No ratings yet
Lecture 4
24 pages
Lecture 10
No ratings yet
Lecture 10
27 pages
The Future of Renewable Energy
No ratings yet
The Future of Renewable Energy
1 page
Climate Change and Agriculture
No ratings yet
Climate Change and Agriculture
1 page
03 Sunum
No ratings yet
03 Sunum
34 pages
03-Knowledge-Based Recommender Systems - Overview and Research Directions
No ratings yet
03-Knowledge-Based Recommender Systems - Overview and Research Directions
19 pages
Crayfish Heart Study: Ethanol & Caffeine
No ratings yet
Crayfish Heart Study: Ethanol & Caffeine
1 page
Warm Up Problem of The Day Lesson Presentation
No ratings yet
Warm Up Problem of The Day Lesson Presentation
35 pages
AMIS0791 Laboratory Proficiency Testing Report
No ratings yet
AMIS0791 Laboratory Proficiency Testing Report
92 pages
ANOVA for Stress & Activity Levels
No ratings yet
ANOVA for Stress & Activity Levels
25 pages
Maths SMILE - Manas Boolani
100% (1)
Maths SMILE - Manas Boolani
4 pages
An Enlightenment To Machine Learning - Resp
No ratings yet
An Enlightenment To Machine Learning - Resp
22 pages
20185644444444444
100% (1)
20185644444444444
59 pages
AI & Lean Six Sigma for Manufacturing
No ratings yet
AI & Lean Six Sigma for Manufacturing
6 pages
Assignment 4 DataPreparation Final
No ratings yet
Assignment 4 DataPreparation Final
2 pages
Martocchio Sc8 Ch7
No ratings yet
Martocchio Sc8 Ch7
40 pages
AS Statistics and Mechanics Paper I Mark Scheme
No ratings yet
AS Statistics and Mechanics Paper I Mark Scheme
8 pages
DMBI Sem 6 Important Topics (IT)
No ratings yet
DMBI Sem 6 Important Topics (IT)
20 pages
Nonparametric Tests Explained
No ratings yet
Nonparametric Tests Explained
6 pages
CHAPTER (2) Statistical Treatment of Experimental Data
100% (1)
CHAPTER (2) Statistical Treatment of Experimental Data
30 pages
Linear Regression Using R
No ratings yet
Linear Regression Using R
24 pages
CC Unit - 4 Imp Questions
No ratings yet
CC Unit - 4 Imp Questions
4 pages
14 Dami Graphanomalysurvey
No ratings yet
14 Dami Graphanomalysurvey
68 pages
Data-Preprocessing
No ratings yet
Data-Preprocessing
138 pages
Chapter 03 Testbank
No ratings yet
Chapter 03 Testbank
108 pages
Starbucks Deployment Tool To Optimally Assign Employees
No ratings yet
Starbucks Deployment Tool To Optimally Assign Employees
1 page
Checking Data For Outliers
No ratings yet
Checking Data For Outliers
8 pages
Unit 3 - Part 2
No ratings yet
Unit 3 - Part 2
17 pages
ISISO16269 Part4 2010 (Reaffirmed2021)
50% (2)
ISISO16269 Part4 2010 (Reaffirmed2021)
59 pages
A Systematic Review On Big Data Applications and Scope For Industrial Processing and Healthcare Sectors
No ratings yet
A Systematic Review On Big Data Applications and Scope For Industrial Processing and Healthcare Sectors
35 pages
Chapter 9 Testing A Claim-9.3
No ratings yet
Chapter 9 Testing A Claim-9.3
29 pages
Volume 4-404-411
No ratings yet
Volume 4-404-411
8 pages
Introduction To Data Analysis
100% (1)
Introduction To Data Analysis
94 pages
Pavement Testing Methods Guide
No ratings yet
Pavement Testing Methods Guide
10 pages
On Building An Accurate Stereo Matching System On Graphics Hardware
No ratings yet
On Building An Accurate Stereo Matching System On Graphics Hardware
8 pages
Automated Extraction of Vertical Obstructions From Lidar Data
No ratings yet
Automated Extraction of Vertical Obstructions From Lidar Data
12 pages

Slide 1

Uploaded by

Slide 1

Uploaded by

Slide 1 – Title Slide:

Slide 3 – What This Project About?:

This project aims to detect students who perform unexpectedly well or

Slide 4 – Main Research Question:

Slide 5 – What is PISA?:

So, why focus on mathematical literacy in particular?

Slide 7 – What is Anomaly Detection?:

Slide 8 – How this project will be approached?:

Do they come from supportive or challenging school environments?

Slide 10 – Isolation Forest:

Let’s start with the first algorithm: Isolation Forest.

Slide 11 – One-Class SVM:

The second algorithm I plan to explore is One-Class Support Vector

Slide 13 – What Might Outliers Look Like?:

Now let me illustrate what kind of students might be identified as outliers

Slide 14 – How Will This Be Evaluated?:

One of the challenges in this project is that we’re working with

Slide 15 – Why This Matters for Educators and Policymakers?:

Although this is a technically focused project, the end goal is educational

Slide 16 – What Comes Next?:

As I mentioned earlier, this presentation reflects an early stage of the

That concludes my presentation.

SHAP (SHapley Additive exPlanations) is a method used to explain how

1. Why is anomaly detection a better fit than traditional regression or

🧩 Practical & Ethical Questions

🚀 Future Direction Questions

20. Could this be useful for school administrators without technical

❓ Why are you interested in this project?

You might also like