Slide 1 – Title Slide:
Hello everyone,
My name is Utku Alperen
Today, I’ll be giving a preliminary presentation on my thesis project, which
explores how machine learning — specifically anomaly detection
algorithms — can help uncover students who perform unexpectedly high
or low in the PISA 2022 mathematics assessment.
This is a work in progress, so I’ll be focusing on the context, motivation,
and planned approach, rather than final results.
Slide 2 – Outline:
Before we dive into the details, here’s a quick overview of what I’ll be
covering in this presentation.
First, I’ll introduce the motivation behind this research.
Then, I’ll share the central research question that guides this project.
I’ll explain what PISA is, and why mathematical literacy — particularly in
this context — is socially and economically important.
After that, I’ll introduce the concept of anomaly detection and how it
applies to education.
Then I’ll walk through the methodology I plan to use, highlight some of
the candidate algorithms I’m exploring, and finally discuss the kinds of
insights I hope to uncover.
I’ll conclude with the next steps as this research develops further.
Slide 3 – What This Project About?:
This project aims to detect students who perform unexpectedly well or
poorly in mathematics, based on the PISA 2022 assessment.
Instead of focusing on average scores or trends, I’m interested in the
exceptions — students who defy what their background, attitudes, or
school conditions would predict.
To do this, I plan to use machine learning anomaly detection algorithms,
which are designed to find unusual patterns or outliers in large datasets.
The focus is specifically on mathematics literacy, which is the main
domain assessed in the 2022 PISA cycle.
I want to explore not just which students are outliers, but also what
characteristics define them — are they highly motivated, underserved,
disengaged, or something else entirely?
Since I’m still in the early stages of this thesis, I’ll be sharing a design-
focused view today — including background, planned methods, and
expected contributions.
Slide 4 – Main Research Question:
At the core of this project is a guiding research question:
“How effective are anomaly detection algorithms in identifying
students with unexpectedly high or low mathematics literacy scores in
PISA 2022, and what characteristics distinguish these outliers?”
This question has two key parts.
First, it's about algorithmic performance: Can these models actually
detect students who are behaving in surprising ways — for example, a
student with low socioeconomic status scoring extremely high, or the other
way around?
Second, it’s about understanding those outliers once they’ve been
identified. What do they have in common? Are there recurring patterns in
their backgrounds, attitudes, or school environments?
The project aims to bridge both machine learning evaluation and
educational interpretation — not just identifying anomalies, but making
sense of them in a meaningful way.
Slide 5 – What is PISA?:
To ground this project, it's important to briefly explain what PISA is.
PISA stands for the Programme for International Student Assessment, and
it's an international standardized assessment organized by the Organisation
for Economic Co-operation and Development.
It takes place every three years and evaluates the performance of 15-year-
old students across more than 80 countries. The core domains are reading,
science, and mathematics, and each cycle emphasizes one of these.
In the 2022 cycle, mathematics is the primary focus, which makes it
particularly relevant for this project.
Beyond scores, PISA collects a wide array of contextual information from
students, parents, and schools — such as socioeconomic background,
access to learning resources, teaching quality, and student attitudes.
This makes it not just a test dataset, but a rich educational database — ideal
for identifying complex patterns, including outlier behavior.
Slide 6 – Why Mathematical Literacy Matters?:
So, why focus on mathematical literacy in particular?
First, on a personal level, students with stronger math skills tend to have
better outcomes in life — including higher income, greater access to stable
employment, and better preparation for careers in science, technology,
engineering, and math (STEM).
But there’s also a national-level impact. Researchers have found a strong
link between math scores and a country's economic growth.
Specifically, they show that when a country's average math score improves
by one standard deviation, its GDP growth — that is, its Gross Domestic
Product, or the total value of all goods and services produced in a country
— increases by around 2% per year.
In other words, better math education doesn't just help individuals — it
strengthens economies.
However, math performance is also an equity issue. Students from low
socioeconomic backgrounds — often measured in PISA using indicators
like (Highest International Socioeconomic Index) — tend to have lower
achievement levels.
That said, there are exceptions. Some disadvantaged students succeed
greatly, while others with many advantages fall behind. These outliers are
precisely the focus of this project. By identifying them, we can uncover
stories of resilience or hidden barriers — and that insight could be valuable
for both policy and pedagogy.
Slide 7 – What is Anomaly Detection?:
With that context in mind, let’s turn to the machine learning concept at the
core of this project — anomaly detection.
Anomaly detection refers to a group of techniques that identify data points
that deviate significantly from the rest. In other words, they're used to
detect unusual or unexpected patterns in large datasets.
It’s used in many areas: for example, to detect credit card fraud, spot
malfunctions in medical devices, or even to flag security breaches in
computer networks.
What makes it powerful in education is that it allows us to spot students
who don't follow the typical pattern — without needing to predefine what
counts as “good” or “bad” performance.
There are different types of anomalies:
A point anomaly is a single data point that looks very different — for
example, a student who performs way above or below the average.
A contextual anomaly is one that’s only unusual given the
surrounding information — like a high score that’s surprising given a
student's background, school environment, or socioeconomic status.
A collective anomaly involves a group that behaves differently — for
instance, a school where all students underperform despite high
resources.
In this project, I'm especially interested in contextual anomalies, because
PISA provides a lot of background data that helps us ask:
“Is this student’s performance surprising given their context?”
Slide 8 – How this project will be approached?:
Now let me walk you through the general approach I plan to take for this
project.
First, I’ll be working with the Turkey dataset from PISA 2022. This subset
provides enough student responses, along with rich contextual information,
to apply machine learning meaningfully.
Second, the approach will focus on unsupervised learning, meaning that we
won’t label students in advance as outliers or not. Instead, the algorithms
will detect anomalies based on patterns in the data.
I plan to explore and compare multiple anomaly detection algorithms. The
idea isn’t just to apply one method, but to investigate how different
techniques identify outliers — and whether they agree or not.
Once students are flagged as outliers, I’ll analyze their characteristics:
Do they have low or high HISEI scores (a measure of parental occupation
status)?
Are they highly motivated or disengaged, based on PISA survey items like
interest in math and self-efficacy?
Do they come from supportive or challenging school environments?
The goal is not just to detect outliers, but to interpret the findings through
an educational lens — asking what these outliers reveal about the
education system and student resilience.
Slide 9 – Overview of Candidate Algorithms:
In this slide, I’ll briefly introduce the three anomaly detection algorithms
I’m currently exploring.
These methods are quite different from one another — which is intentional
— because I want to compare how well each performs in the context of
educational data.
First is Isolation Forest, which isolates data points using random tree
structures. The idea is that anomalies can be separated from the rest of the
data with fewer splits — so the model measures how “easy” it is to isolate
a student from the crowd.
Second is One-Class SVM — a kernel-based algorithm that learns the
shape of the normal data and then flags anything that falls outside this
shape. It's commonly used in fraud detection and bioinformatics, but it is
increasingly being applied to behavioral data too.
And finally, I plan to explore Autoencoders — a type of neural network
that attempts to reconstruct its input data. If the reconstruction error is
high, it suggests that the input didn’t follow the learned patterns — making
it a likely anomaly.
These three models offer a mix of interpretability, efficiency, and
sensitivity to different types of patterns. As I move forward, I’ll evaluate
which one (or combination) performs best on the PISA dataset.
Slide 10 – Isolation Forest:
Let’s start with the first algorithm: Isolation Forest.
The key idea here is surprisingly simple: anomalies — by definition — are
rare and different. So, they’re easier to separate from the rest of the data.
This method works by randomly selecting a feature and then randomly
choosing a split value between the maximum and minimum of that feature.
It continues splitting until the point is isolated.
A normal student might require many splits to be separated from others,
because they share similar patterns. But an anomalous student will likely
be isolated with fewer cuts.
The model then calculates something called the average path length —
basically, how deep down the tree a point had to go before being isolated.
Points with shorter path lengths are more likely to be outliers.
What makes Isolation Forest suitable here is that it’s:
Fast and scalable
Requires no prior labeling (unsupervised)
Effective in high-dimensional spaces, which is the case with PISA
data (since students are described by dozens of background
variables)
In the context of this project, it helps answer the question:
“Which students can be separated from the norm based on a few critical
features?”
Slide 11 – One-Class SVM:
The second algorithm I plan to explore is One-Class Support Vector
Machine, or One-Class SVM.
This model works quite differently from Isolation Forest. Rather than
splitting the data, it tries to learn the boundary that encloses most of the
“normal” data.
Once that boundary is established, any data point — in our case, any
student — that falls outside the boundary is labeled as an anomaly.
The reason this is called a “One-Class” SVM is because it doesn’t require
examples of outliers during training — it only uses the “normal” class to
learn the data distribution.
One of the powerful aspects of this method is the kernel trick, which
allows it to capture non-linear relationships by projecting data into a
higher-dimensional space. This is useful because educational data often
contains complex, non-linear structures — for example, a student might
perform poorly not just due to one factor, but because of an interaction
between multiple background variables.
However, One-Class SVM has some challenges:
It can be sensitive to parameter choices, especially the nu and gamma
values, which control the boundary tightness and data distribution
assumptions.
It works best when the normal data is well-clustered — which may
or may not hold true for a diverse student population.
Still, it's a valuable algorithm to consider — particularly because it offers a
geometric perspective on what counts as “normal” performance.
Slide 12 – Autoencoders:
The third method I’m considering is Autoencoders, a type of neural
network designed to reconstruct its input.
In simple terms, an autoencoder has two parts:
An encoder that compresses the input into a smaller latent
representation, and
A decoder that tries to reconstruct the original data from that
compressed form.
During training, the model learns to minimize the reconstruction error —
which is the difference between the original input and the reconstructed
output.
Here’s where anomaly detection comes in: if the model has been trained
mostly on “normal” data, it will reconstruct those cases well.
But when it sees something unusual or not seen before, the reconstruction
error will be high — and that’s how we can detect an outlier.
In the context of this project, autoencoders are attractive because they:
Handle high-dimensional data very well (which is the case in
PISA),
Can learn non-linear relationships between variables, and
Are particularly useful when the structure of “normal” is complex
and multi-factorial — which is often true in education.
The main downside is interpretability. Neural networks are less
transparent than tree-based or geometric models like the previous two.
However, I plan to address that using tools like SHAP or layer-wise
relevance propagation, if needed.
Overall, autoencoders add depth to the modeling toolkit — especially
when patterns are too subtle for simpler models to capture.
Slide 13 – What Might Outliers Look Like?:
Now let me illustrate what kind of students might be identified as outliers
by these models, using hypothetical examples based on how PISA
variables typically appear.
Student A is a positive outlier.
Despite coming from a lower socioeconomic background — indicated by a
low HISEI score (which reflects parental occupational status) and minimal
resources at home (HOMEPOS) — this student scores well above
average in mathematics.
Their PISA responses show strong self-efficacy (ST90Q01) and high
interest in math (ST88Q01), suggesting strong internal motivation and
possibly good teaching support.
In contrast, Student B is a negative outlier.
They come from a well-off background — high HISEI, access to books,
internet, and study space — yet their math score is significantly below
what we’d expect.
They report low self-confidence and disengagement, despite attending a
school with good academic support ratings (SC04Q01).
These are the kinds of students that anomaly detection can help identify —
those who might otherwise go unnoticed because they don’t match the
typical trend.
Exploring their profiles more closely could provide insights into resilience,
underachievement, and how school and family environments shape
performance.
Slide 14 – How Will This Be Evaluated?:
One of the challenges in this project is that we’re working with
unsupervised learning — which means we don’t have a “correct answer”
or labeled data to tell us who the outliers are.
Because of this, traditional accuracy metrics like precision or recall are
difficult to apply directly. Instead, my evaluation strategy will include a
mix of quantitative and qualitative approaches.
On the quantitative side:
I will look at how many students are flagged by each algorithm,
and whether there is overlap or disagreement between them.
I’ll also examine the distribution of anomaly scores — for
example, are they clustered tightly, or do we see a clear separation
between normal and anomalous students?
On the qualitative side:
I will analyze the profiles of flagged students to see if they are
meaningfully unexpected.
For interpretability, I plan to use tools like SHAP (SHapley Additive
exPlanations), which assign importance values to each feature for a
given prediction. This helps to understand why a student was
flagged as an outlier.
Ultimately, the goal is not just to find anomalies mathematically, but to
ensure that the results are educationally meaningful — that they highlight
students whose performance deserves attention, support, or further study.
Slide 15 – Why This Matters for Educators and Policymakers?:
Although this is a technically focused project, the end goal is educational
— to provide meaningful insights that could help teachers, school leaders,
and policymakers.
For example, if a student is underperforming despite having strong
family and school resources, it might indicate issues like mental health
challenges, disengagement, or classroom mismatches. That’s someone
who could benefit from personalized support.
On the other hand, if a student is excelling despite structural
disadvantages — low socioeconomic status, lack of learning materials at
home, etc. — it’s important to understand what’s working for them. Are
they internally motivated? Is there a particularly effective teacher or school
program supporting them?
These are not just interesting stories — they’re actionable cases that can
improve how we design interventions.
The bigger picture is that outlier detection, when done thoughtfully, can
enhance fairness. Instead of treating all students with a “one-size-fits-all”
model, we can differentiate based on actual needs and strengths.
For policymakers, this kind of analysis could help refine how resources
are distributed — by identifying schools or student populations that
require attention not because of averages, but because of anomalies.
In short, the aim is to use data science not just to predict, but to support
human development in more precise and inclusive ways.
Slide 16 – What Comes Next?:
As I mentioned earlier, this presentation reflects an early stage of the
research. The work is ongoing, and the next steps are focused on moving
from planning into actual model development and analysis.
The first task is to finalize data preprocessing — making decisions on
how to handle missing data, normalize variables, and select which features
will go into the models. Since PISA includes a lot of categorical and
ordinal variables, this step requires careful thought.
Once the data is prepared, I will move forward with selecting and
implementing the anomaly detection algorithms. While I’ve introduced
three strong candidates, I may adjust or combine them based on
experimentation.
After running the models, I’ll compare their outputs — not just
quantitatively, but also by analyzing the students they flag. The goal is to
see if the flagged outliers really are unexpected based on their context, and
what traits set them apart.
If time allows and the models prove successful, a longer-term goal is to
expand the analysis across countries. Since PISA is a global dataset, it
offers a unique opportunity to investigate how outlier patterns differ by
cultural, economic, or policy context.
Overall, the focus is on building a robust, interpretable, and educationally
meaningful tool for understanding exceptional student performance —
both positive and negative.
Slide 17 – Summary/Conclusion:
To wrap up, let me briefly summarize what this project is all about.
The core idea is to use machine learning — specifically anomaly
detection — to identify students who perform unexpectedly in the PISA
2022 mathematics assessment.
That includes both:
Positive outliers, who do better than expected given their
background, and
Negative outliers, who underperform despite apparent advantages.
The plan is to explore and compare several anomaly detection algorithms,
including Isolation Forest, One-Class SVM, and Autoencoders. Each
brings something different to the table — speed, flexibility, or complexity
— and the aim is to see which yields the most meaningful and interpretable
insights.
Although this work is in its early, exploratory phase, I’ve tried to
establish a clear motivation, research question, and methodological
roadmap.
Ultimately, this project is about using AI tools to support educational
equity — to identify students whose performance might otherwise be
missed and understand the factors that shape their success or struggle.
Slide 18 – Thank You:
That concludes my presentation.
Thank you for your attention
I can answer if there is a question.
SHAP (SHapley Additive exPlanations) is a method used to explain how
much each feature (like SES or motivation) contributes to a model’s
prediction
1. Why is anomaly detection a better fit than traditional regression or
classification?
Because I’m not trying to predict average scores — I want to find students
whose results are unexpected. Anomaly detection helps flag those rare
cases without needing labels.
2. How do you define a “true outlier” in this context?
A student whose math score is very high or low compared to what we’d
expect based on their background — like SES or motivation. I check if
flagged cases make sense based on these variables.
3. How will you handle missing data in PISA?
Most likely with imputation — for example, using the median for
numerical features. I’ll avoid dropping rows unless missingness is very
high.
4. How do you prepare PISA’s categorical data for your models?
I’ll use encoding techniques. One-hot encoding for algorithms like
Isolation Forest, and label encoding or embeddings for neural networks
like autoencoders.
5. How will you select features from so many variables?
I’ll use variables that are supported in the literature — like socioeconomic
indicators and math-related attitudes. Later, I might use SHAP to refine
them.
6. What’s the main challenge of using unsupervised learning in
education?
There’s no clear label, so it’s hard to measure accuracy. Also, some outliers
might not be meaningful without context.
7. What assumptions do the algorithms make?
Isolation Forest assumes anomalies are easy to isolate. One-Class SVM
assumes normal data is dense. Autoencoders assume anomalies are hard to
reconstruct.
8. How will you compare the models without labeled data?
By comparing how many outliers they each detect, looking at overlap, and
interpreting the flagged students’ backgrounds.
9. What if two algorithms flag different outliers?
That’s possible. It could mean they’re picking up on different types of
anomalies — which might both be valid in different ways.
10. Can cultural factors affect what we see as an outlier?
Yes — what’s normal or expected varies by country. That’s why I’m
starting with just Turkey.
🧩 Practical & Ethical Questions
11. How could this help teachers?
It could help spot students who are struggling unexpectedly — or those
doing really well and deserve more support.
12. What ethical concerns come with labeling students as outliers?
We need to be careful not to stigmatize students. These models should
guide attention, not judgment.
13. How do you avoid misinterpreting outliers?
I’ll always check flagged cases in context. A high anomaly score doesn’t
mean something’s wrong — just that it’s unusual.
14. Could some outliers just be data errors?
Yes — I’ll cross-check with multiple variables and look for signs of
inconsistency or missing values.
15. Would this work for all countries in PISA?
Technically yes, but interpretation would vary. Cultural and systemic
differences affect how we define “unexpected” performance.
🚀 Future Direction Questions
16. Could you apply this to reading or science scores too?
Definitely. The method would be similar — just with domain-specific
features.
17. Can you combine the models for better results?
Yes, an ensemble could improve reliability — for example, by flagging
students only when two models agree.
18. How could you know if flagged students are actually important
long-term?
If we had follow-up data, we could track their outcomes. Right now, it’s
more of a descriptive insight.
19. Would longitudinal data improve your results?
Yes, absolutely. Seeing how a student changes over time would help
identify sudden drops or gains more accurately.
20. Could this be useful for school administrators without technical
knowledge?
Yes — if packaged as a simple dashboard that highlights unusual cases and
explains why they were flagged.
❓ Why are you interested in this project?
Suggested Answer (Graduate Student Style):
I’ve always been interested in both education and technology, and this
project combines those two areas.
I find it fascinating how data science — especially machine learning —
can be used to uncover patterns that might not be visible otherwise.
The idea that we can identify students who are quietly succeeding or
silently struggling, just by analyzing patterns in data, really motivates me.
It’s not just about the algorithms — it’s about what we can do with those
insights to support students better.