KEMBAR78
Deploying Open Learning Analytics at a National Scale | PDF
Lessons from the Real World
Oct 2016 Deploying Open Learning Analytics at National Scale
Michael Webb
Director of Technology and Analytics, Jisc
Eitel J.M. Lauría, PhD
Professor of Information Technology & Systems, Marist College
Kate Valenti
Vice President of Operations, Unicon
Agenda
» Strategic View
›A brief introduction to Learning Analytics
›National issues in the UK
» Technical View
›Open architecture
›Predictive modeling
» Implementation View
›Trends and tactics from the field
» Discussion
A brief introduction to Learning Analytics
Our working definition...
What do we mean by Learning Analytics?
» The application of big data techniques such as machine based learning and data
mining to help learners and institutions meet their goals:
› For our project:
– Improve retention (current project)
– Improve achievement (current project)
– Improve employability (current project)
– Improve learning design (later stage)
Learning Analytics stages get progressively “smarter”
Basic Analytics
What has happened
Automated Analytics
What is happening
Predictive Analytics
What might happen
A Strategic Overview
National and institutional strategic issues
National issues in the UK: Retention
» 16-18 Education:
› 178,100 students aged 16-18 failed to finish (2012/13)
› costing UK £814 million a year
» Undergraduates:
› 8% of undergraduates drop out in their first year of study
› This costs universities up to £33,000 per student
National issues in the UK: Differential achievement
» Parental background and ethnicity impact achievement:
National issues in the UK: Differential achievement
2/03/2016 The case for Learning Analytics
» Which behaviours are associated
with lower than expected
academic achievement?
National issues in the UK: Teaching excellence framework
Technical Overview
What does the architecture look like?
Jisc’s Learning Analytics project
Three core strands
Learning Analytics
architecture and
service
Toolkit Community
Jisc Learning Analytics
Toolkit: Code of practice
2/03/2016 The case for Learning Analytics
Jisc Learning Analytics architecture
What
» Building a national architecture
» Defined standards and models
» Implementation with core services
Why?
» Standards mean models, visualisations and so on can be shared
» Lower cost per institutions through shared infrastructure
» Lower barrier to innovation – the underpinning work is already done
What do we mean by an open architecture?
» All APIs published, and process for engaging in their development
» Open standards and definitions
›Data Models and Definitions Creative Commons.
›Developed openly on Github
» All core elements open source or open specification (eg creative commons)
» Freedom to implement both commercial and open solutions as the non-core
elements
Data
Collection
Data
Storage
and Analysis
Presentation
and Action
Jisc Learning Analytics open architecture: core
Alert and
Intervention system
Staff Dashboards Consent Student App
Learning
Analytics Processor
Learning
Records Warehouse
Student Records VLE Library
DataExplorer
Self Declared Data
Meanwhile, in the US...
Learning Analytics Processor: Predictive Modeling Framework
Motivation: Alarming Stats in 2010
36% 4-year completion rate across all four-year institutions in the US
21% for Black students
25% for Hispanic students
58% 6-year completion rate for four-year institutions
40% for Black students
49% for Hispanic Students
41% 25-to-34 Year-Olds with an Associate Degree or Higher
(US ranked 12th among 36 developed nations)
Sources: U.S. Dept. of Education, Postsecondary Education Data System (2009)
CollegeBoard, Advocacy & Policy Center, The Completion Agenda 2011 Progress Report
Open Academic Analytics Initiative @ Marist
EDUCAUSE Next Generation Learning Challenges (NGLC) grant
Funded by Bill and Melinda Gates Foundation
Use machine learning to find patterns in large datasets as means to predict student
academic performance.
Create “early alert” framework:
• Predict academically at-risk students in initial weeks of a course
• Deploy intervention to improve chances of success
Based on Open ecosystem for academic analytics
• Sakai Collaboration and Learning Environment
• Pentaho Business Intelligence Suite (Kettle + Weka)
• Collaboration with commercial vendors (IBM SPSS Modeler)
Learning Analytics Processor @ Marist: Early Alert
How does it actually work?
(binary classification problem)
Hardware Platform: IBM zEnterprise 114 with
BladeCenter Extension (zBX)
Virtualized Servers: 64 bit, 16/32 GB RAM
Linux Red Hat
Extraction,
Transformation &
Loading
Scoring
(predictions on new
student data using
library of persisted
learnt classifiers)
Predictive Model
Building
(classifiers learnt
from data)
New Student
Data
(early in the
Semester)
Prediction of at-risk studentsSingle node architecture
Relational Storage
Intervention
SATs, GPA,
HS ranking,
Course size,
Course grade
(target
feature)
Age, gender,
ethnicity,
income level
Sessions
Resources
Lessons
Assignments
Forums
Tests
Partial
contributions
to final
grade
Logistic Regression
SVMs
Naïve Bayes
J48 Decision Trees
Student
Academic
Data
Student
Demographic
Data
LMS
Event Log Data
LMS
Gradebook Data
Learning Analytics Processor @ Marist: Early Alert
New Iteration: Cluster Computing Architecture
New Student
Data
(early in the
Semester)
Prediction of At-risk students
Intervention
Scoring
(predictions on
new student data
using library of
persisted learnt
classifiers)
Hardware Platform (Dev)
Linux VMs (32GB RAM) running on
IBM PureFlex System
Distributed Storage (HDFS)
& Processing
Extraction,
Transformation &
Loading
Predictive
Model Building
(classifiers learnt
from data)
Job
Scheduling
Student Academic
Data
Student Demographic
Data
LMS Event Log Data
LMS Gradebook Data
Library Data
Student Engagement
Data
Social Network Data
and more …
C
U
R
R
E
N
T
F
U
T
U
R
E
Scales well for Big Data use cases
(more volume & variety)
Logistic Regression
Random Forests
Naïve Bayes
Promising Outcomes
Phase II: Cluster Computing Accuracy Recall FP Rate
Marist
- 3 semesters, 25K records each 86% 87% 14%
North Carolina State University
- 3 semesters, 160K recs each 81% 77% 18%
- 3 semesters, online, 85K recs each 80% 82% 19%
Jisc Project:
• 260,000 records
• 4 institutions (Aberystwyth University, University of Gloucestershire, Cardiff
Metropolitan University, University of Greenwich)
• Results due in December 2016
Implementation View
Trends and tactics from the field
Jisc project in numbers
101 35 24 12
(+ 20)
Discovery activity assesses institutional readiness
– Goal: to assess institutional readiness
(think organizational maturity)
» Measured on 26 factors crossing organizational
and technical considerations
» Approximately 60% of the first 11 institutions are
ready to implement Learning Analytics technology
solutions
Source: Moving the Red Queen Forward, Educaus
Review September/October 2016, Dahlstrom
Varied activities show adoption flexibility
Profile Aim Activity Data sources
Russell Group Retention of widening participation +
support for students to achieve 2.1 or better
Discovery +
Tribal Insight +
Learning Locker
Moodle +
Student Records
Research led University Retention, improve teaching, empowering
students
Discovery +
OpenSource Suite +
Student App
Moodle +
Attendance+ Student
Records
Teaching led University
with WP mission
Retention - requirement to make identifying
students more efficient so they can focus on
interventions
Tribal Insight +
Learning Locker
Blackboard +
Attendance +
Student Records
Research led University Student engagement Discovery +
Student app +
Learning Locker
Moodle +
Student Records
Teaching Lead Understanding of how Learning Analytics
can be used
Discovery +
Technical Integration
Moodle
Organizational Trends
» Top level support is critical
» Change culture makes things easier
» Red tape is real (in policy management)
» Academics looking for evidence-based results
It’s (almost) all about change management
& Tactics
» Convene a Learning Analytics committee (include students)
» Identify champions and advocates
» Adjust existing policies rather than creating new
» Pilot the solution
It’s (almost) all about the data
& Tactics
» Perform data audits and quality checks (early and often)
» Look for “all inclusive” offerings (predictions)
» Look for integration options
Technical Trends
» Institutional infrastructure for data collection requires improvement
» Unified data management desired but not realized
» Data quality issues are common
» Integration with existing infrastructure a challenge
» Doing more with the same technical staff
Keep it simple, snowflakes
& Tactics
» Simplify for pilot; add complexity later
» Overall, only add components you don’t already have
» Flexibility (by institution and by vendors) is key
Pilot Trends
» Customizations are required to meet institutional needs
» Ditto for integrations
» Data gathering effort is considerable
» Did we mention data quality?
Q&A
2/03/2016 The case for Learning Analytics
Interested in more detail?
» Data quality challenges
» Predictive model research
» Data collection, UDD, xAPI recipes, use of standards
» Spark, ETL flows
» ?
Michael Webb
michael.webb@jisc.ac.uk
Eitel J.M. Lauría, PhD
eitel.lauria@marist.edu
Kate Valenti
kvalenti@unicon.net

Deploying Open Learning Analytics at a National Scale

  • 1.
    Lessons from theReal World Oct 2016 Deploying Open Learning Analytics at National Scale
  • 2.
    Michael Webb Director ofTechnology and Analytics, Jisc Eitel J.M. Lauría, PhD Professor of Information Technology & Systems, Marist College Kate Valenti Vice President of Operations, Unicon
  • 3.
    Agenda » Strategic View ›Abrief introduction to Learning Analytics ›National issues in the UK » Technical View ›Open architecture ›Predictive modeling » Implementation View ›Trends and tactics from the field » Discussion
  • 4.
    A brief introductionto Learning Analytics Our working definition...
  • 5.
    What do wemean by Learning Analytics? » The application of big data techniques such as machine based learning and data mining to help learners and institutions meet their goals: › For our project: – Improve retention (current project) – Improve achievement (current project) – Improve employability (current project) – Improve learning design (later stage)
  • 6.
    Learning Analytics stagesget progressively “smarter” Basic Analytics What has happened Automated Analytics What is happening Predictive Analytics What might happen
  • 7.
    A Strategic Overview Nationaland institutional strategic issues
  • 8.
    National issues inthe UK: Retention » 16-18 Education: › 178,100 students aged 16-18 failed to finish (2012/13) › costing UK £814 million a year » Undergraduates: › 8% of undergraduates drop out in their first year of study › This costs universities up to £33,000 per student
  • 9.
    National issues inthe UK: Differential achievement » Parental background and ethnicity impact achievement:
  • 10.
    National issues inthe UK: Differential achievement 2/03/2016 The case for Learning Analytics » Which behaviours are associated with lower than expected academic achievement?
  • 11.
    National issues inthe UK: Teaching excellence framework
  • 12.
    Technical Overview What doesthe architecture look like?
  • 13.
    Jisc’s Learning Analyticsproject Three core strands Learning Analytics architecture and service Toolkit Community Jisc Learning Analytics
  • 14.
    Toolkit: Code ofpractice 2/03/2016 The case for Learning Analytics
  • 15.
    Jisc Learning Analyticsarchitecture What » Building a national architecture » Defined standards and models » Implementation with core services Why? » Standards mean models, visualisations and so on can be shared » Lower cost per institutions through shared infrastructure » Lower barrier to innovation – the underpinning work is already done
  • 16.
    What do wemean by an open architecture? » All APIs published, and process for engaging in their development » Open standards and definitions ›Data Models and Definitions Creative Commons. ›Developed openly on Github » All core elements open source or open specification (eg creative commons) » Freedom to implement both commercial and open solutions as the non-core elements
  • 17.
    Data Collection Data Storage and Analysis Presentation and Action JiscLearning Analytics open architecture: core Alert and Intervention system Staff Dashboards Consent Student App Learning Analytics Processor Learning Records Warehouse Student Records VLE Library DataExplorer Self Declared Data
  • 18.
    Meanwhile, in theUS... Learning Analytics Processor: Predictive Modeling Framework
  • 19.
    Motivation: Alarming Statsin 2010 36% 4-year completion rate across all four-year institutions in the US 21% for Black students 25% for Hispanic students 58% 6-year completion rate for four-year institutions 40% for Black students 49% for Hispanic Students 41% 25-to-34 Year-Olds with an Associate Degree or Higher (US ranked 12th among 36 developed nations) Sources: U.S. Dept. of Education, Postsecondary Education Data System (2009) CollegeBoard, Advocacy & Policy Center, The Completion Agenda 2011 Progress Report
  • 20.
    Open Academic AnalyticsInitiative @ Marist EDUCAUSE Next Generation Learning Challenges (NGLC) grant Funded by Bill and Melinda Gates Foundation Use machine learning to find patterns in large datasets as means to predict student academic performance. Create “early alert” framework: • Predict academically at-risk students in initial weeks of a course • Deploy intervention to improve chances of success Based on Open ecosystem for academic analytics • Sakai Collaboration and Learning Environment • Pentaho Business Intelligence Suite (Kettle + Weka) • Collaboration with commercial vendors (IBM SPSS Modeler)
  • 21.
    Learning Analytics Processor@ Marist: Early Alert How does it actually work? (binary classification problem) Hardware Platform: IBM zEnterprise 114 with BladeCenter Extension (zBX) Virtualized Servers: 64 bit, 16/32 GB RAM Linux Red Hat Extraction, Transformation & Loading Scoring (predictions on new student data using library of persisted learnt classifiers) Predictive Model Building (classifiers learnt from data) New Student Data (early in the Semester) Prediction of at-risk studentsSingle node architecture Relational Storage Intervention SATs, GPA, HS ranking, Course size, Course grade (target feature) Age, gender, ethnicity, income level Sessions Resources Lessons Assignments Forums Tests Partial contributions to final grade Logistic Regression SVMs Naïve Bayes J48 Decision Trees Student Academic Data Student Demographic Data LMS Event Log Data LMS Gradebook Data
  • 22.
    Learning Analytics Processor@ Marist: Early Alert New Iteration: Cluster Computing Architecture New Student Data (early in the Semester) Prediction of At-risk students Intervention Scoring (predictions on new student data using library of persisted learnt classifiers) Hardware Platform (Dev) Linux VMs (32GB RAM) running on IBM PureFlex System Distributed Storage (HDFS) & Processing Extraction, Transformation & Loading Predictive Model Building (classifiers learnt from data) Job Scheduling Student Academic Data Student Demographic Data LMS Event Log Data LMS Gradebook Data Library Data Student Engagement Data Social Network Data and more … C U R R E N T F U T U R E Scales well for Big Data use cases (more volume & variety) Logistic Regression Random Forests Naïve Bayes
  • 23.
    Promising Outcomes Phase II:Cluster Computing Accuracy Recall FP Rate Marist - 3 semesters, 25K records each 86% 87% 14% North Carolina State University - 3 semesters, 160K recs each 81% 77% 18% - 3 semesters, online, 85K recs each 80% 82% 19% Jisc Project: • 260,000 records • 4 institutions (Aberystwyth University, University of Gloucestershire, Cardiff Metropolitan University, University of Greenwich) • Results due in December 2016
  • 24.
    Implementation View Trends andtactics from the field
  • 25.
    Jisc project innumbers 101 35 24 12 (+ 20)
  • 26.
    Discovery activity assessesinstitutional readiness – Goal: to assess institutional readiness (think organizational maturity) » Measured on 26 factors crossing organizational and technical considerations » Approximately 60% of the first 11 institutions are ready to implement Learning Analytics technology solutions Source: Moving the Red Queen Forward, Educaus Review September/October 2016, Dahlstrom
  • 27.
    Varied activities showadoption flexibility Profile Aim Activity Data sources Russell Group Retention of widening participation + support for students to achieve 2.1 or better Discovery + Tribal Insight + Learning Locker Moodle + Student Records Research led University Retention, improve teaching, empowering students Discovery + OpenSource Suite + Student App Moodle + Attendance+ Student Records Teaching led University with WP mission Retention - requirement to make identifying students more efficient so they can focus on interventions Tribal Insight + Learning Locker Blackboard + Attendance + Student Records Research led University Student engagement Discovery + Student app + Learning Locker Moodle + Student Records Teaching Lead Understanding of how Learning Analytics can be used Discovery + Technical Integration Moodle
  • 28.
    Organizational Trends » Toplevel support is critical » Change culture makes things easier » Red tape is real (in policy management) » Academics looking for evidence-based results It’s (almost) all about change management & Tactics » Convene a Learning Analytics committee (include students) » Identify champions and advocates » Adjust existing policies rather than creating new » Pilot the solution
  • 29.
    It’s (almost) allabout the data & Tactics » Perform data audits and quality checks (early and often) » Look for “all inclusive” offerings (predictions) » Look for integration options Technical Trends » Institutional infrastructure for data collection requires improvement » Unified data management desired but not realized » Data quality issues are common » Integration with existing infrastructure a challenge » Doing more with the same technical staff
  • 30.
    Keep it simple,snowflakes & Tactics » Simplify for pilot; add complexity later » Overall, only add components you don’t already have » Flexibility (by institution and by vendors) is key Pilot Trends » Customizations are required to meet institutional needs » Ditto for integrations » Data gathering effort is considerable » Did we mention data quality?
  • 31.
    Q&A 2/03/2016 The casefor Learning Analytics Interested in more detail? » Data quality challenges » Predictive model research » Data collection, UDD, xAPI recipes, use of standards » Spark, ETL flows » ? Michael Webb michael.webb@jisc.ac.uk Eitel J.M. Lauría, PhD eitel.lauria@marist.edu Kate Valenti kvalenti@unicon.net