KEMBAR78
Data Analytics for Healthcare Staff | PDF | Predictive Analytics | Analytics
0% found this document useful (0 votes)
115 views61 pages

Data Analytics for Healthcare Staff

Vanderbilt University Medical Centre (VUMC) faces challenges in staff scheduling due to large variations in daily surgical case volumes. The objectives are to resolve issues related to overstaffing or understaffing operating rooms. Auto Finance Ltd. experiences high default rates from customers, with 70% delaying repayments. The objective is to reduce losses from defaults. Scaleneworks aims to reduce costs from candidates accepting offers but then not joining companies. Easy Shopping wants to identify high-value customers for targeted marketing. Data analytics can help address these issues by finding patterns in data and making predictions to help decision making.

Uploaded by

Aditya Singh
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
115 views61 pages

Data Analytics for Healthcare Staff

Vanderbilt University Medical Centre (VUMC) faces challenges in staff scheduling due to large variations in daily surgical case volumes. The objectives are to resolve issues related to overstaffing or understaffing operating rooms. Auto Finance Ltd. experiences high default rates from customers, with 70% delaying repayments. The objective is to reduce losses from defaults. Scaleneworks aims to reduce costs from candidates accepting offers but then not joining companies. Easy Shopping wants to identify high-value customers for targeted marketing. Data analytics can help address these issues by finding patterns in data and making predictions to help decision making.

Uploaded by

Aditya Singh
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 61

DATA ANALYTICS USING R (DA-R)

SCENARIO 1
Staff Scheduling at VUMC
STAFF SCHEDULING AT VUMC

• Vanderbilt University Medical Centre (VUMC) is one of the leading


hospitals.
• VUMC maintains 55 operating rooms across different sites.
• VUMC schedules elective (non-emergency) surgeries primarily on
weekdays.
VUMC OPERATIONS

• The charge nurse reports the schedule for


the next day to admin director.
Admin
• If the number of cases booked is low, the
Director
admin director decides to close some
operating rooms.
• The charge nurse also asks some operating
room nurses to take a paid holiday.
• If the number of booked cases is high, the
admin director asks the charge nurse to
Charge
call in extra operating room nurses. Nurse
CHALLENGES AT VUMC

• VUMC assumes that surgeries would


occur equally across all weekdays in a Elective Surgeries
month.
• Recently, VUMC has observed a large
• 94%
variation in daily surgical case volume
(number of surgeries) to be performed. Add-on Cases
• This is creating a major problem for
surgical staff schedule. • 6%
POTENTIAL CAUSES OF VARIATION

Surgeries are generally scheduled earlier on


the week and earlier on the day.

Sometimes no surgeries are scheduled in a


week for various reasons.

6% add-on cases.
WHY IS STAFF SCHEDULING SO IMPORTANT?

Overstaffing
• May not cancel staff at late notice (labour relations). Even if possible, last
minute changes may hurt employee satisfaction, as most employees want
predictable schedules.

Understaffing
• May not be able to find someone available to work on short notice.
Understaffing of nurses may delay the surgeries.
OBJECTIVE

• To resolve issues related to staff scheduling.


SCENARIO 2
Finding Right Customers at Auto Finance Ltd.
FINDING RIGHT CUSTOMERS AT AUTO FINANCE
LTD.
• Auto Finance Ltd. is a major player in the two-wheeler business in India.
• Many of the people buying two-wheelers belong to lower-middle class of India
and does not have access to enough capital.
• Auto Finance Ltd. provides loans, typically on a fixed interest rate for 3-5 years,
to enable cash-strapped customers to buy the vehicle.
• The loan facility has enabled Auto Finance Ltd. to attract a new customer
segment.
CHALLENGES AT AUTO FINANCE LTD.

• Recently, Auto Finance Ltd. has faced a


major issue. Timely
Payment
• Around 70% of the customers have 30%
delayed the repayments.
• In order to decide whether to grant Delayed
credit, the credit provider considers the Payment
70%
trade-off between the interest income
and the possibility of borrower
defaulting.
OBJECTIVE

• To reduce the loss due to high default rate.


SCENARIO 3
Talent Acquisition by Scaleneworks
TALENT ACQUISITION BY SCALENEWORKS

• Scaleneworks, a Bangalore based start-up company, supports a number of IT


companies in India with talent acquisition.
• Advises its customers on status of modern talent acquisition practices.
• Recommends and implements individually tailored, viable solutions.
• Recently, the top management has observed that a number of persons have not
joined the organization even after accepting the offer.
BUSINESS PROBLEM

• A significant proportion of candidates In an IT firm, suppose 12000 offers


doesn’t join the company that had made are rolled out every year.

an offer. At 30% renege rate, approximately


3600 candidates accept the offer and
• Owing to this, cost of hiring increased then not join the company.
between 10% and 15%. Company would have spent 15 man
hours/candidate in recruitment
lifecycle.

54000 man hours wasted by one


client alone
OBJECTIVE

• To reduce the cost associated with the candidates not joining the company even after
accepting the offer.
SCENARIO 4
Market Segmentation at Easy Shopping
BACKGROUND DETAILS

• Easy Shopping is a registered non-store online retail company.


• The company mainly sells unique all-occasion gifts.
BUSINESS PROBLEM

• Easy Shopping wants to run a targeted marketing campaign that will resonate with a high-
value customers, but not with others.
• This targeted group will receive messages tailored to their needs and interests.
• However, Easy Shopping is clueless about the process of dividing customers into groups.
OBJECTIVE

• To identify high-value customers.


WHAT IS ANALYTICS?
WHAT IS ANALYTICS?

Data Analytics Decision

Analytics is the discovery, interpretation, and communication of meaningful patterns in data; and the process of
applying those patterns towards effective decision making.
OBJECTIVE

• By using analytics, one can


✓Find Patterns
✓Understand the meaning of underlying patterns
✓Make Predictions
✓Recommend Decisions
Hollywood studios predict the success of a screenplay, if produced.

Netflix use a recommendation system that predicts which movies you


will like.
SOME
APPLICATIONS Hewlett-Packard (HP) earmarks each and every one of its more than
300,000 worldwide employees according to “Flight Risk,” the
expected chance he or she will quit the job.

Insurance companies predict who is going to crash a car or hurt


themselves another way.

Target predicted pregnancy.

Source: Siegel, E. (2016). Predictive Analytics. Wiley.


ANALYTICS BASED DECISION-MAKING: KEY STEPS

Recognize the Select the


Review previous
problem or variables and
findings
question build hypothesis

Present and act


Collect Data Analyse Data
on the results
DIFFERENT
TYPES OF
DATA
ANALYTICS

Source:
https://www.sciencedirect.com/science/article/pii/S0
148296318302480
DESCRIPTIVE ANALYTICS

• Descriptive Analytics consists of set of techniques that describes what has happened in the past.
• Examples: Data Queries, Reports, Descriptive Statistics, Data Visualization, etc.
DIAGNOSTIC ANALYTICS

• Diagnostic analytics (as a natural extension of descriptive analytics) examines data or content to
answer the question “why did it happen?”
• It requires exploratory data analysis of the existing data or sometimes additional data
using tools and techniques as visualization, data discovery, and data mining in order to
discover the root causes of a problem.
PREDICTIVE ANALYTICS

• Predictive analytics comprises of the set of techniques that use models constructed from the
past data to predict the future or study the impact on one variable on the other.
• Examples: Linear Regression, Logistic Regression, etc.
PRESCRIPTIVE ANALYTICS

• Prescriptive analytics provides a best course of action to take, i.e., the output from a
prescriptive analytics model is the best solution.
• A common example is portfolio models in finance, which determine the mix of
investments that yield the highest expected return while limiting the exposure to risk.
PREDICTION EFFECT
Application: Direct Marketing

Source: Siegel, E. (2016). Predictive Analytics. Wiley.


Imagine you have a company with a mailing list of a
million customers.

PREDICTION Cost of sending a mail to each one is $2.


EFFECT
Suppose you have observed that one out of 100 of
them will buy your product (i.e., 10,000 responses).

Also your profit is $220 for each positive response.


PREDICTION EFFECT

• Overall Profit
= Revenue − Cost
= $220 × 10,000 responses − ($2 × 1million)
= $200,000.
Now suppose you use Predictive Analytics (PA) in the
same context.

Suppose PA earmarks a quarter of the entire list and


PREDICTION says:“These folks are three times more likely to respond
than average!”
EFFECT
So you now have a short list of 250,000 customers

At 3% percent response rate, 7500 responses.


PREDICTION EFFECT

• Overall Profit
= Revenue − Cost

=($220×7,500 responses)−($2×250,000)
=$1,150,000.
• You just improved your profit 5.75 times
over mailing to fewer people.
DIRECT MARKETING

What’s predicted?
• Which customers will respond to marketing contact?

What’s done about it?


• Contact customers more likely to respond.
TYPES OF PROBLEMS
SUPERVISED LEARNING VS
UNSUPERVISED LEARNING

• Supervised Learning: both X and Y


are known
• Unsupervised Learning: only X
SUPERVISED LEARNING
• Supervised Learning is where both the
predictor(s), 𝑋, and the response, 𝑌 , are
observed.
• Main purpose is either to predict 𝑌 based on 𝑋
SUPERVISED or to understand the relationship between 𝑌 and
LEARNING 𝑋.
• Supervised learning problems can be further
divided into regression and classification problems
based on the nature of 𝑌.
Linear Regression

Logistic Regression

SUPERVISED Decision Trees


LEARNING: Bagging
TECHNIQUES
Random Forest

Boosting

Support Vector Machines


TYPES OF SUPERVISED LEARNING
REGRESSION PROBLEM

Regression
𝑌 𝑌෠
Model

Quantitative Quantitative
𝑋1 𝑋2 𝑋3
REGRESSION PROBLEM: EXAMPLES

• Staff Scheduling at VUMC


CLASSIFICATION PROBLEM

Predicted
Classification
𝑌 Class
Model Labels

Qualitative
Qualitative
𝑋1 𝑋2 𝑋3
CLASSIFICATION PROBLEM: EXAMPLES

• Finding Right Customers at Auto Finance Ltd.


• Talent Acquisition by Scaleneworks
UNSUPERVISED LEARNING
UNSUPERVISED LEARNING

• A set of statistical tools intended for the setting in which we have only a set of
features 𝑋1 , 𝑋2 , … , 𝑋𝑝 measures on 𝑛 observations.
• We are not interested in prediction, because we do not have an associated
response variable 𝑌.
• The goal is to discover interesting things about the measurements on
𝑋1 , 𝑋2 , … , 𝑋𝑝 .
OBJECTIVES IN UNSUPERVISED LEARNING

Is there any informative way to visualize


the data?

Can we discover the subgroups among


the variables or among the observations?
UNSUPERVISED LEARNING: TECHNIQUES

Principal Component Analysis


• a tool used for data visualization or data pre-processing
before supervised techniques are applied.
Clustering
• a broad class of methods for discovering unknown
subgroups in data.
UNSUPERVISED LEARNING: EXAMPLES

• Customer segmentation is the process of dividing customers into groups based on common
characteristics.
• The most common characteristics are demographics (e.g., age, gender, marital status,
income), psychographics (e.g., interests, lifestyle, group affiliations), geographical region,
and purchase behaviour (e.g., previously purchased items, shipping preferences, page
views on your website, etc.).
CUSTOMER SEGMENTATION AT EASY SHOPPING

• Easy Shopping decides to work with metrics such as each customer’s recency of last
purchase, frequency of purchase, and monetary value.
• These three variables, collectively known as RFM, are often used in customer
segmentation for marketing purposes.
UNSUPERVISED LEARNING: CHALLENGES

• The exercise tends to be more subjective, and there is no simple goal for the analysis,
such as prediction of a response.
• Unsupervised learning is often performed as a part of an exploratory data analysis.
• It can be very hard to assess the results obtained from the unsupervised learning
methods since there is no universally accepted mechanism for validating results on an
independent data set.
OBJECTIVES OF THIS COURSE

Introducing a number of supervised and unsupervised learning techniques.

Implements all these techniques in R.


R: SOME OF THE BEST FEATURES

• Open Source Software, available on every major platform.


• Massive set of packages for visualization, statistical modelling, machine learning, and
importing and manipulating data.
• Readily available tools for data analysis.
• A great community. Easy to get help from experts.
• Readily available tools for communicating results.
• Can connect to high-performance programming languages like C, C++, Fortran.
(Check Advanced R by Hadley Wickham for more details)
COURSE PLAN

Linear
Introduction to Logistic
DOE Regression, Best
DA and R Regression
Subset Selection

Decision Tress.
Time Series
Bagging, Random SVM PCA, Clustering
Analysis
Forest
BOOKS

1. Seema Acharya (2018). Data Analytics using R. McGraw Hill Education [Ref 1]
2. James, G., Witten, D., Hastie, T. & Tibshirani, R. (2013). An Introduction to Statistical Learning:
with Applications in R. New York: Springer-Verlag. (web: http://www-bcf.usc.edu/~gareth/ISL/).
[Ref 2]
3. Hyndman, R. J. & Athanasopoulos, G. (2016). Forecasting: Principles and Practice. Otexts. (web:
https://www.otexts.org/fpp/) [Ref 3]
4. Lander, J. (2013). R for Everyone: Advanced Analytics and Graphics. New Jersey:Addison-Wesley.
5. Siegel, E. (2016). Predictive Analytics.Wiley.
INTERNET WEBSITES

• http://analytics-magazine.org/
• http://www.r-bloggers.com/
• https://stat.ethz.ch/mailman/listinfo/r-help
• http://stackoverflow.com/questions/tagged/r
• http://blog.revolutionanalytics.com/r/
• http://chance.amstat.org/
• http://www.statslife.org.uk/significance
JOURNALS

• Computational Statistics and Data Analysis (web:


https://www.journals.elsevier.com/computational-statistics-and-data-analysis/)
• Computational Statistics (web: https://link.springer.com/journal/180)
• Interfaces (https://pubsonline.informs.org/journal/inte)
• The R Journal (web: https://journal.r-project.org/)
• Journal of Statistical Software (web: http://www.jstatsoft.org/index)
EVALUATION COMPONENTS

• End Term Exam: 50%


• Class Participation: 10%
• Assignments: 20%
• Project: 20%
READING MATERIAL

• Section 2.1 of Ref 2


• Sub-sections 2.1.4 and 2.1.5 of Ref 2
• Section 2.3 of Ref 2

You might also like