Individual Assignments
Site: My Courses
Course
BUSI 4063 Business Intelligence and Analytics 11/11(20U-C-BC-9A)
:
Assignments will be marked based on comprehensiveness, presentation quality, form, and content. Student
evidence of having relatively demonstrated or mastered these criteria will be assessed according to the
following grade standards. Submissions must be presented in the manner requested of each particular
assignment. Unless otherwise directed assignments should be uploaded in R-Markdown format, or as an R-
Script with accompanying files if required.
Unit 2: Values, Data Types and Data Structures in R,
Assignment 1
In this assignment students are to apply introductory analytical skills and become
familiar with R and the RStudio environment.
Frank has 3 bags of gold, containing 10 ounces each.
Sally adds five ounces to one of his bags, 3 ounces to another, and 12 to the last.
Use Vectors to show how much gold Frank has in each bag.
Upload R code to Assignment 1 using R Markdown in a Word (docx) format
Unit 3: Input, Output, and Control Structures in R,
Assignment 2
In this assignment students become familiar with packages, importing data, and
manipulating dataframes.
Import a csv file into R as a dataframe.
Print out one cell from the dataframe.
Split the dataframe into two pieces, and print both halfs out separately.
Add up all the values in one column of the dataframe.
Upload R code to Assignment 2 using R Markdown in a Word (docx) format.
Page 1 of 5
Unit 4: Describing and Visualizing Data,
Assignment 3
In this assignment students become familiar measures of central tendency and with
plotting data visually.
Create a dataframe with 5 columns.
Generate the following using your dataframe:
Boxplot
Scatterplot
Histogram
Calculate the standard deviation of the data in one column.
Replace one of the datapoints with an outlier, and generate a new boxplot showing the
outlier.
Upload R code to Assignment 3 using R Markdown in a Word (docx) format.
https://www.youtube.com/watch?v=ePD96i0YHII
plot(some_numbers, type ="b", col="green", main = "Class Examples", xlab
= "Value of Numbers", ylab = "Number Sequence")
#Other Types: l, b, h, s
plot(different_numbers, type ="b", col="red", main = "Class Examples",
xlab = "Value of Numbers", ylab = "Number Sequence")
Unit 5: Introduction to Machine Learning,
Assignment 4
In this assignment students become familiar with the various data types, and how
to change those types within a set of data. They also grow their knowledge of
supervised machine learning.
Create dataframe with 5 columns
Page 2 of 5
Change the data type in only two of the columns
Describe in comments why predictive models must be supervised.
Upload R code to Assignment 4 using R Markdown in a Word (docx) format.
Unit 7: Naïve Bayes Classifier and K-Nearest Neighbors
Supervised Machine Learning,
Assignment 5
In this assignment students learn to analyze data using a machine learning
classifier. Students will predict an outcome based on input data
Build a Naïve Bayes Classifier.
Use three columns of data (variables) to predict a value in a fourth column.
Build it entirely in R so it will run standing alone (ie. do not import an excel file).
Value to be predicted: Will Nathan mow the lawn? (a FACTOR variable with two
levels).
Use a confusion matrix to demonstrate how accurate the model is, given the variables
available.
Upload R code to Assignment 5 using R Markdown in a Word (docx) format.
Unit 8: Decision Trees and Random Forests Supervised Machine
Learning,
Assignment 6
In this assignment students display their knowledge of dissimilar classifiers, as
well as their methods, and uses.
Page 3 of 5
In essay format explain the differences and similarities between how a decision tree
functions and how K-NN functions.
Upload your assignment in word format to Assignment 6.
Unit 9: Hierarchical Clustering, K - Means Clustering, and
Market basket analysis Unsupervised Machine Learning,
Assignment 7
In this assignment students explore unsupervised learning models and clustering
methods.
RStudio has provided a number of demonstrations on how to develop machine
learning models. Follow the k-Means Clustering demo using the Iris dataset here:
https://rpubs.com/Nitika/kmeans_Iris#:~:text=Let%E2%80%99s%20begin%20with
%20our%20clustering%20task%20on%20Iris,R%20Studio%20Console.
%201.%20Load%20and%20view%20dataset.
Build the model, and produce commented code explaining each step.
Upload R code to Assignment 7 using R Markdown in a Word (docx) format.
Unit 10: Unsupervised Machine Learning and Future Trends,
Assignment 8
In this assignment students summarize their machine learning knowledge, and
refresh their exploratory data analysis and visualization skills in a practical
business example.
Explain the difference between supervised and unsupervised machine learning.
Imagine you work in a marketing department and need to divide your customers into
market segments. Which type of machine learning would you use, supervised or
unsupervised, and which R package would you implement?
Page 4 of 5
Import the data1.csv file into Rstudio.
Summarize the data.
Remove the last column, and create a boxplot from the remaining columns.
Create a scatterplot of column 1 and column 3.
Calculate the correlation between column 1 and 3.
Explain the relationship between column 1 and 3 using your scatterplot and
correlation calculation as evidence.
If you applied a naïve-bayes classifier to this data to predict the last column, would
you include the last column in the training set? Why or why not?
Upload R code to Assignment 8 using R Markdown in a Word (docx) format.
Page 5 of 5