KEMBAR78
R Programming Lab | PDF | R (Programming Language) | Computer Program
0% found this document useful (0 votes)
48 views46 pages

R Programming Lab

The document outlines the vision and mission of a center focused on nurturing computer professionals through research, innovation, and ethical values. It details program educational objectives and specific outcomes for students, emphasizing skills in software engineering, higher education, and societal contributions. Additionally, it provides a comprehensive list of R programming tasks and concepts, including installation, data manipulation, and statistical analysis.

Uploaded by

kondaveetiaruna
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
48 views46 pages

R Programming Lab

The document outlines the vision and mission of a center focused on nurturing computer professionals through research, innovation, and ethical values. It details program educational objectives and specific outcomes for students, emphasizing skills in software engineering, higher education, and societal contributions. Additionally, it provides a comprehensive list of R programming tasks and concepts, including installation, data manipulation, and statistical analysis.

Uploaded by

kondaveetiaruna
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 46

VISION

To evolve as a centre of excellence for nurturing computer professionals with research and
innovation skills, inculcating moral values and societal concerns.

MISSION
M1: To transform students into creative computer engineers to meet global challenges.
M2: To produce competent and quality professionals by imparting computer concepts and
techniques and a zest for research and higher studies.
M3: To build entrepreneur skills and leadership qualities in the students by inculcating the spirit
of ethical values.

Program Educational Objectives

PEO 1: To excel in their career as competent software engineer in IT and allied


organizations with enriched curriculum and pedagogical initiatives.
PEO 2: To pursue higher education and to demonstrate research temper for providing solutions
to engineering problems
PEO 3: To contribute for the societal development and engage in lifelong learning by exhibiting
leadership, through professional, social and ethical values.

Program Specific Outcomes


PSO1: Apply principles and practices of computer science and Engineering to design
computational solutions.
PSO2: Develop solutions in the area of database management, software design and computing
systems using machine intelligence

1
LISTOF PROGRAMS:
1. Download and install R-Programming environment and
install basic packages using install. Packages () command
in R.
2. Learn all the basics of R-Programming(Datatypes
,Variables,Operatorsetc.)
3. Implement R-Loops with different examples.
4. Learn the basics of functions in R and implement with examples.
5. Implement data frames in R. Write a program to join
columns and rows in a data frame using c bind () and r
bind () in R.
6. Implement different String Manipulation functions in R.
7. Implement different data structures in R(Vectors,Lists ,DataFrames)
8. Write program to read a csv file and analyze the data in the file in R
9. Create piecharts and barcharts using R.
10. Create a dataset and do statistical analysis on the data using R.
11. Write R program to find Correlation and Covariance
12. Write R program for Regression Modeling
13. Write R program to build classification model using KNN algorithm
14. Write R program to build clustering model using K -mean algorithm

2
INDEX

S.No List ofPrograms PageNo.

DownloadandinstallR-Programmingenvironmentandinstall basic 7
1 packages using install .packages()command in R.
LearnallthebasicsofR-Programming(Datatypes,Variables Operators 9
2 etc.)
ImplementR-Loopswithdifferentexamples. 17
3

Learnthebasicoffunctions inRand implementwithexamples. 20


4
ImplementdataframesinR.Writeaprogramtojoincolumnsand rows in 22
5 a data frame using c bind()and r bind() in R.
ImplementdifferentStringManipulationfunctionsinR. 24
6

ImplementdifferentdatastructuresinR(Vectors,Lists,Data Frames) 26
7
Writeaprogram toreadacsvfileandanalyzethedatainthefileinR 30
8
CreatepiechartsandbarchartsusingR. 37
9

Createadataset anddostatistical analysis onthedatausingR. 39


10
WriteRprogramto findCorrelation and Covariance 40
11
WriteR programfor RegressionModeling 42
12

WriteRprogramtobuildclassificationmodelusingKNN algorithm 43
13
WriteRprogramtobuild clusteringmodelusingK-meanalgorithm 46
14

3
Brief Introduction of R Programming Language:

R is an open-source programming language that is widely used as a statistical software and data
analysis tool. R generally comes with the Command-line interface. R is available across widely used
platforms like Windows, Linux, and mac OS. Also, the R programming language is the latest cutting-
edge tool.

It was designed by RossIhaka and Robert Gentleman at the University of Auckland,New Zealand,
and is currently developed by the R Development Core Team. R programming language isan
implementation of the S programming language. It also combines with lexical scoping semantics
inspiredbyScheme.Moreover,theprojectconceivesin1992,withaninitialversionreleasedin 1995 and a
stable beta version in 2000.

UseofRProgramming:
 It’saplatform-independentlanguage.Thismeansitcanbeappliedtoalloperatingsystem.
 It’sanopen-sourcefreelanguage.Thatmeansanyonecaninstallitinany organizationwithout
purchasing a license.
 Rprogrammingisusedasaleadingtoolformachinelearning,statistics,anddataanalysis. Objects,
functions, and packages can easily be created by R.
 R programming language is not only a statistic package but also allows us to integrate with
other languages (C, C++). Thus,can easily interact with many data sources and statistical
packages.
 TheRprogramminglanguagehasavastcommunityofusersandit’sgrowingdaybyday.
 R is currently one of the most requested programming languages in the Data Science
job marketthat makes it the hottest trend nowadays

4
1. Installationof R-Studioonwindows:

Step – 1: With R-base installed, let’s move on to installing RStudio. To


begin,goto download RStudioand click on the download button for RStudio
desktop.

Step–2:Click on thelinkforthewindowsversion ofRStudio and savethe.exefile.

Step–3:Runthe.exeandfollowtheinstallationinstructions.
Click Next on the welcome window.
Enter/browsethepath totheinstallationfolder andclick Nextto proceed.

Selectthefolderforthestartmenushortcutorclickondonotcreateshortcutsandthen click Next. Wait for


the installation process to complete.

Click Finishtoendtheinstallation.

Output:

5
Install theRPackages:-

 First,run RStudio.
 After clicking on the packages tab, click on install. The following dialog box will
appear.
 In the Install Packages dialog, write the package name you want to install under
the Packages field and then click install. This will install the
packageyousearchedfororgiveyoualistofmatchingpackagesbasedonyour package
text.

InstallingPackages:-
Loading Packages:-
Once the package is downloaded to yourcomputer youcanaccess thefunctions and
Resourcesprovidedbythepackageintwodifferentways:
#load the package to use in the current R session
library(packagename)

GettingHelp on Packages:-

"C:/ProgramFiles/R/R-3.2.2/library"

install.packages("PackageName")
#Installthepackagenamed"XML".
install.packages("XML")

6
2. Learnallthebasicsof R-Programming(Datatypes,Variables,Operators etc.)

ProgramDescription:

Variables are nothing but reserved memory locations to store values. This means that, whencreate a
variable you reserve some space in memory.
A variable provides us with named storage that our programs can manipulate. A variable in R canstore
an atomic vector, group of atomic vectors or a combination of many Robjects. A valid variable name
consists of letters, numbers and the dot or underline characters. The variable name starts with a letter
or the dot not followed by a number.
An operator is a symbol that tells the compiler to perform specific mathematical or logical
manipulations. R language is rich in built-in operators and provides following types of operators.

DataTypes:

Numeric:
v <-23.5
print(class(v))

Logical
v<-
TRUE
print(class(v))
Integer
v<-2L
print(class(v))

Output:

7
R-objects.
 Vectors
 Lists
 Matrices
 Arrays
 Factors
 DataFrames

Vectors
Whenyouwanttocreatevectorwithmorethanoneelement, youshoulduse c()functionwhichmeans to
combine the elements into a vector.
#Createavector.
apple<-c('red','green',"yellow")
print(apple)

#Gettheclassofthevector.
print(class(apple))

Output:

8
Lists
AlistisanR-objectwhichcancontainmanydifferenttypesofelementsinsideitlikevectors, functions and even
another list inside it.

#Createa list.
list1 <-list(c(2,5,3),21.3,sin)

#Printthelist.
print(list1)

Output:

9
Matrices

A matrix is a two-dimensional rectangular data set. It can be created using a vector input to the matrix
function.

#Createa matrix.
M=matrix(c('a','a','b','c','b','a'),nrow=2,ncol=3,byrow=TRUE) print(M)

Output:

10
Arrays
While matrices are confined to two dimensions, arrays can be of any number of dimensions. The array
function takes a dim attribute which creates the required number of dimension. In the below example
we create an array with two elements which are 3x3 matrices each.

#Create an array.
a<-array(c('green','yellow'),dim=c(3,3,2))
print(a)

Output:

11
Factors
Factors are the R-objects which are created using a vector. It stores the vector along with the distinct
values of the elements in the vector as labels. The labels are always character irrespective of whether
it is numeric or character or Boolean etc. in the input vector. They are useful in statistical modeling.

Factors are created using the factor() function. Then levels functions gives the coun to flevels.

#Createa vector.
apple_colors<-c('green','green','yellow','red','red','red','green')

# Create a factor object.


factor_apple<-factor(apple_colors)

# Print the factor.


print(factor_apple)
print(nlevels(factor_apple))
[1] greengreenyellow red redred
greenLevels:
green red yellow

Output:

12
Variables:

The variables can be assigned values using leftward, rightward and equal to operator. The values
ofthevariablescanbeprintedusing print()orcat()function.The cat() functioncombinesmultiple items
into a continuous print output.

#Assignmentusingequaloperator.
var.1=c(0,1,2,3)
#Assignmentusingleftwardoperator.
var.2<- c("learn","R")
#Assignmentusingrightwardoperator.
c(TRUE,1)->var.3

print(var.1)
cat("var.1is",var.1,"\n")
cat("var.2is",var.2,"\n")
cat("var.3is",var.3,"\n")

Output:

13
ROperators:

TypesofOperators

Arithmetic Operators
v<- c( 2,5.5,6)
t<- c(8, 3, 4)
print(v+t)

RelationalOperators

v<- c(2,5.5,6,9)
t<- c(8,2.5,14,9)
print(v>t)

LogicalOperators

v <-
c(3,1,TRUE,2+3i) t<-
c(4,1,FALSE,2+3i)
print(v&t)
Assignment
Operators v1 <-
c(3,1,TRUE,2+3i) v2<<-
c(3,1,TRUE,2+3i) v3 =
c(3,1,TRUE,2+3i)
print(v1)
print(v2)
print(v3)

Output:

14
3ImplementR-Loopswithdifferentexamples.

ProgramDescription:
A for loop is the most popular control flow statement. A for loop is used to iterate a vector. It issimilar
to the while loop. There is only one difference between for and while, i.e., in while loop, the condition
is checked before the execution of the body, but in for loop condition is checked after the execution of
the body.

#Createfruit vector
fruit<-c('Apple','Orange',"Guava",'Pinapple','Banana','Grapes') #
Create the for statement
for(iinfruit){ print(i)
}

Output:

15
#Creatingamatrix

mat<-matrix(data=seq(10,21,by=1),nrow=6,ncol=2) #
Creating the loop with r and c to iterate over the matrix for
(r in 1:nrow(mat))
for(cin1:ncol(mat))
print(paste("mat[",r,",",c,"]=",mat[r,c]))
print(mat)

Output:

16
Rwhileloop :

Awhileloopisatypeofcontrolflowstatementswhichisusedtoiterateablockofcodeseveral numbers of times.


The while loop terminates when the value of the Boolean expression will be false.

Inwhileloop,firstlytheconditionwillbecheckedandthenafterthebodyofthestatementwill execute. In this


statement, the condition will be checked n+1 time, rather than n times.

v<-c("Hello","whileloop")
cnt <- 2
while(cnt<7)
{ print(v)
cnt=cnt+ 1
}

Output:

17
4. LearnthebasicsoffunctionsinRandimplementwithexamples.

ProgramDescription:

Afunctionis asetof statementsorganizedtogethertoperform aspecific task.Rhas alarge number of in-


built functions and the user can create their own functions.

InR,afunctionisanobjectsotheRinterpreterisabletopasscontroltothefunction,alongwith arguments that may


be necessary for the function to accomplish the actions.

The function in turn performs its task and returns control to the interpreter as well as any result which
may be stored in other objects.

Built-in Function

#Createasequenceofnumbersfrom32to44. print(seq(32,44))

#Findmeanofnumbersfrom25to82.
print(mean(25:82))

#Findsumofnumbersfrm41to68.
print(sum(41:68))

Output:

18
User-definedFunction
We can create user-defined functions in R. They are specific to what a user wants and once created
they can be used like the built-in functions. Below is an example of how a function is created andused.

#Createafunctiontoprintsquaresofnumbersinsequence.

new.function <- function(a) {


for(iin1:a){ b
<- i^2
print(b)
}
}

#Callthefunctionnew.functionsupplying6asanargument. new.function(6)

19
5. ImplementdataframesinR.Writeaprogramtojoincolumnsand rowsina
data frame using cbind() and rbind() in R.

ProgramDescription:

A data frame is a table or a two-dimensional array-like structure in which each column


contains values of one variable and each row contains one set of values from each
column.

#Creatingvectorobjects

Name<-c("ShubhamRastogi","NishkaJain","GunjanGarg","SumitChaudhary") Address

<- c("Moradabad","Etah","Sambhal","Khurja")

Marks<-c(255,355,455,655)

#Combiningvectorsintoonedataframe

info <- cbind(Name,Address,Marks)

#Printing data frame

print(info)

#Creatinganotherdataframewithsimilarcolumns

new.stuinfo <- data.frame(

Name = c("Deepmala","Arun"),

Address=c("Khurja","Moradabad"),

Marks = c("755","855"),

stringsAsFactors=FALSE

20
#Printingaheader.

cat("### TheSeconddataframe\n")

#Printingthedataframe.

print(new.stuinfo)

#Combiningrowsformboththedataframes.

all.info <- rbind(info,new.stuinfo)

# Printingaheader.

cat("###Thecombineddataframe\n")

# Printing the result.

print(all.info)

Output :

21
6. ImplementdifferentStringManipulationfunctionsinR
ProgramDescription:
String manipulation basically refers to the process of handling and analyzingstrings. It involves various
operations concerned with modification and parsing of strings to use and change its data. R offers a
series of in-built functions to manipulate the contents of a string. In this article, we will study different
functions concerned with the manipulation of strings in R.

ConcatenationofStrings
String Concatenation is the technique of combining two strings. String Concatenation can be done
using many ways:

 paste()function Anynumberofstringscanbeconcatenatedtogetherusing the paste() function to


form a larger string. This function takes separator as argument which is used between the individual
string elements and another argument ‘collapse’ which reflects if we wish toprint the strings together as
a single larger string.By default, the value of collapse is NULL

pr-1

#RprogramforStringconcatenation

# Concatenation using paste() function


str <- paste("Learn", "Code")
print(str)

Output:

22
pr-2

#Concatenationusingcat()function
str <- cat("learn", "code", "tech", sep = ":")
print (str)

Output:

23
7 ImplementdifferentdatastructuresinR(Vectors,Lists,DataFrames)

ProgramDescription:

Vectors are the most basic R data objects and there are six types of atomic vectors. They are logical,
integer, double, complex, character and raw.
Lists are the R objects which contain elements of different types like − numbers, strings, vectors and
anotherlistinsideit.Alistcanalsocontainamatrix orafunctionasitselements. Listiscreatedusing list()
function.

Vectors
#Createa vector.
apple<-c('red','green',"yellow")
print(apple)

#Gettheclassofthevector.
print(class(apple))

Output:

24
Lists
AlistisanR-objectwhichcancontainmanydifferent typesofelementsinsideitlikevectors,functions and even
another list inside it.

#Createa list.
list1 <-list(c(2,5,3),21.3,sin)

#Printthelist.
print(list1)
[[1]]
[1]2 53
[[2]]
[1]21.3

[[3]]
function(x).Primitive("sin")

Output:

25
Matrices
A matrix is a two-dimensional rectangular data set. It can be created using a vector input to the matrix
function.
#Createa matrix.
M=matrix(c('a','a','b','c','b','a'),nrow=2,ncol=3,byrow=TRUE
) print(M)

Output:

26
DataFrames:

# Create a data frame


Data_Frame<-data.frame(
Training=c("Strength","Stamina","Other"),
Pulse = c(100, 150, 120),
Duration=c(60,30, 45)
)

#Printthedataframe Data_Frame

Output:

27
8. Writea programto read acsv fileand analyzethe data inthe fileinR

ProgramDescription:
In R, we can read data from files stored outside the R environment. We can also write data into fileswhich
will be stored and accessed by the operating system. R can read and write into various file formats like
csv, excel, xml etc.

#Gettingandprintingcurrentworkingdirectory. print(getwd())
# Setting the current working directory.
setwd("C:\Users\sreek\OneDrive\Desktop\SAISANTHOSHI-MRCET-2023")
# Getting and printingthe current working directory.
print(getwd())

Output:

28
ReadingaCSV file

data<-read.csv("record.csv")
print(data)

Output:

Analyzing the CSV File


csv_data<-read.csv("record.csv")
print(is.data.frame(csv_data))
print(ncol(csv_data))
print(nrow(csv_data))

Output:

29
Gettingthemaximum salary
# Creating a data frame.
csv_data<-read.csv("record.csv")
#Gettingthemaximumsalaryfromdataframe.
max_sal<- max(csv_data$salary)
print(max_sal)

Output:

30
Gettingthedetails of allthepersons whoareworkingintheITdepartment

# Creating a data frame.


csv_data<-read.csv("record.csv")
#Gettingthedetaisofallthepwesonwhoareworkingin ITdepartment details <-
subset(csv_data,dept=="IT")
print(details)

Output:

31
Gettingthedetailsofthepersonswhosesalaryisgreaterthan600andworkingintheIT department.

# Creating a data frame.


csv_data<-read.csv("record.csv")
#Gettingthedetaisofallthepwesonwhoareworkingin ITdepartment details <-
subset(csv_data,dept=="IT"&salary>600)
print(details)

Output:

32
Gettingdetailsof thosepeopleswhojoinedonorafter 2014.

# Creating a data frame.


csv_data<-read.csv("record.csv")
#Gettingdetailsofthose peopleswhojoined onorafter2014
details<-subset(csv_data,as.Date(start_date)>as.Date("2014-01-01"))
print(details)

Output:

33
Writinginto aCSVfile:
csv_data<-read.csv("record.csv")
#Gettingdetails ofthosepeopleswhojoined onorafter2014
details<-subset(csv_data,as.Date(start_date)>as.Date("2014-01-01")) #
Writing filtered data into a new file.
write.csv(details,"output.csv")
new_details<-read.csv("output.csv")
print(new_details)

Output:

34
9. CreatepiechartsandbarchartsusingR

Program Description :

A pie-chart is a representation of values as slices of a circle with different colors. The slices are labeled
and the numbers corresponding to each slice is also represented in the chart.

#Createdataforthegraph.
geeks<- c(23, 56, 20, 63)
labels<-c("Mumbai","Pune", "Chennai", "Bangalore")

# Plot the chart.


pie(geeks,labels)

Output:

35
#Createthedataforthechart A
<- c(17, 32, 8, 53, 1)

#Plot thebar chart


barplot(A,xlab="X-axis",ylab="Y-axis",main="Bar-Chart")

Output:

36
10. CreateadatasetanddostatisticalanalysisonthedatausingR

Program Description :

The R Programming Language provides some easy and quick tools that let us convert our data into
visually insightful elements like graphs.

# ? is used before a function


#to gethelp onthat function
?plot
?chickwts
data(chickwts) #loading data into workspace
plot(chickwts$feed)#plotfeedfromchickwts
feeds=table(chickwts$feed)
# plots graph in decreasing order
barplot(feeds[order(feeds,decreasing=TRUE)])

Output:

37
11. WriteRprogramtofindCorrelationandCovariance

Program Description :

Covariance shows the direction of the path of the linear relationship between the variables while
afunction is applied to them.

Correlation on the contrarymeasures both the power and direction of the linear relationship between
two variables.
#Rprogramtoillustrate
#pearsonCorrelationTesting
# Using cor()
#Takingtwo numeric
#Vectorswithsamelength
x
= c(1, 2, 3, 4, 5, 6, 7)
y=c(1, 3,6, 2, 7, 4, 5)

# Calculating
#Correlationcoefficient
# Using cor() method
result=cor(x,y,method="pearson") #
Print the result
cat("Pearsoncorrelationcoefficientis:", result)

Output:

38
Covariance

#Datavectors
x<-c(1,3,5,
10)

y<-c(2, 4,6, 20)

#Printcovarianceusingdifferentmethods
print(cov(x, y))
print(cov(x, y, method =
"pearson")) print(cov(x, y, method
= "kendall"))
print(cov(x,y,method="spearman"))

Output:

39
12. WriteRprogramforRegressionModeling

ProgramDescription:

Regression analysis is a very widely used statistical tool to establish a relationship model between
two variables. One of these variable is called predictor variable whose value is gathered through
experiments. Theothervariableis called response variablewhosevalueis derived from thepredictor
variable.

#GeneraterandomIQvalueswithmean=30andsd=2 IQ <-
rnorm(40, 30, 2)

#SortingIQlevelinascendingorder IQ
<- sort(IQ)

#Generatevectorwithpassandfailvaluesof40students result
<- c(0, 0, 0, 1, 0, 0, 0, 0, 0, 1,
1, 0, 0, 0, 1, 1, 0, 0, 1, 0,
0, 0, 1, 0, 0, 1, 1, 0, 1, 1,
1, 1, 1, 0, 1, 1, 1, 1, 0, 1)
#DataFrame
df<-as.data.frame(cbind(IQ,result))
# Print data frame
print(df)

Output:

40
13 .WriteRprogramtobuildclassificationmodelusingKNNalgorithm

Program Description :

K-Nearest Neighbor or K-NN is a Supervised Non-linear classification algorithm. K-NN is a Non-


parametric algorithm i.e it doesn’t make any assumption about underlying data or its distribution. Itis
one of the simplest and widelyused algorithm which depends on it’s k value(Neighbors) and finds
it’s applications in many industries like finance industry, healthcare industry etc.

#Loadingdata
data(iris)
#Structure
str(iris)
# Installing Packages
install.packages("e1071")
install.packages("caTools")
install.packages("class")
#Loadingpackage
library(e1071)
library(caTools)
library(class)
#Loadingdata
data(iris)
head(iris)
#Splittingdataintotrain
# and test data
split<-sample.split(iris,SplitRatio=0.7)
train_cl<-subset(iris, split ==
"TRUE") test_cl<-subset(iris, split ==
"FALSE")

#Feature Scaling
train_scale<-scale(train_cl[,1:4])
test_scale<-scale(test_cl[,1:4])

41
#FittingKNNModel
# to training dataset
classifier_knn<-knn(train=train_scale,
test =test_scale,
cl=train_cl$Species,
k = 1)
classifier_knn

# Confusiin Matrix
cm<-table(test_cl$Species,classifier_knn) cm

#ModelEvaluation-ChoosingK #
Calculate out of Sample error
misClassError<-mean(classifier_knn!=test_cl$Species)
print(paste('Accuracy=', 1-misClassError))

# K =3
classifier_knn<-knn(train=train_scale,
test =test_scale,
cl=train_cl$Species,
k = 3)
misClassError<-mean(classifier_knn!=test_cl$Species)
print(paste('Accuracy =', 1-misClassError))

# K =5
classifier_knn<-knn(train=train_scale,
test =test_scale,
cl=train_cl$Species,
k = 5)
misClassError<-mean(classifier_knn!=test_cl$Species)
print(paste('Accuracy =', 1-misClassError))

# K =7
classifier_knn<-knn(train=train_scale,
test =test_scale,
cl=train_cl$Species,
k = 7)
misClassError<-mean(classifier_knn!=test_cl$Species)
print(paste('Accuracy=', 1-misClassError))

42
# K =15
classifier_knn<-knn(train=train_scale,
test =test_scale,
cl=train_cl$Species,
k = 15)
misClassError<-mean(classifier_knn!=test_cl$Species)
print(paste('Accuracy =', 1-misClassError))

# K =19
classifier_knn<-knn(train=train_scale,
test =test_scale,
cl=train_cl$Species,
k = 19)
misClassError<-mean(classifier_knn!=test_cl$Species)
print(paste('Accuracy=', 1-misClassError))

Output:

43
14 WriteRprogramtobuild clusteringmodel usingK-meanalgorithm

ProgramDescription:

K Means Clustering in R Programming is an Unsupervised Non-linear algorithm that cluster data based
on similarity or similar groups. It seeks to partition the observations into a pre-specified number of
clusters. Segmentation of data takes place to assign each training example to a segment called a cluster.

#Loadingdata
data(iris)

#Structure
str(iris)
# Installing Packages
install.packages("ClusterR")
install.packages("cluster")

#Loadingpackage
library(ClusterR)
library(cluster)

#Removinginitiallabelof
#Speciesfromoriginaldataset
iris_1 <- iris[, -5]

#FittingK-MeansclusteringModel
# to training dataset
set.seed(240)#Settingseed
kmeans.re<-kmeans(iris_1,centers=3,nstart=20)
kmeans.re

44
#Clusteridentificationfor #
each observation
kmeans.re$cluster

# Confusion Matrix
cm<-table(iris$Species,kmeans.re$cluster) cm

# Model Evaluation and visualization


plot(iris_1[c("Sepal.Length","Sepal.Width")])
plot(iris_1[c("Sepal.Length","Sepal.Width")],
col = kmeans.re$cluster)
plot(iris_1[c("Sepal.Length","Sepal.Width")],
col= kmeans.re$cluster,
main="K-meanswith 3
clusters")

##Plotiingclustercenters
kmeans.re$centers
kmeans.re$centers[,c("Sepal.Length","Sepal.Width")]

#cexisfont size,pch is symbol


points(kmeans.re$centers[,c("Sepal.Length","Sepal.Width")],
col = 1:3, pch = 8, cex = 3)

45
## Visualizing clusters
y_kmeans<-kmeans.re$cluster
clusplot(iris_1[,c("Sepal.Length","Sepal.Width")],
y_kmeans,
lines =
0,shade=TRUE
, color =
TRUE,
labels
= 2,
plotchar = FALSE,
span=TRUE,
main=paste("Clusteriris"),
xlab =
'Sepal.Length', ylab=
'Sepal.Width')

Output:

***

46

You might also like