Introduction to R Programming
Overview of R
R is a free, open-source programming language and environment designed for statistical computing and
graphics. It’s widely used in data analysis, machine learning, and scienti ic research due to its powerful
statistical tools and lexibility. R is highly extensible through packages and supports data visualization, making
it a favorite among data scientists and statisticians.
Example: Imagine you're analyzing sales data for a store. R can help you calculate average sales, create
visualizations like bar charts, and predict future trends.
Installing R
To use R, download it from the Comprehensive R Archive Network (CRAN) at https://cran.r-project.org/.
Choose the version for your operating system (Windows, macOS, or Linux) and follow the installation
instructions.
Example: After installing R, you can verify it by typing R in your terminal or opening the R GUI, which opens
an interactive console.
Running R
R can be run via:
R Console: A command-line interface for running R commands.
RStudio: A popular integrated development environment (IDE) that makes R easier to use with
features like code highlighting and visualization tools.
Scripts: Save R code in .R iles and run them using source(" ilename.R").
Example: In RStudio, type print("Hello, R!") in the console and press Enter.
print("Hello, R!")
Output:
[1] "Hello, R!"
Packages in R
Packages are collections of R functions and data sets that extend R’s capabilities. Install packages using
install.packages("package_name") and load them with library(package_name).
Example: Install and load the ggplot2 package for data visualization.
install.packages("ggplot2") # Run once to install
library(ggplot2) # Load the package
R Data Types
R supports several basic data types:
Numeric: Numbers like 10.5, 42.
Integer: Whole numbers like 1L, 100L (use L to specify integers).
Character: Text strings like "hello", "data".
Logical: TRUE or FALSE.
Complex: Numbers with real and imaginary parts, e.g., 3 + 4i.
Example:
x <- 10.5 # Numeric
y <- 5L # Integer
z <- "R is fun" # Character
w <- TRUE # Logical
v <- 3 + 4i # Complex
print(c(typeof(x), typeof(y), typeof(z), typeof(w), typeof(v)))
Output:
[1] "double" "integer" "character" "logical" "complex"
R Objects
R organizes data into objects like:
Vectors: Ordered collections of elements of the same type (e.g., c(1, 2, 3)).
Matrices: 2D arrays of the same data type.
Data Frames: Tables with rows and columns, like spreadsheets.
Lists: Collections of elements of different types.
Arrays: Multi-dimensional data structures.
Example: Create a vector and a data frame.
vec <- c(1, 2, 3, 4) # Vector
df <- data.frame(name = c("Alice", "Bob"), age = c(25, 30)) # Data frame
print(vec)
print(df)
Output:
[1] 1 2 3 4
name age
1 Alice 25
2 Bob 30
Reading and Writing Data
R can read and write data from various ile formats:
Reading: Use read.csv(" ile.csv") for CSV iles or read.table() for text iles.
Writing: Use write.csv(data, " ile.csv") to save data as CSV.
Example: Read and write a CSV ile.
# Create a sample data frame
data <- data.frame(id = 1:3, name = c("John", "Jane", "Doe"))
write.csv(data, "sample.csv", row.names = FALSE)
# Read it back
data_read <- read.csv("sample.csv")
print(data_read)
Output:
id name
1 1 John
2 2 Jane
3 3 Doe
Subsetting R Objects
Subsetting extracts speci ic elements from objects using indices, names, or conditions.
Vectors: Use [index] (e.g., vec[1]).
Data Frames: Use [row, column] or $ for columns (e.g., df$name).
Lists: Use [[index]] or $name.
Example:
vec <- c(10, 20, 30, 40)
df <- data.frame(name = c("Alice", "Bob"), score = c(85, 90))
print(vec[2]) # Second element
print(df$name[1]) # First name
print(df[1, ]) # First row
Output:
[1] 20
[1] "Alice"
name score
1 Alice 85
Essentials of the R Language
R is case-sensitive, uses <- or = for assignment, and supports functions like mean(), sum(), and length().
Comments start with #.
Example:
x <- 5 # Assignment
# Calculate square
square <- x^2
print(square)
Output:
[1] 25
Calculations
R supports arithmetic operations: +, -, *, /, ^ (exponentiation).
Example:
a <- 10
b <- 3
print(a + b) # Addition
print(a * b) # Multiplication
print(a ^ 2) # Square
Output:
[1] 13
[1] 30
[1] 100
Complex Numbers in R
R supports complex numbers using i for the imaginary part.
Example:
z <- 3 + 4i
print(Re(z)) # Real part
print(Im(z)) # Imaginary part
print(z * z) # Square of complex number
Output:
[1] 3
[1] 4
[1] -7+24i
Rounding
Use round(), ceiling(), loor(), or trunc() for rounding numbers.
round(x, digits): Rounds to speci ied digits.
ceiling(x): Rounds up.
loor(x): Rounds down.
trunc(x): Truncates decimals.
Example:
x <- 3.14159
print(round(x, 2)) # Round to 2 decimals
print(ceiling(x)) # Round up
print( loor(x)) # Round down
Output:
[1] 3.14
[1] 4
[1] 3
Arithmetic, Modulo, and Integer Quotients
Modulo: Use %% to get the remainder.
Integer Quotient: Use %/% for integer division.
Example:
a <- 17
b <- 5
print(a %% b) # Modulo
print(a %/% b) # Integer quotient
Output:
[1] 2
[1] 3
Variable Names and Assignment
Variable names in R:
Can include letters, numbers, dots (.), and underscores (_).
Must start with a letter or dot.
Use <- or = for assignment.
Example:
my_var <- 42
my.var2 <- "Hello"
print(my_var)
print(my.var2)
Output:
[1] 42
[1] "Hello"
Operators
R includes:
Arithmetic: +, -, *, /, ^, %%, %/%.
Comparison: ==, !=, <, >, <=, >=.
Logical: & (and), | (or), ! (not).
Example:
x <- 10
y <- 5
print(x > y) # Comparison
print(x + y > 12) # Combined with arithmetic
Output:
[1] TRUE
[1] TRUE
Integers
Integers are whole numbers, speci ied with L (e.g., 5L). Use as.integer() to convert numerics to integers.
Example:
x <- 10.7
y <- as.integer(x)
print(y)
print(typeof(y))
Output:
[1] 10
[1] "integer"
Factors
Factors represent categorical data (e.g., "male", "female"). Create them with factor().
Example:
grades <- factor(c("A", "B", "A", "C"))
print(grades)
print(levels(grades))
Output:
[1] A B A C
Levels: A B C
[1] "A" "B" "C"
Logical Operations
Logical operations use TRUE/FALSE with operators like &, |, and !.
Example:
a <- TRUE
b <- FALSE
print(a & b) # AND
print(a | b) # OR
print(!a) # NOT
Output:
[1] FALSE
[1] TRUE
[1] FALSE