0% found this document useful (0 votes)

15 views11 pages

Top 80 R Programming Interview Questions

Core Concepts
1. What is R and why is it so popular for data science?
R is an open-source language built specifically for statistical analysis and data visualization. It
has over 18,000 CRAN packages, rich graphics capabilities, strong community support, and is
easily extendable using C/C++, Python, and more.

2. Name three disadvantages of using R in production.

● Higher memory usage compared to Python or Java

● Slower for large-scale computations

● Less robust error handling and package QA in community contributions

3. What are the five atomic data types in R?

● Numeric: 3.14

● Integer: 2L

● Character: "hello"

● Logical: TRUE, FALSE

● Complex: 2 + 3i

4. Difference between vector, list, matrix, and data frame:

● Vector: 1D homogeneous data

● List: 1D heterogeneous

● Matrix: 2D homogeneous

● Data frame: 2D list with equal-length columns (heterogeneous)

5. How to read CSV and tab-delimited files?
Use read.csv("file.csv") for CSV and read.delim("file.txt") for tab-delimited
files.

6. How do you import an Excel sheet into R?

Use readxl::read_excel("file.xlsx") or openxlsx::read.xlsx().

7. Difference between library() and require()?

Both attach packages. library() throws an error if the package is missing; require()
returns FALSE, allowing conditional logic.

8. How does R handle missing values and NaNs?

Missing values are represented as NA. Undefined numeric values (e.g., 0/0) are NaN. Use
is.na() or is.nan() to detect them.

9. Why do R programmers prefer <- over = for assignment?

Because <- is always assignment, while = also binds arguments in functions, so <- avoids
ambiguity.

10. Two ways to open help for a function in R?

Use ?mean or help(mean).

11. How to append a row to a data frame?

Use rbind(df, new_row) or bind_rows(df, new_row) from dplyr.

12. How to filter and select columns from a data frame?

df[df$score > 90, "name"] or df %>% filter(score > 90) %>% pull(name)

13. What is the pipe operator %>% used for?

Used to chain operations, where the output of one function becomes the input of the next.

14. When should you use data.table instead of dplyr?

When working with very large datasets needing in-place updates or fastest performance.

15. How do you convert a string to a date?

Use as.Date("2025-06-27") or lubridate::ymd("20250627").
Data Manipulation
16. What is the apply() family used for?
To apply functions over margins (rows/columns) of data: apply(), lapply(), sapply(),
tapply(), etc.

17. How to join two data frames by column id?

Using dplyr: left_join(df1, df2, by = "id") or base: merge(df1, df2, by =
"id").

18. Difference between rbind() and cbind()?

● rbind() adds rows

● cbind() adds columns

19. What does aggregate(score ~ class, df, mean) do?

Returns average scores for each class as a summary data frame.

20. How to sort a data frame by column descending?

Base R: df[order(df$col, decreasing = TRUE), ]
Dplyr: df %>% arrange(desc(col))

21. Name five dplyr verbs and their uses:

● filter(): filter rows

● select(): pick columns

● mutate(): add/modify columns

● summarise(): summary stats

● arrange(): reorder rows

22. How to pivot data to wide format?

Use pivot_wider(names_from = category, values_from = value).
23. How does lubridate help with dates?
Simplifies date parsing and transformations with functions like ymd(), mdy_hms(), year(),
month().

24. What is a factor and when to use it?

Factor is for categorical data—more memory-efficient and used properly in modeling.

25. Memory difference between list and data frame?

Similar underlying memory, but data frames have extra class metadata and allow column-level
operations.

Visualization
26. What are ggplot2’s main layers?

● Data

● Aesthetics

● Geometries

● Scales

● Themes

● Coordinates

27. Create a histogram of prices with binwidth = 5:

ggplot(df, aes(price)) + geom_histogram(binwidth = 5)

28. How to add regression line to scatterplot?

Add: + geom_smooth(method = "lm", se = FALSE)

29. When to use base plotting over ggplot?

For quick EDA, scripting, or for-loop plotting where speed matters more than aesthetics.

30. How to make a correlation heatmap?

Use heatmap(cor(df)) or melt + geom_tile() in ggplot2.
Programming & Functions
31. Define a function for geometric mean:

geom_mean <- function(x) exp(mean(log(x), na.rm = TRUE))

32. What is lexical scoping in R?

Functions remember variables from their defining environment, not where they’re called.

33. What does <<- do?

Assigns to a variable in a parent (not local) environment. Dangerous if overused.

34. Difference between S3, S4, R6?

● S3: informal OOP

● S4: formal slots, type checking

● R6: reference objects (mutable), better for apps

35. How do you debug R code in a loop?

Use browser(), debug(), or traceback() after error to inspect step-by-step.

36. What’s Rprof() used for?

Profiles code performance to find bottlenecks. Use summaryRprof() to analyze.

37. Why prefer vectorization over loops?

Faster and more readable. Avoids interpreter overhead.

38. Show a closure example:

counter <- local({ i <- 0; function() { i <<- i + 1; i } })

39. How to write and document a package?

Use usethis::create_package(), add code in R/, use roxygen2, document via
devtools::document().

40. What is non-standard evaluation in dplyr?

Allows column names to be used without quotes. Enables tidy programming with {{ }}.
Statistics & Machine Learning
41. Code for two-sample t-test:
t.test(x, y, var.equal = TRUE)

42. How to extract R-squared from linear model?

summary(lm_model)$r.squared

43. Fit a logistic regression:

glm(default ~ income + balance, data = df, family = binomial)

44. How to choose best k for k-means?

Use elbow plot, silhouette score, or gap statistic.

45. PCA code:

prcomp(df, scale. = TRUE)

46. Random forest with 1000 trees:

randomForest(y ~ ., data = df, ntree = 1000)

47. Techniques to handle class imbalance:

SMOTE, down/oversampling, weighting loss function.

48. What is AUC?

Area under ROC curve; measures classification performance.

49. Cross-validation using tidymodels:

Use vfold_cv(), workflow(), and fit_resamples().

50. Time series forecasting function?

forecast::auto.arima()
Deployment & Reproducibility
51. REST API with R?
Use plumber:

#* @post /predict
function(input) { predict(model, input) }

52. How to monitor model drift?

Track changes in input distributions or prediction errors; retrain if thresholds are crossed.

53. What’s the difference between caret and tidymodels?

caret is older, single-interface; tidymodels is modern, modular, tidyverse-aligned.

54. Matching for causal inference in R?

Use MatchIt for propensity matching.

55. What does vetiver do?

Helps standardize and deploy R models via APIs or dashboards.

56. What does renv do?

Manages per-project package versions and dependencies. Use renv::init().

57. Automate reports daily?

Use cronR or RStudio Connect to schedule rendering .Rmd.

58. How to revert last Git commit but keep changes?

git reset --soft HEAD~1

59. Why use Docker for R?

Creates reproducible environments, especially for Shiny/Plumber apps.

60. Minimal Shiny app example:

ui <- fluidPage(textInput("x", ""), textOutput("y"))

server <- function(input, output) { output$y <- renderText(input$x) }
shinyApp(ui, server)

61. What is DBI in R?

It provides a backend-agnostic interface to SQL databases, used by dplyr and dbplyr.
62. Why use Apache Arrow in R?
For reading/writing Parquet and large data files efficiently.

63. Parameterized R Markdown example:

yaml
region: "Asia"

Access with params$region.

64. What is the Rocker project?

Official Docker images for R, e.g., rocker/verse for tidyverse and R Markdown.

65. Outline your complete data pipeline in R:

Ingest (DBI) → Clean (dplyr) → Model (tidymodels) → Deploy (plumber/Shiny) → Monitor
(logs/alerts/Grafana).

66. How do you monitor concept drift once a model is deployed?

Monitor distribution changes in input features (e.g., using JS divergence, population stability
index) and prediction outcomes. Set up alerts when thresholds are crossed. You may use
packages like drifter or tools like Prometheus + Grafana.

67. Explain the difference between caret and tidymodels.

● caret is a single package that wraps multiple ML models but is less modular.

● tidymodels is a modern, tidyverse-style framework with separate packages for

preprocessing (recipes), modeling (parsnip), resampling (rsample), and evaluation
(yardstick), providing better structure and extensibility.

68. Which R package helps with causal inference through matching?

Use the MatchIt package to perform propensity score matching, which balances covariates
across treatment groups in observational data.

69. What is the purpose of the vetiver package?

vetiver streamlines the deployment of R and Python models by versioning them, storing
metadata, and providing prediction APIs using plumber. It supports reproducible and secure
MLOps workflows.
70. Describe how to deploy an R model as a REST API using plumber.

● Create a plumber file with annotated endpoints

#* @post /predict

function(req) {

predict(model, req$input)

● Use plumber::plumb("file.R")$run(port = 8000) to start the API

● Host using RStudio Connect, Docker, or cloud platforms like Heroku

Reproducibility & DevOps

71. What problem does the renv package solve and how is it used?
renv manages project-specific R package environments, ensuring consistent dependencies
across collaborators or servers. Use renv::init() to start and renv::snapshot() to lock
versions.

72. How can you schedule an R Markdown report to run daily?

● Use the cronR package to create and manage cron jobs

cron_rscript("report.Rmd", at = "7:00AM")

● Or use RStudio Connect to schedule automatic rendering.

73. What Git command reverts the last commit but retains the changes staged?

git reset --soft HEAD~1

74. Why is Docker useful for deploying R applications?

Docker ensures consistent environments by bundling R, its packages, system libraries, and
configurations. It simplifies deployment of Shiny apps, Plumber APIs, and reports across
different systems.
75. Write a minimal working Shiny app that echoes text input.

ui <- fluidPage(

textInput("text", "Enter something"),

textOutput("output")

server <- function(input, output) {

output$output <- renderText(input$text)

shinyApp(ui, server)

76. What is the DBI package in R and how is it different from dplyr's DB tools?
DBI is a database interface providing low-level access to databases (e.g., using dbConnect,
dbGetQuery). dplyr DB tools build on top of DBI to allow manipulation using SQL-translated
dplyr verbs (filter, mutate, etc.) in a backend-agnostic way.

77. What benefit does Apache Arrow provide for R users?

Arrow allows efficient, in-memory, zero-copy access to columnar data formats like Parquet. It
improves speed and scalability and supports cross-language data sharing (Python, R, Java,
etc.).

78. How do you create a parameterized R Markdown report?

● Define parameters in the YAML header:

yaml

region: "Asia"

● Use params$region inside the document

● Render with:

rmarkdown::render("report.Rmd", params = list(region = "Europe"))

79. What does the Rocker project offer?
The Rocker project provides prebuilt Docker containers for R. Examples include:

● rocker/r-ver: base R

● rocker/verse: includes tidyverse, RMarkdown

● rocker/shiny: includes Shiny Server for deployment

80. Describe an end-to-end R data science pipeline you have built.

A complete R pipeline typically involves:

● Data Ingestion using DBI or readr

● Data Cleaning & Wrangling using dplyr or data.table

● EDA with ggplot2 or DataExplorer

● Model Building using tidymodels

● Model Evaluation via yardstick, cross-validation

● Deployment using plumber, shiny, or vetiver

● Monitoring using logs, dashboards (e.g., Grafana), or concept drift detection

R Basic Viva Questions
No ratings yet
R Basic Viva Questions
3 pages
R Programming For Data Science QB
No ratings yet
R Programming For Data Science QB
21 pages
R - Solved QB Unit-II
No ratings yet
R - Solved QB Unit-II
14 pages
R Programming 2 MARKS
No ratings yet
R Programming 2 MARKS
12 pages
R Viva Questions
100% (1)
R Viva Questions
4 pages
BA Viva Questions
No ratings yet
BA Viva Questions
8 pages
R Programming Assignment Answers
No ratings yet
R Programming Assignment Answers
9 pages
Advanced R
No ratings yet
Advanced R
19 pages
ANUSHKA
No ratings yet
ANUSHKA
41 pages
R - Ii Unit
No ratings yet
R - Ii Unit
10 pages
R Program VIVA Questions
No ratings yet
R Program VIVA Questions
8 pages
Ids Shorts (Mid 2)
No ratings yet
Ids Shorts (Mid 2)
5 pages
QB - Sample Answers (R Language)
No ratings yet
QB - Sample Answers (R Language)
25 pages
R Programming For Data Science. A Comprehensive Guide To R Programming... 2024
No ratings yet
R Programming For Data Science. A Comprehensive Guide To R Programming... 2024
235 pages
Unit 2
No ratings yet
Unit 2
2 pages
R Interview Prep: 40 Key Questions
No ratings yet
R Interview Prep: 40 Key Questions
22 pages
Data Science Using R
No ratings yet
Data Science Using R
14 pages
Programming in R (Ubccde51)
No ratings yet
Programming in R (Ubccde51)
22 pages
R Inter
No ratings yet
R Inter
6 pages
RMC Lovish
No ratings yet
RMC Lovish
41 pages
R Programming
No ratings yet
R Programming
7 pages
R Programmimg Lab FIle
No ratings yet
R Programmimg Lab FIle
35 pages
Wa0003.
No ratings yet
Wa0003.
9 pages
DS-R Block 2 MCQ Question Bank
No ratings yet
DS-R Block 2 MCQ Question Bank
4 pages
QB Samplealllllll Hemu
No ratings yet
QB Samplealllllll Hemu
19 pages
Data Science in R Interview Questions and Answers
No ratings yet
Data Science in R Interview Questions and Answers
56 pages
100 Data Science in R Interview Questions and Answers For 2016
100% (2)
100 Data Science in R Interview Questions and Answers For 2016
56 pages
Rstudio Divya
No ratings yet
Rstudio Divya
68 pages
Intro to R for Data Analysis
No ratings yet
Intro to R for Data Analysis
146 pages
R Viva Ques
No ratings yet
R Viva Ques
24 pages
Introduction To R
No ratings yet
Introduction To R
39 pages
R PROGRAMMING QUESTION BANK Answer
100% (1)
R PROGRAMMING QUESTION BANK Answer
20 pages
Part I: Introductory Materials: Introduction To R
No ratings yet
Part I: Introductory Materials: Introduction To R
25 pages
R Programming Basics Guide
No ratings yet
R Programming Basics Guide
30 pages
Introduction To R
No ratings yet
Introduction To R
23 pages
R Programming Interview Guide
No ratings yet
R Programming Interview Guide
24 pages
R Programming for Data Science
No ratings yet
R Programming for Data Science
13 pages
ProgrammingForDS13 Intror
No ratings yet
ProgrammingForDS13 Intror
25 pages
Introduction to R Programming
No ratings yet
Introduction to R Programming
30 pages
1.R Unit 1
No ratings yet
1.R Unit 1
49 pages
R Programming
No ratings yet
R Programming
22 pages
R Language Lab Manual Lab 1
100% (1)
R Language Lab Manual Lab 1
33 pages
R Programming
No ratings yet
R Programming
7 pages
R Code Intro
No ratings yet
R Code Intro
46 pages
APM3715 - Introduction To Programming Tools - R Programming
No ratings yet
APM3715 - Introduction To Programming Tools - R Programming
112 pages
How To Use The R Programming Language For Statistical Analyses
No ratings yet
How To Use The R Programming Language For Statistical Analyses
38 pages
R Tutorial
No ratings yet
R Tutorial
100 pages
R Most Important Question
No ratings yet
R Most Important Question
12 pages
R for NGS Data Analysis Beginners
No ratings yet
R for NGS Data Analysis Beginners
5 pages
Data Analysis Using R and Vectors
No ratings yet
Data Analysis Using R and Vectors
35 pages
Introduction To R, Version 2
No ratings yet
Introduction To R, Version 2
51 pages
A Crash R Course On Statistical Graphics
No ratings yet
A Crash R Course On Statistical Graphics
169 pages
R Programming Notes
No ratings yet
R Programming Notes
23 pages
3 - R Programming Main File
No ratings yet
3 - R Programming Main File
137 pages
R Programming for Data Analysis
No ratings yet
R Programming for Data Analysis
11 pages
Himanshu Navamsa Marriage Guide
No ratings yet
Himanshu Navamsa Marriage Guide
2 pages
Nepal AI
No ratings yet
Nepal AI
71 pages
MCSL-016 Internet Concept and Web Design LAB Report (Himanshu Maheshwari)
No ratings yet
MCSL-016 Internet Concept and Web Design LAB Report (Himanshu Maheshwari)
57 pages
Whispers of Life and Nature Poem
No ratings yet
Whispers of Life and Nature Poem
3 pages
Choose Peace Not Pain Poem
No ratings yet
Choose Peace Not Pain Poem
1 page
Al Teacher and HR Online and Ai School World
No ratings yet
Al Teacher and HR Online and Ai School World
7 pages
Book Store Srs
No ratings yet
Book Store Srs
9 pages
JRCECDIS - 9201 - User Manual
No ratings yet
JRCECDIS - 9201 - User Manual
62 pages
DHS User Guide v61 PDF
100% (1)
DHS User Guide v61 PDF
781 pages
Arduino Error
No ratings yet
Arduino Error
2 pages
DI - Error While Updating Two Sales Orders Simultaneously in HANA DB
No ratings yet
DI - Error While Updating Two Sales Orders Simultaneously in HANA DB
2 pages
AWS EKS Cluster Setup Guide
No ratings yet
AWS EKS Cluster Setup Guide
11 pages
Cs Thesis Example
100% (2)
Cs Thesis Example
6 pages
Aspiring Full Stack Developer Profile
No ratings yet
Aspiring Full Stack Developer Profile
2 pages
Shaik Shajahoor: Personal Experience
No ratings yet
Shaik Shajahoor: Personal Experience
2 pages
Key Detector Plugin Guide
No ratings yet
Key Detector Plugin Guide
7 pages
DART Report and Acrobat - IntouchSupport
No ratings yet
DART Report and Acrobat - IntouchSupport
2 pages
LaTeX, Standalone
No ratings yet
LaTeX, Standalone
30 pages
User Manual - DiskOnChip PCI Evaluation Board
No ratings yet
User Manual - DiskOnChip PCI Evaluation Board
12 pages
APRIL 2023 IT Passport Examination
No ratings yet
APRIL 2023 IT Passport Examination
19 pages
DataStru AlgoCPython UpdatedCover
No ratings yet
DataStru AlgoCPython UpdatedCover
2 pages
Rajalakshmi R B - DevOps Engineer
No ratings yet
Rajalakshmi R B - DevOps Engineer
5 pages
Synopsis of Electricity Billing System: Objectives
No ratings yet
Synopsis of Electricity Billing System: Objectives
4 pages
Course - 2025 Data Structures Using Python
No ratings yet
Course - 2025 Data Structures Using Python
2 pages
Catálogo MB 100
No ratings yet
Catálogo MB 100
188 pages
Acm Submission Template
No ratings yet
Acm Submission Template
13 pages
Designing Xilinx Zynq-Based Systems Using Sdsoc
No ratings yet
Designing Xilinx Zynq-Based Systems Using Sdsoc
8 pages
The Free: Free "Adobe Lightroom CC"
No ratings yet
The Free: Free "Adobe Lightroom CC"
6 pages
The Designer's Guide To The Cortex-M Processor Family Trevor Martin Download Full Chapters
No ratings yet
The Designer's Guide To The Cortex-M Processor Family Trevor Martin Download Full Chapters
86 pages
SanDisk Cruzer Blade USB Registry
No ratings yet
SanDisk Cruzer Blade USB Registry
2 pages
Blitz-Logs 20220531192630
No ratings yet
Blitz-Logs 20220531192630
37 pages
SAP EWM Putaway Storage Type Sequence
No ratings yet
SAP EWM Putaway Storage Type Sequence
4 pages
An Analysis of Modern Password Manager Security and Usage On Desk
No ratings yet
An Analysis of Modern Password Manager Security and Usage On Desk
187 pages
Iot Based Smart Farming System: A Project Report
No ratings yet
Iot Based Smart Farming System: A Project Report
18 pages
Student Guide to Productivity Tools
No ratings yet
Student Guide to Productivity Tools
31 pages
PowerSchool Admin Job Desc
No ratings yet
PowerSchool Admin Job Desc
2 pages

Top 80 R Programming Interview Questions

Uploaded by

Top 80 R Programming Interview Questions

Uploaded by

Top 80 R Programming Interview Questions

2. Name three disadvantages of using R in production.

●​ Higher memory usage compared to Python or Java​

●​ Slower for large-scale computations​

●​ Less robust error handling and package QA in community contributions​

3. What are the five atomic data types in R?

●​ Logical: TRUE, FALSE​

4. Difference between vector, list, matrix, and data frame:

●​ Vector: 1D homogeneous data​

●​ Data frame: 2D list with equal-length columns (heterogeneous)​

6. How do you import an Excel sheet into R?​

7. Difference between library() and require()?​

8. How does R handle missing values and NaNs?​

9. Why do R programmers prefer <- over = for assignment?​

10. Two ways to open help for a function in R?​

11. How to append a row to a data frame?​

12. How to filter and select columns from a data frame?​

13. What is the pipe operator %>% used for?​

14. When should you use data.table instead of dplyr?​

15. How do you convert a string to a date?​

17. How to join two data frames by column id?​

18. Difference between rbind() and cbind()?

●​ rbind() adds rows​

●​ cbind() adds columns​

19. What does aggregate(score ~ class, df, mean) do?​

20. How to sort a data frame by column descending?​

21. Name five dplyr verbs and their uses:

●​ filter(): filter rows​

●​ select(): pick columns​

●​ mutate(): add/modify columns​

●​ summarise(): summary stats​

●​ arrange(): reorder rows​

22. How to pivot data to wide format?​

24. What is a factor and when to use it?​

25. Memory difference between list and data frame?​

27. Create a histogram of prices with binwidth = 5:​

28. How to add regression line to scatterplot?​

29. When to use base plotting over ggplot?​

30. How to make a correlation heatmap?​

geom_mean <- function(x) exp(mean(log(x), na.rm = TRUE))

32. What is lexical scoping in R?​

33. What does <<- do?​

34. Difference between S3, S4, R6?

●​ S3: informal OOP​

●​ S4: formal slots, type checking​

●​ R6: reference objects (mutable), better for apps​

35. How do you debug R code in a loop?​

36. What’s Rprof() used for?​

37. Why prefer vectorization over loops?​

38. Show a closure example:

counter <- local({ i <- 0; function() { i <<- i + 1; i } })

39. How to write and document a package?​

40. What is non-standard evaluation in dplyr?​

42. How to extract R-squared from linear model?​

43. Fit a logistic regression:

glm(default ~ income + balance, data = df, family = binomial)

44. How to choose best k for k-means?​

45. PCA code:​

46. Random forest with 1000 trees:​

47. Techniques to handle class imbalance:​

48. What is AUC?​

49. Cross-validation using tidymodels:​

50. Time series forecasting function?​

52. How to monitor model drift?​

53. What’s the difference between caret and tidymodels?​

54. Matching for causal inference in R?​

55. What does vetiver do?​

56. What does renv do?​

57. Automate reports daily?​

58. How to revert last Git commit but keep changes?​

59. Why use Docker for R?​

60. Minimal Shiny app example:

ui <- fluidPage(textInput("x", ""), textOutput("y"))

61. What is DBI in R?​

63. Parameterized R Markdown example:

Access with params$region.

● Higher memory usage compared to Python or Java

● Slower for large-scale computations

● Less robust error handling and package QA in community contributions

● Logical: TRUE, FALSE

● Vector: 1D homogeneous data

● Data frame: 2D list with equal-length columns (heterogeneous)

6. How do you import an Excel sheet into R?

7. Difference between library() and require()?

8. How does R handle missing values and NaNs?

9. Why do R programmers prefer <- over = for assignment?

10. Two ways to open help for a function in R?

11. How to append a row to a data frame?

12. How to filter and select columns from a data frame?

13. What is the pipe operator %>% used for?

14. When should you use data.table instead of dplyr?

15. How do you convert a string to a date?

17. How to join two data frames by column id?

● rbind() adds rows

● cbind() adds columns

19. What does aggregate(score ~ class, df, mean) do?

20. How to sort a data frame by column descending?

● filter(): filter rows

● select(): pick columns

● mutate(): add/modify columns

● summarise(): summary stats

● arrange(): reorder rows

22. How to pivot data to wide format?

24. What is a factor and when to use it?

25. Memory difference between list and data frame?

27. Create a histogram of prices with binwidth = 5:

28. How to add regression line to scatterplot?

29. When to use base plotting over ggplot?

30. How to make a correlation heatmap?

32. What is lexical scoping in R?

33. What does <<- do?

● S3: informal OOP

● S4: formal slots, type checking

● R6: reference objects (mutable), better for apps

35. How do you debug R code in a loop?

36. What’s Rprof() used for?

37. Why prefer vectorization over loops?

39. How to write and document a package?

40. What is non-standard evaluation in dplyr?

42. How to extract R-squared from linear model?

44. How to choose best k for k-means?

45. PCA code:

46. Random forest with 1000 trees:

47. Techniques to handle class imbalance:

48. What is AUC?

49. Cross-validation using tidymodels:

50. Time series forecasting function?

52. How to monitor model drift?

53. What’s the difference between caret and tidymodels?

54. Matching for causal inference in R?

55. What does vetiver do?

56. What does renv do?

57. Automate reports daily?

58. How to revert last Git commit but keep changes?

59. Why use Docker for R?

61. What is DBI in R?

64. What is the Rocker project?

65. Outline your complete data pipeline in R:

66. How do you monitor concept drift once a model is deployed?

● tidymodels is a modern, tidyverse-style framework with separate packages for

68. Which R package helps with causal inference through matching?

69. What is the purpose of the vetiver package?

● Create a plumber file with annotated endpoints

● Use plumber::plumb("file.R")$run(port = 8000) to start the API

● Host using RStudio Connect, Docker, or cloud platforms like Heroku

● Use the cronR package to create and manage cron jobs

● Or use RStudio Connect to schedule automatic rendering.

74. Why is Docker useful for deploying R applications?

77. What benefit does Apache Arrow provide for R users?

● Define parameters in the YAML header:

● Use params$region inside the document

● rocker/verse: includes tidyverse, RMarkdown

● rocker/shiny: includes Shiny Server for deployment

80. Describe an end-to-end R data science pipeline you have built.

● Data Ingestion using DBI or readr

● Data Cleaning & Wrangling using dplyr or data.table