0% found this document useful (0 votes)

17 views13 pages

Regression

Uploaded by

sahil fuck

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

17 views13 pages

Regression

Uploaded by

sahil fuck

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 13

Regression

1. What are the measures in regression?

 In regression analysis, there are several measures used to evaluate the performance
of the model and the relationship between variables. Some common measures
include:

 Mean Squared Error (MSE)

 Root Mean Squared Error (RMSE)

 Mean Absolute Error (MAE)

 R-squared (R²)

 Adjusted R-squared

 Residuals

2. What are the assumptions in OLS regression?

 Ordinary Least Squares (OLS) regression relies on several assumptions:

1. Linearity: The relationship between the independent and dependent

variables is linear.

2. Independence of Errors: The errors (residuals) are independent of each

other.

3. Homoscedasticity: The variance of the errors is constant across all levels of

the independent variables.

4. Normality of Errors: The errors follow a normal distribution.

5. No Multicollinearity: There is no perfect multicollinearity among the

independent variables.

3. What is R-squared (R²) values?

 R-squared (R²) is a statistical measure that represents the proportion of the variance
in the dependent variable that is explained by the independent variables in the
model. It ranges from 0 to 1, where 1 indicates a perfect fit and 0 indicates no linear
relationship between the variables.

4. What is overfitting and underfitting and how do they happen? How to solve them?

 Overfitting: Overfitting occurs when a model learns the training data too well,
capturing noise and random fluctuations that are not representative of the true
relationship. It happens when the model is too complex relative to the amount of
data. To solve overfitting, one can:

 Use simpler models.

 Increase the amount of training data.

 Regularize the model by adding penalties to the coefficients.

 Underfitting: Underfitting occurs when a model is too simple to capture the
underlying structure of the data. It happens when the model is not complex enough
to learn from the data. To solve underfitting, one can:

 Use more complex models.

 Add more features or polynomial features.

 Reduce regularization.

5. What is Gradient Descent Algorithm (GDA)?

 Gradient Descent Algorithm is an optimization algorithm used to minimize the loss

function in regression models. It works by iteratively adjusting the parameters
(coefficients) of the model in the direction of the steepest descent of the cost
function. The goal is to find the optimal parameters that minimize the error.

6. What are the hyperparameters used in regression?

 Some common hyperparameters used in regression models include:

 Regularization Parameter: Controls the amount of regularization applied to

the model.

 Learning Rate (for gradient-based algorithms): Determines the step size for
updating the parameters during optimization.

 Number of iterations: Specifies the maximum number of iterations for

optimization algorithms like gradient descent.

 Penalty Type (L1, L2): Specifies the type of penalty used in regularization (L1
for Lasso, L2 for Ridge).

7. What is cross-validation?

 Cross-validation is a technique used to assess the performance of a predictive model.

It involves splitting the dataset into multiple subsets (folds), training the model on
some of the folds, and evaluating it on the remaining fold. This process is repeated
multiple times, with each fold serving as the test set exactly once. Cross-validation
helps to assess how well the model generalizes to unseen data.

8. What is grid search?

 Grid search is a technique used for hyperparameter tuning, where a set of

hyperparameters and their values are specified, and the model is trained and
evaluated for all possible combinations of these hyperparameters. The combination
of hyperparameters that yields the best performance on the validation set is then
selected as the optimal set of hyperparameters for the model.

Classification
1. Which feature has what magnitude of impact on "Y"?
 In classification, it's important to understand the significance of each feature
in predicting the target variable "Y." Feature importance can be assessed
through methods like coefficients in logistic regression, feature importance
scores in decision trees, or permutation importance in random forests.
2. Best for checking the feature significance?
 Feature significance can be checked using various techniques such as:
 Coefficient significance in logistic regression.
 Feature importance scores in decision trees or random forests.
 Chi-square test for categorical variables.
 Correlation analysis for continuous variables.
 Feature selection algorithms like Recursive Feature Elimination (RFE)
or SelectKBest.
3. What is logistic regression?
 Logistic regression is a statistical method used for binary classification tasks. It
models the probability that a given input belongs to a certain category (class).
It estimates the probability using a logistic function and is commonly used for
predicting binary outcomes.
4. What are the measures of classifications?
 Common measures of classification performance include:
 Accuracy: Overall correctness of the model.
 Precision: Proportion of true positive predictions out of all positive
predictions.
 Recall (Sensitivity): Proportion of true positive predictions out of all
actual positives.
 F1-score: Harmonic mean of precision and recall.
 Specificity: Proportion of true negative predictions out of all actual
negatives.
 ROC-AUC: Area under the Receiver Operating Characteristic curve.
5. In which data is precision more important than accuracy?
 Precision is more important than accuracy when the cost of false positives is
high. For example, in medical diagnosis, we want to minimize false positives
(precision) to avoid unnecessary treatments, even if it means sacrificing
overall accuracy.
6. What is fraud detection?
 Fraud detection is the process of identifying and preventing fraudulent
activities. In finance, it involves using data analysis and machine learning
techniques to detect fraudulent transactions, such as credit card fraud,
identity theft, or money laundering.
7. What is a confusion matrix?
 A confusion matrix is a table used to evaluate the performance of a
classification model. It shows the counts of true positive, false positive, true
negative, and false negative predictions, allowing for the calculation of
various performance metrics like accuracy, precision, recall, and F1-score.
8. What is grid search?
 Grid search is a technique used for hyperparameter tuning in machine
learning models. It involves defining a grid of hyperparameters and evaluating
the model's performance for all possible combinations of hyperparameters
using cross-validation. The combination that gives the best performance is
then selected.
9. What are the hyperparameters used in Classification?
 Hyperparameters used in classification models include:
 Regularization parameter: Controls overfitting in models like logistic
regression (C parameter).
 Learning rate: Controls the step size in gradient-based optimization
algorithms.
 Number of trees: In ensemble methods like random forests or
boosting.
 Kernel type: In Support Vector Machines (SVM).
 Number of neighbors: In k-nearest neighbors (KNN).
 Depth of trees: In decision trees.
10. What are overfitting and underfitting, and how do they happen? How to solve
them in classification?
 Overfitting: Overfitting occurs when a model learns the training data too well,
capturing noise and random fluctuations that are not representative of the
true relationship. It happens when the model is too complex relative to the
amount of data. To solve overfitting:
 Use simpler models.
 Use regularization techniques.
 Increase the size of the training dataset.
 Underfitting: Underfitting occurs when a model is too simple to capture the
underlying structure of the data. It happens when the model is not complex
enough to learn from the data. To solve underfitting:
 Use more complex models.
 Add more features or polynomial features.
 Reduce regularization.
11. What is bias and variance?
 Bias: Bias is the error introduced by approximating a real-world problem with
a simplified model. High bias means the model is too simplistic and fails to
capture the underlying structure of the data.
 Variance: Variance is the error introduced by the model's sensitivity to
fluctuations in the training dataset. High variance means the model is too
sensitive to noise in the training data and may not generalize well to unseen
data.
Decision Tree
1. What are the hyperparameters in decision tree?

 Some common hyperparameters in decision trees include:

 Criterion: The function used to measure the quality of a split. It can be "gini"
for Gini impurity or "entropy" for information gain.

 Max depth: The maximum depth of the tree.

 Min samples split: The minimum number of samples required to split an

internal node.

 Min samples leaf: The minimum number of samples required to be at a leaf

node.

 Max features: The number of features to consider when looking for the best
split.

 Min impurity decrease: A node will be split if this split induces a decrease of
the impurity greater than or equal to this value.

2. What is gini impurity?

 Gini impurity is a measure of how often a randomly chosen element from the set
would be incorrectly labeled if it was randomly labeled according to the distribution
of labels in the subset. It is used as a criterion for splitting in decision trees.

3. What is information gain?

 Information gain is a measure of the reduction in entropy or Gini impurity achieved

by splitting a dataset based on a particular attribute. It quantifies the effectiveness of
a particular attribute in classifying the data.
4. How is it better for classification?

 Decision trees use criteria like Gini impurity or information gain to decide the best
split for classifying the data. By recursively splitting the data based on the chosen
criteria, decision trees create a tree structure that can efficiently classify data into
different classes. This approach is simple, interpretable, and can handle both
numerical and categorical data, making it useful for classification tasks.

5. What are leaf nodes and root nodes?

 Root Node: The root node is the topmost node in a decision tree, representing the
entire dataset before any split. It contains the feature that best splits the dataset
according to the selected criterion.

 Leaf Nodes: Leaf nodes are the terminal nodes of a decision tree where no further
splits occur. Each leaf node represents a class label, and instances reaching that leaf
node are classified as belonging to that class.

KNN
1. umber of Neighbors (k):

 The number of nearest neighbors to consider when making predictions. Choosing

the right value of k is crucial, as a small k can lead to noise sensitivity while a large k
may smooth out decision boundaries too much.

2. Distance Metric:

 KNN uses distance metrics to measure the similarity between data points. Common
distance metrics include:

 Euclidean distance: ∑𝑖=1𝑛(𝑥𝑖−𝑦𝑖)2∑i=1n(xi−yi)2

 Manhattan (or City block) distance: ∑𝑖=1𝑛∣𝑥𝑖−𝑦𝑖∣∑i=1n∣xi−yi∣

where 𝑝p is a parameter, and 𝑝=2p=2 is equivalent to Euclidean distance

 Minkowski distance: A generalization of Euclidean and Manhattan distance

and 𝑝=1p=1 is equivalent to Manhattan distance.

3. Weighting Scheme:

 KNN can assign different weights to neighboring points when making predictions.
The two common weighting schemes are:

 Uniform: All neighbors have the same weight.

 Distance-based: The weight of each neighbor is inversely proportional to its

distance from the query point.

4. Algorithm:

 KNN algorithms can use different strategies to find the nearest neighbors efficiently.
Common algorithms include:

 Brute force: Computes distances between all pairs of points.

 KD tree: Uses a tree data structure to organize points in multi-dimensional
space for efficient nearest neighbor searches.

 Ball tree: Divides the space into nested hyper-spheres for nearest neighbor
searches.

5. Leaf Size (for KD tree and Ball tree):

 The maximum number of points in a leaf node of the tree. Smaller leaf size can lead
to a more accurate but slower search.

6. Parallelization:

 Some implementations of KNN allow parallelization to speed up computation,

especially for large datasets.

Optimizing these hyperparameters through techniques like grid search or random search can
significantly improve the performance of KNN models.

Naïve bayes
Naive Bayes is a simple but powerful classification algorithm based on Bayes' Theorem. Despite its
simplicity, it has been quite effective in many real-world applications, especially in text classification
and spam filtering. However, Naive Bayes makes certain assumptions, which are essential for its
functioning. These assumptions include:

1. Independence of Features:

 The most crucial assumption of Naive Bayes is that all features are independent of
each other given the class label. In other words, the presence or absence of a
particular feature is unrelated to the presence or absence of any other feature.

 For example, in a spam classification problem, the algorithm assumes that the
occurrence of the word "money" in an email is independent of the occurrence of the
word "free".

2. Constant Variance or Homoscedasticity:

 Naive Bayes assumes that the variance of the features is the same across all classes.
This assumption is particularly crucial for Gaussian Naive Bayes, which assumes that
features follow a Gaussian distribution.

 In practice, this assumption may not always hold, especially if the features have
significantly different variances across classes.

3. No Correlation Between Features:

 Although Naive Bayes assumes independence between features, it does not assume
that features are uncorrelated. However, if features are correlated, Naive Bayes can
still perform well, although it might not be as efficient as when features are truly
independent.

4. Presence of Sufficient Training Data:

 Naive Bayes assumes that there is enough training data available to accurately
estimate the probabilities of different classes and features.
 In cases where there is limited data, Naive Bayes may not perform as well, as it
heavily relies on the probabilities estimated from the training data.

Despite these assumptions, Naive Bayes often performs remarkably well in practice, especially when
the assumptions are approximately met. However, it's essential to be aware of these assumptions
and evaluate the model's performance accordingly. In situations where the assumptions don't hold,
other algorithms might be more appropriate.

difference between bagging and boosting models.

How random forest models are better than decision trees?
Both bagging and boosting are ensemble learning techniques used to improve the performance of
machine learning models, but they differ in their approach and implementation.

Bagging (Bootstrap Aggregating):

 Approach: Bagging involves training multiple models independently on different subsets of

the training data and then combining their predictions.

 Training Process:

 Random subsets of the training data are sampled with replacement (bootstrap
samples).

 Each subset is used to train a base model (e.g., decision tree).

 Combining Predictions:

 Predictions from all models are averaged (for regression) or majority-voted (for
classification).

 Key Characteristics:

 Bagging reduces variance and helps to alleviate overfitting by averaging the

predictions of multiple models trained on different subsets of data.

 Examples of bagging algorithms include Random Forest.

Boosting:

 Approach: Boosting involves sequentially training multiple weak learners (models that are
slightly better than random guessing) and adjusting the weights of training instances based
on the performance of previous models.

 Training Process:

 Each model is trained sequentially, and the training instances are re-weighted such
that misclassified instances receive higher weights.

 Subsequent models focus more on the instances that previous models struggled
with.

 Combining Predictions:

 Predictions are combined by giving more weight to the predictions of models that
perform better on the training data.
 Key Characteristics:

 Boosting reduces bias and can achieve better performance than bagging by focusing
on difficult-to-classify instances.

 Examples of boosting algorithms include AdaBoost, Gradient Boosting Machines

(GBM), and XGBoost.

Difference between Bagging and Boosting:

1. Training Process: Bagging trains models independently in parallel, while boosting trains
models sequentially, with each model correcting the errors of its predecessors.

2. Weighting of Instances: Bagging assigns equal weight to all training instances, whereas
boosting assigns higher weights to misclassified instances.

3. Bias-Variance Tradeoff: Bagging reduces variance but may not significantly reduce bias, while
boosting reduces bias and variance.

4. Model Complexity: Bagging typically uses high-variance, low-bias models (e.g., deep decision
trees), while boosting uses simple models (e.g., shallow decision trees or stumps).

Random Forest vs. Decision Trees:

 Random Forest:

 Random Forest is an ensemble learning method based on bagging.

 It builds multiple decision trees on random subsets of the data and combines their
predictions through averaging or voting.

 Advantages:

 Reduces overfitting compared to individual decision trees.

 Handles high-dimensional data well.

 Provides estimates of feature importance.

 Disadvantages:

 Computationally more expensive than individual decision trees.

 May not provide as interpretable models as single decision trees.

 Decision Trees:

 Decision trees are simple, interpretable models that recursively split the data based
on feature conditions.

 Advantages:

 Easy to understand and interpret.

 Can handle both numerical and categorical data.

 No assumptions about data distribution.

 Disadvantages:
 Prone to overfitting, especially with deep trees.

 Lack of generalization; may not perform well on unseen data if overfitting

occurs.

Overall, Random Forest models are often better than individual decision trees because they reduce
overfitting by averaging predictions from multiple trees trained on different subsets of data, while
still maintaining the interpretability and flexibility of decision trees. Additionally, Random Forests
provide robustness against noise and outliers and are less sensitive to hyperparameters compared to
individual decision trees.

What is feature engineering in short?

Feature engineering is the process of selecting, creating, or transforming features (input variables) in
a dataset to improve the performance of machine learning models. It involves:

1. Feature Selection: Choosing the most relevant features that have the most predictive power
for the target variable. This can involve removing irrelevant or redundant features that do
not contribute much to the model's performance.

2. Feature Creation: Creating new features from existing ones that may capture additional
information or patterns in the data. This can include mathematical transformations,
combining features, or generating new features from domain knowledge.

3. Feature Transformation: Transforming features to make them more suitable for modeling.
This can include scaling features to a similar range, encoding categorical variables, or
handling missing values.

In short, feature engineering aims to make the data more informative and representative, ultimately
improving the model's ability to learn and make accurate prediction.

------------------------------------------------------------------------------------------------------------------------------------

1. What is PCA (Principal Component Analysis)?

 PCA is a dimensionality reduction technique used to simplify complex datasets by

reducing the number of features while preserving most of the original information. It
achieves this by transforming the original features into a new set of orthogonal
(uncorrelated) features called principal components. These components are ordered
by the amount of variance they explain in the data, allowing for the retention of the
most important information in fewer dimensions.

2. What is dimensionality reduction?

 Dimensionality reduction refers to the process of reducing the number of features

(or dimensions) in a dataset while preserving as much information as possible. It is
done to address the curse of dimensionality, improve computational efficiency, and
reduce overfitting in machine learning models.

3. What are the measures to get the best number of clusters?

 There are several methods to determine the optimal number of clusters in a dataset:

 Elbow Method: Plot the within-cluster sum of squares (WCSS) against the
number of clusters, and identify the "elbow" point where the rate of
decrease slows down.

 Silhouette Score: Calculate the average silhouette score for different

numbers of clusters, where a higher score indicates better-defined clusters.

 Gap Statistics: Compare the within-cluster dispersion of the data to a null

reference distribution to find the optimal number of clusters.

 Davies-Bouldin Index: Minimize the Davies-Bouldin index, which measures

the average similarity between each cluster and its most similar cluster, with
lower values indicating better clustering.

 Calinski-Harabasz Index: Maximize the Calinski-Harabasz index, which

measures the ratio of between-cluster dispersion to within-cluster
dispersion, with higher values indicating better clustering.

4. What are inter-cluster distance and intra-cluster distance?

 Inter-cluster Distance: Inter-cluster distance refers to the distance between different

clusters in a dataset. It measures how distinct clusters are from each other. In
hierarchical clustering, inter-cluster distance is used to determine which clusters to
merge.

 Intra-cluster Distance: Intra-cluster distance refers to the average distance between

data points within the same cluster. It measures the compactness or cohesion of a
cluster. Clusters with low intra-cluster distance have data points that are close to
each other, indicating a well-defined cluster.

--------------------------------------------------------------------------------------------------------------------------------------

Important Points for Forecasting:

1. Objective: Forecasting aims to predict future values based on past data and trends to
support decision-making.

2. Time Series Data: Forecasting deals with time series data, where observations are collected
at regular intervals over time.

3. Components of Time Series:

 Trend: The long-term movement or direction of the data.

 Seasonality: Periodic fluctuations that occur at fixed intervals.

 Cyclic Patterns: Non-periodic fluctuations that occur over a long period.

 Irregularity (Noise): Random fluctuations in the data.

4. Forecasting Methods:
 Qualitative Methods: Based on expert judgment, surveys, or market research.

 Quantitative Methods: Based on statistical and mathematical techniques.

5. Common Quantitative Forecasting Techniques:

 Moving Average

 Exponential Smoothing

 ARIMA Models (Autoregressive Integrated Moving Average)

 Seasonal Decomposition

6. Model Evaluation: Forecasting models should be evaluated using appropriate metrics like
Mean Absolute Error (MAE), Mean Squared Error (MSE), or Forecast Bias.

Moving Average Method:

 Definition: Moving average is a simple method of smoothing time series data by calculating
the average of consecutive data points within a sliding window.

 Calculation: For each time point, the moving average is calculated by taking the average of
the data points in the window.

 Purpose: Moving averages help to reduce noise and identify trends or patterns in the data.

 Types:

 Simple Moving Average (SMA): Uses the average of the last 𝑛n observations.

 Weighted Moving Average (WMA): Assigns different weights to different

observations within the window.

 Exponential Moving Average (EMA): Assigns exponentially decreasing weights to

past observations.

Exponential Smoothing Method:

 Definition: Exponential smoothing is a forecasting method that assigns exponentially

decreasing weights to past observations.

 Calculation: The forecast for the next time period is calculated as a weighted average of the
current observation and the forecasted value from the previous time period.

 Purpose: Exponential smoothing is used to capture short-term trends and seasonality in the
data.

 Parameters: It has a smoothing parameter (𝛼α) that controls the rate of decay of past
observations' influence.

ARIMA Models (Autoregressive Integrated Moving Average):

 Definition: ARIMA models are a class of time series forecasting models that capture different
components of a time series: autoregressive (AR), differencing (I), and moving average (MA).

 Components:
 Autoregressive (AR): The value of the time series depends on its past values.

 Integrated (I): Differencing to make the time series stationary.

 Moving Average (MA): The value of the time series depends on past forecast errors.

 ARIMA Notation: ARIMA(p, d, q), where:

 𝑝p is the order of the autoregressive part.

 𝑑d is the degree of differencing.

 𝑞q is the order of the moving average part.

 Purpose: ARIMA models are suitable for forecasting time series data with trends and
seasonality.

These methods and models are fundamental techniques used in time series forecasting to make
predictions based on historical data.

In the context of SVM (Support Vector Machine), "C" is a hyperparameter that controls the trade-off
between maximizing the margin and minimizing the classification error.

Here's what it represents:

 Regularization Parameter: "C" is the regularization parameter in SVM.

 Trade-off Parameter: It determines the trade-off between allowing the model to fit the
training data as best as possible and keeping the model's complexity low to avoid overfitting.

 Penalty for Misclassification: A smaller value of "C" encourages a larger margin and allows
more misclassifications in the training data. Conversely, a larger value of "C" penalizes
misclassifications more heavily, leading to a smaller margin.

 Tuning Parameter: "C" needs to be tuned to find the optimal value that balances the margin
width and classification accuracy on the training data.

In summary, "C" in SVM is a tuning parameter that controls the regularization strength, influencing
the balance between the model's bias and variance.

Data Science Tool Box Important Viva Question
No ratings yet
Data Science Tool Box Important Viva Question
14 pages
Machine Learning
No ratings yet
Machine Learning
2 pages
ML Theory
No ratings yet
ML Theory
10 pages
Machine Learning Interview Questions.
50% (2)
Machine Learning Interview Questions.
43 pages
6.classification & Regression
No ratings yet
6.classification & Regression
45 pages
Unit 2
No ratings yet
Unit 2
7 pages
Machine Learning Basics
No ratings yet
Machine Learning Basics
32 pages
Machine Learning Viva Questions
No ratings yet
Machine Learning Viva Questions
6 pages
Week 4 Q&A
No ratings yet
Week 4 Q&A
7 pages
ML Viva QA
No ratings yet
ML Viva QA
3 pages
Aiml-Qb - Unit 3
No ratings yet
Aiml-Qb - Unit 3
6 pages
Machine Learning QUESTION AND ANSWERS
No ratings yet
Machine Learning QUESTION AND ANSWERS
13 pages
Machine Learning Assignment-6.Sol
No ratings yet
Machine Learning Assignment-6.Sol
3 pages
ML Interview Questions PDF
83% (6)
ML Interview Questions PDF
20 pages
Interview Questions
No ratings yet
Interview Questions
24 pages
SP 24 BADM 576 Final - Exam - Study - Guide
No ratings yet
SP 24 BADM 576 Final - Exam - Study - Guide
13 pages
MLOps, ML Algorithms & Techniques
No ratings yet
MLOps, ML Algorithms & Techniques
58 pages
Sem Rpa
No ratings yet
Sem Rpa
61 pages
ML Interview Ques
No ratings yet
ML Interview Ques
12 pages
ML Questions Answers
No ratings yet
ML Questions Answers
4 pages
Machine
No ratings yet
Machine
21 pages
ML 1
No ratings yet
ML 1
3 pages
INT354 MCQs Unit-4 To Unit-6 Answer
No ratings yet
INT354 MCQs Unit-4 To Unit-6 Answer
15 pages
Assignment 4
No ratings yet
Assignment 4
5 pages
??????? ???????? ??????????!
No ratings yet
??????? ???????? ??????????!
16 pages
Unit 3
No ratings yet
Unit 3
18 pages
Data Science Interview Question
No ratings yet
Data Science Interview Question
7 pages
Machine Learning Concepts Explained
No ratings yet
Machine Learning Concepts Explained
4 pages
ASSIGNMENT2
No ratings yet
ASSIGNMENT2
6 pages
ML Interview Qes.
No ratings yet
ML Interview Qes.
21 pages
Simplified Viva EDA
No ratings yet
Simplified Viva EDA
7 pages
Data Science Interview Questions
100% (1)
Data Science Interview Questions
68 pages
UNIT 1 Practice Quiz - MCQs - ML
100% (1)
UNIT 1 Practice Quiz - MCQs - ML
10 pages
MST 2 Data Science Question
No ratings yet
MST 2 Data Science Question
6 pages
ML Short
No ratings yet
ML Short
11 pages
MLRS Assignment 1 24070146008 Sreemanth Mannem
No ratings yet
MLRS Assignment 1 24070146008 Sreemanth Mannem
12 pages
Aam Unit 1 QB With Answer
No ratings yet
Aam Unit 1 QB With Answer
12 pages
Aasignment
No ratings yet
Aasignment
7 pages
ML 21-22 Sem
No ratings yet
ML 21-22 Sem
10 pages
Lecture4 MCQ Guide
No ratings yet
Lecture4 MCQ Guide
8 pages
Data Science Final Mock Test
No ratings yet
Data Science Final Mock Test
47 pages
Axioms:: Simultaneously Meannormalization
No ratings yet
Axioms:: Simultaneously Meannormalization
2 pages
ERROR and Confusion Matrix
No ratings yet
ERROR and Confusion Matrix
29 pages
Interview Questions For DS & DA (ML)
100% (1)
Interview Questions For DS & DA (ML)
66 pages
Machine Learning Qs
No ratings yet
Machine Learning Qs
10 pages
Unit 1 BD PDF
No ratings yet
Unit 1 BD PDF
26 pages
Data Science Interview Prep Guide
No ratings yet
Data Science Interview Prep Guide
3 pages
Unit6 - 7 Issues
No ratings yet
Unit6 - 7 Issues
53 pages
Top 50 ML Interview Questions Recreated
No ratings yet
Top 50 ML Interview Questions Recreated
5 pages
Data Scientist Interview Prep
No ratings yet
Data Scientist Interview Prep
23 pages
QB ML Ans
No ratings yet
QB ML Ans
14 pages
Q1-What's The Trade-Off Between Bias and Variance?
100% (1)
Q1-What's The Trade-Off Between Bias and Variance?
5 pages
120 DS-With Answer
100% (1)
120 DS-With Answer
32 pages
DataScience Interview Questions
100% (1)
DataScience Interview Questions
66 pages
Data Science Interview Questions: Answer Here
No ratings yet
Data Science Interview Questions: Answer Here
54 pages
AI Capstone Project - Notes-Part2
No ratings yet
AI Capstone Project - Notes-Part2
8 pages
SQL Task1
No ratings yet
SQL Task1
6 pages
What Is Python
No ratings yet
What Is Python
7 pages
MySQL Interview Questions Guide
No ratings yet
MySQL Interview Questions Guide
7 pages
Machine Learning Basics
No ratings yet
Machine Learning Basics
5 pages
Applsci 14 06106 v2
No ratings yet
Applsci 14 06106 v2
32 pages
PESTPP Workshop Based On GV Workshop
No ratings yet
PESTPP Workshop Based On GV Workshop
27 pages
Python Data Analysis Projects
No ratings yet
Python Data Analysis Projects
1 page
Sms Spam Detectionn
No ratings yet
Sms Spam Detectionn
63 pages
Transformer-Based Cross-Modal Recipe Embeddings With Large Batch Training
No ratings yet
Transformer-Based Cross-Modal Recipe Embeddings With Large Batch Training
12 pages
IJETAUTISMPAPER
No ratings yet
IJETAUTISMPAPER
7 pages
Heart Disease Prediction
No ratings yet
Heart Disease Prediction
53 pages
AI Hackathon
No ratings yet
AI Hackathon
11 pages
Quiz 1 Materials
No ratings yet
Quiz 1 Materials
159 pages
CNN Image Classification in MATLAB
No ratings yet
CNN Image Classification in MATLAB
8 pages
AI-Powered Face Generator Report
No ratings yet
AI-Powered Face Generator Report
62 pages
Deep Learning-Based Object Detection and Classification For Autonomous Vehicles in Different Weather S
No ratings yet
Deep Learning-Based Object Detection and Classification For Autonomous Vehicles in Different Weather S
15 pages
Radiologist's Guide To Evaluating Publications of Clinical Research On AI: How We Do It
No ratings yet
Radiologist's Guide To Evaluating Publications of Clinical Research On AI: How We Do It
5 pages
Mil780 Classification
No ratings yet
Mil780 Classification
18 pages
Machine Learning Rod Pump
No ratings yet
Machine Learning Rod Pump
19 pages
Interenship Report
No ratings yet
Interenship Report
26 pages
Hand Fracture Detection with YOLO NAS
No ratings yet
Hand Fracture Detection with YOLO NAS
13 pages
Identification - of - Diseases - in - Apple - Fruits - Using - A-March 2025
No ratings yet
Identification - of - Diseases - in - Apple - Fruits - Using - A-March 2025
9 pages
BE - LP III Lab Manual
No ratings yet
BE - LP III Lab Manual
54 pages
3 Standout Projects
No ratings yet
3 Standout Projects
29 pages
1 To 10 DSBDA Case Study
No ratings yet
1 To 10 DSBDA Case Study
17 pages
Machine Learning for Object Weight Prediction
No ratings yet
Machine Learning for Object Weight Prediction
19 pages
Practical Aspect of Robot Design, Control and Application of AI
No ratings yet
Practical Aspect of Robot Design, Control and Application of AI
68 pages
ML Notes-1
No ratings yet
ML Notes-1
54 pages
IEEE Conference Template 1
No ratings yet
IEEE Conference Template 1
7 pages
AI Builder Guide for Business Users
No ratings yet
AI Builder Guide for Business Users
422 pages
Emmanuel Tembo Final Year Project Report2
No ratings yet
Emmanuel Tembo Final Year Project Report2
66 pages
ARTICULO (2023) - Assessing Real-Time Attention Levels of The Students During Online Classes
No ratings yet
ARTICULO (2023) - Assessing Real-Time Attention Levels of The Students During Online Classes
15 pages
CrimeSense 4 Merged (1) Removed
No ratings yet
CrimeSense 4 Merged (1) Removed
32 pages
Pothole Research Paper
No ratings yet
Pothole Research Paper
8 pages