NUS| SCHOOL OF Computing
AI and ML
Fundamentals
         D r. A i X i n
                      Agenda
 Day 1: Introduction to AI and ML
  Introduction and AI applications case study
 Day 2: Machine Learning I
  Intro, K-Means, KNN and Decision Tree
 Day 3: Machine Learning II
  Linear Regression, Logistic Regression and Neural Networks
 Day 4: Deep Learning
  Convolutional Neural Networks for Computer Vision
  Transformer for Natural Language Processing
 Day 5: Project Presentation
Day 1: Introduction to AI and ML
W H AT I S A R T I F I C A L I N T E L L I G E N C E
    AI definition: the effort to automate intellectual tasks normally
    performed by humans
    AI is a general field that encompasses machine learning and deep
    learning, and many other approaches which don’t involve any learning.
             Symbolic AI (1950s- 1980s)
Rules
        Classical Programming   Answers
Data
          Machine Learning (1990s- now)
  Data
            Machine Learning   Rules
Answers
Deep Learning (2010s- now)
Deep Learning (2010s- now)
                  WHY NOW?
 Big Data
Computing Power
 Algorithms
                                                      Generative AI (2020s- now)
Source: “On the Opportunities and Risks of Foundational Models”, Stanford University
         Case Study:
How can AI help business in the
         real world?
 Case Study 1: Smart Farming
                   PROBLEM
 The efficient use of farmland is critical, especially in
 view of the increased use of pesticide chemicals,
 which brings environmental risks and direct hazards
 for human health.
                 SOLUTION
Use computer vision techniques to sense where crops
are threatened by pests, and control robotic
equipment to fire accurate blasts of pesticide
chemicals at the affected crops, while leaving others
untouched.
    Case Study 2: Smart Lift
              PROBLEM
   When a lift breaks down, it affects a lot
   of people significantly and it takes time
   for the workers to find the problem and
   fix it.
             SOLUTION
KONE has connected more than 1 million of
its escalators and elevators to the cloud.
They are fitted with sensors to collect the
data and machine learning algorithms are
trained on this data to “understand” when
faults or breakdowns are likely to occur.
Case Study 3: Autonomous Machine
              PROBLEM
   One of the biggest obstacles in
   space exploration is the limited
   amount of bandwidth available for
   sending information back to Earth.
             SOLUTION
Space exploration generates huge volumes
of data, and it is far more efficient to use
autonomous machines to work out what is
worth sending home and what can be
discarded.
Case Study 4: Pizza Checker
             PROBLEM
 If pizzas are cooked or delivered that do
 not meet their expectations in terms of
 consistency and quality, customers will
 be dissatisfied.
           SOLUTION
 Photographs every pizza when it leaves
 the oven, and then uses deep learning
 algorithms to inspect it for quality before it
 reaches the customer.
Case Study 5: GenAI in Media
                    SUMMARY
 • Write Stories for the newsrooms
 • Help journalists digest information, create summaries,
   create video content and more
 • In the worlds of sports broadcasting, generate real-time
   commentary in multiple languages and create
   interactive, personalized features for viewers
 • Post-production processes for film: generate realistic
   visual effects and streamlining editing
 • Revolutionize music creation
 • For artists, generate new images, manipulate existing
   images and even complete unfinished art works
Case Study 6: GenAI in Healthcare
                    SUMMARY
 • Deliver personalized advice to patients
 • Interpreting medical images and generating reports
   based on the images
 • Personalized treatment
 • Streamlining medical notetaking, dealing with routing
   calls and enquires at clinics
 • In general, make healthcare more accessible and
   efficient
Each group to study one case and
      share with the others
                                           Project
 Each Group to propose a real-life AI and ML solution in the below industries
  Healthcare Industry (Group 1)
     E.g. diagnosing diseases, analyzing medical images, predicting patient outcomes, drug discovery,
      personalized medicine, and improving healthcare operations.
  Finance Industry (Group 2)
     E.g. fraud detection, algorithmic trading, risk assessment, credit scoring, customer service automation, and
      financial market analysis.
  Retail and E-commerce Industry (Group 3)
     E.g. personalized shopping experiences, recommendation systems, demand forecasting, inventory
      management, pricing optimization, and supply chain optimization.
  Manufacturing (Group 4)
     E.g. predictive maintenance, quality control, process optimization, supply chain management, robotics, and
      autonomous vehicles in manufacturing and industrial settings
                                       Project
 Each Group to propose a real-life AI and ML solution in the below industries
  Transportation and Logistics (Group 5)
    E.g. route optimization, demand forecasting, fleet management, autonomous vehicles, traffic pattern
     analysis, and supply chain optimization
  Education (Group 6)
    E.g. adaptive learning platforms, intelligent tutoring systems, personalized learning experiences,
     automated grading, and plagiarism detection.
  Entertainment and Media (Group 7)
    content recommendation, personalized advertising, sentiment analysis, speech recognition, and
     natural language processing in the entertainment and media industry.
 Let me know if you want to work on problems in other areas
                                       Project
 Each group will need to present with
  Presentation Slides:
    Problem
    Challenges
    Proposed Solution
  Jupyter Notebook Demo: a simple implementation to show the feasibility of your
   proposal / solution
    Collect simple data/image/text from internet or create by yourselves
    Build or Load AI and ML model, and apply the model on the data/image/text using
         Sklearn
         Tensorflow
         Transformer (Hugging Face)
         and etc.
      Day 2: Machine Learning I
(Intro, K-Means, KNN, Decision Tree)
   What is Machine Learning?
What is Machine Learning?
• Machine Learning is the science (and art) of programming computers
  so they can learn from data.
• A slightly more general definition:
    [Machine Learning is the] field of study that gives computers the ability to learn
    without being explicitly programmed.
    —Arthur Samuel, 1959
Types of Machine Learning
Supervised vs. unsupervised
DATA DEFINITIONS: LABELED AND UNLABELED DATA
                                   No Target (labels)     Variables or Features
   We do not know what type of iris
                                            ?
   flowers they are.
   There is no labels.
                                        Target (labels)    Variables or Features
    We collected the data from a
    known group of iris flower types.
    The iris column is the target or
    labels.
    Hence labeled data.
    Classification or Prediction
    techniques are used in these
    cases.
Classification vs. Regression
in Supervised Learning
• Regression
 month       town                 flat_type       block  street_name storey_range    floorAreaSqm resale_price
                                                         TECK WHYE
 2012-09     CHOA CHU KANG        4 ROOM             119 LANE        04 TO 06                      104          400000
                                                         JURONG WEST
 2013-10     JURONG WEST          3 ROOM             510 ST 52       13 TO 15                       74          375000
 2013-03     JURONG EAST          5 ROOM             284TOH GUAN RD 07 TO 09                       120          655000
• Classification
 month     town              flat_type   block street_name           storey_range   floorAreaSqm         resale_price
 2012-09   CHOA CHU KANG     4 ROOM           119TECK WHYE LANE      04 TO 06                      104      Medium
 2013-10   JURONG WEST       3 ROOM           510JURONG WEST ST 52   13 TO 15                       74        Low
 2013-03   JURONG EAST       5 ROOM           284TOH GUAN RD         07 TO 09                      120        High
                  MACHINE LEARNING APPROACHES
HTTPS://WWW.WORDSTREAM.COM/BLOG/WS/2017/07/28/MACHINE-LEARNING-APPLICATIONS, ABDUL WAHID
                                                                                           27
Machine Learning Process
                                    Data Acquisition
DEPLOYMENT
                                   Identify, collect, clean,
                                        clean, clean
                                                                 DISCOVERY
          Predictions                                          Explore and Visualize
      Deployment to predict,                                   Charts, Tables, Correlations,
        automate updates                                                Summaries
                                 Machine
                                 Learning
                                 Process
       Test and Score
      Evaluation metrics on                                     Unsupervised learning
       goodness of model                                           to find patterns
                                                                  K-Means, PCA, ARules
                               Supervised learning
                                 to build models
                               kNN, LR, LM, SVM, Trees,
                                Random Forest, NN, DL
                                                                 LEARNING
              k-means Clustering
• Steps to perform k-means clustering:
  1. Define the number of clusters (k).
  2. Choose k data points randomly to serve as the initial
     centroids for the k clusters.
  3. Assign each data point to the cluster represented by its
     nearest centroid.
  4. Find a new centroid for each cluster by calculating the
     mean vector of its members.
  5. Undo the memberships of all data objects. Repeat steps
     3 to 5 until cluster membership no longer changes.
k-means Clustering - Illustration
                         3 randomly chosen centroids   Assign each object to nearest centroid
  Recalculate centroid        Reassign objects                Stop if membership does not change or
                                                                       max iterations reached
              k-means Clustering
• Strengths:
  • Algorithm is efficient and easy to implement
• Weaknesses:
  • May not know what the value of k should be beforehand
  • Sensitive to the choice of initial k centroids: clusters tend
    to converge to a local optimum solution
  • Sensitive to noise
             k-means Clustering
• Determine the value of k using Elbow Method
  • Sum of Squared Errors (SSE) at each number of clusters
    is calculated and graphed
  • Look for a change of slope from steep to shallow (an
    elbow) to determine the optimal number of clusters
  • This method is inexact, but still potentially helpful
               k-means Clustering
• Elbow Method
  •   SSE measures the distances from
      each data object to the cluster
      centroid, take a squared value on
      the distances and then sum them
      up together.
  •   We can see the larger K the lower
      SSE, but beyond certain point (e.g.
      K=3) adding more clusters will not
      reduce SSE significant, therefore
      K=3 is the desired K.
Lab 1: K-Means
K-Nearest Neighbors (KNN)
 • The k-nearest neighbors algorithm (k-NN) is a non-parametric
   method used for classification and regression. In both cases,
   each input sample has k closest training samples (“neighbors”)
   in the feature space.
 • In k-NN classification, the output is a class membership. An
   object is classified by a majority vote of its neighbors, with the
   object being assigned to the class most common among its k
   nearest neighbors.
 • In k-NN regression, the output is the estimated value for the
   object. This value is the average of the values of k nearest
   neighbors (function estimation using KNN).
                                                                        35
           K-Nearest Neighbors (KNN)
                • Example of k-NN classification. The
                  test sample (green dot) should be
                  classified either to blue squares or to
                  red triangles.
                • If k = 3 (solid line circle) it is assigned
                                                                Feature 2
                   to the red triangles because there
                   are 2 triangles and only 1 square
                   inside the inner circle.
                • If k = 5 (dashed line circle) it is
                   assigned to the blue squares (3                             Feature 1
                   squares vs. 2 triangles inside the
                   outer circle).                                           k=3 vs k=5
WWW.WIKIPEDIA.ORG
                                                                                           36
K-Nearest Neighbors (KNN)
 • Advantages:
   – Easy to implement
   – Simple and understandable
   – Completely data-driven
 • Disadvantages:
   – Slow in testing
   – Sensitiveness to noisy or irrelevant data
   – Sensitive to K
                                                 37
Lab 2: KNN
                                                                      Decision Tree
                                                                                   •   Intuitive and easy to interpret
                                                                                   •   Require very little data preparation
                                                                                   •   Easily deployed in rule-based system
                                                                                   •   Build-in variable selection
Source:
https://bigwhalelearning.files.wordpress.com/2014/11/titanic_heuristic.png
https://www.datacamp.com/community/tutorials/decision-tree-classification-python
Lab 3: Decision Tree
Lab 3: Decision Tree
            Day 3: Machine Learning I
(Linear and Logistic Regression, Neural Networks)
Linear Regression
Lab 4: Linear Regression
               Logistic Regression
• Logistic regression model is
  represented in terms of logistic function
                                                        𝑥𝑥1
                                                        𝑥𝑥2      Logistic Regression Model              𝑦𝑦
  as                                                      ⋮    𝐲𝐲 =
                                                                                   𝟏𝟏
                               𝟏𝟏                       𝑥𝑥𝑛𝑛        𝟏𝟏 + 𝒆𝒆− 𝜷𝜷𝟎𝟎+𝜷𝜷𝟏𝟏𝑥𝑥𝟏𝟏+⋯+𝜷𝜷𝒏𝒏𝑥𝑥𝒏𝒏
       𝐲𝐲 =
              𝟏𝟏 + 𝒆𝒆−   𝜷𝜷𝟎𝟎 +𝜷𝜷𝟏𝟏 𝑥𝑥𝟏𝟏 +⋯+𝜷𝜷𝒏𝒏 𝑥𝑥𝒏𝒏
• Where xi is input to the model and βi is
  coefficients estimated by the model
• The probability formula is the function
  known as logistic curve
• Decision boundary is often set at 0.5
Lab 5: Logistic Regression
              Perceptron: A Single Neuron
                   Neural Networks: A Single Neuron
                                                                                      Deep Learning: a vision approach, Andrew Glassner, 2021
                                                                                                                           A Single Neuron
                                                                                                                                 Activation
                                                                                                                                 Function
https://www.simplilearn.com/tutorials/deep-learning-tutorial/what-is-neural-network
   Multiplayer Perceptron: A Simple Neural
Neural Networks: A Simple Neural Network   Network
      Input Features   Target
  How to train a simple neural network
How to train a Simple Neural Network
Lab 6: Neural Networks
Day 4: Deep Learning
Introduction to convnets
                           https://youtu.be/x_VrgWTKkiM
                                 Introduction to convnets
Conv2D Layer (through filters)
                                              MaxPooling
                                                            source: https://youtu.be/x_VrgWTKkiM
                                                          An Example
                                                                             Deep Learning: a vision approach, Andrew Glassner, 2021
source: http://cs231n.github.io/convolutional-networks/
http://cs231n.stanford.edu/
                                                                 https://www.nomidl.com/deep-learning/what-is-relu-and-sigmoid-activation-function/
A VISUALIZATION …
WWW.YOUTUBE.COM , S DMITRIEV
                               55
Lab 7: Fashion MNIST
Lab 8: Fashion MNIST using CNN
                           VGG 16 and Transfer Learning
• 16 Layers: 13 conv + 3 fc
• Trained on Image-Net dataset
  – More than 14 million images and 1000 classes (e.g. animals, plants, vehicles, household items, and
    natural scenes)
FOOD IMAGES CLASSIFIER
         An Example L a n g u a g e M o d e l
Input: Sequence of Words             Output: probability distribution
                                     Over the dictionary / vocabulary
Transformer: Attention is all you need!
             The evolution of Transformer Model
                         • Surpassed CNN & RNN on NLP task
  Transformer (2017)
                         • Pre-trained to Predict the next word
      GPT (2018)         • Fine-tuning using specific task data
                         • Pre-trained to Encode the input sentence
     BERT (2019)         • Fine-tuning for downstream tasks
 GPT-2 (2019) & GPT-3    • Do not require fine-tuning
                         • Instruction in a Prompt  Detailed Response
        (2020)
ChatGPT (2022) & GPT-4   • Pre-trained to predict the next word
                         • Fine-tuned using human feedbacks
        (2023)           • improvements in truthfulness and reductions in toxic outputs
                                                                                          62
                                                    Large Language Models   GPT3 (2020 & 175B parameters)
                                                                            ChatGPT / GPT3.5 (2022)
                                                                            GPT4 (2023)
https://huggingface.co/learn/nlp-course/chapter1/4?fw=pt
Lab 9: Transformer in NLP
                   A Brief History of Neural Networks
                                                                     2010s
                                                                    Attention         2020 GPT3
                                                                                     (Generative
                                   1980s           1997                              Pre-Trained
                                    RNN            LSTM                  2017
                                                                                    Transformer)
                                                                      Transformer
1950        1960   1970     1980           1990           2000      2010            2020               2030
        1958                    1982                                                       2022-2023
                              a simple                 1998            2012
       a single                                    Convolutional                            ChatGPT
                               neural                                ImageNet
       neuron                                     Neural Networks                            GPT4
                              network                               Competition
                                                      (CNN)
   AI Birth               AI Winter                                      AI Renaissance
Project Presentation
            Book
            François Chollet, “Deep Learning with Python”, 2018
            Bernard Marr, “Artificial Intelligence in Practice”, 2019
            Stuart J. Russell and Peter Norvig, “Artificial Intelligence A Modern Approach”,
            1995
Reference   Online Resource
            https://www.youtube.com/watch?v=t4K6lney7Zw
            https://en.wikipedia.org/wiki/Computer_chess
            https://dougenterprises.com/artificial-intelligence/should-you-use-an-expert-
            system-instead-of-machine-learning/
            https://www.saedsayad.com/decision_tree.htm
            https://huggingface.co/