Python AI ML LLM TrainingJun142024
Python AI ML LLM TrainingJun142024
3
Prerequisites Hardware/Software/Cloud
● Windows or Mac Laptop
● Python Package
● Python online Editor (only for initial days)
● Python IDE Vscode or Jupiter IDE
● Open AI account (Free or Paid)
● ChatGPT, MS Copilot, Meta AI, Google AI (FREE)
● Google Colab (Learning& Production)
● Gitlab or GitHub account (store code)
● Salesforce CRM (dev only for CRM apps)
● JRE as needed
4
Foreword
I am also writing a book and here are contents of same [DRAFT]. It will soon published in
packtpub or amazon kindle - Chapters getting ready for review -There are more to add
1. CHAPTER-I : Python Programming
4. CHAPTER-IV : Transformers
5
GSpecific AI and ML Domains
General Machine Learning Datasets Free
DATA 1. ImageNet
1. Kaggle Datasets ○ ImageNet: A large database of annotated images, widely
○ Kaggle: A vast repository of datasets across various domains, used for training image classification models.
often accompanied by challenges and competitions. 2. COCO Dataset
2. UCI Machine Learning Repository ○ COCO: A dataset for object detection, segmentation, and
○ UCI: A classic source for a wide range of well-documented captioning.
datasets used in academic research. 3. Librispeech
3. Google Dataset Search ○ Librispeech: A corpus of read English speech for training and
○ Google Dataset Search: A search engine for datasets across evaluating speech recognition models.
the web. 4. Open AI Gym
4. AWS Public Datasets ○ Open AI Gym: Provides environments for developing and
○ AWS: A collection of datasets available on Amazon Web comparing reinforcement learning algorithms.
Services. 5. Awesome Public Datasets
○ Awesome Public Datasets: A curated list of publicly available
Natural Language Processing (NLP) and Large Language Models (LLMs) datasets across various domains.
8
● Python & AI ( not just python alone)
● Why Python is needed
● What is happening in AI What course
● What is LLM, AI,ML, Neural Networks covers
9
● This is not about just python
● NO job support - i do not have any jobs
● I do not deep dive into everything
● I do not teach Fundamentals of computers
What course
● Not a paid course DO NOT
● Not ideal for pre-high school students cover
● Not ideal unless you are ready to code
10
Job Market:
Experienced jobs AI engineer or basic jobs like Prompt
● Overall AI knowledge
● Domain knowledge
Healthcare
Finance
Retail etc
● Business Analysis
● Software Dev
● Product Engineering
11
High Level Topics Covered
● Python
● Gen AI Concepts
● Neural Networks
● ML Algorithms
● OpenAI API framework
● LLM Architecture
● Prompt Engineering
● Gen AI LLM Projects
12
Computer Revolutions - AI Next steps
● Main frame 1950-1970
● Personal computer 1970-Present
● Client-server 1980-1990
● Web based 1990-present
● Cloud/Mobile 2000-Present
● AI - 2010 …and 2020 growing fast
13
What is
● AI (Artificial Intelligence)
● ML (Machine Learning)
● DL (Deep Learning)
● LLM (Large Language Model)
● Gen AI (Generative AI)
14
Artificial Intelligence
Artificial intelligence (AI), in its broadest sense, is
intelligence exhibited by machines, particularly
computer systems. It is a field of research in computer
science that develops and studies methods and
software that enable machines to perceive their
environment and uses learning and intelligence to
take actions that maximize their chances of achieving
defined goals.
15
Machine Learning
Machine learning (ML) is a field of study in
artificial intelligence concerned with the
development and study of statistical
algorithms that can learn from data and
generalize to unseen data, and thus perform
tasks without explicit instructions.
16
Deep Learning
Deep learning (DL) is the subset of machine
learning methods based on neural networks
with representation learning. The adjective
"deep" refers to the use of multiple layers in the
network. Methods used can be either
supervised, semi-supervised or unsupervised
17
Large Language Model
A large language model (LLM) is a language model
notable for its ability to achieve general-purpose language
understanding and generation. LLMs acquire these
abilities by learning statistical relationships from text
documents during a computationally intensive
self-supervised and semi-supervised training process.[1]
LLMs can be used for text generation, a form of generative
AI, by taking an input text and repeatedly predicting the next
token or word.
18
Generative AI
Generative artificial intelligence (Gen AI) is
artificial intelligence capable of generating text,
images, videos, or other data using generative
models, often in response to prompts. Generative AI
models learn the patterns and structure of their input
training data and then generate new
data/response that has similar characteristics.
19
DL AI
LLM/G
en AI
ML
LLM
Generative AI uses LLM based upon DL algorithms which is subset of ML which is further a subset of overall AI
20
Category Examples
AI Expert Systems, Robotics, Natural Language Processing
(NLP), Computer Vision, Autonomous Vehicles
23
AI Challenges
24
AI Challenges
25
AI Challenges
26
Sample LLM BOT Code
This is an example how - A simple 100 lines code can do so much
magic - At this stage do not worry about what the code is-just
think about we can develop such applications
27
This demo explains
28
Is
Architecture patient
at Heart
risk ?
Python
Not
really
but …
Application requests goes to RAG DB and sends details to LLM and LLM responds to chat
BOT - Python is involved 29
import streamlit as st
from dotenv import load_dotenv
from PyPDF2 import PdfReader
from langchain.text_splitter import CharacterTextSplitter
from langchain.embeddings import OpenAIEmbeddings,
HuggingFaceInstructEmbeddings
from langchain.vectorstores import FAISS Import Libraries
30
def get_vectorstore(text_chunks):
def get_pdf_text(pdf_docs):
# import pdb;pdb.set_trace()
text = ""
embeddings = OpenAIEmbeddings()
for pdf in pdf_docs:
# embeddings =
pdf_reader = PdfReader(pdf) HuggingFaceInstructEmbeddings(model_name="hkunlp/instructor-xl")
for page in pdf_reader.pages: vectorstore = FAISS.from_texts(texts=text_chunks,
text += page.extract_text() embedding=embeddings)
return text return vectorstore
Generative AI Use Case in Medical Field: Predicting Heart Disease Risk and
Providing Preventative Measures
33
Generative Use Case (Use Case)
Use Case:
1. Patient Data Collection: Gather past medical history and lifestyle data of
the patient.
2. Risk Assessment: Use an instruction-trained LLM to analyze the collected
data and predict the risk of heart disease.
3. Personalized Recommendations: Provide customized preventive measures
to mitigate the identified risks.
34
Generative Use Case (Solution)
Components:
35
Other use cases
1. Finance 2. Healthcare
Use Case: Fraud Detection and Prevention Use Case: Personalized Treatment
Recommendations
Description: Develop an AI system that analyzes
transaction patterns and flags suspicious activities Description: Use an LLM to analyze patient records
in real-time to prevent fraudulent transactions and and genetic data to suggest personalized
reduce financial losses. treatment plans and medication regimens tailored
to individual patient needs.
3. Retail
4. Manufacturing
Use Case: Customer Sentiment Analysis
Use Case: Predictive Maintenance
Description: Implement a system to analyze
customer reviews and feedback across multiple Description: Use AI to monitor machinery data and
platforms, providing insights into customer predict potential failures before they occur,
sentiment and helping businesses improve allowing for timely maintenance and reducing
products and services. downtime.
36
Other use cases
5. Education 6. Security
Use Case: Intelligent Tutoring Systems Use Case: Threat Intelligence Analysis
Description: Create an AI tutor that provides Description: Develop a system that gathers and
personalized learning experiences, adapts to analyzes cybersecurity threat data to provide
student progress, and offers targeted help in areas real-time alerts and insights, helping organizations
where students struggle. mitigate potential security breaches.
7. Transportation 8. Hospitality
Use Case: Route Optimization for Logistics Use Case: Personalized Travel Recommendations
37
Other use cases
9. Legal 10. Energy
Use Case: Contract Analysis and Management Use Case: Energy Consumption Optimization
Description: Develop a system that automates the Description: Use AI to analyze energy usage
analysis of legal contracts, identifying key terms patterns and suggest optimizations for reducing
and potential risks, and helping in contract energy consumption in homes and businesses,
management and compliance. leading to cost savings and reduced environmental
impact.
Use Case: Crop Disease Detection Use Case: Property Valuation and Market Analysis
Description: Implement an AI system that uses Description: Develop an AI tool that analyzes real estate
image recognition to analyze photos of crops and market trends, historical sales data, and property
detect early signs of diseases, enabling farmers to features to provide accurate property valuations and
market insights for buyers, sellers, and investors.
take timely actions to protect their yields.
38
Use cases in CRM
1. Automated Customer Support 2. Interactive Voice Response (IVR) System
Description: Develop a chatbot powered by Description: Implement an AI-driven IVR system to
generative AI to handle customer inquiries related
guide customers through their queries and
to bill payments and fund transfers, providing quick
and accurate responses 24/7. transactions over the phone, offering personalized
assistance based on the customer's account
history.
Description: Use AI to analyze the sentiment of Description: Create an AI system that analyzes
customer interactions across emails, chat support, transaction history and usage patterns to
and social media, helping identify dissatisfied recommend personalized financial products or
customers and improving service quality. services, such as bill payment plans or fund
transfer options.
.
39
Use cases in CRM
5. Fraud Detection in Real-time 6. Customer Feedback Analysis
Description: Deploy an AI system to monitor fund Description: Use AI to analyze feedback from surveys and
transfer transactions in real-time, detecting and follow-up interactions, extracting insights to improve
flagging suspicious activities to prevent fraud. services related to bill payments and fund transfers.
40
Python Loading!!
41
● Invented in 1980
● Invented by Guido van Rossum
● Simple & Easy to understand/Learn
● Object Oriented
● Gaining more popularity
● Corporate supported (Google..) Python
Background
● Increased speed & security
● Widespread community & libraries
● AI/ML/LLM libraries widely available
● Python must for AI jobs
● Python is for: Application, general, web, scripting, artificial
intelligence, scientific computing
42
● Python official guide
● Python University Course
● O’reilly Python Ultimate Guide
● Python downloads&installs Python
Resources
● Python online editor
● Python Projects to Try
● Python ML Projects to Try
43
44
SIMPLE
Syntax
45
A Program lang needs
● Basic Data types ( byte,string, in,float) - handle data
● Arithmetics - To do math on data
● Collections (data structures) - To hold data
● Objects - To combine data with operations
● Control statements - To communicate between objects
● File handling - To communicate external files
● Error handling, Reuse libraries, Debugging, Deploy etc
46
Python Syllabus
● Standard types and Variables (float,string,text)
● Data structures(array,list,tuple,dictionary,set)
● Control flows(if,for,while,pass,break,continue,match)
● Exceptions(try,except,raise)
● Functions,Modules & Packages
● File I/O
● Class (OOPs,inheritance,object,attributes,method,init)
● Logging,Template,multiThread
● Environment installs (Virtual),IDE
● Libraries-Next slide
47
● Inbuilt (math,string etc)
● Pandas (data manipulation)
● Numpy (ml) Python
● Matplotlib (plots) Popular
● Scipi (statistics) Libraries
● Scikit-learn/TensorFlow/Pytorch/Keras (ML)
● Flask/Django/Streamlit (web UI)
● Openai (LLM)
● Langchain (Framework)
● Huggingface/Transformers/datasets (Training datasets)
● Pinecone/chromedb/elasticsearch (VecorDB)
48
Python Basic Data Types
49
Python Basic data types and Variables
Example Data Type
x = "Hello World" str
x = 20 int
x = 20.5 float
x = 1j complex
% Modulus x%y
** Exponentiation x ** y
// Floor division x // y
52
Python Assignment Operators
Assignment operators are used to assign values to
Python Assignment Operators variables:
54
Python Logical Operators
Python Logical
Operators
Logical operators are
used to combine
conditional
statements:
Operator Description Example
Returns True if both statements are x < 5 and x <
and true 10
Returns True if one of the statements
or is true x < 5 or x < 4
Reverse the result, returns False if not(x < 5 and
not the result is true x < 10)
55
Data Type Conversion:
Function Description
int(x [,base]) Converts x to an integer. base specifies the base if x is a string.
long(x [,base] ) Converts x to a long integer. base specifies the base if x is a string.
float(x) Converts x to a floating-point number.
complex(real [,imag]) Creates a complex number.
57
Keywords contain lowercase letters only.
and
assert
Reserved Words:
exec
finally
not
or
break for pass
class from print
continue global raise
def if return
del import try
elif in while
else is with
except lambda yield
58
Variable
• Variables are nothing but reserved memory locations to store
values. This means that when you create a variable you
reserve some space in memory.
• Based on the data type of a variable, the interpreter allocates
memory and decides what can be stored in the reserved
memory. Therefore, by assigning different data types to
variables, you can store integers, decimals, or characters in
these variables.
59
Assigning Values to
Variables:
• Python variables do not have to be explicitly declared to reserve memory
space. The declaration happens automatically when you assign a value to a
variable. The equal sign (=) is used to assign values to variables.
counter = 100 # An integer assignment
miles = 1000.0 # A floating point
name = "Frank" # A string
print counter
print miles
print name
60
Multiple Assignment:
61
Numbers:
• Number data types store numeric values. They are immutable data types,
which means that changing the value of a number data type results in a
newly allocated object.
• Number objects are created when you assign a value to them. For example:
var1 = 1
var2 = 10
Python supports four different numerical types:
• int (signed integers)
• long (long integers [can also be represented in octal and hexadecimal])
• float (floating point real values)
• complex (complex numbers)
62
int long float complex
10
100
Number Examples:
51924361L
-0x19323L
0
15.2
3.14j
45.j
-786 0122L -21.9 9.322e-36j
80 0xDEFABCECBDAECBFBAEl 32.3+e18 .876j
-490 535633629843L -90 -.6545+0J
-0x260 -052318172735L -3.25E+101 3e+26J
0x69 -4721885298529L 70.2-E12 4.53e-7j
63
Strings:
• Strings in Python are identified as set of characters in
between quotation marks.
• Python allows for either pairs of single or double quotes.
Subsets of strings can be taken using the slice operator ( [ ]
and [ : ] ) with indexes starting at 0 in the beginning of the
string and working their way from -1 at the end.
• The plus ( + ) sign is the string concatenation operator, and
the asterisk ( * ) is the repetition operator.
64
str = 'Hello World!'
print str # Prints complete string
print str[0] # Prints first character of the string
print str[2:5] # Prints characters starting from 3rd to 6th
print str[2:] # Prints string starting from 3rd character
print str * 2 # Prints string two times
print str + "TEST" # Prints concatenated string
Output:
Hello World!
H
llo
llo World!
Hello World!Hello World!
Hello World!TEST
65
Lists:
• Lists are the most versatile of Python's compound data types. A list contains
items separated by commas and enclosed within square brackets ([]).
• To some extent, lists are similar to arrays in C. One difference between them
is that all the items belonging to a list can be of different data type.
• The values stored in a list can be accessed using the slice operator ( [ ] and [
: ] ) with indexes starting at 0 in the beginning of the list and working their
way to end-1.
• The plus ( + ) sign is the list concatenation operator, and the asterisk ( * ) is
the repetition operator.
66
list = [ 'abcd', 786 , 2.23, 'john', 70.2 ]
tinylist = [123, 'john']
Output:
['abcd', 786, 2.23, 'john', 70.2]
abcd
[786, 2.23]
[2.23, 'john', 70.2]
[123, 'john', 123, 'john']
['abcd', 786, 2.23, 'john', 70.2, 123, 'john']
67
Tuples:
• A tuple is another sequence data type that is similar to the
list. A tuple consists of a number of values separated by
commas. Unlike lists, however, tuples are enclosed within
parentheses.
• The main differences between lists and tuples are: Lists are
enclosed in brackets ( [ ] ), and their elements and size can
be changed, while tuples are enclosed in parentheses ( ( ) )
and cannot be updated. Tuples can be thought of as
read-only lists.
68
tuple = ( 'abcd', 786 , 2.23, 'john', 70.2 )
tinytuple = (123, 'john')
OUTPUT:
('abcd', 786, 2.23, 'john', 70.2)
abcd
(786, 2.23)
(2.23, 'john', 70.2)
(123, 'john', 123, 'john')
('abcd', 786, 2.23, 'john', 70.2, 123, 'john')
69
Dictionary:
• Python 's dictionaries are hash table type. They work like
associative arrays or hashes found in Perl and consist of
key-value pairs.
• Keys can be almost any Python type, but are usually numbers
or strings. Values, on the other hand, can be any arbitrary
Python object.
• Dictionaries are enclosed by curly braces ( { } ) and values can
be assigned and accessed using square braces ( [] ).
70
dict = {}
dict['one'] = "This is one"
dict[2] = "This is two“
tinydict = {'name': 'john','code':6734, 'dept': 'sales'}
print dict['one'] # Prints value for 'one' key
print dict[2] # Prints value for 2 key
print tinydict # Prints complete dictionary
print tinydict.keys() # Prints all the keys
print tinydict.values() # Prints all the values
OUTPUT:
This is one
This is two
{'dept': 'sales', 'code': 6734, 'name': 'john'}
['dept', 'code', 'name']
['sales', 6734, 'john']
71
OOPs in Python and code in Github
72
Python Libraries
73
Demo -sample code links
F” “ Positional, keyword
orgs
# inline comments
75
Next week Homework
Create a Python program to process a CSV file containing student data, including first
name, last name, and marks for three subjects: Maths, English, and Science. The
program should calculate statistical metrics (mean, median, maximum, and minimum)
for each subject and compute the GPA and overall class rank for each student. The
GPA should be on a scale of 4, with 4 being the highest, corresponding to perfect
scores in all subjects. The output should include the original student data with
additional columns for class rank and GPA.
76
Advanced Python Next week Homework
"I aim to develop a predictive model leveraging historical vehicle data. The model will utilize attributes such as vehicle
make, model, manufacturing year, and mileage to forecast the maximum potential lifespan of a vehicle.
Initially, I'll gather historical data for a specific vehicle make, recording various models along with their respective
manufacturing years and mileage. This data will serve as the foundation for training the predictive model.
Once the historical data is collected, I'll train a machine learning model using regression techniques, tailoring it to
predict the maximum lifespan of vehicles based on their manufacturing year and mileage. The model will learn from
the historical data to make accurate predictions.
Upon completion of model training, I'll deploy the model to accept new vehicle data as input. Users will provide details
such as vehicle make, model, manufacturing year, and mileage, for which the model will generate a forecast of the
vehicle's maximum potential lifespan.
By implementing this predictive model, I aim to assist users in estimating the longevity of their vehicles, enabling
informed decision-making regarding maintenance and replacement strategies."
77
Neural Network Project Using any ML pkg
Go to any public available data sets of reviews for movies or
restaurants and based on review classify the data is
Positive/Negative or Neutral
Positive Review
"This film is a masterpiece of storytelling, featuring exceptional performances by the cast. The cinematography is
stunning, and the direction is impeccable. It's a deeply emotional experience that will leave you thinking long after the
credits roll. Highly recommend to anyone looking for a powerful and moving cinematic journey."
Negative Review
"This movie was a huge disappointment. The plot was incoherent, and the characters were poorly developed. The
special effects looked cheap, and the dialogue was cringeworthy. It felt like a waste of time and money. I wouldn't
recommend this to anyone."
Neutral Review
"The movie had its moments of brilliance, with some well-executed scenes and decent acting. However, it struggled
with pacing issues, and some parts felt dragged out. It's an average film that might appeal to fans of the genre but
doesn't stand out as particularly memorable or groundbreaking."
78
Machine Learning
79
Types
1. Supervised Learning
2. Unsupervised Learning
3. Self-Supervised Learning
4. Semi-Supervised Learning
5. Reinforcement Learning
80
Courtesy: MS -
https://learn.microsoft.com/en-us/training/modules/fundamentals-machine-learning/3-types-of-machine-learning 81
Aspect Regression Classification Clustering Anomaly Detection Reinforcement Learning
Definition Predicts a continuous Predicts a discrete class Groups similar data points Identifies outliers or rare Learns optimal actions
output (numerical value). label. into clusters. events in data. through trial and error.
Output Type Continuous (e.g., price, Categorical (e.g., Cluster labels (e.g., Binary or anomaly score Policy or value function
temperature). spam/ham, disease/no cluster 1, cluster 2). (e.g., normal/anomaly). (e.g., action to take).
disease).
Goal Estimate relationships Categorize data into Discover natural Detect deviations from Maximize cumulative
between variables. predefined classes. groupings in data. normal patterns. reward.
Example Use Cases Predicting house prices, Email spam detection, Customer segmentation, Fraud detection, network Game playing, robotics.
forecasting sales. image recognition. document clustering. security.
Algorithms Linear regression, Logistic regression, SVM, K-means, hierarchical Isolation Forest, Q-Learning, Deep Q
polynomial regression, decision trees, KNN. clustering, DBSCAN. One-Class SVM, Networks (DQN), Policy
SVR. Autoencoders. Gradient Methods.
Evaluation Metrics Mean Squared Error Accuracy, precision, Silhouette score, Precision, recall, Cumulative reward, policy
(MSE), R-squared. recall, F1-score. Davies-Bouldin index. F1-score, ROC-AUC. improvement.
Training Data Requires labeled data Requires labeled data Does not require labeled Labeled
with continuous target with categorical target data, only features. (normal/anomalous) or
values. values. unlabeled data.
82
“ Regression”
83
“ Classification”
84
“ Clustering”
85
“ Neural Networks”
86
87
Use Case Algorithm
Weather Prediction Regression
Spam Email Detection Classification
Customer Segmentation Clustering
Fraud Detection Anomaly Detection
92
What is Neural Networks
A neural network is a method in artificial intelligence that teaches
computers to process data in a way that is inspired by the human
brain. It is a type of machine learning process, called deep learning,
that uses interconnected nodes or neurons in a layered structure that
resembles the human brain. It creates an adaptive system that
computers use to learn from their mistakes and improve
continuously. Thus, artificial neural networks attempt to solve
complicated problems, like summarizing documents or recognizing
faces, with greater accuracy.
93
Types
● Perceptron
● Multi Layer
● Feedforward
● Recurrent
● Modular
● Radial state machine
● Liquid state machine
● Residual
● Convolutional NN
94
Transformer Architecture
95
Transformer Neural Networks
Transformer neural networks are worth highlighting because they have assumed a place of outsized
importance in the AI models in widespread use today.
First proposed in 2017, transformer models are neural networks that use a technique called
"self-attention" to take into account the context of elements in a sequence, not just the elements
themselves. Via self-attention, they can detect even subtle ways that parts of a data set relate to each
other.
This ability makes them ideal for analyzing (for example) sentences and paragraphs of text, as opposed
to just individual words and phrases. Before transformer models were developed, AI models that
processed text would often "forget" the beginning of a sentence by the time they got to the end of it, with
the result that they would combine phrases and ideas in ways that did not make sense to human readers.
Transformer models, however, can process and generate human language in a much more natural way.
Transformer models are an integral component of generative AI, in particular LLMs that can produce text
in response to arbitrary human prompts.
96
ML Algorithms
97
ML Algorithms
Simple Analogies
1. Supervised Learning: A student learning math problems with the help of a teacher who provides the correct
answers.
2. Unsupervised Learning: A child sorting a mixed box of Lego pieces by shape and color without any instructions.
3. Self-Supervised Learning: Reading a book where some words are blanked out and using the surrounding text to
guess the missing words.
4. Semi-Supervised Learning: Learning to cook with a few recipes (labeled data) and experimenting with new
ingredients on your own (unlabeled data).
5. Reinforcement Learning: A video game player trying different strategies to win a game, learning from both wins
(rewards) and losses (penalties).
98
ML Algorithms
1. Supervised Learning
Title: Learning with a Teacher
Description: In supervised learning, the algorithm learns from labeled data. It's like having a teacher who provides both
questions and answers. For example, if you're learning to recognize animals, you get a bunch of pictures of animals (the
input) with labels like "cat" or "dog" (the output). The algorithm uses these examples to learn how to classify new images.
2. Unsupervised Learning
Title: Learning without a Teacher
Description: Unsupervised learning deals with unlabeled data. There are no answers provided, so the algorithm tries to
find patterns and structure on its own. It's like exploring a new place without a map and figuring out the layout by yourself.
An example is clustering, where the algorithm groups similar items together, like organizing a pile of mixed candies by
color or type.
1.
99
ML Algorithms
3. Self-Supervised Learning
Title: Learning from Itself
Description: In self-supervised learning, the algorithm generates its own labels from the data. It's like solving a puzzle where
some pieces are missing, and you have to figure out the missing parts from the surrounding pieces. An example is a language
model predicting the next word in a sentence, where the sentence itself provides context.
4. Semi-Supervised Learning
Title: A Mix of Learning with and without a Teacher
Description: Semi-supervised learning uses a combination of labeled and unlabeled data. It's like having a few questions with
answers and a lot of questions without answers. The algorithm learns from the labeled data and uses this knowledge to make
sense of the unlabeled data. This approach is useful when labeling data is expensive or time-consuming.
5. Reinforcement Learning
Title: Learning by Trial and Error
Description: Reinforcement learning involves learning by taking actions and receiving rewards or penalties. It's like training a100
dog where good behavior is rewarded with treats and bad behavior is discouraged. The algorithm learns the best actions to
ML Algorithms
1. Linear Regression
2. Logistic Regression
3. Decision Trees
4. Random Forest
5. K-Nearest Neighbors (KNN)
6. Support Vector Machines (SVM)
7. Naive Bayes
8. K-Means Clustering
9. Principal Component Analysis (PCA)
10. Neural Networks
11. Reinforcement Learning
101
ML Algorithms
1. Linear Regression
Title: Predicting a Continuous Value
Description: Linear Regression is like drawing a straight line through a set of points on a graph to find a trend. For example, if
you have data on house sizes and their prices, you can use linear regression to predict the price of a house based on its size.
2. Logistic Regression
Title: Predicting Categories
Description: Logistic Regression helps to categorize things into groups. It's like sorting emails into "spam" or "not spam" by
looking at the content. Even though it's called regression, it's used for classification.
3. Decision Trees
Title: Making Decisions with If-Else
Description: Decision Trees work like a flowchart of decisions. Each decision is a branch, and the final outcomes are leaves. For
instance, you might decide whether to play outside based on weather: if it's sunny, go play; if it's rainy, stay inside.
102
.
ML Algorithms
4. Random Forest
Title: A Forest of Decision Trees
Description: Random Forests are a bunch of decision trees working together. Each tree gives a vote, and the majority vote
decides the final outcome. It's like asking multiple experts for their opinion and choosing the most common answer.
8. K-Means Clustering
Title: Grouping Similar Items
Description: K-Means groups similar items into clusters. It's like organizing a pile of mixed candies into groups based on color.
The algorithm keeps refining the groups until the items in each group are very similar.
105
ML algorithms Code Sample Gitlab links
106
Transformers
107
What it is ? - READ
Transformers are a type of neural network architecture in
artificial intelligence (AI) that can change an input
sequence into an output sequence. They do this by learning
the context and relationships between the components of
the sequence. Transformer models are a key part of
generative AI and are used in many applications, including:
Natural language processing (NLP), Translating text and
speech, Understanding DNA and proteins, Drug discovery,
and Medical research.
108
The Sky is ?
How a transformer model processes an input
sentence, generates a context-aware response,
and learns to improve its predictions through
training. Understanding each key term and its
role helps demystify how advanced AI models
like transformers operate to generate accurate
and relevant responses
109
The Sky is ?
How a transformer model processes an input
sentence, generates a context-aware response,
and learns to improve its predictions through
training. Understanding each key term and its
role helps demystify how advanced AI models
like transformers operate to generate accurate
and relevant responses
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
Prompt Engineering
137
What is Prompt Engineering?
138
LLM System context (Act like a 5 year old boy)
The sky is …
139
LLM System context (Act like an astronaut)
The sky is …
140
LLM System context (Act like a metrologist)
The sky is …
141
Prompt Engineering Types Google Link
● Zero Shot
● One Shot
● Few Shot
● Chain of Thought
● Tree of Thoughts
142
Definitions of Types of Prompting styles
Zero Shot: Making predictions without prior task-specific training. Useful in sentiment analysis and classification tasks.
One Shot: Learning from a single example to generalize. Useful in tasks like text classification.
Few Shot: Learning from a few examples to understand and perform tasks. Useful in text completion and generation.
Chain of Thought: Step-by-step reasoning to solve complex tasks. Useful in arithmetic and logical problem solving.
Tree of Thought: Exploring multiple branches of reasoning for decision-making. Useful in strategic planning and game
theory.
143
144
145
146
147
References:
https://www.promptingguide.ai/
https://github.com/f/awesome-chatgpt-prompts
148
Where to start handson
149
Installation (Python/vscode)
● Vscode
○ https://code.visualstudio.com/docs/setup/setup-over
view
● Install Java/JRE
● Install Python extensions
● Virtual env :
○ python3 -m venv myenv1; source myenv1/bin/activate
150
LLM resources (OpenAI)
● OpenAI:
○ https://platform.openai.com/docs/quickstart
● API keys:
○ https://platform.openai.com/organization/api-keys
● Set env
○ Create .env and set OPENAI_API_KEY=’your key’
○ printenv,
○ export OPENAI_API_KEY=’your key’,
○ echo $OPENAI_API_KEY
151
LLM resources (Prompt Engineering)
● Sentimental prompts:
○ https://www.promptingguide.ai/prompts/classificati
on/sentiment
152
LLM resources (langchain)
Examples:
https://python.langchain.com/v0.2/docs/tutorials/llm_chain/
153
LLM resources (higgingface)
https://huggingface.co/docs/transformers/v4.15.0/en/examples
154
LLM
155
Large Lang Models are Trained
in Public huge datasets like
10TB approx with large GPU
machines like 6000 GPUs and
cost 2M takes to 12 days to
train
156
Popular Open Source Datasets for Training LLMs
1. Common Crawl
2. RefinedWeb
3. The Pile
4. C4
5. Starcoder Data
6. BookCorpus
7. ROOTS
8. Wikipedia
9. Red Pajama
157
Popular LLMs
● GPT-4: Best for creating marketing content.
● Falcon: Best for a human-like, conversational chatbot.
● Llama 2: Best for a free, resource-light, customizable
LLM.
● Cohere: Best enterprise LLM for building a
company-wide search engine.
● Gemini: Best for an AI assistant in Google Workspace.
● Claude 3: Best for a large context window.
158
● LLM (open sourced/comercial/closed/community)
● Public data sets
● Instruct based Datasets
● Private datasets(RAG)
● Benchmarks
● Evaluation Metrics
159
LLM Architecture
160
● Parameters
● Tokens
● Neural networks
● Multiple layers
● Transformers
● Encoder/decoder/encoder-decoder
● Activation functions
● Attention mechanism
● Gradient functions
● CNN/RNN
161
● ChatGPT
● MetaAI LLMs
● MS Copilot
● Google AI 162
● AI ML DL GemAI/LLM Basics
● Different types of LLM
● LLM Architecture
● LLM Frameworks
● LLM Stages
● LLM Industry Use cases
● LLM TUNING
● LLM product dev process
● LLM Devops
● Python Basics
● E2E project
● Internet Resources
163
LLM can do below but not limited to
164
SFDC/App RAG
Vector
DB
LLM
Architecture
LLM
LLM AGENTS
1
2
Instruction
Pre Trained
Trained
165
RAG
Architecture
166
LLM AI References
● https://www.elastic.co/what-is/large-language-models
● https://aws.amazon.com/what-is/large-language-model/
● https://en.wikipedia.org/wiki/Large_language_model
● https://medium.com/@vipra_singh/building-llm-applications-intr
oduction-part-1-1c90294b155b
● https://www.elastic.co/what-is/semantic-search
● https://arxiv.org/list/cs.AI/recent
● https://github.com/mlabonne/llm-course?tab=readme-ov-file
167
● empty
168
Advanced LLM
169
GAN: Generates realistic data (like images) using a
generator and discriminator.
ViT: Processes images by understanding relationships
between different parts simultaneously.
LoRA: Efficiently fine-tunes large models by adjusting a
smaller subset of parameters.
QLoRA: Further improves LoRA by reducing parameter
precision, making fine-tuning even more efficient.
170
Generative Adversarial Network (GAN)
Explanation:
A Generative Adversarial Network (GAN) is a type of artificial intelligence that can create new, synthetic data that resembles real
data. It consists of two parts: a generator and a discriminator.
● Generator: Think of this as a forger. It tries to create fake data (like fake images or fake text) that looks as real as
possible.
● Discriminator: Think of this as a detective. It tries to determine if the data it receives is real (from the training set) or fake
(from the generator).
The two parts play a game with each other: the generator gets better at creating realistic data, and the discriminator gets better at
detecting fakes. Over time, the generator creates data that is so realistic that the discriminator has a hard time telling if it's real or
fake.
Use Cases:
171
Variational Autoencoder (VAE)
Explanation:
A Variational Autoencoder (VAE) is a type of artificial intelligence used to generate new data and reduce data dimensionality. It
consists of two main parts: an encoder and a decoder.
● Encoder: This part compresses input data (like images) into a smaller, latent representation. It's like summarizing a big
book into a short synopsis.
● Decoder: This part takes the latent representation and reconstructs the original data from it. It's like expanding the short
synopsis back into a detailed book.
What makes a VAE special is that it adds some randomness (variability) into the encoding process. This randomness helps the
model to explore different variations and generate new, diverse outputs.
Use Cases:
1. Image Generation: Creating new images that are similar to those in the training set.
2. Anomaly Detection: Identifying outliers or anomalies in data by reconstructing input data and finding those that don't
match well.
3. Dimensionality Reduction: Reducing the number of features in data while preserving important information, useful for
visualization or further processing.
4. Data Denoising: Cleaning noisy data by reconstructing the original, noise-free version from the latent representation.
172
LORA
Explanation:
LoRA is a technique used to fine-tune large AI models efficiently by only adjusting a smaller, low-rank subset of their
parameters. This makes the fine-tuning process faster and requires less computational power.
Use Cases:
1. Quickly adapting large language models for specific tasks (e.g., chatbot personalization).
2. Efficiently updating models with new data without retraining the entire model.
173
qLORA
Quantized Low-Rank Adaptation (QLoRA)
Explanation:
QLoRA builds on LoRA by adding quantization, which involves reducing the precision of the model parameters to save
memory and computation while maintaining performance. This makes it even more efficient for fine-tuning large
models.
Use Cases:
174
OpenAI
175
OpenAI MODELS
MODELS LIST :
https://platform.openai.com/docs/models/gpt-4-turbo-and-gpt-4
MODEL PARAMETERS
● model=GPT_MODEL,
● temperature=0,
● max_tokens=200,
● top_p=0.5,
● n=1,
● stop="Translate:",
● frequency_penalty=0.6,
● presence_penalty=0.8,
REF: https://medium.com/@csakash03/model-parameters-in-openai-api-fe6101d3f813
MODEL PROMPTS:
https://platform.openai.com/docs/examples
176
Customer Conversation
Customer reached out support and chatted with agent and at the end of chat
or phone call LLM will summarize the results to have follow up actions
LLM reads the case record/email/chat/task- and
a) LLM determines sentiment of customer
b) LLM determines sentiment of customer and provides why it is so
c) LLM summarizes an apology/thank you note to customer
d) LLM escalates to manager to seem extra help to improve customer
satisfaction
1-177
Architecture Customer chats
Llm request
cases/
leads/
emails/
accounts/
quotes
LLM Prompts
1-178
For Example a Conversation like : (negative
sentiment)
1-179
Now llm looked at record and provided the below
1-180
For Example a Conversation like : (postive
sentiment)
1-181
LLM Analysis
1-182
Additional open AI LLM use cases in CRM world
1-183
Additional use cases -set1
1-184
Additional use cases -set2
1-185
Additional use cases -set3
1-187
IT operations
1-188
Sample open AI Code in repo
Category Github link Description/Purpose
Text Code This code will take customer chat/question and it automatically classify
Classification the intent and return classification criteria - very useful for agent
transfers-we can plugin to any CRM or IVR
Text
Summarization
Chat Sentiments
Voice Call
Summary
Meeting Min
summary
1-189
Sample open AI Code in repo
Category Github link Description/Purpose
Post Chat
Summary &
Follow up
1-190
EXTRA SLIDES
191
Training Community Rules
● Respect others
● This meeting is recorded.
● Mute Yourselves always
● Chat your question first
● Do not chat with other candidates
● Do not steal other candidate info(ph number/email etc)
192