KEMBAR78
MySQL - Learn Data Analytics Together's Group | PDF | Computer Programming | Data
0% found this document useful (0 votes)
17 views96 pages

MySQL - Learn Data Analytics Together's Group

Uploaded by

Phạm Tấn
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
17 views96 pages

MySQL - Learn Data Analytics Together's Group

Uploaded by

Phạm Tấn
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 96

MySQL - Learn Data Analytics Together's Group

MySQL - Learn Data Analytics Together's Group


Disclaimer

I compiled the information you'll find here, drawing on my experience from a


fantastic MySQL challenge provided by Eric in Learn Data Analytics Together's
Group and the insightful teachings of Alex Freberg. Special thanks to Eric & Tho
Cao for their proofreading assistance.

Compiler: XuHi
Proofreaders: Eric and Tho Cao
All credit goes to Alex Freberg - our teacher.

Self-Study Data
Learn Data Analytics Together

1 / 96
MySQL - Learn Data Analytics Together's Group

MySQL UI walkthrough
Create a query
Click File icon button to create new query 1-1
Unsaved query has a little star after its unsaved name 1-2

Create a database
Database in MySQL is called Schema
Use Database button to create a database 2
Switch to Schema Tab to see new created database 3
Remember to choose a database when execute a query (especially with the
created queries to ensure that those queries are executed in the corresponding
database)

SQL

CREATE SCHEMA <DatabaseName>;

Create a table
Choose a schema you would like to create a table by double clicking on the
Schema name until it is bold 4
Use Table button to create a table 5 or we can create a Table by using This
query

SQL

CREATE TABLE <DatabaseName>.<TableName>


(
<Column1> DataType
<Column2> DataType
);

Right-click on any Schema → Choose Refresh All to enable the new created
table displayed in the Schema

2 / 96
MySQL - Learn Data Analytics Together's Group

Query in MySQL
What is querying in MySQL?
1. This is how data is retrieved from the tables in a database
2. Querying gives you an output of the results you're searching for
3. The basic syntax follows the order SELECT , FROM , WHERE , GROUP BY , and ORDER
BY

Element Expression Role


5 SELECT [select list] Defines which columns to return
1 FROM [table source] Defines table(s) to query
2 WHERE [search condition] Filter rows using a predicate
3 GROUP BY [group by list] Arranges rows by group
4 HAVING [search condition] Filters group using a predicate
6 ORDER BY [order by list] Sorts the output
7 LIMIT

Actual execution order


SQL queries adhere to a specific order when evaluating clauses. From the eyes of
the user, query begins from the first clause and end at the last clause. However,
queries aren’t read from top to bottom when carried out.

The order in which the clauses in queries are executed is as follows.

What the How it's Why it works this way


query looks executed
like
5 SELECT ▶ FROM ▷ SQL starts with which table your
query is taking data from
1 FROM ▶ WHERE ▷ This is how SQL filters on rows
2 WHERE ▶ GROUP ▷ This is where your SQL query checks
BY if you have an aggregation
3 GROUP BY ▶ HAVING ▷ HAVING requires a GROUP BY
statement
4 HAVING ▶ SELECT ▷ Only after all these calculations have
been made with SQL SELECT returned
6 ORDER BY ▶ ORDER BY ▷ This sorts the data returned
7 LIMIT ▶ LIMIT ▷ Lastly, you can limit the number of
rows returned

3 / 96
MySQL - Learn Data Analytics Together's Group

SELECT
SELECT determines which columns to include in the query's result set. It is the
foremost commonly utilized SQL command. The SELECT statement can be utilized to
choose columns from a table, such as a particular set of columns or all of the
columns from a table.

Field names are separated by commas


The order in which the fields are listed controls the order in which they display
in query output.
The FROM clause follows the last field name and ends with a semi-colon ;
Code lines can be split to improve readability

SQL

SELECT
<Column1>,
<Column2>,
<Column3>,
...
<ColumnN>
FROM <TableName>;

The first thing we need to do is make sure that our database is connected. Or else
you'll encounter error Table does not exist . (Refer to this section to resolve this
error.)

Always define the [Database] before you query any data in that database to
ensure SQL knows which exact location it should navigate to get data.

4 / 96
MySQL - Learn Data Analytics Together's Group

Select all columns in a table


SQL

SELECT *
FROM <TableName>;

Select specific columns in a table


SQL

SELECT
<Column1>,
<Column2>,
...,
<Columnn>
FROM <TableName>;

Select only unique value in a column from a table


SQL

SELECT DISTINCT
<Column1>,
<Column2>,
<Column3>
FROM <TableName>;

Best Practices to follow while using SELECT statement


1. Limit the number of columns: Only select the columns that are needed for the
query. This reduces the amount of data that needs to be retrieved from the
database, resulting in faster query execution times.
2. Use table aliases: Table aliases make the query easier to read and write. They
also reduce the amount of typing required when referencing columns in the
query.
3. Use indexes: Create indexes on columns that are frequently used in WHERE and
JOIN clauses. This improves query performance by reducing the amount of data
that needs to be scanned to find the required data.

5 / 96
MySQL - Learn Data Analytics Together's Group

Using Aliases
Aliases can be used for:

Column's name to retrieve a table with desired columns' name


Table's name to make queries more readable.

Example:
1/ This query renames FirstName by GivenName and LastName by FamilyName

SQL

SELECT
FirstName AS GivenName,
LastName AS FamilyName
FROM customer

2/ This query renames table when joining to make query more readable.

SQL

SELECT e.Name, e.Salary, d.DepartmentName


FROM employees AS e
INNER JOIN departments AS d
ON e.DepartmentID = d.DepartmentID;

3 ways to resolve error Table does not exist


Use USE command
Use USE command to locate a specific database where contains the table.

SQL

USE <Database_Name>;

Example: This query ensures MySQL defines the SELECT query is performed inside
bakery database.

SQL

USE bakery; --This query ensures below queries are performed in


bakery database

SELECT *
FROM customers;

6 / 96
MySQL - Learn Data Analytics Together's Group

Double click on the database


Double click on a specific database to locate a specific database where contains the
table.

Type Database_Name before Table_Name

SQL

SELECT *
FROM <DatabaseName>.<TableName>;

7 / 96
MySQL - Learn Data Analytics Together's Group

WHERE
WHERE clause is used to filter records or rows of the data. It returns only records that
fulfill a specified condition.

WHERE only affects rows at the row level, not the columns.

SQL

SELECT *
FROM <TableName>
WHERE <Condition>;

Comparison Operators - =, <>, >, <


Comparison Operators compares numbers or strings and perform evaluations.

> & < are exclusive operators


>= & <= are inclusive operators
<> or != is the negation operator

Logical Operators - AND, OR, NOT


In SQL, the AND & OR operators are used for filtering the data and getting precise
results based on conditions. The SQL AND & OR operators are also used to combine
multiple conditions. These two operators can be combined to test for multiple
conditions in a SELECT , INSERT , UPDATE , or DELETE statement.

When combining these conditions, it is important to use parentheses so that the


database knows what order to evaluate each condition.

The AND and OR operators are used with the WHERE clause.
These two operators are called conjunctive operators.

AND Operator
This operator displays only those records where both conditions condition1 and
condition2 evaluates to True.

table_name : name of the table


condition1,2,..N : first condition, second condition and so on

8 / 96
MySQL - Learn Data Analytics Together's Group

SQL

SELECT *
FROM <TableName>
WHERE <Condition1>
AND <Condition2>
AND …<ConditionN>;

OR Operator
This operator displays the records where either one of the conditions condition1 and
condition2 evaluates to True. That is, either condition1 is True or condition2 is True.

SQL

SELECT *
FROM <TableName>
WHERE <Condition1>
OR <Condition2>
OR …<ConditionN>;

Combining AND and OR


We can combine AND and OR operators to write complex queries.

SQL

SELECT *
FROM <TableName>
WHERE <Condition1>
AND (<Condition2> OR <Condition3>);

Conclusion - AND & OR

Any kind of condition, including equality, inequality, comparison, and logical


operators, can be utilized with the AND and OR operators.
The AND operator is more important than the OR operator. In other words, the
circumstances that are separated by AND are examined first, followed by the
conditions that are separated by OR.
You can employ the AND and OR operators inside of other conditions because
they can both be nested.

Remember to combine both AND & OR to generate powerful query.

In case you don't include parenthesis ( ) in the query containing both AND & OR to
specify the execution order, SQL runs query automatically from left to right.

9 / 96
MySQL - Learn Data Analytics Together's Group

NOT Operator
The NOT operator in SQL is used to display records for which a condition is NOT
TRUE.

SQL

SELECT *
FROM <TableName>
WHERE <Condition1>
NOT (<Condition2> OR <Condition3>);

Example: Here's what NOT looks like in action when querying a dataset of Amazon
reviews and filtering it to reviews that don't have a 5-star rating:

SQL

SELECT *
FROM reviews
WHERE NOT rating = 5;

We can retrieves same result when using this query also:

SQL

SELECT *
FROM Table
WHERE state != 'PA';

NOTE

'NOT' only negate the 1st condition

Example: Be careful when using NOT Operator with AND Operator. For instance, NOT
only affects the 1st condition in this query

SQL

--Only the 1st condition is affected by `NOT`, the 2nd condition


is a positive condition.

SELECT *
FROM customers
WHERE NOT total_money_spent > 1000 AND state = 'TX';

10 / 96
MySQL - Learn Data Analytics Together's Group

IN Operator
IN Operator allows us to specify multiple values in a WHERE clause. It's a shorthand
for a list of OR conditions.

IN Operator must be used with parenthesis () .


Of note, IN Operator cannot operate when you search multiple values in
multiple columns. IN Operator is useful only when you search multiple values in
one column.

SQL

SELECT *
FROM <TableName>
WHERE <ColumnName> IN (<Value1>, <Value2>, ... <ValueN>);

NOTE

String conditions must be covered in a single quotation mark (' ').

SQL is case-insensitive thus the upper/lower case characters in the String


condition do not impact the query's result.

Of note, be aware that using a large list in IN can be inefficient.

The NOT IN (...) operator is the negation of IN (...) operator.

SQL

SELECT *
FROM <TableName>
WHERE <ColumnName> NOT IN (<Value1>, <Value2>, ... <ValueN>);

Example: Here's what IN looks like when querying:

1/ This query retrieves rows where customerID is 1 or 2 or 3

11 / 96
MySQL - Learn Data Analytics Together's Group

SQL

SELECT *
FROM customers
WHERE CustomerID IN (1, 2, 3);

-- We can rewrite the above syntax using OR Operator. This example


illustrates IN Operator is a shorthand for a list of OR
conditions.
SELECT *
FROM customers
WHERE CustomerID = 1
OR CustomerID = 2
OR CustomerID = 3);

2/ This query retrieves rows where customerID is not 1 or not 2 or not 3

SQL

SELECT *
FROM customers
WHERE CustomerID NOT IN (1, 2, 3);

3/ This query retrieves all customers having names like John, Henry or Arial

SQL

-- When condition is a string, we must cover that string in a


quotation mark
SELECT *
FROM customers
WHERE CustomerName IN ('John', 'Henry', 'Ariel');

4/ To find all customers with ''john' somewhere in their names, this query would
return records where 'CustomerName' contains the substring 'join' in any case. The
result will be 'John', 'JOHn', 'JoHN', etc.

SQL

SELECT *
FROM customers
WHERE CustomerName IN ('John');

12 / 96
MySQL - Learn Data Analytics Together's Group

BETWEEN Operator
BETWEEN Operator selects values within a given range (that range can be numbers,
strings or dates).

BETWEEN Operator is inclusive so what value you have at the beginning or at the
end are going to be included in your output.
BETWEEN & AND operates as same as >= & <=
BETWEEN in MySQL works with strings.

SQL

SELECT *
FROM <TableName>
WHERE <ColumnName> BETWEEN <Condition1> AND <Condition2>;

Example: This query retrieves all cities having name between 'Dallas' & 'Scranton'

SQL

SELECT *
FROM customers
WHERE city BETWEEN 'Dallas' AND 'Scranton';

NOT BETWEEN Operator


We can use NOT to negate the BETWEEN ... AND condition.

SQL

SELECT *
FROM <TableName>
WHERE <ColumnName> NOT BETWEEN <Condition1> AND <Condition2>;

Example: This query retrieves Amazon reviews where the star count is NOT between
2 and 4 stars:

SQL

SELECT *
FROM reviews
WHERE stars NOT BETWEEN 2 AND 4;

13 / 96
MySQL - Learn Data Analytics Together's Group

LIKE Operator
LIKE Operator uses in WHERE clause to search for a specified pattern in a column.
There are 2 wildcards we can use in a LIKE Operator:

% represents for zero, one or multiple characters


_ represents for single character

Similar to IN , LIKE Operator is not case-sensitive. ( 'RED' matches with 'red' when
you use LIKE operator)

The NOT LIKE operator is the negation of LIKE operator.

Default MySQL uses collation case-insensitive and we can adjust it to case-sensitive


collation.

Example:
1/ To find all SKU with 'blue' somewhere in their names, this query would return
records where 'product_name' contains the substring 'blue' in any case. The result
will be 'Blue Jacket', 'Blue Skirt', and 'Blue T-shirt'.

SQL

SELECT product_name
FROM products
WHERE product_name LIKE '%blue%';

2/ This query retrieves all customers having name start with 'ro', e.g. 'rosa', 'Rosa',
'Rose', 'rose', 'Rosabell', etc.

SQL

SELECT *
FROM customers
WHERE FirstName LIKE 'ro%';

3/ This query retrieves all customers with names strictly having 4 characters and
starting with 'ro', e.g. 'rosa', 'Rosa', 'Rose', 'rose', etc.

SQL

SELECT *
FROM customers
WHERE FirstName LIKE 'ro__';

14 / 96
MySQL - Learn Data Analytics Together's Group

GROUP BY
GROUP BY is used to group a set of rows into a set of summary rows based on the
values in one or more columns.

Most often used in conjunction with Aggregate Functions like SUM, COUNT,
MAX, MIN, and AVG, which allow you to perform calculations on the grouped
data.
NULL values are treated as the same and be grouped together in GROUP BY .
When using GROUP BY with multiple columns, we have a distinct combinations of
the specified columns.

SQL

SELECT <Column1>
FROM <TableName>
GROUP BY <Column1>;

NOTE

If columns in SELECT clause do not cover inside an aggregation function, it must


be listed in GROUP BY clause.

Example:
1/ This query retrieves a table listing ProductIDs and its corresponding prices.

SQL

SELECT
ProductID,
ProductPrice
FROM orderdetails
GROUP BY ProductID, ProductPrice;

We can rewrite the above query as below:

SQL

SELECT DISTINCT
ProductID,
ProductPrice
FROM orderdetails;

15 / 96
MySQL - Learn Data Analytics Together's Group

2/ This query retrieves a table listing all departments in a company

SQL

SELECT department
FROM employees
GROUP BY department;

We can also rewrite the above query as below:

SQL

SELECT DISTINCT department


FROM employees;

3/ This query retrieves a table listing ProductIDs and its corresponding sales. (Refer
to SUM function at this link)

SQL

SELECT
ProductID,
SUM(Sales)
FROM orderdetails
GROUP BY ProductID;

Difference between GROUP BY and DISTINCT


GROUP BY groups rows that have the same values into summary rows and is
typically used with aggregate functions.
DISTINCT removes duplicate rows from the result set.

SQL

-- Using GROUP BY to count unique departments


SELECT department, COUNT(*) AS department_count
FROM employees
GROUP BY department;

-- Using DISTINCT to list unique departments


SELECT DISTINCT department
FROM employees;

16 / 96
MySQL - Learn Data Analytics Together's Group

GROUP BY ROLLUP
ROLLUP is used to do aggregate operation on multiple levels in a hierarchy.

ROLLUP is used with multiple columns, providing subtotals for each level of grouping
specified.

SQL

SELECT
<Column1>,
SUM (<Column2>)
FROM <TableName>
GROUP BY ROLLUP (<Column1>);

Or

SQL

SELECT
<Column1>,
SUM (<Column2>)
FROM <TableName>
GROUP BY <Column1> WITH ROLLUP;

Example: This query retrieves the grand totals of the grade per each student.

SQL

SELECT
student_name,
SUM(grade) sum_grade
FROM classes
GROUP BY ROLLUP (student_name);

17 / 96
MySQL - Learn Data Analytics Together's Group

Aggregate Functions
Aggregate Functions are essential tools for efficient data summarization and
analysis. They help to distill large datasets into meaningful insights, facilitate
statistical analysis, and simplify complex data structures.

The most frequently used SQL aggregate functions are:

SUM()
COUNT()
AVG()
MIN()
MAX()

SUM()
Returns the total of a numerical column. It is typically used to find the total of values,
e.g. sales income, quantities, or expenses.

SQL

SELECT SUM(<ColumnName>)
FROM TableName;

Example: ABC manufacturing company sold a total of 10,000 products in 2023, with a
breakdown of:

8,000 units of Product A - $150/unit


1,500 units of Product B - $700/unit
500 units of Product C - $1999/unit

The company records their sales in a Products Table, which includes two columns:
ProductID, TransactionDate, Price and UnitsSold for each type of product.

1/ This query retrieves the company's total units sold in 2023. It returns to a result of
10,000 units sold.

SQL

SELECT SUM(UnitsSold) AS Total_Units_Sold


FROM Products;

2/ This query retrieves the company's total units of Product A sold in 2023. It returns
to a result of 8,000 units of Product A being sold.

18 / 96
MySQL - Learn Data Analytics Together's Group

SQL

SELECT SUM(UnitsSold) AS Total_Units_Sold


FROM Products
WHERE ProductID = 'A';

COUNT()
Count the number of rows in a table or the number of non-null values in a column.

DISTINCT keyword is optionally used to count only the unique values.

SQL

SELECT COUNT([DISTINCT] <ColumnName>)


FROM TableName;

Example: Using the above example

1/ This query finds out how many sales transactions in 2024.

SQL

SELECT COUNT(TransactionDate) AS Total_Transactions


FROM Products;

Or we can use This query to retrieve the same result.

SQL

SELECT COUNT(*)
FROM Products;

2/ This query finds out how many types of products were sold. It returns to 3, i.e.
Product A, Product B and Product C.

SQL

SELECT COUNT(DISTINCT ProductID)


FROM Products;

19 / 96
MySQL - Learn Data Analytics Together's Group

AVG()
Calculate the average value of a numeric column.

SQL

SELECT AVG(<ColumnName>)
FROM TableName;

Example: To understand this with an example, consider the following table that
contains employee records.

employees

employeenumber lastname skill_level annual_salary department


1056 Patterson 10 10000 Finance
1076 Firrelli 10 NULL Marketing
1088 Patterson 60 12500 Finance
1102 Bondur 21 10000 Human Resources
1143 Bow 10 5000 Sales
1165 Jennings 10 5000 Sales
1166 Thompson 10 10000 Marketing

1/ This query finds out the average skill level of the employees. The result is 18.7143.

SQL

SELECT AVG(skill_level)
FROM employees;

To round this number to the nearest integer, AVG() is used nested in other SQL
functions like ROUND() as follow. The result now is 19.

SQL

SELECT ROUND(AVG(skill_level))
FROM employees;

2/ This query finds out the average salary of the employees in Finance Department.
The result is $11,250.

20 / 96
MySQL - Learn Data Analytics Together's Group

SQL

SELECT AVG(annual_salary)
FROM employees
WHERE department = 'Finance';

3/ This query finds out the average salary of the employees in Marketing
Department. Notice how AVG function ignores NULL value and returns to the result of
$10,000.

SQL

SELECT AVG(annual_salary)
FROM employees
WHERE department = 'Marketing';

4/ This query retrieves average salary for each department.

SQL

SELECT
department AS department,
AVG(annual_salary) AS AvgSalary
FROM employees
GROUP BY department;

Result is as below:

Department AvgSalary
Finance 11,250
Marketing 10,000
Human Resources 10,000
Sales 5,000

21 / 96
MySQL - Learn Data Analytics Together's Group

MIN()
Return the smallest value within a column.

SQL

SELECT MIN(<ColumnName>)
FROM TableName;

Example: Use the above example.

1/ This query retrieves the minimum salary across departments. The result is $5,000.

SQL

SELECT MIN(annual_salary)
FROM employees;

2/ The query retrieves the minimum salary for each department.

SQL

SELECT
department AS Department,
MIN(annual_salary) AS Min_Salary
FROM employees
GROUP BY department;

Result is as below:

Department Salary
Finance 10,000
Marketing 10,000
Human Resources 10,000
Sales 5,000

22 / 96
MySQL - Learn Data Analytics Together's Group

MAX()
Return the largest value within a column.

SQL

SELECT MAX(<ColumnName>)
FROM TableName;

Example: Use the above example. This query retrieves the minimum salary across
departments. The result is $12,500.

SQL

SELECT MAX(annual_salary)
FROM employees;

Tips

Multiple aggregate functions can be used in a single SQL query. Using several
aggregate functions provides a more comprehensive view of your data.

23 / 96
MySQL - Learn Data Analytics Together's Group

HAVING
Collaborate with the GROUP BY clause to narrow down results based on Aggregate
Functions. In other word, HAVING applies conditions on groups pivots by GROUP BY
clause.

SQL

SELECT
<Column1>,
<Column2>,
AGGREGATE_FUNCTION(<Column3>)
FROM <TableName>
GROUP BY <Column1>, <Column2>
HAVING <GroupCondition>;

Example: Using the same example in AVG() function. This query retrieves average
salary of each departments where average salary is greater than 6,000.

SQL

SELECT
department,
AVG(annual_salary)
FROM employees
GROUP BY department
HAVING AVG(annual_salary) > 6000;

Result is as below. Noting with the condition in HAVING clause, the average salary of
Sales Department (which is $5,000) is removed from the result.
(Refer to this link the query retrieves average salary of each department)

Department AvgSalary
Finance 11,250
Marketing 10,000
Human Resources 10,000

24 / 96
MySQL - Learn Data Analytics Together's Group

Data Types
What are data types in MySQL?
1. The data type is a guideline for SQL to understand what type of data is
expected inside of each column
2. It also identifies how SQL will interact with the stored data
3. There are main categories of Data Types - Strings, Numeric, Date and Time, and
then Boolean (which is technically under Numeric)

STRING Data Type


Data Type Details
CHAR Fixed_Length
VARCHAR Variable-Length - Max: 65,536
MEDIUMTEXT Max: 16MB
LONGTEXT Max: 4MB
TINYTEXT Max: 255 bytes
TEXT Max: 64KB

String Functions
LENGTH()

Return the length of the string, including spaces.

SQL

LENGTH(ColumnName)

Example: This query returns to a result of '15'

SQL

SELECT LENGTH('MySQL Challenge');

UPPER()

Upper all characters in the string.

SQL

UPPER(ColumnName)

25 / 96
MySQL - Learn Data Analytics Together's Group

Example: This query returns to 'MYSQL CHALLENGE'

SQL

SELECT UPPER('MySQL Challenge');

LOWER()

Lower all characters in the string.

SQL

LOWER(ColumnName)

Example: This query returns to 'mysql challenge'

SQL

SELECT LOWER('MySQL Challenge');

TRIM()

TRIM() removes trailing and leading blanks, not blanks between characters.

SQL

TRIM([{BOTH | LEADING | TRAILING} [removal_char] FROM] string)

Example:
1/ This query returns to 'I love MySQL Challenge'

SQL

SELECT TRIM(' I love MySQL Challenge ');

2/ This query returns to 'I love MySQL Challenge'

SQL

SELECT TRIM(' I love MySQL Challenge ');

26 / 96
MySQL - Learn Data Analytics Together's Group

3/ This query returns to 'MySQL Challenge'

SQL

SELECT TRIM(LEADING 'x' FROM 'xxxMySQL Challenge');

4/ This query returns to 'MySQL Challenge'

SQL

SELECT TRIM(TRAILING '!' FROM 'MySQL Challenge!');

5/ This query returns to 'MySQL Challenge'

SQL

SELECT TRIM(BOTH 'x' FROM 'xxxMySQL Challengexxx');

LTRIM()

LTRIM() removes leading blanks.

SQL

LTRIM(string)

Example: This query returns to 'MySQL Challenge '

SQL

SELECT LTRIM(' MySQL Challenge ');

RTRIM()

RTRIM() removes trailing blanks.

SQL

RTRIM(string)

Example: This query returns to ' MySQL Challenge'

SQL

SELECT RTRIM(' MySQL Challenge ');

27 / 96
MySQL - Learn Data Analytics Together's Group

LEFT()

Extract x characters from a string (starting from left)

SQL

LEFT(string, length)

Example: This query returns to 'MySQL'

SQL

SELECT LEFT('MySQL Challenge',5);

RIGHT()

Extract x characters from a string (starting from right)

SQL

RIGHT(string, length)

Example: This query returns to 'Challenge'

SQL

SELECT RIGHT('MySQL Challenge',9);

SUBSTRING()

Extract a substring from a string (start at position start , extract length characters)

SQL

SUBSTRING(string_expression, start, length)

Example: This query returns to 'Challenge'

SQL

SELECT SUBSTRING('MySQL Challenge',7,10);

28 / 96
MySQL - Learn Data Analytics Together's Group

SUBSTRING_INDEX()

Return a substring of a string before a specified number of delimiter occurs.

positive number counts delimiter from left to right


negative number counts delimiter from right to left

SQL

SUBSTRING_INDEX(string, delimiter, number)

Example:
1/ This query returns to '056'

SQL

SELECT SUBSTRING_INDEX('056-193-778-513', '-', 1);

2/ This query returns to '056-193'

SQL

SELECT SUBSTRING_INDEX('056-193-778-513', '-', 2);

3/ This query returns to '778-513'

SQL

SELECT SUBSTRING_INDEX('056-193-778-513', '-', 2);

REPLACE()

REPLACE() function replaces all occurrences of a substring within a string, with a


new substring. This function performs a case-sensitive replacement.

SQL

REPLACE(string, from_string, new_string)

Example: This query returns to '056193778513'

SQL

SELECT REPLACE('056-193-778-513','-','');

29 / 96
MySQL - Learn Data Analytics Together's Group

CONCAT()

CONCAT() function adds two or more expressions together.

SQL

CONCAT(Expression1, Expression2, Expression3, ...)

Example: This query concatenates 3 phrases to turn it into a meaningful sentence


'Shoot for the moon! Even if you miss, you'll land among the star!'

SQL

SELECT CONCAT('Shoot for the moon! ', 'Even if you miss,', '
you\'ll land among the star!');

-- Notice how spaces are added between phrases


-- If your string contains single quotation mark as a normal
character. To treat it as a literal character sequence, you need
to escape it using a backslash (`\`) before the single quote.

30 / 96
MySQL - Learn Data Analytics Together's Group

NUMERIC Data Type

Data Type Details


TINYINT 1b
SMALLINT 2b
MEDIUMINT 3b
INT 4b - Whole Numbers
BIGINT 8b
DECIMAL(precision, scale) DECIMAL(3,2) → 1.56

Numeric Function
ROUND

ROUND rounds a number to a specified number of decimal places.

SQL

SELECT ROUND(5.7897,2); -- Result = 5.79


SELECT ROUND(5.7897,1); -- Result = 5.8

CEILING

CEILING () rounds up to the nearest whole number.

SQL

SELECT CEILING(5.7); -- Result = 6


SELECT CEILING(5.3); -- Result = 6

FLOOR

FLOOR () rounds down to the nearest whole number.

SQL

SELECT FLOOR(5.7); -- Result = 5


SELECT FLOOR(5.3); -- Result = 5

ABSOLUTE

ABSOLUTE gives absolute whole number at a given value.

31 / 96
MySQL - Learn Data Analytics Together's Group

SQL

SELECT ABS(4.6); -- Result = 4.6


SELECT ABS(-4.6); -- Result = 4.6

32 / 96
MySQL - Learn Data Analytics Together's Group

DATE & TIME

Data Type Details


DATE 2022-01-01
TIME 22:52:21
DATETIME 2010-01-01 22:52:21
TIMESTAMP 2010-01-01 22:52:21
(limited from 1970-01-01 to 2038-01-01)
YEAR 2022

Date & Time Functions in MySQL

NOW()
NOW() function returns the current date and time.

Example:
1/ This query retrieves current date and time as 'YYYY-MM-DD HH-MM-SS' (string
format) → '2024-06-08 05:26:35'

SQL

SELECT NOW();

2/ The below queries retrieve current year, month, week and day .

SQL

SELECT YEAR(NOW());
SELECT MONTH(NOW());
SELECT WEEK(NOW());
SELECT DAY(NOW());

CURDATE()
CURDATE() function returns the current date.

Example: This query retrieves current date as 'YYYY-MM-DD' (string format) →


'2024-06-08'

SQL

SELECT CURDATE();

33 / 96
MySQL - Learn Data Analytics Together's Group

Or you can rewrite it using CURRENT_DATE() function.

SQL

SELECT CURRENT_DATE();

CURTIME()
CURTIME() function returns the current time.

Example: This query retrieves current time as 'HH-MM-SS' (string format) →


'05:26:35'

SQL

SELECT CURTIME();

Or you can rewrite it using CURRENT_TIME() function.

SQL

SELECT CURRENT_TIME();

DAYNAME()
DAYNAME() function returns the weekday name for a given date.

Example: This query returns to 'Sarturday'.

SQL

SELECT DAYNAME(NOW());

MONTHNAME()
MONTHNAME() function returns the month name for a given date.

Example: This query returns to 'June'.

SQL

SELECT MONTHNAME(NOW());

34 / 96
MySQL - Learn Data Analytics Together's Group

DATE_FORMAT()
DATE_FORMAT() function formats a date as specified.

SQL

DATE_FORMAT(date,format)

Example:

SQL

SELECT DATE_FORMAT(NOW(),'%M %d %Y'); -- 'June 08 2024'


SELECT DATE_FORMAT(NOW(),'%M %D, %Y'); -- 'June 8th, 2024'
SELECT DATE_FORMAT(NOW(),'%m-%d-%Y'); -- '06-08-2024'

DATEADD
DATE_ADD() function adds a time/date interval from a date and then returns the
date.

Use positive number to determine date after a specific date


Use negative number to determine date before a specific date

SQL

DATE_ADD(date, INTERVAL value interval)

Example:
1/ This query returns to next day of '2024-06-08' date → '2024-06-09'

SQL

SELECT DATE_ADD('2024-06-08', INTERVAL 1 DAY);

2/ This query returns to previous day of '2024-06-08' date → '2024-06-07'

SQL

SELECT DATE_ADD('2024-06-08', INTERVAL -1 DAY);

3/ This query returns to the timestamp which is 15 minutes after '2024-06-08


23:00:00' timestamp → '2024-06-08 23:15:00'

35 / 96
MySQL - Learn Data Analytics Together's Group

SQL

SELECT DATE_ADD('2024-06-08 23:00:00', INTERVAL 15 MINUTE);

NOTE

When using INTERVAL with time values like HOUR, MINUTE, SECOND, or
MICROSECOND without specifying a specific time in the date parameter, MySQL
automatically uses '00:00:00' of the same day as the base for calculation.

4/ This query returns to the timestamp which is 15 minutes after '2024-06-08


00:00:00' timestamp (the 'HH:MM:SS' = '00:00:00' part is implicitly defined by
MySQL)
→ '2024-06-08 00:15:00'

SQL

SELECT DATE_ADD('2024-06-08', INTERVAL 15 MINUTE);

This query returns to the timestamp which is 2 hours after '2024-06-08 00:00:00'
timestamp (the 'HH:MM:SS' = '00:00:00' part is implicitly defined by MySQL)
→ '2024-06-08 02:00:00'

SQL

SELECT DATE_ADD('2024-06-08', INTERVAL 2 HOUR);

DATE_SUB()

DATE_SUB() function subtracts a time/date interval from a date and then returns the
date.
- Use positive number to determine date before a specific date
- Use negative number to determine date after a specific date
(Think of it like Date - (-number) = Date + number )

SQL

DATE_SUB(date, INTERVAL value interval)

Example:
1/ This query returns to next day of '2024-06-08' date
→ '2024-06-08' - (-1) = '2024-06-08' + 1 = '2024-06-09'

36 / 96
MySQL - Learn Data Analytics Together's Group

SQL

SELECT DATE_SUB('2024-06-08', INTERVAL -1 DAY);

2/ This query returns to previous day of '2024-06-08' date → '2024-06-07'

SQL

SELECT DATE_SUB('2024-06-08', INTERVAL 1 DAY);

3/ This query returns to the timestamp which is 15 minutes prior to '2024-06-08


23:00:00' timestamp → '2024-06-08 22:45:00'

SQL

SELECT DATE_SUB('2024-06-08 23:00:00', INTERVAL 15 MINUTE);

NOTE

When using INTERVAL with time values like HOUR, MINUTE, SECOND, or
MICROSECOND without specifying a specific time in the date parameter, MySQL
automatically uses '00:00:00' of the same day as the base for calculation.

4/ This query returns to the timestamp which is 15 minutes prior to '2024-06-08


00:00:00' timestamp (the 'HH:MM:SS' = '00:00:00' part is implicitly defined by
MySQL)
→ '2024-06-07 22:45:00'

SQL

SELECT DATE_SUB('2024-06-08', INTERVAL 15 MINUTE);

This query returns to the timestamp which is 2 hours prior to '2024-06-08 00:00:00'
timestamp (the 'HH:MM:SS' = '00:00:00' part is implicitly defined by MySQL)
→ '2024-06-07 22:00:00'

SQL

SELECT DATE_SUB('2024-06-08', INTERVAL 2 HOUR);

37 / 96
MySQL - Learn Data Analytics Together's Group

DATEDIFF
Calculate the difference/ timespan between two dates based on the Unit of time
specified.

SQL

DATEDIFF(UnitOfTime, StartDate, EndDate)

Example:
1/ This query return to 1.

SQL

SELECT DATEDIFF('2024-06-08','2024-06-07');

2/ This query retrieves the total days between OrderDate and ShippedDate.
(OrderDate and ShippedDate are 2 columns in Sales Table)

SQL

SELECT DATEDIFF(OrderDate, ShippedDate);

BOOLEAN

Data Type Details


BOOL 0/1 - TRUE/FALSE

38 / 96
MySQL - Learn Data Analytics Together's Group

LIMIT
LIMIT clause helps to specify the number of rows you want to retrieve.

SQL

SELECT <ColumnName>
FROM <TableName>
WHERE <Condition>
LIMIT <Number>;

Example:
1/ This query retrieves top 5 records from customers Table

SQL

SELECT *
FROM customers
LIMIT 5

2/ This query retrieves 2 records from the 6th record. This query can be interpreted
as MySQL goes from 1st record to 5th record and take 2 records from 6th position,
retrieving 2 rows of 7th and 8th records.

SQL

SELECT *
FROM customers
LIMIT 5,2

39 / 96
MySQL - Learn Data Analytics Together's Group

IF Functions
IF return 'YES' if the condition is TRUE, or 'NO' if the condition is FALSE.

SQL

IF(condition, value_if_true, value_if_false)

We can use nested IF in our query.

Example:
1/ This query returns to '5'

SQL

SELECT IF(500<1000, 5, 10);

2/ This query returns to '1'.

Nested IF is executed from the inner IF to the outer IF :


After inner IF is executed we can read the query as SELECT IF(500<1000, 1,
10)

SQL

SELECT IF(500<1000, IF(100 > 50, 1,0), 10);

3/ This query labels Delivery status for each transaction:

If total days from ShippedDate to OrderDate are greater than 3 days, transaction
is labeld as 'Slow Delivery'
Otherwise (meaning total days from ShippedDate to OrderDate are less than 3
days), transaction is labeld as 'Fast Delivery'

SQL

SELECT IF(DATEDIFF(OrderDate, ShippedDate) > 3,


'Slow Delivery',
'Fast Delivery') AS category
FROM Delivery

40 / 96
MySQL - Learn Data Analytics Together's Group

NULLIF
NULLIF() function compares two expressions and returns NULL if they are equal.
Otherwise, the first expression is returned.

SQL

NULLIF(expr1, expr2)

We can use NULLIF() to check whether columns in a table contains NULL value or
not. This query returns to NULL if value in Column1 is NULL.

SQL

SELECT NULLIF(Column1,'NULL');

IFNULL
IFNULL() function returns a specified value if the expression is NULL. If the
expression is NOT NULL, this function returns the expression.

SQL

IFNULL(expression, alt_value)

41 / 96
MySQL - Learn Data Analytics Together's Group

CAST and CONVERT Functions


CONVERT function in MySQL is similar to CAST and is used for type conversion, but it
can also convert character sets.

The datatype to convert to. Can be one of the following:

Value Description
DATE Converts value to DATE. Format: "YYYY-MM-DD"
DATETIME Converts value to DATETIME. Format: "YYYY-MM-DD HH:MM:SS"
DECIMAL Converts value to DECIMAL. Use the optional M and D parameters to
specify the maximum number of digits (M) and the number of digits
following the decimal point (D).
TIME Converts value to TIME. Format: "HH:MM:SS"
CHAR Converts value to CHAR (a fixed length string)
NCHAR Converts value to NCHAR (like CHAR, but produces a string with the
national character set)
SIGNED Converts value to SIGNED (a signed 64-bit integer)
UNSIGNED Converts value to UNSIGNED (an unsigned 64-bit integer)
BINARY Converts value to BINARY (a binary string)

CAST
CAST() function converts a value (of any type) into the specified datatype.

SQL

CAST(value AS datatype)

Example: This query converts number '150' into a string '150'

SQL

SELECT CAST(150 AS CHAR);

CONVERT
CONVERT() function converts a value (of any type) into the specified datatype.

SQL

CONVERT(value , datatype)

Example: This query converts number '150' into a string '150'

42 / 96
MySQL - Learn Data Analytics Together's Group

SQL

SELECT CONVERT(150, CHAR);

43 / 96
MySQL - Learn Data Analytics Together's Group

CASE WHEN Statements


The CASE WHEN statement acts like an IF Functions. It goes through conditions and
returns a value when the first condition is met.

The output of CASE WHEN function can be used by another function. This is a very
useful function when combining with other functions!

SQL

SELECT <Column1>, <Column2>, ...


CASE
WHEN <Condition1> THEN <Result1>
WHEN <Condition2> THEN <Result2>
...
[ELSE <ResultN>]
END
FROM <TableName>;

Example:
1/ This query labels salary range for each employees' salary.

SQL

SELECT Name, Salary,


CASE
WHEN Salary <= 60000 THEN 'Low'
WHEN Salary BETWEEN 60001 AND 80000 THEN 'Medium'
ELSE 'High'
END AS SalaryBracket
FROM employees;

2/ This query combines CASE WHEN with SUM , returning a table calculates payment
against each year for each customer.

44 / 96
MySQL - Learn Data Analytics Together's Group

SQL

SELECT
CustomerNumber,
SUM(CASE WHEN YEAR(paymentDate)='2021' THEN Sales ELSE 0 END)
AS'2021',
SUM(CASE WHEN YEAR(paymentDate)='2022' THEN Sales ELSE 0 END)
AS'2022',
SUM(CASE WHEN YEAR(paymentDate)='2023' THEN Sales ELSE 0 END)
AS'2023',
SUM(CASE WHEN YEAR(paymentDate)='2024' THEN Sales ELSE 0 END)
AS'2024',
FROM payments
GROUP BY CustomerNumber

45 / 96
MySQL - Learn Data Analytics Together's Group

JOIN
What are JOINs?
1. Joins are used to combine rows from two or more tables based on a related
column
2. There are several types of JOINs in SQL, including NATURAL JOIN, INNER JOIN,
LEFT AND RIGHT JOINS, CROSS JOIN, and SELF JOIN.

USING keywork in JOIN


USING is a shortcut syntax for ON in JOIN.

Instead of using this:

SQL

SELECT *
FROM Table1 T1
JOIN Table2 T2
ON T1.column1 = T2.column1;

We can use USING :

SQL

SELECT *
FROM Table1
JOIN Table2
USING (column1)

NATURAL JOIN
NATURAL JOIN is JOIN without ON condition. NATURAL JOIN links tables together
based on columns with the same name and automatically selects only the columns
that exist in both tables.

And it will automatically eliminate the redundant columns (same columns in 2 tables)

SQL

SELECT *
FROM <Table1>
NATURAL JOIN <Table2>;

46 / 96
MySQL - Learn Data Analytics Together's Group

INNER JOIN
1. Returns records that have matching values in both tables
2. Tables are joined by a common column in both tables and only rows that match
in each table's columns are returned
3. The INNER keyword is optional.

SQL

SELECT *
FROM <Table1>
[INNER] JOIN <Table2>
ON <Condition>;

LEFT AND RIGHT JOINS


1. RIGHT JOIN: Returns all records from the right table, and the matched records
from the left table
2. LEFT JOIN: Returns all records from the left table, and the matched records
from the right table

SQL

SELECT *
FROM <Table1>
LEFT JOIN <Table2>
ON <Condition>;

SQL

SELECT *
FROM <Table1>
RIGHT JOIN <Table2>
ON <Condition>;

Fun Fact

RIGHT JOIN is rarely used because it returns the same result as the LEFT JOIN.
On the other hand, queries which use LEFT JOIN are much easier to read because
we simply list tables one after the other.

47 / 96
MySQL - Learn Data Analytics Together's Group

SELF JOINS
1. A JOIN where a table is joined to itself.
2. Aliases are required to distinguish from which "table" is being selected or
filtered on.

SQL

SELECT *
FROM <Table> AS T1
[INNER/ LEFT/ RIGHT] JOIN <Table> AS T2
ON <Condition>;

CROSS JOINS
1. Returns all possible combinations of all rows
2. No "ON" clause is needed because all columns and rows from one table are
being combined with all columns and rows from the other table
3. Also called a "Cartesian" Join

SQL

SELECT *
FROM <Table1>
CROSS JOIN <Table2>
ON <Condition>;

JOIN vs UNION
1. A UNION is used to combine the results of two or more SELECT statements into
a single result set.
2. JOINS typically give a horizontal output, while UNIONs give a vertical output.

48 / 96
MySQL - Learn Data Analytics Together's Group

We can use JOIN and UNION to append 2 tables and create new column to label its
records. Of note, columns' name in the appended table is aligned with the columns'
name in the first table.

Example:

SQL

SELECT first_name, last_name, 'Old' AS Label


FROM Customers
WHERE YEAR(Birth_date) < 1950

UNION

SELECT first_name, last_name, 'New'


FROM Customers
WHERE YEAR(Birth_date) > 1950;

MySQL doesn't directly support a full outer join using the FULL OUTER JOIN keyword.
However, same result can be achieved by combining a LEFT JOIN and a RIGHT JOIN
with the UNION operator.

Example:

SQL

SELECT Employees.*, Departments.*


FROM Employees
LEFT JOIN Departments
ON Employees.department_id = Departments.id

UNION ALL

SELECT Employees.*, Departments.*


FROM Departments
RIGHT JOIN Employees
ON Departments.id = Employees.department_id;

49 / 96
MySQL - Learn Data Analytics Together's Group

Sub-queries
What are Sub-queries?
1. A Subquery is a Select statement that is nested within another query
2. A Subquery returns an output that is used by the "outer" query

Sub-queries in WHERE
SQL

SELECT *
FROM customers
WHERE customer_id IN
(SELECT customer_id
FROM customer_orders);

Aggregation Functions can be used with WHERE and be wrapped in a sub-query.

SQL

SELECT AVG(total_money_spent)
FROM customers
WHERE total_money_spent > (SELECT AVG(total_money_spent)
FROM customers);

ALL in Sub-queries
The ALL keyword returns true if the comparison operation evaluates to true for all of
the values returned by the subquery.

ALL means that the condition will be satisfied if the operation is true for all values in
this range.

The below syntax use MAX to manually pick the maximum sales and set it as a
condition for the outer SELECT statement.

50 / 96
MySQL - Learn Data Analytics Together's Group

SQL

SELECT
shipper_id
, order_id
, quantity
, unit_price
, (quantity * unit_price) AS total_order_price
FROM ordered_items
WHERE (quantity * unit_price) > (SELECT MAX(quantity * unit_price)
AS total_order_price
FROM ordered_items
WHERE shipper_id = 1)

With ALL , it retrieves same result as compared to the above query.


The query means select sales which is greater than each values generated in inner
query. In other word, select sales which is greater than the maximum value
generated from query inside ALL clause.

SQL

SELECT
shipper_id
, order_id
, quantity
, unit_price
, (quantity * unit_price) AS total_order_price
FROM ordered_items
WHERE (quantity * unit_price) > ALL(SELECT (quantity * unit_price)
AS total_order_price
FROM ordered_items
WHERE shipper_id = 1)

ANY in Sub-queries
The ANY keyword returns true if the comparison operation evaluates to true for any
of the values returned by the subquery. It's functionally equivalent to using IN .

ANY means that the condition will be satisfied if the operation is true for any of the
values in this range.

This query selects the names of customers whose age is greater than the age of at
least one employee in the Sales department.

51 / 96
MySQL - Learn Data Analytics Together's Group

SQL

SELECT name
FROM customers
WHERE age > ANY (SELECT age FROM employees WHERE department =
'Sales');

EXISTS Operator
EXISTS Operator is same as IN Operator.

EXISTS is used to check for the existence of rows returned by a subquery. It returns
true if the subquery returns at least one row; otherwise, it returns false. It helps to
check if a value exists within that sub-query.

Once EXISTS find a row meet condition, it'll evaluate a true and stop all the process.

We use EXISTS in sub-query because it is faster (cause it's not evaluating the entire
dataset, it is just evaluating until it hits TRUE then the check will be stopped).

We use EXISTS in extreme large dataset.

Sub-query with SELECT and FROM


Example
1/ The below sub-query calculates the average quantity for all Quantity values in the
ordered_items table.

SQL

SELECT
product_id,
quantity,
(SELECT AVG(Quantity)
FROM ordered_items)
FROM ordered_items

We can also rewrite the above query using Window Functions. The OVER() clause
specifies that the average is calculated for each partition (in this case, there's no
partition specified, so it's calculated over the entire table).

52 / 96
MySQL - Learn Data Analytics Together's Group

SQL

SELECT
product_id,
quantity,
AVG(Quantity) OVER()
FROM ordered_items;

2/ This query retrieves all records in the employee table where employees' salaries
are less than 40,000.

SQL

SELECT E1.*
FROM
(SELECT *
FROM employee
WHERE salary < 40000) E1;
;

53 / 96
MySQL - Learn Data Analytics Together's Group

Window Functions
What is a Window Functions?

1. A Window Function is a function that performs a calculation across a set of rows


that are related to the current row

2. Similar to an aggregate function, but a window function does not cause rows to
group into a single output row

3. Functions are applied to each row individually, and the result is displayed in a
separate column of the output

Window Functions vs GROUP BY

OVER Clause & PARTITION BY


OVER Clause
Aggregation Functions + OVER() means that aggregation functions is over every
columns in the dataset. It acts as Sub-query with SELECT & Aggregation.

Example:

SQL

SELECT
c.customer_id
,first_name
,order_total
,MAX(order_total) OVER() AS max_order_total
FROM customers C
JOIN customer_orders CO
ON C.customer_id = CO.customer_id;

54 / 96
MySQL - Learn Data Analytics Together's Group

SQL

SELECT
c.customer_id
,first_name
,order_total
,SELECT(MAX(order_total) FROM customer_orders) AS
max_order_total
FROM customers C
JOIN customer_orders CO
ON C.customer_id = CO.customer_id;

PARTITION BY
PARTITION BY means dividing the query results set into partitions, same as GROUP
BY , we're grouping on it but we don't have the grouped final result sets. Instead, we
have a new column with values across each rows.

SQL

SELECT
c.customer_id
,first_name
,order_total
,MAX(order_total) OVER(PARTITION BY c.customer_id) AS
max_order_total
FROM customers C
JOIN customer_orders CO
ON C.customer_id = CO.customer_id;

ROW NUMBER
Assigns a unique number to each row starting from 1 for each partition.

SQL

ROW_NUMBER() OVER (PARTITION BY column ORDER BY column)

Example: This query assigns a row number to employees in each department based
on their salary.

55 / 96
MySQL - Learn Data Analytics Together's Group

SQL

SELECT Name, Department, Salary,


ROW_NUMBER() OVER (PARTITION BY Department ORDER BY Salary
DESC) AS SalaryRank
FROM employees;

RANK
Assigns a rank to each row within a partition, with gaps in rank numbers for ties.

SQL

RANK() OVER (PARTITION BY column ORDER BY column)

Example: This query ranks sales records within each region based on the amount.

SQL

SELECT Region, SalesAmount,


RANK() OVER (PARTITION BY Region ORDER BY SalesAmount DESC)
AS SalesRank
FROM sales;

DENSE_RANK
Similar to RANK(), but ranks are consecutive with no gaps.

SQL

DENSE_RANK() OVER (PARTITION BY column ORDER BY column)

Example: This query ranks students within each class based on their grades.

SQL

SELECT Class, StudentName, Grade,


DENSE_RANK() OVER (PARTITION BY Class ORDER BY Grade DESC)
AS GradeRank
FROM students;

LAG
LAG() accesses data from a previous row in the same result set, without the need
for a self-join. It is commonly used to compare current row values with previous row
values.

56 / 96
MySQL - Learn Data Analytics Together's Group

SQL

LEAD(value_expression, offset, default) OVER ([partition_clause]


order_by_clause)

Example: This query compares the current employees' salary in each department
with their previous year salary and labels the increase, decrease as 'More' and 'Less'
respectively.

SQL

SELECT *, IF(salary > lead_col,'More','Less')


FROM
(SELECT
*,
LEAD(salary) OVER(PARTITION BY department ORDER BY
employee_id) AS lead_col
FROM employees) AS lead_table

LEAD
LEAD() accesses data from a subsequent row in the same result set. It is useful for
comparing the current row with upcoming row values.

SQL

LEAD(value_expression, offset, default) OVER ([partition_clause]


order_by_clause)

Example: This query compares the current employees' salary in each department
with their next year salary and labels the increase, decrease as 'More' and 'Less'
respectively.

SQL

SELECT *, IF(salary > lead_col,'More','Less')


FROM
(SELECT
*,
LEAD(salary) OVER(PARTITION BY department ORDER BY
employee_id) AS lead_col
FROM employees) AS lead_table

57 / 96
MySQL - Learn Data Analytics Together's Group

Regular Expression
Regular Expression is kind of like using LIKE statement but a lot more advanced and
you can specify the pattern you're looking for in a more specific way.

REGEXP is case-insentitive.

What is Regular Expression?


1. Regular Expressions, also called Regex, is a sequence of characters, used to
search and locate specific sequences of characters that match a provided
pattern.
2. Similar to the LIKE Statement, but more specific and complex.

The below syntaxes using LIKE + Wildcard retrieves same result when using REGEXP .

SQL

SELECT *
FROM customers
WHERE first_name LIKE '%k%'

SELECT *
FROM customers
WHERE first_name REGEXP 'k'

REGEXP_REPLACE
Replaces all occurrences of a substring within a string, with a new substring.

SQL

REGEXP_REPLACE(expression, pattern, replacement);

Example:
1/ This query replaces 'a' with 'b' in first_name

SQL

SELECT first_name, REGEXP_REPLACE(first_name,'a','b')


FROM customers;

2/ This query removes dash characters in a phone number. It returns to


'056193778513'.

58 / 96
MySQL - Learn Data Analytics Together's Group

SQL

SELECT REGEXP_REPLACE('056-193-778-513', '-','');

3/ More advanced, This query removes special characters like parentheses and
dashes in phone number by using pattern \D to match any non-digit character and
replacing them with an empty string ( '' )

SQL

SELECT REGEXP_REPLACE(phone_number, '\\D', '') AS clean_number


FROM users;

The result is as below:

original_number clean_number
+1 (202) 555-1212 12025551212
098-765-4321 0987654321
123.456.7890 1234567890

REGEXP_LIKE
Uses for pattern matching in strings based on regular expressions. It checks if a
string matches a specified pattern and returns 1 (true) if there's a match, and 0
(false) otherwise.

Example:

1/ This query should be executed in the bakery database. It returns to 1 if


first_name contains 'a' and 0 if it does not.

SQL

SELECT
first_name,
REGEXP_LIKE(first_name,'a')
FROM customers

2/ This query checks if the first name consists only of alphabetic characters and
returns a boolean result.

59 / 96
MySQL - Learn Data Analytics Together's Group

SQL

SELECT
first_name,
REGEXP_LIKE(first_name, '^[A-Za-z]+$') AS is_alpha
FROM customers;

REGEXP_INSTR
Shows a starting index of a sub-string (like we use LOCATE / LOCATION of this
character.). The syntax ends when hitting the 1st matching.

Example: This query should be executed in the bakery database. It finds the 1st
position of character 'a' in each customer's name.

SQL

SELECT first_name, REGEXP_INSTR(first_name,'a')


FROM customers;

-- Some examples of result are as below:


Charles → Result = 3
Michael → Result = 5
Anakin → Result = 1

REGEXP_SUBSTR
REGEXP_SUBSTR is case-insensitive. If the string matches the searching sub-string, it
returns to the searching sub-string, otherwise, it returns NULL .

Example:
1/ This query should be executed in the bakery database. It returns to the substring
'char' if the searching values contain this substring, otherwise, it returns to NULL.

SQL

SELECT first_name, REGEXP_SUBSTR(first_name,'char')


FROM customers;

-- Some examples of result are as below:


Charles → Result = Char
Micheal → Result = NULL
Anakin → Result = NULL

60 / 96
MySQL - Learn Data Analytics Together's Group

Regular Expression Metacharacters


[-.^$*+?}|

[] - Square bracket
[] indicates we're looking for a specific set of characters. [] give us a special ability
to look for a range of string/ numbers, [] can be used with both numeric data and
string data.

REGEXP [a-c] = a range from a to c → retrieve any records containing a , b


and c
REGEXP [0-1] = a range from 0 to 1 → retrieve any records containing 0 and 1

Without a bracket, we're looking for an exact set of characters

REGEXP 'a-c' → retrieve any records containing a-c anywhere


REGEXP '0-1' → retrieve any records containing 0-1 anywhere

Example:
1/ This query should be executed in the bakery database. It finds all customers
having character 'a', 'b' or 'c' in their first names.

SQL

SELECT *
FROM customers
WHERE first_name REGEXP '[a-c]';

2/ This query should be executed in the bakery database. It finds all customers
having number '0' or '1' in their phone numbers.

SQL

SELECT *
FROM customers
WHERE first_name REGEXP '[0-1]';

. Dot
. searches any records except for NULL values.

Example:

1/ This query should be executed in the bakery database. It finds all customers
having phone number contain at least one character (except NULL values).

61 / 96
MySQL - Learn Data Analytics Together's Group

SQL

SELECT *
FROM customers
WHERE phone REGEXP '.';

2/ This query should be executed in the bakery database. It finds all customers
whose phone numbers contain a digit '6' followed by any single character.

SQL

SELECT *
FROM customers
WHERE phone REGEXP '6.';

3/ This query should be executed in the bakery database. It searches strings contain
k , tail by 3 characters and end by n . In this case, . acts like _ (underscore) in LIKE
statement.

SQL

SELECT *
FROM customers
WHERE first_name REGEXP 'k...n';

^ Caret
^ called carrot but it is actually an upper arrow sign. ^ means you're searching the
start with that letter/ number.

Example:

1/ This query should be executed in the bakery database. It finds all customers
having first name starting with 'k'.

SQL

SELECT *
FROM customers
WHERE first_name REGEXP '^k';

Result is: Kevin & Kelly.

2/ This query should be executed in the bakery database. Without ^ , REGEXP


searches string contains k anywhere.

62 / 96
MySQL - Learn Data Analytics Together's Group

SQL

SELECT *
FROM customers
WHERE first_name REGEXP 'k';

Result is: Kevin, Kelly and Anakin.

$ - Dollar sign
$ means you're searching the start with that letter/ number. It is crucial to put $ at
the end of the searching sub-string.

Example: This query should be executed in the bakery database. It finds all
customers having first name ending in 'n'.

SQL

SELECT *
FROM customers
WHERE first_name REGEXP 'n$';

Result is: Kevin, Don and Anakin.

* - Asterisk
* represents for zero or more occurrences of the preceding element.

Example: This query should be executed in the bakery database. It searches for
string contains Obi anywhere and that Obi must end with 0, 1, or more characters
after it. It is case-insensitive, meaning condition Obi.* retrieves the same result as
condition obi.* .

SQL

SELECT *
FROM customers
WHERE first_name REGEXP 'Obi.*';

+ - Plus
+ represents for one or more occurrences of the preceding element.

Example:

1/ This query should be executed in the bakery database. It finds whose first names

63 / 96
MySQL - Learn Data Analytics Together's Group

start with 'Obi' followed by one or more characters of any type. It is case-insensitive,
meaning condition Obi.+ retrieves same result as condition obi.+ .

It returns nothing because the only customer having 'Obi' in their name is explicitly
named 'Obi', which does not entail any additional characters. To match this query,
customers must have a first name like 'Obin' or 'Obina'.

SQL

SELECT *
FROM customers
WHERE first_name REGEXP 'Obi.+';

2/ This query should be executed in the bakery database. It finds customers having
their first name start with 'K' and end with 'n', with any characters in between. It is
case-insensitive, meaning condition K.+n retrieves same result as condition k.+n .

Result is Kevin & Anakin.

SQL

SELECT *
FROM customers
WHERE first_name REGEXP 'K.+n';

Noting that omit any of . or + in the above query leading to different results

Using 'K.+n' → Result = Kevin, Anakin


This matches any first name that starts with 'K', followed by one or more
characters of any type, and ends with 'n'. This pattern will match names like
"Kevin" and "Anakin", where there are one or more characters between 'K' and 'n'.

Using 'K.n' → Result = Anakin


This matches any first name that starts with 'K', followed by exactly one character
of any type, and ends with 'n'. This pattern will only match "Anakin" because it has
exactly one character between 'K' and 'n'.

Using 'K+n' -> Result = nothing


This matches any first name that starts with 'K', followed by one or more
preceding character which is 'k' in this case, and ends with 'n'. It returns nothing
because there is no person whose name like 'Kkevin' unless this is a typing error.

64 / 96
MySQL - Learn Data Analytics Together's Group

? - Question mark
? limits how many you search for.

Example:
1/ This query should be executed in the bakery database. It finds customers whose
first names start with 'Obi' and are followed by either zero or one occurrence of any
character.

It returns customer named 'Obi' (unlike the same query with '+').

SQL

SELECT *
FROM customers
WHERE first_name REGEXP 'Obi.?';

2/ This query should be executed in the bakery database. It finds customers whose
first name k , followed by zero, one or more characters of any type and ends with n .

It returns customer named 'Anakin'.

SQL

SELECT *
FROM customers
WHERE first_name REGEXP 'K.?n';

{} - Curly Braces
{} specifies how many previous characters are.

Example: {3} means 3 characters of .

1/ This query should be executed in the bakery database. It finds customers whose
first name k , followed by exactly 3 characters of any type and ends with n .

It returns customer named 'Kevin'.

SQL

SELECT *
FROM customers
WHERE first_name REGEXP 'K.{3}n';

The above query can be rewritten as below, retrieving same result.

65 / 96
MySQL - Learn Data Analytics Together's Group

SQL

SELECT *
FROM customers
WHERE first_name REGEXP 'K...n';

| - Pipe (Vertical bar)


| represents for either or.

Example: This query should be executed in the bakery database. It finds customers
whose first name starting with kev or fro .

It returns customers named 'Kevin' and 'Frodo'.

SQL

SELECT *
FROM customers
WHERE first_name REGEXP 'kev|fro';

66 / 96
MySQL - Learn Data Analytics Together's Group

Data Cleaning
What is Data Cleaning?

1. Data Cleaning is the process of identifying and correcting or removing invalid,


incorrect, or incomplete data from a dataset

2. There are several techniques including removing duplicated data, normalization,


standardization, populating values, and more

Rule of thumb in Data Cleaning


Below is a rule of thumb before removing any duplicates in your dataset:

Before making any changes that might remove duplicates from your original
database, always create a copy. Retrieving the original data later can be
difficult, so working on a copy ensures you have a backup.
Double-check what you're deleting before proceeding. Use SELECT
statements to preview the data that will be affected by your deletion operation.
This helps avoid accidental data loss.

67 / 96
MySQL - Learn Data Analytics Together's Group

Removing Duplicates
SQL

DELETE FROM <TableName>


WHERE <ColumnName> IN <Condition>

Example: This query uses sweepstakes table. It identifies duplicated sweepstake_id


entries for each customer in the sweepstakes table. It then deletes all those entries
except for the first occurrence (which has row_num = 1 ) for each customer.

SQL

DELETE FROM sweepstakes


WHERE sweepstake_id IN (
SELECT sweepstake_id
FROM (
SELECT
sweepstake_id,
ROW_NUMBER() OVER (PARTITION BY customer_id ORDER
BY customer_id) AS row_num
FROM bakery.sweepstakes) AS table_row
WHERE row_num > 1
);

Standardize Data
SQL

UPDATE <TableName>
SET <ColumnName> = <Condition>;

Example:

1/ This query uses sweepstakes table. It updates the phone column in the
sweepstake table.

SQL

UPDATE sweepstake
SET phone = REGEXP_REPLACE(phone, '[()-/+]', '')
WHERE phone <> '';

Before updating, you can double check the above query by this query:

68 / 96
MySQL - Learn Data Analytics Together's Group

SQL

SELECT
*,
REGEXP_REPLACE(phone, '[()-/+]', '')
FROM sweepstake
WHERE phone <> '';

2/ This query uses sweepstakes table. It sets phone number at a standardize format.

SQL

UPDATE sweepstake
SET phone = CONCAT(SUBSTRING(phone,1,3),'-
',SUBSTRING(phone,4,3),'-',SUBSTRING(phone,7,4))
WHERE phone <> '';

Before updating, you can double check the above query by this query:

SQL

SELECT
*
,phone = CONCAT(SUBSTRING(phone,1,3),'-
',SUBSTRING(phone,4,3),'-',SUBSTRING(phone,7,4))
FROM sweepstake
WHERE phone <> '';

3/ This query uses sweepstakes table. It sets birthdate at a standardize format.

SQL

UPDATE sweepstake
SET birth_date =
CONCAT(SUBSTRING(birth_date,9,2),'/',SUBSTRING(birth_date,6,2),'/'
,SUBSTRING(
birth_date,1,4))
WHERE sweepstake_id IN (9,11);

UPDATE sweepstake
SET birth_date = STR_TO_DATE(birth_date, '%m/%d/%Y');

Before updating, you can double check the above query by this query:

69 / 96
MySQL - Learn Data Analytics Together's Group

SQL

SELECT
birth_date
, IF(STR_TO_DATE(birth_date, '%m/%d/%Y') IS NOT NULL
, STR_TO_DATE(birth_date, '%m/%d/%Y')
, STR_TO_DATE(birth_date, '%Y/%d/%m'))
FROM sweepstake;

Breaking Column into multiple columns


This query uses sweepstakes table. Tip: We use SUBSTRING_INDEX inside a
SUBSTRING_INDEX to break a string into substrings.

SQL

SELECT
address
,SUBSTRING_INDEX(address,',',1) Address
,SUBSTRING_INDEX(SUBSTRING_INDEX(address,',',2),',',-1) City
,SUBSTRING_INDEX(address,',',-1) State
FROM sweepstake;

Then actually break a column into multiple columns

AFTER address means the new column will be put after the address column
Because values in the state column does not 100% UPPER, thus we need to do it
for standardization

70 / 96
MySQL - Learn Data Analytics Together's Group

SQL

-- Alter table sweepstake to add in 3 more columns.


ALTER TABLE sweepstake
ADD COLUMN street VARCHAR(50) AFTER address;
,ADD COLUMN city VARCHAR(50) AFTER street;
,ADD COLUMN state VARCHAR(50) AFTER city;

-- Update values in the new columns


UPDATE sweepstake
SET street = SUBSTRING_INDEX(address,',',1);

UPDATE sweepstake
SET city = SUBSTRING_INDEX(SUBSTRING_INDEX(address,',',2),',',-1);

UPDATE sweepstake
SET state = SUBSTRING_INDEX(address,',',-1);

-- Standardize values in the new created columns


UPDATE sweepstake
SET state = UPPER(state);

UPDATE sweepstake
SET city = TRIM(city);

UPDATE sweepstake
SET state = TRIM(state);

Working with NULL values


When working with blank or NULL values you have to ask yourself:

Should we keep records where do not have values? Are those extremely
important?
If the answer is No, we can delete those records
Can we populate this data? Can we derived this field based off another field or
another table?
Can we populate data with an average amount?

Example: This query uses sweepstakes table. It turns blank values in phone column
and income column into NULL values.

71 / 96
MySQL - Learn Data Analytics Together's Group

SQL

UPDATE sweepstake
SET phone = NULL
WHERE phone = '';

UPDATE sweepstake
SET income = NULL
WHERE income = '';

Deleting unused columns


Using This query:

SQL

ALTER TABLE <TableName>


DROP COLUMN <ColumnName>;

72 / 96
MySQL - Learn Data Analytics Together's Group

CTEs & Temp Table


CTES
What are CTEs?
1. CTE stands for Common Table Expression (it is basically a name of temporary
result set)
2. CTE is not stored as an object or in memory and last only during the execution
of a query
3. Similar to a Subquery, but a CTE can be self-referencing (a recursive CTE) or
can be referenced multiple times in the same query
4. CTEs are often used for be better readability and recursion
5. CTEs, or Common Table Expressions, are temporary result sets that can be
referenced within a SELECT, INSERT, UPDATE, or DELETE statement.

Performance Considerations

CTEs can improve readability and maintainability of queries but may not always
provide performance benefits over subqueries or temporary tables.

Using a CTEs
CTE stands for Common Table Expression. CTE only lasts immediately after a CTE
( WITH AS ), a CTE will disappear if you separate it with a semi-colon.

Example: We can rename columns for a CTE as below:

SQL

WITH
CTE_Example(product_id,sum_order_total,sum_tip,count_tip) AS
(
SELECT product_id, SUM(order_total), SUM(tip), COUNT(tip)
FROM bakery. customer_orders
WHERE tip != 0
GROUP BY product_id
)

SELECT product_id, ROUND(sum_order_total/count_tip,2)


FROM CTE_Example;

73 / 96
MySQL - Learn Data Analytics Together's Group

Example: Or we can create 2 CTEs and JOIN them together

SQL

WITH
CTE_Example1 AS
(
SELECT product_id, SUM(order_total), SUM(tip), COUNT(tip)
FROM bakery. customer_orders
WHERE tip != 0
GROUP BY product_id
)

,CTE_Example2 AS
(
SELECT product_id, SUM(order_total), SUM(tip), COUNT(tip)
FROM bakery. customer_orders
GROUP BY product_id
)

SELECT *
FROM CTE_Example CTE1
RIGHT JOIN CTE_Example CTE2
ON CTE1.product_id = CTE2.product_id;

74 / 96
MySQL - Learn Data Analytics Together's Group

Recursion in CTEs
Recursion is what really set CTEs from other things like sub-query. Recursion means
it can reference itself.

During a process of recursion, the initial CTE that we build is going to repeatedly
execute and return subsets of the data until the complete result sets is returned.

Recursive CTEs start with WITH RECURSIVE keyword to inform SQL that this is a
recursive CTE and it acts differently from the typical CTEs. When you use a recursive
CTE, you have to use UNION ALL keyword.

75 / 96
MySQL - Learn Data Analytics Together's Group

Example 1: This recursive CTE retrieves incremental x values, starting from 1 and
ending at 10.

Recursive CTEs are powerful tools, but they require careful design to avoid infinite
loops. Without condition ( x < 10 ), this Recursive CTE will go forever and crash your
MySQL.

SQL

WITH RECURSIVE
CTE_Example AS
(
SELECT 1 AS x

UNION ALL

SELECT x + 1
FROM CTE_Example
WHERE x < 10
)

SELECT x
FROM CTE_Example

To elaborate the above, please find the below illustration for your ease of reference:

1/ SELECT 1 AS x is an anchor member and be executed only once. This query


generates the 1st row of our table.

x
1

2/ The below recursive member then runs multiple times until the whole
recursion is completed.

SQL

SELECT x + 1
FROM CTE_Example
WHERE x < 10

2.1/ A reference within the CTE is made, x is currently defined as 1 . Given the WHERE
condition is met: x < 10 (1 < 10) , the recursive member is executed.

Running the recursive member 1st time, x is now defined as 2 (former x = 1 → x + 1


= 1 + 1 → x = 2), the 2nd row of our table is generated:

76 / 96
MySQL - Learn Data Analytics Together's Group

UNION ALL now acts as a glue to append these two rows as a whole:

x
1
2

2.2/ A reference within the CTE is made, x is currently defined as 2 . Given the
WHERE condition is met: x < 10 (2 < 10) , the recursive member is executed.

Running the recursive member 2nd time, x is now defined as 3 (former x = 2 → x + 1


= 2 + 1 → x = 3), the 3rd row of our table is generated:

UNION ALL now acts as a glue to append these rows as a whole:

x
1
2
3

...

2.9/ A reference within the CTE is made, x is currently defined as 9 . Given the
WHERE condition is met: x < 10 (9 < 10) , the recursive member is executed.

Running the recursive member 9th time, x is now defined as 10 (former x = 9 → x +


1 = 9 + 1 → x = 10), the 10th row of our table is generated:

10

UNION ALL now acts as a glue to append these rows as a whole:

x
1
2
3
4
5
6
7

77 / 96
MySQL - Learn Data Analytics Together's Group

x
8
9
10

3/ Termination condition is met, the whole recursion is completed.

A reference within the CTE is made, x is currently defined as 10 . Given the WHERE
condition is NOT met: x = 10 , the recursive member is not executed.

Thus, the final result is:

x
1
2
3
4
5
6
7
8
9
10

78 / 96
MySQL - Learn Data Analytics Together's Group

Example 2: A common use of Recursive CTE is retrieving a list of employees along


with their manager's name and potentially additional levels up the hierarchy.

SQL

WITH RECURSIVE

CompanyHierarchy AS
(
SELECT
employee_id
,first_name
,last_name
,boss_id
,0 as HierarchyLevel
FROM Employees
WHERE boss_id IS NULL

UNION ALL

SELECT
E.employee_id
,E.first_name
,E.last_name
,E.boss_id
,HierarchyLevel + 1
FROM Employees E, CompanyHierarchy CH
WHERE E.boss_id = CH.employee_id
)

SELECT
CH.first_name EmployeeFirstName
,CH.last_name EmployeeLastName
,E.first_name BossFirstName
,E.last_name BossLastName
,HierarchyLevel
FROM CompanyHierarchy CH
JOIN Employees E
ON E.employee_id = CH.boss_id

79 / 96
MySQL - Learn Data Analytics Together's Group

Temp Tables
What are temp tables?
Temporary tables are tables that only visible in that session when you create them.
They don't let forever like a regular table, but they can be used over and over
repeatedly. They are often used to store intermediary results for complex queries.

Temporary tables are used to manipulate data before inserting it into a permanent
table.

1. Temporary Tables are special types of tables that let you store a temporary
result set in memory
2. You can reuse this temp table multiple times in a single session 222
3. Great for storing complex queries that you want to reuse 223
4. Temp tables are great for reducing the complexity of queries 224, storing
intermediate result sets, and improving performance 225

There are 2 ways to create a temp table:

(1) Actually creating a temporary table, almost like creating a real table and then
you insert data into that table.

SQL

CREATE TEMPORARY TABLE temp_table


(first_name varchar(50),
last_name varchar(50),
favorite_movie varchar(100)
);

INSERT INTO temp_table


VALUES ('Alex', 'Freberg', 'Lord of the Rings: The Twin Towers');

(2) Creating a temporary table but then selecting data from an existing table and
putting it into that temporary table.

SQL

CREATE TEMPORARY TABLE temp_table_2


SELECT *
FROM employees
WHERE salary > 50000;

80 / 96
MySQL - Learn Data Analytics Together's Group

Comparison between CTEs and Temp Tables


Both temporary tables and Common Table Expressions (CTEs) are used to store
and manipulate intermediate results in SQL, but they have different characteristics
and use cases. Here's a comparison:

Temporary Tables

Scope: Temporary tables are available for the duration of the session in which
they are created. They can be used across multiple queries and statements
within that session.
Performance: Temporary tables can be indexed, which can improve
performance for large datasets. They also allow for statistics, which can help
the query optimizer make better decisions.
Usage: Temporary tables are useful when you need to refer to the same
intermediate result set multiple times in different queries or procedures, or when
you're working with a very large amount of data.

Common Table Expressions (CTEs):

Scope: CTEs are available only for the duration of the single query in which they
are defined. They cannot be used across multiple queries.
Performance: CTEs cannot be indexed and do not have statistics. For large
datasets, this can lead to slower performance compared to temporary tables.
Usage: CTEs are useful for breaking down complex queries into simpler, more
readable parts. They also support recursion, which can be used to handle
hierarchical data.

Which is better?

The choice between temporary tables and CTEs depends on the specific
requirements of your task. If you're dealing with large amounts of data, need to use
the result set in multiple queries, or need to improve performance with indexing, a
temporary table might be the better choice. If you're looking to improve query
readability, simplify complex queries, or handle hierarchical data with recursion, a
CTE might be more suitable. Remember, these are not mutually exclusive. In many
real-world scenarios, you might find yourself using both temporary tables and CTEs
to accomplish different tasks.

81 / 96
MySQL - Learn Data Analytics Together's Group

Stored Procedures
Stored Procedures Introduction
What are Stored Procedures?
1. Stored Procedures are a way to save SQL code that can be reused over and
over again.
2. You use a Stored Procedure by "Calling" it which executes the code saved within
the Stored Procedure.
3. Great for storing complex SQL code, simplifying repetitive code, and enhanced
performance.

Parameters in Stored Procedures


1. Parameters are variables that are passed as an input or output to a stored
procedure.
2. They allow the stored procedure to accept input values from the calling
program, to perform certain actions based on those input values
3. IN Parameters are used to pass input values to the stored procedure
4. OUT Parameters are used to return output values from the stored procedure

Input Parameters in Stored Procedures

Input parameters in stored procedures are used to pass data into the procedure
when it is called. This allows the procedure to perform operations using the
provided data, making the procedure more flexible and reusable.
Each input parameter has a specific data type, such as INT, VARCHAR, or DATE.
The data type defines what kind of data the parameter can accept. When calling
the procedure, you must provide data that matches the data type of each
parameter.
Input parameters are part of the procedure's signature (i.e., its name and
parameter list). When calling the procedure, you must provide arguments that
match the number, order, and data types of the input parameters.

A query creates a stored procedure with a condition specified by the input


parameter INParameterName is as following:

82 / 96
MySQL - Learn Data Analytics Together's Group

SQL

DELIMITER $$

CREATE PROCEDURE <ProcedureName>(


IN INParameterName DataType
)
BEGIN
SELECT *
FROM <TableName>
WHERE <Condition> = INParameterName;
END $$

DELIMITER ;

Output Parameters in Stored Procedures

Output parameters in stored procedures are used to return data from the
procedure back to the caller. This allows the procedure to communicate results
back to the caller, such as the status of the procedure or any computed values.
Similar to input parameters, each output parameter has a specific data type,
such as INT, VARCHAR, or DATE. The data type defines what kind of data the
parameter can return.
When calling a stored procedure with output parameters, you must provide
variables to hold the output values. After the procedure is called, these variables
will contain the data that was output by the procedure.

A stored procedure has two parameters (IN, OUT) can be created as below:

SQL

DELIMITER $$

CREATE PROCEDURE <ProcedureName>(


IN INParameterName DataType,
OUT OUTParameterName DataType
)
BEGIN
SELECT *
FROM <TableName>
WHERE <Condition> = INParameterName;
END $$

DELIMITER ;

83 / 96
MySQL - Learn Data Analytics Together's Group

Session and Local Variables


Session Variables only last in your session, almost like a temporary table and a CTE

A Temp Table lasts as long as you're in that session, when you exit out, it is
deleted from the memory, whereas
A CTE can only be used in that moment

So that a local variable is much like a CTE because it's only be used in that moment.

Session Variables

Session variables are available throughout the current user session. This means
they can be accessed from anywhere within the session, including different
procedures, functions, and blocks of code.
The lifetime of a session variable extends for the duration of the user session.
The variable is destroyed when the session ends.
Session variables are useful when you need to maintain state or share data
across different parts of your application during a user session.

Local Variables

Local variables are only available within the block of code, function, or
procedure in which they are declared. They cannot be accessed outside of this
scope.
The lifetime of a local variable is limited to the execution of the block of code,
function, or procedure where it is declared. The variable is destroyed once the
execution of that block of code, function, or procedure completes.
Local variables are useful when you need to store temporary data within a
specific block of code, function, or procedure, and you don't want that data to
be accessible or interfere with other parts of your application.

In general, the choice between session and local variables depends on the specific
requirements of your task. If you need to maintain state across different parts of your
application during a user session, use session variables. If you need temporary
storage within a specific block of code, function, or procedure, use local variables.

The SET statement is used to assign values to local variables, but it is not the
standard way to assign values to output parameters in MySQL stored procedures.

84 / 96
MySQL - Learn Data Analytics Together's Group

Create Stored Procedure


Basic syntax to create a Stored Procedure:

SQL

DELIMITER $$

CREATE PROCEDURE <ProcedureName>()


BEGIN
SELECT *
FROM <TableName>
END $$

DELIMITER ;

After saving the stored procedure, you can invoke it by using the CALL statement:

CALL <ProcedureName>();

Modify Stored Procedure


SQL

ALTER PROCEDURE <ProcedureName> ...

Show Stored Procedure


To show all stored procedures in the current MySQL server:

SQL

SHOW PROCEDURE STATUS;

To show all stored procedures in a specific database:

SQL

SHOW PROCEDURE STATUS WHERE db = <SchemaName>;

To show all stored procedures matching a specific pattern:

85 / 96
MySQL - Learn Data Analytics Together's Group

SQL

SHOW PROCEDURE STATUS LIKE '%pattern%'

Drop Stored Procedure


SQL

DROP PROCEDURE [IF EXISTS] <ProcedureName>;

86 / 96
MySQL - Learn Data Analytics Together's Group

Triggers and Events


A trigger is triggered in a specific event while an event is scheduled to be executed
in a specific time.

Triggers
Triggers are a named database object that is associated with a table and
automatically executes in response to certain events on that table
Trigger can be INSERT, UPDATE or DELETE Operation
Triggers can be used for automation, auditing, replication, data integrity, and
more

Triggers are primarily used for maintaining the integrity of the information in the
database. They enable the database to automatically perform checks or changes to
data in response to certain changes in the database, such as inserts, updates, or
deletes. Here are some common uses:

Data Validation: Triggers can be used to validate data before it's inserted or
updated in the database. For example, you could use a trigger to check that a
new entry in a 'date_of_birth' field is a valid date and is in the past.
Automating Calculations: Triggers can be used to automatically calculate values.
For example, you could use a trigger to automatically update a 'total_price' field
whenever a new item is added to an 'order' table.
Auditing: Triggers can be used to automatically log changes to data. For
example, you could use a trigger to insert a record into an 'audit_log' table
whenever a record in an 'employees' table is updated.

Create Trigger
SQL

CREATE TRIGGER
trigger_name
trigger_time
trigger_event ON table_name
FOR EACH ROW trigger_body;

Show Trigger
To show all triggers in the current MySQL server:

SQL

SHOW TRIGGERS;

87 / 96
MySQL - Learn Data Analytics Together's Group

To show all trigger in a specific database:

SQL

SHOW TRIGGERS FROM/ IN <DatabaseName>;

To show a specific trigger:

SQL

SHOW CREATE TRIGGER <TriggerName>;

Drop Trigger
SQL

DROP TRIGGER [IF EXISTS] <TriggerName>;

Events
Events are pretty similar to triggers in that they automatically do something.

An event is a scheduled task that runs at a specific time or interval.


Events can be used to perform a variety of tasks, such as generating reports,
updating data, daily maintenance, or sending notifications, emails to somebody

Events are primarily used for scheduling tasks that need to be performed at specific
times or intervals. Here are some common uses:

Data Maintenance: Events can be used to perform regular maintenance tasks on


the database. For example, you could use an event to delete old records from a
'logs' table every night at midnight.
Data Aggregation: Events can be used to regularly calculate and store
aggregated data. For example, you could use an event to calculate daily sales
totals at the end of each day and store them in a 'daily_sales' table.
Scheduling Notifications: Events can be used to schedule tasks that need to
happen at a specific time. For example, you could use an event to send a
reminder email to users who have items in their shopping cart but haven't
completed their purchase after a certain amount of time.

88 / 96
MySQL - Learn Data Analytics Together's Group

Create Event
SQL

CREATE EVENT [IF NOT EXIST] <EventName>


ON SCHEDULE <Schedule>
DO
<EventBody>

Show Event
To show all events in the current MySQL server:

SQL

SHOW EVENTS;

To show all events in a specific database:

SQL

SHOW EVENTS FROM <DatabaseName>;

To show all events matching a specific pattern:

SQL

SHOW EVENTS LIKE 'event%';

Alter Events
SQL

ALTER EVENT [IF EXISTS] <EventName>


ON SCHEDULE <Schedule>
DO
<EventBody>

Example: This query alter an existing event to automatically deletes old customer
orders from the bakery.customer_orders table every 30 seconds

89 / 96
MySQL - Learn Data Analytics Together's Group

SQL

DELIMITER $$
ALTER EVENT delete_old_customer_orders
ON SCHEDULE EVERY 30 SECOND
DO
BEGIN
DELETE
FROM bakery.customer_orders
WHERE order_date < NOW() - INTERVAL 5 YEAR;
END $$

DELMITER ;

Drop Events
SQL

DROP EVENT [IF EXISTS] <EventName>;

90 / 96
MySQL - Learn Data Analytics Together's Group

Indexes
Indexes in MySQL are data structures (in the back-end that you cannot see)
that are used to improve the speed of queries on database tables.
MySQL can use the index to quickly find the rows that match the query criteria,
rather than scanning the entire table.
Indexes can significantly improve the performance of queries on large tables.

Create an Index
SQL

CREATE INDEX <IndexName>


ON <TableName> (<ColumnName>);

Prefix Index
A prefix index is a special type of index used to improve query performance when
searching for data based on the initial characters of a string column.

A prefix index is created on a string column (Example, names, descriptions,


email addresses).
It stores only the specified prefix length (a certain number of starting
characters) from each value in the column.
When a query searches for values based on a pattern that matches the prefix,
the index can quickly locate relevant rows without having to scan the entire
table.

91 / 96
MySQL - Learn Data Analytics Together's Group

Composite Index
Composite Index is when you create an Index based off multiple columns at the same
time.

When creating a Composite Index, it's the rule of thumb to set the column with
higher cardinality 1st (high cardinality means less distinct values). Higher cardinality
can helps Index generate more smaller chunks

Show Index
SQL

SHOW INDEXES FROM/ IN <TableName>;

Drop Index
SQL

DROP INDEX <IndexName>


ON <TableName>;

Best practices for using indexes in MySQL:


Understand Your Queries: Indexes should be designed around the queries that your
application uses most frequently. Understand the WHERE clauses, JOIN conditions,
and ORDER BY clauses in your queries, and design your indexes to support them.

Avoid Over-Indexing: While indexes can speed up read operations, they slow down
write operations (INSERT, UPDATE, DELETE) because the index also needs to be
updated. Therefore, avoid creating unnecessary indexes.

Use Composite Indexes Judiciously: Composite indexes (indexes on multiple


columns) can be very useful, but they should be used judiciously. The order of
columns in a composite index is important, and MySQL can only use a composite
index if the query conditions start with the leftmost column of the index.

Use Appropriate Data Types: The smaller your data types, the smaller your index,
and the faster your queries will run. So, use the smallest data type that can correctly
store your data.

Consider Using Partial Indexes: If you have a large text column, consider indexing
only the first few characters. This is often sufficient for finding the rows you need,
and it can significantly reduce the size of the index.

Regularly Monitor and Optimize Your Indexes: MySQL provides several tools for
monitoring index usage and performance, such as the EXPLAIN statement and the
Performance Schema. Regularly check your index usage and optimize your indexes

92 / 96
MySQL - Learn Data Analytics Together's Group

as necessary. Remove Unused Indexes: Unused indexes take up space and slow
down write operations, so if you have indexes that are no longer used, you should
remove them.

Primary Key Selection: Choose a primary key that is compact and contains the most
frequently queried column. InnoDB uses clustered indexes where the data is stored
in the leaf nodes of the primary key index, making primary key lookups very fast.

Remember, the effectiveness of an index depends on the nature of your data, the
distribution of values, and your application's query patterns. Always test your
assumptions about index performance with real data and queries.

93 / 96
MySQL - Learn Data Analytics Together's Group

Disclaimer

I compiled the information you'll find here, drawing on my experience from a


fantastic MySQL challenge provided by Eric in Learn Data Analytics Together's
Group and the insightful teachings of Alex Freberg. Special thanks to Eric & Tho
Cao for their proofreading assistance.

Compiler: XuHi
Proofreaders: Eric and Tho Cao
All credit goes to Alex Freberg - our teacher.

Self-Study Data
Learn Data Analytics Together

94 / 96
MySQL - Learn Data Analytics Together's Group

Footnote
1

95 / 96
MySQL - Learn Data Analytics Together's Group

96 / 96

You might also like