00 Pytorch Fundamentals - Ipynb
00 Pytorch Fundamentals - Ipynb
"cells": [
{
"cell_type": "markdown",
"metadata": {
"colab_type": "text",
"id": "view-in-github"
},
"source": [
"<a href=\"https://colab.research.google.com/github/mrdbourke/pytorch-deep-
learning/blob/main/00_pytorch_fundamentals.ipynb\" target=\"_parent\"><img
src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Open In
Colab\"/></a> \n",
"\n",
"[View Source
Code](https://github.com/mrdbourke/pytorch-deep-learning/blob/main/
00_pytorch_fundamentals.ipynb) | [View
Slides](https://github.com/mrdbourke/pytorch-deep-learning/blob/main/slides/
00_pytorch_and_deep_learning_fundamentals.pdf) | [Watch Video Walkthrough]
(https://youtu.be/Z_ikDlimN6A?t=76) "
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "jSNK7duj5SeU"
},
"source": [
"# 00. PyTorch Fundamentals\n",
"\n",
"## What is PyTorch?\n",
"\n",
"[PyTorch](https://pytorch.org/) is an open source machine learning and deep
learning framework.\n",
"\n",
"## What can PyTorch be used for?\n",
"\n",
"PyTorch allows you to manipulate and process data and write machine learning
algorithms using Python code.\n",
"\n",
"## Who uses PyTorch?\n",
"\n",
"Many of the world's largest technology companies such as [Meta (Facebook)]
(https://ai.facebook.com/blog/pytorch-builds-the-future-of-ai-and-machine-learning-
at-facebook/), Tesla and Microsoft as well as artificial intelligence research
companies such as [OpenAI use PyTorch](https://openai.com/blog/openai-pytorch/) to
power research and bring machine learning to their products.\n",
"\n",
"\n",
"\n",
"For example, Andrej Karpathy (head of AI at Tesla) has given several talks
([PyTorch DevCon 2019](https://youtu.be/oBklltKXtDE), [Tesla AI Day
2021](https://youtu.be/j0z4FweCy4M?t=2904)) about how Tesla uses PyTorch to power
their self-driving computer vision models.\n",
"\n",
"PyTorch is also used in other industries such as agriculture to [power
computer vision on tractors](https://medium.com/pytorch/ai-for-ag-production-
machine-learning-for-agriculture-e8cfdb9849a1).\n",
"\n",
"## Why use PyTorch?\n",
"\n",
"Machine learning researchers love using PyTorch. And as of February 2022,
PyTorch is the [most used deep learning framework on Papers With
Code](https://paperswithcode.com/trends), a website for tracking machine learning
research papers and the code repositories attached with them.\n",
"\n",
"PyTorch also helps take care of many things such as GPU acceleration (making
your code run faster) behind the scenes. \n",
"\n",
"So you can focus on manipulating data and writing algorithms and PyTorch will
make sure it runs fast.\n",
"\n",
"And if companies such as Tesla and Meta (Facebook) use it to build models they
deploy to power hundreds of applications, drive thousands of cars and deliver
content to billions of people, it's clearly capable on the development front too.\
n",
"\n",
"## What we're going to cover in this module\n",
"\n",
"This course is broken down into different sections (notebooks). \n",
"\n",
"Each notebook covers important ideas and concepts within PyTorch.\n",
"\n",
"Subsequent notebooks build upon knowledge from the previous one (numbering
starts at 00, 01, 02 and goes to whatever it ends up going to).\n",
"\n",
"This notebook deals with the basic building block of machine learning and deep
learning, the tensor.\n",
"\n",
"Specifically, we're going to cover:\n",
"\n",
"| **Topic** | **Contents** |\n",
"| ----- | ----- |\n",
"| **Introduction to tensors** | Tensors are the basic building block of all of
machine learning and deep learning. |\n",
"| **Creating tensors** | Tensors can represent almost any kind of data
(images, words, tables of numbers). |\n",
"| **Getting information from tensors** | If you can put information into a
tensor, you'll want to get it out too. |\n",
"| **Manipulating tensors** | Machine learning algorithms (like neural
networks) involve manipulating tensors in many different ways such as adding,
multiplying, combining. | \n",
"| **Dealing with tensor shapes** | One of the most common issues in machine
learning is dealing with shape mismatches (trying to mix wrong shaped tensors with
other tensors). |\n",
"| **Indexing on tensors** | If you've indexed on a Python list or NumPy array,
it's very similar with tensors, except they can have far more dimensions. |\n",
"| **Mixing PyTorch tensors and NumPy** | PyTorch plays with tensors
([`torch.Tensor`](https://pytorch.org/docs/stable/tensors.html)), NumPy likes
arrays
([`np.ndarray`](https://numpy.org/doc/stable/reference/generated/numpy.ndarray.html
)) sometimes you'll want to mix and match these. | \n",
"| **Reproducibility** | Machine learning is very experimental and since it
uses a lot of *randomness* to work, sometimes you'll want that *randomness* to not
be so random. |\n",
"| **Running tensors on GPU** | GPUs (Graphics Processing Units) make your code
faster, PyTorch makes it easy to run your code on GPUs. |\n",
"\n",
"## Where can you get help?\n",
"\n",
"All of the materials for this course [live on
GitHub](https://github.com/mrdbourke/pytorch-deep-learning).\n",
"\n",
"And if you run into trouble, you can ask a question on the [Discussions page]
(https://github.com/mrdbourke/pytorch-deep-learning/discussions) there too.\n",
"\n",
"There's also the [PyTorch developer forums](https://discuss.pytorch.org/), a
very helpful place for all things PyTorch. "
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "5v3iRCRUTGeu"
},
"source": [
"## Importing PyTorch\n",
"\n",
"> **Note:** Before running any of the code in this notebook, you should have
gone through the [PyTorch setup steps](https://pytorch.org/get-started/locally/). \
n",
">\n",
"> However, **if you're running on Google Colab**, everything should work
(Google Colab comes with PyTorch and other libraries installed).\n",
"\n",
"Let's start by importing PyTorch and checking the version we're using."
]
},
{
"cell_type": "code",
"execution_count": 1,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/",
"height": 35
},
"id": "1VxEOik46Y4i",
"outputId": "f3141076-29bc-4600-c1c3-1586b1fe2292"
},
"outputs": [
{
"data": {
"text/plain": [
"'1.13.1+cu116'"
]
},
"execution_count": 1,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"import torch\n",
"torch.__version__"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "_SqvI4S9TGew"
},
"source": [
"Wonderful, it looks like we've got PyTorch 1.10.0+. \n",
"\n",
"This means if you're going through these materials, you'll see most
compatability with PyTorch 1.10.0+, however if your version number is far higher
than that, you might notice some inconsistencies. \n",
"\n",
"And if you do have any issues, please post on the course [GitHub Discussions
page](https://github.com/mrdbourke/pytorch-deep-learning/discussions)."
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "i-33BKR16iWc"
},
"source": [
"## Introduction to tensors \n",
"\n",
"Now we've got PyTorch imported, it's time to learn about tensors.\n",
"\n",
"Tensors are the fundamental building block of machine learning.\n",
"\n",
"Their job is to represent data in a numerical way.\n",
"\n",
"For example, you could represent an image as a tensor with shape `[3, 224,
224]` which would mean `[colour_channels, height, width]`, as in the image has `3`
colour channels (red, green, blue), a height of `224` pixels and a width of `224`
pixels.\n",
"\n",
"\n",
"\n",
"In tensor-speak (the language used to describe tensors), the tensor would have
three dimensions, one for `colour_channels`, `height` and `width`.\n",
"\n",
"But we're getting ahead of ourselves.\n",
"\n",
"Let's learn more about tensors by coding them.\n"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "gFF0N2TU7S7Q"
},
"source": [
"### Creating tensors \n",
"\n",
"PyTorch loves tensors. So much so there's a whole documentation page dedicated
to the [`torch.Tensor`](https://pytorch.org/docs/stable/tensors.html) class.\n",
"\n",
"Your first piece of homework is to [read through the documentation on
`torch.Tensor`](https://pytorch.org/docs/stable/tensors.html) for 10-minutes. But
you can get to that later.\n",
"\n",
"Let's code.\n",
"\n",
"The first thing we're going to create is a **scalar**.\n",
"\n",
"A scalar is a single number and in tensor-speak it's a zero dimension tensor.\
n",
"\n",
"> **Note:** That's a trend for this course. We'll focus on writing specific
code. But often I'll set exercises which involve reading and getting familiar with
the PyTorch documentation. Because after all, once you're finished this course,
you'll no doubt want to learn more. And the documentation is somewhere you'll be
finding yourself quite often."
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "YUDgG2zk7Us5",
"outputId": "0ac22bd2-16bc-4307-f312-31ae89d6c375"
},
"outputs": [
{
"data": {
"text/plain": [
"tensor(7)"
]
},
"execution_count": 2,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# Scalar\n",
"scalar = torch.tensor(7)\n",
"scalar"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "JqSuhW7rTGey"
},
"source": [
"See how the above printed out `tensor(7)`?\n",
"\n",
"That means although `scalar` is a single number, it's of type `torch.Tensor`.\
n",
"\n",
"We can check the dimensions of a tensor using the `ndim` attribute."
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "lV98Yz868bav",
"outputId": "502a625e-ff3c-4fc4-b523-f7634ea82128"
},
"outputs": [
{
"data": {
"text/plain": [
"0"
]
},
"execution_count": 3,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"scalar.ndim"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "ZO2YW_QGTGez"
},
"source": [
"What if we wanted to retrieve the number from the tensor?\n",
"\n",
"As in, turn it from `torch.Tensor` to a Python integer?\n",
"\n",
"To do we can use the `item()` method."
]
},
{
"cell_type": "code",
"execution_count": 4,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "-k4cyKumPfbE",
"outputId": "1f6a7916-0c7c-403f-8ebd-875454a94470"
},
"outputs": [
{
"data": {
"text/plain": [
"7"
]
},
"execution_count": 4,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# Get the Python number within a tensor (only works with one-element tensors)\
n",
"scalar.item()"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "qYs7ulrATGe0"
},
"source": [
"Okay, now let's see a **vector**.\n",
"\n",
"A vector is a single dimension tensor but can contain many numbers.\n",
"\n",
"As in, you could have a vector `[3, 2]` to describe `[bedrooms, bathrooms]` in
your house. Or you could have `[3, 2, 2]` to describe `[bedrooms, bathrooms,
car_parks]` in your house.\n",
"\n",
"The important trend here is that a vector is flexible in what it can represent
(the same with tensors)."
]
},
{
"cell_type": "code",
"execution_count": 5,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "-IZF6ASs8QH9",
"outputId": "e556ed2a-e58a-440f-b103-0f06c91bc75c"
},
"outputs": [
{
"data": {
"text/plain": [
"tensor([7, 7])"
]
},
"execution_count": 5,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# Vector\n",
"vector = torch.tensor([7, 7])\n",
"vector"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "mXxRUUW2TGe1"
},
"source": [
"Wonderful, `vector` now contains two 7's, my favourite number.\n",
"\n",
"How many dimensions do you think it'll have?"
]
},
{
"cell_type": "code",
"execution_count": 6,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "03hm3VVv8kr4",
"outputId": "2035bb26-0189-4b28-fa02-34220d44677f"
},
"outputs": [
{
"data": {
"text/plain": [
"1"
]
},
"execution_count": 6,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# Check the number of dimensions of vector\n",
"vector.ndim"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "W0VYvSGbTGe1"
},
"source": [
"Hmm, that's strange, `vector` contains two numbers but only has a single
dimension.\n",
"\n",
"I'll let you in on a trick.\n",
"\n",
"You can tell the number of dimensions a tensor in PyTorch has by the number of
square brackets on the outside (`[`) and you only need to count one side.\n",
"\n",
"How many square brackets does `vector` have?\n",
"\n",
"Another important concept for tensors is their `shape` attribute. The shape
tells you how the elements inside them are arranged.\n",
"\n",
"Let's check out the shape of `vector`."
]
},
{
"cell_type": "code",
"execution_count": 7,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "6zREV1bDTGe2",
"outputId": "2a6e7ceb-7eb2-422b-b006-2c6e4825272f"
},
"outputs": [
{
"data": {
"text/plain": [
"torch.Size([2])"
]
},
"execution_count": 7,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# Check shape of vector\n",
"vector.shape"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "9aWKppNyTGe2"
},
"source": [
"The above returns `torch.Size([2])` which means our vector has a shape of
`[2]`. This is because of the two elements we placed inside the square brackets
(`[7, 7]`).\n",
"\n",
"Let's now see a **matrix**."
]
},
{
"cell_type": "code",
"execution_count": 8,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "D5iNwCYL8QO9",
"outputId": "88fc63a7-4130-4c7a-a574-c61e85d2e99e"
},
"outputs": [
{
"data": {
"text/plain": [
"tensor([[ 7, 8],\n",
" [ 9, 10]])"
]
},
"execution_count": 8,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# Matrix\n",
"MATRIX = torch.tensor([[7, 8], \n",
" [9, 10]])\n",
"MATRIX"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "a3U1bCdjTGe3"
},
"source": [
"Wow! More numbers! Matrices are as flexible as vectors, except they've got an
extra dimension.\n",
"\n"
]
},
{
"cell_type": "code",
"execution_count": 9,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "8LREUbeb8r8j",
"outputId": "636246b0-b109-472a-c6d5-8601a9e08654"
},
"outputs": [
{
"data": {
"text/plain": [
"2"
]
},
"execution_count": 9,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# Check number of dimensions\n",
"MATRIX.ndim"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "LhXXgq-dTGe3"
},
"source": [
"`MATRIX` has two dimensions (did you count the number of square brackets on
the outside of one side?).\n",
"\n",
"What `shape` do you think it will have?"
]
},
{
"cell_type": "code",
"execution_count": 10,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "_TL26I31TGe3",
"outputId": "f05ec0b6-0bc1-4381-9474-56cbe6c67139"
},
"outputs": [
{
"data": {
"text/plain": [
"torch.Size([2, 2])"
]
},
"execution_count": 10,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"MATRIX.shape"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "dvLpUvrKTGe4"
},
"source": [
"We get the output `torch.Size([2, 2])` because `MATRIX` is two elements deep
and two elements wide.\n",
"\n",
"How about we create a **tensor**?"
]
},
{
"cell_type": "code",
"execution_count": 11,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "wEMDQr188QWW",
"outputId": "4230e6bd-1844-4210-eea8-245bb8b8b265"
},
"outputs": [
{
"data": {
"text/plain": [
"tensor([[[1, 2, 3],\n",
" [3, 6, 9],\n",
" [2, 4, 5]]])"
]
},
"execution_count": 11,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# Tensor\n",
"TENSOR = torch.tensor([[[1, 2, 3],\n",
" [3, 6, 9],\n",
" [2, 4, 5]]])\n",
"TENSOR"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "UmJKkXD7TGe4"
},
"source": [
"Woah! What a nice looking tensor.\n",
"\n",
"I want to stress that tensors can represent almost anything. \n",
"\n",
"The one we just created could be the sales numbers for a steak and almond
butter store (two of my favourite foods).\n",
"\n",
"\n",
"\n",
"How many dimensions do you think it has? (hint: use the square bracket
counting trick)"
]
},
{
"cell_type": "code",
"execution_count": 12,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "8dhuEsjS8QcT",
"outputId": "7a45df1b-fc32-4cc5-e330-527c6ef7ba5d"
},
"outputs": [
{
"data": {
"text/plain": [
"3"
]
},
"execution_count": 12,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# Check number of dimensions for TENSOR\n",
"TENSOR.ndim"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "ln9dys5VTGe4"
},
"source": [
"And what about its shape?"
]
},
{
"cell_type": "code",
"execution_count": 13,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "hdVv4iNRTGe5",
"outputId": "d8ac706c-020b-4926-b145-d44e41f35e90"
},
"outputs": [
{
"data": {
"text/plain": [
"torch.Size([1, 3, 3])"
]
},
"execution_count": 13,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# Check shape of TENSOR\n",
"TENSOR.shape"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "zxk8GU7oTGe5"
},
"source": [
"Alright, it outputs `torch.Size([1, 3, 3])`.\n",
"\n",
"The dimensions go outer to inner.\n",
"\n",
"That means there's 1 dimension of 3 by 3.\n",
"\n",
"\n",
"\n",
"> **Note:** You might've noticed me using lowercase letters for `scalar` and
`vector` and uppercase letters for `MATRIX` and `TENSOR`. This was on purpose. In
practice, you'll often see scalars and vectors denoted as lowercase letters such as
`y` or `a`. And matrices and tensors denoted as uppercase letters such as `X` or
`W`.\n",
">\n",
"> You also might notice the names martrix and tensor used interchangably. This
is common. Since in PyTorch you're often dealing with `torch.Tensor`s (hence the
tensor name), however, the shape and dimensions of what's inside will dictate what
it actually is.\n",
"\n",
"Let's summarise.\n",
"\n",
"| Name | What is it? | Number of dimensions | Lower or upper (usually/example)
|\n",
"| ----- | ----- | ----- | ----- |\n",
"| **scalar** | a single number | 0 | Lower (`a`) | \n",
"| **vector** | a number with direction (e.g. wind speed with direction) but
can also have many other numbers | 1 | Lower (`y`) |\n",
"| **matrix** | a 2-dimensional array of numbers | 2 | Upper (`Q`) |\n",
"| **tensor** | an n-dimensional array of numbers | can be any number, a 0-
dimension tensor is a scalar, a 1-dimension tensor is a vector | Upper (`X`) | \n",
"\n",
""
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "dms7G4nkTGe5"
},
"source": [
"### Random tensors\n",
"\n",
"We've established tensors represent some form of data.\n",
"\n",
"And machine learning models such as neural networks manipulate and seek
patterns within tensors.\n",
"\n",
"But when building machine learning models with PyTorch, it's rare you'll
create tensors by hand (like what we've been doing).\n",
"\n",
"Instead, a machine learning model often starts out with large random tensors
of numbers and adjusts these random numbers as it works through data to better
represent it.\n",
"\n",
"In essence:\n",
"\n",
"`Start with random numbers -> look at data -> update random numbers -> look at
data -> update random numbers...`\n",
"\n",
"As a data scientist, you can define how the machine learning model starts
(initialization), looks at data (representation) and updates (optimization) its
random numbers.\n",
"\n",
"We'll get hands on with these steps later on.\n",
"\n",
"For now, let's see how to create a tensor of random numbers.\n",
"\n",
"We can do so using
[`torch.rand()`](https://pytorch.org/docs/stable/generated/torch.rand.html) and
passing in the `size` parameter."
]
},
{
"cell_type": "code",
"execution_count": 14,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "EOJEtDx--GnK",
"outputId": "2680d44b-e31c-4ab1-d5b1-c0cd76706a0d"
},
"outputs": [
{
"data": {
"text/plain": [
"(tensor([[0.6541, 0.4807, 0.2162, 0.6168],\n",
" [0.4428, 0.6608, 0.6194, 0.8620],\n",
" [0.2795, 0.6055, 0.4958, 0.5483]]),\n",
" torch.float32)"
]
},
"execution_count": 14,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# Create a random tensor of size (3, 4)\n",
"random_tensor = torch.rand(size=(3, 4))\n",
"random_tensor, random_tensor.dtype"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "-wB1c_cXTGe5"
},
"source": [
"The flexibility of `torch.rand()` is that we can adjust the `size` to be
whatever we want.\n",
"\n",
"For example, say you wanted a random tensor in the common image shape of
`[224, 224, 3]` (`[height, width, color_channels`])."
]
},
{
"cell_type": "code",
"execution_count": 15,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "xMF_NUp3Ym__",
"outputId": "8346b853-0b1e-481a-d9ee-a410ee21bab0"
},
"outputs": [
{
"data": {
"text/plain": [
"(torch.Size([224, 224, 3]), 3)"
]
},
"execution_count": 15,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# Create a random tensor of size (224, 224, 3)\n",
"random_image_size_tensor = torch.rand(size=(224, 224, 3))\n",
"random_image_size_tensor.shape, random_image_size_tensor.ndim"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "0MQNTY0eTGe6"
},
"source": [
"### Zeros and ones\n",
"\n",
"Sometimes you'll just want to fill tensors with zeros or ones.\n",
"\n",
"This happens a lot with masking (like masking some of the values in one tensor
with zeros to let a model know not to learn them).\n",
"\n",
"Let's create a tensor full of zeros with
[`torch.zeros()`](https://pytorch.org/docs/stable/generated/torch.zeros.html)\n",
"\n",
"Again, the `size` parameter comes into play."
]
},
{
"cell_type": "code",
"execution_count": 16,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "oCzhd0hl9Vp6",
"outputId": "9c8ec87f-d8c9-4751-a13e-6a5e986daaa9"
},
"outputs": [
{
"data": {
"text/plain": [
"(tensor([[0., 0., 0., 0.],\n",
" [0., 0., 0., 0.],\n",
" [0., 0., 0., 0.]]),\n",
" torch.float32)"
]
},
"execution_count": 16,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# Create a tensor of all zeros\n",
"zeros = torch.zeros(size=(3, 4))\n",
"zeros, zeros.dtype"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "WDQBZJRUZWTN"
},
"source": [
"We can do the same to create a tensor of all ones except using [`torch.ones()`
](https://pytorch.org/docs/stable/generated/torch.ones.html) instead."
]
},
{
"cell_type": "code",
"execution_count": 17,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "HRe6sSXiTGe6",
"outputId": "3f45b0b8-7f65-423d-c664-f5b5f7866fd2"
},
"outputs": [
{
"data": {
"text/plain": [
"(tensor([[1., 1., 1., 1.],\n",
" [1., 1., 1., 1.],\n",
" [1., 1., 1., 1.]]),\n",
" torch.float32)"
]
},
"execution_count": 17,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# Create a tensor of all ones\n",
"ones = torch.ones(size=(3, 4))\n",
"ones, ones.dtype"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "hib1NYrSarL2"
},
"source": [
"### Creating a range and tensors like\n",
"\n",
"Sometimes you might want a range of numbers, such as 1 to 10 or 0 to 100.\n",
"\n",
"You can use `torch.arange(start, end, step)` to do so.\n",
"\n",
"Where:\n",
"* `start` = start of range (e.g. 0)\n",
"* `end` = end of range (e.g. 10)\n",
"* `step` = how many steps in between each value (e.g. 1)\n",
"\n",
"> **Note:** In Python, you can use `range()` to create a range. However in
PyTorch, `torch.range()` is deprecated and may show an error in the future."
]
},
{
"cell_type": "code",
"execution_count": 18,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "1IqUs81d9W4W",
"outputId": "2a6f0c08-052e-4b36-b4eb-6a537239026f"
},
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
"/tmp/ipykernel_3695928/193451495.py:2: UserWarning: torch.range is
deprecated and will be removed in a future release because its behavior is
inconsistent with Python's range builtin. Instead, use torch.arange, which produces
values in [start, end).\n",
" zero_to_ten_deprecated = torch.range(0, 10) # Note: this may return an
error in the future\n"
]
},
{
"data": {
"text/plain": [
"tensor([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])"
]
},
"execution_count": 18,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# Use torch.arange(), torch.range() is deprecated \n",
"zero_to_ten_deprecated = torch.range(0, 10) # Note: this may return an error
in the future\n",
"\n",
"# Create a range of values 0 to 10\n",
"zero_to_ten = torch.arange(start=0, end=10, step=1)\n",
"zero_to_ten"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "i-bXf0Ugbh-D"
},
"source": [
"Sometimes you might want one tensor of a certain type with the same shape as
another tensor.\n",
"\n",
"For example, a tensor of all zeros with the same shape as a previous tensor. \
n",
"\n",
"To do so you can use
[`torch.zeros_like(input)`](https://pytorch.org/docs/stable/generated/
torch.zeros_like.html) or
[`torch.ones_like(input)`](https://pytorch.org/docs/1.9.1/generated/
torch.ones_like.html) which return a tensor filled with zeros or ones in the same
shape as the `input` respectively."
]
},
{
"cell_type": "code",
"execution_count": 19,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "ZvXwUut5BhHq",
"outputId": "096b2f8e-8c21-4ace-97b9-c36b92b2fe77"
},
"outputs": [
{
"data": {
"text/plain": [
"tensor([0, 0, 0, 0, 0, 0, 0, 0, 0, 0])"
]
},
"execution_count": 19,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# Can also create a tensor of zeros similar to another tensor\n",
"ten_zeros = torch.zeros_like(input=zero_to_ten) # will have same shape\n",
"ten_zeros"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "huKZ6QlYTGe7"
},
"source": [
"### Tensor datatypes\n",
"\n",
"There are many different [tensor datatypes available in
PyTorch](https://pytorch.org/docs/stable/tensors.html#data-types).\n",
"\n",
"Some are specific for CPU and some are better for GPU.\n",
"\n",
"Getting to know which one can take some time.\n",
"\n",
"Generally if you see `torch.cuda` anywhere, the tensor is being used for GPU
(since Nvidia GPUs use a computing toolkit called CUDA).\n",
"\n",
"The most common type (and generally the default) is `torch.float32` or
`torch.float`.\n",
"\n",
"This is referred to as \"32-bit floating point\".\n",
"\n",
"But there's also 16-bit floating point (`torch.float16` or `torch.half`) and
64-bit floating point (`torch.float64` or `torch.double`).\n",
"\n",
"And to confuse things even more there's also 8-bit, 16-bit, 32-bit and 64-bit
integers.\n",
"\n",
"Plus more!\n",
"\n",
"> **Note:** An integer is a flat round number like `7` whereas a float has a
decimal `7.0`.\n",
"\n",
"The reason for all of these is to do with **precision in computing**.\n",
"\n",
"Precision is the amount of detail used to describe a number.\n",
"\n",
"The higher the precision value (8, 16, 32), the more detail and hence data
used to express a number.\n",
"\n",
"This matters in deep learning and numerical computing because you're making so
many operations, the more detail you have to calculate on, the more compute you
have to use.\n",
"\n",
"So lower precision datatypes are generally faster to compute on but sacrifice
some performance on evaluation metrics like accuracy (faster to compute but less
accurate).\n",
"\n",
"> **Resources:** \n",
" * See the [PyTorch documentation for a list of all available tensor
datatypes](https://pytorch.org/docs/stable/tensors.html#data-types).\n",
" * Read the [Wikipedia page for an overview of what precision in computing]
(https://en.wikipedia.org/wiki/Precision_(computer_science)) is.\n",
"\n",
"Let's see how to create some tensors with specific datatypes. We can do so
using the `dtype` parameter."
]
},
{
"cell_type": "code",
"execution_count": 20,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "q3MoGnpw9XaF",
"outputId": "61070939-8c52-4ac6-bed7-e64b3ce24615"
},
"outputs": [
{
"data": {
"text/plain": [
"(torch.Size([3]), torch.float32, device(type='cpu'))"
]
},
"execution_count": 20,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# Default datatype for tensors is float32\n",
"float_32_tensor = torch.tensor([3.0, 6.0, 9.0],\n",
" dtype=None, # defaults to None, which is
torch.float32 or whatever datatype is passed\n",
" device=None, # defaults to None, which uses the
default tensor type\n",
" requires_grad=False) # if True, operations
performed on the tensor are recorded \n",
"\n",
"float_32_tensor.shape, float_32_tensor.dtype, float_32_tensor.device"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "MhP8kzDfe_ty"
},
"source": [
"Aside from shape issues (tensor shapes don't match up), two of the other most
common issues you'll come across in PyTorch are datatype and device issues.\n",
"\n",
"For example, one of tensors is `torch.float32` and the other is
`torch.float16` (PyTorch often likes tensors to be the same format).\n",
"\n",
"Or one of your tensors is on the CPU and the other is on the GPU (PyTorch
likes calculations between tensors to be on the same device).\n",
"\n",
"We'll see more of this device talk later on.\n",
"\n",
"For now let's create a tensor with `dtype=torch.float16`."
]
},
{
"cell_type": "code",
"execution_count": 21,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "PKSuajld_09s",
"outputId": "cbac29d9-3371-4fe1-b47c-3af4623b5fbf"
},
"outputs": [
{
"data": {
"text/plain": [
"torch.float16"
]
},
"execution_count": 21,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"float_16_tensor = torch.tensor([3.0, 6.0, 9.0],\n",
" dtype=torch.float16) # torch.half would also
work\n",
"\n",
"float_16_tensor.dtype"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "gUjkB2AX7Upz"
},
"source": [
"## Getting information from tensors\n",
"\n",
"Once you've created tensors (or someone else or a PyTorch module has created
them for you), you might want to get some information from them.\n",
"\n",
"We've seen these before but three of the most common attributes you'll want to
find out about tensors are:\n",
"* `shape` - what shape is the tensor? (some operations require specific shape
rules)\n",
"* `dtype` - what datatype are the elements within the tensor stored in?\n",
"* `device` - what device is the tensor stored on? (usually GPU or CPU)\n",
"\n",
"Let's create a random tensor and find out details about it."
]
},
{
"cell_type": "code",
"execution_count": 22,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "hd_X4D0j7Umq",
"outputId": "86045713-ab36-4c8e-840c-e788f80c5266"
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"tensor([[0.4688, 0.0055, 0.8551, 0.0646],\n",
" [0.6538, 0.5157, 0.4071, 0.2109],\n",
" [0.9960, 0.3061, 0.9369, 0.7008]])\n",
"Shape of tensor: torch.Size([3, 4])\n",
"Datatype of tensor: torch.float32\n",
"Device tensor is stored on: cpu\n"
]
}
],
"source": [
"# Create a tensor\n",
"some_tensor = torch.rand(3, 4)\n",
"\n",
"# Find out details about it\n",
"print(some_tensor)\n",
"print(f\"Shape of tensor: {some_tensor.shape}\")\n",
"print(f\"Datatype of tensor: {some_tensor.dtype}\")\n",
"print(f\"Device tensor is stored on: {some_tensor.device}\") # will default to
CPU"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "45K-E5uPg6cj"
},
"source": [
"> **Note:** When you run into issues in PyTorch, it's very often one to do
with one of the three attributes above. So when the error messages show up, sing
yourself a little song called \"what, what, where\": \n",
" * \"*what shape are my tensors? what datatype are they and where are they
stored? what shape, what datatype, where where where*\""
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "BdiWvoAi7UjL"
},
"source": [
"## Manipulating tensors (tensor operations)\n",
"\n",
"In deep learning, data (images, text, video, audio, protein structures, etc)
gets represented as tensors.\n",
"\n",
"A model learns by investigating those tensors and performing a series of
operations (could be 1,000,000s+) on tensors to create a representation of the
patterns in the input data.\n",
"\n",
"These operations are often a wonderful dance between:\n",
"* Addition\n",
"* Substraction\n",
"* Multiplication (element-wise)\n",
"* Division\n",
"* Matrix multiplication\n",
"\n",
"And that's it. Sure there are a few more here and there but these are the
basic building blocks of neural networks.\n",
"\n",
"Stacking these building blocks in the right way, you can create the most
sophisticated of neural networks (just like lego!)."
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "Sk_6Dd7L7Uce"
},
"source": [
"### Basic operations\n",
"\n",
"Let's start with a few of the fundamental operations, addition (`+`),
subtraction (`-`), mutliplication (`*`).\n",
"\n",
"They work just as you think they would."
]
},
{
"cell_type": "code",
"execution_count": 23,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "X71WpQoPD7a4",
"outputId": "ab30f13e-fc67-4ae4-c5ce-1006410dba07"
},
"outputs": [
{
"data": {
"text/plain": [
"tensor([11, 12, 13])"
]
},
"execution_count": 23,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# Create a tensor of values and add a number to it\n",
"tensor = torch.tensor([1, 2, 3])\n",
"tensor + 10"
]
},
{
"cell_type": "code",
"execution_count": 24,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "Sp4TlTWWEFeO",
"outputId": "ce7d2296-881f-4eb3-802e-fd12bc25d6ea"
},
"outputs": [
{
"data": {
"text/plain": [
"tensor([10, 20, 30])"
]
},
"execution_count": 24,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# Multiply it by 10\n",
"tensor * 10"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "-1VEHnuRkn8Q"
},
"source": [
"Notice how the tensor values above didn't end up being `tensor([110, 120,
130])`, this is because the values inside the tensor don't change unless they're
reassigned."
]
},
{
"cell_type": "code",
"execution_count": 25,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "XuB1UjCIEJIA",
"outputId": "57cae862-c145-4681-d74b-fe6d77f2125a"
},
"outputs": [
{
"data": {
"text/plain": [
"tensor([1, 2, 3])"
]
},
"execution_count": 25,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# Tensors don't change unless reassigned\n",
"tensor"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "VYvqGpUTk1o6"
},
"source": [
"Let's subtract a number and this time we'll reassign the `tensor` variable. "
]
},
{
"cell_type": "code",
"execution_count": 26,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "U4iWKoLsENry",
"outputId": "14d6771d-eb57-4b11-88a7-b1bb308ddc6e"
},
"outputs": [
{
"data": {
"text/plain": [
"tensor([-9, -8, -7])"
]
},
"execution_count": 26,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# Subtract and reassign\n",
"tensor = tensor - 10\n",
"tensor"
]
},
{
"cell_type": "code",
"execution_count": 27,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "tFgZY-PaFNXa",
"outputId": "3536ea54-a056-444c-cd5d-6d438ddda965"
},
"outputs": [
{
"data": {
"text/plain": [
"tensor([1, 2, 3])"
]
},
"execution_count": 27,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# Add and reassign\n",
"tensor = tensor + 10\n",
"tensor"
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {
"id": "CYXDoIOzk-6I"
},
"source": [
"PyTorch also has a bunch of built-in functions like
[`torch.mul()`](https://pytorch.org/docs/stable/generated/torch.mul.html#torch.mul)
(short for multiplication) and
[`torch.add()`](https://pytorch.org/docs/stable/generated/torch.add.html) to
perform basic operations. "
]
},
{
"cell_type": "code",
"execution_count": 28,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "uVysdk3kFWbY",
"outputId": "3a5bf687-cf24-4224-9e76-975f84638ca8"
},
"outputs": [
{
"data": {
"text/plain": [
"tensor([10, 20, 30])"
]
},
"execution_count": 28,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# Can also use torch functions\n",
"torch.multiply(tensor, 10)"
]
},
{
"cell_type": "code",
"execution_count": 29,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "IxuPJIpNFbqO",
"outputId": "f04cafd9-eaea-4254-df1a-5ab3b524d74e"
},
"outputs": [
{
"data": {
"text/plain": [
"tensor([1, 2, 3])"
]
},
"execution_count": 29,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# Original tensor is still unchanged \n",
"tensor"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "70UNL33AlVQq"
},
"source": [
"However, it's more common to use the operator symbols like `*` instead of
`torch.mul()`"
]
},
{
"cell_type": "code",
"execution_count": 30,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "S5v3RkR0F2Jq",
"outputId": "0137caab-5ea1-4d95-f4c5-a0baa0fd652d"
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"tensor([1, 2, 3]) * tensor([1, 2, 3])\n",
"Equals: tensor([1, 4, 9])\n"
]
}
],
"source": [
"# Element-wise multiplication (each element multiplies its equivalent, index
0->0, 1->1, 2->2)\n",
"print(tensor, \"*\", tensor)\n",
"print(\"Equals:\", tensor * tensor)"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "TT5fVuyu7q5z"
},
"source": [
"### Matrix multiplication (is all you need)\n",
"\n",
"One of the most common operations in machine learning and deep learning
algorithms (like neural networks) is [matrix
multiplication](https://www.mathsisfun.com/algebra/matrix-multiplying.html).\n",
"\n",
"PyTorch implements matrix multiplication functionality in the
[`torch.matmul()`](https://pytorch.org/docs/stable/generated/torch.matmul.html)
method.\n",
"\n",
"The main two rules for matrix multiplication to remember are:\n",
"\n",
"1. The **inner dimensions** must match:\n",
" * `(3, 2) @ (3, 2)` won't work\n",
" * `(2, 3) @ (3, 2)` will work\n",
" * `(3, 2) @ (2, 3)` will work\n",
"2. The resulting matrix has the shape of the **outer dimensions**:\n",
" * `(2, 3) @ (3, 2)` -> `(2, 2)`\n",
" * `(3, 2) @ (2, 3)` -> `(3, 3)`\n",
"\n",
"> **Note:** \"`@`\" in Python is the symbol for matrix multiplication.\n",
"\n",
"> **Resource:** You can see all of the rules for matrix multiplication using
`torch.matmul()` [in the PyTorch
documentation](https://pytorch.org/docs/stable/generated/torch.matmul.html).\n",
"\n",
"Let's create a tensor and perform element-wise multiplication and matrix
multiplication on it.\n",
"\n"
]
},
{
"cell_type": "code",
"execution_count": 31,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "ZE7loucmDlEM",
"outputId": "44032bf9-c1f7-42fc-c842-dbe7a5c1221a"
},
"outputs": [
{
"data": {
"text/plain": [
"torch.Size([3])"
]
},
"execution_count": 31,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"import torch\n",
"tensor = torch.tensor([1, 2, 3])\n",
"tensor.shape"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "VUAZ3_b0vOKv"
},
"source": [
"The difference between element-wise multiplication and matrix multiplication
is the addition of values.\n",
"\n",
"For our `tensor` variable with values `[1, 2, 3]`:\n",
"\n",
"| Operation | Calculation | Code |\n",
"| ----- | ----- | ----- |\n",
"| **Element-wise multiplication** | `[1*1, 2*2, 3*3]` = `[1, 4, 9]` | `tensor
* tensor` |\n",
"| **Matrix multiplication** | `[1*1 + 2*2 + 3*3]` = `[14]` |
`tensor.matmul(tensor)` |\n"
]
},
{
"cell_type": "code",
"execution_count": 32,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "i42gkUeHvI_1",
"outputId": "18a630ce-bb56-4c40-81b4-9fdbb2ed7a4f"
},
"outputs": [
{
"data": {
"text/plain": [
"tensor([1, 4, 9])"
]
},
"execution_count": 32,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# Element-wise matrix multiplication\n",
"tensor * tensor"
]
},
{
"cell_type": "code",
"execution_count": 33,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "PvCBiiTTDk8y",
"outputId": "cf623247-8f1b-49f1-e788-16da3ed1e59c"
},
"outputs": [
{
"data": {
"text/plain": [
"tensor(14)"
]
},
"execution_count": 33,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# Matrix multiplication\n",
"torch.matmul(tensor, tensor)"
]
},
{
"cell_type": "code",
"execution_count": 34,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "m4E_pROBDk2r",
"outputId": "a09af00f-277b-479e-b0a2-ad6311ee5413"
},
"outputs": [
{
"data": {
"text/plain": [
"tensor(14)"
]
},
"execution_count": 34,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# Can also use the \"@\" symbol for matrix multiplication, though not
recommended\n",
"tensor @ tensor"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "obbginUMv43A"
},
"source": [
"You can do matrix multiplication by hand but it's not recommended.\n",
"\n",
"The in-built `torch.matmul()` method is faster."
]
},
{
"cell_type": "code",
"execution_count": 35,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "6qMSaLOoJscL",
"outputId": "8bcad8a2-c900-4966-e13c-ff2cc02b9207"
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"CPU times: user 773 µs, sys: 0 ns, total: 773 µs\n",
"Wall time: 499 µs\n"
]
},
{
"data": {
"text/plain": [
"tensor(14)"
]
},
"execution_count": 35,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"%%time\n",
"# Matrix multiplication by hand \n",
"# (avoid doing operations with for loops at all cost, they are computationally
expensive)\n",
"value = 0\n",
"for i in range(len(tensor)):\n",
" value += tensor[i] * tensor[i]\n",
"value"
]
},
{
"cell_type": "code",
"execution_count": 36,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "vVWiKB0KwH74",
"outputId": "fce58235-5c09-49ec-f34b-a90e5640281e"
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"CPU times: user 146 µs, sys: 83 µs, total: 229 µs\n",
"Wall time: 171 µs\n"
]
},
{
"data": {
"text/plain": [
"tensor(14)"
]
},
"execution_count": 36,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"%%time\n",
"torch.matmul(tensor, tensor)"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "aJ4DDmo1TGe-"
},
"source": [
"## One of the most common errors in deep learning (shape errors)\n",
"\n",
"Because much of deep learning is multiplying and performing operations on
matrices and matrices have a strict rule about what shapes and sizes can be
combined, one of the most common errors you'll run into in deep learning is shape
mismatches."
]
},
{
"cell_type": "code",
"execution_count": 37,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/",
"height": 204
},
"id": "rN5RcoD4Jo6y",
"outputId": "20f6c65b-86f4-4903-d253-f6cbf0583934"
},
"outputs": [
{
"ename": "RuntimeError",
"evalue": "mat1 and mat2 shapes cannot be multiplied (3x2 and 3x2)",
"output_type": "error",
"traceback": [
"\
u001b[0;31m------------------------------------------------------------------------
---\u001b[0m",
"\u001b[0;31mRuntimeError\u001b[0m Traceback
(most recent call last)",
"\u001b[1;32m/home/daniel/code/pytorch/pytorch-course/pytorch-deep-learning/
00_pytorch_fundamentals.ipynb Cell 75\u001b[0m in \u001b[0;36m<cell line: 10>\
u001b[0;34m()\u001b[0m\n\u001b[1;32m <a href='vscode-notebook-cell://ssh-
remote%2B7b22686f73744e616d65223a22544954414e2d525458227d/home/daniel/code/
pytorch/pytorch-course/pytorch-deep-learning/
00_pytorch_fundamentals.ipynb#Y134sdnNjb2RlLXJlbW90ZQ%3D%3D?line=1'>2</a>\u001b[0m
tensor_A \u001b[39m=\u001b[39m torch\u001b[39m.\u001b[39mtensor([[\u001b[39m1\
u001b[39m, \u001b[39m2\u001b[39m],\n\u001b[1;32m <a href='vscode-notebook-
cell://ssh-remote%2B7b22686f73744e616d65223a22544954414e2d525458227d/home/daniel/
code/pytorch/pytorch-course/pytorch-deep-learning/
00_pytorch_fundamentals.ipynb#Y134sdnNjb2RlLXJlbW90ZQ%3D%3D?line=2'>3</a>\u001b[0m
[\u001b[39m3\u001b[39m, \u001b[39m4\u001b[39m],\n\u001b[1;32m <a href='vscode-
notebook-cell://ssh-remote%2B7b22686f73744e616d65223a22544954414e2d525458227d/
home/daniel/code/pytorch/pytorch-course/pytorch-deep-learning/
00_pytorch_fundamentals.ipynb#Y134sdnNjb2RlLXJlbW90ZQ%3D%3D?line=3'>4</a>\u001b[0m
[\u001b[39m5\u001b[39m, \u001b[39m6\u001b[39m]], dtype\u001b[39m=\u001b[39mtorch\
u001b[39m.\u001b[39mfloat32)\n\u001b[1;32m <a
href='vscode-notebook-cell://ssh-remote
%2B7b22686f73744e616d65223a22544954414e2d525458227d/home/daniel/code/pytorch/
pytorch-course/pytorch-deep-learning/
00_pytorch_fundamentals.ipynb#Y134sdnNjb2RlLXJlbW90ZQ%3D%3D?line=5'>6</a>\u001b[0m
tensor_B \u001b[39m=\u001b[39m torch\u001b[39m.\u001b[39mtensor([[\u001b[39m7\
u001b[39m, \u001b[39m10\u001b[39m],\n\u001b[1;32m <a href='vscode-notebook-
cell://ssh-remote%2B7b22686f73744e616d65223a22544954414e2d525458227d/home/daniel/
code/pytorch/pytorch-course/pytorch-deep-learning/
00_pytorch_fundamentals.ipynb#Y134sdnNjb2RlLXJlbW90ZQ%3D%3D?line=6'>7</a>\u001b[0m
[\u001b[39m8\u001b[39m, \u001b[39m11\u001b[39m], \n\u001b[1;32m <a
href='vscode-notebook-cell://ssh-remote
%2B7b22686f73744e616d65223a22544954414e2d525458227d/home/daniel/code/pytorch/
pytorch-course/pytorch-deep-learning/
00_pytorch_fundamentals.ipynb#Y134sdnNjb2RlLXJlbW90ZQ%3D%3D?line=7'>8</a>\u001b[0m
[\u001b[39m9\u001b[39m, \u001b[39m12\u001b[39m]], dtype\u001b[39m=\u001b[39mtorch\
u001b[39m.\u001b[39mfloat32)\n\u001b[0;32m---> <a href='vscode-notebook-cell://ssh-
remote%2B7b22686f73744e616d65223a22544954414e2d525458227d/home/daniel/code/
pytorch/pytorch-course/pytorch-deep-learning/
00_pytorch_fundamentals.ipynb#Y134sdnNjb2RlLXJlbW90ZQ%3D%3D?line=9'>10</a>\u001b[0m
torch\u001b[39m.\u001b[39;49mmatmul(tensor_A, tensor_B)\n",
"\u001b[0;31mRuntimeError\u001b[0m: mat1 and mat2 shapes cannot be multiplied
(3x2 and 3x2)"
]
}
],
"source": [
"# Shapes need to be in the right way \n",
"tensor_A = torch.tensor([[1, 2],\n",
" [3, 4],\n",
" [5, 6]], dtype=torch.float32)\n",
"\n",
"tensor_B = torch.tensor([[7, 10],\n",
" [8, 11], \n",
" [9, 12]], dtype=torch.float32)\n",
"\n",
"torch.matmul(tensor_A, tensor_B) # (this will error)"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "HNA6MZEFxWVt"
},
"source": [
"We can make matrix multiplication work between `tensor_A` and `tensor_B` by
making their inner dimensions match.\n",
"\n",
"One of the ways to do this is with a **transpose** (switch the dimensions of a
given tensor).\n",
"\n",
"You can perform transposes in PyTorch using either:\n",
"* `torch.transpose(input, dim0, dim1)` - where `input` is the desired tensor
to transpose and `dim0` and `dim1` are the dimensions to be swapped.\n",
"* `tensor.T` - where `tensor` is the desired tensor to transpose.\n",
"\n",
"Let's try the latter."
]
},
{
"cell_type": "code",
"execution_count": 38,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "lUqgaANiy1wq",
"outputId": "e48bbf0c-8008-434e-d372-caa658b2f36b"
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"tensor([[1., 2.],\n",
" [3., 4.],\n",
" [5., 6.]])\n",
"tensor([[ 7., 10.],\n",
" [ 8., 11.],\n",
" [ 9., 12.]])\n"
]
}
],
"source": [
"# View tensor_A and tensor_B\n",
"print(tensor_A)\n",
"print(tensor_B)"
]
},
{
"cell_type": "code",
"execution_count": 39,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "DveqxO7iy_Fi",
"outputId": "1bd2e85b-ea4d-4948-c408-8eb46ef3534c"
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"tensor([[1., 2.],\n",
" [3., 4.],\n",
" [5., 6.]])\n",
"tensor([[ 7., 8., 9.],\n",
" [10., 11., 12.]])\n"
]
}
],
"source": [
"# View tensor_A and tensor_B.T\n",
"print(tensor_A)\n",
"print(tensor_B.T)"
]
},
{
"cell_type": "code",
"execution_count": 40,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "35rEIu-NKtVE",
"outputId": "0b32c7f1-556e-45d4-de22-388419e93dc2"
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Original shapes: tensor_A = torch.Size([3, 2]), tensor_B = torch.Size([3,
2])\n",
"\n",
"New shapes: tensor_A = torch.Size([3, 2]) (same as above), tensor_B.T =
torch.Size([2, 3])\n",
"\n",
"Multiplying: torch.Size([3, 2]) * torch.Size([2, 3]) <- inner dimensions
match\n",
"\n",
"Output:\n",
"\n",
"tensor([[ 27., 30., 33.],\n",
" [ 61., 68., 75.],\n",
" [ 95., 106., 117.]])\n",
"\n",
"Output shape: torch.Size([3, 3])\n"
]
}
],
"source": [
"# The operation works when tensor_B is transposed\n",
"print(f\"Original shapes: tensor_A = {tensor_A.shape}, tensor_B =
{tensor_B.shape}\\n\")\n",
"print(f\"New shapes: tensor_A = {tensor_A.shape} (same as above), tensor_B.T =
{tensor_B.T.shape}\\n\")\n",
"print(f\"Multiplying: {tensor_A.shape} * {tensor_B.T.shape} <- inner
dimensions match\\n\")\n",
"print(\"Output:\\n\")\n",
"output = torch.matmul(tensor_A, tensor_B.T)\n",
"print(output) \n",
"print(f\"\\nOutput shape: {output.shape}\")"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "MfcFEqfLjN24"
},
"source": [
"You can also use
[`torch.mm()`](https://pytorch.org/docs/stable/generated/torch.mm.html) which is a
short for `torch.matmul()`."
]
},
{
"cell_type": "code",
"execution_count": 41,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "x3rJvW_TTGe_",
"outputId": "2c501972-20bf-4a83-ad4a-b5f1b2424097"
},
"outputs": [
{
"data": {
"text/plain": [
"tensor([[ 27., 30., 33.],\n",
" [ 61., 68., 75.],\n",
" [ 95., 106., 117.]])"
]
},
"execution_count": 41,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# torch.mm is a shortcut for matmul\n",
"torch.mm(tensor_A, tensor_B.T)"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "bXKozI4T0hFi"
},
"source": [
"Without the transpose, the rules of matrix multiplication aren't fulfilled and
we get an error like above.\n",
"\n",
"How about a visual? \n",
"\n",
"\n",
"\n",
"You can create your own matrix multiplication visuals like this at
http://matrixmultiplication.xyz/.\n",
"\n",
"> **Note:** A matrix multiplication like this is also referred to as the
[**dot product**](https://www.mathsisfun.com/algebra/vectors-dot-product.html) of
two matrices.\n",
"\n"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "hA64Z4DmkB31"
},
"source": [
"Neural networks are full of matrix multiplications and dot products.\n",
"\n",
"The
[`torch.nn.Linear()`](https://pytorch.org/docs/1.9.1/generated/torch.nn.Linear.html
) module (we'll see this in action later on), also known as a feed-forward layer or
fully connected layer, implements a matrix multiplication between an input `x` and
a weights matrix `A`.\n",
"\n",
"$$\n",
"y = x\\cdot{A^T} + b\n",
"$$\n",
"\n",
"Where:\n",
"* `x` is the input to the layer (deep learning is a stack of layers like
`torch.nn.Linear()` and others on top of each other).\n",
"* `A` is the weights matrix created by the layer, this starts out as random
numbers that get adjusted as a neural network learns to better represent patterns
in the data (notice the \"`T`\", that's because the weights matrix gets
transposed).\n",
" * **Note:** You might also often see `W` or another letter like `X` used to
showcase the weights matrix.\n",
"* `b` is the bias term used to slightly offset the weights and inputs.\n",
"* `y` is the output (a manipulation of the input in the hopes to discover
patterns in it).\n",
"\n",
"This is a linear function (you may have seen something like $y = mx+b$ in high
school or elsewhere), and can be used to draw a straight line!\n",
"\n",
"Let's play around with a linear layer.\n",
"\n",
"Try changing the values of `in_features` and `out_features` below and see what
happens.\n",
"\n",
"Do you notice anything to do with the shapes?"
]
},
{
"cell_type": "code",
"execution_count": 42,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "mC_MjKW1LX7T",
"outputId": "768f75d2-c978-4df3-e18a-4684d46bdfa9"
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Input shape: torch.Size([3, 2])\n",
"\n",
"Output:\n",
"tensor([[2.2368, 1.2292, 0.4714, 0.3864, 0.1309, 0.9838],\n",
" [4.4919, 2.1970, 0.4469, 0.5285, 0.3401, 2.4777],\n",
" [6.7469, 3.1648, 0.4224, 0.6705, 0.5493, 3.9716]],\n",
" grad_fn=<AddmmBackward0>)\n",
"\n",
"Output shape: torch.Size([3, 6])\n"
]
}
],
"source": [
"# Since the linear layer starts with a random weights matrix, let's make it
reproducible (more on this later)\n",
"torch.manual_seed(42)\n",
"# This uses matrix multiplication\n",
"linear = torch.nn.Linear(in_features=2, # in_features = matches inner
dimension of input \n",
" out_features=6) # out_features = describes outer
value \n",
"x = tensor_A\n",
"output = linear(x)\n",
"print(f\"Input shape: {x.shape}\\n\")\n",
"print(f\"Output:\\n{output}\\n\\nOutput shape: {output.shape}\")"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "zIGrP5j1pN7j"
},
"source": [
"> **Question:** What happens if you change `in_features` from 2 to 3 above?
Does it error? How could you change the shape of the input (`x`) to accommodate to
the error? Hint: what did we have to do to `tensor_B` above?"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "EPNF0nMWoGEj"
},
"source": [
"If you've never done it before, matrix multiplication can be a confusing topic
at first.\n",
"\n",
"But after you've played around with it a few times and even cracked open a few
neural networks, you'll notice it's everywhere.\n",
"\n",
"Remember, matrix multiplication is all you need.\n",
"\n",
"\n",
"\n",
"*When you start digging into neural network layers and building your own,
you'll find matrix multiplications everywhere. **Source:**
https://marksaroufim.substack.com/p/working-class-deep-learner*"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "pjMmrJOOPv5e"
},
"source": [
"### Finding the min, max, mean, sum, etc (aggregation)\n",
"\n",
"Now we've seen a few ways to manipulate tensors, let's run through a few ways
to aggregate them (go from more values to less values).\n",
"\n",
"First we'll create a tensor and then find the max, min, mean and sum of it.\
n",
"\n",
"\n",
"\n"
]
},
{
"cell_type": "code",
"execution_count": 43,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "jrFQbe5fP1Rk",
"outputId": "034013c1-b384-4a0d-edf8-295ed3a456f1"
},
"outputs": [
{
"data": {
"text/plain": [
"tensor([ 0, 10, 20, 30, 40, 50, 60, 70, 80, 90])"
]
},
"execution_count": 43,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# Create a tensor\n",
"x = torch.arange(0, 100, 10)\n",
"x"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "-J-wfMdlsEco"
},
"source": [
"Now let's perform some aggregation."
]
},
{
"cell_type": "code",
"execution_count": 44,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "e5wSP9YKP3Lb",
"outputId": "3aa238c7-646f-434f-a55c-292aabef7227"
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Minimum: 0\n",
"Maximum: 90\n",
"Mean: 45.0\n",
"Sum: 450\n"
]
}
],
"source": [
"print(f\"Minimum: {x.min()}\")\n",
"print(f\"Maximum: {x.max()}\")\n",
"# print(f\"Mean: {x.mean()}\") # this will error\n",
"print(f\"Mean: {x.type(torch.float32).mean()}\") # won't work without float
datatype\n",
"print(f\"Sum: {x.sum()}\")"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "JHoKpsg3sKQE"
},
"source": [
"> **Note:** You may find some methods such as `torch.mean()` require tensors
to be in `torch.float32` (the most common) or another specific datatype, otherwise
the operation will fail. \n",
"\n",
"You can also do the same as above with `torch` methods."
]
},
{
"cell_type": "code",
"execution_count": 45,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "0Cr23Y9uP3HO",
"outputId": "9c86d805-eef2-465c-e2c8-2bccd515e6d5"
},
"outputs": [
{
"data": {
"text/plain": [
"(tensor(90), tensor(0), tensor(45.), tensor(450))"
]
},
"execution_count": 45,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"torch.max(x), torch.min(x), torch.mean(x.type(torch.float32)), torch.sum(x)"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "i7ApCaZjDkvp"
},
"source": [
"### Positional min/max\n",
"\n",
"You can also find the index of a tensor where the max or minimum occurs with
[`torch.argmax()`](https://pytorch.org/docs/stable/generated/torch.argmax.html) and
[`torch.argmin()`](https://pytorch.org/docs/stable/generated/torch.argmin.html)
respectively.\n",
"\n",
"This is helpful incase you just want the position where the highest (or
lowest) value is and not the actual value itself (we'll see this in a later section
when using the [softmax activation
function](https://pytorch.org/docs/stable/generated/torch.nn.Softmax.html))."
]
},
{
"cell_type": "code",
"execution_count": 46,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "FzNBl9JSGlHi",
"outputId": "01e0740e-c34f-469b-9c8f-9e6e5f0363af"
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Tensor: tensor([10, 20, 30, 40, 50, 60, 70, 80, 90])\n",
"Index where max value occurs: 8\n",
"Index where min value occurs: 0\n"
]
}
],
"source": [
"# Create a tensor\n",
"tensor = torch.arange(10, 100, 10)\n",
"print(f\"Tensor: {tensor}\")\n",
"\n",
"# Returns index of max and min values\n",
"print(f\"Index where max value occurs: {tensor.argmax()}\")\n",
"print(f\"Index where min value occurs: {tensor.argmin()}\")"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "QBu33WihOXBk"
},
"source": [
"### Change tensor datatype\n",
"\n",
"As mentioned, a common issue with deep learning operations is having your
tensors in different datatypes.\n",
"\n",
"If one tensor is in `torch.float64` and another is in `torch.float32`, you
might run into some errors.\n",
"\n",
"But there's a fix.\n",
"\n",
"You can change the datatypes of tensors using
[`torch.Tensor.type(dtype=None)`](https://pytorch.org/docs/stable/generated/
torch.Tensor.type.html) where the `dtype` parameter is the datatype you'd like to
use.\n",
"\n",
"First we'll create a tensor and check its datatype (the default is
`torch.float32`)."
]
},
{
"cell_type": "code",
"execution_count": 47,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "rY2FEsCAOaLu",
"outputId": "507f1ade-7c7a-4172-fa48-60c9ac4831c0"
},
"outputs": [
{
"data": {
"text/plain": [
"torch.float32"
]
},
"execution_count": 47,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# Create a tensor and check its datatype\n",
"tensor = torch.arange(10., 100., 10.)\n",
"tensor.dtype"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "jR30FHEc92of"
},
"source": [
"Now we'll create another tensor the same as before but change its datatype to
`torch.float16`.\n",
"\n"
]
},
{
"cell_type": "code",
"execution_count": 48,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "Cac8gRYjOeab",
"outputId": "96e5ce12-bc29-4a2b-f81c-bfc89ea2d075"
},
"outputs": [
{
"data": {
"text/plain": [
"tensor([10., 20., 30., 40., 50., 60., 70., 80., 90.], dtype=torch.float16)"
]
},
"execution_count": 48,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# Create a float16 tensor\n",
"tensor_float16 = tensor.type(torch.float16)\n",
"tensor_float16"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "ndVlKJZ4-7_5"
},
"source": [
"And we can do something similar to make a `torch.int8` tensor."
]
},
{
"cell_type": "code",
"execution_count": 49,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "8Yqovld2Oj6s",
"outputId": "667da17f-e38f-404a-bd2d-63683e45c99a"
},
"outputs": [
{
"data": {
"text/plain": [
"tensor([10, 20, 30, 40, 50, 60, 70, 80, 90], dtype=torch.int8)"
]
},
"execution_count": 49,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# Create an int8 tensor\n",
"tensor_int8 = tensor.type(torch.int8)\n",
"tensor_int8"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "44GxVabar-xe"
},
"source": [
"> **Note:** Different datatypes can be confusing to begin with. But think of
it like this, the lower the number (e.g. 32, 16, 8), the less precise a computer
stores the value. And with a lower amount of storage, this generally results in
faster computation and a smaller overall model. Mobile-based neural networks often
operate with 8-bit integers, smaller and faster to run but less accurate than their
float32 counterparts. For more on this, I'd read up about [precision in computing]
(https://en.wikipedia.org/wiki/Precision_(computer_science)).\n",
"\n",
"> **Exercise:** So far we've covered a fair few tensor methods but there's a
bunch more in the [`torch.Tensor`
documentation](https://pytorch.org/docs/stable/tensors.html), I'd recommend
spending 10-minutes scrolling through and looking into any that catch your eye.
Click on them and then write them out in code yourself to see what happens."
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "7CkCtAYmGsHY"
},
"source": [
"### Reshaping, stacking, squeezing and unsqueezing\n",
"\n",
"Often times you'll want to reshape or change the dimensions of your tensors
without actually changing the values inside them.\n",
"\n",
"To do so, some popular methods are:\n",
"\n",
"| Method | One-line description |\n",
"| ----- | ----- |\n",
"| [`torch.reshape(input,
shape)`](https://pytorch.org/docs/stable/generated/torch.reshape.html#torch.reshape
) | Reshapes `input` to `shape` (if compatible), can also use
`torch.Tensor.reshape()`. |\n",
"|
[`Tensor.view(shape)`](https://pytorch.org/docs/stable/generated/torch.Tensor.view.
html) | Returns a view of the original tensor in a different `shape` but shares the
same data as the original tensor. |\n",
"| [`torch.stack(tensors,
dim=0)`](https://pytorch.org/docs/1.9.1/generated/torch.stack.html) | Concatenates
a sequence of `tensors` along a new dimension (`dim`), all `tensors` must be same
size. |\n",
"| [`torch.squeeze(input)`](https://pytorch.org/docs/stable/generated/
torch.squeeze.html) | Squeezes `input` to remove all the dimenions with value `1`.
|\n",
"| [`torch.unsqueeze(input,
dim)`](https://pytorch.org/docs/1.9.1/generated/torch.unsqueeze.html) | Returns
`input` with a dimension value of `1` added at `dim`. | \n",
"| [`torch.permute(input,
dims)`](https://pytorch.org/docs/stable/generated/torch.permute.html) | Returns a
*view* of the original `input` with its dimensions permuted (rearranged) to `dims`.
| \n",
"\n",
"Why do any of these?\n",
"\n",
"Because deep learning models (neural networks) are all about manipulating
tensors in some way. And because of the rules of matrix multiplication, if you've
got shape mismatches, you'll run into errors. These methods help you make sure the
right elements of your tensors are mixing with the right elements of other tensors.
\n",
"\n",
"Let's try them out.\n",
"\n",
"First, we'll create a tensor."
]
},
{
"cell_type": "code",
"execution_count": 50,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "EYjRTLOzG4Ev",
"outputId": "f7f2719c-15ce-406b-dc8f-4477046cd5d9"
},
"outputs": [
{
"data": {
"text/plain": [
"(tensor([1., 2., 3., 4., 5., 6., 7.]), torch.Size([7]))"
]
},
"execution_count": 50,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# Create a tensor\n",
"import torch\n",
"x = torch.arange(1., 8.)\n",
"x, x.shape"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "3_VarMO9CoT8"
},
"source": [
"Now let's add an extra dimension with `torch.reshape()`. "
]
},
{
"cell_type": "code",
"execution_count": 51,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "US4WjpQ3SG-8",
"outputId": "c519d59e-85f1-4a10-eaaa-acb487028e3a"
},
"outputs": [
{
"data": {
"text/plain": [
"(tensor([[1., 2., 3., 4., 5., 6., 7.]]), torch.Size([1, 7]))"
]
},
"execution_count": 51,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# Add an extra dimension\n",
"x_reshaped = x.reshape(1, 7)\n",
"x_reshaped, x_reshaped.shape"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "tig5xm0jCxuU"
},
"source": [
"We can also change the view with `torch.view()`."
]
},
{
"cell_type": "code",
"execution_count": 52,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "WDN2BNe5TGfB",
"outputId": "3df1b0d6-2548-4ecc-ca25-0c4e28a6e536"
},
"outputs": [
{
"data": {
"text/plain": [
"(tensor([[1., 2., 3., 4., 5., 6., 7.]]), torch.Size([1, 7]))"
]
},
"execution_count": 52,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# Change view (keeps same data as original but changes view)\n",
"# See more: https://stackoverflow.com/a/54507446/7900723\n",
"z = x.view(1, 7)\n",
"z, z.shape"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "m8joAaUEC2NX"
},
"source": [
"Remember though, changing the view of a tensor with `torch.view()` really only
creates a new view of the *same* tensor.\n",
"\n",
"So changing the view changes the original tensor too. "
]
},
{
"cell_type": "code",
"execution_count": 53,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "2DxURVvXTGfC",
"outputId": "668d194d-dd0a-4db1-da00-9c3fd8849186"
},
"outputs": [
{
"data": {
"text/plain": [
"(tensor([[5., 2., 3., 4., 5., 6., 7.]]), tensor([5., 2., 3., 4., 5., 6.,
7.]))"
]
},
"execution_count": 53,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# Changing z changes x\n",
"z[:, 0] = 5\n",
"z, x"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "YxnqDBlpDDJ_"
},
"source": [
"If we wanted to stack our new tensor on top of itself five times, we could do
so with `torch.stack()`."
]
},
{
"cell_type": "code",
"execution_count": 54,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "pX5Adf3ORiTK",
"outputId": "703e8568-61df-4ebd-f4d3-a6366dc265c0"
},
"outputs": [
{
"data": {
"text/plain": [
"tensor([[5., 2., 3., 4., 5., 6., 7.],\n",
" [5., 2., 3., 4., 5., 6., 7.],\n",
" [5., 2., 3., 4., 5., 6., 7.],\n",
" [5., 2., 3., 4., 5., 6., 7.]])"
]
},
"execution_count": 54,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# Stack tensors on top of each other\n",
"x_stacked = torch.stack([x, x, x, x], dim=0) # try changing dim to dim=1 and
see what happens\n",
"x_stacked"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "ET56QzNHDuOI"
},
"source": [
"How about removing all single dimensions from a tensor?\n",
"\n",
"To do so you can use `torch.squeeze()` (I remember this as *squeezing* the
tensor to only have dimensions over 1)."
]
},
{
"cell_type": "code",
"execution_count": 55,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "w2Y2HEoDRxJZ",
"outputId": "dd0645a6-1cdd-46bc-a3a2-433d9cd09336"
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Previous tensor: tensor([[5., 2., 3., 4., 5., 6., 7.]])\n",
"Previous shape: torch.Size([1, 7])\n",
"\n",
"New tensor: tensor([5., 2., 3., 4., 5., 6., 7.])\n",
"New shape: torch.Size([7])\n"
]
}
],
"source": [
"print(f\"Previous tensor: {x_reshaped}\")\n",
"print(f\"Previous shape: {x_reshaped.shape}\")\n",
"\n",
"# Remove extra dimension from x_reshaped\n",
"x_squeezed = x_reshaped.squeeze()\n",
"print(f\"\\nNew tensor: {x_squeezed}\")\n",
"print(f\"New shape: {x_squeezed.shape}\")"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "acjDLk8WD8NC"
},
"source": [
"And to do the reverse of `torch.squeeze()` you can use `torch.unsqueeze()` to
add a dimension value of 1 at a specific index."
]
},
{
"cell_type": "code",
"execution_count": 56,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "CUC-DEEwSYv7",
"outputId": "da60e019-3ea6-42f8-8e47-ba037ead737f"
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Previous tensor: tensor([5., 2., 3., 4., 5., 6., 7.])\n",
"Previous shape: torch.Size([7])\n",
"\n",
"New tensor: tensor([[5., 2., 3., 4., 5., 6., 7.]])\n",
"New shape: torch.Size([1, 7])\n"
]
}
],
"source": [
"print(f\"Previous tensor: {x_squeezed}\")\n",
"print(f\"Previous shape: {x_squeezed.shape}\")\n",
"\n",
"## Add an extra dimension with unsqueeze\n",
"x_unsqueezed = x_squeezed.unsqueeze(dim=0)\n",
"print(f\"\\nNew tensor: {x_unsqueezed}\")\n",
"print(f\"New shape: {x_unsqueezed.shape}\")"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "R9DuJzXgFbM5"
},
"source": [
"You can also rearrange the order of axes values with `torch.permute(input,
dims)`, where the `input` gets turned into a *view* with new `dims`."
]
},
{
"cell_type": "code",
"execution_count": 57,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "fCRGCX8DTGfC",
"outputId": "6853328b-a1cf-4470-f366-106a231a189c"
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Previous shape: torch.Size([224, 224, 3])\n",
"New shape: torch.Size([3, 224, 224])\n"
]
}
],
"source": [
"# Create tensor with specific shape\n",
"x_original = torch.rand(size=(224, 224, 3))\n",
"\n",
"# Permute the original tensor to rearrange the axis order\n",
"x_permuted = x_original.permute(2, 0, 1) # shifts axis 0->1, 1->2, 2->0\n",
"\n",
"print(f\"Previous shape: {x_original.shape}\")\n",
"print(f\"New shape: {x_permuted.shape}\")"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "06LKaFemGBoE"
},
"source": [
"> **Note**: Because permuting returns a *view* (shares the same data as the
original), the values in the permuted tensor will be the same as the original
tensor and if you change the values in the view, it will change the values of the
original."
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "nEPqVL7fTGfC"
},
"source": [
"## Indexing (selecting data from tensors)\n",
"\n",
"Sometimes you'll want to select specific data from tensors (for example, only
the first column or second row).\n",
"\n",
"To do so, you can use indexing.\n",
"\n",
"If you've ever done indexing on Python lists or NumPy arrays, indexing in
PyTorch with tensors is very similar."
]
},
{
"cell_type": "code",
"execution_count": 58,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "oSXzdxCQTGfD",
"outputId": "05a72c08-5f8c-433a-cd31-46065686f825"
},
"outputs": [
{
"data": {
"text/plain": [
"(tensor([[[1, 2, 3],\n",
" [4, 5, 6],\n",
" [7, 8, 9]]]),\n",
" torch.Size([1, 3, 3]))"
]
},
"execution_count": 58,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# Create a tensor \n",
"import torch\n",
"x = torch.arange(1, 10).reshape(1, 3, 3)\n",
"x, x.shape"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "xQG5krnKG43B"
},
"source": [
"Indexing values goes outer dimension -> inner dimension (check out the square
brackets)."
]
},
{
"cell_type": "code",
"execution_count": 59,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "zv_Z3IAzTGfD",
"outputId": "cf6c0936-7600-4af4-9b6f-f6b8ac9b4c05"
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"First square bracket:\n",
"tensor([[1, 2, 3],\n",
" [4, 5, 6],\n",
" [7, 8, 9]])\n",
"Second square bracket: tensor([1, 2, 3])\n",
"Third square bracket: 1\n"
]
}
],
"source": [
"# Let's index bracket by bracket\n",
"print(f\"First square bracket:\\n{x[0]}\") \n",
"print(f\"Second square bracket: {x[0][0]}\") \n",
"print(f\"Third square bracket: {x[0][0][0]}\")"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "XaLjaIFxHe89"
},
"source": [
"You can also use `:` to specify \"all values in this dimension\" and then use
a comma (`,`) to add another dimension."
]
},
{
"cell_type": "code",
"execution_count": 60,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "gCT09pqeTGfD",
"outputId": "a91f9b73-f8f0-476a-9c69-fcd03b042f6b"
},
"outputs": [
{
"data": {
"text/plain": [
"tensor([[1, 2, 3]])"
]
},
"execution_count": 60,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# Get all values of 0th dimension and the 0 index of 1st dimension\n",
"x[:, 0]"
]
},
{
"cell_type": "code",
"execution_count": 61,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "dwDx_gMsTGfD",
"outputId": "8165cfd9-a88d-4212-8c45-1eb84ef5be83"
},
"outputs": [
{
"data": {
"text/plain": [
"tensor([[2, 5, 8]])"
]
},
"execution_count": 61,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# Get all values of 0th & 1st dimensions but only index 1 of 2nd dimension\n",
"x[:, :, 1]"
]
},
{
"cell_type": "code",
"execution_count": 62,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "xiw3_1E3TGfD",
"outputId": "12fa4749-cf52-4e88-c2c0-44d26aeb633c"
},
"outputs": [
{
"data": {
"text/plain": [
"tensor([5])"
]
},
"execution_count": 62,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# Get all values of the 0 dimension but only the 1 index value of the 1st and
2nd dimension\n",
"x[:, 1, 1]"
]
},
{
"cell_type": "code",
"execution_count": 63,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "XFVEgrKhTGfD",
"outputId": "69eadeb9-11b3-4b48-cb95-0b3305c1274c"
},
"outputs": [
{
"data": {
"text/plain": [
"tensor([1, 2, 3])"
]
},
"execution_count": 63,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# Get index 0 of 0th and 1st dimension and all values of 2nd dimension \n",
"x[0, 0, :] # same as x[0][0]"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "6Ik0r11RIxtm"
},
"source": [
"Indexing can be quite confusing to begin with, especially with larger tensors
(I still have to try indexing multiple times to get it right). But with a bit of
practice and following the data explorer's motto (***visualize, visualize,
visualize***), you'll start to get the hang of it."
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "h8ZaW0Bq7rCm"
},
"source": [
"## PyTorch tensors & NumPy\n",
"\n",
"Since NumPy is a popular Python numerical computing library, PyTorch has
functionality to interact with it nicely. \n",
"\n",
"The two main methods you'll want to use for NumPy to PyTorch (and back again)
are: \n",
"* [`torch.from_numpy(ndarray)`](https://pytorch.org/docs/stable/generated/
torch.from_numpy.html) - NumPy array -> PyTorch tensor. \n",
"* [`torch.Tensor.numpy()`](https://pytorch.org/docs/stable/generated/
torch.Tensor.numpy.html) - PyTorch tensor -> NumPy array.\n",
"\n",
"Let's try them out."
]
},
{
"cell_type": "code",
"execution_count": 64,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "yDrDCnvY7rKS",
"outputId": "86155a63-01f9-4372-e889-61a65ebf0fb1"
},
"outputs": [
{
"data": {
"text/plain": [
"(array([1., 2., 3., 4., 5., 6., 7.]),\n",
" tensor([1., 2., 3., 4., 5., 6., 7.], dtype=torch.float64))"
]
},
"execution_count": 64,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# NumPy array to tensor\n",
"import torch\n",
"import numpy as np\n",
"array = np.arange(1.0, 8.0)\n",
"tensor = torch.from_numpy(array)\n",
"array, tensor"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "16JG6cONLPnO"
},
"source": [
"> **Note:** By default, NumPy arrays are created with the datatype `float64`
and if you convert it to a PyTorch tensor, it'll keep the same datatype (as above).
\n",
">\n",
"> However, many PyTorch calculations default to using `float32`. \n",
"> \n",
"> So if you want to convert your NumPy array (float64) -> PyTorch tensor
(float64) -> PyTorch tensor (float32), you can use `tensor =
torch.from_numpy(array).type(torch.float32)`.\n",
"\n",
"Because we reassigned `tensor` above, if you change the tensor, the array
stays the same."
]
},
{
"cell_type": "code",
"execution_count": 65,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "ovwl7VCREv8L",
"outputId": "efd21eb9-0010-436a-dc29-f851e3d7d77a"
},
"outputs": [
{
"data": {
"text/plain": [
"(array([2., 3., 4., 5., 6., 7., 8.]),\n",
" tensor([1., 2., 3., 4., 5., 6., 7.], dtype=torch.float64))"
]
},
"execution_count": 65,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# Change the array, keep the tensor\n",
"array = array + 1\n",
"array, tensor"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "geVvu1p0MTWc"
},
"source": [
"And if you want to go from PyTorch tensor to NumPy array, you can call
`tensor.numpy()`."
]
},
{
"cell_type": "code",
"execution_count": 66,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "xw_7ZyVaTKxQ",
"outputId": "54d6f347-d3f6-44df-9155-83d980c31780"
},
"outputs": [
{
"data": {
"text/plain": [
"(tensor([1., 1., 1., 1., 1., 1., 1.]),\n",
" array([1., 1., 1., 1., 1., 1., 1.], dtype=float32))"
]
},
"execution_count": 66,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# Tensor to NumPy array\n",
"tensor = torch.ones(7) # create a tensor of ones with dtype=float32\n",
"numpy_tensor = tensor.numpy() # will be dtype=float32 unless changed\n",
"tensor, numpy_tensor"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "Dt8yEV1jMfi2"
},
"source": [
"And the same rule applies as above, if you change the original `tensor`, the
new `numpy_tensor` stays the same."
]
},
{
"cell_type": "code",
"execution_count": 67,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "mMp6ZSkET4_Y",
"outputId": "100678a4-c220-4a44-e4a5-0542359cb9de"
},
"outputs": [
{
"data": {
"text/plain": [
"(tensor([2., 2., 2., 2., 2., 2., 2.]),\n",
" array([1., 1., 1., 1., 1., 1., 1.], dtype=float32))"
]
},
"execution_count": 67,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# Change the tensor, keep the array the same\n",
"tensor = tensor + 1\n",
"tensor, numpy_tensor"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "7gU3ubCrUkI-"
},
"source": [
"## Reproducibility (trying to take the random out of random)\n",
"\n",
"As you learn more about neural networks and machine learning, you'll start to
discover how much randomness plays a part.\n",
"\n",
"Well, pseudorandomness that is. Because after all, as they're designed, a
computer is fundamentally deterministic (each step is predictable) so the
randomness they create are simulated randomness (though there is debate on this
too, but since I'm not a computer scientist, I'll let you find out more yourself).\
n",
"\n",
"How does this relate to neural networks and deep learning then?\n",
"\n",
"We've discussed neural networks start with random numbers to describe patterns
in data (these numbers are poor descriptions) and try to improve those random
numbers using tensor operations (and a few other things we haven't discussed yet)
to better describe patterns in data.\n",
"\n",
"In short: \n",
"\n",
"``start with random numbers -> tensor operations -> try to make better (again
and again and again)``\n",
"\n",
"Although randomness is nice and powerful, sometimes you'd like there to be a
little less randomness.\n",
"\n",
"Why?\n",
"\n",
"So you can perform repeatable experiments.\n",
"\n",
"For example, you create an algorithm capable of achieving X performance.\n",
"\n",
"And then your friend tries it out to verify you're not crazy.\n",
"\n",
"How could they do such a thing?\n",
"\n",
"That's where **reproducibility** comes in.\n",
"\n",
"In other words, can you get the same (or very similar) results on your
computer running the same code as I get on mine?\n",
"\n",
"Let's see a brief example of reproducibility in PyTorch.\n",
"\n",
"We'll start by creating two random tensors, since they're random, you'd expect
them to be different right? "
]
},
{
"cell_type": "code",
"execution_count": 68,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "eSwxnwEbTGfF",
"outputId": "73b34154-734f-496f-9b55-b6aaa137e854"
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Tensor A:\n",
"tensor([[0.8016, 0.3649, 0.6286, 0.9663],\n",
" [0.7687, 0.4566, 0.5745, 0.9200],\n",
" [0.3230, 0.8613, 0.0919, 0.3102]])\n",
"\n",
"Tensor B:\n",
"tensor([[0.9536, 0.6002, 0.0351, 0.6826],\n",
" [0.3743, 0.5220, 0.1336, 0.9666],\n",
" [0.9754, 0.8474, 0.8988, 0.1105]])\n",
"\n",
"Does Tensor A equal Tensor B? (anywhere)\n"
]
},
{
"data": {
"text/plain": [
"tensor([[False, False, False, False],\n",
" [False, False, False, False],\n",
" [False, False, False, False]])"
]
},
"execution_count": 68,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"import torch\n",
"\n",
"# Create two random tensors\n",
"random_tensor_A = torch.rand(3, 4)\n",
"random_tensor_B = torch.rand(3, 4)\n",
"\n",
"print(f\"Tensor A:\\n{random_tensor_A}\\n\")\n",
"print(f\"Tensor B:\\n{random_tensor_B}\\n\")\n",
"print(f\"Does Tensor A equal Tensor B? (anywhere)\")\n",
"random_tensor_A == random_tensor_B"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "nPU6mDKJnr8M"
},
"source": [
"Just as you might've expected, the tensors come out with different values.\n",
"\n",
"But what if you wanted to create two random tensors with the *same* values.\
n",
"\n",
"As in, the tensors would still contain random values but they would be of the
same flavour.\n",
"\n",
"That's where
[`torch.manual_seed(seed)`](https://pytorch.org/docs/stable/generated/
torch.manual_seed.html) comes in, where `seed` is an integer (like `42` but it
could be anything) that flavours the randomness.\n",
"\n",
"Let's try it out by creating some more *flavoured* random tensors."
]
},
{
"cell_type": "code",
"execution_count": 69,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "sB6d1GfYTGfF",
"outputId": "4d11d38e-4406-4aff-9a81-cf13aa89ee5f"
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Tensor C:\n",
"tensor([[0.8823, 0.9150, 0.3829, 0.9593],\n",
" [0.3904, 0.6009, 0.2566, 0.7936],\n",
" [0.9408, 0.1332, 0.9346, 0.5936]])\n",
"\n",
"Tensor D:\n",
"tensor([[0.8823, 0.9150, 0.3829, 0.9593],\n",
" [0.3904, 0.6009, 0.2566, 0.7936],\n",
" [0.9408, 0.1332, 0.9346, 0.5936]])\n",
"\n",
"Does Tensor C equal Tensor D? (anywhere)\n"
]
},
{
"data": {
"text/plain": [
"tensor([[True, True, True, True],\n",
" [True, True, True, True],\n",
" [True, True, True, True]])"
]
},
"execution_count": 69,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"import torch\n",
"import random\n",
"\n",
"# # Set the random seed\n",
"RANDOM_SEED=42 # try changing this to different values and see what happens to
the numbers below\n",
"torch.manual_seed(seed=RANDOM_SEED) \n",
"random_tensor_C = torch.rand(3, 4)\n",
"\n",
"# Have to reset the seed every time a new rand() is called \n",
"# Without this, tensor_D would be different to tensor_C \n",
"torch.random.manual_seed(seed=RANDOM_SEED) # try commenting this line out and
seeing what happens\n",
"random_tensor_D = torch.rand(3, 4)\n",
"\n",
"print(f\"Tensor C:\\n{random_tensor_C}\\n\")\n",
"print(f\"Tensor D:\\n{random_tensor_D}\\n\")\n",
"print(f\"Does Tensor C equal Tensor D? (anywhere)\")\n",
"random_tensor_C == random_tensor_D"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "uct53Xr5QRC_"
},
"source": [
"Nice!\n",
"\n",
"It looks like setting the seed worked. \n",
"\n",
"> **Resource:** What we've just covered only scratches the surface of
reproducibility in PyTorch. For more, on reproducibility in general and random
seeds, I'd checkout:\n",
"> * [The PyTorch reproducibility
documentation](https://pytorch.org/docs/stable/notes/randomness.html) (a good
exercise would be to read through this for 10-minutes and even if you don't
understand it now, being aware of it is important).\n",
"> * [The Wikipedia random seed
page](https://en.wikipedia.org/wiki/Random_seed) (this'll give a good overview of
random seeds and pseudorandomness in general)."
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "hxIIM7t27rQ-"
},
"source": [
"## Running tensors on GPUs (and making faster computations)\n",
"\n",
"Deep learning algorithms require a lot of numerical operations.\n",
"\n",
"And by default these operations are often done on a CPU (computer processing
unit).\n",
"\n",
"However, there's another common piece of hardware called a GPU (graphics
processing unit), which is often much faster at performing the specific types of
operations neural networks need (matrix multiplications) than CPUs.\n",
"\n",
"Your computer might have one.\n",
"\n",
"If so, you should look to use it whenever you can to train neural networks
because chances are it'll speed up the training time dramatically.\n",
"\n",
"There are a few ways to first get access to a GPU and secondly get PyTorch to
use the GPU.\n",
"\n",
"> **Note:** When I reference \"GPU\" throughout this course, I'm referencing a
[Nvidia GPU with CUDA](https://developer.nvidia.com/cuda-gpus) enabled (CUDA is a
computing platform and API that helps allow GPUs be used for general purpose
computing & not just graphics) unless otherwise specified.\n",
"\n"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "0UiR6QpoYQH_"
},
"source": [
"\n",
"### 1. Getting a GPU\n",
"\n",
"You may already know what's going on when I say GPU. But if not, there are a
few ways to get access to one.\n",
"\n",
"| **Method** | **Difficulty to setup** | **Pros** | **Cons** | **How to
setup** |\n",
"| ----- | ----- | ----- | ----- | ----- |\n",
"| Google Colab | Easy | Free to use, almost zero setup required, can share
work with others as easy as a link | Doesn't save your data outputs, limited
compute, subject to timeouts | [Follow the Google Colab
Guide](https://colab.research.google.com/notebooks/gpu.ipynb) |\n",
"| Use your own | Medium | Run everything locally on your own machine | GPUs
aren't free, require upfront cost | Follow the [PyTorch installation guidelines]
(https://pytorch.org/get-started/locally/) |\n",
"| Cloud computing (AWS, GCP, Azure) | Medium-Hard | Small upfront cost, access
to almost infinite compute | Can get expensive if running continually, takes some
time to setup right | Follow the [PyTorch installation
guidelines](https://pytorch.org/get-started/cloud-partners/) |\n",
"\n",
"There are more options for using GPUs but the above three will suffice for
now.\n",
"\n",
"Personally, I use a combination of Google Colab and my own personal computer
for small scale experiments (and creating this course) and go to cloud resources
when I need more compute power.\n",
"\n",
"> **Resource:** If you're looking to purchase a GPU of your own but not sure
what to get, [Tim Dettmers has an excellent
guide](https://timdettmers.com/2020/09/07/which-gpu-for-deep-learning/).\n",
"\n",
"To check if you've got access to a Nvidia GPU, you can run `!nvidia-smi` where
the `!` (also called bang) means \"run this on the command line\".\n",
"\n"
]
},
{
"cell_type": "code",
"execution_count": 70,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "vEMcO-9zYc-w",
"outputId": "77405db7-3494-4add-cfc7-8415e52a0412"
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Sat Jan 21 08:34:23 2023 \n",
"+-----------------------------------------------------------------------------+\
n",
"| NVIDIA-SMI 515.48.07 Driver Version: 515.48.07 CUDA Version: 11.7
|\n",
"|-------------------------------+----------------------
+----------------------+\n",
"| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr.
ECC |\n",
"| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute
M. |\n",
"| | | MIG
M. |\n",
"|
===============================+======================+======================|\n",
"| 0 NVIDIA TITAN RTX On | 00000000:01:00.0 Off |
N/A |\n",
"| 40% 30C P8 7W / 280W | 177MiB / 24576MiB | 0%
Default |\n",
"| | |
N/A |\n",
"+-------------------------------+----------------------
+----------------------+\n",
"
\n",
"+-----------------------------------------------------------------------------+\
n",
"| Processes:
|\n",
"| GPU GI CI PID Type Process name GPU
Memory |\n",
"| ID ID Usage
|\n",
"|
=============================================================================|\n",
"| 0 N/A N/A 1061 G /usr/lib/xorg/Xorg
53MiB |\n",
"| 0 N/A N/A 2671131 G /usr/lib/xorg/Xorg
97MiB |\n",
"| 0 N/A N/A 2671256 G /usr/bin/gnome-shell
9MiB |\n",
"+-----------------------------------------------------------------------------+\n"
]
}
],
"source": [
"!nvidia-smi"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "HvkB9p5zYf8E"
},
"source": [
"If you don't have a Nvidia GPU accessible, the above will output something
like:\n",
"\n",
"```\n",
"NVIDIA-SMI has failed because it couldn't communicate with the NVIDIA driver.
Make sure that the latest NVIDIA driver is installed and running.\n",
"```\n",
"\n",
"In that case, go back up and follow the install steps.\n",
"\n",
"If you do have a GPU, the line above will output something like:\n",
"\n",
"```\n",
"Wed Jan 19 22:09:08 2022 \n",
"+-----------------------------------------------------------------------------
+\n",
"| NVIDIA-SMI 495.46 Driver Version: 460.32.03 CUDA Version: 11.2
|\n",
"|-------------------------------+----------------------+----------------------
+\n",
"| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC
|\n",
"| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M.
|\n",
"| | | MIG M.
|\n",
"|
===============================+======================+======================|\n",
"| 0 Tesla P100-PCIE... Off | 00000000:00:04.0 Off | 0
|\n",
"| N/A 35C P0 27W / 250W | 0MiB / 16280MiB | 0% Default
|\n",
"| | | N/A
|\n",
"+-------------------------------+----------------------+----------------------
+\n",
"
\n",
"+-----------------------------------------------------------------------------
+\n",
"| Processes:
|\n",
"| GPU GI CI PID Type Process name GPU Memory
|\n",
"| ID ID Usage
|\n",
"|
=============================================================================|\n",
"| No running processes found
|\n",
"+-----------------------------------------------------------------------------
+\n",
"```"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "UvibZ6e0YcDk"
},
"source": [
"\n",
"\n",
"### 2. Getting PyTorch to run on the GPU\n",
"\n",
"Once you've got a GPU ready to access, the next step is getting PyTorch to use
for storing data (tensors) and computing on data (performing operations on
tensors).\n",
"\n",
"To do so, you can use the
[`torch.cuda`](https://pytorch.org/docs/stable/cuda.html) package.\n",
"\n",
"Rather than talk about it, let's try it out.\n",
"\n",
"You can test if PyTorch has access to a GPU using
[`torch.cuda.is_available()`](https://pytorch.org/docs/stable/generated/
torch.cuda.is_available.html#torch.cuda.is_available).\n"
]
},
{
"cell_type": "code",
"execution_count": 71,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "OweDLgwjEvZ2",
"outputId": "3a278a24-3ec3-4b1f-8f96-298086fa6ea6"
},
"outputs": [
{
"data": {
"text/plain": [
"True"
]
},
"execution_count": 71,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# Check for GPU\n",
"import torch\n",
"torch.cuda.is_available()"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "jedZcx2PZFpL"
},
"source": [
"If the above outputs `True`, PyTorch can see and use the GPU, if it outputs
`False`, it can't see the GPU and in that case, you'll have to go back through the
installation steps.\n",
"\n",
"Now, let's say you wanted to setup your code so it ran on CPU *or* the GPU if
it was available.\n",
"\n",
"That way, if you or someone decides to run your code, it'll work regardless of
the computing device they're using. \n",
"\n",
"Let's create a `device` variable to store what kind of device is available."
]
},
{
"cell_type": "code",
"execution_count": 72,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/",
"height": 35
},
"id": "j92HBCKB7rYa",
"outputId": "8cca1643-645c-4b67-f1f5-37066f6b9549"
},
"outputs": [
{
"data": {
"text/plain": [
"'cuda'"
]
},
"execution_count": 72,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# Set device type\n",
"device = \"cuda\" if torch.cuda.is_available() else \"cpu\"\n",
"device"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "FjFyPP2WaCch"
},
"source": [
"If the above output `\"cuda\"` it means we can set all of our PyTorch code to
use the available CUDA device (a GPU) and if it output `\"cpu\"`, our PyTorch code
will stick with the CPU.\n",
"\n",
"> **Note:** In PyTorch, it's best practice to write [**device agnostic code**]
(https://pytorch.org/docs/master/notes/cuda.html#device-agnostic-code). This means
code that'll run on CPU (always available) or GPU (if available).\n",
"\n",
"If you want to do faster computing you can use a GPU but if you want to do
*much* faster computing, you can use multiple GPUs.\n",
"\n",
"You can count the number of GPUs PyTorch has access to using
[`torch.cuda.device_count()`](https://pytorch.org/docs/stable/generated/
torch.cuda.device_count.html#torch.cuda.device_count)."
]
},
{
"cell_type": "code",
"execution_count": 73,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "MArsn0DFTGfG",
"outputId": "de717df5-bb67-4900-805e-a6f00ad0b409"
},
"outputs": [
{
"data": {
"text/plain": [
"1"
]
},
"execution_count": 73,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# Count number of devices\n",
"torch.cuda.device_count()"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "xVNf1hiqa-gO"
},
"source": [
"Knowing the number of GPUs PyTorch has access to is helpful incase you wanted
to run a specific process on one GPU and another process on another (PyTorch also
has features to let you run a process across *all* GPUs)."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### 2.1 Getting PyTorch to run on Apple Silicon\n",
"\n",
"In order to run PyTorch on Apple's M1/M2/M3 GPUs you can use the
[`torch.backends.mps`](https://pytorch.org/docs/stable/notes/mps.html) module.\n",
"\n",
"Be sure that the versions of the macOS and Pytorch are updated.\n",
"\n",
"You can test if PyTorch has access to a GPU using
`torch.backends.mps.is_available()`."
]
},
{
"cell_type": "code",
"execution_count": 4,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"True"
]
},
"execution_count": 4,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# Check for Apple Silicon GPU\n",
"import torch\n",
"torch.backends.mps.is_available() # Note this will print false if you're not
running on a Mac"
]
},
{
"cell_type": "code",
"execution_count": 7,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"'mps'"
]
},
"execution_count": 7,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# Set device type\n",
"device = \"mps\" if torch.backends.mps.is_available() else \"cpu\"\n",
"device"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"As before, if the above output `\"mps\"` it means we can set all of our
PyTorch code to use the available Apple Silicon GPU."
]
},
{
"cell_type": "code",
"execution_count": 8,
"metadata": {},
"outputs": [],
"source": [
"if torch.cuda.is_available():\n",
" device = \"cuda\" # Use NVIDIA GPU (if available)\n",
"elif torch.backends.mps.is_available():\n",
" device = \"mps\" # Use Apple Silicon GPU (if available)\n",
"else:\n",
" device = \"cpu\" # Default to CPU if no GPU is available"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "XqQLcuj68OA-"
},
"source": [
"### 3. Putting tensors (and models) on the GPU\n",
"\n",
"You can put tensors (and models, we'll see this later) on a specific device by
calling
[`to(device)`](https://pytorch.org/docs/stable/generated/torch.Tensor.to.html) on
them. Where `device` is the target device you'd like the tensor (or model) to go
to.\n",
"\n",
"Why do this?\n",
"\n",
"GPUs offer far faster numerical computing than CPUs do and if a GPU isn't
available, because of our **device agnostic code** (see above), it'll run on the
CPU.\n",
"\n",
"> **Note:** Putting a tensor on GPU using `to(device)` (e.g.
`some_tensor.to(device)`) returns a copy of that tensor, e.g. the same tensor will
be on CPU and GPU. To overwrite tensors, reassign them:\n",
">\n",
"> `some_tensor = some_tensor.to(device)`\n",
"\n",
"Let's try creating a tensor and putting it on the GPU (if it's available)."
]
},
{
"cell_type": "code",
"execution_count": 9,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "FhI3srFXEHfP",
"outputId": "2f4f6435-fdc4-4e99-e87c-9421c2100f36"
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"tensor([1, 2, 3]) cpu\n"
]
},
{
"data": {
"text/plain": [
"tensor([1, 2, 3], device='mps:0')"
]
},
"execution_count": 9,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# Create tensor (default on CPU)\n",
"tensor = torch.tensor([1, 2, 3])\n",
"\n",
"# Tensor not on GPU\n",
"print(tensor, tensor.device)\n",
"\n",
"# Move tensor to GPU (if available)\n",
"tensor_on_gpu = tensor.to(device)\n",
"tensor_on_gpu"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "DxXeRKO0TGfG"
},
"source": [
"If you have a GPU available, the above code will output something like:\n",
"\n",
"```\n",
"tensor([1, 2, 3]) cpu\n",
"tensor([1, 2, 3], device='cuda:0')\n",
"```\n",
"\n",
"Notice the second tensor has `device='cuda:0'`, this means it's stored on the
0th GPU available (GPUs are 0 indexed, if two GPUs were available, they'd be
`'cuda:0'` and `'cuda:1'` respectively, up to `'cuda:n'`).\n",
"\n"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "4puyUX4Bci5D"
},
"source": [
"### 4. Moving tensors back to the CPU\n",
"\n",
"What if we wanted to move the tensor back to CPU?\n",
"\n",
"For example, you'll want to do this if you want to interact with your tensors
with NumPy (NumPy does not leverage the GPU).\n",
"\n",
"Let's try using the
[`torch.Tensor.numpy()`](https://pytorch.org/docs/stable/generated/
torch.Tensor.numpy.html) method on our `tensor_on_gpu`."
]
},
{
"cell_type": "code",
"execution_count": 75,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/",
"height": 186
},
"id": "3ChSLJgPTGfG",
"outputId": "32e92f62-db28-4dc7-ce93-c2ab33229252"
},
"outputs": [
{
"ename": "TypeError",
"evalue": "can't convert cuda:0 device type tensor to numpy. Use Tensor.cpu()
to copy the tensor to host memory first.",
"output_type": "error",
"traceback": [
"\
u001b[0;31m------------------------------------------------------------------------
---\u001b[0m",
"\u001b[0;31mTypeError\u001b[0m Traceback
(most recent call last)",
"\u001b[1;32m/home/daniel/code/pytorch/pytorch-course/pytorch-deep-learning/
00_pytorch_fundamentals.ipynb Cell 157\u001b[0m in \u001b[0;36m<cell line: 2>\
u001b[0;34m()\u001b[0m\n\u001b[1;32m <a href='vscode-notebook-cell://ssh-
remote%2B7b22686f73744e616d65223a22544954414e2d525458227d/home/daniel/code/
pytorch/pytorch-course/pytorch-deep-learning/
00_pytorch_fundamentals.ipynb#Y312sdnNjb2RlLXJlbW90ZQ%3D%3D?line=0'>1</a>\
u001b[0m \u001b[39m# If tensor is on GPU, can't transform it to NumPy (this will
error)\u001b[39;00m\n\u001b[0;32m----> <a href='vscode-notebook-cell://ssh-remote
%2B7b22686f73744e616d65223a22544954414e2d525458227d/home/daniel/code/pytorch/
pytorch-course/pytorch-deep-learning/
00_pytorch_fundamentals.ipynb#Y312sdnNjb2RlLXJlbW90ZQ%3D%3D?line=1'>2</a>\u001b[0m
tensor_on_gpu\u001b[39m.\u001b[39;49mnumpy()\n",
"\u001b[0;31mTypeError\u001b[0m: can't convert cuda:0 device type tensor to
numpy. Use Tensor.cpu() to copy the tensor to host memory first."
]
}
],
"source": [
"# If tensor is on GPU, can't transform it to NumPy (this will error)\n",
"tensor_on_gpu.numpy()"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "LhymtkRDTGfG"
},
"source": [
"Instead, to get a tensor back to CPU and usable with NumPy we can use
[`Tensor.cpu()`](https://pytorch.org/docs/stable/generated/torch.Tensor.cpu.html).\
n",
"\n",
"This copies the tensor to CPU memory so it's usable with CPUs."
]
},
{
"cell_type": "code",
"execution_count": 76,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "gN15s-NdTGfG",
"outputId": "9fffb6f2-c200-4f9c-d987-d9ab5d9cba49"
},
"outputs": [
{
"data": {
"text/plain": [
"array([1, 2, 3])"
]
},
"execution_count": 76,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# Instead, copy the tensor back to cpu\n",
"tensor_back_on_cpu = tensor_on_gpu.cpu().numpy()\n",
"tensor_back_on_cpu"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "qyzNH5lrTGfH"
},
"source": [
"The above returns a copy of the GPU tensor in CPU memory so the original
tensor is still on GPU."
]
},
{
"cell_type": "code",
"execution_count": 77,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "S5u83PCRTGfH",
"outputId": "4cb931e2-7c8d-49b9-a7de-db3d3c6589b5"
},
"outputs": [
{
"data": {
"text/plain": [
"tensor([1, 2, 3], device='cuda:0')"
]
},
"execution_count": 77,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"tensor_on_gpu"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "xlmBpnuPTGfH"
},
"source": [
"## Exercises\n",
"\n",
"All of the exercises are focused on practicing the code above.\n",
"\n",
"You should be able to complete them by referencing each section or by
following the resource(s) linked.\n",
"\n",
"**Resources:**\n",
"\n",
"* [Exercise template notebook for 00](https://github.com/mrdbourke/pytorch-
deep-learning/blob/main/extras/exercises/00_pytorch_fundamentals_exercises.ipynb).\
n",
"* [Example solutions notebook for 00](https://github.com/mrdbourke/pytorch-
deep-learning/blob/main/extras/solutions/
00_pytorch_fundamentals_exercise_solutions.ipynb) (try the exercises *before*
looking at this).\n",
"\n",
"1. Documentation reading - A big part of deep learning (and learning to code
in general) is getting familiar with the documentation of a certain framework
you're using. We'll be using the PyTorch documentation a lot throughout the rest of
this course. So I'd recommend spending 10-minutes reading the following (it's okay
if you don't get some things for now, the focus is not yet full understanding, it's
awareness). See the documentation on
[`torch.Tensor`](https://pytorch.org/docs/stable/tensors.html#torch-tensor) and for
[`torch.cuda`](https://pytorch.org/docs/master/notes/cuda.html#cuda-semantics).\n",
"2. Create a random tensor with shape `(7, 7)`.\n",
"3. Perform a matrix multiplication on the tensor from 2 with another random
tensor with shape `(1, 7)` (hint: you may have to transpose the second tensor).\n",
"4. Set the random seed to `0` and do exercises 2 & 3 over again.\n",
"5. Speaking of random seeds, we saw how to set it with `torch.manual_seed()`
but is there a GPU equivalent? (hint: you'll need to look into the documentation
for `torch.cuda` for this one). If there is, set the GPU random seed to `1234`.\n",
"6. Create two random tensors of shape `(2, 3)` and send them both to the GPU
(you'll need access to a GPU for this). Set `torch.manual_seed(1234)` when creating
the tensors (this doesn't have to be the GPU random seed).\n",
"7. Perform a matrix multiplication on the tensors you created in 6 (again, you
may have to adjust the shapes of one of the tensors).\n",
"8. Find the maximum and minimum values of the output of 7.\n",
"9. Find the maximum and minimum index values of the output of 7.\n",
"10. Make a random tensor with shape `(1, 1, 1, 10)` and then create a new
tensor with all the `1` dimensions removed to be left with a tensor of shape
`(10)`. Set the seed to `7` when you create it and print out the first tensor and
it's shape as well as the second tensor and it's shape."
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "xlmBpnuPTGfH"
},
"source": [
"## Extra-curriculum\n",
"\n",
"* Spend 1-hour going through the [PyTorch basics
tutorial](https://pytorch.org/tutorials/beginner/basics/intro.html) (I'd recommend
the
[Quickstart](https://pytorch.org/tutorials/beginner/basics/quickstart_tutorial.html
) and
[Tensors](https://pytorch.org/tutorials/beginner/basics/tensorqs_tutorial.html)
sections).\n",
"* To learn more on how a tensor can represent data, see this video: [What's a
tensor?](https://youtu.be/f5liqUk0ZTw)"
]
}
],
"metadata": {
"accelerator": "GPU",
"colab": {
"collapsed_sections": [],
"include_colab_link": true,
"name": "00_pytorch_fundamentals.ipynb",
"provenance": [],
"toc_visible": true
},
"interpreter": {
"hash": "3fbe1355223f7b2ffc113ba3ade6a2b520cadace5d5ec3e828c83ce02eb221bf"
},
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.8.13"
}
},
"nbformat": 4,
"nbformat_minor": 4
}