KEMBAR78
Python Data Structures Guide | PDF | Parameter (Computer Programming) | Control Flow
0% found this document useful (0 votes)
152 views27 pages

Python Data Structures Guide

Lists are mutable sequences of objects that can contain different data types. They are ordered and indexed starting at 0. Tuples are similar to lists but are immutable, so their elements cannot be changed. Sets are unordered collections of unique objects without duplicates.

Uploaded by

mike110*
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
152 views27 pages

Python Data Structures Guide

Lists are mutable sequences of objects that can contain different data types. They are ordered and indexed starting at 0. Tuples are similar to lists but are immutable, so their elements cannot be changed. Sets are unordered collections of unique objects without duplicates.

Uploaded by

mike110*
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 27

Elitedatascience Python Crash Course:

1. Lists are mutable sequences of


objects
Let's start with one of the most important data structures in Python.

1.1 - Lists

Lists are very versatile and you will use them all the time.

Lists are mutable sequences of objects, enclosed by square brackets: []

Let's break that statement down:

 Mutable: You can update elements of a list, add to the list, delete elements, etc.
 Sequence: Each element is ordered and indexed, starting at 0.
 Of Objects: You can put any other Python object in lists, including functions, other
lists, and custom objects.
 Square Brackets: ['element 0', 'element 1', 'element 2']

list is the official object type.

For example:

In [2]:
integer_list = [0, 1, 2, 3, 4]

print( integer_list )
print( type(integer_list) )
[0, 1, 2, 3, 4]
< type 'list' >

You can also have lists of mixed types:


In [3]:
# Create list of mixed types: strings, ints, and floats
my_list = ['hello', 1, 'world', 2, 3.0]

# Print entire list


print( my_list )
['hello', 1, 'world', 2, 3.0]
1.2 - Indexing
In Python, lists are zero-indexed.

 That means you access the first element using index 0, the second element using
index 1, and so on...
 Don't forget this fact. Index 1 is not the first element in a Python list.

For example:

In [4]:
print( my_list[0] ) # Print the first element
print( my_list[2] ) # Print the third element
hello
world

You can also select elements from the list using a range of indices.
They are denoted like start_index:end_index.

 This will select all elements starting from start_index, but


before end_index.
 In other words, start_index is inclusive while end_index is not inclusive.

In [5]:
# Selects all starting from the 2nd element, but BEFORE the 4th element
print( my_list[1:3] )
[1, 'world']

This is also called slicing the list.

In addition:

 You can omit start_index to select all elements before end_index.


 Or omit end_index to select all elements starting from start_index.

In [6]:
# Selects all BEFORE the 4th element
print( my_list[:3] )

# Selects all starting from the 2nd element


print( my_list[1:] )
['hello', 1, 'world']
[1, 'world', 2, 3.0]

Finally, negative indices will select from the reverse direction.

For example:
In [7]:
# Selects the last element
print( my_list[-1] )

# Selects all BEFORE the last element


print( my_list[:-1] )

# Selects all starting from the 2nd element, but before the last element
print( my_list[1:-1] )
3.0
['hello', 1, 'world', 2]
[1, 'world', 2]
1.3 - Mutable

Because lists are mutable, you can change individual elements of the list.

Here's how to update an element in a list:


In [8]:
my_list[0] = 'bonjour' # Sets new value for the first element

print( my_list[0] ) # Print the first element


print( my_list[2] ) # Print the third elements
bonjour
world

Appending to and removing elements from lists are both easy to do.
In [9]:
# Add to end of the list
my_list.append(99)
print(my_list)

# Remove an element from the list


my_list.remove(3.0)
print(my_list)
['bonjour', 1, 'world', 2, 3.0, 99]
['bonjour', 1, 'world', 2, 99]
1.4 - List operations

Python offers an entire suite of list operations. Remember, operations for each type of object behave
differently.

Let's take two lists for example:

In [10]:
a = [1, 2, 3]
b = [4, 5, 6]

First, you can concatenate the list.


In [11]:
print( a + b ) # Concatentation
[1, 2, 3, 4, 5, 6]

Or repeat a list.
In [12]:
print( a * 3 ) # Repetition
[1, 2, 3, 1, 2, 3, 1, 2, 3]

You can check for membership in a list.


In [13]:
print( 3 in a ) # Membership
True

You can find the min and max values in a list.


In [14]:
print( min(b), max(b) ) # Min, Max
4 6

And finally, you can check the length of a list, which is just the number of elements in the list.
In [15]:
print( len(a) ) # Length
3

2. Tuples are immutable sequences of


objects
Next, we have tuples. Tuples are like lists in many ways... with one notable difference: they
are immutable.

Read on to find out what that means.

2.1 - Tuples

We won't use tuples as often as lists, but they are still fairly common.

Tuples are immutable sequences of objects, enclosed by parentheses: ().

Once again, let's break that statement down, and highlight the parts different from lists:

 Immutable: You cannot update elements of a tuple, add to the tuple, delete
elements, etc.
 Sequence: Each element is ordered and indexed, starting at 0.
 Of Objects: You can put any other Python object in tuples, including functions,
other lists, and custom objects.
 Parentheses: ('element 0', 'element 1', 'element 2')

tuple is the official type.

For example:

In [22]:
integer_tuple = (0, 1, 2, 3, 4)

print( integer_tuple )
print( type(integer_tuple) )
(0, 1, 2, 3, 4)
< type 'tuple' >

Just as with lists, you can have tuples of mixed types:


In [23]:
# Create tuple of mixed types: strings, ints, and floats
my_tuple = ('hello', 1, 'world', 2, 3.0)

# Print entire tuple


print( my_tuple )
('hello', 1, 'world', 2, 3.0)
2.2 - Indexing
Tuples are zero-indexed, just like lists.
In [24]:
print( my_tuple[0] ) # Print the first element
print( my_tuple[2] ) # Print the third element
hello
world
You can slice them the same way as well.
In [25]:
print( my_tuple[:2] ) # Select first 3 elements
('hello', 1, 'world')
2.3 - Immutable
However, unlike lists, tuples cannot be updated or added to.

 In fact, they don't even have an append() or remove() function.

Running the code cell below will give you this error:

TypeError: 'tuple' object does not support item assignment

In [26]:
# Tuples cannot be updated
my_tuple[0] = 'goodbye' # Will throw an error
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
in ()
1 # Tuples cannot be updated
----> 2 my_tuple[0] = 'goodbye' # Will throw an error
TypeError: 'tuple' object does not support item assignment
2.4 - Unpacking
One convenient tip about tuples is that you can unpack their individual elements.

 That just means setting them to different variables


 This is convenient for functions that return more than 1 value (more on this later).

In [27]:
# Unpack tuple
a, b, c, d, e = my_tuple

# Print a and c, which were the first and third elements from the tuple
print( a, c )
hello world

In case you're wondering, you can unpack lists too (it's just not as common).

In [28]:
my_list = ['hello', 1, 'world', 2, 99]

# Unpack list
a, b, c, d, e = my_list

# Print a and c, which were the first and third elements from the list
print( a, c )
hello world
2.5 - Tuple operations

Tuples enjoy many of the same operations that lists do.

For example:

In [29]:
a = (1, 2, 3)
b = (4, 5, 6)

print( a + b ) # Concatentation
print( a * 3 ) # Repetition
print( 3 in a ) # Membership
print( len(a) ) # Length
print( min(b), max(b) ) # Min, Max
(1, 2, 3, 4, 5, 6)
(1, 2, 3, 1, 2, 3, 1, 2, 3)
True
3
4 6

For this course, we won't worry too much about the difference between lists and tuples, which
are:

 Tuples can be used as indices in dictionaries.


 Tuples have a slight performance improvement.
 Tuples are immutable, while lists are mutable.

We will mostly use lists, but it's helpful to be able to spot tuples when they appear.

3. Sets are collections of unique objects


Next, we have sets. These are not used as frequently as the other 3 data structures in this lesson, but
they are very convenient for one (important) use in data science... removing duplicates.

3.1 - Sets

Sets in Python mimic the sets from basic algebra.

Sets are unordered collections of unique objects, enclosed by curly braces: {}.

Again, let's break that statement down:

 Unordered: Elements are not ordered (and can't be indexed).


 Unique: No duplicates are allowed.
 Of Objects: You can put any other Python object in sets.
 Curly Braces: {'element 0', 'element 1', 'element 2'}

set is the official type.

For example:

In [30]:
integer_set = {0, 1, 2, 3, 4}

print( integer_set )
print( type(integer_set) )
set([0, 1, 2, 3, 4])
< type 'set' >
You can also count the number of elements in the set with the same len() function you used for lists
and tuples.
In [31]:
# Print length of integer set
print( integer_set, 'has', len(integer_set), 'elements.' )
set([0, 1, 2, 3, 4]) has 5 elements.
3.2 - Removing duplicates

Because each element in a set must be unique, sets are a great tool for removing duplicates.

For example:

In [32]:
fibonacci_list = [ 1, 1, 2, 3, 5, 8, 13 ] # Will keep both 1's will remain
fibonacci_set = { 1, 1, 2, 3, 5, 8, 13 } # Only one 1 will remain

print( fibonacci_list )
print( fibonacci_set )
[1, 1, 2, 3, 5, 8, 13]
set([1, 2, 3, 5, 8, 13])

Notice how the set removes duplicate 1's.

3.3 - No indexing

Because sets are unordered, they do not support indexing.

The code below will throw an error:

TypeError: 'set' object does not support indexing

In [33]:
# Throws an error
fibonacci_set[0]
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
in ()
1 # Throws an error
----> 2 fibonacci_set[0]

TypeError: 'set' object does not support indexing


3.4 - Converting to sets
You can create sets from other data structures using the set() function.

 You can use it like the int() and float() functions from the previous lesson.

In [34]:
# Create a list
fibonacci_list = [ 1, 1, 2, 3, 5, 8, 13 ]

# Convert it to a set
fibonacci_set = set(fibonacci_list)

print( fibonacci_set )
set([1, 2, 3, 5, 8, 13])
3.5 - Set operations

Finally, sets have their own operations.

You can easily take the union, intersection, and difference of sets.

 Union is performed through the union() function or the | operator.


 Intersection is performed through the intersection() function or
the & operator.
 Difference is performed through the difference() function or the - operator.

Let's create two sets and look at some examples:

In [35]:
powers_of_two = { 1, 2, 4, 8, 16 }
fibonacci_set = { 1, 1, 2, 3, 5, 8, 13 }

First, the union.


In [36]:
# Union: Elements in either set
print( powers_of_two.union( fibonacci_set ) )
print( powers_of_two | fibonacci_set )
set([1, 2, 3, 4, 5, 8, 13, 16])
set([1, 2, 3, 4, 5, 8, 13, 16])

Next, the intersection.


In [37]:
# Intersection: Elements in both sets
print( powers_of_two.intersection( fibonacci_set ) )
print( powers_of_two & fibonacci_set )
set([8, 1, 2])
set([8, 1, 2])

Finally, the difference... By the way... watch out for the order of the sets!
In [38]:
# Difference
print( powers_of_two - fibonacci_set )
print( fibonacci_set - powers_of_two)
set([16, 4])
set([3, 5, 13])

4. Dictionaries are collections of key-


value pairs
The last data structure in this lesson is the dictionary. These are quite different from the other 3 because
each item in a dictionary is actually a pair of elements.
4.1 - Dictionaries

Dictionaries are like mini-databases for storing information.

Dictionaries are unordered collections of key-value pairs, enclosed by curly braces, like {}.

Again, let's break that statement down:

 Unordered: Elements are not ordered (and can't be indexed normally).


 Of key-value pairs: Elements are indexed by their key. Each key stores a value.
 Curly Braces: {'key 0' : 44, 'key 1' : 'value_1', 'key 2' : 23}

dict is the official type.

For example:

In [41]:
integer_dict = {
'zero' : 0,
'one' : 1,
'two' : 2,
'three' : 3,
'four' : 4
}

print( integer_dict )
print( type(integer_dict) )
{'four': 4, 'zero': 0, 'three': 3, 'two': 2, 'one': 1}
< type 'dict' >

Don't get dictionaries confused with sets. They both use curly braces, but that's about it.

4.2 - Keys and values

A dictionary, also called a "dict", is like a miniature database for storing and organizing data.

Each element in a dict is actually a key-value pair, and it's called an item.

 A key is like a name for the value. You should make them descriptive.
 A value is some other Python object to store. They can be floats, lists, functions,
and so on.

Here's an example with descriptive keys:


In [42]:
my_dict = {
'title' : "Hitchhiker's Guide to the Galaxy",
'author' : 'Dougie Adams',
42 : ['A number.', 'The answer to 40 + 2 = ?'] # Keys can be integers too!
}

For the purposes of this course, we'll use strings for most of our keys, but they can be other data types as
well.

 For example, 42 is an integer key in my_dict.

The cool thing about dictionaries is that you can access values by their keys.

In [43]:
# Print the value for the 'title' key
print( my_dict['title'] )

# Print the value for the 'author' key


print( my_dict['author'] )
Hitchhiker's Guide to the Galaxy
Dougie Adams

But wait, the author field is wrong! His name should be "Douglas Adams," not "Dougie Adams."
4.3 - Updating values

No problem, it's pretty easy to update the values of existing keys...

Just overwrite the previous value, like so:

In [44]:
# Updating existing key-value pair
my_dict['author'] = 'Douglas Adams'

# Print the value for the 'author' key


print( my_dict['author'] )
Douglas Adams

Values can be any other Python object, even other lists or dictionaries.

For example, the key 42 contains a list as its value.


In [45]:
# Print value for the key 42
print( my_dict[42] )
['A number.', 'The answer to 40 + 2 = ?']

As with any other list, we can update or append values to it.

For example, those who have read Hitchhiker's Guide to the Galaxy will recognize 42 as not just any
old mundane number, but rather
The Answer to the Ultimate Question of Life, the Universe, and Everything.

(If you haven't read the book, you're seriously missing out!)
The Meaning of Life
Conveniently, we can .append() that to our list, just as we would with any other list.
In [46]:
# Append element to list
my_dict[42].append('Answer to the Ultimate Question of Life, the Universe, and Everything')

# Print value for the key 42


print( my_dict[42] )
['A number.', 'The answer to 40 + 2 = ?',
'Answer to the Ultimate Question of Life, the Universe, and Everything']
4.4 - Creating new items

You can also add a new item by simply setting a value to an unused key.

In [47]:
# Creating a new key-value pair
my_dict['year'] = 1979

Now that we have the year, we can print a summary of the book:

In [48]:
# Print summary of the book.
print('{} was written by {} in {}.'.format(my_dict['title'], my_dict['author'], my_dict['year']) )
Hitchhiker's Guide to the Galaxy was written by Douglas Adams in 1979.
4.5 - Convenience functions
Finally, you can also access a list of all the keys in a dictionary using the .keys() function.
And you can get a list of all the values using the .values() function.
In [49]:
# Keys
print( my_dict.keys() )

# Values
print( my_dict.values() )
[42, 'author', 'year', 'title']

[['A number.', 'The answer to 40 + 2 = ?', 'Answer to the Ultimate Question of Life,
the Universe, and Everything'],
'Douglas Adams',
1979,
"Hitchhiker's Guide to the Galaxy"]
You can access a list of all key-value pairs using the .items() function.
 This is very useful for iterating through a dictionary, which we'll see in Lesson 3:
Flow and Functions.
 It will return a list of tuples.

In [50]:
# All items (list of tuples)
print( my_dict.items() )
[(42, ['A number.', 'The answer to 40 + 2 = ?', 'Answer to the Ultimate Question of Li
fe, the Universe, and Everything']), ('author', 'Douglas Adams'),
('year', 1979),
('title', "Hitchhiker's Guide to the Galaxy")]

Quick note on whitespace in Python:


In Python, whitespace (spaces before lines of code) is very important, and it's used to denote blocks of
code.
Its usage replaces how curly braces { } are used in many other languages.

 When you indent a line of code, it becomes a child of the previous line.
 The parent has a colon following it.
 Each indent is exactly 4 spaces.
 To end a block of code, you would simply outdent.
 In Jupyter Notebook, you can just press tab to indent 4 spaces.

For example:

parent line of code:

child line of code:

grand_child line of code

sibling of first child

All of this will become very clear after this lesson.


1. if statements allow conditional logic
The first flow control tool we'll use is the if statement.
1.1 - If...

These are the basic building blocks of conditional logic.

If statements check if a condition is met before running a block of code.

You begin them with the if keyword. Then the statement has two parts:

1. The condition, which must evaluate to a boolean. (Technically, it's fine as long as
its "truthiness" can be evaluated, but let's not worry about that for now.)
2. The code block to run if the condition is met (indented with 4 spaces).

For example:

In [2]:
current_fuel = 85

# Condition
if current_fuel >= 80:
# Code block to run if condition is met
print( 'We have enough fuel to last the zombie apocalypse. ')
We have enough fuel to last the zombie apocalypse.

 current_fuel >= 80 is the condition.


 Since we set current_fuel = 85 earlier, the condition was met and the code
block was run.

However, what do you think happens if the condition is not met?


In [3]:
current_fuel = 50

if current_fuel >= 80:


print( 'We have enough fuel to last the zombie apocalypse. ')
Hmm... it looks like it ignores the code block.

That seems OK, but what if we want to print another message, such as a warning to restock on fuel?

1.2 - If... Else...


We can follow an if statement with an else statement to tell the program what to do if the condition is
not met.

For example:

In [4]:
current_fuel = 50

# Condition
if current_fuel >= 80:
# Do this when condition is met
print( 'We have enough fuel to last the zombie apocalypse. ')
else:
# Do this when condition is not met
print( 'Restock! We need at least {} gallons.'.format(80 - current_fuel) )
Restock! We need at least 30 gallons.
1.3 - If... Elif... Else...

We can also check for multiple conditions, in sequence.

 The elif (short for else if) statement checks another condition if the first one is
not met.

For example:

In [5]:
current_fuel = 50

# First condition
if current_fuel >= 80:
print( 'We have enough fuel to last the zombie apocalypse. ')
# If first condition is not met, check this condition
elif current_fuel < 60:
print( 'ALERT: WE ARE WAY TOO LOW ON FUEL!' )
# If no conditions were met, perform this
else:
print( 'Restock! We need at least {} gallons.'.format(80 - current_fuel) )
ALERT: WE ARE WAY TOO LOW ON FUEL!

2. for loops allow iteration


Next, it's time to introduce the concept of iteration.
2.1 - For...
When you need a repeat a process across many objects, you can write a for loop to do so.
For loops repeat a code block across a sequence of elements.

You begin them with the for keyword. Then the loop has three parts:

1. An iterable (list, tuple, etc.) object that contains the elements to loop through.
2. A named variable that represents each element in the list.
3. A code block to run for each element (indented with 4 spaces).

For example:

In [9]:
for number in [0, 1, 2, 3, 4]:
print( number )
0
1
2
3
4

Let's break that down.

 First, we set the iterable list.

for number in [0, 1, 2, 3, 4]:


print( number )

 Next, we set each single element that we loop through to a named variable. In
this case, we named it number.

for number in [0, 1, 2, 3, 4]:


print( number )

 Finally, we can access that named variable in the code block.

for number in [0, 1, 2, 3, 4]:

print( number )

In this case, we simply printed each element in our list.

2.2 - Range
range() is a built-in Python function for generating lists of sequential integers.
 i.e. range(10) creates the list [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
 Note: In Python 3, it creates something called a generator instead. We won't
worry too much about the differences because we'll use them identically.

For example, this produces the same output as our for loop from earlier:
In [10]:
for number in range(5):
print( number )
0
1
2
3
4
You can also iterate in reversed() order.
In [11]:
for number in reversed(range(5)):
print( number )
4
3
2
1
0
2.3 - Nested control flow
You can nest if statements within for loops to write more complex logic.

Here's an example:

 The % (modulo) operator appears again. It's used to check if a number is divisible
by another.

In [12]:
range_list = range(10)

for number in reversed(range_list):


if number == 0:
print( 'Liftoff!' )
elif number % 3 == 0:
print( 'Buzz' )
else:
print( number )
Buzz
8
7
Buzz
5
4
Buzz
2
1
Liftoff!
Of course, you can also nest for loops within other for loops.
In [13]:
list_a = [4, 3, 2]
list_b = [6, 3]

for a in list_a:
for b in list_b:
print( a, 'x', b, '=', a * b )
4 x 6 = 24
4 x 3 = 12
3 x 6 = 18
3 x 3 = 9
2 x 6 = 12
2 x 3 = 6
2.4 - Building new lists
for loops can be used to build new lists from scratch. Here's how:

1. First, set a variable to an empty list, which is just [].


2. Then, as you loop through elements, .append() the ones you want to your list
variable.

For example, let's say we want to separate our range_list into an evens_list and
an odds_list. We can do so like this:
In [14]:
range_list = range(10)

# Initialize empty lists


evens_list = []
odds_list = []

# Iterate through each number in range_list


for number in range_list:
# check for divisibility by 2
if number % 2 == 0:
# If divisible by 2
evens_list.append(number)
else:
# If not divisible by 2
odds_list.append(number)

# Confirm our lists are correct


print( evens_list )
print( odds_list )
[0, 2, 4, 6, 8]
[1, 3, 5, 7, 9]

3. List comprehensions construct new


lists elegantly
List comprehensions are one of the most wonderful tools in Python.

3.1 - List comprehensions

They are somewhat advanced, and you don't technically need to use them, but they help keep your code
clean and concise.

List comprehensions construct new lists out of existing ones after applying transformations
or conditions to them.

These are one of the trickier concepts in Python, so don't worry if it doesn't make sense right away. We'll
get plenty of practice with them in the projects.

Here's an example:

In [17]:
# Construct list of the squares in range(10) using list comprehension
squares_list = [number**2 for number in range(10)]

print( squares_list )
[0, 1, 4, 9, 16, 25, 36, 49, 64, 81]

Let's break that down.

 Every list comprehension has a for loop.

[number**2 for number in range(10)]

 They also have an output. In this case, it's the squared number in the list.

[number**2 for number in range(10)]

 Finally, they are surrounded by list brackets.

[number**2 for number in range(10)]


3.2 - Conditional inclusion

You can even include conditional logic in list comprehensions!

 Simply add the if statement at the end, like so:


[number for number in range(10) if number % 2 == 0]
In [18]:
# Conditional inclusion
evens_list = [number for number in range(10) if number % 2 == 0]

print( evens_list )
[0, 2, 4, 6, 8]
3.3 - Conditional outputs

 You can also use if... else... in the output for conditional outputs.

['Even' if number % 2 == 0 else 'Odd' for number in range(10)]


In [19]:
# Conditional outputs
even_odd_labels = ['Even' if number % 2 == 0 else 'Odd' for number in range(10)]

print( even_odd_labels )
['Even', 'Odd', 'Even', 'Odd', 'Even', 'Odd', 'Even', 'Odd', 'Even', 'Odd']
3.4 - Other comprehensions

Finally, list comprehensions are not limited to lists!

 You can use them for other data structures too.


 The syntax is the same, except you would enclose it with curly braces for sets and
parentheses for tuples.

For example, we can create a set like so:

In [20]:
# Construct set of doubles using set comprehension
doubles_set = { number * 2 for number in range(10) }

print( doubles_set )
set([0, 2, 4, 6, 8, 10, 12, 14, 16, 18])

4. Functions are blocks of reusable code


Functions provide modularity for your code and make it much easier to stay organized.
Who doesn't use functions?
Barbarians... that's who.
4.1 - Functions

Functions allow you to reuse and quickly tailor code to different situations.

Functions are blocks of reusable code that can be called by name.

Functions even have their own type: function.

Here's an example:

In [23]:
def make_message_exciting(message='hello, world'):
text = message + '!'
return text

print( type(make_message_exciting) )
< type 'function' >

Let's break that down:

 Functions begin with the def keyword, followed by the function name (and a
colon).
def make_message_exciting(message='hello, world'):
text = message + '!'
return text

 They can take optional arguments (more on this later).

def make_message_exciting(message='hello, world'):


text = message + '!'
return text

 This is the default value for the argument (more on this later).

def make_message_exciting(message='hello, world'):


text = message + '!'
return text

 They are then followed by an indented code block.

def make_message_exciting(message='hello, world'):

text = message + '!'


return text

 Finally, they return a value, which is also indented.

def make_message_exciting(message='hello, world'):

text = message + '!'

return text
To call a function, simply type its name and a parentheses.
In [24]:
# Call make_message_exciting() function
make_message_exciting()
Out[24]:
'hello, world!'

As you can see, if you don't pass the function any argument values, it will use the default ones.

4.2 - In practice
In practice, functions are ideal for isolating functionality.
In [25]:
def square(x):
output = x*x
return output

def cube(x):
output = x*x*x
return output

print( square(3) )
print( cube(2) )
print( square(3) + cube(2) )
9
8
17
4.3 - Optional parts
It's worth noting that the code block is actually optional, as long as you have a return statement.
In [26]:
# Example of function without a code block
def hello_world():
return 'hello world'

# Call hello_world() function


hello_world()
Out[26]:
'hello world'
And also... the return statement is optional, as long as you have a code block.

 If no return statement is given, the function will return None by default.


 Code blocks in the function will still run.

In [27]:
# Example of function without a return statement
def print_hello_world():
print( 'hello world' )

# Call print_hello_world() function


print_hello_world()
hello world

Basically, as long as you have either a code block or a return statement, you're good to go.
4.4 - Arguments
Finally, functions can have arguments, which are variables that you pass into the function.

 You can then use these variables in the code block.


 Arguments can have also default values, set using the = operator.

In [28]:
def print_message( message='Hello, world', punctuation='.' ):
output = message + punctuation
print( output )

# Print default message


print_message()
Hello, world.

To pass a new value for the argument, simply set it again when calling the function.
In [29]:
# Print new message, but default punctuation
print_message( message='Nice to meet you', punctuation='...' )
Nice to meet you...

When passing a value to an argument, you don't have to write the argument's name if the values are in
order.

 The first value is for the first argument, second value for the second argument, etc.

In [30]:
# Print new message without explicity setting the argument
print_message( 'Where is everybody', '?' )
Where is everybody?

Before anything else, let's import the actual NumPy library.


In [2]:
import numpy as np

In the code above, we set an alias for NumPy:


import numpy as np
The NumPy library can now be called using just np, instead of typing the whole name.

 np is a commonly used alias for NumPy

1. Arrays are homogeneous


Remember our discussion of data structures in Lesson 2: Data Structures?

Well, one of the main reasons NumPy is so popular for scientific computing is because it provides a new
data structure that's optimized for calculations with arrays of data.
1.1 - NumPy Arrays

This new data structure is the NumPy array.


NumPy arrays are tables of elements that all share the same data type (usually numeric).

numpy.ndarray is its official type.

For example:

In [3]:
# Array of ints
array_a = np.array([0, 1, 2, 3])

print( array_a )
print( type(array_a) )
[0 1 2 3]
< type 'numpy.ndarray' >
You can see the data type of the contained elements using the .dtype attribute.
In [4]:
# Print data type of contained elements
print( array_a.dtype )
int64
In NumPy, integers have the dtype int64.
Note: don't get it confused!

 The NumPy array itself has type numpy.ndarray.


 The elements contained inside the array have type int64.

1.2 - Homogenous
NumPy arrays are homogenous, which means all of their elements must have the same data type.

What do you think happens if we mix two data types?

In [5]:
# Mixed array with 1 string and 2 integers
array_b = np.array(['four', 5, 6])

# Print elements in array_b


print( array_b )
['four' '5' '6']
Because NumPy doesn't support mixed types, it converted all of the elements in array_b into strings.
1.3 - Shape
The two arrays we created above, array_a and array_b both have only 1 axis.

 You can think an axis as a direction in the coordinate plane.


 For example, lines have 1 axis, squares have 2 axes, cubes have 3 axes, etc.
 Of course, in data science, each axis can represent different aspects of the data
(we'll see how in the next section).

We can use the .shape attribute to see the axes for a NumPy array.
In [6]:
print( array_a.shape )
print( array_b.shape )
(4,)
(3,)
As you can see, .shape returns a tuple.
 The number of elements in the tuple is the number of axes.
 And each element's value is the length of that axis.
Together, these two pieces of information make up the shape, or dimensions, of the array.

 So array_a is a 4x1 array. It has 1 axis of length 4.


 And array_b is a 3x1 array. It has 1 axis of length 3.
 They are both considered "1-dimensional" arrays.

If this seems confusing right now, don't worry. This will become clearer once we see more examples.

1.4 - Indexing
Similar to the lists we saw in Lesson 2: Data Structures, we can access elements in NumPy arrays by
their indices.
In [7]:
# First element of array_a
print( array_a[0] )

# Last element of array_a


print( array_a[-1] )
0
3

Or by slicing.
In [8]:
# From second element of array_a up to the 4th
print( array_a[2:4] )
[2 3]
1.5 - Missing data

Finally, another reason NumPy is popular in the data science community is that it can indicate missing
data.

 As you'll see in the next 3 projects, most real world datasets are plagued by
missing data.
 Fortunately, NumPy has a special np.nan object for denoting missing values.

This object is called NaN, which stands for Not a Number.

For example, let's create an array with a missing value:

In [9]:
# Array with missing values
array_with_missing_value = np.array([1.2, 8.8, 4.0, np.nan, 6.1])

# Print array
print( array_with_missing_value )
[ 1.2 8.8 4. nan 6.1]
NaN allows you to indicate a value is missing while preserving the array's numeric dtype:
In [10]:
# Print array's dtype
print( array_with_missing_value.dtype )
float64
See how NumPy keeps a nan to indicate a missing value, but the .dtype is still float64?

This turns out to be a very useful property for data analysis, as you'll see soon!

You might also like