KEMBAR78
Workshop on Programming in Python - day II | PDF
Programming in Python
A Two Day Workshop
Satyaki Sikdar
Vice Chair
ACM Student Chapter
Heritage Institute of Technology
April 23 2016
Satyaki Sikdar© Programming in Python April 23 2016 1 / 62
hour 6: let’s get rich!
table of contents
1 hour 6: let’s get rich!
an elaborate example
inheritance
file handling 101
2 hour 7: algo design 101
3 hours 8: data viz 101
4 hours 9 - 11 SNA 101
Satyaki Sikdar© Programming in Python April 23 2016 2 / 62
hour 6: let’s get rich! an elaborate example
another example
There are 52 cards in a deck. There are 4 suits - Spades, Hearts, Diamonds and Clubs
Each suit has 13 cards - Ace, 2, ..., 10, Jack, Queen and King
We’ll create a card class. Attributes can be strings, like ’Spade’ for suits and ’Queen’ for
ranks. We’ll have trouble comparing the cards
We use integers to encode the ranks and suits
Spades → 3, Hearts → 2, Diamonds → 1 and Clubs → 0
Ace → 1, Jack → 11, Queen → 12 and King → 13
Satyaki Sikdar© Programming in Python April 23 2016 3 / 62
hour 6: let’s get rich! an elaborate example
another example
There are 52 cards in a deck. There are 4 suits - Spades, Hearts, Diamonds and Clubs
Each suit has 13 cards - Ace, 2, ..., 10, Jack, Queen and King
We’ll create a card class. Attributes can be strings, like ’Spade’ for suits and ’Queen’ for
ranks. We’ll have trouble comparing the cards
We use integers to encode the ranks and suits
Spades → 3, Hearts → 2, Diamonds → 1 and Clubs → 0
Ace → 1, Jack → 11, Queen → 12 and King → 13
Satyaki Sikdar© Programming in Python April 23 2016 3 / 62
hour 6: let’s get rich! an elaborate example
another example
There are 52 cards in a deck. There are 4 suits - Spades, Hearts, Diamonds and Clubs
Each suit has 13 cards - Ace, 2, ..., 10, Jack, Queen and King
We’ll create a card class. Attributes can be strings, like ’Spade’ for suits and ’Queen’ for
ranks. We’ll have trouble comparing the cards
We use integers to encode the ranks and suits
Spades → 3, Hearts → 2, Diamonds → 1 and Clubs → 0
Ace → 1, Jack → 11, Queen → 12 and King → 13
Satyaki Sikdar© Programming in Python April 23 2016 3 / 62
hour 6: let’s get rich! an elaborate example
another example
There are 52 cards in a deck. There are 4 suits - Spades, Hearts, Diamonds and Clubs
Each suit has 13 cards - Ace, 2, ..., 10, Jack, Queen and King
We’ll create a card class. Attributes can be strings, like ’Spade’ for suits and ’Queen’ for
ranks. We’ll have trouble comparing the cards
We use integers to encode the ranks and suits
Spades → 3, Hearts → 2, Diamonds → 1 and Clubs → 0
Ace → 1, Jack → 11, Queen → 12 and King → 13
Satyaki Sikdar© Programming in Python April 23 2016 3 / 62
hour 6: let’s get rich! an elaborate example
another example
There are 52 cards in a deck. There are 4 suits - Spades, Hearts, Diamonds and Clubs
Each suit has 13 cards - Ace, 2, ..., 10, Jack, Queen and King
We’ll create a card class. Attributes can be strings, like ’Spade’ for suits and ’Queen’ for
ranks. We’ll have trouble comparing the cards
We use integers to encode the ranks and suits
Spades → 3, Hearts → 2, Diamonds → 1 and Clubs → 0
Ace → 1, Jack → 11, Queen → 12 and King → 13
Satyaki Sikdar© Programming in Python April 23 2016 3 / 62
hour 6: let’s get rich! an elaborate example
another example
There are 52 cards in a deck. There are 4 suits - Spades, Hearts, Diamonds and Clubs
Each suit has 13 cards - Ace, 2, ..., 10, Jack, Queen and King
We’ll create a card class. Attributes can be strings, like ’Spade’ for suits and ’Queen’ for
ranks. We’ll have trouble comparing the cards
We use integers to encode the ranks and suits
Spades → 3, Hearts → 2, Diamonds → 1 and Clubs → 0
Ace → 1, Jack → 11, Queen → 12 and King → 13
Satyaki Sikdar© Programming in Python April 23 2016 3 / 62
hour 6: let’s get rich! an elaborate example
the class definition
class Card:
'''Represents a standard playing card'''
suit_names = ['Clubs', 'Diamonds', 'Hearts', 'Spades']
rank_names = [None, 'Ace', '2', '3', '4', '5', '6', '7', '8', '9', '10',
'Jack', 'Queen', 'King']
def __init__(self, suit=0, rank=2):
self.suit = suit
self.rank = rank
def __str__(self):
return '%s of %s' % (Card.rank_names[self.rank],
Card.suit_names[self.suit])
>>> two_of_clubs = Card() >>> queen_of_diamonds = Card(1, 12)
Satyaki Sikdar© Programming in Python April 23 2016 4 / 62
hour 6: let’s get rich! an elaborate example
class and instance attributes
class attribute instance attribute
Defined outside any method Defined inside methods
Referred by class.class_attr Referred by inst.inst_attr
One copy per class One copy per instance
Eg: suit_names and rank_names Eg: suit and rank
Figure: Class and instance attributes
Satyaki Sikdar© Programming in Python April 23 2016 5 / 62
hour 6: let’s get rich! an elaborate example
comparing cards
For built-in types, there are relational operators (<, >, ==, etc.) that compare two things
to produce a boolean
For user-defined types, we need to override the __cmp__ method. It takes in two
parameters, self and other, returns
a positive number if the first object is greater
a negative number if the second object is greater
zero if they are equal
The ordering for cards is not obvious. Which is better, the 3 of Clubs or the 2 of
Diamonds? One has a higher rank, but the other has a higher suit
We arbitrarily choose that suit is more important, so all the Spades outrank all the
Diamonds and so on.
Satyaki Sikdar© Programming in Python April 23 2016 6 / 62
hour 6: let’s get rich! an elaborate example
comparing cards
For built-in types, there are relational operators (<, >, ==, etc.) that compare two things
to produce a boolean
For user-defined types, we need to override the __cmp__ method. It takes in two
parameters, self and other, returns
a positive number if the first object is greater
a negative number if the second object is greater
zero if they are equal
The ordering for cards is not obvious. Which is better, the 3 of Clubs or the 2 of
Diamonds? One has a higher rank, but the other has a higher suit
We arbitrarily choose that suit is more important, so all the Spades outrank all the
Diamonds and so on.
Satyaki Sikdar© Programming in Python April 23 2016 6 / 62
hour 6: let’s get rich! an elaborate example
comparing cards
For built-in types, there are relational operators (<, >, ==, etc.) that compare two things
to produce a boolean
For user-defined types, we need to override the __cmp__ method. It takes in two
parameters, self and other, returns
a positive number if the first object is greater
a negative number if the second object is greater
zero if they are equal
The ordering for cards is not obvious. Which is better, the 3 of Clubs or the 2 of
Diamonds? One has a higher rank, but the other has a higher suit
We arbitrarily choose that suit is more important, so all the Spades outrank all the
Diamonds and so on.
Satyaki Sikdar© Programming in Python April 23 2016 6 / 62
hour 6: let’s get rich! an elaborate example
comparing cards
For built-in types, there are relational operators (<, >, ==, etc.) that compare two things
to produce a boolean
For user-defined types, we need to override the __cmp__ method. It takes in two
parameters, self and other, returns
a positive number if the first object is greater
a negative number if the second object is greater
zero if they are equal
The ordering for cards is not obvious. Which is better, the 3 of Clubs or the 2 of
Diamonds? One has a higher rank, but the other has a higher suit
We arbitrarily choose that suit is more important, so all the Spades outrank all the
Diamonds and so on.
Satyaki Sikdar© Programming in Python April 23 2016 6 / 62
hour 6: let’s get rich! an elaborate example
comparing cards
For built-in types, there are relational operators (<, >, ==, etc.) that compare two things
to produce a boolean
For user-defined types, we need to override the __cmp__ method. It takes in two
parameters, self and other, returns
a positive number if the first object is greater
a negative number if the second object is greater
zero if they are equal
The ordering for cards is not obvious. Which is better, the 3 of Clubs or the 2 of
Diamonds? One has a higher rank, but the other has a higher suit
We arbitrarily choose that suit is more important, so all the Spades outrank all the
Diamonds and so on.
Satyaki Sikdar© Programming in Python April 23 2016 6 / 62
hour 6: let’s get rich! an elaborate example
comparing cards
For built-in types, there are relational operators (<, >, ==, etc.) that compare two things
to produce a boolean
For user-defined types, we need to override the __cmp__ method. It takes in two
parameters, self and other, returns
a positive number if the first object is greater
a negative number if the second object is greater
zero if they are equal
The ordering for cards is not obvious. Which is better, the 3 of Clubs or the 2 of
Diamonds? One has a higher rank, but the other has a higher suit
We arbitrarily choose that suit is more important, so all the Spades outrank all the
Diamonds and so on.
Satyaki Sikdar© Programming in Python April 23 2016 6 / 62
hour 6: let’s get rich! an elaborate example
comparing cards
For built-in types, there are relational operators (<, >, ==, etc.) that compare two things
to produce a boolean
For user-defined types, we need to override the __cmp__ method. It takes in two
parameters, self and other, returns
a positive number if the first object is greater
a negative number if the second object is greater
zero if they are equal
The ordering for cards is not obvious. Which is better, the 3 of Clubs or the 2 of
Diamonds? One has a higher rank, but the other has a higher suit
We arbitrarily choose that suit is more important, so all the Spades outrank all the
Diamonds and so on.
Satyaki Sikdar© Programming in Python April 23 2016 6 / 62
hour 6: let’s get rich! an elaborate example
writing the __cmp__ method
#inside Card class
def __cmp__(self, other):
if self.suit > other.suit: #check the suits
return 1
elif self.suit < other.suit:
return -1
elif self.rank > other.rank: #check the ranks
return 1
elif self.rank < other.rank:
return -1
else: #both the suits and the ranks are the same
return 0
Satyaki Sikdar© Programming in Python April 23 2016 7 / 62
hour 6: let’s get rich! an elaborate example
decks
Now that we have Cards, we define Decks. It will contain a list of Cards
The init method creates the entire deck of 52 cards
class Deck:
'''Represents a deck of cards'''
def __init__(self):
self.cards = []
for suit in range(4):
for rank in range(1, 14):
card = Card(suit, rank)
self.cards.append(card)
Satyaki Sikdar© Programming in Python April 23 2016 8 / 62
hour 6: let’s get rich! an elaborate example
decks
Now that we have Cards, we define Decks. It will contain a list of Cards
The init method creates the entire deck of 52 cards
class Deck:
'''Represents a deck of cards'''
def __init__(self):
self.cards = []
for suit in range(4):
for rank in range(1, 14):
card = Card(suit, rank)
self.cards.append(card)
Satyaki Sikdar© Programming in Python April 23 2016 8 / 62
hour 6: let’s get rich! an elaborate example
decks
#inside class Deck
def __str__(self):
res = []
for card in self.cards:
res.append(str(card))
return 'n'.join(res)
def shuffle(self):
random.shuffle(self.cards)
#inside class Deck
def pop_card(self):
return self.cards.pop()
def add_card(self, card):
self.cards.append(card)
def sort(self):
self.cards.sort()
>>> deck = Deck()
>>> print deck.pop_card()
King of Spades
Satyaki Sikdar© Programming in Python April 23 2016 9 / 62
hour 6: let’s get rich! inheritance
inheritance
The language feature most often associated with object-oriented programming is
inheritance
It’s the ability to define a new class that’s a modified version of an existing class
The existing class is called the parent and the new class is called the child
We want a class to represent a hand that is, the set of cards held by a player
A hand is similar to a deck: both are made up of a set of cards, and both require
operations like adding and removing cards
A hand is also different from a deck; there are operations we want for hands that don’t
make sense for a deck
Satyaki Sikdar© Programming in Python April 23 2016 10 / 62
hour 6: let’s get rich! inheritance
inheritance
The language feature most often associated with object-oriented programming is
inheritance
It’s the ability to define a new class that’s a modified version of an existing class
The existing class is called the parent and the new class is called the child
We want a class to represent a hand that is, the set of cards held by a player
A hand is similar to a deck: both are made up of a set of cards, and both require
operations like adding and removing cards
A hand is also different from a deck; there are operations we want for hands that don’t
make sense for a deck
Satyaki Sikdar© Programming in Python April 23 2016 10 / 62
hour 6: let’s get rich! inheritance
inheritance
The language feature most often associated with object-oriented programming is
inheritance
It’s the ability to define a new class that’s a modified version of an existing class
The existing class is called the parent and the new class is called the child
We want a class to represent a hand that is, the set of cards held by a player
A hand is similar to a deck: both are made up of a set of cards, and both require
operations like adding and removing cards
A hand is also different from a deck; there are operations we want for hands that don’t
make sense for a deck
Satyaki Sikdar© Programming in Python April 23 2016 10 / 62
hour 6: let’s get rich! inheritance
inheritance
The language feature most often associated with object-oriented programming is
inheritance
It’s the ability to define a new class that’s a modified version of an existing class
The existing class is called the parent and the new class is called the child
We want a class to represent a hand that is, the set of cards held by a player
A hand is similar to a deck: both are made up of a set of cards, and both require
operations like adding and removing cards
A hand is also different from a deck; there are operations we want for hands that don’t
make sense for a deck
Satyaki Sikdar© Programming in Python April 23 2016 10 / 62
hour 6: let’s get rich! inheritance
inheritance
The language feature most often associated with object-oriented programming is
inheritance
It’s the ability to define a new class that’s a modified version of an existing class
The existing class is called the parent and the new class is called the child
We want a class to represent a hand that is, the set of cards held by a player
A hand is similar to a deck: both are made up of a set of cards, and both require
operations like adding and removing cards
A hand is also different from a deck; there are operations we want for hands that don’t
make sense for a deck
Satyaki Sikdar© Programming in Python April 23 2016 10 / 62
hour 6: let’s get rich! inheritance
inheritance
The language feature most often associated with object-oriented programming is
inheritance
It’s the ability to define a new class that’s a modified version of an existing class
The existing class is called the parent and the new class is called the child
We want a class to represent a hand that is, the set of cards held by a player
A hand is similar to a deck: both are made up of a set of cards, and both require
operations like adding and removing cards
A hand is also different from a deck; there are operations we want for hands that don’t
make sense for a deck
Satyaki Sikdar© Programming in Python April 23 2016 10 / 62
hour 6: let’s get rich! inheritance
The definition of a child class is like other class definitions, but the name of the parent
class appears in parentheses
class Hand(Deck):
'''Represents a hand of playing cards'''
This definition indicates that Hand inherits from Deck; that means we can use methods
like pop_card and add_card for Hands as well as Decks
Hand also inherits __init__ from Deck, but it doesn’t really do what we want: the init
method for Hands should initialize cards with an empty list
We can provide an init method, overriding the one in Deck
#inside class Hand
def __init__(self, label=''):
self.cards = []
self.label = label
Satyaki Sikdar© Programming in Python April 23 2016 11 / 62
hour 6: let’s get rich! inheritance
The definition of a child class is like other class definitions, but the name of the parent
class appears in parentheses
class Hand(Deck):
'''Represents a hand of playing cards'''
This definition indicates that Hand inherits from Deck; that means we can use methods
like pop_card and add_card for Hands as well as Decks
Hand also inherits __init__ from Deck, but it doesn’t really do what we want: the init
method for Hands should initialize cards with an empty list
We can provide an init method, overriding the one in Deck
#inside class Hand
def __init__(self, label=''):
self.cards = []
self.label = label
Satyaki Sikdar© Programming in Python April 23 2016 11 / 62
hour 6: let’s get rich! inheritance
The definition of a child class is like other class definitions, but the name of the parent
class appears in parentheses
class Hand(Deck):
'''Represents a hand of playing cards'''
This definition indicates that Hand inherits from Deck; that means we can use methods
like pop_card and add_card for Hands as well as Decks
Hand also inherits __init__ from Deck, but it doesn’t really do what we want: the init
method for Hands should initialize cards with an empty list
We can provide an init method, overriding the one in Deck
#inside class Hand
def __init__(self, label=''):
self.cards = []
self.label = label
Satyaki Sikdar© Programming in Python April 23 2016 11 / 62
hour 6: let’s get rich! inheritance
The definition of a child class is like other class definitions, but the name of the parent
class appears in parentheses
class Hand(Deck):
'''Represents a hand of playing cards'''
This definition indicates that Hand inherits from Deck; that means we can use methods
like pop_card and add_card for Hands as well as Decks
Hand also inherits __init__ from Deck, but it doesn’t really do what we want: the init
method for Hands should initialize cards with an empty list
We can provide an init method, overriding the one in Deck
#inside class Hand
def __init__(self, label=''):
self.cards = []
self.label = label
Satyaki Sikdar© Programming in Python April 23 2016 11 / 62
hour 6: let’s get rich! inheritance
So when you create a Hand,
Python invokes it’s own init
>>> hand = Hand('new hand')
>>> print hand.cards
[]
>>> print hand.label
new hand
But the other methods are inherited from Deck
>>> deck = Deck()
>>> card = deck.pop_card()
>>> hand.add_card(card) #add_card from Hand
>>> print hand #using the str of Hand
King of Spades
A natural next step is to encapsulate this code in a method called move_cards
#inside class Deck
def move_cards(self, hand, card):
for i in xrange(num):
hand.add_card(self.pop_card())
move_cards takes two arguments, a Hand object and the number of cards to deal.
Modifies both self and hand
Satyaki Sikdar© Programming in Python April 23 2016 12 / 62
hour 6: let’s get rich! inheritance
So when you create a Hand,
Python invokes it’s own init
>>> hand = Hand('new hand')
>>> print hand.cards
[]
>>> print hand.label
new hand
But the other methods are inherited from Deck
>>> deck = Deck()
>>> card = deck.pop_card()
>>> hand.add_card(card) #add_card from Hand
>>> print hand #using the str of Hand
King of Spades
A natural next step is to encapsulate this code in a method called move_cards
#inside class Deck
def move_cards(self, hand, card):
for i in xrange(num):
hand.add_card(self.pop_card())
move_cards takes two arguments, a Hand object and the number of cards to deal.
Modifies both self and hand
Satyaki Sikdar© Programming in Python April 23 2016 12 / 62
hour 6: let’s get rich! inheritance
So when you create a Hand,
Python invokes it’s own init
>>> hand = Hand('new hand')
>>> print hand.cards
[]
>>> print hand.label
new hand
But the other methods are inherited from Deck
>>> deck = Deck()
>>> card = deck.pop_card()
>>> hand.add_card(card) #add_card from Hand
>>> print hand #using the str of Hand
King of Spades
A natural next step is to encapsulate this code in a method called move_cards
#inside class Deck
def move_cards(self, hand, card):
for i in xrange(num):
hand.add_card(self.pop_card())
move_cards takes two arguments, a Hand object and the number of cards to deal.
Modifies both self and hand
Satyaki Sikdar© Programming in Python April 23 2016 12 / 62
hour 6: let’s get rich! inheritance
So when you create a Hand,
Python invokes it’s own init
>>> hand = Hand('new hand')
>>> print hand.cards
[]
>>> print hand.label
new hand
But the other methods are inherited from Deck
>>> deck = Deck()
>>> card = deck.pop_card()
>>> hand.add_card(card) #add_card from Hand
>>> print hand #using the str of Hand
King of Spades
A natural next step is to encapsulate this code in a method called move_cards
#inside class Deck
def move_cards(self, hand, card):
for i in xrange(num):
hand.add_card(self.pop_card())
move_cards takes two arguments, a Hand object and the number of cards to deal.
Modifies both self and hand
Satyaki Sikdar© Programming in Python April 23 2016 12 / 62
hour 6: let’s get rich! inheritance
#inside class Deck
def deal_hands(self, num_hands, cards_per_hand):
hands = []
self.shuffle() #shuffling the deck
for i in range(num_hands):
hand = Hand('player %d' % (i))
for j in range(cards_per_hand):
hand.add_card(self.pop_card())
hands.append(hand)
return hands
Now you have a proper framework for a card game, be it poker, blackjack or bridge!
Satyaki Sikdar© Programming in Python April 23 2016 13 / 62
hour 6: let’s get rich! file handling 101
the need for file handling
Most of the programs we have seen so far are transient in the sense that they run for a
short time and produce some output, but when they end, their data disappears. If you run
the program again, it starts with a clean slate
Other programs are persistent: they run for a long time (or all the time); they keep at
least some of their data in permanent storage (a hard drive, for example); if they shut
down and restart, they pick up where they left off
Big input and output sizes - too big for the main memory
Satyaki Sikdar© Programming in Python April 23 2016 14 / 62
hour 6: let’s get rich! file handling 101
the need for file handling
Most of the programs we have seen so far are transient in the sense that they run for a
short time and produce some output, but when they end, their data disappears. If you run
the program again, it starts with a clean slate
Other programs are persistent: they run for a long time (or all the time); they keep at
least some of their data in permanent storage (a hard drive, for example); if they shut
down and restart, they pick up where they left off
Big input and output sizes - too big for the main memory
Satyaki Sikdar© Programming in Python April 23 2016 14 / 62
hour 6: let’s get rich! file handling 101
the need for file handling
Most of the programs we have seen so far are transient in the sense that they run for a
short time and produce some output, but when they end, their data disappears. If you run
the program again, it starts with a clean slate
Other programs are persistent: they run for a long time (or all the time); they keep at
least some of their data in permanent storage (a hard drive, for example); if they shut
down and restart, they pick up where they left off
Big input and output sizes - too big for the main memory
Satyaki Sikdar© Programming in Python April 23 2016 14 / 62
hour 6: let’s get rich! file handling 101
Examples of persistent programs are operating systems, which run pretty much whenever a
computer is on, and web servers, which run all the time, waiting for requests to come in on
the network.
One of the simplest ways for programs to maintain their data is by reading and writing
text files.
fp_read = open('input.txt', 'r')
fp_write = open('output.txt', 'w')
Satyaki Sikdar© Programming in Python April 23 2016 15 / 62
hour 6: let’s get rich! file handling 101
Examples of persistent programs are operating systems, which run pretty much whenever a
computer is on, and web servers, which run all the time, waiting for requests to come in on
the network.
One of the simplest ways for programs to maintain their data is by reading and writing
text files.
fp_read = open('input.txt', 'r')
fp_write = open('output.txt', 'w')
Satyaki Sikdar© Programming in Python April 23 2016 15 / 62
hour 6: let’s get rich! file handling 101
reading from files
The built-in function open takes the name of the file as a parameter and returns a file
object you can use to read the file
>>> fin = open('input.txt', 'r')
>>> print fin
>>> <open file 'input.txt', mode 'r' at 0xb7eb2410>
A few things to note: The file opened must exist. An IOError is thrown otherwise.
The exact path to the file must be provided which includes the correct filename with
extension (if any)
Satyaki Sikdar© Programming in Python April 23 2016 16 / 62
hour 6: let’s get rich! file handling 101
reading from files
The built-in function open takes the name of the file as a parameter and returns a file
object you can use to read the file
>>> fin = open('input.txt', 'r')
>>> print fin
>>> <open file 'input.txt', mode 'r' at 0xb7eb2410>
A few things to note: The file opened must exist. An IOError is thrown otherwise.
The exact path to the file must be provided which includes the correct filename with
extension (if any)
Satyaki Sikdar© Programming in Python April 23 2016 16 / 62
hour 6: let’s get rich! file handling 101
reading from files
The built-in function open takes the name of the file as a parameter and returns a file
object you can use to read the file
>>> fin = open('input.txt', 'r')
>>> print fin
>>> <open file 'input.txt', mode 'r' at 0xb7eb2410>
A few things to note: The file opened must exist. An IOError is thrown otherwise.
The exact path to the file must be provided which includes the correct filename with
extension (if any)
Satyaki Sikdar© Programming in Python April 23 2016 16 / 62
hour 6: let’s get rich! file handling 101
reading from files
The built-in function open takes the name of the file as a parameter and returns a file
object you can use to read the file
>>> fin = open('input.txt', 'r')
>>> print fin
>>> <open file 'input.txt', mode 'r' at 0xb7eb2410>
A few things to note: The file opened must exist. An IOError is thrown otherwise.
The exact path to the file must be provided which includes the correct filename with
extension (if any)
Satyaki Sikdar© Programming in Python April 23 2016 16 / 62
hour 6: let’s get rich! file handling 101
reading files
The file object provides several methods for reading, including readline, which reads
characters from the file until it gets to a newline and returns the result as a string:
>>> fin.readline()
'the first line n'
If you keep on doing fin.readlines(), you’d end up reading the whole file, one line at a time.
Let’s see a few examples of reading files.
Satyaki Sikdar© Programming in Python April 23 2016 17 / 62
hour 6: let’s get rich! file handling 101
writing to files
>>> fout = open('output.txt', 'w')
>>> print fout
<open file 'output.txt', mode 'w' at 0xb7eb2410>
If the file already exists, opening it in write mode clears out the old data and starts fresh,
so be careful! If the file doesn’t exist, a new one is created
>>> line1 = 'He left yesterday behind him, you might say he was born again,
>>> fout.write(line1)
Again, the file object keeps track of where it is, so if you call write again, it adds the new
data to the end
>>> line2 = 'you might say he found a key for every door.n'
>>> fout.write(line2)
Satyaki Sikdar© Programming in Python April 23 2016 18 / 62
hour 6: let’s get rich! file handling 101
using files for something meaningful
Let’s combine the knowledge of file handling with dictionaries to do some basic lexical analysis
import string
def char_freq(filename):
counter = dict()
with open(filename, 'r') as f:
raw_text = f.read()
for c in raw_text:
c = c.lower()
if c in string.ascii_lowercase:
if c in counter:
counter[c] += 1
else:
counter[c] = 1
return counter
def normalize(counter):
sum_values = float(sum(counter.values()))
for key in counter:
counter[key] /= sum_values
return counter
Satyaki Sikdar© Programming in Python April 23 2016 19 / 62
hour 7: algo design 101
table of contents
1 hour 6: let’s get rich!
2 hour 7: algo design 101
merge sort
modules
3 hours 8: data viz 101
4 hours 9 - 11 SNA 101
Satyaki Sikdar© Programming in Python April 23 2016 20 / 62
hour 7: algo design 101 merge sort
algorithm design in Python
One of the strong points of Python is the ease of expression
Turning pseudocode into actual code is not difficult
Let’s try to implement the Merge Sort algorithm in Python
A high level idea of the algorithm
Divide: Divide the n-element sequence into two subsequences of n
2 elements
Conquer: Sort the subsequences recursively
Combine: Merge the two sorted subsequences to produce the sorted answer
Satyaki Sikdar© Programming in Python April 23 2016 21 / 62
hour 7: algo design 101 merge sort
algorithm design in Python
One of the strong points of Python is the ease of expression
Turning pseudocode into actual code is not difficult
Let’s try to implement the Merge Sort algorithm in Python
A high level idea of the algorithm
Divide: Divide the n-element sequence into two subsequences of n
2 elements
Conquer: Sort the subsequences recursively
Combine: Merge the two sorted subsequences to produce the sorted answer
Satyaki Sikdar© Programming in Python April 23 2016 21 / 62
hour 7: algo design 101 merge sort
algorithm design in Python
One of the strong points of Python is the ease of expression
Turning pseudocode into actual code is not difficult
Let’s try to implement the Merge Sort algorithm in Python
A high level idea of the algorithm
Divide: Divide the n-element sequence into two subsequences of n
2 elements
Conquer: Sort the subsequences recursively
Combine: Merge the two sorted subsequences to produce the sorted answer
Satyaki Sikdar© Programming in Python April 23 2016 21 / 62
hour 7: algo design 101 merge sort
algorithm design in Python
One of the strong points of Python is the ease of expression
Turning pseudocode into actual code is not difficult
Let’s try to implement the Merge Sort algorithm in Python
A high level idea of the algorithm
Divide: Divide the n-element sequence into two subsequences of n
2 elements
Conquer: Sort the subsequences recursively
Combine: Merge the two sorted subsequences to produce the sorted answer
Satyaki Sikdar© Programming in Python April 23 2016 21 / 62
hour 7: algo design 101 merge sort
algorithm design in Python
One of the strong points of Python is the ease of expression
Turning pseudocode into actual code is not difficult
Let’s try to implement the Merge Sort algorithm in Python
A high level idea of the algorithm
Divide: Divide the n-element sequence into two subsequences of n
2 elements
Conquer: Sort the subsequences recursively
Combine: Merge the two sorted subsequences to produce the sorted answer
Satyaki Sikdar© Programming in Python April 23 2016 21 / 62
hour 7: algo design 101 merge sort
algorithm design in Python
One of the strong points of Python is the ease of expression
Turning pseudocode into actual code is not difficult
Let’s try to implement the Merge Sort algorithm in Python
A high level idea of the algorithm
Divide: Divide the n-element sequence into two subsequences of n
2 elements
Conquer: Sort the subsequences recursively
Combine: Merge the two sorted subsequences to produce the sorted answer
Satyaki Sikdar© Programming in Python April 23 2016 21 / 62
hour 7: algo design 101 merge sort
Algorithm 1: MERGE(left, right)
begin
Append ∞ to left and right
i ← 0, j ← 0
merged ← new list
while len(merged) < len(left) + len(right) - 2
do
if left[i] < right[j] then
merged.append(left[i])
i ← i + 1
else
merged.append(right[j])
j ← j + 1
return merged
Algorithm 2: MERGE-SORT(A)
begin
if len(A) < 2 then
return A
else
left ← first n
2 elements of A
right ← last n
2 elements of A
left ← MERGE − SORT(left)
right ←
MERGE − SORT(right)
return MERGE(left, right)
Satyaki Sikdar© Programming in Python April 23 2016 22 / 62
hour 7: algo design 101 merge sort
the core idea
The algorithm is naturally recursive
The MERGE method takes two sorted lists and merges into a single sorted list
MERGE − SORT sorts the list recursively by breaking it into equal sized halves and
sorting them
A list having less than 2 elements is trivially sorted - base case
Smaller sorted lists are agglomerated to form the overall sorted list
Satyaki Sikdar© Programming in Python April 23 2016 23 / 62
hour 7: algo design 101 merge sort
the core idea
The algorithm is naturally recursive
The MERGE method takes two sorted lists and merges into a single sorted list
MERGE − SORT sorts the list recursively by breaking it into equal sized halves and
sorting them
A list having less than 2 elements is trivially sorted - base case
Smaller sorted lists are agglomerated to form the overall sorted list
Satyaki Sikdar© Programming in Python April 23 2016 23 / 62
hour 7: algo design 101 merge sort
the core idea
The algorithm is naturally recursive
The MERGE method takes two sorted lists and merges into a single sorted list
MERGE − SORT sorts the list recursively by breaking it into equal sized halves and
sorting them
A list having less than 2 elements is trivially sorted - base case
Smaller sorted lists are agglomerated to form the overall sorted list
Satyaki Sikdar© Programming in Python April 23 2016 23 / 62
hour 7: algo design 101 merge sort
the core idea
The algorithm is naturally recursive
The MERGE method takes two sorted lists and merges into a single sorted list
MERGE − SORT sorts the list recursively by breaking it into equal sized halves and
sorting them
A list having less than 2 elements is trivially sorted - base case
Smaller sorted lists are agglomerated to form the overall sorted list
Satyaki Sikdar© Programming in Python April 23 2016 23 / 62
hour 7: algo design 101 merge sort
the core idea
The algorithm is naturally recursive
The MERGE method takes two sorted lists and merges into a single sorted list
MERGE − SORT sorts the list recursively by breaking it into equal sized halves and
sorting them
A list having less than 2 elements is trivially sorted - base case
Smaller sorted lists are agglomerated to form the overall sorted list
Satyaki Sikdar© Programming in Python April 23 2016 23 / 62
hour 7: algo design 101 merge sort
Algorithm 3: MERGE(left, right)
begin
Append ∞ to left and right
i ← 0, j ← 0
merged ← new list
while len(merged) < len(left) + len(right) - 2
do
if left[i] < right[j] then
merged.append(left[i])
i ← i + 1
else
merged.append(right[j])
j ← j + 1
return merged
def merge(left, right):
left.append(float('inf'))
right.append(float('inf'))
i = 0
j = 0
merged = []
while len(merged) < len(left) +
len(right) - 2:
if left[i] < right[j]:
merged.append(left[i])
i += 1
else:
merged.append(right[j])
j += 1
return merged
Satyaki Sikdar© Programming in Python April 23 2016 24 / 62
hour 7: algo design 101 merge sort
Algorithm 4: MERGE-SORT(A)
begin
if len(A) < 2 then
return A
else
left ← first n
2 elements of A
right ← last n
2 elements of A
left ← MERGE − SORT(left)
right ← MERGE − SORT(right)
return MERGE(left, right)
def merge_sort(A):
if len(A) < 2:
return A
else:
mid = len(A) / 2
left = A[: mid]
right = A[mid: ]
left = merge_sort(left)
right = merge_sort(right)
return merge(left, right)
Satyaki Sikdar© Programming in Python April 23 2016 25 / 62
hour 7: algo design 101 modules
modules: extending functionalities beyond
basic Python
Modules are external files and libraries that provide additional functions and classes to the
bare bone Python
Modules are files containing Python definitions and statements (ex. name.py)
The interface is very simple. Definitions can be imported into other modules by using
“import name”
To access a module’s functions, type “name.function()”
Each module is imported once per session
Give nicknames to modules by using as
Satyaki Sikdar© Programming in Python April 23 2016 26 / 62
hour 7: algo design 101 modules
modules: extending functionalities beyond
basic Python
Modules are external files and libraries that provide additional functions and classes to the
bare bone Python
Modules are files containing Python definitions and statements (ex. name.py)
The interface is very simple. Definitions can be imported into other modules by using
“import name”
To access a module’s functions, type “name.function()”
Each module is imported once per session
Give nicknames to modules by using as
Satyaki Sikdar© Programming in Python April 23 2016 26 / 62
hour 7: algo design 101 modules
modules: extending functionalities beyond
basic Python
Modules are external files and libraries that provide additional functions and classes to the
bare bone Python
Modules are files containing Python definitions and statements (ex. name.py)
The interface is very simple. Definitions can be imported into other modules by using
“import name”
To access a module’s functions, type “name.function()”
Each module is imported once per session
Give nicknames to modules by using as
Satyaki Sikdar© Programming in Python April 23 2016 26 / 62
hour 7: algo design 101 modules
modules: extending functionalities beyond
basic Python
Modules are external files and libraries that provide additional functions and classes to the
bare bone Python
Modules are files containing Python definitions and statements (ex. name.py)
The interface is very simple. Definitions can be imported into other modules by using
“import name”
To access a module’s functions, type “name.function()”
Each module is imported once per session
Give nicknames to modules by using as
Satyaki Sikdar© Programming in Python April 23 2016 26 / 62
hour 7: algo design 101 modules
modules: extending functionalities beyond
basic Python
Modules are external files and libraries that provide additional functions and classes to the
bare bone Python
Modules are files containing Python definitions and statements (ex. name.py)
The interface is very simple. Definitions can be imported into other modules by using
“import name”
To access a module’s functions, type “name.function()”
Each module is imported once per session
Give nicknames to modules by using as
Satyaki Sikdar© Programming in Python April 23 2016 26 / 62
hour 7: algo design 101 modules
modules: extending functionalities beyond
basic Python
Modules are external files and libraries that provide additional functions and classes to the
bare bone Python
Modules are files containing Python definitions and statements (ex. name.py)
The interface is very simple. Definitions can be imported into other modules by using
“import name”
To access a module’s functions, type “name.function()”
Each module is imported once per session
Give nicknames to modules by using as
Satyaki Sikdar© Programming in Python April 23 2016 26 / 62
hour 7: algo design 101 modules
modules: extending functionalities beyond
basic Python
Python has a lot of predefined modules - sys, __future__, math, random, re, ...
The Zen of Python. Do import this
Each module is highly specialized
You have various choices when importing things from a module
Import the whole module, but preserve the namespace - important when dealing with a lot of
modules and keeping a track of things
import module_name
Import the whole module, but bring everything to the current namespace
from module_name import ∗
Import only specific things - often faster.
from math import pi, sin, cos
Satyaki Sikdar© Programming in Python April 23 2016 27 / 62
hour 7: algo design 101 modules
modules: extending functionalities beyond
basic Python
Python has a lot of predefined modules - sys, __future__, math, random, re, ...
The Zen of Python. Do import this
Each module is highly specialized
You have various choices when importing things from a module
Import the whole module, but preserve the namespace - important when dealing with a lot of
modules and keeping a track of things
import module_name
Import the whole module, but bring everything to the current namespace
from module_name import ∗
Import only specific things - often faster.
from math import pi, sin, cos
Satyaki Sikdar© Programming in Python April 23 2016 27 / 62
hour 7: algo design 101 modules
modules: extending functionalities beyond
basic Python
Python has a lot of predefined modules - sys, __future__, math, random, re, ...
The Zen of Python. Do import this
Each module is highly specialized
You have various choices when importing things from a module
Import the whole module, but preserve the namespace - important when dealing with a lot of
modules and keeping a track of things
import module_name
Import the whole module, but bring everything to the current namespace
from module_name import ∗
Import only specific things - often faster.
from math import pi, sin, cos
Satyaki Sikdar© Programming in Python April 23 2016 27 / 62
hour 7: algo design 101 modules
modules: extending functionalities beyond
basic Python
Python has a lot of predefined modules - sys, __future__, math, random, re, ...
The Zen of Python. Do import this
Each module is highly specialized
You have various choices when importing things from a module
Import the whole module, but preserve the namespace - important when dealing with a lot of
modules and keeping a track of things
import module_name
Import the whole module, but bring everything to the current namespace
from module_name import ∗
Import only specific things - often faster.
from math import pi, sin, cos
Satyaki Sikdar© Programming in Python April 23 2016 27 / 62
hour 7: algo design 101 modules
modules: extending functionalities beyond
basic Python
Python has a lot of predefined modules - sys, __future__, math, random, re, ...
The Zen of Python. Do import this
Each module is highly specialized
You have various choices when importing things from a module
Import the whole module, but preserve the namespace - important when dealing with a lot of
modules and keeping a track of things
import module_name
Import the whole module, but bring everything to the current namespace
from module_name import ∗
Import only specific things - often faster.
from math import pi, sin, cos
Satyaki Sikdar© Programming in Python April 23 2016 27 / 62
hour 7: algo design 101 modules
modules: extending functionalities beyond
basic Python
Python has a lot of predefined modules - sys, __future__, math, random, re, ...
The Zen of Python. Do import this
Each module is highly specialized
You have various choices when importing things from a module
Import the whole module, but preserve the namespace - important when dealing with a lot of
modules and keeping a track of things
import module_name
Import the whole module, but bring everything to the current namespace
from module_name import ∗
Import only specific things - often faster.
from math import pi, sin, cos
Satyaki Sikdar© Programming in Python April 23 2016 27 / 62
hour 7: algo design 101 modules
modules: extending functionalities beyond
basic Python
Python has a lot of predefined modules - sys, __future__, math, random, re, ...
The Zen of Python. Do import this
Each module is highly specialized
You have various choices when importing things from a module
Import the whole module, but preserve the namespace - important when dealing with a lot of
modules and keeping a track of things
import module_name
Import the whole module, but bring everything to the current namespace
from module_name import ∗
Import only specific things - often faster.
from math import pi, sin, cos
Satyaki Sikdar© Programming in Python April 23 2016 27 / 62
hour 7: algo design 101 modules
modules: extending functionalities beyond
basic Python
Python has a lot of predefined modules - sys, __future__, math, random, re, ...
The Zen of Python. Do import this
Each module is highly specialized
You have various choices when importing things from a module
Import the whole module, but preserve the namespace - important when dealing with a lot of
modules and keeping a track of things
import module_name
Import the whole module, but bring everything to the current namespace
from module_name import ∗
Import only specific things - often faster.
from math import pi, sin, cos
Satyaki Sikdar© Programming in Python April 23 2016 27 / 62
hour 7: algo design 101 modules
modules: extending functionalities beyond
basic Python
Python has a lot of predefined modules - sys, __future__, math, random, re, ...
The Zen of Python. Do import this
Each module is highly specialized
You have various choices when importing things from a module
Import the whole module, but preserve the namespace - important when dealing with a lot of
modules and keeping a track of things
import module_name
Import the whole module, but bring everything to the current namespace
from module_name import ∗
Import only specific things - often faster.
from math import pi, sin, cos
Satyaki Sikdar© Programming in Python April 23 2016 27 / 62
hour 7: algo design 101 modules
modules: extending functionalities beyond
basic Python
Python has a lot of predefined modules - sys, __future__, math, random, re, ...
The Zen of Python. Do import this
Each module is highly specialized
You have various choices when importing things from a module
Import the whole module, but preserve the namespace - important when dealing with a lot of
modules and keeping a track of things
import module_name
Import the whole module, but bring everything to the current namespace
from module_name import ∗
Import only specific things - often faster.
from math import pi, sin, cos
Satyaki Sikdar© Programming in Python April 23 2016 27 / 62
hour 7: algo design 101 modules
modules: extending functionalities beyond
basic Python
Python has a lot of predefined modules - sys, __future__, math, random, re, ...
The Zen of Python. Do import this
Each module is highly specialized
You have various choices when importing things from a module
Import the whole module, but preserve the namespace - important when dealing with a lot of
modules and keeping a track of things
import module_name
Import the whole module, but bring everything to the current namespace
from module_name import ∗
Import only specific things - often faster.
from math import pi, sin, cos
Satyaki Sikdar© Programming in Python April 23 2016 27 / 62
hour 7: algo design 101 modules
the sys module
This module provides access to some variables used or maintained by the interpreter and
to functions that interact strongly with the interpreter
sys.argv - The list of command line arguments passed to a Python script
argv[0] is the script name
Further command line args are stored in argv[1] onwards. Eg:
python test_prog.py arg1 arg2 arg3, then argv = [ test_prog.py , arg1 , arg2 , arg3 ]
sys.getrecursionlimit() - Return the current value of the recursion limit, the maximum
depth of the Python interpreter stack
sys.setrecursionlimit(limit)
Set the maximum depth of the Python interpreter stack to limit
This limit prevents infinite recursion from causing an overflow of the C stack and crashing
Python
The highest possible limit is platform-dependent
Satyaki Sikdar© Programming in Python April 23 2016 28 / 62
hour 7: algo design 101 modules
the sys module
This module provides access to some variables used or maintained by the interpreter and
to functions that interact strongly with the interpreter
sys.argv - The list of command line arguments passed to a Python script
argv[0] is the script name
Further command line args are stored in argv[1] onwards. Eg:
python test_prog.py arg1 arg2 arg3, then argv = [ test_prog.py , arg1 , arg2 , arg3 ]
sys.getrecursionlimit() - Return the current value of the recursion limit, the maximum
depth of the Python interpreter stack
sys.setrecursionlimit(limit)
Set the maximum depth of the Python interpreter stack to limit
This limit prevents infinite recursion from causing an overflow of the C stack and crashing
Python
The highest possible limit is platform-dependent
Satyaki Sikdar© Programming in Python April 23 2016 28 / 62
hour 7: algo design 101 modules
the sys module
This module provides access to some variables used or maintained by the interpreter and
to functions that interact strongly with the interpreter
sys.argv - The list of command line arguments passed to a Python script
argv[0] is the script name
Further command line args are stored in argv[1] onwards. Eg:
python test_prog.py arg1 arg2 arg3, then argv = [ test_prog.py , arg1 , arg2 , arg3 ]
sys.getrecursionlimit() - Return the current value of the recursion limit, the maximum
depth of the Python interpreter stack
sys.setrecursionlimit(limit)
Set the maximum depth of the Python interpreter stack to limit
This limit prevents infinite recursion from causing an overflow of the C stack and crashing
Python
The highest possible limit is platform-dependent
Satyaki Sikdar© Programming in Python April 23 2016 28 / 62
hour 7: algo design 101 modules
the sys module
This module provides access to some variables used or maintained by the interpreter and
to functions that interact strongly with the interpreter
sys.argv - The list of command line arguments passed to a Python script
argv[0] is the script name
Further command line args are stored in argv[1] onwards. Eg:
python test_prog.py arg1 arg2 arg3, then argv = [ test_prog.py , arg1 , arg2 , arg3 ]
sys.getrecursionlimit() - Return the current value of the recursion limit, the maximum
depth of the Python interpreter stack
sys.setrecursionlimit(limit)
Set the maximum depth of the Python interpreter stack to limit
This limit prevents infinite recursion from causing an overflow of the C stack and crashing
Python
The highest possible limit is platform-dependent
Satyaki Sikdar© Programming in Python April 23 2016 28 / 62
hour 7: algo design 101 modules
the sys module
This module provides access to some variables used or maintained by the interpreter and
to functions that interact strongly with the interpreter
sys.argv - The list of command line arguments passed to a Python script
argv[0] is the script name
Further command line args are stored in argv[1] onwards. Eg:
python test_prog.py arg1 arg2 arg3, then argv = [ test_prog.py , arg1 , arg2 , arg3 ]
sys.getrecursionlimit() - Return the current value of the recursion limit, the maximum
depth of the Python interpreter stack
sys.setrecursionlimit(limit)
Set the maximum depth of the Python interpreter stack to limit
This limit prevents infinite recursion from causing an overflow of the C stack and crashing
Python
The highest possible limit is platform-dependent
Satyaki Sikdar© Programming in Python April 23 2016 28 / 62
hour 7: algo design 101 modules
the sys module
This module provides access to some variables used or maintained by the interpreter and
to functions that interact strongly with the interpreter
sys.argv - The list of command line arguments passed to a Python script
argv[0] is the script name
Further command line args are stored in argv[1] onwards. Eg:
python test_prog.py arg1 arg2 arg3, then argv = [ test_prog.py , arg1 , arg2 , arg3 ]
sys.getrecursionlimit() - Return the current value of the recursion limit, the maximum
depth of the Python interpreter stack
sys.setrecursionlimit(limit)
Set the maximum depth of the Python interpreter stack to limit
This limit prevents infinite recursion from causing an overflow of the C stack and crashing
Python
The highest possible limit is platform-dependent
Satyaki Sikdar© Programming in Python April 23 2016 28 / 62
hour 7: algo design 101 modules
the sys module
This module provides access to some variables used or maintained by the interpreter and
to functions that interact strongly with the interpreter
sys.argv - The list of command line arguments passed to a Python script
argv[0] is the script name
Further command line args are stored in argv[1] onwards. Eg:
python test_prog.py arg1 arg2 arg3, then argv = [ test_prog.py , arg1 , arg2 , arg3 ]
sys.getrecursionlimit() - Return the current value of the recursion limit, the maximum
depth of the Python interpreter stack
sys.setrecursionlimit(limit)
Set the maximum depth of the Python interpreter stack to limit
This limit prevents infinite recursion from causing an overflow of the C stack and crashing
Python
The highest possible limit is platform-dependent
Satyaki Sikdar© Programming in Python April 23 2016 28 / 62
hour 7: algo design 101 modules
the sys module
This module provides access to some variables used or maintained by the interpreter and
to functions that interact strongly with the interpreter
sys.argv - The list of command line arguments passed to a Python script
argv[0] is the script name
Further command line args are stored in argv[1] onwards. Eg:
python test_prog.py arg1 arg2 arg3, then argv = [ test_prog.py , arg1 , arg2 , arg3 ]
sys.getrecursionlimit() - Return the current value of the recursion limit, the maximum
depth of the Python interpreter stack
sys.setrecursionlimit(limit)
Set the maximum depth of the Python interpreter stack to limit
This limit prevents infinite recursion from causing an overflow of the C stack and crashing
Python
The highest possible limit is platform-dependent
Satyaki Sikdar© Programming in Python April 23 2016 28 / 62
hour 7: algo design 101 modules
the sys module
This module provides access to some variables used or maintained by the interpreter and
to functions that interact strongly with the interpreter
sys.argv - The list of command line arguments passed to a Python script
argv[0] is the script name
Further command line args are stored in argv[1] onwards. Eg:
python test_prog.py arg1 arg2 arg3, then argv = [ test_prog.py , arg1 , arg2 , arg3 ]
sys.getrecursionlimit() - Return the current value of the recursion limit, the maximum
depth of the Python interpreter stack
sys.setrecursionlimit(limit)
Set the maximum depth of the Python interpreter stack to limit
This limit prevents infinite recursion from causing an overflow of the C stack and crashing
Python
The highest possible limit is platform-dependent
Satyaki Sikdar© Programming in Python April 23 2016 28 / 62
hour 7: algo design 101 modules
make your own module
Making modules are very easy - at least the basic ones anyway
Create a script in IDLE or in a decent text editor
Write the classes and variables you want the module to have (say, three functions f1, f2, f3
and two variables v1 and v2)
Save the script as my_mod.py
Create another Python script, in the same directory where you’ll use the module
Write import my_mod anywhere and you’re done!
dir(modulename) gives a sorted list of strings of the things imported from the module
Satyaki Sikdar© Programming in Python April 23 2016 29 / 62
hour 7: algo design 101 modules
make your own module
Making modules are very easy - at least the basic ones anyway
Create a script in IDLE or in a decent text editor
Write the classes and variables you want the module to have (say, three functions f1, f2, f3
and two variables v1 and v2)
Save the script as my_mod.py
Create another Python script, in the same directory where you’ll use the module
Write import my_mod anywhere and you’re done!
dir(modulename) gives a sorted list of strings of the things imported from the module
Satyaki Sikdar© Programming in Python April 23 2016 29 / 62
hour 7: algo design 101 modules
make your own module
Making modules are very easy - at least the basic ones anyway
Create a script in IDLE or in a decent text editor
Write the classes and variables you want the module to have (say, three functions f1, f2, f3
and two variables v1 and v2)
Save the script as my_mod.py
Create another Python script, in the same directory where you’ll use the module
Write import my_mod anywhere and you’re done!
dir(modulename) gives a sorted list of strings of the things imported from the module
Satyaki Sikdar© Programming in Python April 23 2016 29 / 62
hour 7: algo design 101 modules
make your own module
Making modules are very easy - at least the basic ones anyway
Create a script in IDLE or in a decent text editor
Write the classes and variables you want the module to have (say, three functions f1, f2, f3
and two variables v1 and v2)
Save the script as my_mod.py
Create another Python script, in the same directory where you’ll use the module
Write import my_mod anywhere and you’re done!
dir(modulename) gives a sorted list of strings of the things imported from the module
Satyaki Sikdar© Programming in Python April 23 2016 29 / 62
hour 7: algo design 101 modules
make your own module
Making modules are very easy - at least the basic ones anyway
Create a script in IDLE or in a decent text editor
Write the classes and variables you want the module to have (say, three functions f1, f2, f3
and two variables v1 and v2)
Save the script as my_mod.py
Create another Python script, in the same directory where you’ll use the module
Write import my_mod anywhere and you’re done!
dir(modulename) gives a sorted list of strings of the things imported from the module
Satyaki Sikdar© Programming in Python April 23 2016 29 / 62
hour 7: algo design 101 modules
make your own module
Making modules are very easy - at least the basic ones anyway
Create a script in IDLE or in a decent text editor
Write the classes and variables you want the module to have (say, three functions f1, f2, f3
and two variables v1 and v2)
Save the script as my_mod.py
Create another Python script, in the same directory where you’ll use the module
Write import my_mod anywhere and you’re done!
dir(modulename) gives a sorted list of strings of the things imported from the module
Satyaki Sikdar© Programming in Python April 23 2016 29 / 62
hour 7: algo design 101 modules
make your own module
Making modules are very easy - at least the basic ones anyway
Create a script in IDLE or in a decent text editor
Write the classes and variables you want the module to have (say, three functions f1, f2, f3
and two variables v1 and v2)
Save the script as my_mod.py
Create another Python script, in the same directory where you’ll use the module
Write import my_mod anywhere and you’re done!
dir(modulename) gives a sorted list of strings of the things imported from the module
Satyaki Sikdar© Programming in Python April 23 2016 29 / 62
hours 8: data viz 101
table of contents
1 hour 6: let’s get rich!
2 hour 7: algo design 101
3 hours 8: data viz 101
plotting
matplotlib
making plots prettier
4 hours 9 - 11 SNA 101
Satyaki Sikdar© Programming in Python April 23 2016 30 / 62
hours 8: data viz 101 plotting
data visualization
Data visualization turns numbers and letters into aesthetically pleasing visuals, making it
easy to recognize patterns and find exceptions
Figure: US Census data (2010)
It is easy to see some general settlement
patterns in the US
The East Coast has a much greater
population density than the rest of
America
The East Coast has a much greater
population density than the rest of
America - racial homophily
Satyaki Sikdar© Programming in Python April 23 2016 31 / 62
hours 8: data viz 101 plotting
data visualization
Data visualization turns numbers and letters into aesthetically pleasing visuals, making it
easy to recognize patterns and find exceptions
Figure: US Census data (2010)
It is easy to see some general settlement
patterns in the US
The East Coast has a much greater
population density than the rest of
America
The East Coast has a much greater
population density than the rest of
America - racial homophily
Satyaki Sikdar© Programming in Python April 23 2016 31 / 62
hours 8: data viz 101 plotting
data visualization
Data visualization turns numbers and letters into aesthetically pleasing visuals, making it
easy to recognize patterns and find exceptions
Figure: US Census data (2010)
It is easy to see some general settlement
patterns in the US
The East Coast has a much greater
population density than the rest of
America
The East Coast has a much greater
population density than the rest of
America - racial homophily
Satyaki Sikdar© Programming in Python April 23 2016 31 / 62
hours 8: data viz 101 plotting
data visualization
Data visualization turns numbers and letters into aesthetically pleasing visuals, making it
easy to recognize patterns and find exceptions
Figure: US Census data (2010)
It is easy to see some general settlement
patterns in the US
The East Coast has a much greater
population density than the rest of
America
The East Coast has a much greater
population density than the rest of
America - racial homophily
Satyaki Sikdar© Programming in Python April 23 2016 31 / 62
hours 8: data viz 101 plotting
love in the time of cholera
Satyaki Sikdar© Programming in Python April 23 2016 32 / 62
hours 8: data viz 101 plotting
Anscombe’s quartet comprises four datasets that have nearly identical simple statistical
properties, yet appear very different when graphed
Constructed in 1973 by Francis Anscombe to demonstrate both the importance of
graphing data before analyzing it and the effect of outliers on statistical properties
Property Value
¯x 9
σ2(x) 11
¯y 7.50
σ2(y) 4.122
correlation 0.816
regression y = 3 + 0.5x
Satyaki Sikdar© Programming in Python April 23 2016 33 / 62
hours 8: data viz 101 plotting
Anscombe’s quartet comprises four datasets that have nearly identical simple statistical
properties, yet appear very different when graphed
Constructed in 1973 by Francis Anscombe to demonstrate both the importance of
graphing data before analyzing it and the effect of outliers on statistical properties
Property Value
¯x 9
σ2(x) 11
¯y 7.50
σ2(y) 4.122
correlation 0.816
regression y = 3 + 0.5x
Satyaki Sikdar© Programming in Python April 23 2016 33 / 62
hours 8: data viz 101 plotting
plotting the four datasets
Satyaki Sikdar© Programming in Python April 23 2016 34 / 62
hours 8: data viz 101 plotting
more reasons to visualize the data
Visualization is the highest bandwidth channel into the human brain
The visual cortex is the largest system in the human brain; it’s wasteful not to make use of
it
As data volumes grow, visualization becomes a necessity rather than a luxury
"A picture is worth a thousand words"
Satyaki Sikdar© Programming in Python April 23 2016 35 / 62
hours 8: data viz 101 plotting
more reasons to visualize the data
Visualization is the highest bandwidth channel into the human brain
The visual cortex is the largest system in the human brain; it’s wasteful not to make use of
it
As data volumes grow, visualization becomes a necessity rather than a luxury
"A picture is worth a thousand words"
Satyaki Sikdar© Programming in Python April 23 2016 35 / 62
hours 8: data viz 101 plotting
more reasons to visualize the data
Visualization is the highest bandwidth channel into the human brain
The visual cortex is the largest system in the human brain; it’s wasteful not to make use of
it
As data volumes grow, visualization becomes a necessity rather than a luxury
"A picture is worth a thousand words"
Satyaki Sikdar© Programming in Python April 23 2016 35 / 62
hours 8: data viz 101 matplotlib
matplotlib and pylab
Matplotlib is a 3rd party module that provides an interface to make plots in Python
Inspired by Matlab’s plotting library and hence the name
pylab or equivalently matplotlib.pyplot is a module defined by matplotlib that is used to
make plots
I’ll cover two most used types of plots in some detail
line plots
scatter plots
histograms
Satyaki Sikdar© Programming in Python April 23 2016 36 / 62
hours 8: data viz 101 matplotlib
matplotlib and pylab
Matplotlib is a 3rd party module that provides an interface to make plots in Python
Inspired by Matlab’s plotting library and hence the name
pylab or equivalently matplotlib.pyplot is a module defined by matplotlib that is used to
make plots
I’ll cover two most used types of plots in some detail
line plots
scatter plots
histograms
Satyaki Sikdar© Programming in Python April 23 2016 36 / 62
hours 8: data viz 101 matplotlib
matplotlib and pylab
Matplotlib is a 3rd party module that provides an interface to make plots in Python
Inspired by Matlab’s plotting library and hence the name
pylab or equivalently matplotlib.pyplot is a module defined by matplotlib that is used to
make plots
I’ll cover two most used types of plots in some detail
line plots
scatter plots
histograms
Satyaki Sikdar© Programming in Python April 23 2016 36 / 62
hours 8: data viz 101 matplotlib
matplotlib and pylab
Matplotlib is a 3rd party module that provides an interface to make plots in Python
Inspired by Matlab’s plotting library and hence the name
pylab or equivalently matplotlib.pyplot is a module defined by matplotlib that is used to
make plots
I’ll cover two most used types of plots in some detail
line plots
scatter plots
histograms
Satyaki Sikdar© Programming in Python April 23 2016 36 / 62
hours 8: data viz 101 matplotlib
matplotlib and pylab
Matplotlib is a 3rd party module that provides an interface to make plots in Python
Inspired by Matlab’s plotting library and hence the name
pylab or equivalently matplotlib.pyplot is a module defined by matplotlib that is used to
make plots
I’ll cover two most used types of plots in some detail
line plots
scatter plots
histograms
Satyaki Sikdar© Programming in Python April 23 2016 36 / 62
hours 8: data viz 101 matplotlib
matplotlib and pylab
Matplotlib is a 3rd party module that provides an interface to make plots in Python
Inspired by Matlab’s plotting library and hence the name
pylab or equivalently matplotlib.pyplot is a module defined by matplotlib that is used to
make plots
I’ll cover two most used types of plots in some detail
line plots
scatter plots
histograms
Satyaki Sikdar© Programming in Python April 23 2016 36 / 62
hours 8: data viz 101 matplotlib
matplotlib and pylab
Matplotlib is a 3rd party module that provides an interface to make plots in Python
Inspired by Matlab’s plotting library and hence the name
pylab or equivalently matplotlib.pyplot is a module defined by matplotlib that is used to
make plots
I’ll cover two most used types of plots in some detail
line plots
scatter plots
histograms
Satyaki Sikdar© Programming in Python April 23 2016 36 / 62
hours 8: data viz 101 matplotlib
line plots
# lineplot.py
import pylab as pl
x = [1, 2, 3, 4, 5]
y = [1, 4, 9, 16, 25]
pl.plot(x, y)
pl.show() # show the plot on the screen
Satyaki Sikdar© Programming in Python April 23 2016 37 / 62
hours 8: data viz 101 matplotlib
line plots
# scatterplot.py
import pylab as pl
x = [1, 2, 3, 4, 5]
y = [1, 4, 9, 16, 25]
pl.scatter(x, y)
pl.show() # show the plot on the screen
Satyaki Sikdar© Programming in Python April 23 2016 38 / 62
hours 8: data viz 101 making plots prettier
tinkering parameters
Matplotlib offers a lot of customizations. Let’s look at the key ones.
Changing the line color - different datasets can have different colors
# at lineplot.py
# pl.plot(x, y)
pl.plot(x, y, c='r')
character color
b blue
g green
r red
c cyan
m magenta
y yellow
k black
w white
Satyaki Sikdar© Programming in Python April 23 2016 39 / 62
hours 8: data viz 101 making plots prettier
tinkering parameters
Changing the marker - marks the data points
# at lineplot.py
pl.plot(x, y, c='b', marker='*') # gives blue star shaped markers
pl.plot(x, y, marker='b*') # same plot as above
character marker shape
’s’ square
’o’ circle
’p’ pentagon
’*’ star
’h’ hexagon
’+’ plus
’D’ diamond
’d’ thin diamond
Satyaki Sikdar© Programming in Python April 23 2016 40 / 62
hours 8: data viz 101 making plots prettier
tinkering parameters
Plot and axis titles and limits - It is very important to always label plots and the axes of
plots to tell the viewers what they are looking at
pl.xlabel('put label of x axis')
pl.ylabel('put label of y axis')
pt.title('put title here')
You can change the x and y ranges displayed on your plot by:
pl.xlim(x_low, x_high)
pl.ylabel(y_low, y_high)
Satyaki Sikdar© Programming in Python April 23 2016 41 / 62
hours 8: data viz 101 making plots prettier
tinkering parameters
Plot and axis titles and limits - It is very important to always label plots and the axes of
plots to tell the viewers what they are looking at
pl.xlabel('put label of x axis')
pl.ylabel('put label of y axis')
pt.title('put title here')
You can change the x and y ranges displayed on your plot by:
pl.xlim(x_low, x_high)
pl.ylabel(y_low, y_high)
Satyaki Sikdar© Programming in Python April 23 2016 41 / 62
hours 8: data viz 101 making plots prettier
tinkering parameters
#lineplotAxis.py
import pylab as pl
x = [1, 2, 3, 4, 5]
y = [1, 4, 9, 16, 25]
pl.plot(x, y)
pl.title(’Plot of y vs. x’)
pl.xlabel(’x axis’)
pl.ylabel(’y axis’)
# set axis limits
pl.xlim(0.0, 7.0)
pl.ylim(0.0, 30.)
pl.show()
Satyaki Sikdar© Programming in Python April 23 2016 42 / 62
hours 8: data viz 101 making plots prettier
plotting more than one plot
#lineplot2Plots.py
import pylab as pl
x1 = [1, 2, 3, 4, 5]
y1 = [1, 4, 9, 16, 25]
x2 = [1, 2, 4, 6, 8]
y2 = [2, 4, 8, 12, 16]
pl.plot(x1, y1, ’r’)
pl.plot(x2, y2, ’g’)
pl.title(’Plot of y vs. x’)
pl.xlabel(’x axis’)
pl.ylabel(’y axis’)
pl.xlim(0.0, 9.0)
pl.ylim(0.0, 30.)
pl.show()
Satyaki Sikdar© Programming in Python April 23 2016 43 / 62
hours 8: data viz 101 making plots prettier
legen.. wait for it.. dary!
It’s very useful to add legends to plots to differentiate between the different lines or
quantities being plotted
pl.legend([plot1, plot2], ('label1', 'label2'), 'best')
The first parameter is a list of the plots you want labeled,
The second parameter is the list / tuple of labels
The third parameter is where you would like matplotlib to place your legend. Options are
‘upper right’, ‘upper left’, ‘center’, ‘lower left’, ‘lower right’ and ’best’
Satyaki Sikdar© Programming in Python April 23 2016 44 / 62
hours 8: data viz 101 making plots prettier
legen.. wait for it.. dary!
It’s very useful to add legends to plots to differentiate between the different lines or
quantities being plotted
pl.legend([plot1, plot2], ('label1', 'label2'), 'best')
The first parameter is a list of the plots you want labeled,
The second parameter is the list / tuple of labels
The third parameter is where you would like matplotlib to place your legend. Options are
‘upper right’, ‘upper left’, ‘center’, ‘lower left’, ‘lower right’ and ’best’
Satyaki Sikdar© Programming in Python April 23 2016 44 / 62
hours 8: data viz 101 making plots prettier
legen.. wait for it.. dary!
It’s very useful to add legends to plots to differentiate between the different lines or
quantities being plotted
pl.legend([plot1, plot2], ('label1', 'label2'), 'best')
The first parameter is a list of the plots you want labeled,
The second parameter is the list / tuple of labels
The third parameter is where you would like matplotlib to place your legend. Options are
‘upper right’, ‘upper left’, ‘center’, ‘lower left’, ‘lower right’ and ’best’
Satyaki Sikdar© Programming in Python April 23 2016 44 / 62
hours 8: data viz 101 making plots prettier
legen.. wait for it.. dary!
It’s very useful to add legends to plots to differentiate between the different lines or
quantities being plotted
pl.legend([plot1, plot2], ('label1', 'label2'), 'best')
The first parameter is a list of the plots you want labeled,
The second parameter is the list / tuple of labels
The third parameter is where you would like matplotlib to place your legend. Options are
‘upper right’, ‘upper left’, ‘center’, ‘lower left’, ‘lower right’ and ’best’
Satyaki Sikdar© Programming in Python April 23 2016 44 / 62
hours 8: data viz 101 making plots prettier
#lineplotFigLegend.py
x1 = [1, 2, 3, 4, 5]
y1 = [1, 4, 9, 16, 25]
x2 = [1, 2, 4, 6, 8]
y2 = [2, 4, 8, 12, 16]
plot1 = pl.plot(x1, y1, ’r’)
plot2 = pl.plot(x2, y2, ’g’)
pl.title(’Plot of y vs. x’)
pl.xlabel(’x axis’)
pl.ylabel(’y axis’)
pl.xlim(0.0, 9.0)
pl.ylim(0.0, 30.)
pl.legend([plot1, plot2], ('red line',
'green circles'), 'best')
pl.show()
Satyaki Sikdar© Programming in Python April 23 2016 45 / 62
hours 8: data viz 101 making plots prettier
histograms
They are very useful to plot distributions
In Matplotlib you use the hist command to make a histogram
from numpy import random
# mean, sigma, number of points
data = random.normal(5.0, 3.0, 1000)
pl.hist(data)
pl.title('a sample histogram')
pl.xlabel('data')
pl.show()
Satyaki Sikdar© Programming in Python April 23 2016 46 / 62
hours 8: data viz 101 making plots prettier
histograms
They are very useful to plot distributions
In Matplotlib you use the hist command to make a histogram
from numpy import random
# mean, sigma, number of points
data = random.normal(5.0, 3.0, 1000)
pl.hist(data)
pl.title('a sample histogram')
pl.xlabel('data')
pl.show()
Satyaki Sikdar© Programming in Python April 23 2016 46 / 62
hours 8: data viz 101 making plots prettier
subplots
Matplotlib is reasonably flexible about allowing multiple plots per canvas and it is easy to
set this up
You need to first make a figure and then specify subplots as follows
fig1 = pl.figure(1)
pl.subplot(211)
subplot(211) - a figure with 2 rows, 1 column, and the top plot (1)
pl.subplot(212) - a figure with 2 rows, 1 column, and the bottom plot (2)
Satyaki Sikdar© Programming in Python April 23 2016 47 / 62
hours 8: data viz 101 making plots prettier
subplots
Matplotlib is reasonably flexible about allowing multiple plots per canvas and it is easy to
set this up
You need to first make a figure and then specify subplots as follows
fig1 = pl.figure(1)
pl.subplot(211)
subplot(211) - a figure with 2 rows, 1 column, and the top plot (1)
pl.subplot(212) - a figure with 2 rows, 1 column, and the bottom plot (2)
Satyaki Sikdar© Programming in Python April 23 2016 47 / 62
hours 8: data viz 101 making plots prettier
subplots
Matplotlib is reasonably flexible about allowing multiple plots per canvas and it is easy to
set this up
You need to first make a figure and then specify subplots as follows
fig1 = pl.figure(1)
pl.subplot(211)
subplot(211) - a figure with 2 rows, 1 column, and the top plot (1)
pl.subplot(212) - a figure with 2 rows, 1 column, and the bottom plot (2)
Satyaki Sikdar© Programming in Python April 23 2016 47 / 62
hours 8: data viz 101 making plots prettier
subplots
Matplotlib is reasonably flexible about allowing multiple plots per canvas and it is easy to
set this up
You need to first make a figure and then specify subplots as follows
fig1 = pl.figure(1)
pl.subplot(211)
subplot(211) - a figure with 2 rows, 1 column, and the top plot (1)
pl.subplot(212) - a figure with 2 rows, 1 column, and the bottom plot (2)
Satyaki Sikdar© Programming in Python April 23 2016 47 / 62
hours 8: data viz 101 making plots prettier
Satyaki Sikdar© Programming in Python April 23 2016 48 / 62
hours 8: data viz 101 making plots prettier
handling data
So far, we have been hard coding the data sets
Actual datasets might be very large! We use file handling
import pylab as pl
def read_data(filename):
X = []
Y = []
with open(filename, 'r') as f:
for line in f.readlines():
x, y = line.split()
X.append(float(x))
Y.append(float(y))
return X, Y
def plot_data(filename):
X, Y = read_data(filename)
pl.scatter(X, Y, c='g')
pl.xlabel('x')
pl.ylabel('y')
pl.title('y vs x')
pl.show()
Satyaki Sikdar© Programming in Python April 23 2016 49 / 62
hours 8: data viz 101 making plots prettier
handling data
So far, we have been hard coding the data sets
Actual datasets might be very large! We use file handling
import pylab as pl
def read_data(filename):
X = []
Y = []
with open(filename, 'r') as f:
for line in f.readlines():
x, y = line.split()
X.append(float(x))
Y.append(float(y))
return X, Y
def plot_data(filename):
X, Y = read_data(filename)
pl.scatter(X, Y, c='g')
pl.xlabel('x')
pl.ylabel('y')
pl.title('y vs x')
pl.show()
Satyaki Sikdar© Programming in Python April 23 2016 49 / 62
hours 9 - 11 SNA 101
table of contents
1 hour 6: let’s get rich!
2 hour 7: algo design 101
3 hours 8: data viz 101
4 hours 9 - 11 SNA 101
Introduction to SNA
Modelling - Introduction and Importance
Representing Networks
Satyaki Sikdar© Programming in Python April 23 2016 50 / 62
hours 9 - 11 SNA 101 Introduction to SNA
Social Networks Analysis
Investigation social structures through the use of network and graph theories
Characterizes ties among say: Friends, Webpages, disease transmission
Analysis is crucial to understand the flow of influence, disease, or investigate patterns like
voting patterns, food preferences
Satyaki Sikdar© Programming in Python April 23 2016 51 / 62
hours 9 - 11 SNA 101 Introduction to SNA
Social Networks Analysis
Investigation social structures through the use of network and graph theories
Characterizes ties among say: Friends, Webpages, disease transmission
Analysis is crucial to understand the flow of influence, disease, or investigate patterns like
voting patterns, food preferences
Satyaki Sikdar© Programming in Python April 23 2016 51 / 62
hours 9 - 11 SNA 101 Introduction to SNA
Social Networks Analysis
Investigation social structures through the use of network and graph theories
Characterizes ties among say: Friends, Webpages, disease transmission
Analysis is crucial to understand the flow of influence, disease, or investigate patterns like
voting patterns, food preferences
Satyaki Sikdar© Programming in Python April 23 2016 51 / 62
hours 9 - 11 SNA 101 Introduction to SNA
Citation and Email networks
Satyaki Sikdar© Programming in Python April 23 2016 52 / 62
hours 9 - 11 SNA 101 Introduction to SNA
Citation and Email networks
Satyaki Sikdar© Programming in Python April 23 2016 52 / 62
hours 9 - 11 SNA 101 Introduction to SNA
Telecommunication and Protein networks
Satyaki Sikdar© Programming in Python April 23 2016 53 / 62
hours 9 - 11 SNA 101 Introduction to SNA
Telecommunication and Protein networks
Satyaki Sikdar© Programming in Python April 23 2016 53 / 62
hours 9 - 11 SNA 101 Introduction to SNA
Friendship and Les Misérables
Satyaki Sikdar© Programming in Python April 23 2016 54 / 62
hours 9 - 11 SNA 101 Introduction to SNA
Friendship and Les Misérables
Satyaki Sikdar© Programming in Python April 23 2016 54 / 62
hours 9 - 11 SNA 101 Introduction to SNA
Blackout Aug ’96
A hot summer day. A single transmission line fails in Portland, Oregon
The power line fails. The load is distributed over the remaining lines which were operating
at almost max capacity
The system collapses. Much like a stack of dominoes.
OR => WA
WA => CA
CA => ID
ID => UT
UT => CO
CO => AZ
AZ => NM
NM => NVSatyaki Sikdar© Programming in Python April 23 2016 55 / 62
hours 9 - 11 SNA 101 Introduction to SNA
Blackout Aug ’96
A hot summer day. A single transmission line fails in Portland, Oregon
The power line fails. The load is distributed over the remaining lines which were operating
at almost max capacity
The system collapses. Much like a stack of dominoes.
OR => WA
WA => CA
CA => ID
ID => UT
UT => CO
CO => AZ
AZ => NM
NM => NVSatyaki Sikdar© Programming in Python April 23 2016 55 / 62
hours 9 - 11 SNA 101 Introduction to SNA
Blackout Aug ’96
A hot summer day. A single transmission line fails in Portland, Oregon
The power line fails. The load is distributed over the remaining lines which were operating
at almost max capacity
The system collapses. Much like a stack of dominoes.
OR => WA
WA => CA
CA => ID
ID => UT
UT => CO
CO => AZ
AZ => NM
NM => NVSatyaki Sikdar© Programming in Python April 23 2016 55 / 62
hours 9 - 11 SNA 101 Introduction to SNA
Blackout Aug ’96
A hot summer day. A single transmission line fails in Portland, Oregon
The power line fails. The load is distributed over the remaining lines which were operating
at almost max capacity
The system collapses. Much like a stack of dominoes.
OR => WA
WA => CA
CA => ID
ID => UT
UT => CO
CO => AZ
AZ => NM
NM => NVSatyaki Sikdar© Programming in Python April 23 2016 55 / 62
hours 9 - 11 SNA 101 Introduction to SNA
Blackout Aug ’96
A hot summer day. A single transmission line fails in Portland, Oregon
The power line fails. The load is distributed over the remaining lines which were operating
at almost max capacity
The system collapses. Much like a stack of dominoes.
OR => WA
WA => CA
CA => ID
ID => UT
UT => CO
CO => AZ
AZ => NM
NM => NVSatyaki Sikdar© Programming in Python April 23 2016 55 / 62
hours 9 - 11 SNA 101 Introduction to SNA
Blackout Aug ’96
A hot summer day. A single transmission line fails in Portland, Oregon
The power line fails. The load is distributed over the remaining lines which were operating
at almost max capacity
The system collapses. Much like a stack of dominoes.
OR => WA
WA => CA
CA => ID
ID => UT
UT => CO
CO => AZ
AZ => NM
NM => NVSatyaki Sikdar© Programming in Python April 23 2016 55 / 62
hours 9 - 11 SNA 101 Introduction to SNA
Blackout Aug ’96
A hot summer day. A single transmission line fails in Portland, Oregon
The power line fails. The load is distributed over the remaining lines which were operating
at almost max capacity
The system collapses. Much like a stack of dominoes.
OR => WA
WA => CA
CA => ID
ID => UT
UT => CO
CO => AZ
AZ => NM
NM => NVSatyaki Sikdar© Programming in Python April 23 2016 55 / 62
hours 9 - 11 SNA 101 Introduction to SNA
Blackout Aug ’96
A hot summer day. A single transmission line fails in Portland, Oregon
The power line fails. The load is distributed over the remaining lines which were operating
at almost max capacity
The system collapses. Much like a stack of dominoes.
OR => WA
WA => CA
CA => ID
ID => UT
UT => CO
CO => AZ
AZ => NM
NM => NVSatyaki Sikdar© Programming in Python April 23 2016 55 / 62
hours 9 - 11 SNA 101 Introduction to SNA
Blackout Aug ’96
A hot summer day. A single transmission line fails in Portland, Oregon
The power line fails. The load is distributed over the remaining lines which were operating
at almost max capacity
The system collapses. Much like a stack of dominoes.
OR => WA
WA => CA
CA => ID
ID => UT
UT => CO
CO => AZ
AZ => NM
NM => NVSatyaki Sikdar© Programming in Python April 23 2016 55 / 62
hours 9 - 11 SNA 101 Introduction to SNA
Blackout Aug ’96
A hot summer day. A single transmission line fails in Portland, Oregon
The power line fails. The load is distributed over the remaining lines which were operating
at almost max capacity
The system collapses. Much like a stack of dominoes.
OR => WA
WA => CA
CA => ID
ID => UT
UT => CO
CO => AZ
AZ => NM
NM => NVSatyaki Sikdar© Programming in Python April 23 2016 55 / 62
hours 9 - 11 SNA 101 Introduction to SNA
Blackout Aug ’96
A hot summer day. A single transmission line fails in Portland, Oregon
The power line fails. The load is distributed over the remaining lines which were operating
at almost max capacity
The system collapses. Much like a stack of dominoes.
OR => WA
WA => CA
CA => ID
ID => UT
UT => CO
CO => AZ
AZ => NM
NM => NVSatyaki Sikdar© Programming in Python April 23 2016 55 / 62
hours 9 - 11 SNA 101 Introduction to SNA
The Aftermath
The skyline of San Francisco was dark
A total of 175 generating units failed
Some of the nuclear reactors tooks days to restart
Total cost of $2 billion
What caused this catastrophe?
Sloppy maintainence
Insufficient attention to warning signs
Pure chance - bad luck
Inadequate understanding of the interdependencies in the system
Satyaki Sikdar© Programming in Python April 23 2016 56 / 62
hours 9 - 11 SNA 101 Introduction to SNA
The Aftermath
The skyline of San Francisco was dark
A total of 175 generating units failed
Some of the nuclear reactors tooks days to restart
Total cost of $2 billion
What caused this catastrophe?
Sloppy maintainence
Insufficient attention to warning signs
Pure chance - bad luck
Inadequate understanding of the interdependencies in the system
Satyaki Sikdar© Programming in Python April 23 2016 56 / 62
hours 9 - 11 SNA 101 Introduction to SNA
The Aftermath
The skyline of San Francisco was dark
A total of 175 generating units failed
Some of the nuclear reactors tooks days to restart
Total cost of $2 billion
What caused this catastrophe?
Sloppy maintainence
Insufficient attention to warning signs
Pure chance - bad luck
Inadequate understanding of the interdependencies in the system
Satyaki Sikdar© Programming in Python April 23 2016 56 / 62
hours 9 - 11 SNA 101 Introduction to SNA
The Aftermath
The skyline of San Francisco was dark
A total of 175 generating units failed
Some of the nuclear reactors tooks days to restart
Total cost of $2 billion
What caused this catastrophe?
Sloppy maintainence
Insufficient attention to warning signs
Pure chance - bad luck
Inadequate understanding of the interdependencies in the system
Satyaki Sikdar© Programming in Python April 23 2016 56 / 62
hours 9 - 11 SNA 101 Introduction to SNA
The Aftermath
The skyline of San Francisco was dark
A total of 175 generating units failed
Some of the nuclear reactors tooks days to restart
Total cost of $2 billion
What caused this catastrophe?
Sloppy maintainence
Insufficient attention to warning signs
Pure chance - bad luck
Inadequate understanding of the interdependencies in the system
Satyaki Sikdar© Programming in Python April 23 2016 56 / 62
hours 9 - 11 SNA 101 Introduction to SNA
The Aftermath
The skyline of San Francisco was dark
A total of 175 generating units failed
Some of the nuclear reactors tooks days to restart
Total cost of $2 billion
What caused this catastrophe?
Sloppy maintainence
Insufficient attention to warning signs
Pure chance - bad luck
Inadequate understanding of the interdependencies in the system
Satyaki Sikdar© Programming in Python April 23 2016 56 / 62
hours 9 - 11 SNA 101 Introduction to SNA
The Aftermath
The skyline of San Francisco was dark
A total of 175 generating units failed
Some of the nuclear reactors tooks days to restart
Total cost of $2 billion
What caused this catastrophe?
Sloppy maintainence
Insufficient attention to warning signs
Pure chance - bad luck
Inadequate understanding of the interdependencies in the system
Satyaki Sikdar© Programming in Python April 23 2016 56 / 62
hours 9 - 11 SNA 101 Introduction to SNA
The Aftermath
The skyline of San Francisco was dark
A total of 175 generating units failed
Some of the nuclear reactors tooks days to restart
Total cost of $2 billion
What caused this catastrophe?
Sloppy maintainence
Insufficient attention to warning signs
Pure chance - bad luck
Inadequate understanding of the interdependencies in the system
Satyaki Sikdar© Programming in Python April 23 2016 56 / 62
hours 9 - 11 SNA 101 Introduction to SNA
Complex networks
Satyaki Sikdar© Programming in Python April 23 2016 57 / 62
hours 9 - 11 SNA 101 Introduction to SNA
Complex networks in real life
Advances in gene sequencing reveal that all human lives consist of about 30, 000 genes
The complexity rises from the interactions between different genes expressing different
characteristics
The parts making up the whole don’t sum up in any simple fashion
The building blocks interact with one another thus generating bewildering behavior
Satyaki Sikdar© Programming in Python April 23 2016 58 / 62
hours 9 - 11 SNA 101 Introduction to SNA
Complex networks in real life
Advances in gene sequencing reveal that all human lives consist of about 30, 000 genes
The complexity rises from the interactions between different genes expressing different
characteristics
The parts making up the whole don’t sum up in any simple fashion
The building blocks interact with one another thus generating bewildering behavior
Satyaki Sikdar© Programming in Python April 23 2016 58 / 62
hours 9 - 11 SNA 101 Introduction to SNA
Complex networks in real life
Advances in gene sequencing reveal that all human lives consist of about 30, 000 genes
The complexity rises from the interactions between different genes expressing different
characteristics
The parts making up the whole don’t sum up in any simple fashion
The building blocks interact with one another thus generating bewildering behavior
Satyaki Sikdar© Programming in Python April 23 2016 58 / 62
hours 9 - 11 SNA 101 Introduction to SNA
Complex networks in real life
Advances in gene sequencing reveal that all human lives consist of about 30, 000 genes
The complexity rises from the interactions between different genes expressing different
characteristics
The parts making up the whole don’t sum up in any simple fashion
The building blocks interact with one another thus generating bewildering behavior
Satyaki Sikdar© Programming in Python April 23 2016 58 / 62
hours 9 - 11 SNA 101 Introduction to SNA
Open problems in SNA
Small outbreaks of diseases becoming epidemics
Resilience of networks - the internet, power grids
What makes some videos go viral?
How to find clusters of nodes that are similar to each other?
How to find a seed set of nodes to maximize influence?
Key question: How does individual behavior aggregate to collective behavior?
Modelling and pattern finding is critical. Random sampling (Leskovec et al) is a way. The
other way is to form synthetic statistical models of networks
Satyaki Sikdar© Programming in Python April 23 2016 59 / 62
hours 9 - 11 SNA 101 Introduction to SNA
Open problems in SNA
Small outbreaks of diseases becoming epidemics
Resilience of networks - the internet, power grids
What makes some videos go viral?
How to find clusters of nodes that are similar to each other?
How to find a seed set of nodes to maximize influence?
Key question: How does individual behavior aggregate to collective behavior?
Modelling and pattern finding is critical. Random sampling (Leskovec et al) is a way. The
other way is to form synthetic statistical models of networks
Satyaki Sikdar© Programming in Python April 23 2016 59 / 62
hours 9 - 11 SNA 101 Introduction to SNA
Open problems in SNA
Small outbreaks of diseases becoming epidemics
Resilience of networks - the internet, power grids
What makes some videos go viral?
How to find clusters of nodes that are similar to each other?
How to find a seed set of nodes to maximize influence?
Key question: How does individual behavior aggregate to collective behavior?
Modelling and pattern finding is critical. Random sampling (Leskovec et al) is a way. The
other way is to form synthetic statistical models of networks
Satyaki Sikdar© Programming in Python April 23 2016 59 / 62
hours 9 - 11 SNA 101 Introduction to SNA
Open problems in SNA
Small outbreaks of diseases becoming epidemics
Resilience of networks - the internet, power grids
What makes some videos go viral?
How to find clusters of nodes that are similar to each other?
How to find a seed set of nodes to maximize influence?
Key question: How does individual behavior aggregate to collective behavior?
Modelling and pattern finding is critical. Random sampling (Leskovec et al) is a way. The
other way is to form synthetic statistical models of networks
Satyaki Sikdar© Programming in Python April 23 2016 59 / 62
hours 9 - 11 SNA 101 Introduction to SNA
Open problems in SNA
Small outbreaks of diseases becoming epidemics
Resilience of networks - the internet, power grids
What makes some videos go viral?
How to find clusters of nodes that are similar to each other?
How to find a seed set of nodes to maximize influence?
Key question: How does individual behavior aggregate to collective behavior?
Modelling and pattern finding is critical. Random sampling (Leskovec et al) is a way. The
other way is to form synthetic statistical models of networks
Satyaki Sikdar© Programming in Python April 23 2016 59 / 62
hours 9 - 11 SNA 101 Introduction to SNA
Open problems in SNA
Small outbreaks of diseases becoming epidemics
Resilience of networks - the internet, power grids
What makes some videos go viral?
How to find clusters of nodes that are similar to each other?
How to find a seed set of nodes to maximize influence?
Key question: How does individual behavior aggregate to collective behavior?
Modelling and pattern finding is critical. Random sampling (Leskovec et al) is a way. The
other way is to form synthetic statistical models of networks
Satyaki Sikdar© Programming in Python April 23 2016 59 / 62
hours 9 - 11 SNA 101 Introduction to SNA
Open problems in SNA
Small outbreaks of diseases becoming epidemics
Resilience of networks - the internet, power grids
What makes some videos go viral?
How to find clusters of nodes that are similar to each other?
How to find a seed set of nodes to maximize influence?
Key question: How does individual behavior aggregate to collective behavior?
Modelling and pattern finding is critical. Random sampling (Leskovec et al) is a way. The
other way is to form synthetic statistical models of networks
Satyaki Sikdar© Programming in Python April 23 2016 59 / 62
hours 9 - 11 SNA 101 Introduction to SNA
Open problems in SNA
Small outbreaks of diseases becoming epidemics
Resilience of networks - the internet, power grids
What makes some videos go viral?
How to find clusters of nodes that are similar to each other?
How to find a seed set of nodes to maximize influence?
Key question: How does individual behavior aggregate to collective behavior?
Modelling and pattern finding is critical. Random sampling (Leskovec et al) is a way. The
other way is to form synthetic statistical models of networks
Satyaki Sikdar© Programming in Python April 23 2016 59 / 62
hours 9 - 11 SNA 101 Introduction to SNA
Open problems in SNA
Small outbreaks of diseases becoming epidemics
Resilience of networks - the internet, power grids
What makes some videos go viral?
How to find clusters of nodes that are similar to each other?
How to find a seed set of nodes to maximize influence?
Key question: How does individual behavior aggregate to collective behavior?
Modelling and pattern finding is critical. Random sampling (Leskovec et al) is a way. The
other way is to form synthetic statistical models of networks
Satyaki Sikdar© Programming in Python April 23 2016 59 / 62
hours 9 - 11 SNA 101 Modelling - Introduction and Importance
The Need for Modelling
The models discussed in the talk is simplified
Starting off simple is an essential stage of understanding anything complex
Results from the simple models are often intriguing and fascinating
The cost is abstraction - the results are often hard to apply in real life
Models provide a simple framework for experimentation
Models look to emulate the properties of actual networks to some extent
Satyaki Sikdar© Programming in Python April 23 2016 60 / 62
hours 9 - 11 SNA 101 Modelling - Introduction and Importance
The Need for Modelling
The models discussed in the talk is simplified
Starting off simple is an essential stage of understanding anything complex
Results from the simple models are often intriguing and fascinating
The cost is abstraction - the results are often hard to apply in real life
Models provide a simple framework for experimentation
Models look to emulate the properties of actual networks to some extent
Satyaki Sikdar© Programming in Python April 23 2016 60 / 62
hours 9 - 11 SNA 101 Modelling - Introduction and Importance
The Need for Modelling
The models discussed in the talk is simplified
Starting off simple is an essential stage of understanding anything complex
Results from the simple models are often intriguing and fascinating
The cost is abstraction - the results are often hard to apply in real life
Models provide a simple framework for experimentation
Models look to emulate the properties of actual networks to some extent
Satyaki Sikdar© Programming in Python April 23 2016 60 / 62
hours 9 - 11 SNA 101 Modelling - Introduction and Importance
The Need for Modelling
The models discussed in the talk is simplified
Starting off simple is an essential stage of understanding anything complex
Results from the simple models are often intriguing and fascinating
The cost is abstraction - the results are often hard to apply in real life
Models provide a simple framework for experimentation
Models look to emulate the properties of actual networks to some extent
Satyaki Sikdar© Programming in Python April 23 2016 60 / 62
hours 9 - 11 SNA 101 Modelling - Introduction and Importance
The Need for Modelling
The models discussed in the talk is simplified
Starting off simple is an essential stage of understanding anything complex
Results from the simple models are often intriguing and fascinating
The cost is abstraction - the results are often hard to apply in real life
Models provide a simple framework for experimentation
Models look to emulate the properties of actual networks to some extent
Satyaki Sikdar© Programming in Python April 23 2016 60 / 62
hours 9 - 11 SNA 101 Modelling - Introduction and Importance
The Need for Modelling
The models discussed in the talk is simplified
Starting off simple is an essential stage of understanding anything complex
Results from the simple models are often intriguing and fascinating
The cost is abstraction - the results are often hard to apply in real life
Models provide a simple framework for experimentation
Models look to emulate the properties of actual networks to some extent
Satyaki Sikdar© Programming in Python April 23 2016 60 / 62
hours 9 - 11 SNA 101 Representing Networks
Network representation
Networks portray the interactions between different actors.Graphs hand us a valuable tool to
process and handle networks.
Actors or individuals are nodes in the graph
If there’s interaction between two nodes,
there’s an edge between them
The links can have weights or intensities
signifying connection strength
The links can be directed, like in the web
graph. There’s a directed link between two
nodes (pages) A and B if there’s a
hyperlink to B from A
Satyaki Sikdar© Programming in Python April 23 2016 61 / 62
hours 9 - 11 SNA 101 Representing Networks
Network representation
Networks portray the interactions between different actors.Graphs hand us a valuable tool to
process and handle networks.
Actors or individuals are nodes in the graph
If there’s interaction between two nodes,
there’s an edge between them
The links can have weights or intensities
signifying connection strength
The links can be directed, like in the web
graph. There’s a directed link between two
nodes (pages) A and B if there’s a
hyperlink to B from A
Satyaki Sikdar© Programming in Python April 23 2016 61 / 62
hours 9 - 11 SNA 101 Representing Networks
Network representation
Networks portray the interactions between different actors.Graphs hand us a valuable tool to
process and handle networks.
Actors or individuals are nodes in the graph
If there’s interaction between two nodes,
there’s an edge between them
The links can have weights or intensities
signifying connection strength
The links can be directed, like in the web
graph. There’s a directed link between two
nodes (pages) A and B if there’s a
hyperlink to B from A
Satyaki Sikdar© Programming in Python April 23 2016 61 / 62
hours 9 - 11 SNA 101 Representing Networks
Network representation
Networks portray the interactions between different actors.Graphs hand us a valuable tool to
process and handle networks.
Actors or individuals are nodes in the graph
If there’s interaction between two nodes,
there’s an edge between them
The links can have weights or intensities
signifying connection strength
The links can be directed, like in the web
graph. There’s a directed link between two
nodes (pages) A and B if there’s a
hyperlink to B from A
Satyaki Sikdar© Programming in Python April 23 2016 61 / 62
hours 9 - 11 SNA 101 Representing Networks
Network representation
Networks portray the interactions between different actors.Graphs hand us a valuable tool to
process and handle networks.
Actors or individuals are nodes in the graph
If there’s interaction between two nodes,
there’s an edge between them
The links can have weights or intensities
signifying connection strength
The links can be directed, like in the web
graph. There’s a directed link between two
nodes (pages) A and B if there’s a
hyperlink to B from A
Satyaki Sikdar© Programming in Python April 23 2016 61 / 62
hours 9 - 11 SNA 101 Representing Networks
Network representation
Networks portray the interactions between different actors.Graphs hand us a valuable tool to
process and handle networks.
Actors or individuals are nodes in the graph
If there’s interaction between two nodes,
there’s an edge between them
The links can have weights or intensities
signifying connection strength
The links can be directed, like in the web
graph. There’s a directed link between two
nodes (pages) A and B if there’s a
hyperlink to B from A
Satyaki Sikdar© Programming in Python April 23 2016 61 / 62
hours 9 - 11 SNA 101 Representing Networks
Network representation
Networks portray the interactions between different actors.Graphs hand us a valuable tool to
process and handle networks.
Actors or individuals are nodes in the graph
If there’s interaction between two nodes,
there’s an edge between them
The links can have weights or intensities
signifying connection strength
The links can be directed, like in the web
graph. There’s a directed link between two
nodes (pages) A and B if there’s a
hyperlink to B from A
Satyaki Sikdar© Programming in Python April 23 2016 61 / 62
hours 9 - 11 SNA 101 Representing Networks
Network representation
Networks portray the interactions between different actors.Graphs hand us a valuable tool to
process and handle networks.
Actors or individuals are nodes in the graph
If there’s interaction between two nodes,
there’s an edge between them
The links can have weights or intensities
signifying connection strength
The links can be directed, like in the web
graph. There’s a directed link between two
nodes (pages) A and B if there’s a
hyperlink to B from A
Satyaki Sikdar© Programming in Python April 23 2016 61 / 62
hours 9 - 11 SNA 101 Representing Networks
Network representation
Networks portray the interactions between different actors.Graphs hand us a valuable tool to
process and handle networks.
Actors or individuals are nodes in the graph
If there’s interaction between two nodes,
there’s an edge between them
The links can have weights or intensities
signifying connection strength
The links can be directed, like in the web
graph. There’s a directed link between two
nodes (pages) A and B if there’s a
hyperlink to B from A
Satyaki Sikdar© Programming in Python April 23 2016 61 / 62
hours 9 - 11 SNA 101 Representing Networks
Network representation
Networks portray the interactions between different actors.Graphs hand us a valuable tool to
process and handle networks.
Actors or individuals are nodes in the graph
If there’s interaction between two nodes,
there’s an edge between them
The links can have weights or intensities
signifying connection strength
The links can be directed, like in the web
graph. There’s a directed link between two
nodes (pages) A and B if there’s a
hyperlink to B from A
Satyaki Sikdar© Programming in Python April 23 2016 61 / 62
hours 9 - 11 SNA 101 Representing Networks
Network representation
Networks portray the interactions between different actors.Graphs hand us a valuable tool to
process and handle networks.
Actors or individuals are nodes in the graph
If there’s interaction between two nodes,
there’s an edge between them
The links can have weights or intensities
signifying connection strength
The links can be directed, like in the web
graph. There’s a directed link between two
nodes (pages) A and B if there’s a
hyperlink to B from A
Satyaki Sikdar© Programming in Python April 23 2016 61 / 62
hours 9 - 11 SNA 101 Representing Networks
Please move to the pdf named tutorial_networkx for the rest of the slides
Thanks!
Satyaki Sikdar© Programming in Python April 23 2016 62 / 62

Workshop on Programming in Python - day II

  • 1.
    Programming in Python ATwo Day Workshop Satyaki Sikdar Vice Chair ACM Student Chapter Heritage Institute of Technology April 23 2016 Satyaki Sikdar© Programming in Python April 23 2016 1 / 62
  • 2.
    hour 6: let’sget rich! table of contents 1 hour 6: let’s get rich! an elaborate example inheritance file handling 101 2 hour 7: algo design 101 3 hours 8: data viz 101 4 hours 9 - 11 SNA 101 Satyaki Sikdar© Programming in Python April 23 2016 2 / 62
  • 3.
    hour 6: let’sget rich! an elaborate example another example There are 52 cards in a deck. There are 4 suits - Spades, Hearts, Diamonds and Clubs Each suit has 13 cards - Ace, 2, ..., 10, Jack, Queen and King We’ll create a card class. Attributes can be strings, like ’Spade’ for suits and ’Queen’ for ranks. We’ll have trouble comparing the cards We use integers to encode the ranks and suits Spades → 3, Hearts → 2, Diamonds → 1 and Clubs → 0 Ace → 1, Jack → 11, Queen → 12 and King → 13 Satyaki Sikdar© Programming in Python April 23 2016 3 / 62
  • 4.
    hour 6: let’sget rich! an elaborate example another example There are 52 cards in a deck. There are 4 suits - Spades, Hearts, Diamonds and Clubs Each suit has 13 cards - Ace, 2, ..., 10, Jack, Queen and King We’ll create a card class. Attributes can be strings, like ’Spade’ for suits and ’Queen’ for ranks. We’ll have trouble comparing the cards We use integers to encode the ranks and suits Spades → 3, Hearts → 2, Diamonds → 1 and Clubs → 0 Ace → 1, Jack → 11, Queen → 12 and King → 13 Satyaki Sikdar© Programming in Python April 23 2016 3 / 62
  • 5.
    hour 6: let’sget rich! an elaborate example another example There are 52 cards in a deck. There are 4 suits - Spades, Hearts, Diamonds and Clubs Each suit has 13 cards - Ace, 2, ..., 10, Jack, Queen and King We’ll create a card class. Attributes can be strings, like ’Spade’ for suits and ’Queen’ for ranks. We’ll have trouble comparing the cards We use integers to encode the ranks and suits Spades → 3, Hearts → 2, Diamonds → 1 and Clubs → 0 Ace → 1, Jack → 11, Queen → 12 and King → 13 Satyaki Sikdar© Programming in Python April 23 2016 3 / 62
  • 6.
    hour 6: let’sget rich! an elaborate example another example There are 52 cards in a deck. There are 4 suits - Spades, Hearts, Diamonds and Clubs Each suit has 13 cards - Ace, 2, ..., 10, Jack, Queen and King We’ll create a card class. Attributes can be strings, like ’Spade’ for suits and ’Queen’ for ranks. We’ll have trouble comparing the cards We use integers to encode the ranks and suits Spades → 3, Hearts → 2, Diamonds → 1 and Clubs → 0 Ace → 1, Jack → 11, Queen → 12 and King → 13 Satyaki Sikdar© Programming in Python April 23 2016 3 / 62
  • 7.
    hour 6: let’sget rich! an elaborate example another example There are 52 cards in a deck. There are 4 suits - Spades, Hearts, Diamonds and Clubs Each suit has 13 cards - Ace, 2, ..., 10, Jack, Queen and King We’ll create a card class. Attributes can be strings, like ’Spade’ for suits and ’Queen’ for ranks. We’ll have trouble comparing the cards We use integers to encode the ranks and suits Spades → 3, Hearts → 2, Diamonds → 1 and Clubs → 0 Ace → 1, Jack → 11, Queen → 12 and King → 13 Satyaki Sikdar© Programming in Python April 23 2016 3 / 62
  • 8.
    hour 6: let’sget rich! an elaborate example another example There are 52 cards in a deck. There are 4 suits - Spades, Hearts, Diamonds and Clubs Each suit has 13 cards - Ace, 2, ..., 10, Jack, Queen and King We’ll create a card class. Attributes can be strings, like ’Spade’ for suits and ’Queen’ for ranks. We’ll have trouble comparing the cards We use integers to encode the ranks and suits Spades → 3, Hearts → 2, Diamonds → 1 and Clubs → 0 Ace → 1, Jack → 11, Queen → 12 and King → 13 Satyaki Sikdar© Programming in Python April 23 2016 3 / 62
  • 9.
    hour 6: let’sget rich! an elaborate example the class definition class Card: '''Represents a standard playing card''' suit_names = ['Clubs', 'Diamonds', 'Hearts', 'Spades'] rank_names = [None, 'Ace', '2', '3', '4', '5', '6', '7', '8', '9', '10', 'Jack', 'Queen', 'King'] def __init__(self, suit=0, rank=2): self.suit = suit self.rank = rank def __str__(self): return '%s of %s' % (Card.rank_names[self.rank], Card.suit_names[self.suit]) >>> two_of_clubs = Card() >>> queen_of_diamonds = Card(1, 12) Satyaki Sikdar© Programming in Python April 23 2016 4 / 62
  • 10.
    hour 6: let’sget rich! an elaborate example class and instance attributes class attribute instance attribute Defined outside any method Defined inside methods Referred by class.class_attr Referred by inst.inst_attr One copy per class One copy per instance Eg: suit_names and rank_names Eg: suit and rank Figure: Class and instance attributes Satyaki Sikdar© Programming in Python April 23 2016 5 / 62
  • 11.
    hour 6: let’sget rich! an elaborate example comparing cards For built-in types, there are relational operators (<, >, ==, etc.) that compare two things to produce a boolean For user-defined types, we need to override the __cmp__ method. It takes in two parameters, self and other, returns a positive number if the first object is greater a negative number if the second object is greater zero if they are equal The ordering for cards is not obvious. Which is better, the 3 of Clubs or the 2 of Diamonds? One has a higher rank, but the other has a higher suit We arbitrarily choose that suit is more important, so all the Spades outrank all the Diamonds and so on. Satyaki Sikdar© Programming in Python April 23 2016 6 / 62
  • 12.
    hour 6: let’sget rich! an elaborate example comparing cards For built-in types, there are relational operators (<, >, ==, etc.) that compare two things to produce a boolean For user-defined types, we need to override the __cmp__ method. It takes in two parameters, self and other, returns a positive number if the first object is greater a negative number if the second object is greater zero if they are equal The ordering for cards is not obvious. Which is better, the 3 of Clubs or the 2 of Diamonds? One has a higher rank, but the other has a higher suit We arbitrarily choose that suit is more important, so all the Spades outrank all the Diamonds and so on. Satyaki Sikdar© Programming in Python April 23 2016 6 / 62
  • 13.
    hour 6: let’sget rich! an elaborate example comparing cards For built-in types, there are relational operators (<, >, ==, etc.) that compare two things to produce a boolean For user-defined types, we need to override the __cmp__ method. It takes in two parameters, self and other, returns a positive number if the first object is greater a negative number if the second object is greater zero if they are equal The ordering for cards is not obvious. Which is better, the 3 of Clubs or the 2 of Diamonds? One has a higher rank, but the other has a higher suit We arbitrarily choose that suit is more important, so all the Spades outrank all the Diamonds and so on. Satyaki Sikdar© Programming in Python April 23 2016 6 / 62
  • 14.
    hour 6: let’sget rich! an elaborate example comparing cards For built-in types, there are relational operators (<, >, ==, etc.) that compare two things to produce a boolean For user-defined types, we need to override the __cmp__ method. It takes in two parameters, self and other, returns a positive number if the first object is greater a negative number if the second object is greater zero if they are equal The ordering for cards is not obvious. Which is better, the 3 of Clubs or the 2 of Diamonds? One has a higher rank, but the other has a higher suit We arbitrarily choose that suit is more important, so all the Spades outrank all the Diamonds and so on. Satyaki Sikdar© Programming in Python April 23 2016 6 / 62
  • 15.
    hour 6: let’sget rich! an elaborate example comparing cards For built-in types, there are relational operators (<, >, ==, etc.) that compare two things to produce a boolean For user-defined types, we need to override the __cmp__ method. It takes in two parameters, self and other, returns a positive number if the first object is greater a negative number if the second object is greater zero if they are equal The ordering for cards is not obvious. Which is better, the 3 of Clubs or the 2 of Diamonds? One has a higher rank, but the other has a higher suit We arbitrarily choose that suit is more important, so all the Spades outrank all the Diamonds and so on. Satyaki Sikdar© Programming in Python April 23 2016 6 / 62
  • 16.
    hour 6: let’sget rich! an elaborate example comparing cards For built-in types, there are relational operators (<, >, ==, etc.) that compare two things to produce a boolean For user-defined types, we need to override the __cmp__ method. It takes in two parameters, self and other, returns a positive number if the first object is greater a negative number if the second object is greater zero if they are equal The ordering for cards is not obvious. Which is better, the 3 of Clubs or the 2 of Diamonds? One has a higher rank, but the other has a higher suit We arbitrarily choose that suit is more important, so all the Spades outrank all the Diamonds and so on. Satyaki Sikdar© Programming in Python April 23 2016 6 / 62
  • 17.
    hour 6: let’sget rich! an elaborate example comparing cards For built-in types, there are relational operators (<, >, ==, etc.) that compare two things to produce a boolean For user-defined types, we need to override the __cmp__ method. It takes in two parameters, self and other, returns a positive number if the first object is greater a negative number if the second object is greater zero if they are equal The ordering for cards is not obvious. Which is better, the 3 of Clubs or the 2 of Diamonds? One has a higher rank, but the other has a higher suit We arbitrarily choose that suit is more important, so all the Spades outrank all the Diamonds and so on. Satyaki Sikdar© Programming in Python April 23 2016 6 / 62
  • 18.
    hour 6: let’sget rich! an elaborate example writing the __cmp__ method #inside Card class def __cmp__(self, other): if self.suit > other.suit: #check the suits return 1 elif self.suit < other.suit: return -1 elif self.rank > other.rank: #check the ranks return 1 elif self.rank < other.rank: return -1 else: #both the suits and the ranks are the same return 0 Satyaki Sikdar© Programming in Python April 23 2016 7 / 62
  • 19.
    hour 6: let’sget rich! an elaborate example decks Now that we have Cards, we define Decks. It will contain a list of Cards The init method creates the entire deck of 52 cards class Deck: '''Represents a deck of cards''' def __init__(self): self.cards = [] for suit in range(4): for rank in range(1, 14): card = Card(suit, rank) self.cards.append(card) Satyaki Sikdar© Programming in Python April 23 2016 8 / 62
  • 20.
    hour 6: let’sget rich! an elaborate example decks Now that we have Cards, we define Decks. It will contain a list of Cards The init method creates the entire deck of 52 cards class Deck: '''Represents a deck of cards''' def __init__(self): self.cards = [] for suit in range(4): for rank in range(1, 14): card = Card(suit, rank) self.cards.append(card) Satyaki Sikdar© Programming in Python April 23 2016 8 / 62
  • 21.
    hour 6: let’sget rich! an elaborate example decks #inside class Deck def __str__(self): res = [] for card in self.cards: res.append(str(card)) return 'n'.join(res) def shuffle(self): random.shuffle(self.cards) #inside class Deck def pop_card(self): return self.cards.pop() def add_card(self, card): self.cards.append(card) def sort(self): self.cards.sort() >>> deck = Deck() >>> print deck.pop_card() King of Spades Satyaki Sikdar© Programming in Python April 23 2016 9 / 62
  • 22.
    hour 6: let’sget rich! inheritance inheritance The language feature most often associated with object-oriented programming is inheritance It’s the ability to define a new class that’s a modified version of an existing class The existing class is called the parent and the new class is called the child We want a class to represent a hand that is, the set of cards held by a player A hand is similar to a deck: both are made up of a set of cards, and both require operations like adding and removing cards A hand is also different from a deck; there are operations we want for hands that don’t make sense for a deck Satyaki Sikdar© Programming in Python April 23 2016 10 / 62
  • 23.
    hour 6: let’sget rich! inheritance inheritance The language feature most often associated with object-oriented programming is inheritance It’s the ability to define a new class that’s a modified version of an existing class The existing class is called the parent and the new class is called the child We want a class to represent a hand that is, the set of cards held by a player A hand is similar to a deck: both are made up of a set of cards, and both require operations like adding and removing cards A hand is also different from a deck; there are operations we want for hands that don’t make sense for a deck Satyaki Sikdar© Programming in Python April 23 2016 10 / 62
  • 24.
    hour 6: let’sget rich! inheritance inheritance The language feature most often associated with object-oriented programming is inheritance It’s the ability to define a new class that’s a modified version of an existing class The existing class is called the parent and the new class is called the child We want a class to represent a hand that is, the set of cards held by a player A hand is similar to a deck: both are made up of a set of cards, and both require operations like adding and removing cards A hand is also different from a deck; there are operations we want for hands that don’t make sense for a deck Satyaki Sikdar© Programming in Python April 23 2016 10 / 62
  • 25.
    hour 6: let’sget rich! inheritance inheritance The language feature most often associated with object-oriented programming is inheritance It’s the ability to define a new class that’s a modified version of an existing class The existing class is called the parent and the new class is called the child We want a class to represent a hand that is, the set of cards held by a player A hand is similar to a deck: both are made up of a set of cards, and both require operations like adding and removing cards A hand is also different from a deck; there are operations we want for hands that don’t make sense for a deck Satyaki Sikdar© Programming in Python April 23 2016 10 / 62
  • 26.
    hour 6: let’sget rich! inheritance inheritance The language feature most often associated with object-oriented programming is inheritance It’s the ability to define a new class that’s a modified version of an existing class The existing class is called the parent and the new class is called the child We want a class to represent a hand that is, the set of cards held by a player A hand is similar to a deck: both are made up of a set of cards, and both require operations like adding and removing cards A hand is also different from a deck; there are operations we want for hands that don’t make sense for a deck Satyaki Sikdar© Programming in Python April 23 2016 10 / 62
  • 27.
    hour 6: let’sget rich! inheritance inheritance The language feature most often associated with object-oriented programming is inheritance It’s the ability to define a new class that’s a modified version of an existing class The existing class is called the parent and the new class is called the child We want a class to represent a hand that is, the set of cards held by a player A hand is similar to a deck: both are made up of a set of cards, and both require operations like adding and removing cards A hand is also different from a deck; there are operations we want for hands that don’t make sense for a deck Satyaki Sikdar© Programming in Python April 23 2016 10 / 62
  • 28.
    hour 6: let’sget rich! inheritance The definition of a child class is like other class definitions, but the name of the parent class appears in parentheses class Hand(Deck): '''Represents a hand of playing cards''' This definition indicates that Hand inherits from Deck; that means we can use methods like pop_card and add_card for Hands as well as Decks Hand also inherits __init__ from Deck, but it doesn’t really do what we want: the init method for Hands should initialize cards with an empty list We can provide an init method, overriding the one in Deck #inside class Hand def __init__(self, label=''): self.cards = [] self.label = label Satyaki Sikdar© Programming in Python April 23 2016 11 / 62
  • 29.
    hour 6: let’sget rich! inheritance The definition of a child class is like other class definitions, but the name of the parent class appears in parentheses class Hand(Deck): '''Represents a hand of playing cards''' This definition indicates that Hand inherits from Deck; that means we can use methods like pop_card and add_card for Hands as well as Decks Hand also inherits __init__ from Deck, but it doesn’t really do what we want: the init method for Hands should initialize cards with an empty list We can provide an init method, overriding the one in Deck #inside class Hand def __init__(self, label=''): self.cards = [] self.label = label Satyaki Sikdar© Programming in Python April 23 2016 11 / 62
  • 30.
    hour 6: let’sget rich! inheritance The definition of a child class is like other class definitions, but the name of the parent class appears in parentheses class Hand(Deck): '''Represents a hand of playing cards''' This definition indicates that Hand inherits from Deck; that means we can use methods like pop_card and add_card for Hands as well as Decks Hand also inherits __init__ from Deck, but it doesn’t really do what we want: the init method for Hands should initialize cards with an empty list We can provide an init method, overriding the one in Deck #inside class Hand def __init__(self, label=''): self.cards = [] self.label = label Satyaki Sikdar© Programming in Python April 23 2016 11 / 62
  • 31.
    hour 6: let’sget rich! inheritance The definition of a child class is like other class definitions, but the name of the parent class appears in parentheses class Hand(Deck): '''Represents a hand of playing cards''' This definition indicates that Hand inherits from Deck; that means we can use methods like pop_card and add_card for Hands as well as Decks Hand also inherits __init__ from Deck, but it doesn’t really do what we want: the init method for Hands should initialize cards with an empty list We can provide an init method, overriding the one in Deck #inside class Hand def __init__(self, label=''): self.cards = [] self.label = label Satyaki Sikdar© Programming in Python April 23 2016 11 / 62
  • 32.
    hour 6: let’sget rich! inheritance So when you create a Hand, Python invokes it’s own init >>> hand = Hand('new hand') >>> print hand.cards [] >>> print hand.label new hand But the other methods are inherited from Deck >>> deck = Deck() >>> card = deck.pop_card() >>> hand.add_card(card) #add_card from Hand >>> print hand #using the str of Hand King of Spades A natural next step is to encapsulate this code in a method called move_cards #inside class Deck def move_cards(self, hand, card): for i in xrange(num): hand.add_card(self.pop_card()) move_cards takes two arguments, a Hand object and the number of cards to deal. Modifies both self and hand Satyaki Sikdar© Programming in Python April 23 2016 12 / 62
  • 33.
    hour 6: let’sget rich! inheritance So when you create a Hand, Python invokes it’s own init >>> hand = Hand('new hand') >>> print hand.cards [] >>> print hand.label new hand But the other methods are inherited from Deck >>> deck = Deck() >>> card = deck.pop_card() >>> hand.add_card(card) #add_card from Hand >>> print hand #using the str of Hand King of Spades A natural next step is to encapsulate this code in a method called move_cards #inside class Deck def move_cards(self, hand, card): for i in xrange(num): hand.add_card(self.pop_card()) move_cards takes two arguments, a Hand object and the number of cards to deal. Modifies both self and hand Satyaki Sikdar© Programming in Python April 23 2016 12 / 62
  • 34.
    hour 6: let’sget rich! inheritance So when you create a Hand, Python invokes it’s own init >>> hand = Hand('new hand') >>> print hand.cards [] >>> print hand.label new hand But the other methods are inherited from Deck >>> deck = Deck() >>> card = deck.pop_card() >>> hand.add_card(card) #add_card from Hand >>> print hand #using the str of Hand King of Spades A natural next step is to encapsulate this code in a method called move_cards #inside class Deck def move_cards(self, hand, card): for i in xrange(num): hand.add_card(self.pop_card()) move_cards takes two arguments, a Hand object and the number of cards to deal. Modifies both self and hand Satyaki Sikdar© Programming in Python April 23 2016 12 / 62
  • 35.
    hour 6: let’sget rich! inheritance So when you create a Hand, Python invokes it’s own init >>> hand = Hand('new hand') >>> print hand.cards [] >>> print hand.label new hand But the other methods are inherited from Deck >>> deck = Deck() >>> card = deck.pop_card() >>> hand.add_card(card) #add_card from Hand >>> print hand #using the str of Hand King of Spades A natural next step is to encapsulate this code in a method called move_cards #inside class Deck def move_cards(self, hand, card): for i in xrange(num): hand.add_card(self.pop_card()) move_cards takes two arguments, a Hand object and the number of cards to deal. Modifies both self and hand Satyaki Sikdar© Programming in Python April 23 2016 12 / 62
  • 36.
    hour 6: let’sget rich! inheritance #inside class Deck def deal_hands(self, num_hands, cards_per_hand): hands = [] self.shuffle() #shuffling the deck for i in range(num_hands): hand = Hand('player %d' % (i)) for j in range(cards_per_hand): hand.add_card(self.pop_card()) hands.append(hand) return hands Now you have a proper framework for a card game, be it poker, blackjack or bridge! Satyaki Sikdar© Programming in Python April 23 2016 13 / 62
  • 37.
    hour 6: let’sget rich! file handling 101 the need for file handling Most of the programs we have seen so far are transient in the sense that they run for a short time and produce some output, but when they end, their data disappears. If you run the program again, it starts with a clean slate Other programs are persistent: they run for a long time (or all the time); they keep at least some of their data in permanent storage (a hard drive, for example); if they shut down and restart, they pick up where they left off Big input and output sizes - too big for the main memory Satyaki Sikdar© Programming in Python April 23 2016 14 / 62
  • 38.
    hour 6: let’sget rich! file handling 101 the need for file handling Most of the programs we have seen so far are transient in the sense that they run for a short time and produce some output, but when they end, their data disappears. If you run the program again, it starts with a clean slate Other programs are persistent: they run for a long time (or all the time); they keep at least some of their data in permanent storage (a hard drive, for example); if they shut down and restart, they pick up where they left off Big input and output sizes - too big for the main memory Satyaki Sikdar© Programming in Python April 23 2016 14 / 62
  • 39.
    hour 6: let’sget rich! file handling 101 the need for file handling Most of the programs we have seen so far are transient in the sense that they run for a short time and produce some output, but when they end, their data disappears. If you run the program again, it starts with a clean slate Other programs are persistent: they run for a long time (or all the time); they keep at least some of their data in permanent storage (a hard drive, for example); if they shut down and restart, they pick up where they left off Big input and output sizes - too big for the main memory Satyaki Sikdar© Programming in Python April 23 2016 14 / 62
  • 40.
    hour 6: let’sget rich! file handling 101 Examples of persistent programs are operating systems, which run pretty much whenever a computer is on, and web servers, which run all the time, waiting for requests to come in on the network. One of the simplest ways for programs to maintain their data is by reading and writing text files. fp_read = open('input.txt', 'r') fp_write = open('output.txt', 'w') Satyaki Sikdar© Programming in Python April 23 2016 15 / 62
  • 41.
    hour 6: let’sget rich! file handling 101 Examples of persistent programs are operating systems, which run pretty much whenever a computer is on, and web servers, which run all the time, waiting for requests to come in on the network. One of the simplest ways for programs to maintain their data is by reading and writing text files. fp_read = open('input.txt', 'r') fp_write = open('output.txt', 'w') Satyaki Sikdar© Programming in Python April 23 2016 15 / 62
  • 42.
    hour 6: let’sget rich! file handling 101 reading from files The built-in function open takes the name of the file as a parameter and returns a file object you can use to read the file >>> fin = open('input.txt', 'r') >>> print fin >>> <open file 'input.txt', mode 'r' at 0xb7eb2410> A few things to note: The file opened must exist. An IOError is thrown otherwise. The exact path to the file must be provided which includes the correct filename with extension (if any) Satyaki Sikdar© Programming in Python April 23 2016 16 / 62
  • 43.
    hour 6: let’sget rich! file handling 101 reading from files The built-in function open takes the name of the file as a parameter and returns a file object you can use to read the file >>> fin = open('input.txt', 'r') >>> print fin >>> <open file 'input.txt', mode 'r' at 0xb7eb2410> A few things to note: The file opened must exist. An IOError is thrown otherwise. The exact path to the file must be provided which includes the correct filename with extension (if any) Satyaki Sikdar© Programming in Python April 23 2016 16 / 62
  • 44.
    hour 6: let’sget rich! file handling 101 reading from files The built-in function open takes the name of the file as a parameter and returns a file object you can use to read the file >>> fin = open('input.txt', 'r') >>> print fin >>> <open file 'input.txt', mode 'r' at 0xb7eb2410> A few things to note: The file opened must exist. An IOError is thrown otherwise. The exact path to the file must be provided which includes the correct filename with extension (if any) Satyaki Sikdar© Programming in Python April 23 2016 16 / 62
  • 45.
    hour 6: let’sget rich! file handling 101 reading from files The built-in function open takes the name of the file as a parameter and returns a file object you can use to read the file >>> fin = open('input.txt', 'r') >>> print fin >>> <open file 'input.txt', mode 'r' at 0xb7eb2410> A few things to note: The file opened must exist. An IOError is thrown otherwise. The exact path to the file must be provided which includes the correct filename with extension (if any) Satyaki Sikdar© Programming in Python April 23 2016 16 / 62
  • 46.
    hour 6: let’sget rich! file handling 101 reading files The file object provides several methods for reading, including readline, which reads characters from the file until it gets to a newline and returns the result as a string: >>> fin.readline() 'the first line n' If you keep on doing fin.readlines(), you’d end up reading the whole file, one line at a time. Let’s see a few examples of reading files. Satyaki Sikdar© Programming in Python April 23 2016 17 / 62
  • 47.
    hour 6: let’sget rich! file handling 101 writing to files >>> fout = open('output.txt', 'w') >>> print fout <open file 'output.txt', mode 'w' at 0xb7eb2410> If the file already exists, opening it in write mode clears out the old data and starts fresh, so be careful! If the file doesn’t exist, a new one is created >>> line1 = 'He left yesterday behind him, you might say he was born again, >>> fout.write(line1) Again, the file object keeps track of where it is, so if you call write again, it adds the new data to the end >>> line2 = 'you might say he found a key for every door.n' >>> fout.write(line2) Satyaki Sikdar© Programming in Python April 23 2016 18 / 62
  • 48.
    hour 6: let’sget rich! file handling 101 using files for something meaningful Let’s combine the knowledge of file handling with dictionaries to do some basic lexical analysis import string def char_freq(filename): counter = dict() with open(filename, 'r') as f: raw_text = f.read() for c in raw_text: c = c.lower() if c in string.ascii_lowercase: if c in counter: counter[c] += 1 else: counter[c] = 1 return counter def normalize(counter): sum_values = float(sum(counter.values())) for key in counter: counter[key] /= sum_values return counter Satyaki Sikdar© Programming in Python April 23 2016 19 / 62
  • 49.
    hour 7: algodesign 101 table of contents 1 hour 6: let’s get rich! 2 hour 7: algo design 101 merge sort modules 3 hours 8: data viz 101 4 hours 9 - 11 SNA 101 Satyaki Sikdar© Programming in Python April 23 2016 20 / 62
  • 50.
    hour 7: algodesign 101 merge sort algorithm design in Python One of the strong points of Python is the ease of expression Turning pseudocode into actual code is not difficult Let’s try to implement the Merge Sort algorithm in Python A high level idea of the algorithm Divide: Divide the n-element sequence into two subsequences of n 2 elements Conquer: Sort the subsequences recursively Combine: Merge the two sorted subsequences to produce the sorted answer Satyaki Sikdar© Programming in Python April 23 2016 21 / 62
  • 51.
    hour 7: algodesign 101 merge sort algorithm design in Python One of the strong points of Python is the ease of expression Turning pseudocode into actual code is not difficult Let’s try to implement the Merge Sort algorithm in Python A high level idea of the algorithm Divide: Divide the n-element sequence into two subsequences of n 2 elements Conquer: Sort the subsequences recursively Combine: Merge the two sorted subsequences to produce the sorted answer Satyaki Sikdar© Programming in Python April 23 2016 21 / 62
  • 52.
    hour 7: algodesign 101 merge sort algorithm design in Python One of the strong points of Python is the ease of expression Turning pseudocode into actual code is not difficult Let’s try to implement the Merge Sort algorithm in Python A high level idea of the algorithm Divide: Divide the n-element sequence into two subsequences of n 2 elements Conquer: Sort the subsequences recursively Combine: Merge the two sorted subsequences to produce the sorted answer Satyaki Sikdar© Programming in Python April 23 2016 21 / 62
  • 53.
    hour 7: algodesign 101 merge sort algorithm design in Python One of the strong points of Python is the ease of expression Turning pseudocode into actual code is not difficult Let’s try to implement the Merge Sort algorithm in Python A high level idea of the algorithm Divide: Divide the n-element sequence into two subsequences of n 2 elements Conquer: Sort the subsequences recursively Combine: Merge the two sorted subsequences to produce the sorted answer Satyaki Sikdar© Programming in Python April 23 2016 21 / 62
  • 54.
    hour 7: algodesign 101 merge sort algorithm design in Python One of the strong points of Python is the ease of expression Turning pseudocode into actual code is not difficult Let’s try to implement the Merge Sort algorithm in Python A high level idea of the algorithm Divide: Divide the n-element sequence into two subsequences of n 2 elements Conquer: Sort the subsequences recursively Combine: Merge the two sorted subsequences to produce the sorted answer Satyaki Sikdar© Programming in Python April 23 2016 21 / 62
  • 55.
    hour 7: algodesign 101 merge sort algorithm design in Python One of the strong points of Python is the ease of expression Turning pseudocode into actual code is not difficult Let’s try to implement the Merge Sort algorithm in Python A high level idea of the algorithm Divide: Divide the n-element sequence into two subsequences of n 2 elements Conquer: Sort the subsequences recursively Combine: Merge the two sorted subsequences to produce the sorted answer Satyaki Sikdar© Programming in Python April 23 2016 21 / 62
  • 56.
    hour 7: algodesign 101 merge sort Algorithm 1: MERGE(left, right) begin Append ∞ to left and right i ← 0, j ← 0 merged ← new list while len(merged) < len(left) + len(right) - 2 do if left[i] < right[j] then merged.append(left[i]) i ← i + 1 else merged.append(right[j]) j ← j + 1 return merged Algorithm 2: MERGE-SORT(A) begin if len(A) < 2 then return A else left ← first n 2 elements of A right ← last n 2 elements of A left ← MERGE − SORT(left) right ← MERGE − SORT(right) return MERGE(left, right) Satyaki Sikdar© Programming in Python April 23 2016 22 / 62
  • 57.
    hour 7: algodesign 101 merge sort the core idea The algorithm is naturally recursive The MERGE method takes two sorted lists and merges into a single sorted list MERGE − SORT sorts the list recursively by breaking it into equal sized halves and sorting them A list having less than 2 elements is trivially sorted - base case Smaller sorted lists are agglomerated to form the overall sorted list Satyaki Sikdar© Programming in Python April 23 2016 23 / 62
  • 58.
    hour 7: algodesign 101 merge sort the core idea The algorithm is naturally recursive The MERGE method takes two sorted lists and merges into a single sorted list MERGE − SORT sorts the list recursively by breaking it into equal sized halves and sorting them A list having less than 2 elements is trivially sorted - base case Smaller sorted lists are agglomerated to form the overall sorted list Satyaki Sikdar© Programming in Python April 23 2016 23 / 62
  • 59.
    hour 7: algodesign 101 merge sort the core idea The algorithm is naturally recursive The MERGE method takes two sorted lists and merges into a single sorted list MERGE − SORT sorts the list recursively by breaking it into equal sized halves and sorting them A list having less than 2 elements is trivially sorted - base case Smaller sorted lists are agglomerated to form the overall sorted list Satyaki Sikdar© Programming in Python April 23 2016 23 / 62
  • 60.
    hour 7: algodesign 101 merge sort the core idea The algorithm is naturally recursive The MERGE method takes two sorted lists and merges into a single sorted list MERGE − SORT sorts the list recursively by breaking it into equal sized halves and sorting them A list having less than 2 elements is trivially sorted - base case Smaller sorted lists are agglomerated to form the overall sorted list Satyaki Sikdar© Programming in Python April 23 2016 23 / 62
  • 61.
    hour 7: algodesign 101 merge sort the core idea The algorithm is naturally recursive The MERGE method takes two sorted lists and merges into a single sorted list MERGE − SORT sorts the list recursively by breaking it into equal sized halves and sorting them A list having less than 2 elements is trivially sorted - base case Smaller sorted lists are agglomerated to form the overall sorted list Satyaki Sikdar© Programming in Python April 23 2016 23 / 62
  • 62.
    hour 7: algodesign 101 merge sort Algorithm 3: MERGE(left, right) begin Append ∞ to left and right i ← 0, j ← 0 merged ← new list while len(merged) < len(left) + len(right) - 2 do if left[i] < right[j] then merged.append(left[i]) i ← i + 1 else merged.append(right[j]) j ← j + 1 return merged def merge(left, right): left.append(float('inf')) right.append(float('inf')) i = 0 j = 0 merged = [] while len(merged) < len(left) + len(right) - 2: if left[i] < right[j]: merged.append(left[i]) i += 1 else: merged.append(right[j]) j += 1 return merged Satyaki Sikdar© Programming in Python April 23 2016 24 / 62
  • 63.
    hour 7: algodesign 101 merge sort Algorithm 4: MERGE-SORT(A) begin if len(A) < 2 then return A else left ← first n 2 elements of A right ← last n 2 elements of A left ← MERGE − SORT(left) right ← MERGE − SORT(right) return MERGE(left, right) def merge_sort(A): if len(A) < 2: return A else: mid = len(A) / 2 left = A[: mid] right = A[mid: ] left = merge_sort(left) right = merge_sort(right) return merge(left, right) Satyaki Sikdar© Programming in Python April 23 2016 25 / 62
  • 64.
    hour 7: algodesign 101 modules modules: extending functionalities beyond basic Python Modules are external files and libraries that provide additional functions and classes to the bare bone Python Modules are files containing Python definitions and statements (ex. name.py) The interface is very simple. Definitions can be imported into other modules by using “import name” To access a module’s functions, type “name.function()” Each module is imported once per session Give nicknames to modules by using as Satyaki Sikdar© Programming in Python April 23 2016 26 / 62
  • 65.
    hour 7: algodesign 101 modules modules: extending functionalities beyond basic Python Modules are external files and libraries that provide additional functions and classes to the bare bone Python Modules are files containing Python definitions and statements (ex. name.py) The interface is very simple. Definitions can be imported into other modules by using “import name” To access a module’s functions, type “name.function()” Each module is imported once per session Give nicknames to modules by using as Satyaki Sikdar© Programming in Python April 23 2016 26 / 62
  • 66.
    hour 7: algodesign 101 modules modules: extending functionalities beyond basic Python Modules are external files and libraries that provide additional functions and classes to the bare bone Python Modules are files containing Python definitions and statements (ex. name.py) The interface is very simple. Definitions can be imported into other modules by using “import name” To access a module’s functions, type “name.function()” Each module is imported once per session Give nicknames to modules by using as Satyaki Sikdar© Programming in Python April 23 2016 26 / 62
  • 67.
    hour 7: algodesign 101 modules modules: extending functionalities beyond basic Python Modules are external files and libraries that provide additional functions and classes to the bare bone Python Modules are files containing Python definitions and statements (ex. name.py) The interface is very simple. Definitions can be imported into other modules by using “import name” To access a module’s functions, type “name.function()” Each module is imported once per session Give nicknames to modules by using as Satyaki Sikdar© Programming in Python April 23 2016 26 / 62
  • 68.
    hour 7: algodesign 101 modules modules: extending functionalities beyond basic Python Modules are external files and libraries that provide additional functions and classes to the bare bone Python Modules are files containing Python definitions and statements (ex. name.py) The interface is very simple. Definitions can be imported into other modules by using “import name” To access a module’s functions, type “name.function()” Each module is imported once per session Give nicknames to modules by using as Satyaki Sikdar© Programming in Python April 23 2016 26 / 62
  • 69.
    hour 7: algodesign 101 modules modules: extending functionalities beyond basic Python Modules are external files and libraries that provide additional functions and classes to the bare bone Python Modules are files containing Python definitions and statements (ex. name.py) The interface is very simple. Definitions can be imported into other modules by using “import name” To access a module’s functions, type “name.function()” Each module is imported once per session Give nicknames to modules by using as Satyaki Sikdar© Programming in Python April 23 2016 26 / 62
  • 70.
    hour 7: algodesign 101 modules modules: extending functionalities beyond basic Python Python has a lot of predefined modules - sys, __future__, math, random, re, ... The Zen of Python. Do import this Each module is highly specialized You have various choices when importing things from a module Import the whole module, but preserve the namespace - important when dealing with a lot of modules and keeping a track of things import module_name Import the whole module, but bring everything to the current namespace from module_name import ∗ Import only specific things - often faster. from math import pi, sin, cos Satyaki Sikdar© Programming in Python April 23 2016 27 / 62
  • 71.
    hour 7: algodesign 101 modules modules: extending functionalities beyond basic Python Python has a lot of predefined modules - sys, __future__, math, random, re, ... The Zen of Python. Do import this Each module is highly specialized You have various choices when importing things from a module Import the whole module, but preserve the namespace - important when dealing with a lot of modules and keeping a track of things import module_name Import the whole module, but bring everything to the current namespace from module_name import ∗ Import only specific things - often faster. from math import pi, sin, cos Satyaki Sikdar© Programming in Python April 23 2016 27 / 62
  • 72.
    hour 7: algodesign 101 modules modules: extending functionalities beyond basic Python Python has a lot of predefined modules - sys, __future__, math, random, re, ... The Zen of Python. Do import this Each module is highly specialized You have various choices when importing things from a module Import the whole module, but preserve the namespace - important when dealing with a lot of modules and keeping a track of things import module_name Import the whole module, but bring everything to the current namespace from module_name import ∗ Import only specific things - often faster. from math import pi, sin, cos Satyaki Sikdar© Programming in Python April 23 2016 27 / 62
  • 73.
    hour 7: algodesign 101 modules modules: extending functionalities beyond basic Python Python has a lot of predefined modules - sys, __future__, math, random, re, ... The Zen of Python. Do import this Each module is highly specialized You have various choices when importing things from a module Import the whole module, but preserve the namespace - important when dealing with a lot of modules and keeping a track of things import module_name Import the whole module, but bring everything to the current namespace from module_name import ∗ Import only specific things - often faster. from math import pi, sin, cos Satyaki Sikdar© Programming in Python April 23 2016 27 / 62
  • 74.
    hour 7: algodesign 101 modules modules: extending functionalities beyond basic Python Python has a lot of predefined modules - sys, __future__, math, random, re, ... The Zen of Python. Do import this Each module is highly specialized You have various choices when importing things from a module Import the whole module, but preserve the namespace - important when dealing with a lot of modules and keeping a track of things import module_name Import the whole module, but bring everything to the current namespace from module_name import ∗ Import only specific things - often faster. from math import pi, sin, cos Satyaki Sikdar© Programming in Python April 23 2016 27 / 62
  • 75.
    hour 7: algodesign 101 modules modules: extending functionalities beyond basic Python Python has a lot of predefined modules - sys, __future__, math, random, re, ... The Zen of Python. Do import this Each module is highly specialized You have various choices when importing things from a module Import the whole module, but preserve the namespace - important when dealing with a lot of modules and keeping a track of things import module_name Import the whole module, but bring everything to the current namespace from module_name import ∗ Import only specific things - often faster. from math import pi, sin, cos Satyaki Sikdar© Programming in Python April 23 2016 27 / 62
  • 76.
    hour 7: algodesign 101 modules modules: extending functionalities beyond basic Python Python has a lot of predefined modules - sys, __future__, math, random, re, ... The Zen of Python. Do import this Each module is highly specialized You have various choices when importing things from a module Import the whole module, but preserve the namespace - important when dealing with a lot of modules and keeping a track of things import module_name Import the whole module, but bring everything to the current namespace from module_name import ∗ Import only specific things - often faster. from math import pi, sin, cos Satyaki Sikdar© Programming in Python April 23 2016 27 / 62
  • 77.
    hour 7: algodesign 101 modules modules: extending functionalities beyond basic Python Python has a lot of predefined modules - sys, __future__, math, random, re, ... The Zen of Python. Do import this Each module is highly specialized You have various choices when importing things from a module Import the whole module, but preserve the namespace - important when dealing with a lot of modules and keeping a track of things import module_name Import the whole module, but bring everything to the current namespace from module_name import ∗ Import only specific things - often faster. from math import pi, sin, cos Satyaki Sikdar© Programming in Python April 23 2016 27 / 62
  • 78.
    hour 7: algodesign 101 modules modules: extending functionalities beyond basic Python Python has a lot of predefined modules - sys, __future__, math, random, re, ... The Zen of Python. Do import this Each module is highly specialized You have various choices when importing things from a module Import the whole module, but preserve the namespace - important when dealing with a lot of modules and keeping a track of things import module_name Import the whole module, but bring everything to the current namespace from module_name import ∗ Import only specific things - often faster. from math import pi, sin, cos Satyaki Sikdar© Programming in Python April 23 2016 27 / 62
  • 79.
    hour 7: algodesign 101 modules modules: extending functionalities beyond basic Python Python has a lot of predefined modules - sys, __future__, math, random, re, ... The Zen of Python. Do import this Each module is highly specialized You have various choices when importing things from a module Import the whole module, but preserve the namespace - important when dealing with a lot of modules and keeping a track of things import module_name Import the whole module, but bring everything to the current namespace from module_name import ∗ Import only specific things - often faster. from math import pi, sin, cos Satyaki Sikdar© Programming in Python April 23 2016 27 / 62
  • 80.
    hour 7: algodesign 101 modules modules: extending functionalities beyond basic Python Python has a lot of predefined modules - sys, __future__, math, random, re, ... The Zen of Python. Do import this Each module is highly specialized You have various choices when importing things from a module Import the whole module, but preserve the namespace - important when dealing with a lot of modules and keeping a track of things import module_name Import the whole module, but bring everything to the current namespace from module_name import ∗ Import only specific things - often faster. from math import pi, sin, cos Satyaki Sikdar© Programming in Python April 23 2016 27 / 62
  • 81.
    hour 7: algodesign 101 modules the sys module This module provides access to some variables used or maintained by the interpreter and to functions that interact strongly with the interpreter sys.argv - The list of command line arguments passed to a Python script argv[0] is the script name Further command line args are stored in argv[1] onwards. Eg: python test_prog.py arg1 arg2 arg3, then argv = [ test_prog.py , arg1 , arg2 , arg3 ] sys.getrecursionlimit() - Return the current value of the recursion limit, the maximum depth of the Python interpreter stack sys.setrecursionlimit(limit) Set the maximum depth of the Python interpreter stack to limit This limit prevents infinite recursion from causing an overflow of the C stack and crashing Python The highest possible limit is platform-dependent Satyaki Sikdar© Programming in Python April 23 2016 28 / 62
  • 82.
    hour 7: algodesign 101 modules the sys module This module provides access to some variables used or maintained by the interpreter and to functions that interact strongly with the interpreter sys.argv - The list of command line arguments passed to a Python script argv[0] is the script name Further command line args are stored in argv[1] onwards. Eg: python test_prog.py arg1 arg2 arg3, then argv = [ test_prog.py , arg1 , arg2 , arg3 ] sys.getrecursionlimit() - Return the current value of the recursion limit, the maximum depth of the Python interpreter stack sys.setrecursionlimit(limit) Set the maximum depth of the Python interpreter stack to limit This limit prevents infinite recursion from causing an overflow of the C stack and crashing Python The highest possible limit is platform-dependent Satyaki Sikdar© Programming in Python April 23 2016 28 / 62
  • 83.
    hour 7: algodesign 101 modules the sys module This module provides access to some variables used or maintained by the interpreter and to functions that interact strongly with the interpreter sys.argv - The list of command line arguments passed to a Python script argv[0] is the script name Further command line args are stored in argv[1] onwards. Eg: python test_prog.py arg1 arg2 arg3, then argv = [ test_prog.py , arg1 , arg2 , arg3 ] sys.getrecursionlimit() - Return the current value of the recursion limit, the maximum depth of the Python interpreter stack sys.setrecursionlimit(limit) Set the maximum depth of the Python interpreter stack to limit This limit prevents infinite recursion from causing an overflow of the C stack and crashing Python The highest possible limit is platform-dependent Satyaki Sikdar© Programming in Python April 23 2016 28 / 62
  • 84.
    hour 7: algodesign 101 modules the sys module This module provides access to some variables used or maintained by the interpreter and to functions that interact strongly with the interpreter sys.argv - The list of command line arguments passed to a Python script argv[0] is the script name Further command line args are stored in argv[1] onwards. Eg: python test_prog.py arg1 arg2 arg3, then argv = [ test_prog.py , arg1 , arg2 , arg3 ] sys.getrecursionlimit() - Return the current value of the recursion limit, the maximum depth of the Python interpreter stack sys.setrecursionlimit(limit) Set the maximum depth of the Python interpreter stack to limit This limit prevents infinite recursion from causing an overflow of the C stack and crashing Python The highest possible limit is platform-dependent Satyaki Sikdar© Programming in Python April 23 2016 28 / 62
  • 85.
    hour 7: algodesign 101 modules the sys module This module provides access to some variables used or maintained by the interpreter and to functions that interact strongly with the interpreter sys.argv - The list of command line arguments passed to a Python script argv[0] is the script name Further command line args are stored in argv[1] onwards. Eg: python test_prog.py arg1 arg2 arg3, then argv = [ test_prog.py , arg1 , arg2 , arg3 ] sys.getrecursionlimit() - Return the current value of the recursion limit, the maximum depth of the Python interpreter stack sys.setrecursionlimit(limit) Set the maximum depth of the Python interpreter stack to limit This limit prevents infinite recursion from causing an overflow of the C stack and crashing Python The highest possible limit is platform-dependent Satyaki Sikdar© Programming in Python April 23 2016 28 / 62
  • 86.
    hour 7: algodesign 101 modules the sys module This module provides access to some variables used or maintained by the interpreter and to functions that interact strongly with the interpreter sys.argv - The list of command line arguments passed to a Python script argv[0] is the script name Further command line args are stored in argv[1] onwards. Eg: python test_prog.py arg1 arg2 arg3, then argv = [ test_prog.py , arg1 , arg2 , arg3 ] sys.getrecursionlimit() - Return the current value of the recursion limit, the maximum depth of the Python interpreter stack sys.setrecursionlimit(limit) Set the maximum depth of the Python interpreter stack to limit This limit prevents infinite recursion from causing an overflow of the C stack and crashing Python The highest possible limit is platform-dependent Satyaki Sikdar© Programming in Python April 23 2016 28 / 62
  • 87.
    hour 7: algodesign 101 modules the sys module This module provides access to some variables used or maintained by the interpreter and to functions that interact strongly with the interpreter sys.argv - The list of command line arguments passed to a Python script argv[0] is the script name Further command line args are stored in argv[1] onwards. Eg: python test_prog.py arg1 arg2 arg3, then argv = [ test_prog.py , arg1 , arg2 , arg3 ] sys.getrecursionlimit() - Return the current value of the recursion limit, the maximum depth of the Python interpreter stack sys.setrecursionlimit(limit) Set the maximum depth of the Python interpreter stack to limit This limit prevents infinite recursion from causing an overflow of the C stack and crashing Python The highest possible limit is platform-dependent Satyaki Sikdar© Programming in Python April 23 2016 28 / 62
  • 88.
    hour 7: algodesign 101 modules the sys module This module provides access to some variables used or maintained by the interpreter and to functions that interact strongly with the interpreter sys.argv - The list of command line arguments passed to a Python script argv[0] is the script name Further command line args are stored in argv[1] onwards. Eg: python test_prog.py arg1 arg2 arg3, then argv = [ test_prog.py , arg1 , arg2 , arg3 ] sys.getrecursionlimit() - Return the current value of the recursion limit, the maximum depth of the Python interpreter stack sys.setrecursionlimit(limit) Set the maximum depth of the Python interpreter stack to limit This limit prevents infinite recursion from causing an overflow of the C stack and crashing Python The highest possible limit is platform-dependent Satyaki Sikdar© Programming in Python April 23 2016 28 / 62
  • 89.
    hour 7: algodesign 101 modules the sys module This module provides access to some variables used or maintained by the interpreter and to functions that interact strongly with the interpreter sys.argv - The list of command line arguments passed to a Python script argv[0] is the script name Further command line args are stored in argv[1] onwards. Eg: python test_prog.py arg1 arg2 arg3, then argv = [ test_prog.py , arg1 , arg2 , arg3 ] sys.getrecursionlimit() - Return the current value of the recursion limit, the maximum depth of the Python interpreter stack sys.setrecursionlimit(limit) Set the maximum depth of the Python interpreter stack to limit This limit prevents infinite recursion from causing an overflow of the C stack and crashing Python The highest possible limit is platform-dependent Satyaki Sikdar© Programming in Python April 23 2016 28 / 62
  • 90.
    hour 7: algodesign 101 modules make your own module Making modules are very easy - at least the basic ones anyway Create a script in IDLE or in a decent text editor Write the classes and variables you want the module to have (say, three functions f1, f2, f3 and two variables v1 and v2) Save the script as my_mod.py Create another Python script, in the same directory where you’ll use the module Write import my_mod anywhere and you’re done! dir(modulename) gives a sorted list of strings of the things imported from the module Satyaki Sikdar© Programming in Python April 23 2016 29 / 62
  • 91.
    hour 7: algodesign 101 modules make your own module Making modules are very easy - at least the basic ones anyway Create a script in IDLE or in a decent text editor Write the classes and variables you want the module to have (say, three functions f1, f2, f3 and two variables v1 and v2) Save the script as my_mod.py Create another Python script, in the same directory where you’ll use the module Write import my_mod anywhere and you’re done! dir(modulename) gives a sorted list of strings of the things imported from the module Satyaki Sikdar© Programming in Python April 23 2016 29 / 62
  • 92.
    hour 7: algodesign 101 modules make your own module Making modules are very easy - at least the basic ones anyway Create a script in IDLE or in a decent text editor Write the classes and variables you want the module to have (say, three functions f1, f2, f3 and two variables v1 and v2) Save the script as my_mod.py Create another Python script, in the same directory where you’ll use the module Write import my_mod anywhere and you’re done! dir(modulename) gives a sorted list of strings of the things imported from the module Satyaki Sikdar© Programming in Python April 23 2016 29 / 62
  • 93.
    hour 7: algodesign 101 modules make your own module Making modules are very easy - at least the basic ones anyway Create a script in IDLE or in a decent text editor Write the classes and variables you want the module to have (say, three functions f1, f2, f3 and two variables v1 and v2) Save the script as my_mod.py Create another Python script, in the same directory where you’ll use the module Write import my_mod anywhere and you’re done! dir(modulename) gives a sorted list of strings of the things imported from the module Satyaki Sikdar© Programming in Python April 23 2016 29 / 62
  • 94.
    hour 7: algodesign 101 modules make your own module Making modules are very easy - at least the basic ones anyway Create a script in IDLE or in a decent text editor Write the classes and variables you want the module to have (say, three functions f1, f2, f3 and two variables v1 and v2) Save the script as my_mod.py Create another Python script, in the same directory where you’ll use the module Write import my_mod anywhere and you’re done! dir(modulename) gives a sorted list of strings of the things imported from the module Satyaki Sikdar© Programming in Python April 23 2016 29 / 62
  • 95.
    hour 7: algodesign 101 modules make your own module Making modules are very easy - at least the basic ones anyway Create a script in IDLE or in a decent text editor Write the classes and variables you want the module to have (say, three functions f1, f2, f3 and two variables v1 and v2) Save the script as my_mod.py Create another Python script, in the same directory where you’ll use the module Write import my_mod anywhere and you’re done! dir(modulename) gives a sorted list of strings of the things imported from the module Satyaki Sikdar© Programming in Python April 23 2016 29 / 62
  • 96.
    hour 7: algodesign 101 modules make your own module Making modules are very easy - at least the basic ones anyway Create a script in IDLE or in a decent text editor Write the classes and variables you want the module to have (say, three functions f1, f2, f3 and two variables v1 and v2) Save the script as my_mod.py Create another Python script, in the same directory where you’ll use the module Write import my_mod anywhere and you’re done! dir(modulename) gives a sorted list of strings of the things imported from the module Satyaki Sikdar© Programming in Python April 23 2016 29 / 62
  • 97.
    hours 8: dataviz 101 table of contents 1 hour 6: let’s get rich! 2 hour 7: algo design 101 3 hours 8: data viz 101 plotting matplotlib making plots prettier 4 hours 9 - 11 SNA 101 Satyaki Sikdar© Programming in Python April 23 2016 30 / 62
  • 98.
    hours 8: dataviz 101 plotting data visualization Data visualization turns numbers and letters into aesthetically pleasing visuals, making it easy to recognize patterns and find exceptions Figure: US Census data (2010) It is easy to see some general settlement patterns in the US The East Coast has a much greater population density than the rest of America The East Coast has a much greater population density than the rest of America - racial homophily Satyaki Sikdar© Programming in Python April 23 2016 31 / 62
  • 99.
    hours 8: dataviz 101 plotting data visualization Data visualization turns numbers and letters into aesthetically pleasing visuals, making it easy to recognize patterns and find exceptions Figure: US Census data (2010) It is easy to see some general settlement patterns in the US The East Coast has a much greater population density than the rest of America The East Coast has a much greater population density than the rest of America - racial homophily Satyaki Sikdar© Programming in Python April 23 2016 31 / 62
  • 100.
    hours 8: dataviz 101 plotting data visualization Data visualization turns numbers and letters into aesthetically pleasing visuals, making it easy to recognize patterns and find exceptions Figure: US Census data (2010) It is easy to see some general settlement patterns in the US The East Coast has a much greater population density than the rest of America The East Coast has a much greater population density than the rest of America - racial homophily Satyaki Sikdar© Programming in Python April 23 2016 31 / 62
  • 101.
    hours 8: dataviz 101 plotting data visualization Data visualization turns numbers and letters into aesthetically pleasing visuals, making it easy to recognize patterns and find exceptions Figure: US Census data (2010) It is easy to see some general settlement patterns in the US The East Coast has a much greater population density than the rest of America The East Coast has a much greater population density than the rest of America - racial homophily Satyaki Sikdar© Programming in Python April 23 2016 31 / 62
  • 102.
    hours 8: dataviz 101 plotting love in the time of cholera Satyaki Sikdar© Programming in Python April 23 2016 32 / 62
  • 103.
    hours 8: dataviz 101 plotting Anscombe’s quartet comprises four datasets that have nearly identical simple statistical properties, yet appear very different when graphed Constructed in 1973 by Francis Anscombe to demonstrate both the importance of graphing data before analyzing it and the effect of outliers on statistical properties Property Value ¯x 9 σ2(x) 11 ¯y 7.50 σ2(y) 4.122 correlation 0.816 regression y = 3 + 0.5x Satyaki Sikdar© Programming in Python April 23 2016 33 / 62
  • 104.
    hours 8: dataviz 101 plotting Anscombe’s quartet comprises four datasets that have nearly identical simple statistical properties, yet appear very different when graphed Constructed in 1973 by Francis Anscombe to demonstrate both the importance of graphing data before analyzing it and the effect of outliers on statistical properties Property Value ¯x 9 σ2(x) 11 ¯y 7.50 σ2(y) 4.122 correlation 0.816 regression y = 3 + 0.5x Satyaki Sikdar© Programming in Python April 23 2016 33 / 62
  • 105.
    hours 8: dataviz 101 plotting plotting the four datasets Satyaki Sikdar© Programming in Python April 23 2016 34 / 62
  • 106.
    hours 8: dataviz 101 plotting more reasons to visualize the data Visualization is the highest bandwidth channel into the human brain The visual cortex is the largest system in the human brain; it’s wasteful not to make use of it As data volumes grow, visualization becomes a necessity rather than a luxury "A picture is worth a thousand words" Satyaki Sikdar© Programming in Python April 23 2016 35 / 62
  • 107.
    hours 8: dataviz 101 plotting more reasons to visualize the data Visualization is the highest bandwidth channel into the human brain The visual cortex is the largest system in the human brain; it’s wasteful not to make use of it As data volumes grow, visualization becomes a necessity rather than a luxury "A picture is worth a thousand words" Satyaki Sikdar© Programming in Python April 23 2016 35 / 62
  • 108.
    hours 8: dataviz 101 plotting more reasons to visualize the data Visualization is the highest bandwidth channel into the human brain The visual cortex is the largest system in the human brain; it’s wasteful not to make use of it As data volumes grow, visualization becomes a necessity rather than a luxury "A picture is worth a thousand words" Satyaki Sikdar© Programming in Python April 23 2016 35 / 62
  • 109.
    hours 8: dataviz 101 matplotlib matplotlib and pylab Matplotlib is a 3rd party module that provides an interface to make plots in Python Inspired by Matlab’s plotting library and hence the name pylab or equivalently matplotlib.pyplot is a module defined by matplotlib that is used to make plots I’ll cover two most used types of plots in some detail line plots scatter plots histograms Satyaki Sikdar© Programming in Python April 23 2016 36 / 62
  • 110.
    hours 8: dataviz 101 matplotlib matplotlib and pylab Matplotlib is a 3rd party module that provides an interface to make plots in Python Inspired by Matlab’s plotting library and hence the name pylab or equivalently matplotlib.pyplot is a module defined by matplotlib that is used to make plots I’ll cover two most used types of plots in some detail line plots scatter plots histograms Satyaki Sikdar© Programming in Python April 23 2016 36 / 62
  • 111.
    hours 8: dataviz 101 matplotlib matplotlib and pylab Matplotlib is a 3rd party module that provides an interface to make plots in Python Inspired by Matlab’s plotting library and hence the name pylab or equivalently matplotlib.pyplot is a module defined by matplotlib that is used to make plots I’ll cover two most used types of plots in some detail line plots scatter plots histograms Satyaki Sikdar© Programming in Python April 23 2016 36 / 62
  • 112.
    hours 8: dataviz 101 matplotlib matplotlib and pylab Matplotlib is a 3rd party module that provides an interface to make plots in Python Inspired by Matlab’s plotting library and hence the name pylab or equivalently matplotlib.pyplot is a module defined by matplotlib that is used to make plots I’ll cover two most used types of plots in some detail line plots scatter plots histograms Satyaki Sikdar© Programming in Python April 23 2016 36 / 62
  • 113.
    hours 8: dataviz 101 matplotlib matplotlib and pylab Matplotlib is a 3rd party module that provides an interface to make plots in Python Inspired by Matlab’s plotting library and hence the name pylab or equivalently matplotlib.pyplot is a module defined by matplotlib that is used to make plots I’ll cover two most used types of plots in some detail line plots scatter plots histograms Satyaki Sikdar© Programming in Python April 23 2016 36 / 62
  • 114.
    hours 8: dataviz 101 matplotlib matplotlib and pylab Matplotlib is a 3rd party module that provides an interface to make plots in Python Inspired by Matlab’s plotting library and hence the name pylab or equivalently matplotlib.pyplot is a module defined by matplotlib that is used to make plots I’ll cover two most used types of plots in some detail line plots scatter plots histograms Satyaki Sikdar© Programming in Python April 23 2016 36 / 62
  • 115.
    hours 8: dataviz 101 matplotlib matplotlib and pylab Matplotlib is a 3rd party module that provides an interface to make plots in Python Inspired by Matlab’s plotting library and hence the name pylab or equivalently matplotlib.pyplot is a module defined by matplotlib that is used to make plots I’ll cover two most used types of plots in some detail line plots scatter plots histograms Satyaki Sikdar© Programming in Python April 23 2016 36 / 62
  • 116.
    hours 8: dataviz 101 matplotlib line plots # lineplot.py import pylab as pl x = [1, 2, 3, 4, 5] y = [1, 4, 9, 16, 25] pl.plot(x, y) pl.show() # show the plot on the screen Satyaki Sikdar© Programming in Python April 23 2016 37 / 62
  • 117.
    hours 8: dataviz 101 matplotlib line plots # scatterplot.py import pylab as pl x = [1, 2, 3, 4, 5] y = [1, 4, 9, 16, 25] pl.scatter(x, y) pl.show() # show the plot on the screen Satyaki Sikdar© Programming in Python April 23 2016 38 / 62
  • 118.
    hours 8: dataviz 101 making plots prettier tinkering parameters Matplotlib offers a lot of customizations. Let’s look at the key ones. Changing the line color - different datasets can have different colors # at lineplot.py # pl.plot(x, y) pl.plot(x, y, c='r') character color b blue g green r red c cyan m magenta y yellow k black w white Satyaki Sikdar© Programming in Python April 23 2016 39 / 62
  • 119.
    hours 8: dataviz 101 making plots prettier tinkering parameters Changing the marker - marks the data points # at lineplot.py pl.plot(x, y, c='b', marker='*') # gives blue star shaped markers pl.plot(x, y, marker='b*') # same plot as above character marker shape ’s’ square ’o’ circle ’p’ pentagon ’*’ star ’h’ hexagon ’+’ plus ’D’ diamond ’d’ thin diamond Satyaki Sikdar© Programming in Python April 23 2016 40 / 62
  • 120.
    hours 8: dataviz 101 making plots prettier tinkering parameters Plot and axis titles and limits - It is very important to always label plots and the axes of plots to tell the viewers what they are looking at pl.xlabel('put label of x axis') pl.ylabel('put label of y axis') pt.title('put title here') You can change the x and y ranges displayed on your plot by: pl.xlim(x_low, x_high) pl.ylabel(y_low, y_high) Satyaki Sikdar© Programming in Python April 23 2016 41 / 62
  • 121.
    hours 8: dataviz 101 making plots prettier tinkering parameters Plot and axis titles and limits - It is very important to always label plots and the axes of plots to tell the viewers what they are looking at pl.xlabel('put label of x axis') pl.ylabel('put label of y axis') pt.title('put title here') You can change the x and y ranges displayed on your plot by: pl.xlim(x_low, x_high) pl.ylabel(y_low, y_high) Satyaki Sikdar© Programming in Python April 23 2016 41 / 62
  • 122.
    hours 8: dataviz 101 making plots prettier tinkering parameters #lineplotAxis.py import pylab as pl x = [1, 2, 3, 4, 5] y = [1, 4, 9, 16, 25] pl.plot(x, y) pl.title(’Plot of y vs. x’) pl.xlabel(’x axis’) pl.ylabel(’y axis’) # set axis limits pl.xlim(0.0, 7.0) pl.ylim(0.0, 30.) pl.show() Satyaki Sikdar© Programming in Python April 23 2016 42 / 62
  • 123.
    hours 8: dataviz 101 making plots prettier plotting more than one plot #lineplot2Plots.py import pylab as pl x1 = [1, 2, 3, 4, 5] y1 = [1, 4, 9, 16, 25] x2 = [1, 2, 4, 6, 8] y2 = [2, 4, 8, 12, 16] pl.plot(x1, y1, ’r’) pl.plot(x2, y2, ’g’) pl.title(’Plot of y vs. x’) pl.xlabel(’x axis’) pl.ylabel(’y axis’) pl.xlim(0.0, 9.0) pl.ylim(0.0, 30.) pl.show() Satyaki Sikdar© Programming in Python April 23 2016 43 / 62
  • 124.
    hours 8: dataviz 101 making plots prettier legen.. wait for it.. dary! It’s very useful to add legends to plots to differentiate between the different lines or quantities being plotted pl.legend([plot1, plot2], ('label1', 'label2'), 'best') The first parameter is a list of the plots you want labeled, The second parameter is the list / tuple of labels The third parameter is where you would like matplotlib to place your legend. Options are ‘upper right’, ‘upper left’, ‘center’, ‘lower left’, ‘lower right’ and ’best’ Satyaki Sikdar© Programming in Python April 23 2016 44 / 62
  • 125.
    hours 8: dataviz 101 making plots prettier legen.. wait for it.. dary! It’s very useful to add legends to plots to differentiate between the different lines or quantities being plotted pl.legend([plot1, plot2], ('label1', 'label2'), 'best') The first parameter is a list of the plots you want labeled, The second parameter is the list / tuple of labels The third parameter is where you would like matplotlib to place your legend. Options are ‘upper right’, ‘upper left’, ‘center’, ‘lower left’, ‘lower right’ and ’best’ Satyaki Sikdar© Programming in Python April 23 2016 44 / 62
  • 126.
    hours 8: dataviz 101 making plots prettier legen.. wait for it.. dary! It’s very useful to add legends to plots to differentiate between the different lines or quantities being plotted pl.legend([plot1, plot2], ('label1', 'label2'), 'best') The first parameter is a list of the plots you want labeled, The second parameter is the list / tuple of labels The third parameter is where you would like matplotlib to place your legend. Options are ‘upper right’, ‘upper left’, ‘center’, ‘lower left’, ‘lower right’ and ’best’ Satyaki Sikdar© Programming in Python April 23 2016 44 / 62
  • 127.
    hours 8: dataviz 101 making plots prettier legen.. wait for it.. dary! It’s very useful to add legends to plots to differentiate between the different lines or quantities being plotted pl.legend([plot1, plot2], ('label1', 'label2'), 'best') The first parameter is a list of the plots you want labeled, The second parameter is the list / tuple of labels The third parameter is where you would like matplotlib to place your legend. Options are ‘upper right’, ‘upper left’, ‘center’, ‘lower left’, ‘lower right’ and ’best’ Satyaki Sikdar© Programming in Python April 23 2016 44 / 62
  • 128.
    hours 8: dataviz 101 making plots prettier #lineplotFigLegend.py x1 = [1, 2, 3, 4, 5] y1 = [1, 4, 9, 16, 25] x2 = [1, 2, 4, 6, 8] y2 = [2, 4, 8, 12, 16] plot1 = pl.plot(x1, y1, ’r’) plot2 = pl.plot(x2, y2, ’g’) pl.title(’Plot of y vs. x’) pl.xlabel(’x axis’) pl.ylabel(’y axis’) pl.xlim(0.0, 9.0) pl.ylim(0.0, 30.) pl.legend([plot1, plot2], ('red line', 'green circles'), 'best') pl.show() Satyaki Sikdar© Programming in Python April 23 2016 45 / 62
  • 129.
    hours 8: dataviz 101 making plots prettier histograms They are very useful to plot distributions In Matplotlib you use the hist command to make a histogram from numpy import random # mean, sigma, number of points data = random.normal(5.0, 3.0, 1000) pl.hist(data) pl.title('a sample histogram') pl.xlabel('data') pl.show() Satyaki Sikdar© Programming in Python April 23 2016 46 / 62
  • 130.
    hours 8: dataviz 101 making plots prettier histograms They are very useful to plot distributions In Matplotlib you use the hist command to make a histogram from numpy import random # mean, sigma, number of points data = random.normal(5.0, 3.0, 1000) pl.hist(data) pl.title('a sample histogram') pl.xlabel('data') pl.show() Satyaki Sikdar© Programming in Python April 23 2016 46 / 62
  • 131.
    hours 8: dataviz 101 making plots prettier subplots Matplotlib is reasonably flexible about allowing multiple plots per canvas and it is easy to set this up You need to first make a figure and then specify subplots as follows fig1 = pl.figure(1) pl.subplot(211) subplot(211) - a figure with 2 rows, 1 column, and the top plot (1) pl.subplot(212) - a figure with 2 rows, 1 column, and the bottom plot (2) Satyaki Sikdar© Programming in Python April 23 2016 47 / 62
  • 132.
    hours 8: dataviz 101 making plots prettier subplots Matplotlib is reasonably flexible about allowing multiple plots per canvas and it is easy to set this up You need to first make a figure and then specify subplots as follows fig1 = pl.figure(1) pl.subplot(211) subplot(211) - a figure with 2 rows, 1 column, and the top plot (1) pl.subplot(212) - a figure with 2 rows, 1 column, and the bottom plot (2) Satyaki Sikdar© Programming in Python April 23 2016 47 / 62
  • 133.
    hours 8: dataviz 101 making plots prettier subplots Matplotlib is reasonably flexible about allowing multiple plots per canvas and it is easy to set this up You need to first make a figure and then specify subplots as follows fig1 = pl.figure(1) pl.subplot(211) subplot(211) - a figure with 2 rows, 1 column, and the top plot (1) pl.subplot(212) - a figure with 2 rows, 1 column, and the bottom plot (2) Satyaki Sikdar© Programming in Python April 23 2016 47 / 62
  • 134.
    hours 8: dataviz 101 making plots prettier subplots Matplotlib is reasonably flexible about allowing multiple plots per canvas and it is easy to set this up You need to first make a figure and then specify subplots as follows fig1 = pl.figure(1) pl.subplot(211) subplot(211) - a figure with 2 rows, 1 column, and the top plot (1) pl.subplot(212) - a figure with 2 rows, 1 column, and the bottom plot (2) Satyaki Sikdar© Programming in Python April 23 2016 47 / 62
  • 135.
    hours 8: dataviz 101 making plots prettier Satyaki Sikdar© Programming in Python April 23 2016 48 / 62
  • 136.
    hours 8: dataviz 101 making plots prettier handling data So far, we have been hard coding the data sets Actual datasets might be very large! We use file handling import pylab as pl def read_data(filename): X = [] Y = [] with open(filename, 'r') as f: for line in f.readlines(): x, y = line.split() X.append(float(x)) Y.append(float(y)) return X, Y def plot_data(filename): X, Y = read_data(filename) pl.scatter(X, Y, c='g') pl.xlabel('x') pl.ylabel('y') pl.title('y vs x') pl.show() Satyaki Sikdar© Programming in Python April 23 2016 49 / 62
  • 137.
    hours 8: dataviz 101 making plots prettier handling data So far, we have been hard coding the data sets Actual datasets might be very large! We use file handling import pylab as pl def read_data(filename): X = [] Y = [] with open(filename, 'r') as f: for line in f.readlines(): x, y = line.split() X.append(float(x)) Y.append(float(y)) return X, Y def plot_data(filename): X, Y = read_data(filename) pl.scatter(X, Y, c='g') pl.xlabel('x') pl.ylabel('y') pl.title('y vs x') pl.show() Satyaki Sikdar© Programming in Python April 23 2016 49 / 62
  • 138.
    hours 9 -11 SNA 101 table of contents 1 hour 6: let’s get rich! 2 hour 7: algo design 101 3 hours 8: data viz 101 4 hours 9 - 11 SNA 101 Introduction to SNA Modelling - Introduction and Importance Representing Networks Satyaki Sikdar© Programming in Python April 23 2016 50 / 62
  • 139.
    hours 9 -11 SNA 101 Introduction to SNA Social Networks Analysis Investigation social structures through the use of network and graph theories Characterizes ties among say: Friends, Webpages, disease transmission Analysis is crucial to understand the flow of influence, disease, or investigate patterns like voting patterns, food preferences Satyaki Sikdar© Programming in Python April 23 2016 51 / 62
  • 140.
    hours 9 -11 SNA 101 Introduction to SNA Social Networks Analysis Investigation social structures through the use of network and graph theories Characterizes ties among say: Friends, Webpages, disease transmission Analysis is crucial to understand the flow of influence, disease, or investigate patterns like voting patterns, food preferences Satyaki Sikdar© Programming in Python April 23 2016 51 / 62
  • 141.
    hours 9 -11 SNA 101 Introduction to SNA Social Networks Analysis Investigation social structures through the use of network and graph theories Characterizes ties among say: Friends, Webpages, disease transmission Analysis is crucial to understand the flow of influence, disease, or investigate patterns like voting patterns, food preferences Satyaki Sikdar© Programming in Python April 23 2016 51 / 62
  • 142.
    hours 9 -11 SNA 101 Introduction to SNA Citation and Email networks Satyaki Sikdar© Programming in Python April 23 2016 52 / 62
  • 143.
    hours 9 -11 SNA 101 Introduction to SNA Citation and Email networks Satyaki Sikdar© Programming in Python April 23 2016 52 / 62
  • 144.
    hours 9 -11 SNA 101 Introduction to SNA Telecommunication and Protein networks Satyaki Sikdar© Programming in Python April 23 2016 53 / 62
  • 145.
    hours 9 -11 SNA 101 Introduction to SNA Telecommunication and Protein networks Satyaki Sikdar© Programming in Python April 23 2016 53 / 62
  • 146.
    hours 9 -11 SNA 101 Introduction to SNA Friendship and Les Misérables Satyaki Sikdar© Programming in Python April 23 2016 54 / 62
  • 147.
    hours 9 -11 SNA 101 Introduction to SNA Friendship and Les Misérables Satyaki Sikdar© Programming in Python April 23 2016 54 / 62
  • 148.
    hours 9 -11 SNA 101 Introduction to SNA Blackout Aug ’96 A hot summer day. A single transmission line fails in Portland, Oregon The power line fails. The load is distributed over the remaining lines which were operating at almost max capacity The system collapses. Much like a stack of dominoes. OR => WA WA => CA CA => ID ID => UT UT => CO CO => AZ AZ => NM NM => NVSatyaki Sikdar© Programming in Python April 23 2016 55 / 62
  • 149.
    hours 9 -11 SNA 101 Introduction to SNA Blackout Aug ’96 A hot summer day. A single transmission line fails in Portland, Oregon The power line fails. The load is distributed over the remaining lines which were operating at almost max capacity The system collapses. Much like a stack of dominoes. OR => WA WA => CA CA => ID ID => UT UT => CO CO => AZ AZ => NM NM => NVSatyaki Sikdar© Programming in Python April 23 2016 55 / 62
  • 150.
    hours 9 -11 SNA 101 Introduction to SNA Blackout Aug ’96 A hot summer day. A single transmission line fails in Portland, Oregon The power line fails. The load is distributed over the remaining lines which were operating at almost max capacity The system collapses. Much like a stack of dominoes. OR => WA WA => CA CA => ID ID => UT UT => CO CO => AZ AZ => NM NM => NVSatyaki Sikdar© Programming in Python April 23 2016 55 / 62
  • 151.
    hours 9 -11 SNA 101 Introduction to SNA Blackout Aug ’96 A hot summer day. A single transmission line fails in Portland, Oregon The power line fails. The load is distributed over the remaining lines which were operating at almost max capacity The system collapses. Much like a stack of dominoes. OR => WA WA => CA CA => ID ID => UT UT => CO CO => AZ AZ => NM NM => NVSatyaki Sikdar© Programming in Python April 23 2016 55 / 62
  • 152.
    hours 9 -11 SNA 101 Introduction to SNA Blackout Aug ’96 A hot summer day. A single transmission line fails in Portland, Oregon The power line fails. The load is distributed over the remaining lines which were operating at almost max capacity The system collapses. Much like a stack of dominoes. OR => WA WA => CA CA => ID ID => UT UT => CO CO => AZ AZ => NM NM => NVSatyaki Sikdar© Programming in Python April 23 2016 55 / 62
  • 153.
    hours 9 -11 SNA 101 Introduction to SNA Blackout Aug ’96 A hot summer day. A single transmission line fails in Portland, Oregon The power line fails. The load is distributed over the remaining lines which were operating at almost max capacity The system collapses. Much like a stack of dominoes. OR => WA WA => CA CA => ID ID => UT UT => CO CO => AZ AZ => NM NM => NVSatyaki Sikdar© Programming in Python April 23 2016 55 / 62
  • 154.
    hours 9 -11 SNA 101 Introduction to SNA Blackout Aug ’96 A hot summer day. A single transmission line fails in Portland, Oregon The power line fails. The load is distributed over the remaining lines which were operating at almost max capacity The system collapses. Much like a stack of dominoes. OR => WA WA => CA CA => ID ID => UT UT => CO CO => AZ AZ => NM NM => NVSatyaki Sikdar© Programming in Python April 23 2016 55 / 62
  • 155.
    hours 9 -11 SNA 101 Introduction to SNA Blackout Aug ’96 A hot summer day. A single transmission line fails in Portland, Oregon The power line fails. The load is distributed over the remaining lines which were operating at almost max capacity The system collapses. Much like a stack of dominoes. OR => WA WA => CA CA => ID ID => UT UT => CO CO => AZ AZ => NM NM => NVSatyaki Sikdar© Programming in Python April 23 2016 55 / 62
  • 156.
    hours 9 -11 SNA 101 Introduction to SNA Blackout Aug ’96 A hot summer day. A single transmission line fails in Portland, Oregon The power line fails. The load is distributed over the remaining lines which were operating at almost max capacity The system collapses. Much like a stack of dominoes. OR => WA WA => CA CA => ID ID => UT UT => CO CO => AZ AZ => NM NM => NVSatyaki Sikdar© Programming in Python April 23 2016 55 / 62
  • 157.
    hours 9 -11 SNA 101 Introduction to SNA Blackout Aug ’96 A hot summer day. A single transmission line fails in Portland, Oregon The power line fails. The load is distributed over the remaining lines which were operating at almost max capacity The system collapses. Much like a stack of dominoes. OR => WA WA => CA CA => ID ID => UT UT => CO CO => AZ AZ => NM NM => NVSatyaki Sikdar© Programming in Python April 23 2016 55 / 62
  • 158.
    hours 9 -11 SNA 101 Introduction to SNA Blackout Aug ’96 A hot summer day. A single transmission line fails in Portland, Oregon The power line fails. The load is distributed over the remaining lines which were operating at almost max capacity The system collapses. Much like a stack of dominoes. OR => WA WA => CA CA => ID ID => UT UT => CO CO => AZ AZ => NM NM => NVSatyaki Sikdar© Programming in Python April 23 2016 55 / 62
  • 159.
    hours 9 -11 SNA 101 Introduction to SNA The Aftermath The skyline of San Francisco was dark A total of 175 generating units failed Some of the nuclear reactors tooks days to restart Total cost of $2 billion What caused this catastrophe? Sloppy maintainence Insufficient attention to warning signs Pure chance - bad luck Inadequate understanding of the interdependencies in the system Satyaki Sikdar© Programming in Python April 23 2016 56 / 62
  • 160.
    hours 9 -11 SNA 101 Introduction to SNA The Aftermath The skyline of San Francisco was dark A total of 175 generating units failed Some of the nuclear reactors tooks days to restart Total cost of $2 billion What caused this catastrophe? Sloppy maintainence Insufficient attention to warning signs Pure chance - bad luck Inadequate understanding of the interdependencies in the system Satyaki Sikdar© Programming in Python April 23 2016 56 / 62
  • 161.
    hours 9 -11 SNA 101 Introduction to SNA The Aftermath The skyline of San Francisco was dark A total of 175 generating units failed Some of the nuclear reactors tooks days to restart Total cost of $2 billion What caused this catastrophe? Sloppy maintainence Insufficient attention to warning signs Pure chance - bad luck Inadequate understanding of the interdependencies in the system Satyaki Sikdar© Programming in Python April 23 2016 56 / 62
  • 162.
    hours 9 -11 SNA 101 Introduction to SNA The Aftermath The skyline of San Francisco was dark A total of 175 generating units failed Some of the nuclear reactors tooks days to restart Total cost of $2 billion What caused this catastrophe? Sloppy maintainence Insufficient attention to warning signs Pure chance - bad luck Inadequate understanding of the interdependencies in the system Satyaki Sikdar© Programming in Python April 23 2016 56 / 62
  • 163.
    hours 9 -11 SNA 101 Introduction to SNA The Aftermath The skyline of San Francisco was dark A total of 175 generating units failed Some of the nuclear reactors tooks days to restart Total cost of $2 billion What caused this catastrophe? Sloppy maintainence Insufficient attention to warning signs Pure chance - bad luck Inadequate understanding of the interdependencies in the system Satyaki Sikdar© Programming in Python April 23 2016 56 / 62
  • 164.
    hours 9 -11 SNA 101 Introduction to SNA The Aftermath The skyline of San Francisco was dark A total of 175 generating units failed Some of the nuclear reactors tooks days to restart Total cost of $2 billion What caused this catastrophe? Sloppy maintainence Insufficient attention to warning signs Pure chance - bad luck Inadequate understanding of the interdependencies in the system Satyaki Sikdar© Programming in Python April 23 2016 56 / 62
  • 165.
    hours 9 -11 SNA 101 Introduction to SNA The Aftermath The skyline of San Francisco was dark A total of 175 generating units failed Some of the nuclear reactors tooks days to restart Total cost of $2 billion What caused this catastrophe? Sloppy maintainence Insufficient attention to warning signs Pure chance - bad luck Inadequate understanding of the interdependencies in the system Satyaki Sikdar© Programming in Python April 23 2016 56 / 62
  • 166.
    hours 9 -11 SNA 101 Introduction to SNA The Aftermath The skyline of San Francisco was dark A total of 175 generating units failed Some of the nuclear reactors tooks days to restart Total cost of $2 billion What caused this catastrophe? Sloppy maintainence Insufficient attention to warning signs Pure chance - bad luck Inadequate understanding of the interdependencies in the system Satyaki Sikdar© Programming in Python April 23 2016 56 / 62
  • 167.
    hours 9 -11 SNA 101 Introduction to SNA Complex networks Satyaki Sikdar© Programming in Python April 23 2016 57 / 62
  • 168.
    hours 9 -11 SNA 101 Introduction to SNA Complex networks in real life Advances in gene sequencing reveal that all human lives consist of about 30, 000 genes The complexity rises from the interactions between different genes expressing different characteristics The parts making up the whole don’t sum up in any simple fashion The building blocks interact with one another thus generating bewildering behavior Satyaki Sikdar© Programming in Python April 23 2016 58 / 62
  • 169.
    hours 9 -11 SNA 101 Introduction to SNA Complex networks in real life Advances in gene sequencing reveal that all human lives consist of about 30, 000 genes The complexity rises from the interactions between different genes expressing different characteristics The parts making up the whole don’t sum up in any simple fashion The building blocks interact with one another thus generating bewildering behavior Satyaki Sikdar© Programming in Python April 23 2016 58 / 62
  • 170.
    hours 9 -11 SNA 101 Introduction to SNA Complex networks in real life Advances in gene sequencing reveal that all human lives consist of about 30, 000 genes The complexity rises from the interactions between different genes expressing different characteristics The parts making up the whole don’t sum up in any simple fashion The building blocks interact with one another thus generating bewildering behavior Satyaki Sikdar© Programming in Python April 23 2016 58 / 62
  • 171.
    hours 9 -11 SNA 101 Introduction to SNA Complex networks in real life Advances in gene sequencing reveal that all human lives consist of about 30, 000 genes The complexity rises from the interactions between different genes expressing different characteristics The parts making up the whole don’t sum up in any simple fashion The building blocks interact with one another thus generating bewildering behavior Satyaki Sikdar© Programming in Python April 23 2016 58 / 62
  • 172.
    hours 9 -11 SNA 101 Introduction to SNA Open problems in SNA Small outbreaks of diseases becoming epidemics Resilience of networks - the internet, power grids What makes some videos go viral? How to find clusters of nodes that are similar to each other? How to find a seed set of nodes to maximize influence? Key question: How does individual behavior aggregate to collective behavior? Modelling and pattern finding is critical. Random sampling (Leskovec et al) is a way. The other way is to form synthetic statistical models of networks Satyaki Sikdar© Programming in Python April 23 2016 59 / 62
  • 173.
    hours 9 -11 SNA 101 Introduction to SNA Open problems in SNA Small outbreaks of diseases becoming epidemics Resilience of networks - the internet, power grids What makes some videos go viral? How to find clusters of nodes that are similar to each other? How to find a seed set of nodes to maximize influence? Key question: How does individual behavior aggregate to collective behavior? Modelling and pattern finding is critical. Random sampling (Leskovec et al) is a way. The other way is to form synthetic statistical models of networks Satyaki Sikdar© Programming in Python April 23 2016 59 / 62
  • 174.
    hours 9 -11 SNA 101 Introduction to SNA Open problems in SNA Small outbreaks of diseases becoming epidemics Resilience of networks - the internet, power grids What makes some videos go viral? How to find clusters of nodes that are similar to each other? How to find a seed set of nodes to maximize influence? Key question: How does individual behavior aggregate to collective behavior? Modelling and pattern finding is critical. Random sampling (Leskovec et al) is a way. The other way is to form synthetic statistical models of networks Satyaki Sikdar© Programming in Python April 23 2016 59 / 62
  • 175.
    hours 9 -11 SNA 101 Introduction to SNA Open problems in SNA Small outbreaks of diseases becoming epidemics Resilience of networks - the internet, power grids What makes some videos go viral? How to find clusters of nodes that are similar to each other? How to find a seed set of nodes to maximize influence? Key question: How does individual behavior aggregate to collective behavior? Modelling and pattern finding is critical. Random sampling (Leskovec et al) is a way. The other way is to form synthetic statistical models of networks Satyaki Sikdar© Programming in Python April 23 2016 59 / 62
  • 176.
    hours 9 -11 SNA 101 Introduction to SNA Open problems in SNA Small outbreaks of diseases becoming epidemics Resilience of networks - the internet, power grids What makes some videos go viral? How to find clusters of nodes that are similar to each other? How to find a seed set of nodes to maximize influence? Key question: How does individual behavior aggregate to collective behavior? Modelling and pattern finding is critical. Random sampling (Leskovec et al) is a way. The other way is to form synthetic statistical models of networks Satyaki Sikdar© Programming in Python April 23 2016 59 / 62
  • 177.
    hours 9 -11 SNA 101 Introduction to SNA Open problems in SNA Small outbreaks of diseases becoming epidemics Resilience of networks - the internet, power grids What makes some videos go viral? How to find clusters of nodes that are similar to each other? How to find a seed set of nodes to maximize influence? Key question: How does individual behavior aggregate to collective behavior? Modelling and pattern finding is critical. Random sampling (Leskovec et al) is a way. The other way is to form synthetic statistical models of networks Satyaki Sikdar© Programming in Python April 23 2016 59 / 62
  • 178.
    hours 9 -11 SNA 101 Introduction to SNA Open problems in SNA Small outbreaks of diseases becoming epidemics Resilience of networks - the internet, power grids What makes some videos go viral? How to find clusters of nodes that are similar to each other? How to find a seed set of nodes to maximize influence? Key question: How does individual behavior aggregate to collective behavior? Modelling and pattern finding is critical. Random sampling (Leskovec et al) is a way. The other way is to form synthetic statistical models of networks Satyaki Sikdar© Programming in Python April 23 2016 59 / 62
  • 179.
    hours 9 -11 SNA 101 Introduction to SNA Open problems in SNA Small outbreaks of diseases becoming epidemics Resilience of networks - the internet, power grids What makes some videos go viral? How to find clusters of nodes that are similar to each other? How to find a seed set of nodes to maximize influence? Key question: How does individual behavior aggregate to collective behavior? Modelling and pattern finding is critical. Random sampling (Leskovec et al) is a way. The other way is to form synthetic statistical models of networks Satyaki Sikdar© Programming in Python April 23 2016 59 / 62
  • 180.
    hours 9 -11 SNA 101 Introduction to SNA Open problems in SNA Small outbreaks of diseases becoming epidemics Resilience of networks - the internet, power grids What makes some videos go viral? How to find clusters of nodes that are similar to each other? How to find a seed set of nodes to maximize influence? Key question: How does individual behavior aggregate to collective behavior? Modelling and pattern finding is critical. Random sampling (Leskovec et al) is a way. The other way is to form synthetic statistical models of networks Satyaki Sikdar© Programming in Python April 23 2016 59 / 62
  • 181.
    hours 9 -11 SNA 101 Modelling - Introduction and Importance The Need for Modelling The models discussed in the talk is simplified Starting off simple is an essential stage of understanding anything complex Results from the simple models are often intriguing and fascinating The cost is abstraction - the results are often hard to apply in real life Models provide a simple framework for experimentation Models look to emulate the properties of actual networks to some extent Satyaki Sikdar© Programming in Python April 23 2016 60 / 62
  • 182.
    hours 9 -11 SNA 101 Modelling - Introduction and Importance The Need for Modelling The models discussed in the talk is simplified Starting off simple is an essential stage of understanding anything complex Results from the simple models are often intriguing and fascinating The cost is abstraction - the results are often hard to apply in real life Models provide a simple framework for experimentation Models look to emulate the properties of actual networks to some extent Satyaki Sikdar© Programming in Python April 23 2016 60 / 62
  • 183.
    hours 9 -11 SNA 101 Modelling - Introduction and Importance The Need for Modelling The models discussed in the talk is simplified Starting off simple is an essential stage of understanding anything complex Results from the simple models are often intriguing and fascinating The cost is abstraction - the results are often hard to apply in real life Models provide a simple framework for experimentation Models look to emulate the properties of actual networks to some extent Satyaki Sikdar© Programming in Python April 23 2016 60 / 62
  • 184.
    hours 9 -11 SNA 101 Modelling - Introduction and Importance The Need for Modelling The models discussed in the talk is simplified Starting off simple is an essential stage of understanding anything complex Results from the simple models are often intriguing and fascinating The cost is abstraction - the results are often hard to apply in real life Models provide a simple framework for experimentation Models look to emulate the properties of actual networks to some extent Satyaki Sikdar© Programming in Python April 23 2016 60 / 62
  • 185.
    hours 9 -11 SNA 101 Modelling - Introduction and Importance The Need for Modelling The models discussed in the talk is simplified Starting off simple is an essential stage of understanding anything complex Results from the simple models are often intriguing and fascinating The cost is abstraction - the results are often hard to apply in real life Models provide a simple framework for experimentation Models look to emulate the properties of actual networks to some extent Satyaki Sikdar© Programming in Python April 23 2016 60 / 62
  • 186.
    hours 9 -11 SNA 101 Modelling - Introduction and Importance The Need for Modelling The models discussed in the talk is simplified Starting off simple is an essential stage of understanding anything complex Results from the simple models are often intriguing and fascinating The cost is abstraction - the results are often hard to apply in real life Models provide a simple framework for experimentation Models look to emulate the properties of actual networks to some extent Satyaki Sikdar© Programming in Python April 23 2016 60 / 62
  • 187.
    hours 9 -11 SNA 101 Representing Networks Network representation Networks portray the interactions between different actors.Graphs hand us a valuable tool to process and handle networks. Actors or individuals are nodes in the graph If there’s interaction between two nodes, there’s an edge between them The links can have weights or intensities signifying connection strength The links can be directed, like in the web graph. There’s a directed link between two nodes (pages) A and B if there’s a hyperlink to B from A Satyaki Sikdar© Programming in Python April 23 2016 61 / 62
  • 188.
    hours 9 -11 SNA 101 Representing Networks Network representation Networks portray the interactions between different actors.Graphs hand us a valuable tool to process and handle networks. Actors or individuals are nodes in the graph If there’s interaction between two nodes, there’s an edge between them The links can have weights or intensities signifying connection strength The links can be directed, like in the web graph. There’s a directed link between two nodes (pages) A and B if there’s a hyperlink to B from A Satyaki Sikdar© Programming in Python April 23 2016 61 / 62
  • 189.
    hours 9 -11 SNA 101 Representing Networks Network representation Networks portray the interactions between different actors.Graphs hand us a valuable tool to process and handle networks. Actors or individuals are nodes in the graph If there’s interaction between two nodes, there’s an edge between them The links can have weights or intensities signifying connection strength The links can be directed, like in the web graph. There’s a directed link between two nodes (pages) A and B if there’s a hyperlink to B from A Satyaki Sikdar© Programming in Python April 23 2016 61 / 62
  • 190.
    hours 9 -11 SNA 101 Representing Networks Network representation Networks portray the interactions between different actors.Graphs hand us a valuable tool to process and handle networks. Actors or individuals are nodes in the graph If there’s interaction between two nodes, there’s an edge between them The links can have weights or intensities signifying connection strength The links can be directed, like in the web graph. There’s a directed link between two nodes (pages) A and B if there’s a hyperlink to B from A Satyaki Sikdar© Programming in Python April 23 2016 61 / 62
  • 191.
    hours 9 -11 SNA 101 Representing Networks Network representation Networks portray the interactions between different actors.Graphs hand us a valuable tool to process and handle networks. Actors or individuals are nodes in the graph If there’s interaction between two nodes, there’s an edge between them The links can have weights or intensities signifying connection strength The links can be directed, like in the web graph. There’s a directed link between two nodes (pages) A and B if there’s a hyperlink to B from A Satyaki Sikdar© Programming in Python April 23 2016 61 / 62
  • 192.
    hours 9 -11 SNA 101 Representing Networks Network representation Networks portray the interactions between different actors.Graphs hand us a valuable tool to process and handle networks. Actors or individuals are nodes in the graph If there’s interaction between two nodes, there’s an edge between them The links can have weights or intensities signifying connection strength The links can be directed, like in the web graph. There’s a directed link between two nodes (pages) A and B if there’s a hyperlink to B from A Satyaki Sikdar© Programming in Python April 23 2016 61 / 62
  • 193.
    hours 9 -11 SNA 101 Representing Networks Network representation Networks portray the interactions between different actors.Graphs hand us a valuable tool to process and handle networks. Actors or individuals are nodes in the graph If there’s interaction between two nodes, there’s an edge between them The links can have weights or intensities signifying connection strength The links can be directed, like in the web graph. There’s a directed link between two nodes (pages) A and B if there’s a hyperlink to B from A Satyaki Sikdar© Programming in Python April 23 2016 61 / 62
  • 194.
    hours 9 -11 SNA 101 Representing Networks Network representation Networks portray the interactions between different actors.Graphs hand us a valuable tool to process and handle networks. Actors or individuals are nodes in the graph If there’s interaction between two nodes, there’s an edge between them The links can have weights or intensities signifying connection strength The links can be directed, like in the web graph. There’s a directed link between two nodes (pages) A and B if there’s a hyperlink to B from A Satyaki Sikdar© Programming in Python April 23 2016 61 / 62
  • 195.
    hours 9 -11 SNA 101 Representing Networks Network representation Networks portray the interactions between different actors.Graphs hand us a valuable tool to process and handle networks. Actors or individuals are nodes in the graph If there’s interaction between two nodes, there’s an edge between them The links can have weights or intensities signifying connection strength The links can be directed, like in the web graph. There’s a directed link between two nodes (pages) A and B if there’s a hyperlink to B from A Satyaki Sikdar© Programming in Python April 23 2016 61 / 62
  • 196.
    hours 9 -11 SNA 101 Representing Networks Network representation Networks portray the interactions between different actors.Graphs hand us a valuable tool to process and handle networks. Actors or individuals are nodes in the graph If there’s interaction between two nodes, there’s an edge between them The links can have weights or intensities signifying connection strength The links can be directed, like in the web graph. There’s a directed link between two nodes (pages) A and B if there’s a hyperlink to B from A Satyaki Sikdar© Programming in Python April 23 2016 61 / 62
  • 197.
    hours 9 -11 SNA 101 Representing Networks Network representation Networks portray the interactions between different actors.Graphs hand us a valuable tool to process and handle networks. Actors or individuals are nodes in the graph If there’s interaction between two nodes, there’s an edge between them The links can have weights or intensities signifying connection strength The links can be directed, like in the web graph. There’s a directed link between two nodes (pages) A and B if there’s a hyperlink to B from A Satyaki Sikdar© Programming in Python April 23 2016 61 / 62
  • 198.
    hours 9 -11 SNA 101 Representing Networks Please move to the pdf named tutorial_networkx for the rest of the slides Thanks! Satyaki Sikdar© Programming in Python April 23 2016 62 / 62