KEMBAR78
Object Oriented Programing in Python | PDF | Career & Growth | Computers
0% found this document useful (0 votes)
371 views220 pages

Object Oriented Programing in Python

Uploaded by

Garuma Abdisa
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
371 views220 pages

Object Oriented Programing in Python

Uploaded by

Garuma Abdisa
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 220

Object-Oriented Programming (OOP) in

Python 3
by David Amos 103 Comments intermediate python
Tweet Share Email

Table of Contents

 What Is Object-Oriented Programming in Python?


 Define a Class in Python
o Classes vs Instances
o How to Define a Class
 Instantiate an Object in Python
o Class and Instance Attributes
o Instance Methods
o Check Your Understanding
 Inherit From Other Classes in Python
o Dog Park Example
o Parent Classes vs Child Classes
o Extend the Functionality of a Parent Class
o Check Your Understanding
 Conclusion

Watch Now This tutorial has a related video course created by the Real Python team. Watch it
together with the written tutorial to deepen your understanding: Intro to Object-Oriented
Programming (OOP) in Python

Object-oriented programming (OOP) is a method of structuring a program by bundling related


properties and behaviors into individual objects. In this tutorial, you’ll learn the basics of object-
oriented programming in Python.

Conceptually, objects are like the components of a system. Think of a program as a factory
assembly line of sorts. At each step of the assembly line a system component processes some
material, ultimately transforming raw material into a finished product.

An object contains data, like the raw or preprocessed materials at each step on an assembly line,
and behavior, like the action each assembly line component performs.

In this tutorial, you’ll learn how to:

 Create a class, which is like a blueprint for creating an object


 Use classes to create new objects
 Model systems with class inheritance

1
Note: This tutorial is adapted from the chapter “Object-Oriented Programming (OOP)” in
Python Basics: A Practical Introduction to Python 3.

The book uses Python’s built-in IDLE editor to create and edit Python files and interact with the
Python shell, so you will see occasional references to IDLE throughout this tutorial. However,
you should have no problems running the example code from the editor and environment of your
choice.

Free Bonus: Click here to get access to a free Python OOP Cheat Sheet that points you to the
best tutorials, videos, and books to learn more about Object-Oriented Programming with Python.

What Is Object-Oriented Programming in Python?


Object-oriented programming is a programming paradigm that provides a means of structuring
programs so that properties and behaviors are bundled into individual objects.

For instance, an object could represent a person with properties like a name, age, and address
and behaviors such as walking, talking, breathing, and running. Or it could represent an email
with properties like a recipient list, subject, and body and behaviors like adding attachments and
sending.

Put another way, object-oriented programming is an approach for modeling concrete, real-world
things, like cars, as well as relations between things, like companies and employees, students and
teachers, and so on. OOP models real-world entities as software objects that have some data
associated with them and can perform certain functions.

Another common programming paradigm is procedural programming, which structures a


program like a recipe in that it provides a set of steps, in the form of functions and code blocks,
that flow sequentially in order to complete a task.

The key takeaway is that objects are at the center of object-oriented programming in Python, not
only representing the data, as in procedural programming, but in the overall structure of the
program as well.

Define a Class in Python


Primitive data structures—like numbers, strings, and lists—are designed to represent simple
pieces of information, such as the cost of an apple, the name of a poem, or your favorite colors,
respectively. What if you want to represent something more complex?

For example, let’s say you want to track employees in an organization. You need to store some
basic information about each employee, such as their name, age, position, and the year they
started working.

One way to do this is to represent each employee as a list:

2
kirk = ["James Kirk", 34, "Captain", 2265]
spock = ["Spock", 35, "Science Officer", 2254]
mccoy = ["Leonard McCoy", "Chief Medical Officer", 2266]

There are a number of issues with this approach.

First, it can make larger code files more difficult to manage. If you reference kirk[0] several
lines away from where the kirk list is declared, will you remember that the element with index 0
is the employee’s name?

Second, it can introduce errors if not every employee has the same number of elements in the
list. In the mccoy list above, the age is missing, so mccoy[1] will return "Chief Medical
Officer" instead of Dr. McCoy’s age.

A great way to make this type of code more manageable and more maintainable is to use classes.

Classes vs Instances

Classes are used to create user-defined data structures. Classes define functions called methods,
which identify the behaviors and actions that an object created from the class can perform with
its data.

In this tutorial, you’ll create a Dog class that stores some information about the characteristics
and behaviors that an individual dog can have.

A class is a blueprint for how something should be defined. It doesn’t actually contain any data.
The Dog class specifies that a name and an age are necessary for defining a dog, but it doesn’t
contain the name or age of any specific dog.

While the class is the blueprint, an instance is an object that is built from a class and contains
real data. An instance of the Dog class is not a blueprint anymore. It’s an actual dog with a name,
like Miles, who’s four years old.

Put another way, a class is like a form or questionnaire. An instance is like a form that has been
filled out with information. Just like many people can fill out the same form with their own
unique information, many instances can be created from a single class.

How to Define a Class

All class definitions start with the class keyword, which is followed by the name of the class
and a colon. Any code that is indented below the class definition is considered part of the class’s
body.

Here’s an example of a Dog class:

class Dog:
pass

3
The body of the Dog class consists of a single statement: the pass keyword. pass is often used as
a placeholder indicating where code will eventually go. It allows you to run this code without
Python throwing an error.

Note: Python class names are written in CapitalizedWords notation by convention. For example,
a class for a specific breed of dog like the Jack Russell Terrier would be written as
JackRussellTerrier.

The Dog class isn’t very interesting right now, so let’s spruce it up a bit by defining some
properties that all Dog objects should have. There are a number of properties that we can choose
from, including name, age, coat color, and breed. To keep things simple, we’ll just use name and
age.

The properties that all Dog objects must have are defined in a method called .__init__(). Every
time a new Dog object is created, .__init__() sets the initial state of the object by assigning the
values of the object’s properties. That is, .__init__() initializes each new instance of the class.

You can give .__init__() any number of parameters, but the first parameter will always be a
variable called self. When a new class instance is created, the instance is automatically passed
to the self parameter in .__init__() so that new attributes can be defined on the object.

Let’s update the Dog class with an .__init__() method that creates .name and .age attributes:

class Dog:
def __init__(self, name, age):
self.name = name
self.age = age

Notice that the .__init__() method’s signature is indented four spaces. The body of the
method is indented by eight spaces. This indentation is vitally important. It tells Python that the
.__init__() method belongs to the Dog class.

In the body of .__init__(), there are two statements using the self variable:

1. self.name = name creates an attribute called name and assigns to it the value of the
name parameter.
2. self.age = age creates an attribute called age and assigns to it the value of the age
parameter.

Attributes created in .__init__() are called instance attributes. An instance attribute’s value
is specific to a particular instance of the class. All Dog objects have a name and an age, but the
values for the name and age attributes will vary depending on the Dog instance.

On the other hand, class attributes are attributes that have the same value for all class instances.
You can define a class attribute by assigning a value to a variable name outside of .__init__().

4
For example, the following Dog class has a class attribute called species with the value "Canis
familiaris":

class Dog:
# Class attribute
species = "Canis familiaris"

def __init__(self, name, age):


self.name = name
self.age = age

Class attributes are defined directly beneath the first line of the class name and are indented by
four spaces. They must always be assigned an initial value. When an instance of the class is
created, class attributes are automatically created and assigned to their initial values.

Use class attributes to define properties that should have the same value for every class instance.
Use instance attributes for properties that vary from one instance to another.

Now that we have a Dog class, let’s create some dogs!

Instantiate an Object in Python


Open IDLE’s interactive window and type the following:

>>> class Dog:


... pass

This creates a new Dog class with no attributes or methods.

Creating a new object from a class is called instantiating an object. You can instantiate a new
Dog object by typing the name of the class, followed by opening and closing parentheses:

>>> Dog()
<__main__.Dog object at 0x106702d30>

You now have a new Dog object at 0x106702d30. This funny-looking string of letters and
numbers is a memory address that indicates where the Dog object is stored in your computer’s
memory. Note that the address you see on your screen will be different.

Now instantiate a second Dog object:

>>> Dog()
<__main__.Dog object at 0x0004ccc90>

The new Dog instance is located at a different memory address. That’s because it’s an entirely
new instance and is completely unique from the first Dog object that you instantiated.

To see this another way, type the following:

5
>>> a = Dog()
>>> b = Dog()
>>> a == b
False

In this code, you create two new Dog objects and assign them to the variables a and b. When you
compare a and b using the == operator, the result is False. Even though a and b are both
instances of the Dog class, they represent two distinct objects in memory.

Class and Instance Attributes

Now create a new Dog class with a class attribute called .species and two instance attributes
called .name and .age:

>>> class Dog:


... species = "Canis familiaris"
... def __init__(self, name, age):
... self.name = name
... self.age = age

To instantiate objects of this Dog class, you need to provide values for the name and age. If you
don’t, then Python raises a TypeError:

>>> Dog()
Traceback (most recent call last):
File "<pyshell#6>", line 1, in <module>
Dog()
TypeError: __init__() missing 2 required positional arguments: 'name' and
'age'

To pass arguments to the name and age parameters, put values into the parentheses after the class
name:

>>> buddy = Dog("Buddy", 9)


>>> miles = Dog("Miles", 4)

This creates two new Dog instances—one for a nine-year-old dog named Buddy and one for a
four-year-old dog named Miles.

The Dog class’s .__init__() method has three parameters, so why are only two arguments
passed to it in the example?

When you instantiate a Dog object, Python creates a new instance and passes it to the first
parameter of .__init__(). This essentially removes the self parameter, so you only need to
worry about the name and age parameters.

After you create the Dog instances, you can access their instance attributes using dot notation:

>>> buddy.name

6
'Buddy'
>>> buddy.age
9

>>> miles.name
'Miles'
>>> miles.age
4

You can access class attributes the same way:

>>> buddy.species
'Canis familiaris'

One of the biggest advantages of using classes to organize data is that instances are guaranteed to
have the attributes you expect. All Dog instances have .species, .name, and .age attributes, so
you can use those attributes with confidence knowing that they will always return a value.

Although the attributes are guaranteed to exist, their values can be changed dynamically:

>>> buddy.age = 10
>>> buddy.age
10

>>> miles.species = "Felis silvestris"


>>> miles.species
'Felis silvestris'

In this example, you change the .age attribute of the buddy object to 10. Then you change the
.species attribute of the miles object to "Felis silvestris", which is a species of cat. That
makes Miles a pretty strange dog, but it is valid Python!

The key takeaway here is that custom objects are mutable by default. An object is mutable if it
can be altered dynamically. For example, lists and dictionaries are mutable, but strings and tuples
are immutable.

Instance Methods

Instance methods are functions that are defined inside a class and can only be called from an
instance of that class. Just like .__init__(), an instance method’s first parameter is always
self.

Open a new editor window in IDLE and type in the following Dog class:

class Dog:
species = "Canis familiaris"

def __init__(self, name, age):


self.name = name
self.age = age

7
# Instance method
def description(self):
return f"{self.name} is {self.age} years old"

# Another instance method


def speak(self, sound):
return f"{self.name} says {sound}"

This Dog class has two instance methods:

1. .description() returns a string displaying the name and age of the dog.
2. .speak() has one parameter called sound and returns a string containing the dog’s name
and the sound the dog makes.

Save the modified Dog class to a file called dog.py and press F5 to run the program. Then open
the interactive window and type the following to see your instance methods in action:

>>> miles = Dog("Miles", 4)

>>> miles.description()
'Miles is 4 years old'

>>> miles.speak("Woof Woof")


'Miles says Woof Woof'

>>> miles.speak("Bow Wow")


'Miles says Bow Wow'

In the above Dog class, .description() returns a string containing information about the Dog
instance miles. When writing your own classes, it’s a good idea to have a method that returns a
string containing useful information about an instance of the class. However, .description()
isn’t the most Pythonic way of doing this.

When you create a list object, you can use print() to display a string that looks like the list:

>>> names = ["Fletcher", "David", "Dan"]


>>> print(names)
['Fletcher', 'David', 'Dan']

Let’s see what happens when you print() the miles object:

>>> print(miles)
<__main__.Dog object at 0x00aeff70>

When you print(miles), you get a cryptic looking message telling you that miles is a Dog
object at the memory address 0x00aeff70. This message isn’t very helpful. You can change
what gets printed by defining a special instance method called .__str__().

8
In the editor window, change the name of the Dog class’s .description() method to
.__str__():

class Dog:
# Leave other parts of Dog class as-is

# Replace .description() with __str__()


def __str__(self):
return f"{self.name} is {self.age} years old"

Save the file and press F5. Now, when you print(miles), you get a much friendlier output:

>>> miles = Dog("Miles", 4)


>>> print(miles)
'Miles is 4 years old'

Methods like .__init__() and .__str__() are called dunder methods because they begin and
end with double underscores. There are many dunder methods that you can use to customize
classes in Python. Although too advanced a topic for a beginning Python book, understanding
dunder methods is an important part of mastering object-oriented programming in Python.

In the next section, you’ll see how to take your knowledge one step further and create classes
from other classes.

Check Your Understanding

Expand the block below to check your understanding:

You can expand the block below to see a solution:

When you’re ready, you can move on to the next section.

Inherit From Other Classes in Python


Inheritance is the process by which one class takes on the attributes and methods of another.
Newly formed classes are called child classes, and the classes that child classes are derived from
are called parent classes.

Note: This tutorial is adapted from the chapter “Object-Oriented Programming (OOP)” in
Python Basics: A Practical Introduction to Python 3. If you enjoy what you’re reading, then be
sure to check out the rest of the book and the learning path.

You can also check out the Python Basics: Building Systems With Classes video course to
reinforce the skills that you’ll develop in this section of the tutorial.

9
Child classes can override or extend the attributes and methods of parent classes. In other words,
child classes inherit all of the parent’s attributes and methods but can also specify attributes and
methods that are unique to themselves.

Although the analogy isn’t perfect, you can think of object inheritance sort of like genetic
inheritance.

You may have inherited your hair color from your mother. It’s an attribute you were born with.
Let’s say you decide to color your hair purple. Assuming your mother doesn’t have purple hair,
you’ve just overridden the hair color attribute that you inherited from your mom.

You also inherit, in a sense, your language from your parents. If your parents speak English, then
you’ll also speak English. Now imagine you decide to learn a second language, like German. In
this case you’ve extended your attributes because you’ve added an attribute that your parents
don’t have.

Dog Park Example

Pretend for a moment that you’re at a dog park. There are many dogs of different breeds at the
park, all engaging in various dog behaviors.

Suppose now that you want to model the dog park with Python classes. The Dog class that you
wrote in the previous section can distinguish dogs by name and age but not by breed.

You could modify the Dog class in the editor window by adding a .breed attribute:

class Dog:
species = "Canis familiaris"

def __init__(self, name, age, breed):


self.name = name
self.age = age
self.breed = breed

The instance methods defined earlier are omitted here because they aren’t important for this
discussion.

Press F5 to save the file. Now you can model the dog park by instantiating a bunch of different
dogs in the interactive window:

>>> miles = Dog("Miles", 4, "Jack Russell Terrier")


>>> buddy = Dog("Buddy", 9, "Dachshund")
>>> jack = Dog("Jack", 3, "Bulldog")
>>> jim = Dog("Jim", 5, "Bulldog")

Each breed of dog has slightly different behaviors. For example, bulldogs have a low bark that
sounds like woof, but dachshunds have a higher-pitched bark that sounds more like yap.

10
Using just the Dog class, you must supply a string for the sound argument of .speak() every
time you call it on a Dog instance:

>>> buddy.speak("Yap")
'Buddy says Yap'

>>> jim.speak("Woof")
'Jim says Woof'

>>> jack.speak("Woof")
'Jack says Woof'

Passing a string to every call to .speak() is repetitive and inconvenient. Moreover, the string
representing the sound that each Dog instance makes should be determined by its .breed
attribute, but here you have to manually pass the correct string to .speak() every time it’s
called.

You can simplify the experience of working with the Dog class by creating a child class for each
breed of dog. This allows you to extend the functionality that each child class inherits, including
specifying a default argument for .speak().

Parent Classes vs Child Classes

Let’s create a child class for each of the three breeds mentioned above: Jack Russell Terrier,
Dachshund, and Bulldog.

For reference, here’s the full definition of the Dog class:

class Dog:
species = "Canis familiaris"

def __init__(self, name, age):


self.name = name
self.age = age

def __str__(self):
return f"{self.name} is {self.age} years old"

def speak(self, sound):


return f"{self.name} says {sound}"

Remember, to create a child class, you create new class with its own name and then put the name
of the parent class in parentheses. Add the following to the dog.py file to create three new child
classes of the Dog class:

class JackRussellTerrier(Dog):
pass

class Dachshund(Dog):
pass

11
class Bulldog(Dog):
pass

Press F5 to save and run the file. With the child classes defined, you can now instantiate some
dogs of specific breeds in the interactive window:

>>> miles = JackRussellTerrier("Miles", 4)


>>> buddy = Dachshund("Buddy", 9)
>>> jack = Bulldog("Jack", 3)
>>> jim = Bulldog("Jim", 5)

Instances of child classes inherit all of the attributes and methods of the parent class:

>>> miles.species
'Canis familiaris'

>>> buddy.name
'Buddy'

>>> print(jack)
Jack is 3 years old

>>> jim.speak("Woof")
'Jim says Woof'

To determine which class a given object belongs to, you can use the built-in type():

>>> type(miles)
<class '__main__.JackRussellTerrier'>

What if you want to determine if miles is also an instance of the Dog class? You can do this with
the built-in isinstance():

>>> isinstance(miles, Dog)


True

Notice that isinstance() takes two arguments, an object and a class. In the example above,
isinstance() checks if miles is an instance of the Dog class and returns True.

The miles, buddy, jack, and jim objects are all Dog instances, but miles is not a Bulldog
instance, and jack is not a Dachshund instance:

>>> isinstance(miles, Bulldog)


False

>>> isinstance(jack, Dachshund)


False

More generally, all objects created from a child class are instances of the parent class, although
they may not be instances of other child classes.

12
Now that you’ve created child classes for some different breeds of dogs, let’s give each breed its
own sound.

Extend the Functionality of a Parent Class

Since different breeds of dogs have slightly different barks, you want to provide a default value
for the sound argument of their respective .speak() methods. To do this, you need to override
.speak() in the class definition for each breed.

To override a method defined on the parent class, you define a method with the same name on
the child class. Here’s what that looks like for the JackRussellTerrier class:

class JackRussellTerrier(Dog):
def speak(self, sound="Arf"):
return f"{self.name} says {sound}"

Now .speak() is defined on the JackRussellTerrier class with the default argument for
sound set to "Arf".

Update dog.py with the new JackRussellTerrier class and press F5 to save and run the file.
You can now call .speak() on a JackRussellTerrier instance without passing an argument to
sound:

>>> miles = JackRussellTerrier("Miles", 4)


>>> miles.speak()
'Miles says Arf'

Sometimes dogs make different barks, so if Miles gets angry and growls, you can still call
.speak() with a different sound:

>>> miles.speak("Grrr")
'Miles says Grrr'

One thing to keep in mind about class inheritance is that changes to the parent class
automatically propagate to child classes. This occurs as long as the attribute or method being
changed isn’t overridden in the child class.

For example, in the editor window, change the string returned by .speak() in the Dog class:

class Dog:
# Leave other attributes and methods as they are

# Change the string returned by .speak()


def speak(self, sound):
return f"{self.name} barks: {sound}"

Save the file and press F5. Now, when you create a new Bulldog instance named jim,
jim.speak() returns the new string:

13
>>> jim = Bulldog("Jim", 5)
>>> jim.speak("Woof")
'Jim barks: Woof'

However, calling .speak() on a JackRussellTerrier instance won’t show the new style of
output:

>>> miles = JackRussellTerrier("Miles", 4)


>>> miles.speak()
'Miles says Arf'

Sometimes it makes sense to completely override a method from a parent class. But in this
instance, we don’t want the JackRussellTerrier class to lose any changes that might be made
to the formatting of the output string of Dog.speak().

To do this, you still need to define a .speak() method on the child JackRussellTerrier class.
But instead of explicitly defining the output string, you need to call the Dog class’s .speak()
inside of the child class’s .speak() using the same arguments that you passed to
JackRussellTerrier.speak().

You can access the parent class from inside a method of a child class by using super():

class JackRussellTerrier(Dog):
def speak(self, sound="Arf"):
return super().speak(sound)

When you call super().speak(sound) inside JackRussellTerrier, Python searches the


parent class, Dog, for a .speak() method and calls it with the variable sound.

Update dog.py with the new JackRussellTerrier class. Save the file and press F5 so you can
test it in the interactive window:

>>> miles = JackRussellTerrier("Miles", 4)


>>> miles.speak()
'Miles barks: Arf'

Now when you call miles.speak(), you’ll see output reflecting the new formatting in the Dog
class.

Note: In the above examples, the class hierarchy is very straightforward. The
JackRussellTerrier class has a single parent class, Dog. In real-world examples, the class
hierarchy can get quite complicated.

super() does much more than just search the parent class for a method or an attribute. It
traverses the entire class hierarchy for a matching method or attribute. If you aren’t careful,
super() can have surprising results.

14
Check Your Understanding

Expand the block below to check your understanding:

You can expand the block below to see a solution:

Conclusion
In this tutorial, you learned about object-oriented programming (OOP) in Python. Most modern
programming languages, such as Java, C#, and C++, follow OOP principles, so the knowledge
you gained here will be applicable no matter where your programming career takes you.

In this tutorial, you learned how to:

 Define a class, which is a sort of blueprint for an object


 Instantiate an object from a class
 Use attributes and methods to define the properties and behaviors of an object
 Use inheritance to create child classes from a parent class
 Reference a method on a parent class using super()
 Check if an object inherits from another class using isinstance()

If you enjoyed what you learned in this sample from Python Basics: A Practical Introduction to
Python 3, then be sure to check out the rest of the book.

15
Operator and Function Overloading in
Custom Python Classes
by Malay Agarwal May 08, 2018 16 Comments intermediate python
Tweet Share Email

Table of Contents

 The Python Data Model


 The Internals of Operations Like len() and []
 Overloading Built-in Functions
o Giving a Length to Your Objects Using len()
o Making Your Objects Work With abs()
o Printing Your Objects Prettily Using str()
o Representing Your Objects Using repr()
o Making Your Objects Truthy or Falsey Using bool()
 Overloading Built-in Operators
o Making Your Objects Capable of Being Added Using +
o Shortcuts: the += Operator
o Indexing and Slicing Your Objects Using []
o Reverse Operators: Making Your Classes Mathematically Correct
 A Complete Example
 Recap and Resources

If you’ve used the + or * operator on a str object in Python, you must have noticed its different
behavior when compared to int or float objects:

>>> # Adds the two numbers


>>> 1 + 2
3

>>> # Concatenates the two strings


>>> 'Real' + 'Python'
'RealPython'

>>> # Gives the product


>>> 3 * 2
6

>>> # Repeats the string


>>> 'Python' * 3
'PythonPythonPython'

You might have wondered how the same built-in operator or function shows different behavior
for objects of different classes. This is called operator overloading or function overloading

16
respectively. This article will help you understand this mechanism, so that you can do the same
in your own Python classes and make your objects more Pythonic.

You’ll learn the following:

 The API that handles operators and built-ins in Python


 The “secret” behind len() and other built-ins
 How to make your classes capable of using operators
 How to make your classes compatible with Python’s built-in functions

Free Bonus: Click here to get access to a free Python OOP Cheat Sheet that points you to the
best tutorials, videos, and books to learn more about Object-Oriented Programming with Python.

As a bonus, you’ll also see an example class, objects of which will be compatible with many of
these operators and functions. Let’s get started!

The Python Data Model


Say you have a class representing an online order having a cart (a list) and a customer (a str or
instance of another class which represents a customer).

Note: If you need a refresher on OOP in Python, check out this tutorial on Real Python: Object-
Oriented Programming (OOP) in Python 3

In such a case, it is quite natural to want to obtain the length of the cart list. Someone new to
Python might decide to implement a method called get_cart_len() in their class to do this. But
you can configure the built-in len() in such a way that it returns the length of the cart list when
given our object.

In another case, we might want to append something to the cart. Again, someone new to Python
would think of implementing a method called append_to_cart() that takes an item and
appends it to the cart list. But you can configure the + operator in such a way that it appends a
new item to the cart.

Python does all this using special methods. These special methods have a naming convention,
where the name starts with two underscores, followed by an identifier and ends with another pair
of underscores.

Essentially, each built-in function or operator has a special method corresponding to it. For
example, there’s __len__(), corresponding to len(), and __add__(), corresponding to the +
operator.

By default, most of the built-ins and operators will not work with objects of your classes. You
must add the corresponding special methods in your class definition to make your object
compatible with built-ins and operators.

17
When you do this, the behavior of the function or operator associated with it changes according
to that defined in the method.

This is exactly what the Data Model (Section 3 of the Python documentation) helps you
accomplish. It lists all the special methods available and provides you with the means of
overloading built-in functions and operators so that you can use them on your own objects.

Let’s see what this means.

Fun fact: Due to the naming convention used for these methods, they are also called dunder
methods which is a shorthand for double underscore methods. Sometimes they’re also referred to
as special methods or magic methods. We prefer dunder methods though!

The Internals of Operations Like len() and []


Every class in Python defines its own behavior for built-in functions and methods. When you
pass an instance of some class to a built-in function or use an operator on the instance, it is
actually equivalent to calling a special method with relevant arguments.

If there is a built-in function, func(), and the corresponding special method for the function is
__func__(), Python interprets a call to the function as obj.__func__(), where obj is the
object. In the case of operators, if you have an operator opr and the corresponding special
method for it is __opr__(), Python interprets something like obj1 <opr> obj2 as
obj1.__opr__(obj2).

So, when you’re calling len() on an object, Python handles the call as obj.__len__(). When
you use the [] operator on an iterable to obtain the value at an index, Python handles it as
itr.__getitem__(index), where itr is the iterable object and index is the index you want to
obtain.

Therefore, when you define these special methods in your own class, you override the behavior
of the function or operator associated with them because, behind the scenes, Python is calling
your method. Let’s get a better understanding of this:

>>> a = 'Real Python'


>>> b = ['Real', 'Python']
>>> len(a)
11
>>> a.__len__()
11
>>> b[0]
'Real'
>>> b.__getitem__(0)
'Real'

As you can see, when you use the function or its corresponding special method, you get the same
result. In fact, when you obtain the list of attributes and methods of a str object using dir(),

18
you’ll see these special methods in the list in addition to the usual methods available on str
objects:

>>> dir(a)
['__add__',
'__class__',
'__contains__',
'__delattr__',
'__dir__',
...,
'__iter__',
'__le__',
'__len__',
'__lt__',
...,
'swapcase',
'title',
'translate',
'upper',
'zfill']

If the behavior of a built-in function or operator is not defined in the class by the special method,
then you will get a TypeError.

So, how can you use special methods in your classes?

Overloading Built-in Functions


Many of the special methods defined in the Data Model can be used to change the behavior of
functions such as len, abs, hash, divmod, and so on. To do this, you only need to define the
corresponding special method in your class. Let’s look at a few examples:

Giving a Length to Your Objects Using len()

To change the behavior of len(), you need to define the __len__() special method in your
class. Whenever you pass an object of your class to len(), your custom definition of __len__()
will be used to obtain the result. Let’s implement len() for the order class we talked about in the
beginning:

>>> class Order:


... def __init__(self, cart, customer):
... self.cart = list(cart)
... self.customer = customer
...
... def __len__(self):
... return len(self.cart)
...
>>> order = Order(['banana', 'apple', 'mango'], 'Real Python')
>>> len(order)
3

19
As you can see, you can now use len() to directly obtain the length of the cart. Moreover, it
makes more intuitive sense to say “length of order” rather than calling something like
order.get_cart_len(). Your call is both Pythonic and more intuitive. When you don’t have
the __len__() method defined but still call len() on your object, you get a TypeError:

>>> class Order:


... def __init__(self, cart, customer):
... self.cart = list(cart)
... self.customer = customer
...
>>> order = Order(['banana', 'apple', 'mango'], 'Real Python')
>>> len(order) # Calling len when no __len__
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: object of type 'Order' has no len()

But, when overloading len(), you should keep in mind that Python requires the function to
return an integer. If your method were to return anything other than an integer, you would get a
TypeError. This, most probably, is to keep it consistent with the fact that len() is generally
used to obtain the length of a sequence, which can only be an integer:

>>> class Order:


... def __init__(self, cart, customer):
... self.cart = list(cart)
... self.customer = customer
...
... def __len__(self):
... return float(len(self.cart)) # Return type changed to float
...
>>> order = Order(['banana', 'apple', 'mango'], 'Real Python')
>>> len(order)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: 'float' object cannot be interpreted as an integer

Making Your Objects Work With abs()

You can dictate the behavior of the abs() built-in for instances of your class by defining the
__abs__() special method in the class. There are no restrictions on the return value of abs(),
and you get a TypeError when the special method is absent in your class definition.

In a class representing a vector in a two-dimensional space, abs() can be used to get the length
of the vector. Let’s see it in action:

>>> class Vector:


... def __init__(self, x_comp, y_comp):
... self.x_comp = x_comp
... self.y_comp = y_comp
...
... def __abs__(self):
... return (self.x_comp ** 2 + self.y_comp ** 2) ** 0.5

20
...
>>> vector = Vector(3, 4)
>>> abs(vector)
5.0

It makes more intuitive sense to say “absolute value of vector” rather than calling something like
vector.get_mag().

Printing Your Objects Prettily Using str()

The str() built-in is used to cast an instance of a class to a str object, or more appropriately, to
obtain a user-friendly string representation of the object which can be read by a normal user
rather than the programmer. You can define the string format your object should be displayed in
when passed to str() by defining the __str__() method in your class. Moreover, __str__() is
the method that is used by Python when you call print() on your object.

Let’s implement this in the Vector class to format Vector objects as xi+yj. A negative y-
component will be handled using the format mini-language:

>>> class Vector:


... def __init__(self, x_comp, y_comp):
... self.x_comp = x_comp
... self.y_comp = y_comp
...
... def __str__(self):
... # By default, sign of +ve number is not displayed
... # Using `+`, sign is always displayed
... return f'{self.x_comp}i{self.y_comp:+}j'
...
>>> vector = Vector(3, 4)
>>> str(vector)
'3i+4j'
>>> print(vector)
3i+4j

It is necessary that __str__() returns a str object, and we get a TypeError if the return type is
non-string.

Representing Your Objects Using repr()

The repr() built-in is used to obtain the parsable string representation of an object. If an object
is parsable, that means that Python should be able to recreate the object from the representation
when repr is used in conjunction with functions like eval(). To define the behavior of repr(),
you can use the __repr__() special method.

This is also the method Python uses to display the object in a REPL session. If the __repr__()
method is not defined, you will get something like <__main__.Vector object at 0x...>
trying to look at the object in the REPL session. Let’s see it in action in the Vector class:

21
>>> class Vector:
... def __init__(self, x_comp, y_comp):
... self.x_comp = x_comp
... self.y_comp = y_comp
...
... def __repr__(self):
... return f'Vector({self.x_comp}, {self.y_comp})'
...

>>> vector = Vector(3, 4)


>>> repr(vector)
'Vector(3, 4)'

>>> b = eval(repr(vector))
>>> type(b), b.x_comp, b.y_comp
(__main__.Vector, 3, 4)

>>> vector # Looking at object; __repr__ used


'Vector(3, 4)'

Note: In cases where the __str__() method is not defined, Python uses the __repr__() method
to print the object, as well as to represent the object when str() is called on it. If both the
methods are missing, it defaults to <__main__.Vector ...>. But __repr__() is the only
method that is used to display the object in an interactive session. Absence of it in the class
yields <__main__.Vector ...>.

Also, while this distinction between __str__() and __repr__() is the recommended behavior,
many of the popular libraries ignore this distinction and use the two methods interchangeably.

Here’s a recommended article on __repr__() and __str__() by our very own Dan Bader:
Python String Conversion 101: Why Every Class Needs a “repr”.

Making Your Objects Truthy or Falsey Using bool()

The bool() built-in can be used to obtain the truth value of an object. To define its behavior, you
can use the __bool__() (__nonzero__() in Python 2.x) special method.

The behavior defined here will determine the truth value of an instance in all contexts that
require obtaining a truth value such as in if statements.

As an example, for the Order class that was defined above, an instance can be considered to be
truthy if the length of the cart list is non-zero. This can be used to check whether an order should
be processed or not:

>>> class Order:


... def __init__(self, cart, customer):
... self.cart = list(cart)
... self.customer = customer
...
... def __bool__(self):
... return len(self.cart) > 0

22
...
>>> order1 = Order(['banana', 'apple', 'mango'], 'Real Python')
>>> order2 = Order([], 'Python')

>>> bool(order1)
True
>>> bool(order2)
False

>>> for order in [order1, order2]:


... if order:
... print(f"{order.customer}'s order is processing...")
... else:
... print(f"Empty order for customer {order.customer}")
Real Python's order is processing...
Empty order for customer Python

Note: When the __bool__() special method is not implemented in a class, the value returned by
__len__() is used as the truth value, where a non-zero value indicates True and a zero value
indicates False. In case both the methods are not implemented, all instances of the class are
considered to be True.

There are many more special methods that overload built-in functions. You can find them in the
documentation. Having discussed some of them, let’s move to operators.

Overloading Built-in Operators


Changing the behavior of operators is just as simple as changing the behavior of functions. You
define their corresponding special methods in your class, and the operators work according to the
behavior defined in these methods.

These are different from the above special methods in the sense that they need to accept another
argument in the definition other than self, generally referred to by the name other. Let’s look
at a few examples.

Making Your Objects Capable of Being Added Using +

The special method corresponding to the + operator is the __add__() method. Adding a custom
definition of __add__() changes the behavior of the operator. It is recommended that __add__()
returns a new instance of the class instead of modifying the calling instance itself. You’ll see this
behavior quite commonly in Python:

>>> a = 'Real'
>>> a + 'Python' # Gives new str instance
'RealPython'
>>> a # Values unchanged
'Real'
>>> a = a + 'Python' # Creates new instance and assigns a to it
>>> a
'RealPython'

23
You can see above that using the + operator on a str object actually returns a new str instance,
keeping the value of the calling instance (a) unmodified. To change it, we need to explicitly
assign the new instance to a.

Let’s implement the ability to append new items to our cart in the Order class using the operator.
We’ll follow the recommended practice and make the operator return a new Order instance that
has our required changes instead of making the changes directly to our instance:

>>> class Order:


... def __init__(self, cart, customer):
... self.cart = list(cart)
... self.customer = customer
...
... def __add__(self, other):
... new_cart = self.cart.copy()
... new_cart.append(other)
... return Order(new_cart, self.customer)
...
>>> order = Order(['banana', 'apple'], 'Real Python')

>>> (order + 'orange').cart # New Order instance


['banana', 'apple', 'orange']
>>> order.cart # Original instance unchanged
['banana', 'apple']

>>> order = order + 'mango' # Changing the original instance


>>> order.cart
['banana', 'apple', 'mango']

Similarly, you have the __sub__(), __mul__(), and other special methods which define the
behavior of -, *, and so on. These methods should return a new instance of the class as well.

Shortcuts: the += Operator

The += operator stands as a shortcut to the expression obj1 = obj1 + obj2. The special method
corresponding to it is __iadd__(). The __iadd__() method should make changes directly to the
self argument and return the result, which may or may not be self. This behavior is quite
different from __add__() since the latter creates a new object and returns that, as you saw
above.

Roughly, any += use on two objects is equivalent to this:

>>> result = obj1 + obj2


>>> obj1 = result

Here, result is the value returned by __iadd__(). The second assignment is taken care of
automatically by Python, meaning that you do not need to explicitly assign obj1 to the result as
in the case of obj1 = obj1 + obj2.

24
Let’s make this possible for the Order class so that new items can be appended to the cart using
+=:

>>> class Order:


... def __init__(self, cart, customer):
... self.cart = list(cart)
... self.customer = customer
...
... def __iadd__(self, other):
... self.cart.append(other)
... return self
...
>>> order = Order(['banana', 'apple'], 'Real Python')
>>> order += 'mango'
>>> order.cart
['banana', 'apple', 'mango']

As can be seen, any change is made directly to self and it is then returned. What happens when
you return some random value, like a string or an integer?

>>> class Order:


... def __init__(self, cart, customer):
... self.cart = list(cart)
... self.customer = customer
...
... def __iadd__(self, other):
... self.cart.append(other)
... return 'Hey, I am string!'
...
>>> order = Order(['banana', 'apple'], 'Real Python')
>>> order += 'mango'
>>> order
'Hey, I am string!'

Even though the relevant item was appended to the cart, the value of order changed to what was
returned by __iadd__(). Python implicitly handled the assignment for you. This can lead to
surprising behavior if you forget to return something in your implementation:

>>> class Order:


... def __init__(self, cart, customer):
... self.cart = list(cart)
... self.customer = customer
...
... def __iadd__(self, other):
... self.cart.append(other)
...
>>> order = Order(['banana', 'apple'], 'Real Python')
>>> order += 'mango'
>>> order # No output
>>> type(order)
NoneType

25
Since all Python functions (or methods) return None implicitly, order is reassigned to None and
the REPL session doesn’t show any output when order is inspected. Looking at the type of
order, you see that it is now NoneType. Therefore, always make sure that you’re returning
something in your implementation of __iadd__() and that it is the result of the operation and
not anything else.

Similar to __iadd__(), you have __isub__(), __imul__(), __idiv__() and other special
methods which define the behavior of -=, *=, /=, and others alike.

Note: When __iadd__() or its friends are missing from your class definition but you still use
their operators on your objects, Python uses __add__() and its friends to get the result of the
operation and assigns that to the calling instance. Generally speaking, it is safe to not implement
__iadd__() and its friends in your classes as long as __add__() and its friends work properly
(return something which is the result of the operation).

The Python documentation has a good explanation of these methods. Also, take a look at this
example which shows the caveats involved with += and the others when working with immutable
types.

Indexing and Slicing Your Objects Using []

The [] operator is called the indexing operator and is used in various contexts in Python such as
getting the value at an index in sequences, getting the value associated with a key in dictionaries,
or obtaining a part of a sequence through slicing. You can change its behavior using the
__getitem__() special method.

Let’s configure our Order class so that we can directly use the object and obtain an item from the
cart:

>>> class Order:


... def __init__(self, cart, customer):
... self.cart = list(cart)
... self.customer = customer
...
... def __getitem__(self, key):
... return self.cart[key]
...
>>> order = Order(['banana', 'apple'], 'Real Python')
>>> order[0]
'banana'
>>> order[-1]
'apple'

You’ll notice that above, the name of the argument to __getitem__() is not index but key. This
is because the argument can be of mainly three forms: an integer value, in which case it is either
an index or a dictionary key, a string value, in which case it is a dictionary key, and a slice
object, in which case it will slice the sequence used by the class. While there are other
possibilities, these are the ones most commonly encountered.

26
Since our internal data structure is a list, we can use the [] operator to slice the list, as in this
case, the key argument will be a slice object. This is one of the biggest advantages of having a
__getitem__() definition in your class. As long as you’re using data structures that support
slicing (lists, tuples, strings, and so on), you can configure your objects to directly slice the
structure:

>>> order[1:]
['apple']
>>> order[::-1]
['apple', 'banana']

Note: There is a similar __setitem__() special method that is used to define the behavior of
obj[x] = y. This method takes two arguments in addition to self, generally called key and
value, and can be used to change the value at key to value.

Reverse Operators: Making Your Classes Mathematically Correct

While defining the __add__(), __sub__(), __mul__(), and similar special methods allows you
to use the operators when your class instance is the left-hand side operand, the operator will not
work if the class instance is the right-hand side operand:

>>> class Mock:


... def __init__(self, num):
... self.num = num
... def __add__(self, other):
... return Mock(self.num + other)
...
>>> mock = Mock(5)
>>> mock = mock + 6
>>> mock.num
11

>>> mock = 6 + Mock(5)


Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: unsupported operand type(s) for +: 'int' and 'Mock'

If your class represents a mathematical entity like a vector, a coordinate, or a complex number,
applying the operators should work in both the cases since it is a valid mathematical operation.

Moreover, if the operators work only when the instance is the left operand, we are violating the
fundamental principle of commutativity in many cases. Therefore, to help you make your classes
mathematically correct, Python provides you with reverse special methods such as
__radd__(), __rsub__(), __rmul__(), and so on.

These handle calls such as x + obj, x - obj, and x * obj, where x is not an instance of the
concerned class. Just like __add__() and the others, these reverse special methods should return
a new instance of class with the changes of the operation rather than modifying the calling
instance itself.

27
Let’s configure __radd__() in the Order class in such a way that it will append something at the
front of the cart. This can be used in cases where the cart is organized in terms of the priority of
the orders:

>>> class Order:


... def __init__(self, cart, customer):
... self.cart = list(cart)
... self.customer = customer
...
... def __add__(self, other):
... new_cart = self.cart.copy()
... new_cart.append(other)
... return Order(new_cart, self.customer)
...
... def __radd__(self, other):
... new_cart = self.cart.copy()
... new_cart.insert(0, other)
... return Order(new_cart, self.customer)
...
>>> order = Order(['banana', 'apple'], 'Real Python')

>>> order = order + 'orange'


>>> order.cart
['banana', 'apple', 'orange']

>>> order = 'mango' + order


>>> order.cart
['mango', 'banana', 'apple', 'orange']

A Complete Example
To drive all these points home, it’s better to look at an example class which implements these
operators together.

Let’s reinvent the wheel and implement our own class to represent complex numbers,
CustomComplex. Objects of our class will support a variety of built-in functions and operators,
making them behave very similar to the built-in complex numbers class:

from math import hypot, atan, sin, cos

class CustomComplex:
def __init__(self, real, imag):
self.real = real
self.imag = imag

The constructor handles only one kind of call, CustomComplex(a, b). It takes positional
arguments, representing the real and imaginary parts of the complex number.

Let’s define two methods inside the class, conjugate() and argz(), which will give us the
complex conjugate and the argument of a complex number respectively:

28
def conjugate(self):
return self.__class__(self.real, -self.imag)

def argz(self):
return atan(self.imag / self.real)

Note: __class__ is not a special method but a class attribute which is present by default. It has a
reference to the class. By using it here, we are obtaining that and then calling the constructor in
the usual manner. In other words, this is equivalent to CustomComplex(real, imag). This is
done here to avoid refactoring the code if the name of the class changes someday.

Next, we configure abs() to return the modulus of a complex number:

def __abs__(self):
return hypot(self.real, self.imag)

We will follow the recommended distinction between __repr__() and __str__() and use the
first for the parsable string representation and the second for a “pretty” representation.

The __repr__() method will simply return CustomComplex(a, b) in a string so that we can
call eval() to recreate the object, while the __str__() method will return the complex number
in brackets, as (a+bj):

def __repr__(self):
return f"{self.__class__.__name__}({self.real}, {self.imag})"

def __str__(self):
return f"({self.real}{self.imag:+}j)"

Mathematically, it is possible to add any two complex numbers or add a real number to a
complex number. Let’s configure the + operator in such a way that it works for both cases.

The method will check the type of the right-hand side operator. In case it is an int or a float, it
will increment only the real part (since any real number, a, is equivalent to a+0j), while in the
case of another complex number, it will change both the parts:

def __add__(self, other):


if isinstance(other, float) or isinstance(other, int):
real_part = self.real + other
imag_part = self.imag

if isinstance(other, CustomComplex):
real_part = self.real + other.real
imag_part = self.imag + other.imag

return self.__class__(real_part, imag_part)

Similarly, we define the behavior for - and *:

def __sub__(self, other):

29
if isinstance(other, float) or isinstance(other, int):
real_part = self.real - other
imag_part = self.imag

if isinstance(other, CustomComplex):
real_part = self.real - other.real
imag_part = self.imag - other.imag

return self.__class__(real_part, imag_part)

def __mul__(self, other):


if isinstance(other, int) or isinstance(other, float):
real_part = self.real * other
imag_part = self.imag * other

if isinstance(other, CustomComplex):
real_part = (self.real * other.real) - (self.imag * other.imag)
imag_part = (self.real * other.imag) + (self.imag * other.real)

return self.__class__(real_part, imag_part)

Since both addition and multiplication are commutative, we can define their reverse operators by
calling __add__() and __mul__() in __radd__() and __rmul__() respectively. On the other
hand, the behavior of __rsub__() needs to be defined since subtraction is not commutative:

def __radd__(self, other):


  return self.__add__(other)

def __rmul__(self, other):


  return self.__mul__(other)

def __rsub__(self, other):


# x - y != y - x
if isinstance(other, float) or isinstance(other, int):
real_part = other - self.real
imag_part = -self.imag

return self.__class__(real_part, imag_part)

Note: You might have noticed that we didn’t add a construct to handle a CustomComplex
instance here. This is because, in such a case, both the operands are instances of our class, and
__rsub__() won’t be responsible for handling the operation. Instead, __sub__() will be called.
This is a subtle but important detail.

Now, we take care of the two operators, == and !=. The special methods used for them are
__eq__() and __ne__(), respectively. Two complex numbers are said to be equal if their
corresponding real and imaginary parts are both equal. They are said to be unequal when either
one of these are unequal:

def __eq__(self, other):


# Note: generally, floats should not be compared directly
# due to floating-point precision
return (self.real == other.real) and (self.imag == other.imag)

30
def __ne__(self, other):
return (self.real != other.real) or (self.imag != other.imag)

Note: The Floating-Point Guide is an article that talks about comparing floats and floating-point
precision. It highlights the caveats involved in comparing floats directly, which is something
we’re doing here.

It is also possible to raise a complex number to any power using a simple formula. We configure
the behavior for both the built-in pow() and the ** operator using the __pow__() special
method:

def __pow__(self, other):


r_raised = abs(self) ** other
argz_multiplied = self.argz() * other

real_part = round(r_raised * cos(argz_multiplied))


imag_part = round(r_raised * sin(argz_multiplied))

return self.__class__(real_part, imag_part)

Note: Take a close look at the definition of the method. We are calling abs() to obtain the
modulus of the complex number. So, once you’ve defined the special method for a particular
function or operator in your class, it can be used in other methods of the same class.

Let’s create two instances of this class, one having a positive imaginary part and one having a
negative imaginary part:

>>> a = CustomComplex(1, 2)
>>> b = CustomComplex(3, -4)

String representations:

>>> a
CustomComplex(1, 2)
>>> b
CustomComplex(3, -4)
>>> print(a)
(1+2j)
>>> print(b)
(3-4j)

Recreating the object using eval() with repr():

>>> b_copy = eval(repr(b))


>>> type(b_copy), b_copy.real, b_copy.imag
(__main__.CustomComplex, 3, -4)

Addition, subtraction, and multiplication:

>>> a + b

31
CustomComplex(4, -2)
>>> a - b
CustomComplex(-2, 6)
>>> a + 5
CustomComplex(6, 2)
>>> 3 - a
CustomComplex(2, -2)
>>> a * 6
CustomComplex(6, 12)
>>> a * (-6)
CustomComplex(-6, -12)

Equality and inequality checks:

>>> a == CustomComplex(1, 2)
True
>>> a == b
False
>>> a != b
True
>>> a != CustomComplex(1, 2)
False

Finally, raising a complex number to some power:

>>> a ** 2
CustomComplex(-3, 4)
>>> b ** 5
CustomComplex(-237, 3116)

As you can see, objects of our custom class behave and look like those of a built-in class and are
very Pythonic. The full example code for this class is embedded below.

Recap and Resources


In this tutorial, you learned about the Python Data Model and how the Data Model can be used to
build Pythonic classes. You learned about changing the behavior of built-in functions such as
len(), abs(), str(), bool(), and so on. You also learned about changing the behavior of built-
in operators like +, -, *, **, and so forth.

Free Bonus: Click here to get access to a free Python OOP Cheat Sheet that points you to the
best tutorials, videos, and books to learn more about Object-Oriented Programming with Python.

After reading this, you can confidently create classes that make use of the best idiomatic features
of Python and make your objects Pythonic!

For more information on the Data Model, and function and operator overloading, take a look at
these resources:

32
 Section 3.3, Special Method Names of the Data Model section in the Python
documentation
 Fluent Python by Luciano Ramalho
 Python Tricks: The Book

33
Supercharge Your Classes With Python
super()
by Kyle Stratis Feb 12, 2019 67 Comments best-practices intermediate python
Tweet Share Email

Table of Contents

 An Overview of Python’s super() Function


 super() in Single Inheritance
 What Can super() Do for You?
 A super() Deep Dive
 super() in Multiple Inheritance
o Multiple Inheritance Overview
o Method Resolution Order
o Multiple Inheritance Alternatives
 A super() Recap

Watch Now This tutorial has a related video course created by the Real Python team. Watch it
together with the written tutorial to deepen your understanding: Supercharge Your Classes
With Python super()

While Python isn’t purely an object-oriented language, it’s flexible enough and powerful enough
to allow you to build your applications using the object-oriented paradigm. One of the ways in
which Python achieves this is by supporting inheritance, which it does with super().

In this tutorial, you’ll learn about the following:

 The concept of inheritance in Python


 Multiple inheritance in Python
 How the super() function works
 How the super() function in single inheritance works
 How the super() function in multiple inheritance works

Free Bonus: 5 Thoughts On Python Mastery, a free course for Python developers that shows you
the roadmap and the mindset you’ll need to take your Python skills to the next level.

An Overview of Python’s super() Function


If you have experience with object-oriented languages, you may already be familiar with the
functionality of super().

If not, don’t fear! While the official documentation is fairly technical, at a high level super()
gives you access to methods in a superclass from the subclass that inherits from it.
34
super() alone returns a temporary object of the superclass that then allows you to call that
superclass’s methods.

Why would you want to do any of this? While the possibilities are limited by your imagination, a
common use case is building classes that extend the functionality of previously built classes.

Calling the previously built methods with super() saves you from needing to rewrite those
methods in your subclass, and allows you to swap out superclasses with minimal code changes.

super() in Single Inheritance


If you’re unfamiliar with object-oriented programming concepts, inheritance might be an
unfamiliar term. Inheritance is a concept in object-oriented programming in which a class derives
(or inherits) attributes and behaviors from another class without needing to implement them
again.

For me at least, it’s easier to understand these concepts when looking at code, so let’s write
classes describing some shapes:

class Rectangle:
def __init__(self, length, width):
self.length = length
self.width = width

def area(self):
return self.length * self.width

def perimeter(self):
return 2 * self.length + 2 * self.width

class Square:
def __init__(self, length):
self.length = length

def area(self):
return self.length * self.length

def perimeter(self):
return 4 * self.length

Here, there are two similar classes: Rectangle and Square.

You can use them as below:

>>> square = Square(4)


>>> square.area()
16
>>> rectangle = Rectangle(2,4)
>>> rectangle.area()
8

35
In this example, you have two shapes that are related to each other: a square is a special kind of
rectangle. The code, however, doesn’t reflect that relationship and thus has code that is
essentially repeated.

By using inheritance, you can reduce the amount of code you write while simultaneously
reflecting the real-world relationship between rectangles and squares:

class Rectangle:
def __init__(self, length, width):
self.length = length
self.width = width

def area(self):
return self.length * self.width

def perimeter(self):
return 2 * self.length + 2 * self.width

# Here we declare that the Square class inherits from the Rectangle class
class Square(Rectangle):
def __init__(self, length):
super().__init__(length, length)

Here, you’ve used super() to call the __init__() of the Rectangle class, allowing you to use
it in the Square class without repeating code. Below, the core functionality remains after making
changes:

>>> square = Square(4)


>>> square.area()
16

In this example, Rectangle is the superclass, and Square is the subclass.

Because the Square and Rectangle .__init__() methods are so similar, you can simply call
the superclass’s .__init__() method (Rectangle.__init__()) from that of Square by using
super(). This sets the .length and .width attributes even though you just had to supply a
single length parameter to the Square constructor.

When you run this, even though your Square class doesn’t explicitly implement it, the call to
.area() will use the .area() method in the superclass and print 16. The Square class inherited
.area() from the Rectangle class.

Note: To learn more about inheritance and object-oriented concepts in Python, be sure to check
out Inheritance and Composition: A Python OOP Guide and Object-Oriented Programming
(OOP) in Python 3.

What Can super() Do for You?


So what can super() do for you in single inheritance?

36
Like in other object-oriented languages, it allows you to call methods of the superclass in your
subclass. The primary use case of this is to extend the functionality of the inherited method.

In the example below, you will create a class Cube that inherits from Square and extends the
functionality of .area() (inherited from the Rectangle class through Square) to calculate the
surface area and volume of a Cube instance:

class Square(Rectangle):
def __init__(self, length):
super().__init__(length, length)

class Cube(Square):
def surface_area(self):
face_area = super().area()
return face_area * 6

def volume(self):
face_area = super().area()
return face_area * self.length

Now that you’ve built the classes, let’s look at the surface area and volume of a cube with a side
length of 3:

>>> cube = Cube(3)


>>> cube.surface_area()
54
>>> cube.volume()
27

Caution: Note that in our example above, super() alone won’t make the method calls for you:
you have to call the method on the proxy object itself.

Here you have implemented two methods for the Cube class: .surface_area() and .volume().
Both of these calculations rely on calculating the area of a single face, so rather than
reimplementing the area calculation, you use super() to extend the area calculation.

Also notice that the Cube class definition does not have an .__init__(). Because Cube inherits
from Square and .__init__() doesn’t really do anything differently for Cube than it already
does for Square, you can skip defining it, and the .__init__() of the superclass (Square) will
be called automatically.

super() returns a delegate object to a parent class, so you call the method you want directly on
it: super().area().

Not only does this save us from having to rewrite the area calculations, but it also allows us to
change the internal .area() logic in a single location. This is especially in handy when you have
a number of subclasses inheriting from one superclass.

37
A super() Deep Dive
Before heading into multiple inheritance, let’s take a quick detour into the mechanics of
super().

While the examples above (and below) call super() without any parameters, super() can also
take two parameters: the first is the subclass, and the second parameter is an object that is an
instance of that subclass.

First, let’s see two examples showing what manipulating the first variable can do, using the
classes already shown:

class Rectangle:
def __init__(self, length, width):
self.length = length
self.width = width

def area(self):
return self.length * self.width

def perimeter(self):
return 2 * self.length + 2 * self.width

class Square(Rectangle):
def __init__(self, length):
super(Square, self).__init__(length, length)

In Python 3, the super(Square, self) call is equivalent to the parameterless super() call. The
first parameter refers to the subclass Square, while the second parameter refers to a Square
object which, in this case, is self. You can call super() with other classes as well:

class Cube(Square):
def surface_area(self):
face_area = super(Square, self).area()
return face_area * 6

def volume(self):
face_area = super(Square, self).area()
return face_area * self.length

In this example, you are setting Square as the subclass argument to super(), instead of Cube.
This causes super() to start searching for a matching method (in this case, .area()) at one
level above Square in the instance hierarchy, in this case Rectangle.

In this specific example, the behavior doesn’t change. But imagine that Square also
implemented an .area() function that you wanted to make sure Cube did not use. Calling
super() in this way allows you to do that.

38
Caution: While we are doing a lot of fiddling with the parameters to super() in order to explore
how it works under the hood, I’d caution against doing this regularly.

The parameterless call to super() is recommended and sufficient for most use cases, and
needing to change the search hierarchy regularly could be indicative of a larger design issue.

What about the second parameter? Remember, this is an object that is an instance of the class
used as the first parameter. For an example, isinstance(Cube, Square) must return True.

By including an instantiated object, super() returns a bound method: a method that is bound to
the object, which gives the method the object’s context such as any instance attributes. If this
parameter is not included, the method returned is just a function, unassociated with an object’s
context.

For more information about bound methods, unbound methods, and functions, read the Python
documentation on its descriptor system.

Note: Technically, super() doesn’t return a method. It returns a proxy object. This is an object
that delegates calls to the correct class methods without making an additional object in order to
do so.

super() in Multiple Inheritance


Now that you’ve worked through an overview and some examples of super() and single
inheritance, you will be introduced to an overview and some examples that will demonstrate how
multiple inheritance works and how super() enables that functionality.

Multiple Inheritance Overview

There is another use case in which super() really shines, and this one isn’t as common as the
single inheritance scenario. In addition to single inheritance, Python supports multiple
inheritance, in which a subclass can inherit from multiple superclasses that don’t necessarily
inherit from each other (also known as sibling classes).

I’m a very visual person, and I find diagrams are incredibly helpful to understand concepts like
this. The image below shows a very simple multiple inheritance scenario, where one class
inherits from two unrelated (sibling) superclasses:

39
A diagrammed example of multiple
inheritance (Image: Kyle Stratis)

To better illustrate multiple inheritance in action, here is some code for you to try out, showing
how you can build a right pyramid (a pyramid with a square base) out of a Triangle and a
Square:

class Triangle:
def __init__(self, base, height):
self.base = base
self.height = height

def area(self):
return 0.5 * self.base * self.height

class RightPyramid(Triangle, Square):


def __init__(self, base, slant_height):
self.base = base
self.slant_height = slant_height

def area(self):
base_area = super().area()
perimeter = super().perimeter()
return 0.5 * perimeter * self.slant_height + base_area

Note: The term slant height may be unfamiliar, especially if it’s been a while since you’ve taken
a geometry class or worked on any pyramids.

The slant height is the height from the center of the base of an object (like a pyramid) up its face
to the peak of that object. You can read more about slant heights at WolframMathWorld.

This example declares a Triangle class and a RightPyramid class that inherits from both
Square and Triangle.

You’ll see another .area() method that uses super() just like in single inheritance, with the
aim of it reaching the .perimeter() and .area() methods defined all the way up in the
Rectangle class.

40
Note: You may notice that the code above isn’t using any inherited properties from the
Triangle class yet. Later examples will fully take advantage of inheritance from both Triangle
and Square.

The problem, though, is that both superclasses (Triangle and Square) define a .area(). Take a
second and think about what might happen when you call .area() on RightPyramid, and then
try calling it like below:

>> pyramid = RightPyramid(2, 4)


>> pyramid.area()
Traceback (most recent call last):
File "shapes.py", line 63, in <module>
print(pyramid.area())
File "shapes.py", line 47, in area
base_area = super().area()
File "shapes.py", line 38, in area
return 0.5 * self.base * self.height
AttributeError: 'RightPyramid' object has no attribute 'height'

Did you guess that Python will try to call Triangle.area()? This is because of something
called the method resolution order.

Note: How did we notice that Triangle.area() was called and not, as we hoped,
Square.area()? If you look at the last line of the traceback (before the AttributeError),
you’ll see a reference to a specific line of code:

return 0.5 * self.base * self.height

You may recognize this from geometry class as the formula for the area of a triangle. Otherwise,
if you’re like me, you might have scrolled up to the Triangle and Rectangle class definitions
and seen this same code in Triangle.area().

Method Resolution Order

The method resolution order (or MRO) tells Python how to search for inherited methods. This
comes in handy when you’re using super() because the MRO tells you exactly where Python
will look for a method you’re calling with super() and in what order.

Every class has an .__mro__ attribute that allows us to inspect the order, so let’s do that:

>>> RightPyramid.__mro__
(<class '__main__.RightPyramid'>, <class '__main__.Triangle'>,
<class '__main__.Square'>, <class '__main__.Rectangle'>,
<class 'object'>)

41
This tells us that methods will be searched first in Rightpyramid, then in Triangle, then in
Square, then Rectangle, and then, if nothing is found, in object, from which all classes
originate.

The problem here is that the interpreter is searching for .area() in Triangle before Square and
Rectangle, and upon finding .area() in Triangle, Python calls it instead of the one you want.
Because Triangle.area() expects there to be a .height and a .base attribute, Python throws
an AttributeError.

Luckily, you have some control over how the MRO is constructed. Just by changing the
signature of the RightPyramid class, you can search in the order you want, and the methods will
resolve correctly:

class RightPyramid(Square, Triangle):


def __init__(self, base, slant_height):
self.base = base
self.slant_height = slant_height
super().__init__(self.base)

def area(self):
base_area = super().area()
perimeter = super().perimeter()
return 0.5 * perimeter * self.slant_height + base_area

Notice that RightPyramid initializes partially with the .__init__() from the Square class.
This allows .area() to use the .length on the object, as is designed.

Now, you can build a pyramid, inspect the MRO, and calculate the surface area:

>>> pyramid = RightPyramid(2, 4)


>>> RightPyramid.__mro__
(<class '__main__.RightPyramid'>, <class '__main__.Square'>,
<class '__main__.Rectangle'>, <class '__main__.Triangle'>,
<class 'object'>)
>>> pyramid.area()
20.0

You see that the MRO is now what you’d expect, and you can inspect the area of the pyramid as
well, thanks to .area() and .perimeter().

There’s still a problem here, though. For the sake of simplicity, I did a few things wrong in this
example: the first, and arguably most importantly, was that I had two separate classes with the
same method name and signature.

This causes issues with method resolution, because the first instance of .area() that is
encountered in the MRO list will be called.

When you’re using super() with multiple inheritance, it’s imperative to design your classes to
cooperate. Part of this is ensuring that your methods are unique so that they get resolved in the

42
MRO, by making sure method signatures are unique—whether by using method names or
method parameters.

In this case, to avoid a complete overhaul of your code, you can rename the Triangle class’s
.area() method to .tri_area(). This way, the area methods can continue using class
properties rather than taking external parameters:

class Triangle:
def __init__(self, base, height):
self.base = base
self.height = height
super().__init__()

def tri_area(self):
return 0.5 * self.base * self.height

Let’s also go ahead and use this in the RightPyramid class:

class RightPyramid(Square, Triangle):


def __init__(self, base, slant_height):
self.base = base
self.slant_height = slant_height
super().__init__(self.base)

def area(self):
base_area = super().area()
perimeter = super().perimeter()
return 0.5 * perimeter * self.slant_height + base_area

def area_2(self):
base_area = super().area()
triangle_area = super().tri_area()
return triangle_area * 4 + base_area

The next issue here is that the code doesn’t have a delegated Triangle object like it does for a
Square object, so calling .area_2() will give us an AttributeError since .base and .height
don’t have any values.

You need to do two things to fix this:

1. All methods that are called with super() need to have a call to their superclass’s version
of that method. This means that you will need to add super().__init__() to the
.__init__() methods of Triangle and Rectangle.
2. Redesign all the .__init__() calls to take a keyword dictionary. See the complete code
below.

There are a number of important differences in this code:

43
 **kwargs is modified in some places (such as RightPyramid.__init__()):** This will
allow users of these objects to instantiate them only with the arguments that make sense
for that particular object.
 Setting up named arguments before **kwargs: You can see this in
RightPyramid.__init__(). This has the neat effect of popping that key right out of the
**kwargs dictionary, so that by the time that it ends up at the end of the MRO in the
object class, **kwargs is empty.

Note: Following the state of kwargs can be tricky here, so here’s a table of .__init__() calls in
order, showing the class that owns that call, and the contents of kwargs during that call:

Class Named Arguments kwargs


RightPyramid base, slant_height
Square length base, height
Rectangle length, width base, height
Triangle base, height

Now, when you use these updated classes, you have this:

>>> pyramid = RightPyramid(base=2, slant_height=4)


>>> pyramid.area()
20.0
>>> pyramid.area_2()
20.0

It works! You’ve used super() to successfully navigate a complicated class hierarchy while
using both inheritance and composition to create new classes with minimal reimplementation.

Multiple Inheritance Alternatives

As you can see, multiple inheritance can be useful but also lead to very complicated situations
and code that is hard to read. It’s also rare to have objects that neatly inherit everything from
more than multiple other objects.

If you see yourself beginning to use multiple inheritance and a complicated class hierarchy, it’s
worth asking yourself if you can achieve code that is cleaner and easier to understand by using
composition instead of inheritance. Since this article is focused on inheritance, I won’t go into
too much detail on composition and how to wield it in Python. Luckily, Real Python has
published a deep-dive guide to both inheritance and composition in Python that will make you an
OOP pro in no time.

There’s another technique that can help you get around the complexity of multiple inheritance
while still providing many of the benefits. This technique is in the form of a specialized, simple
class called a mixin.

44
A mixin works as a kind of inheritance, but instead of defining an “is-a” relationship it may be
more accurate to say that it defines an “includes-a” relationship. With a mix-in you can write a
behavior that can be directly included in any number of other classes.

Below, you will see a short example using VolumeMixin to give specific functionality to our 3D
objects—in this case, a volume calculation:

class Rectangle:
def __init__(self, length, width):
self.length = length
self.width = width

def area(self):
return self.length * self.width

class Square(Rectangle):
def __init__(self, length):
super().__init__(length, length)

class VolumeMixin:
def volume(self):
return self.area() * self.height

class Cube(VolumeMixin, Square):


def __init__(self, length):
super().__init__(length)
self.height = length

def face_area(self):
return super().area()

def surface_area(self):
return super().area() * 6

In this example, the code was reworked to include a mixin called VolumeMixin. The mixin is
then used by Cube and gives Cube the ability to calculate its volume, which is shown below:

>>> cube = Cube(2)


>>> cube.surface_area()
24
>>> cube.volume()
8

This mixin can be used the same way in any other class that has an area defined for it and for
which the formula area * height returns the correct volume.

A super() Recap
In this tutorial, you learned how to supercharge your classes with super(). Your journey started
with a review of single inheritance and then showed how to call superclass methods easily with
super().

45
You then learned how multiple inheritance works in Python, and techniques to combine super()
with multiple inheritance. You also learned about how Python resolves method calls using the
method resolution order (MRO), as well as how to inspect and modify the MRO to ensure
appropriate methods are called at appropriate times.

For more information about object-oriented programming in Python and using super(), check
out these resources:

 Official super() documentation


 Python’s super() Considered Super by Raymond Hettinger
 Object-Oriented Programming in Python 3

46
Inheritance and Composition: A Python OOP
Guide
by Isaac Rodriguez Aug 07, 2019 27 Comments best-practices intermediate python
Tweet Share Email

Table of Contents

 What Are Inheritance and Composition?


o What’s Inheritance?
o What’s Composition?
 An Overview of Inheritance in Python
o The Object Super Class
o Exceptions Are an Exception
o Creating Class Hierarchies
o Abstract Base Classes in Python
o Implementation Inheritance vs Interface Inheritance
o The Class Explosion Problem
o Inheriting Multiple Classes
 Composition in Python
o Flexible Designs With Composition
o Customizing Behavior With Composition
 Choosing Between Inheritance and Composition in Python
o Inheritance to Model “Is A” Relationship
o Mixing Features With Mixin Classes
o Composition to Model “Has A” Relationship
o Composition to Change Run-Time Behavior
o Choosing Between Inheritance and Composition in Python
 Conclusion
 Recommended Reading

Watch Now This tutorial has a related video course created by the Real Python team. Watch it
together with the written tutorial to deepen your understanding: Inheritance and Composition:
A Python OOP Guide

In this article, you’ll explore inheritance and composition in Python. Inheritance and
composition are two important concepts in object oriented programming that model the
relationship between two classes. They are the building blocks of object oriented design, and
they help programmers to write reusable code.

By the end of this article, you’ll know how to:

47
 Use inheritance in Python
 Model class hierarchies using inheritance
 Use multiple inheritance in Python and understand its drawbacks
 Use composition to create complex objects
 Reuse existing code by applying composition
 Change application behavior at run-time through composition

Free Bonus: Click here to get access to a free Python OOP Cheat Sheet that points you to the
best tutorials, videos, and books to learn more about Object-Oriented Programming with Python.

What Are Inheritance and Composition?


Inheritance and composition are two major concepts in object oriented programming that
model the relationship between two classes. They drive the design of an application and
determine how the application should evolve as new features are added or requirements change.

Both of them enable code reuse, but they do it in different ways.

What’s Inheritance?

Inheritance models what is called an is a relationship. This means that when you have a
Derived class that inherits from a Base class, you created a relationship where Derived is a
specialized version of Base.

Inheritance is represented using the Unified Modeling Language or UML in the following way:

48
Classes are represented as boxes with the class name on top. The inheritance relationship is
represented by an arrow from the derived class pointing to the base class. The word extends is
usually added to the arrow.

Note: In an inheritance relationship:

 Classes that inherit from another are called derived classes, subclasses, or subtypes.
 Classes from which other classes are derived are called base classes or super classes.
 A derived class is said to derive, inherit, or extend a base class.

Let’s say you have a base class Animal and you derive from it to create a Horse class. The
inheritance relationship states that a Horse is an Animal. This means that Horse inherits the
interface and implementation of Animal, and Horse objects can be used to replace Animal
objects in the application.

This is known as the Liskov substitution principle. The principle states that “in a computer
program, if S is a subtype of T, then objects of type T may be replaced with objects of type S
without altering any of the desired properties of the program”.

You’ll see in this article why you should always follow the Liskov substitution principle when
creating your class hierarchies, and the problems you’ll run into if you don’t.

What’s Composition?

49
Composition is a concept that models a has a relationship. It enables creating complex types by
combining objects of other types. This means that a class Composite can contain an object of
another class Component. This relationship means that a Composite has a Component.

UML represents composition as follows:

Composition is represented through a line with a diamond at the composite class pointing to the
component class. The composite side can express the cardinality of the relationship. The
cardinality indicates the number or valid range of Component instances the Composite class will
contain.

In the diagram above, the 1 represents that the Composite class contains one object of type
Component. Cardinality can be expressed in the following ways:

 A number indicates the number of Component instances that are contained in the
Composite.
 The * symbol indicates that the Composite class can contain a variable number of
Component instances.
 A range 1..4 indicates that the Composite class can contain a range of Component
instances. The range is indicated with the minimum and maximum number of instances,
or minimum and many instances like in 1..*.

Note: Classes that contain objects of other classes are usually referred to as composites, where
classes that are used to create more complex types are referred to as components.

50
For example, your Horse class can be composed by another object of type Tail. Composition
allows you to express that relationship by saying a Horse has a Tail.

Composition enables you to reuse code by adding objects to other objects, as opposed to
inheriting the interface and implementation of other classes. Both Horse and Dog classes can
leverage the functionality of Tail through composition without deriving one class from the
other.

An Overview of Inheritance in Python


Everything in Python is an object. Modules are objects, class definitions and functions are
objects, and of course, objects created from classes are objects too.

Inheritance is a required feature of every object oriented programming language. This means that
Python supports inheritance, and as you’ll see later, it’s one of the few languages that supports
multiple inheritance.

When you write Python code using classes, you are using inheritance even if you don’t know
you’re using it. Let’s take a look at what that means.

The Object Super Class

The easiest way to see inheritance in Python is to jump into the Python interactive shell and write
a little bit of code. You’ll start by writing the simplest class possible:

>>> class MyClass:


... pass
...

You declared a class MyClass that doesn’t do much, but it will illustrate the most basic
inheritance concepts. Now that you have the class declared, you can use the dir() function to
list its members:

>>> c = MyClass()
>>> dir(c)
['__class__', '__delattr__', '__dict__', '__dir__', '__doc__', '__eq__',
'__format__', '__ge__', '__getattribute__', '__gt__', '__hash__', '__init__',
'__init_subclass__', '__le__', '__lt__', '__module__', '__ne__', '__new__',
'__reduce__', '__reduce_ex__', '__repr__', '__setattr__', '__sizeof__',
'__str__', '__subclasshook__', '__weakref__']

dir() returns a list of all the members in the specified object. You have not declared any
members in MyClass, so where is the list coming from? You can find out using the interactive
interpreter:

>>> o = object()
>>> dir(o)
['__class__', '__delattr__', '__dir__', '__doc__', '__eq__', '__format__',

51
'__ge__', '__getattribute__', '__gt__', '__hash__', '__init__',
'__init_subclass__', '__le__', '__lt__', '__ne__', '__new__', '__reduce__',
'__reduce_ex__', '__repr__', '__setattr__', '__sizeof__', '__str__',
'__subclasshook__']

As you can see, the two lists are nearly identical. There are some additional members in MyClass
like __dict__ and __weakref__, but every single member of the object class is also present in
MyClass.

This is because every class you create in Python implicitly derives from object. You could be
more explicit and write class MyClass(object):, but it’s redundant and unnecessary.

Note: In Python 2, you have to explicitly derive from object for reasons beyond the scope of
this article, but you can read about it in the New-style and classic classes section of the Python 2
documentation.

Exceptions Are an Exception

Every class that you create in Python will implicitly derive from object. The exception to this
rule are classes used to indicate errors by raising an exception.

You can see the problem using the Python interactive interpreter:

>>> class MyError:


... pass
...
>>> raise MyError()

Traceback (most recent call last):


File "<stdin>", line 1, in <module>
TypeError: exceptions must derive from BaseException

You created a new class to indicate a type of error. Then you tried to use it to raise an exception.
An exception is raised but the output states that the exception is of type TypeError not MyError
and that all exceptions must derive from BaseException.

BaseException is a base class provided for all error types. To create a new error type, you must
derive your class from BaseException or one of its derived classes. The convention in Python is
to derive your custom error types from Exception, which in turn derives from BaseException.

The correct way to define your error type is the following:

>>> class MyError(Exception):


... pass
...
>>> raise MyError()

Traceback (most recent call last):


File "<stdin>", line 1, in <module>

52
__main__.MyError

As you can see, when you raise MyError, the output correctly states the type of error raised.

Creating Class Hierarchies

Inheritance is the mechanism you’ll use to create hierarchies of related classes. These related
classes will share a common interface that will be defined in the base classes. Derived classes
can specialize the interface by providing a particular implementation where applies.

In this section, you’ll start modeling an HR system. The example will demonstrate the use of
inheritance and how derived classes can provide a concrete implementation of the base class
interface.

The HR system needs to process payroll for the company’s employees, but there are different
types of employees depending on how their payroll is calculated.

You start by implementing a PayrollSystem class that processes payroll:

# In hr.py

class PayrollSystem:
def calculate_payroll(self, employees):
print('Calculating Payroll')
print('===================')
for employee in employees:
print(f'Payroll for: {employee.id} - {employee.name}')
print(f'- Check amount: {employee.calculate_payroll()}')
print('')

The PayrollSystem implements a .calculate_payroll() method that takes a collection of


employees and prints their id, name, and check amount using the .calculate_payroll()
method exposed on each employee object.

Now, you implement a base class Employee that handles the common interface for every
employee type:

# In hr.py

class Employee:
def __init__(self, id, name):
self.id = id
self.name = name

Employee is the base class for all employee types. It is constructed with an id and a name. What
you are saying is that every Employee must have an id assigned as well as a name.

53
The HR system requires that every Employee processed must provide a .calculate_payroll()
interface that returns the weekly salary for the employee. The implementation of that interface
differs depending on the type of Employee.

For example, administrative workers have a fixed salary, so every week they get paid the same
amount:

# In hr.py

class SalaryEmployee(Employee):
def __init__(self, id, name, weekly_salary):
super().__init__(id, name)
self.weekly_salary = weekly_salary

def calculate_payroll(self):
return self.weekly_salary

You create a derived class SalaryEmployee that inherits Employee. The class is initialized with
the id and name required by the base class, and you use super() to initialize the members of the
base class. You can read all about super() in Supercharge Your Classes With Python super().

SalaryEmployee also requires a weekly_salary initialization parameter that represents the


amount the employee makes per week.

The class provides the required .calculate_payroll() method used by the HR system. The
implementation just returns the amount stored in weekly_salary.

The company also employs manufacturing workers that are paid by the hour, so you add an
HourlyEmployee to the HR system:

# In hr.py

class HourlyEmployee(Employee):
def __init__(self, id, name, hours_worked, hour_rate):
super().__init__(id, name)
self.hours_worked = hours_worked
self.hour_rate = hour_rate

def calculate_payroll(self):
return self.hours_worked * self.hour_rate

The HourlyEmployee class is initialized with id and name, like the base class, plus the
hours_worked and the hour_rate required to calculate the payroll. The
.calculate_payroll() method is implemented by returning the hours worked times the hour
rate.

Finally, the company employs sales associates that are paid through a fixed salary plus a
commission based on their sales, so you create a CommissionEmployee class:

54
# In hr.py

class CommissionEmployee(SalaryEmployee):
def __init__(self, id, name, weekly_salary, commission):
super().__init__(id, name, weekly_salary)
self.commission = commission

def calculate_payroll(self):
fixed = super().calculate_payroll()
return fixed + self.commission

You derive CommissionEmployee from SalaryEmployee because both classes have a


weekly_salary to consider. At the same time, CommissionEmployee is initialized with a
commission value that is based on the sales for the employee.

.calculate_payroll() leverages the implementation of the base class to retrieve the fixed
salary and adds the commission value.

Since CommissionEmployee derives from SalaryEmployee, you have access to the


weekly_salary property directly, and you could’ve implemented .calculate_payroll()
using the value of that property.

The problem with accessing the property directly is that if the implementation of
SalaryEmployee.calculate_payroll() changes, then you’ll have to also change the
implementation of CommissionEmployee.calculate_payroll(). It’s better to rely on the
already implemented method in the base class and extend the functionality as needed.

You created your first class hierarchy for the system. The UML diagram of the classes looks like
this:

55
The diagram shows the inheritance hierarchy of the classes. The derived classes implement the
IPayrollCalculator interface, which is required by the PayrollSystem. The
PayrollSystem.calculate_payroll() implementation requires that the employee objects
passed contain an id, name, and calculate_payroll() implementation.

Interfaces are represented similarly to classes with the word interface above the interface name.
Interface names are usually prefixed with a capital I.

The application creates its employees and passes them to the payroll system to process payroll:

# In program.py

import hr

salary_employee = hr.SalaryEmployee(1, 'John Smith', 1500)

56
hourly_employee = hr.HourlyEmployee(2, 'Jane Doe', 40, 15)
commission_employee = hr.CommissionEmployee(3, 'Kevin Bacon', 1000, 250)
payroll_system = hr.PayrollSystem()
payroll_system.calculate_payroll([
salary_employee,
hourly_employee,
commission_employee
])

You can run the program in the command line and see the results:

$ python program.py

Calculating Payroll
===================
Payroll for: 1 - John Smith
- Check amount: 1500

Payroll for: 2 - Jane Doe


- Check amount: 600

Payroll for: 3 - Kevin Bacon


- Check amount: 1250

The program creates three employee objects, one for each of the derived classes. Then, it creates
the payroll system and passes a list of the employees to its .calculate_payroll() method,
which calculates the payroll for each employee and prints the results.

Notice how the Employee base class doesn’t define a .calculate_payroll() method. This
means that if you were to create a plain Employee object and pass it to the PayrollSystem, then
you’d get an error. You can try it in the Python interactive interpreter:

>>> import hr
>>> employee = hr.Employee(1, 'Invalid')
>>> payroll_system = hr.PayrollSystem()
>>> payroll_system.calculate_payroll([employee])

Payroll for: 1 - Invalid


Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/hr.py", line 39, in calculate_payroll
print(f'- Check amount: {employee.calculate_payroll()}')
AttributeError: 'Employee' object has no attribute 'calculate_payroll'

While you can instantiate an Employee object, the object can’t be used by the PayrollSystem.
Why? Because it can’t .calculate_payroll() for an Employee. To meet the requirements of
PayrollSystem, you’ll want to convert the Employee class, which is currently a concrete class,
to an abstract class. That way, no employee is ever just an Employee, but one that implements
.calculate_payroll().

Abstract Base Classes in Python

57
The Employee class in the example above is what is called an abstract base class. Abstract base
classes exist to be inherited, but never instantiated. Python provides the abc module to define
abstract base classes.

You can use leading underscores in your class name to communicate that objects of that class
should not be created. Underscores provide a friendly way to prevent misuse of your code, but
they don’t prevent eager users from creating instances of that class.

The abc module in the Python standard library provides functionality to prevent creating objects
from abstract base classes.

You can modify the implementation of the Employee class to ensure that it can’t be instantiated:

# In hr.py

from abc import ABC, abstractmethod

class Employee(ABC):
def __init__(self, id, name):
self.id = id
self.name = name

@abstractmethod
def calculate_payroll(self):
pass

You derive Employee from ABC, making it an abstract base class. Then, you decorate the
.calculate_payroll() method with the @abstractmethod decorator.

This change has two nice side-effects:

1. You’re telling users of the module that objects of type Employee can’t be created.
2. You’re telling other developers working on the hr module that if they derive from
Employee, then they must override the .calculate_payroll() abstract method.

You can see that objects of type Employee can’t be created using the interactive interpreter:

>>> import hr
>>> employee = hr.Employee(1, 'abstract')

Traceback (most recent call last):


File "<stdin>", line 1, in <module>
TypeError: Can't instantiate abstract class Employee with abstract methods
calculate_payroll

The output shows that the class cannot be instantiated because it contains an abstract method
calculate_payroll(). Derived classes must override the method to allow creating objects of
their type.

58
Implementation Inheritance vs Interface Inheritance

When you derive one class from another, the derived class inherits both:

1. The base class interface: The derived class inherits all the methods, properties, and
attributes of the base class.
2. The base class implementation: The derived class inherits the code that implements the
class interface.

Most of the time, you’ll want to inherit the implementation of a class, but you will want to
implement multiple interfaces, so your objects can be used in different situations.

Modern programming languages are designed with this basic concept in mind. They allow you to
inherit from a single class, but you can implement multiple interfaces.

In Python, you don’t have to explicitly declare an interface. Any object that implements the
desired interface can be used in place of another object. This is known as duck typing. Duck
typing is usually explained as “if it behaves like a duck, then it’s a duck.”

To illustrate this, you will now add a DisgruntledEmployee class to the example above which
doesn’t derive from Employee:

# In disgruntled.py

class DisgruntledEmployee:
def __init__(self, id, name):
self.id = id
self.name = name

def calculate_payroll(self):
return 1000000

The DisgruntledEmployee class doesn’t derive from Employee, but it exposes the same
interface required by the PayrollSystem. The PayrollSystem.calculate_payroll() requires
a list of objects that implement the following interface:

 An id property or attribute that returns the employee’s id


 A name property or attribute that represents the employee’s name
 A .calculate_payroll() method that doesn’t take any parameters and returns the
payroll amount to process

All these requirements are met by the DisgruntledEmployee class, so the PayrollSystem can
still calculate its payroll.

You can modify the program to use the DisgruntledEmployee class:

# In program.py

59
import hr
import disgruntled

salary_employee = hr.SalaryEmployee(1, 'John Smith', 1500)


hourly_employee = hr.HourlyEmployee(2, 'Jane Doe', 40, 15)
commission_employee = hr.CommissionEmployee(3, 'Kevin Bacon', 1000, 250)
disgruntled_employee = disgruntled.DisgruntledEmployee(20000, 'Anonymous')
payroll_system = hr.PayrollSystem()
payroll_system.calculate_payroll([
salary_employee,
hourly_employee,
commission_employee,
disgruntled_employee
])

The program creates a DisgruntledEmployee object and adds it to the list processed by the
PayrollSystem. You can now run the program and see its output:

$ python program.py

Calculating Payroll
===================
Payroll for: 1 - John Smith
- Check amount: 1500

Payroll for: 2 - Jane Doe


- Check amount: 600

Payroll for: 3 - Kevin Bacon


- Check amount: 1250

Payroll for: 20000 - Anonymous


- Check amount: 1000000

As you can see, the PayrollSystem can still process the new object because it meets the desired
interface.

Since you don’t have to derive from a specific class for your objects to be reusable by the
program, you may be asking why you should use inheritance instead of just implementing the
desired interface. The following rules may help you:

 Use inheritance to reuse an implementation: Your derived classes should leverage


most of their base class implementation. They must also model an is a relationship. A
Customer class might also have an id and a name, but a Customer is not an Employee, so
you should not use inheritance.
 Implement an interface to be reused: When you want your class to be reused by a
specific part of your application, you implement the required interface in your class, but
you don’t need to provide a base class, or inherit from another class.

You can now clean up the example above to move onto the next topic. You can delete the
disgruntled.py file and then modify the hr module to its original state:

60
# In hr.py

class PayrollSystem:
def calculate_payroll(self, employees):
print('Calculating Payroll')
print('===================')
for employee in employees:
print(f'Payroll for: {employee.id} - {employee.name}')
print(f'- Check amount: {employee.calculate_payroll()}')
print('')

class Employee:
def __init__(self, id, name):
self.id = id
self.name = name

class SalaryEmployee(Employee):
def __init__(self, id, name, weekly_salary):
super().__init__(id, name)
self.weekly_salary = weekly_salary

def calculate_payroll(self):
return self.weekly_salary

class HourlyEmployee(Employee):
def __init__(self, id, name, hours_worked, hour_rate):
super().__init__(id, name)
self.hours_worked = hours_worked
self.hour_rate = hour_rate

def calculate_payroll(self):
return self.hours_worked * self.hour_rate

class CommissionEmployee(SalaryEmployee):
def __init__(self, id, name, weekly_salary, commission):
super().__init__(id, name, weekly_salary)
self.commission = commission

def calculate_payroll(self):
fixed = super().calculate_payroll()
return fixed + self.commission

You removed the import of the abc module since the Employee class doesn’t need to be abstract.
You also removed the abstract calculate_payroll() method from it since it doesn’t provide
any implementation.

Basically, you are inheriting the implementation of the id and name attributes of the Employee
class in your derived classes. Since .calculate_payroll() is just an interface to the
PayrollSystem.calculate_payroll() method, you don’t need to implement it in the
Employee base class.

Notice how the CommissionEmployee class derives from SalaryEmployee. This means that
CommissionEmployee inherits the implementation and interface of SalaryEmployee. You can

61
see how the CommissionEmployee.calculate_payroll() method leverages the base class
implementation because it relies on the result from super().calculate_payroll() to
implement its own version.

The Class Explosion Problem

If you are not careful, inheritance can lead you to a huge hierarchical structure of classes that is
hard to understand and maintain. This is known as the class explosion problem.

You started building a class hierarchy of Employee types used by the PayrollSystem to
calculate payroll. Now, you need to add some functionality to those classes, so they can be used
with the new ProductivitySystem.

The ProductivitySystem tracks productivity based on employee roles. There are different
employee roles:

 Managers: They walk around yelling at people telling them what to do. They are salaried
employees and make more money.
 Secretaries: They do all the paper work for managers and ensure that everything gets
billed and payed on time. They are also salaried employees but make less money.
 Sales employees: They make a lot of phone calls to sell products. They have a salary, but
they also get commissions for sales.
 Factory workers: They manufacture the products for the company. They are paid by the
hour.

With those requirements, you start to see that Employee and its derived classes might belong
somewhere other than the hr module because now they’re also used by the
ProductivitySystem.

You create an employees module and move the classes there:

# In employees.py

class Employee:
def __init__(self, id, name):
self.id = id
self.name = name

class SalaryEmployee(Employee):
def __init__(self, id, name, weekly_salary):
super().__init__(id, name)
self.weekly_salary = weekly_salary

def calculate_payroll(self):
return self.weekly_salary

class HourlyEmployee(Employee):
def __init__(self, id, name, hours_worked, hour_rate):
super().__init__(id, name)

62
self.hours_worked = hours_worked
self.hour_rate = hour_rate

def calculate_payroll(self):
return self.hours_worked * self.hour_rate

class CommissionEmployee(SalaryEmployee):
def __init__(self, id, name, weekly_salary, commission):
super().__init__(id, name, weekly_salary)
self.commission = commission

def calculate_payroll(self):
fixed = super().calculate_payroll()
return fixed + self.commission

The implementation remains the same, but you move the classes to the employee module. Now,
you change your program to support the change:

# In program.py

import hr
import employees

salary_employee = employees.SalaryEmployee(1, 'John Smith', 1500)


hourly_employee = employees.HourlyEmployee(2, 'Jane Doe', 40, 15)
commission_employee = employees.CommissionEmployee(3, 'Kevin Bacon', 1000,
250)
payroll_system = hr.PayrollSystem()
payroll_system.calculate_payroll([
salary_employee,
hourly_employee,
commission_employee
])

You run the program and verify that it still works:

$ python program.py

Calculating Payroll
===================
Payroll for: 1 - John Smith
- Check amount: 1500

Payroll for: 2 - Jane Doe


- Check amount: 600

Payroll for: 3 - Kevin Bacon


- Check amount: 1250

With everything in place, you start adding the new classes:

# In employees.py

class Manager(SalaryEmployee):

63
def work(self, hours):
print(f'{self.name} screams and yells for {hours} hours.')

class Secretary(SalaryEmployee):
def work(self, hours):
print(f'{self.name} expends {hours} hours doing office paperwork.')

class SalesPerson(CommissionEmployee):
def work(self, hours):
print(f'{self.name} expends {hours} hours on the phone.')

class FactoryWorker(HourlyEmployee):
def work(self, hours):
print(f'{self.name} manufactures gadgets for {hours} hours.')

First, you add a Manager class that derives from SalaryEmployee. The class exposes a method
work() that will be used by the productivity system. The method takes the hours the employee
worked.

Then you add Secretary, SalesPerson, and FactoryWorker and then implement the work()
interface, so they can be used by the productivity system.

Now, you can add the ProductivitySytem class:

# In productivity.py

class ProductivitySystem:
def track(self, employees, hours):
print('Tracking Employee Productivity')
print('==============================')
for employee in employees:
employee.work(hours)
print('')

The class tracks employees in the track() method that takes a list of employees and the number
of hours to track. You can now add the productivity system to your program:

# In program.py

import hr
import employees
import productivity

manager = employees.Manager(1, 'Mary Poppins', 3000)


secretary = employees.Secretary(2, 'John Smith', 1500)
sales_guy = employees.SalesPerson(3, 'Kevin Bacon', 1000, 250)
factory_worker = employees.FactoryWorker(2, 'Jane Doe', 40, 15)
employees = [
manager,
secretary,
sales_guy,
factory_worker,
]

64
productivity_system = productivity.ProductivitySystem()
productivity_system.track(employees, 40)
payroll_system = hr.PayrollSystem()
payroll_system.calculate_payroll(employees)

The program creates a list of employees of different types. The employee list is sent to the
productivity system to track their work for 40 hours. Then the same list of employees is sent to
the payroll system to calculate their payroll.

You can run the program to see the output:

$ python program.py

Tracking Employee Productivity


==============================
Mary Poppins screams and yells for 40 hours.
John Smith expends 40 hours doing office paperwork.
Kevin Bacon expends 40 hours on the phone.
Jane Doe manufactures gadgets for 40 hours.

Calculating Payroll
===================
Payroll for: 1 - Mary Poppins
- Check amount: 3000

Payroll for: 2 - John Smith


- Check amount: 1500

Payroll for: 3 - Kevin Bacon


- Check amount: 1250

Payroll for: 4 - Jane Doe


- Check amount: 600

The program shows the employees working for 40 hours through the productivity system. Then
it calculates and displays the payroll for each of the employees.

The program works as expected, but you had to add four new classes to support the changes. As
new requirements come, your class hierarchy will inevitably grow, leading to the class explosion
problem where your hierarchies will become so big that they’ll be hard to understand and
maintain.

The following diagram shows the new class hierarchy:

65
The diagram shows how the class hierarchy is growing. Additional requirements might have an
exponential effect in the number of classes with this design.

Inheriting Multiple Classes

Python is one of the few modern programming languages that supports multiple inheritance.
Multiple inheritance is the ability to derive a class from multiple base classes at the same time.

66
Multiple inheritance has a bad reputation to the extent that most modern programming languages
don’t support it. Instead, modern programming languages support the concept of interfaces. In
those languages, you inherit from a single base class and then implement multiple interfaces, so
your class can be re-used in different situations.

This approach puts some constraints in your designs. You can only inherit the implementation of
one class by directly deriving from it. You can implement multiple interfaces, but you can’t
inherit the implementation of multiple classes.

This constraint is good for software design because it forces you to design your classes with
fewer dependencies on each other. You will see later in this article that you can leverage multiple
implementations through composition, which makes software more flexible. This section,
however, is about multiple inheritance, so let’s take a look at how it works.

It turns out that sometimes temporary secretaries are hired when there is too much paperwork to
do. The TemporarySecretary class performs the role of a Secretary in the context of the
ProductivitySystem, but for payroll purposes, it is an HourlyEmployee.

You look at your class design. It has grown a little bit, but you can still understand how it works.
It seems you have two options:

1. Derive from Secretary: You can derive from Secretary to inherit the .work() method
for the role, and then override the .calculate_payroll() method to implement it as an
HourlyEmployee.
2. Derive from HourlyEmployee: You can derive from HourlyEmployee to inherit the
.calculate_payroll() method, and then override the .work() method to implement it
as a Secretary.

Then, you remember that Python supports multiple inheritance, so you decide to derive from
both Secretary and HourlyEmployee:

# In employees.py

class TemporarySecretary(Secretary, HourlyEmployee):


pass

Python allows you to inherit from two different classes by specifying them between parenthesis
in the class declaration.

Now, you modify your program to add the new temporary secretary employee:

import hr
import employees
import productivity

manager = employees.Manager(1, 'Mary Poppins', 3000)


secretary = employees.Secretary(2, 'John Smith', 1500)
sales_guy = employees.SalesPerson(3, 'Kevin Bacon', 1000, 250)

67
factory_worker = employees.FactoryWorker(4, 'Jane Doe', 40, 15)
temporary_secretary = employees.TemporarySecretary(5, 'Robin Williams', 40, 9)
company_employees = [
manager,
secretary,
sales_guy,
factory_worker,
temporary_secretary,
]
productivity_system = productivity.ProductivitySystem()
productivity_system.track(company_employees, 40)
payroll_system = hr.PayrollSystem()
payroll_system.calculate_payroll(company_employees)

You run the program to test it:

$ python program.py

Traceback (most recent call last):


File ".\program.py", line 9, in <module>
temporary_secretary = employee.TemporarySecretary(5, 'Robin Williams', 40,
9)
TypeError: __init__() takes 4 positional arguments but 5 were given

You get a TypeError exception saying that 4 positional arguments where expected, but 5 were
given.

This is because you derived TemporarySecretary first from Secretary and then from
HourlyEmployee, so the interpreter is trying to use Secretary.__init__() to initialize the
object.

Okay, let’s reverse it:

class TemporarySecretary(HourlyEmployee, Secretary):


pass

Now, run the program again and see what happens:

$ python program.py

Traceback (most recent call last):


File ".\program.py", line 9, in <module>
temporary_secretary = employee.TemporarySecretary(5, 'Robin Williams', 40,
9)
File "employee.py", line 16, in __init__
super().__init__(id, name)
TypeError: __init__() missing 1 required positional argument: 'weekly_salary'

Now it seems you are missing a weekly_salary parameter, which is necessary to initialize
Secretary, but that parameter doesn’t make sense in the context of a TemporarySecretary
because it’s an HourlyEmployee.

68
Maybe implementing TemporarySecretary.__init__() will help:

# In employees.py

class TemporarySecretary(HourlyEmployee, Secretary):


def __init__(self, id, name, hours_worked, hour_rate):
super().__init__(id, name, hours_worked, hour_rate)

Try it:

$ python program.py

Traceback (most recent call last):


File ".\program.py", line 9, in <module>
temporary_secretary = employee.TemporarySecretary(5, 'Robin Williams', 40,
9)
File "employee.py", line 54, in __init__
super().__init__(id, name, hours_worked, hour_rate)
File "employee.py", line 16, in __init__
super().__init__(id, name)
TypeError: __init__() missing 1 required positional argument: 'weekly_salary'

That didn’t work either. Okay, it’s time for you to dive into Python’s method resolution order
(MRO) to see what’s going on.

When a method or attribute of a class is accessed, Python uses the class MRO to find it. The
MRO is also used by super() to determine which method or attribute to invoke. You can learn
more about super() in Supercharge Your Classes With Python super().

You can evaluate the TemporarySecretary class MRO using the interactive interpreter:

>>> from employees import TemporarySecretary


>>> TemporarySecretary.__mro__

(<class 'employees.TemporarySecretary'>,
<class 'employees.HourlyEmployee'>,
<class 'employees.Secretary'>,
<class 'employees.SalaryEmployee'>,
<class 'employees.Employee'>,
<class 'object'>
)

The MRO shows the order in which Python is going to look for a matching attribute or method.
In the example, this is what happens when we create the TemporarySecretary object:

1. The TemporarySecretary.__init__(self, id, name, hours_worked, hour_rate)


method is called.
2. The super().__init__(id, name, hours_worked, hour_rate) call matches
HourlyEmployee.__init__(self, id, name, hour_worked, hour_rate).

69
3. HourlyEmployee calls super().__init__(id, name), which the MRO is going to
match to Secretary.__init__(), which is inherited from
SalaryEmployee.__init__(self, id, name, weekly_salary).

Because the parameters don’t match, a TypeError exception is raised.

You can bypass the MRO by reversing the inheritance order and directly calling
HourlyEmployee.__init__() as follows:

class TemporarySecretary(Secretary, HourlyEmployee):


def __init__(self, id, name, hours_worked, hour_rate):
HourlyEmployee.__init__(self, id, name, hours_worked, hour_rate)

That solves the problem of creating the object, but you will run into a similar problem when
trying to calculate payroll. You can run the program to see the problem:

$ python program.py

Tracking Employee Productivity


==============================
Mary Poppins screams and yells for 40 hours.
John Smith expends 40 hours doing office paperwork.
Kevin Bacon expends 40 hours on the phone.
Jane Doe manufactures gadgets for 40 hours.
Robin Williams expends 40 hours doing office paperwork.

Calculating Payroll
===================
Payroll for: 1 - Mary Poppins
- Check amount: 3000

Payroll for: 2 - John Smith


- Check amount: 1500

Payroll for: 3 - Kevin Bacon


- Check amount: 1250

Payroll for: 4 - Jane Doe


- Check amount: 600

Payroll for: 5 - Robin Williams


Traceback (most recent call last):
File ".\program.py", line 20, in <module>
payroll_system.calculate_payroll(employees)
File "hr.py", line 7, in calculate_payroll
print(f'- Check amount: {employee.calculate_payroll()}')
File "employee.py", line 12, in calculate_payroll
return self.weekly_salary
AttributeError: 'TemporarySecretary' object has no attribute 'weekly_salary'

The problem now is that because you reversed the inheritance order, the MRO is finding the
.calculate_payroll() method of SalariedEmployee before the one in HourlyEmployee.

70
You need to override .calculate_payroll() in TemporarySecretary and invoke the right
implementation from it:

class TemporarySecretary(Secretary, HourlyEmployee):


def __init__(self, id, name, hours_worked, hour_rate):
HourlyEmployee.__init__(self, id, name, hours_worked, hour_rate)

def calculate_payroll(self):
return HourlyEmployee.calculate_payroll(self)

The calculate_payroll() method directly invokes HourlyEmployee.calculate_payroll()


to ensure that you get the correct result. You can run the program again to see it working:

$ python program.py

Tracking Employee Productivity


==============================
Mary Poppins screams and yells for 40 hours.
John Smith expends 40 hours doing office paperwork.
Kevin Bacon expends 40 hours on the phone.
Jane Doe manufactures gadgets for 40 hours.
Robin Williams expends 40 hours doing office paperwork.

Calculating Payroll
===================
Payroll for: 1 - Mary Poppins
- Check amount: 3000

Payroll for: 2 - John Smith


- Check amount: 1500

Payroll for: 3 - Kevin Bacon


- Check amount: 1250

Payroll for: 4 - Jane Doe


- Check amount: 600

Payroll for: 5 - Robin Williams


- Check amount: 360

The program now works as expected because you’re forcing the method resolution order by
explicitly telling the interpreter which method we want to use.

As you can see, multiple inheritance can be confusing, especially when you run into the diamond
problem.

The following diagram shows the diamond problem in your class hierarchy:

71
The diagram shows the diamond problem with the current class design. TemporarySecretary
uses multiple inheritance to derive from two classes that ultimately also derive from Employee.
This causes two paths to reach the Employee base class, which is something you want to avoid in
your designs.

72
The diamond problem appears when you’re using multiple inheritance and deriving from two
classes that have a common base class. This can cause the wrong version of a method to be
called.

As you’ve seen, Python provides a way to force the right method to be invoked, and analyzing
the MRO can help you understand the problem.

Still, when you run into the diamond problem, it’s better to re-think the design. You will now
make some changes to leverage multiple inheritance, avoiding the diamond problem.

The Employee derived classes are used by two different systems:

1. The productivity system that tracks employee productivity.


2. The payroll system that calculates the employee payroll.

This means that everything related to productivity should be together in one module and
everything related to payroll should be together in another. You can start making changes to the
productivity module:

# In productivity.py

class ProductivitySystem:
def track(self, employees, hours):
print('Tracking Employee Productivity')
print('==============================')
for employee in employees:
result = employee.work(hours)
print(f'{employee.name}: {result}')
print('')

class ManagerRole:
def work(self, hours):
return f'screams and yells for {hours} hours.'

class SecretaryRole:
def work(self, hours):
return f'expends {hours} hours doing office paperwork.'

class SalesRole:
def work(self, hours):
return f'expends {hours} hours on the phone.'

class FactoryRole:
def work(self, hours):
return f'manufactures gadgets for {hours} hours.'

The productivity module implements the ProductivitySystem class, as well as the related
roles it supports. The classes implement the work() interface required by the system, but they
don’t derived from Employee.

You can do the same with the hr module:

73
# In hr.py

class PayrollSystem:
def calculate_payroll(self, employees):
print('Calculating Payroll')
print('===================')
for employee in employees:
print(f'Payroll for: {employee.id} - {employee.name}')
print(f'- Check amount: {employee.calculate_payroll()}')
print('')

class SalaryPolicy:
def __init__(self, weekly_salary):
self.weekly_salary = weekly_salary

def calculate_payroll(self):
return self.weekly_salary

class HourlyPolicy:
def __init__(self, hours_worked, hour_rate):
self.hours_worked = hours_worked
self.hour_rate = hour_rate

def calculate_payroll(self):
return self.hours_worked * self.hour_rate

class CommissionPolicy(SalaryPolicy):
def __init__(self, weekly_salary, commission):
super().__init__(weekly_salary)
self.commission = commission

def calculate_payroll(self):
fixed = super().calculate_payroll()
return fixed + self.commission

The hr module implements the PayrollSystem, which calculates payroll for the employees. It
also implements the policy classes for payroll. As you can see, the policy classes don’t derive
from Employee anymore.

You can now add the necessary classes to the employee module:

# In employees.py

from hr import (
SalaryPolicy,
CommissionPolicy,
HourlyPolicy
)
from productivity import (
ManagerRole,
SecretaryRole,
SalesRole,
FactoryRole
)

74
class Employee:
def __init__(self, id, name):
self.id = id
self.name = name

class Manager(Employee, ManagerRole, SalaryPolicy):


def __init__(self, id, name, weekly_salary):
SalaryPolicy.__init__(self, weekly_salary)
super().__init__(id, name)

class Secretary(Employee, SecretaryRole, SalaryPolicy):


def __init__(self, id, name, weekly_salary):
SalaryPolicy.__init__(self, weekly_salary)
super().__init__(id, name)

class SalesPerson(Employee, SalesRole, CommissionPolicy):


def __init__(self, id, name, weekly_salary, commission):
CommissionPolicy.__init__(self, weekly_salary, commission)
super().__init__(id, name)

class FactoryWorker(Employee, FactoryRole, HourlyPolicy):


def __init__(self, id, name, hours_worked, hour_rate):
HourlyPolicy.__init__(self, hours_worked, hour_rate)
super().__init__(id, name)

class TemporarySecretary(Employee, SecretaryRole, HourlyPolicy):


def __init__(self, id, name, hours_worked, hour_rate):
HourlyPolicy.__init__(self, hours_worked, hour_rate)
super().__init__(id, name)

The employees module imports policies and roles from the other modules and implements the
different Employee types. You are still using multiple inheritance to inherit the implementation
of the salary policy classes and the productivity roles, but the implementation of each class only
needs to deal with initialization.

Notice that you still need to explicitly initialize the salary policies in the constructors. You
probably saw that the initializations of Manager and Secretary are identical. Also, the
initializations of FactoryWorker and TemporarySecretary are the same.

You will not want to have this kind of code duplication in more complex designs, so you have to
be careful when designing class hierarchies.

Here’s the UML diagram for the new design:

75
The diagram shows the relationships to define the Secretary and TemporarySecretary using
multiple inheritance, but avoiding the diamond problem.

You can run the program and see how it works:

$ python program.py

Tracking Employee Productivity


==============================
Mary Poppins: screams and yells for 40 hours.
John Smith: expends 40 hours doing office paperwork.
Kevin Bacon: expends 40 hours on the phone.
Jane Doe: manufactures gadgets for 40 hours.
Robin Williams: expends 40 hours doing office paperwork.

Calculating Payroll
===================
Payroll for: 1 - Mary Poppins
- Check amount: 3000

76
Payroll for: 2 - John Smith
- Check amount: 1500

Payroll for: 3 - Kevin Bacon


- Check amount: 1250

Payroll for: 4 - Jane Doe


- Check amount: 600

Payroll for: 5 - Robin Williams


- Check amount: 360

You’ve seen how inheritance and multiple inheritance work in Python. You can now explore the
topic of composition.

Composition in Python
Composition is an object oriented design concept that models a has a relationship. In
composition, a class known as composite contains an object of another class known to as
component. In other words, a composite class has a component of another class.

Composition allows composite classes to reuse the implementation of the components it


contains. The composite class doesn’t inherit the component class interface, but it can leverage
its implementation.

The composition relation between two classes is considered loosely coupled. That means that
changes to the component class rarely affect the composite class, and changes to the composite
class never affect the component class.

This provides better adaptability to change and allows applications to introduce new
requirements without affecting existing code.

When looking at two competing software designs, one based on inheritance and another based on
composition, the composition solution usually is the most flexible. You can now look at how
composition works.

You’ve already used composition in our examples. If you look at the Employee class, you’ll see
that it contains two attributes:

1. id to identify an employee.
2. name to contain the name of the employee.

These two attributes are objects that the Employee class has. Therefore, you can say that an
Employee has an id and has a name.

Another attribute for an Employee might be an Address:


77
# In contacts.py

class Address:
def __init__(self, street, city, state, zipcode, street2=''):
self.street = street
self.street2 = street2
self.city = city
self.state = state
self.zipcode = zipcode

def __str__(self):
lines = [self.street]
if self.street2:
lines.append(self.street2)
lines.append(f'{self.city}, {self.state} {self.zipcode}')
return '\n'.join(lines)

You implemented a basic address class that contains the usual components for an address. You
made the street2 attribute optional because not all addresses will have that component.

You implemented __str__() to provide a pretty representation of an Address. You can see this
implementation in the interactive interpreter:

>>> from contacts import Address


>>> address = Address('55 Main St.', 'Concord', 'NH', '03301')
>>> print(address)

55 Main St.
Concord, NH 03301

When you print() the address variable, the special method __str__() is invoked. Since you
overloaded the method to return a string formatted as an address, you get a nice, readable
representation. Operator and Function Overloading in Custom Python Classes gives a good
overview of the special methods available in classes that can be implemented to customize the
behavior of your objects.

You can now add the Address to the Employee class through composition:

# In employees.py

class Employee:
def __init__(self, id, name):
self.id = id
self.name = name
self.address = None

You initialize the address attribute to None for now to make it optional, but by doing that, you
can now assign an Address to an Employee. Also notice that there is no reference in the
employee module to the contacts module.

78
Composition is a loosely coupled relationship that often doesn’t require the composite class to
have knowledge of the component.

The UML diagram representing the relationship between Employee and Address looks like this:

The diagram shows the basic composition relationship between Employee and Address.

You can now modify the PayrollSystem class to leverage the address attribute in Employee:

# In hr.py

class PayrollSystem:
def calculate_payroll(self, employees):
print('Calculating Payroll')
print('===================')
for employee in employees:
print(f'Payroll for: {employee.id} - {employee.name}')
print(f'- Check amount: {employee.calculate_payroll()}')
if employee.address:
print('- Sent to:')
print(employee.address)
print('')

79
You check to see if the employee object has an address, and if it does, you print it. You can now
modify the program to assign some addresses to the employees:

# In program.py

import hr
import employees
import productivity
import contacts

manager = employees.Manager(1, 'Mary Poppins', 3000)


manager.address = contacts.Address(
'121 Admin Rd',
'Concord',
'NH',
'03301'
)
secretary = employees.Secretary(2, 'John Smith', 1500)
secretary.address = contacts.Address(
'67 Paperwork Ave.',
'Manchester',
'NH',
'03101'
)
sales_guy = employees.SalesPerson(3, 'Kevin Bacon', 1000, 250)
factory_worker = employees.FactoryWorker(4, 'Jane Doe', 40, 15)
temporary_secretary = employees.TemporarySecretary(5, 'Robin Williams', 40, 9)
employees = [
manager,
secretary,
sales_guy,
factory_worker,
temporary_secretary,
]
productivity_system = productivity.ProductivitySystem()
productivity_system.track(employees, 40)
payroll_system = hr.PayrollSystem()
payroll_system.calculate_payroll(employees)

You added a couple of addresses to the manager and secretary objects. When you run the
program, you will see the addresses printed:

$ python program.py

Tracking Employee Productivity


==============================
Mary Poppins: screams and yells for {hours} hours.
John Smith: expends {hours} hours doing office paperwork.
Kevin Bacon: expends {hours} hours on the phone.
Jane Doe: manufactures gadgets for {hours} hours.
Robin Williams: expends {hours} hours doing office paperwork.

Calculating Payroll
===================
Payroll for: 1 - Mary Poppins

80
- Check amount: 3000
- Sent to:
121 Admin Rd
Concord, NH 03301

Payroll for: 2 - John Smith


- Check amount: 1500
- Sent to:
67 Paperwork Ave.
Manchester, NH 03101

Payroll for: 3 - Kevin Bacon


- Check amount: 1250

Payroll for: 4 - Jane Doe


- Check amount: 600

Payroll for: 5 - Robin Williams


- Check amount: 360

Notice how the payroll output for the manager and secretary objects show the addresses where
the checks were sent.

The Employee class leverages the implementation of the Address class without any knowledge
of what an Address object is or how it’s represented. This type of design is so flexible that you
can change the Address class without any impact to the Employee class.

Flexible Designs With Composition

Composition is more flexible than inheritance because it models a loosely coupled relationship.
Changes to a component class have minimal or no effects on the composite class. Designs based
on composition are more suitable to change.

You change behavior by providing new components that implement those behaviors instead of
adding new classes to your hierarchy.

Take a look at the multiple inheritance example above. Imagine how new payroll policies will
affect the design. Try to picture what the class hierarchy will look like if new roles are needed.
As you saw before, relying too heavily on inheritance can lead to class explosion.

The biggest problem is not so much the number of classes in your design, but how tightly
coupled the relationships between those classes are. Tightly coupled classes affect each other
when changes are introduced.

In this section, you are going to use composition to implement a better design that still fits the
requirements of the PayrollSystem and the ProductivitySystem.

81
You can start by implementing the functionality of the ProductivitySystem:

# In productivity.py

class ProductivitySystem:
def __init__(self):
self._roles = {
'manager': ManagerRole,
'secretary': SecretaryRole,
'sales': SalesRole,
'factory': FactoryRole,
}

def get_role(self, role_id):


role_type = self._roles.get(role_id)
if not role_type:
raise ValueError('role_id')
return role_type()

def track(self, employees, hours):


print('Tracking Employee Productivity')
print('==============================')
for employee in employees:
employee.work(hours)
print('')

The ProductivitySystem class defines some roles using a string identifier mapped to a role
class that implements the role. It exposes a .get_role() method that, given a role identifier,
returns the role type object. If the role is not found, then a ValueError exception is raised.

It also exposes the previous functionality in the .track() method, where given a list of
employees it tracks the productivity of those employees.

You can now implement the different role classes:

# In productivity.py

class ManagerRole:
def perform_duties(self, hours):
return f'screams and yells for {hours} hours.'

class SecretaryRole:
def perform_duties(self, hours):
return f'does paperwork for {hours} hours.'

class SalesRole:
def perform_duties(self, hours):
return f'expends {hours} hours on the phone.'

class FactoryRole:
def perform_duties(self, hours):
return f'manufactures gadgets for {hours} hours.'

82
Each of the roles you implemented expose a .perform_duties() that takes the number of
hours worked. The methods return a string representing the duties.

The role classes are independent of each other, but they expose the same interface, so they are
interchangeable. You’ll see later how they are used in the application.

Now, you can implement the PayrollSystem for the application:

# In hr.py

class PayrollSystem:
def __init__(self):
self._employee_policies = {
1: SalaryPolicy(3000),
2: SalaryPolicy(1500),
3: CommissionPolicy(1000, 100),
4: HourlyPolicy(15),
5: HourlyPolicy(9)
}

def get_policy(self, employee_id):


policy = self._employee_policies.get(employee_id)
if not policy:
return ValueError(employee_id)
return policy

def calculate_payroll(self, employees):


print('Calculating Payroll')
print('===================')
for employee in employees:
print(f'Payroll for: {employee.id} - {employee.name}')
print(f'- Check amount: {employee.calculate_payroll()}')
if employee.address:
print('- Sent to:')
print(employee.address)
print('')

The PayrollSystem keeps an internal database of payroll policies for each employee. It exposes
a .get_policy() that, given an employee id, returns its payroll policy. If a specified id doesn’t
exist in the system, then the method raises a ValueError exception.

The implementation of .calculate_payroll() works the same as before. It takes a list of


employees, calculates the payroll, and prints the results.

You can now implement the payroll policy classes:

# In hr.py

class PayrollPolicy:
def __init__(self):
self.hours_worked = 0

83
def track_work(self, hours):
self.hours_worked += hours

class SalaryPolicy(PayrollPolicy):
def __init__(self, weekly_salary):
super().__init__()
self.weekly_salary = weekly_salary

def calculate_payroll(self):
return self.weekly_salary

class HourlyPolicy(PayrollPolicy):
def __init__(self, hour_rate):
super().__init__()
self.hour_rate = hour_rate

def calculate_payroll(self):
return self.hours_worked * self.hour_rate

class CommissionPolicy(SalaryPolicy):
def __init__(self, weekly_salary, commission_per_sale):
super().__init__(weekly_salary)
self.commission_per_sale = commission_per_sale

@property
def commission(self):
sales = self.hours_worked / 5
return sales * self.commission_per_sale

def calculate_payroll(self):
fixed = super().calculate_payroll()
return fixed + self.commission

You first implement a PayrollPolicy class that serves as a base class for all the payroll
policies. This class tracks the hours_worked, which is common to all payroll policies.

The other policy classes derive from PayrollPolicy. We use inheritance here because we want
to leverage the implementation of PayrollPolicy. Also, SalaryPolicy, HourlyPolicy, and
CommissionPolicy are a PayrollPolicy.

SalaryPolicy is initialized with a weekly_salary value that is then used in


.calculate_payroll(). HourlyPolicy is initialized with the hour_rate, and implements
.calculate_payroll() by leveraging the base class hours_worked.

The CommissionPolicy class derives from SalaryPolicy because it wants to inherit its
implementation. It is initialized with the weekly_salary parameters, but it also requires a
commission_per_sale parameter.

The commission_per_sale is used to calculate the .commission, which is implemented as a


property so it gets calculated when requested. In the example, we are assuming that a sale
happens every 5 hours worked, and the .commission is the number of sales times the
commission_per_sale value.

84
CommissionPolicy implements the .calculate_payroll() method by first leveraging the
implementation in SalaryPolicy and then adding the calculated commission.

You can now add an AddressBook class to manage employee addresses:

# In contacts.py

class AddressBook:
def __init__(self):
self._employee_addresses = {
1: Address('121 Admin Rd.', 'Concord', 'NH', '03301'),
2: Address('67 Paperwork Ave', 'Manchester', 'NH', '03101'),
3: Address('15 Rose St', 'Concord', 'NH', '03301', 'Apt. B-1'),
4: Address('39 Sole St.', 'Concord', 'NH', '03301'),
5: Address('99 Mountain Rd.', 'Concord', 'NH', '03301'),
}

def get_employee_address(self, employee_id):


address = self._employee_addresses.get(employee_id)
if not address:
raise ValueError(employee_id)
return address

The AddressBook class keeps an internal database of Address objects for each employee. It
exposes a get_employee_address() method that returns the address of the specified employee
id. If the employee id doesn’t exist, then it raises a ValueError.

The Address class implementation remains the same as before:

# In contacts.py

class Address:
def __init__(self, street, city, state, zipcode, street2=''):
self.street = street
self.street2 = street2
self.city = city
self.state = state
self.zipcode = zipcode

def __str__(self):
lines = [self.street]
if self.street2:
lines.append(self.street2)
lines.append(f'{self.city}, {self.state} {self.zipcode}')
return '\n'.join(lines)

The class manages the address components and provides a pretty representation of an address.

So far, the new classes have been extended to support more functionality, but there are no
significant changes to the previous design. This is going to change with the design of the
employees module and its classes.

85
You can start by implementing an EmployeeDatabase class:

# In employees.py

from productivity import ProductivitySystem


from hr import PayrollSystem
from contacts import AddressBook

class EmployeeDatabase:
def __init__(self):
self._employees = [
{
'id': 1,
'name': 'Mary Poppins',
'role': 'manager'
},
{
'id': 2,
'name': 'John Smith',
'role': 'secretary'
},
{
'id': 3,
'name': 'Kevin Bacon',
'role': 'sales'
},
{
'id': 4,
'name': 'Jane Doe',
'role': 'factory'
},
{
'id': 5,
'name': 'Robin Williams',
'role': 'secretary'
},
]
self.productivity = ProductivitySystem()
self.payroll = PayrollSystem()
self.employee_addresses = AddressBook()

@property
def employees(self):
return [self._create_employee(**data) for data in self._employees]

def _create_employee(self, id, name, role):


address = self.employee_addresses.get_employee_address(id)
employee_role = self.productivity.get_role(role)
payroll_policy = self.payroll.get_policy(id)
return Employee(id, name, address, employee_role, payroll_policy)

The EmployeeDatabase keeps track of all the employees in the company. For each employee, it
tracks the id, name, and role. It has an instance of the ProductivitySystem, the
PayrollSystem, and the AddressBook. These instances are used to create employees.

86
It exposes an .employees property that returns the list of employees. The Employee objects are
created in an internal method ._create_employee(). Notice that you don’t have different types
of Employee classes. You just need to implement a single Employee class:

# In employees.py

class Employee:
def __init__(self, id, name, address, role, payroll):
self.id = id
self.name = name
self.address = address
self.role = role
self.payroll = payroll

def work(self, hours):


duties = self.role.perform_duties(hours)
print(f'Employee {self.id} - {self.name}:')
print(f'- {duties}')
print('')
self.payroll.track_work(hours)

def calculate_payroll(self):
return self.payroll.calculate_payroll()

The Employee class is initialized with the id, name, and address attributes. It also requires the
productivity role for the employee and the payroll policy.

The class exposes a .work() method that takes the hours worked. This method first retrieves the
duties from the role. In other words, it delegates to the role object to perform its duties.

In the same way, it delegates to the payroll object to track the work hours. The payroll, as
you saw, uses those hours to calculate the payroll if needed.

The following diagram shows the composition design used:

87
The diagram shows the design of composition based policies. There is a single Employee that is
composed of other data objects like Address and depends on the IRole and
IPayrollCalculator interfaces to delegate the work. There are multiple implementations of
these interfaces.

You can now use this design in your program:

88
# In program.py

from hr import PayrollSystem


from productivity import ProductivitySystem
from employees import EmployeeDatabase

productivity_system = ProductivitySystem()
payroll_system = PayrollSystem()
employee_database = EmployeeDatabase()
employees = employee_database.employees
productivity_system.track(employees, 40)
payroll_system.calculate_payroll(employees)

You can run the program to see its output:

$ python program.py

Tracking Employee Productivity


==============================
Employee 1 - Mary Poppins:
- screams and yells for 40 hours.

Employee 2 - John Smith:


- does paperwork for 40 hours.

Employee 3 - Kevin Bacon:


- expends 40 hours on the phone.

Employee 4 - Jane Doe:


- manufactures gadgets for 40 hours.

Employee 5 - Robin Williams:


- does paperwork for 40 hours.

Calculating Payroll
===================
Payroll for: 1 - Mary Poppins
- Check amount: 3000
- Sent to:
121 Admin Rd.
Concord, NH 03301

Payroll for: 2 - John Smith


- Check amount: 1500
- Sent to:
67 Paperwork Ave
Manchester, NH 03101

Payroll for: 3 - Kevin Bacon


- Check amount: 1800.0
- Sent to:
15 Rose St
Apt. B-1
Concord, NH 03301

89
Payroll for: 4 - Jane Doe
- Check amount: 600
- Sent to:
39 Sole St.
Concord, NH 03301

Payroll for: 5 - Robin Williams


- Check amount: 360
- Sent to:
99 Mountain Rd.
Concord, NH 03301

This design is what is called policy-based design, where classes are composed of policies, and
they delegate to those policies to do the work.

Policy-based design was introduced in the book Modern C++ Design, and it uses template
metaprogramming in C++ to achieve the results.

Python does not support templates, but you can achieve similar results using composition, as you
saw in the example above.

This type of design gives you all the flexibility you’ll need as requirements change. Imagine you
need to change the way payroll is calculated for an object at run-time.

Customizing Behavior With Composition

If your design relies on inheritance, you need to find a way to change the type of an object to
change its behavior. With composition, you just need to change the policy the object uses.

Imagine that our manager all of a sudden becomes a temporary employee that gets paid by the
hour. You can modify the object during the execution of the program in the following way:

# In program.py

from hr import PayrollSystem, HourlyPolicy


from productivity import ProductivitySystem
from employees import EmployeeDatabase

productivity_system = ProductivitySystem()
payroll_system = PayrollSystem()
employee_database = EmployeeDatabase()
employees = employee_database.employees
manager = employees[0]
manager.payroll = HourlyPolicy(55)

productivity_system.track(employees, 40)
payroll_system.calculate_payroll(employees)

90
The program gets the employee list from the EmployeeDatabase and retrieves the first
employee, which is the manager we want. Then it creates a new HourlyPolicy initialized at $55
per hour and assigns it to the manager object.

The new policy is now used by the PayrollSystem modifying the existing behavior. You can
run the program again to see the result:

$ python program.py

Tracking Employee Productivity


==============================
Employee 1 - Mary Poppins:
- screams and yells for 40 hours.

Employee 2 - John Smith:


- does paperwork for 40 hours.

Employee 3 - Kevin Bacon:


- expends 40 hours on the phone.

Employee 4 - Jane Doe:


- manufactures gadgets for 40 hours.

Employee 5 - Robin Williams:


- does paperwork for 40 hours.

Calculating Payroll
===================
Payroll for: 1 - Mary Poppins
- Check amount: 2200
- Sent to:
121 Admin Rd.
Concord, NH 03301

Payroll for: 2 - John Smith


- Check amount: 1500
- Sent to:
67 Paperwork Ave
Manchester, NH 03101

Payroll for: 3 - Kevin Bacon


- Check amount: 1800.0
- Sent to:
15 Rose St
Apt. B-1
Concord, NH 03301

Payroll for: 4 - Jane Doe


- Check amount: 600
- Sent to:
39 Sole St.
Concord, NH 03301

Payroll for: 5 - Robin Williams

91
- Check amount: 360
- Sent to:
99 Mountain Rd.
Concord, NH 03301

The check for Mary Poppins, our manager, is now for $2200 instead of the fixed salary of $3000
that she had per week.

Notice how we added that business rule to the program without changing any of the existing
classes. Consider what type of changes would’ve been required with an inheritance design.

You would’ve had to create a new class and change the type of the manager employee. There is
no chance you could’ve changed the policy at run-time.

Choosing Between Inheritance and Composition in Python


So far, you’ve seen how inheritance and composition work in Python. You’ve seen that derived
classes inherit the interface and implementation of their base classes. You’ve also seen that
composition allows you to reuse the implementation of another class.

You’ve implemented two solutions to the same problem. The first solution used multiple
inheritance, and the second one used composition.

You’ve also seen that Python’s duck typing allows you to reuse objects with existing parts of a
program by implementing the desired interface. In Python, it isn’t necessary to derive from a
base class for your classes to be reused.

At this point, you might be asking when to use inheritance vs composition in Python. They both
enable code reuse. Inheritance and composition can tackle similar problems in your Python
programs.

The general advice is to use the relationship that creates fewer dependencies between two
classes. This relation is composition. Still, there will be times where inheritance will make more
sense.

The following sections provide some guidelines to help you make the right choice between
inheritance and composition in Python.

Inheritance to Model “Is A” Relationship

Inheritance should only be used to model an is a relationship. Liskov’s substitution principle


says that an object of type Derived, which inherits from Base, can replace an object of type
Base without altering the desirable properties of a program.

Liskov’s substitution principle is the most important guideline to determine if inheritance is the
appropriate design solution. Still, the answer might not be straightforward in all situations.

92
Fortunately, there is a simple test you can use to determine if your design follows Liskov’s
substitution principle.

Let’s say you have a class A that provides an implementation and interface you want to reuse in
another class B. Your initial thought is that you can derive B from A and inherit both the interface
and implementation. To be sure this is the right design, you follow theses steps:

1. Evaluate B is an A: Think about this relationship and justify it. Does it make sense?
2. Evaluate A is a B: Reverse the relationship and justify it. Does it also make sense?

If you can justify both relationships, then you should never inherit those classes from one
another. Let’s look at a more concrete example.

You have a class Rectangle which exposes an .area property. You need a class Square, which
also has an .area. It seems that a Square is a special type of Rectangle, so maybe you can
derive from it and leverage both the interface and implementation.

Before you jump into the implementation, you use Liskov’s substitution principle to evaluate the
relationship.

A Square is a Rectangle because its area is calculated from the product of its height times its
length. The constraint is that Square.height and Square.length must be equal.

It makes sense. You can justify the relationship and explain why a Square is a Rectangle. Let’s
reverse the relationship to see if it makes sense.

A Rectangle is a Square because its area is calculated from the product of its height times its
length. The difference is that Rectangle.height and Rectangle.width can change
independently.

It also makes sense. You can justify the relationship and describe the special constraints for each
class. This is a good sign that these two classes should never derive from each other.

You might have seen other examples that derive Square from Rectangle to explain inheritance.
You might be skeptical with the little test you just did. Fair enough. Let’s write a program that
illustrates the problem with deriving Square from Rectangle.

First, you implement Rectangle. You’re even going to encapsulate the attributes to ensure that
all the constraints are met:

# In rectangle_square_demo.py

class Rectangle:
def __init__(self, length, height):
self._length = length
self._height = height

93
@property
def area(self):
return self._length * self._height

The Rectangle class is initialized with a length and a height, and it provides an .area
property that returns the area. The length and height are encapsulated to avoid changing them
directly.

Now, you derive Square from Rectangle and override the necessary interface to meet the
constraints of a Square:

# In rectangle_square_demo.py

class Square(Rectangle):
def __init__(self, side_size):
super().__init__(side_size, side_size)

The Square class is initialized with a side_size, which is used to initialize both components of
the base class. Now, you write a small program to test the behavior:

# In rectangle_square_demo.py

rectangle = Rectangle(2, 4)
assert rectangle.area == 8

square = Square(2)
assert square.area == 4

print('OK!')

The program creates a Rectangle and a Square and asserts that their .area is calculated
correctly. You can run the program and see that everything is OK so far:

$ python rectangle_square_demo.py

OK!

The program executes correctly, so it seems that Square is just a special case of a Rectangle.

Later on, you need to support resizing Rectangle objects, so you make the appropriate changes
to the class:

# In rectangle_square_demo.py

class Rectangle:
def __init__(self, length, height):
self._length = length
self._height = height

@property
def area(self):

94
return self._length * self._height

def resize(self, new_length, new_height):


self._length = new_length
self._height = new_height

.resize() takes the new_length and new_width for the object. You can add the following code
to the program to verify that it works correctly:

# In rectangle_square_demo.py

rectangle.resize(3, 5)
assert rectangle.area == 15

print('OK!')

You resize the rectangle object and assert that the new area is correct. You can run the program
to verify the behavior:

$ python rectangle_square_demo.py

OK!

The assertion passes, and you see that the program runs correctly.

So, what happens if you resize a square? Modify the program, and try to modify the square
object:

# In rectangle_square_demo.py

square.resize(3, 5)
print(f'Square area: {square.area}')

You pass the same parameters to square.resize() that you used with rectangle, and print the
area. When you run the program you see:

$ python rectangle_square_demo.py

Square area: 15
OK!

The program shows that the new area is 15 like the rectangle object. The problem now is that
the square object no longer meets the Square class constraint that the length and height must
be equal.

How can you fix that problem? You can try several approaches, but all of them will be awkward.
You can override .resize() in square and ignore the height parameter, but that will be
confusing for people looking at other parts of the program where rectangles are being resized
and some of them are not getting the expected areas because they are really squares.

95
In a small program like this one, it might be easy to spot the causes of the weird behavior, but in
a more complex program, the problem will be harder to find.

The reality is that if you’re able to justify an inheritance relationship between two classes both
ways, you should not derive one class from another.

In the example, it doesn’t make sense that Square inherits the interface and implementation of
.resize() from Rectangle. That doesn’t mean that Square objects can’t be resized. It means
that the interface is different because it only needs a side_size parameter.

This difference in interface justifies not deriving Square from Rectangle like the test above
advised.

Mixing Features With Mixin Classes

One of the uses of multiple inheritance in Python is to extend a class features through mixins. A
mixin is a class that provides methods to other classes but are not considered a base class.

A mixin allows other classes to reuse its interface and implementation without becoming a super
class. They implement a unique behavior that can be aggregated to other unrelated classes. They
are similar to composition but they create a stronger relationship.

Let’s say you want to convert objects of certain types in your application to a dictionary
representation of the object. You could provide a .to_dict() method in every class that you
want to support this feature, but the implementation of .to_dict() seems to be very similar.

This could be a good candidate for a mixin. You start by slightly modifying the Employee class
from the composition example:

# In employees.py

class Employee:
def __init__(self, id, name, address, role, payroll):
self.id = id
self.name = name
self.address = address
self._role = role
self._payroll = payroll

def work(self, hours):


duties = self._role.perform_duties(hours)
print(f'Employee {self.id} - {self.name}:')
print(f'- {duties}')
print('')
self._payroll.track_work(hours)

96
def calculate_payroll(self):
return self._payroll.calculate_payroll()

The change is very small. You just changed the role and payroll attributes to be internal by
adding a leading underscore to their name. You will see soon why you are making that change.

Now, you add the AsDictionaryMixin class:

# In representations.py

class AsDictionaryMixin:
def to_dict(self):
return {
prop: self._represent(value)
for prop, value in self.__dict__.items()
if not self._is_internal(prop)
}

def _represent(self, value):


if isinstance(value, object):
if hasattr(value, 'to_dict'):
return value.to_dict()
else:
return str(value)
else:
return value

def _is_internal(self, prop):


return prop.startswith('_')

The AsDictionaryMixin class exposes a .to_dict() method that returns the representation of
itself as a dictionary. The method is implemented as a dict comprehension that says, “Create a
dictionary mapping prop to value for each item in self.__dict__.items() if the prop is not
internal.”

Note: This is why we made the role and payroll attributes internal in the Employee class,
because we don’t want to represent them in the dictionary.

As you saw at the beginning, creating a class inherits some members from object, and one of
those members is __dict__, which is basically a mapping of all the attributes in an object to
their value.

You iterate through all the items in __dict__ and filter out the ones that have a name that starts
with an underscore using ._is_internal().

._represent() checks the specified value. If the value is an object, then it looks to see if it
also has a .to_dict() member and uses it to represent the object. Otherwise, it returns a string
representation. If the value is not an object, then it simply returns the value.

You can modify the Employee class to support this mixin:

97
# In employees.py

from representations import AsDictionaryMixin

class Employee(AsDictionaryMixin):
def __init__(self, id, name, address, role, payroll):
self.id = id
self.name = name
self.address = address
self._role = role
self._payroll = payroll

def work(self, hours):


duties = self._role.perform_duties(hours)
print(f'Employee {self.id} - {self.name}:')
print(f'- {duties}')
print('')
self._payroll.track_work(hours)

def calculate_payroll(self):
return self._payroll.calculate_payroll()

All you have to do is inherit the AsDictionaryMixin to support the functionality. It will be nice
to support the same functionality in the Address class, so the Employee.address attribute is
represented in the same way:

# In contacts.py

from representations import AsDictionaryMixin

class Address(AsDictionaryMixin):
def __init__(self, street, city, state, zipcode, street2=''):
self.street = street
self.street2 = street2
self.city = city
self.state = state
self.zipcode = zipcode

def __str__(self):
lines = [self.street]
if self.street2:
lines.append(self.street2)
lines.append(f'{self.city}, {self.state} {self.zipcode}')
return '\n'.join(lines)

You apply the mixin to the Address class to support the feature. Now, you can write a small
program to test it:

# In program.py

import json
from employees import EmployeeDatabase

def print_dict(d):

98
print(json.dumps(d, indent=2))

for employee in EmployeeDatabase().employees:


print_dict(employee.to_dict())

The program implements a print_dict() that converts the dictionary to a JSON string using
indentation so the output looks better.

Then, it iterates through all the employees, printing the dictionary representation provided by
.to_dict(). You can run the program to see its output:

$ python program.py

{
"id": "1",
"name": "Mary Poppins",
"address": {
"street": "121 Admin Rd.",
"street2": "",
"city": "Concord",
"state": "NH",
"zipcode": "03301"
}
}
{
"id": "2",
"name": "John Smith",
"address": {
"street": "67 Paperwork Ave",
"street2": "",
"city": "Manchester",
"state": "NH",
"zipcode": "03101"
}
}
{
"id": "3",
"name": "Kevin Bacon",
"address": {
"street": "15 Rose St",
"street2": "Apt. B-1",
"city": "Concord",
"state": "NH",
"zipcode": "03301"
}
}
{
"id": "4",
"name": "Jane Doe",
"address": {
"street": "39 Sole St.",
"street2": "",
"city": "Concord",
"state": "NH",
"zipcode": "03301"

99
}
}
{
"id": "5",
"name": "Robin Williams",
"address": {
"street": "99 Mountain Rd.",
"street2": "",
"city": "Concord",
"state": "NH",
"zipcode": "03301"
}
}

You leveraged the implementation of AsDictionaryMixin in both Employee and Address


classes even when they are not related. Because AsDictionaryMixin only provides behavior, it
is easy to reuse with other classes without causing problems.

Composition to Model “Has A” Relationship

Composition models a has a relationship. With composition, a class Composite has an instance
of class Component and can leverage its implementation. The Component class can be reused in
other classes completely unrelated to the Composite.

In the composition example above, the Employee class has an Address object. Address
implements all the functionality to handle addresses, and it can be reused by other classes.

Other classes like Customer or Vendor can reuse Address without being related to Employee.
They can leverage the same implementation ensuring that addresses are handled consistently
across the application.

A problem you may run into when using composition is that some of your classes may start
growing by using multiple components. Your classes may require multiple parameters in the
constructor just to pass in the components they are made of. This can make your classes hard to
use.

A way to avoid the problem is by using the Factory Method to construct your objects. You did
that with the composition example.

If you look at the implementation of the EmployeeDatabase class, you’ll notice that it uses
._create_employee() to construct an Employee object with the right parameters.

This design will work, but ideally, you should be able to construct an Employee object just by
specifying an id, for example employee = Employee(1).

100
The following changes might improve your design. You can start with the productivity
module:

# In productivity.py

class _ProductivitySystem:
def __init__(self):
self._roles = {
'manager': ManagerRole,
'secretary': SecretaryRole,
'sales': SalesRole,
'factory': FactoryRole,
}

def get_role(self, role_id):


role_type = self._roles.get(role_id)
if not role_type:
raise ValueError('role_id')
return role_type()

def track(self, employees, hours):


print('Tracking Employee Productivity')
print('==============================')
for employee in employees:
employee.work(hours)
print('')

# Role classes implementation omitted

_productivity_system = _ProductivitySystem()

def get_role(role_id):
return _productivity_system.get_role(role_id)

def track(employees, hours):


_productivity_system.track(employees, hours)

First, you make the _ProductivitySystem class internal, and then provide a
_productivity_system internal variable to the module. You are communicating to other
developers that they should not create or use the _ProductivitySystem directly. Instead, you
provide two functions, get_role() and track(), as the public interface to the module. This is
what other modules should use.

What you are saying is that the _ProductivitySystem is a Singleton, and there should only be
one object created from it.

Now, you can do the same with the hr module:

# In hr.py

class _PayrollSystem:
def __init__(self):
self._employee_policies = {

101
1: SalaryPolicy(3000),
2: SalaryPolicy(1500),
3: CommissionPolicy(1000, 100),
4: HourlyPolicy(15),
5: HourlyPolicy(9)
}

def get_policy(self, employee_id):


policy = self._employee_policies.get(employee_id)
if not policy:
return ValueError(employee_id)
return policy

def calculate_payroll(self, employees):


print('Calculating Payroll')
print('===================')
for employee in employees:
print(f'Payroll for: {employee.id} - {employee.name}')
print(f'- Check amount: {employee.calculate_payroll()}')
if employee.address:
print('- Sent to:')
print(employee.address)
print('')

# Policy classes implementation omitted

_payroll_system = _PayrollSystem()

def get_policy(employee_id):
return _payroll_system.get_policy(employee_id)

def calculate_payroll(employees):
_payroll_system.calculate_payroll(employees)

Again, you make the _PayrollSystem internal and provide a public interface to it. The
application will use the public interface to get policies and calculate payroll.

You will now do the same with the contacts module:

# In contacts.py

class _AddressBook:
def __init__(self):
self._employee_addresses = {
1: Address('121 Admin Rd.', 'Concord', 'NH', '03301'),
2: Address('67 Paperwork Ave', 'Manchester', 'NH', '03101'),
3: Address('15 Rose St', 'Concord', 'NH', '03301', 'Apt. B-1'),
4: Address('39 Sole St.', 'Concord', 'NH', '03301'),
5: Address('99 Mountain Rd.', 'Concord', 'NH', '03301'),
}

def get_employee_address(self, employee_id):


address = self._employee_addresses.get(employee_id)
if not address:
raise ValueError(employee_id)

102
return address

# Implementation of Address class omitted

_address_book = _AddressBook()

def get_employee_address(employee_id):
return _address_book.get_employee_address(employee_id)

You are basically saying that there should only be one _AddressBook, one _PayrollSystem,
and one _ProductivitySystem. Again, this design pattern is called the Singleton design pattern,
which comes in handy for classes from which there should only be one, single instance.

Now, you can work on the employees module. You will also make a Singleton out of the
_EmployeeDatabase, but you will make some additional changes:

# In employees.py

from productivity import get_role


from hr import get_policy
from contacts import get_employee_address
from representations import AsDictionaryMixin

class _EmployeeDatabase:
def __init__(self):
self._employees = {
1: {
'name': 'Mary Poppins',
'role': 'manager'
},
2: {
'name': 'John Smith',
'role': 'secretary'
},
3: {
'name': 'Kevin Bacon',
'role': 'sales'
},
4: {
'name': 'Jane Doe',
'role': 'factory'
},
5: {
'name': 'Robin Williams',
'role': 'secretary'
}
}

@property
def employees(self):
return [Employee(id_) for id_ in sorted(self._employees)]

def get_employee_info(self, employee_id):


info = self._employees.get(employee_id)

103
if not info:
raise ValueError(employee_id)
return info

class Employee(AsDictionaryMixin):
def __init__(self, id):
self.id = id
info = employee_database.get_employee_info(self.id)
self.name = info.get('name')
self.address = get_employee_address(self.id)
self._role = get_role(info.get('role'))
self._payroll = get_policy(self.id)

def work(self, hours):


duties = self._role.perform_duties(hours)
print(f'Employee {self.id} - {self.name}:')
print(f'- {duties}')
print('')
self._payroll.track_work(hours)

def calculate_payroll(self):
return self._payroll.calculate_payroll()

employee_database = _EmployeeDatabase()

You first import the relevant functions and classes from other modules. The
_EmployeeDatabase is made internal, and at the bottom, you create a single instance. This
instance is public and part of the interface because you will want to use it in the application.

You changed the _EmployeeDatabase._employees attribute to be a dictionary where the key is


the employee id and the value is the employee information. You also exposed a
.get_employee_info() method to return the information for the specified employee
employee_id.

The _EmployeeDatabase.employees property now sorts the keys to return the employees sorted
by their id. You replaced the method that constructed the Employee objects with calls to the
Employee initializer directly.

The Employee class now is initialized with the id and uses the public functions exposed in the
other modules to initialize its attributes.

You can now change the program to test the changes:

# In program.py

import json

from hr import calculate_payroll


from productivity import track
from employees import employee_database, Employee

104
def print_dict(d):
print(json.dumps(d, indent=2))

employees = employee_database.employees

track(employees, 40)
calculate_payroll(employees)

temp_secretary = Employee(5)
print('Temporary Secretary:')
print_dict(temp_secretary.to_dict())

You import the relevant functions from the hr and productivity modules, as well as the
employee_database and Employee class. The program is cleaner because you exposed the
required interface and encapsulated how objects are accessed.

Notice that you can now create an Employee object directly just using its id. You can run the
program to see its output:

$ python program.py

Tracking Employee Productivity


==============================
Employee 1 - Mary Poppins:
- screams and yells for 40 hours.

Employee 2 - John Smith:


- does paperwork for 40 hours.

Employee 3 - Kevin Bacon:


- expends 40 hours on the phone.

Employee 4 - Jane Doe:


- manufactures gadgets for 40 hours.

Employee 5 - Robin Williams:


- does paperwork for 40 hours.

Calculating Payroll
===================
Payroll for: 1 - Mary Poppins
- Check amount: 3000
- Sent to:
121 Admin Rd.
Concord, NH 03301

Payroll for: 2 - John Smith


- Check amount: 1500
- Sent to:
67 Paperwork Ave
Manchester, NH 03101

Payroll for: 3 - Kevin Bacon


- Check amount: 1800.0
- Sent to:

105
15 Rose St
Apt. B-1
Concord, NH 03301

Payroll for: 4 - Jane Doe


- Check amount: 600
- Sent to:
39 Sole St.
Concord, NH 03301

Payroll for: 5 - Robin Williams


- Check amount: 360
- Sent to:
99 Mountain Rd.
Concord, NH 03301

Temporary Secretary:
{
"id": "5",
"name": "Robin Williams",
"address": {
"street": "99 Mountain Rd.",
"street2": "",
"city": "Concord",
"state": "NH",
"zipcode": "03301"
}
}

The program works the same as before, but now you can see that a single Employee object can
be created from its id and display its dictionary representation.

Take a closer look at the Employee class:

# In employees.py

class Employee(AsDictionaryMixin):
def __init__(self, id):
self.id = id
info = employee_database.get_employee_info(self.id)
self.name = info.get('name')
self.address = get_employee_address(self.id)
self._role = get_role(info.get('role'))
self._payroll = get_policy(self.id)

def work(self, hours):


duties = self._role.perform_duties(hours)
print(f'Employee {self.id} - {self.name}:')
print(f'- {duties}')
print('')
self._payroll.track_work(hours)

def calculate_payroll(self):
return self._payroll.calculate_payroll()

106
The Employee class is a composite that contains multiple objects providing different
functionality. It contains an Address that implements all the functionality related to where the
employee lives.

Employee also contains a productivity role provided by the productivity module, and a payroll
policy provided by the hr module. These two objects provide implementations that are leveraged
by the Employee class to track work in the .work() method and to calculate the payroll in the
.calculate_payroll() method.

You are using composition in two different ways. The Address class provides additional data to
Employee where the role and payroll objects provide additional behavior.

Still, the relationship between Employee and those objects is loosely coupled, which provides
some interesting capabilities that you’ll see in the next section.

Composition to Change Run-Time Behavior

Inheritance, as opposed to composition, is a tightly couple relationship. With inheritance, there is


only one way to change and customize behavior. Method overriding is the only way to customize
the behavior of a base class. This creates rigid designs that are difficult to change.

Composition, on the other hand, provides a loosely coupled relationship that enables flexible
designs and can be used to change behavior at run-time.

Imagine you need to support a long-term disability (LTD) policy when calculating payroll. The
policy states that an employee on LTD should be paid 60% of their weekly salary assuming 40
hours of work.

With an inheritance design, this can be a very difficult requirement to support. Adding it to the
composition example is a lot easier. Let’s start by adding the policy class:

# In hr.py

class LTDPolicy:
def __init__(self):
self._base_policy = None

def track_work(self, hours):


self._check_base_policy()
return self._base_policy.track_work(hours)

def calculate_payroll(self):
self._check_base_policy()
base_salary = self._base_policy.calculate_payroll()
return base_salary * 0.6

def apply_to_policy(self, base_policy):


self._base_policy = base_policy

107
def _check_base_policy(self):
if not self._base_policy:
raise RuntimeError('Base policy missing')

Notice that LTDPolicy doesn’t inherit PayrollPolicy, but implements the same interface. This
is because the implementation is completely different, so we don’t want to inherit any of the
PayrollPolicy implementation.

The LTDPolicy initializes _base_policy to None, and provides an internal


._check_base_policy() method that raises an exception if the ._base_policy has not been
applied. Then, it provides a .apply_to_policy() method to assign the _base_policy.

The public interface first checks that the _base_policy has been applied, and then implements
the functionality in terms of that base policy. The .track_work() method just delegates to the
base policy, and .calculate_payroll() uses it to calculate the base_salary and then return
the 60%.

You can now make a small change to the Employee class:

# In employees.py

class Employee(AsDictionaryMixin):
def __init__(self, id):
self.id = id
info = employee_database.get_employee_info(self.id)
self.name = info.get('name')
self.address = get_employee_address(self.id)
self._role = get_role(info.get('role'))
self._payroll = get_policy(self.id)

def work(self, hours):


duties = self._role.perform_duties(hours)
print(f'Employee {self.id} - {self.name}:')
print(f'- {duties}')
print('')
self._payroll.track_work(hours)

def calculate_payroll(self):
return self._payroll.calculate_payroll()

def apply_payroll_policy(self, new_policy):


new_policy.apply_to_policy(self._payroll)
self._payroll = new_policy

You added an .apply_payroll_policy() method that applies the existing payroll policy to the
new policy and then substitutes it. You can now modify the program to apply the policy to an
Employee object:

# In program.py

from hr import calculate_payroll, LTDPolicy

108
from productivity import track
from employees import employee_database

employees = employee_database.employees

sales_employee = employees[2]
ltd_policy = LTDPolicy()
sales_employee.apply_payroll_policy(ltd_policy)

track(employees, 40)
calculate_payroll(employees)

The program accesses sales_employee, which is located at index 2, creates the LTDPolicy
object, and applies the policy to the employee. When .calculate_payroll() is called, the
change is reflected. You can run the program to evaluate the output:

$ python program.py

Tracking Employee Productivity


==============================
Employee 1 - Mary Poppins:
- screams and yells for 40 hours.

Employee 2 - John Smith:


- Does paperwork for 40 hours.

Employee 3 - Kevin Bacon:


- Expends 40 hours on the phone.

Employee 4 - Jane Doe:


- Manufactures gadgets for 40 hours.

Employee 5 - Robin Williams:


- Does paperwork for 40 hours.

Calculating Payroll
===================
Payroll for: 1 - Mary Poppins
- Check amount: 3000
- Sent to:
121 Admin Rd.
Concord, NH 03301

Payroll for: 2 - John Smith


- Check amount: 1500
- Sent to:
67 Paperwork Ave
Manchester, NH 03101

Payroll for: 3 - Kevin Bacon


- Check amount: 1080.0
- Sent to:
15 Rose St
Apt. B-1

109
Concord, NH 03301

Payroll for: 4 - Jane Doe


- Check amount: 600
- Sent to:
39 Sole St.
Concord, NH 03301

Payroll for: 5 - Robin Williams


- Check amount: 360
- Sent to:
99 Mountain Rd.
Concord, NH 03301

The check amount for employee Kevin Bacon, who is the sales employee, is now for $1080
instead of $1800. That’s because the LTDPolicy has been applied to the salary.

As you can see, you were able to support the changes just by adding a new policy and modifying
a couple interfaces. This is the kind of flexibility that policy design based on composition gives
you.

Choosing Between Inheritance and Composition in Python

Python, as an object oriented programming language, supports both inheritance and composition.
You saw that inheritance is best used to model an is a relationship, whereas composition models
a has a relationship.

Sometimes, it’s hard to see what the relationship between two classes should be, but you can
follow these guidelines:

 Use inheritance over composition in Python to model a clear is a relationship. First,


justify the relationship between the derived class and its base. Then, reverse the
relationship and try to justify it. If you can justify the relationship in both directions, then
you should not use inheritance between them.
 Use inheritance over composition in Python to leverage both the interface and
implementation of the base class.
 Use inheritance over composition in Python to provide mixin features to several
unrelated classes when there is only one implementation of that feature.
 Use composition over inheritance in Python to model a has a relationship that
leverages the implementation of the component class.
 Use composition over inheritance in Python to create components that can be reused
by multiple classes in your Python applications.
 Use composition over inheritance in Python to implement groups of behaviors and
policies that can be applied interchangeably to other classes to customize their behavior.
 Use composition over inheritance in Python to enable run-time behavior changes
without affecting existing classes.

Conclusion
110
You explored inheritance and composition in Python. You learned about the type of
relationships that inheritance and composition create. You also went through a series of exercises
to understand how inheritance and composition are implemented in Python.

In this article, you learned how to:

 Use inheritance to express an is a relationship between two classes


 Evaluate if inheritance is the right relationship
 Use multiple inheritance in Python and evaluate Python’s MRO to troubleshoot multiple
inheritance problems
 Extend classes with mixins and reuse their implementation
 Use composition to express a has a relationship between two classes
 Provide flexible designs using composition
 Reuse existing code through policy design based on composition

111
SOLID Principles: Improve Object-Oriented
Design in Python
by Leodanis Pozo Ramos May 01, 2023 best-practices intermediate python
Tweet Share Email

Table of Contents

 Object-Oriented Design in Python: The SOLID Principles


 Single-Responsibility Principle (SRP)
 Open-Closed Principle (OCP)
 Liskov Substitution Principle (LSP)
 Interface Segregation Principle (ISP)
 Dependency Inversion Principle (DIP)
 Conclusion

When you build a Python project using object-oriented programming (OOP), planning how
the different classes and objects will interact to solve your specific problems is an important part
of the job. This planning is known as object-oriented design (OOD), and getting it right can be a
challenge. If you’re stuck while designing your Python classes, then the SOLID principles can
help you out.

SOLID is a set of five object-oriented design principles that can help you write more
maintainable, flexible, and scalable code based on well-designed, cleanly structured classes.
These principles are a fundamental part of object-oriented design best practices.

In this tutorial, you’ll:

 Understand the meaning and purpose of each SOLID principle


 Identify Python code that violates some of the SOLID principles
 Apply the SOLID principles to refactor your Python code and improve its design

Throughout your learning journey, you’ll code practical examples to discover how the SOLID
principles can lead to well-organized, flexible, maintainable, and scalable code.

To get the most out of this tutorial, you must have a good understanding of Python object-
oriented programming concepts, such as classes, interfaces, and inheritance.

112
Free Bonus: Click here to download sample code so you can build clean, maintainable classes
with the SOLID Principles in Python.

Object-Oriented Design in Python: The SOLID Principles


When it comes to writing classes and designing their interactions in Python, you can follow a
series of principles that will help you build better object-oriented code. One of the most popular
and widely accepted sets of standards for object-oriented design (OOD) is known as the SOLID
principles.

If you’re coming from C++ or Java, you may already be familiar with these principles. Maybe
you’re wondering if the SOLID principles also apply to Python code. To that question, the
answer is a resounding yes. If you’re writing object-oriented code, then you should consider
applying these principles to your OOD.

But what are these SOLID principles? SOLID is an acronym that groups five core principles that
apply to object-oriented design. These principles are the following:

1. Single-responsibility principle (SRP)


2. Open–closed principle (OCP)
3. Liskov substitution principle (LSP)
4. Interface segregation principle (ISP)
5. Dependency inversion principle (DIP)

You’ll explore each of these principles in detail and code real-world examples of how to apply
them in Python. In the process, you’ll gain a strong understanding of how to write more
straightforward, organized, scalable, and reusable object-oriented code by applying the SOLID
principles. To kick things off, you’ll start with the first principle on the list.

Single-Responsibility Principle (SRP)


The single-responsibility principle (SRP) comes from Robert C. Martin, more commonly
known by his nickname Uncle Bob, who’s a well-respected figure in the software engineering
world and one of the original signatories of the Agile Manifesto. In fact, he coined the term
SOLID.

The single-responsibility principle states that:

A class should have only one reason to change.

This means that a class should have only one responsibility, as expressed through its methods. If
a class takes care of more than one task, then you should separate those tasks into separate
classes.

113
Note: You’ll find the SOLID principles worded in various ways out there. In this tutorial, you’ll
refer to them following the wording that Uncle Bob uses in his book Agile Software
Development: Principles, Patterns, and Practices. So, all the direct quotes come from this book.

If you want to read alternate wordings in a quick roundup of these and related principles, then
check out Uncle Bob’s The Principles of OOD.

This principle is closely related to the concept of separation of concerns, which suggests that you
should split your programs into different sections. Each section must address a separate concern.

To illustrate the single-responsibility principle and how it can help you improve your object-
oriented design, say that you have the following FileManager class:

# file_manager_srp.py

from pathlib import Path


from zipfile import ZipFile

class FileManager:
def __init__(self, filename):
self.path = Path(filename)

def read(self, encoding="utf-8"):


return self.path.read_text(encoding)

def write(self, data, encoding="utf-8"):


self.path.write_text(data, encoding)

def compress(self):
with ZipFile(self.path.with_suffix(".zip"), mode="w") as archive:
archive.write(self.path)

def decompress(self):
with ZipFile(self.path.with_suffix(".zip"), mode="r") as archive:
archive.extractall()

In this example, your FileManager class has two different responsibilities. It uses the .read()
and .write() methods to manage the file. It also deals with ZIP archives by providing the
.compress() and .decompress() methods.

This class violates the single-responsibility principle because it has two reasons for changing its
internal implementation. To fix this issue and make your design more robust, you can split the
class into two smaller, more focused classes, each with its own specific concern:

# file_manager_srp.py

from pathlib import Path


from zipfile import ZipFile

class FileManager:
def __init__(self, filename):
self.path = Path(filename)

114
def read(self, encoding="utf-8"):
return self.path.read_text(encoding)

def write(self, data, encoding="utf-8"):


self.path.write_text(data, encoding)

class ZipFileManager:
def __init__(self, filename):
self.path = Path(filename)

def compress(self):
with ZipFile(self.path.with_suffix(".zip"), mode="w") as archive:
archive.write(self.path)

def decompress(self):
with ZipFile(self.path.with_suffix(".zip"), mode="r") as archive:
archive.extractall()

Now you have two smaller classes, each having only a single responsibility. FileManager takes
care of managing a file, while ZipFileManager handles the compression and decompression of a
file using the ZIP format. These two classes are smaller, so they’re more manageable. They’re
also easier to reason about, test, and debug.

The concept of responsibility in this context may be pretty subjective. Having a single
responsibility doesn’t necessarily mean having a single method. Responsibility isn’t directly tied
to the number of methods but to the core task that your class is responsible for, depending on
your idea of what the class represents in your code. However, that subjectivity shouldn’t stop
you from striving to use the SRP.

Open-Closed Principle (OCP)


The open-closed principle (OCP) for object-oriented design was originally introduced by
Bertrand Meyer in 1988 and means that:

Software entities (classes, modules, functions, etc.) should be open for extension, but closed for
modification.

To understand what the open-closed principle is all about, consider the following Shape class:

# shapes_ocp.py

from math import pi

class Shape:
def __init__(self, shape_type, **kwargs):
self.shape_type = shape_type
if self.shape_type == "rectangle":
self.width = kwargs["width"]
self.height = kwargs["height"]
elif self.shape_type == "circle":

115
self.radius = kwargs["radius"]

def calculate_area(self):
if self.shape_type == "rectangle":
return self.width * self.height
elif self.shape_type == "circle":
return pi * self.radius**2

The initializer of Shape takes a shape_type argument that can be either "rectangle" or
"circle". It also takes a specific set of keyword arguments using the **kwargs syntax. If you
set the shape type to "rectangle", then you should also pass the width and height keyword
arguments so that you can construct a proper rectangle.

In contrast, if you set the shape type to "circle", then you must also pass a radius argument to
construct a circle.

Note: This example may seem a bit extreme. Its intention is to clearly expose the core idea
behind the open-closed principle.

Shape also has a .calculate_area() method that computes the area of the current shape
according to its .shape_type:

>>> from shapes_ocp import Shape

>>> rectangle = Shape("rectangle", width=10, height=5)


>>> rectangle.calculate_area()
50
>>> circle = Shape("circle", radius=5)
>>> circle.calculate_area()
78.53981633974483

The class works. You can create circles and rectangles, compute their area, and so on. However,
the class looks pretty bad. Something seems wrong with it at first sight.

Imagine that you need to add a new shape, maybe a square. How would you do that? Well, the
option here is to add another elif clause to .__init__() and to .calculate_area() so that
you can address the requirements of a square shape.

Having to make these changes to create new shapes means that your class is open to
modification. That violates the open-closed principle. How can you fix your class to make it
open to extension but closed to modification? Here’s a possible solution:

# shapes_ocp.py

from abc import ABC, abstractmethod


from math import pi

class Shape(ABC):
def __init__(self, shape_type):
self.shape_type = shape_type

116
@abstractmethod
def calculate_area(self):
pass

class Circle(Shape):
def __init__(self, radius):
super().__init__("circle")
self.radius = radius

def calculate_area(self):
return pi * self.radius**2

class Rectangle(Shape):
def __init__(self, width, height):
super().__init__("rectangle")
self.width = width
self.height = height

def calculate_area(self):
return self.width * self.height

class Square(Shape):
def __init__(self, side):
super().__init__("square")
self.side = side

def calculate_area(self):
return self.side**2

In this code, you completely refactored the Shape class, turning it into an abstract base class
(ABC). This class provides the required interface (API) for any shape that you’d like to define.
That interface consists of a .shape_type attribute and a .calculate_area() method that you
must override in all the subclasses.

Note: The example above and some examples in the next sections use Python’s ABCs to provide
what’s called interface inheritance. In this type of inheritance, subclasses inherit interfaces
rather than functionality. In contrast, when classes inherit functionality, then you’re presented
with implementation inheritance.

This update closes the class to modifications. Now you can add new shapes to your class design
without the need to modify Shape. In every case, you’ll have to implement the required interface,
which also makes your classes polymorphic.

Liskov Substitution Principle (LSP)


The Liskov substitution principle (LSP) was introduced by Barbara Liskov at an OOPSLA
conference in 1987. Since then, this principle has been a fundamental part of object-oriented
programming. The principle states that:

Subtypes must be substitutable for their base types.

117
For example, if you have a piece of code that works with a Shape class, then you should be able
to substitute that class with any of its subclasses, such as Circle or Rectangle, without
breaking the code.

Note: You can read the conference proceedings from the keynote where Barbara Liskov first
shared this principle, or you can watch a short fragment of an interview with her for more
context.

In practice, this principle is about making your subclasses behave like their base classes without
breaking anyone’s expectations when they call the same methods. To continue with shape-
related examples, say you have a Rectangle class like the following:

# shapes_lsp.py

class Rectangle:
def __init__(self, width, height):
self.width = width
self.height = height

def calculate_area(self):
return self.width * self.height

In Rectangle, you’ve provided the .calculate_area() method, which operates with the
.width and .height instance attributes.

Because a square is a special case of a rectangle with equal sides, you think of deriving a Square
class from Rectangle in order to reuse the code. Then, you override the setter method for the
.width and .height attributes so that when one side changes, the other side also changes:

# shapes_lsp.py

# ...

class Square(Rectangle):
def __init__(self, side):
super().__init__(side, side)

def __setattr__(self, key, value):


super().__setattr__(key, value)
if key in ("width", "height"):
self.__dict__["width"] = value
self.__dict__["height"] = value

In this snippet of code, you’ve defined Square as a subclass of Rectangle. As a user might
expect, the class constructor takes only the side of the square as an argument. Internally, the
.__init__() method initializes the parent’s attributes, .width and .height, with the side
argument.

You’ve also defined a special method, .__setattr__(), to hook into Python’s attribute-setting
mechanism and intercept the assignment of a new value to either the .width or .height

118
attribute. Specifically, when you set one of those attributes, the other attribute is also set to the
same value:

>>> from shapes_lsp import Square

>>> square = Square(5)


>>> vars(square)
{'width': 5, 'height': 5}

>>> square.width = 7
>>> vars(square)
{'width': 7, 'height': 7}

>>> square.height = 9
>>> vars(square)
{'width': 9, 'height': 9}

Now you’ve ensured that the Square object always remains a valid square, making your life
easier for the small price of a bit of wasted memory. Unfortunately, this violates the Liskov
substitution principle because you can’t replace instances of Rectangle with their Square
counterparts.

When someone expects a rectangle object in their code, they might assume that it’ll behave like
one by exposing two independent .width and .height attributes. Meanwhile, your Square class
breaks that assumption by changing the behavior promised by the object’s interface. That could
have surprising and unwanted consequences, which would likely be hard to debug.

While a square is a specific type of rectangle in mathematics, the classes that represent those
shapes shouldn’t be in a parent-child relationship if you want them to comply with the Liskov
substitution principle. One way to solve this problem is to create a base class for both Rectangle
and Square to extend:

# shapes_lsp.py

from abc import ABC, abstractmethod

class Shape(ABC):
@abstractmethod
def calculate_area(self):
pass

class Rectangle(Shape):
def __init__(self, width, height):
self.width = width
self.height = height

def calculate_area(self):
return self.width * self.height

class Square(Shape):
def __init__(self, side):
self.side = side

119
def calculate_area(self):
return self.side ** 2

Shape becomes the type that you can substitute through polymorphism with either Rectangle or
Square, which are now siblings rather than a parent and a child. Notice that both concrete shape
types have distinct sets of attributes, different initializer methods, and could potentially
implement even more separate behaviors. The only thing that they have in common is the ability
to calculate their area.

With this implementation in place, you can use the Shape type interchangeably with its Square
and Rectangle subtypes when you only care about their common behavior:

>>> from shapes_lsp import Rectangle, Square

>>> def get_total_area(shapes):


... return sum(shape.calculate_area() for shape in shapes)

>>> get_total_area([Rectangle(10, 5), Square(5)])


75

Here, you pass a pair consisting of a rectangle and a square into a function that calculates their
total area. Because the function only cares about the .calculate_area() method, it doesn’t
matter that the shapes are different. This is the essence of the Liskov substitution principle.

Interface Segregation Principle (ISP)


The interface segregation principle (ISP) comes from the same mind as the single-
responsibility principle. Yes, it’s another feather in Uncle Bob’s cap. The principle’s main idea is
that:

Clients should not be forced to depend upon methods that they do not use. Interfaces belong to
clients, not to hierarchies.

In this case, clients are classes and subclasses, and interfaces consist of methods and attributes.
In other words, if a class doesn’t use particular methods or attributes, then those methods and
attributes should be segregated into more specific classes.

Consider the following example of class hierarchy to model printing machines:

# printers_isp.py

from abc import ABC, abstractmethod

class Printer(ABC):
@abstractmethod
def print(self, document):
pass

120
@abstractmethod
def fax(self, document):
pass

@abstractmethod
def scan(self, document):
pass

class OldPrinter(Printer):
def print(self, document):
print(f"Printing {document} in black and white...")

def fax(self, document):


raise NotImplementedError("Fax functionality not supported")

def scan(self, document):


raise NotImplementedError("Scan functionality not supported")

class ModernPrinter(Printer):
def print(self, document):
print(f"Printing {document} in color...")

def fax(self, document):


print(f"Faxing {document}...")

def scan(self, document):


print(f"Scanning {document}...")

In this example, the base class, Printer, provides the interface that its subclasses must
implement. OldPrinter inherits from Printer and must implement the same interface.
However, OldPrinter doesn’t use the .fax() and .scan() methods because this type of printer
doesn’t support these functionalities.

This implementation violates the ISP because it forces OldPrinter to expose an interface that
the class doesn’t implement or need. To fix this issue, you should separate the interfaces into
smaller and more specific classes. Then you can create concrete classes by inheriting from
multiple interface classes as needed:

# printers_isp.py

from abc import ABC, abstractmethod

class Printer(ABC):
@abstractmethod
def print(self, document):
pass

class Fax(ABC):
@abstractmethod
def fax(self, document):
pass

class Scanner(ABC):
@abstractmethod

121
def scan(self, document):
pass

class OldPrinter(Printer):
def print(self, document):
print(f"Printing {document} in black and white...")

class NewPrinter(Printer, Fax, Scanner):


def print(self, document):
print(f"Printing {document} in color...")

def fax(self, document):


print(f"Faxing {document}...")

def scan(self, document):


print(f"Scanning {document}...")

Now Printer, Fax, and Scanner are base classes that provide specific interfaces with a single
responsibility each. To create OldPrinter, you only inherit the Printer interface. This way, the
class won’t have unused methods. To create the ModernPrinter class, you need to inherit from
all the interfaces. In short, you’ve segregated the Printer interface.

This class design allows you to create different machines with different sets of functionalities,
making your design more flexible and extensible.

Dependency Inversion Principle (DIP)


The dependency inversion principle (DIP) is the last principle in the SOLID set. This principle
states that:

Abstractions should not depend upon details. Details should depend upon abstractions.

That sounds pretty complex. Here’s an example that will help to clarify it. Say you’re building an
application and have a FrontEnd class to display data to the users in a friendly way. The app
currently gets its data from a database, so you end up with the following code:

# app_dip.py

class FrontEnd:
def __init__(self, back_end):
self.back_end = back_end

def display_data(self):
data = self.back_end.get_data_from_database()
print("Display data:", data)

class BackEnd:
def get_data_from_database(self):
return "Data from the database"

122
In this example, the FrontEnd class depends on the BackEnd class and its concrete
implementation. You can say that both classes are tightly coupled. This coupling can lead to
scalability issues. For example, say that your app is growing fast, and you want the app to be
able to read data from a REST API. How would you do that?

You may think of adding a new method to BackEnd to retrieve the data from the REST API.
However, that will also require you to modify FrontEnd, which should be closed to
modification, according to the open-closed principle.

To fix the issue, you can apply the dependency inversion principle and make your classes depend
on abstractions rather than on concrete implementations like BackEnd. In this specific example,
you can introduce a DataSource class that provides the interface to use in your concrete classes:

# app_dip.py

from abc import ABC, abstractmethod

class FrontEnd:
def __init__(self, data_source):
self.data_source = data_source

def display_data(self):
data = self.data_source.get_data()
print("Display data:", data)

class DataSource(ABC):
@abstractmethod
def get_data(self):
pass

class Database(DataSource):
def get_data(self):
return "Data from the database"

class API(DataSource):
def get_data(self):
return "Data from the API"

In this redesign of your classes, you’ve added a DataSource class as an abstraction that provides
the required interface, or the .get_data() method. Note how FrontEnd now depends on the
interface provided by DataSource, which is an abstraction.

Then you define the Database class, which is a concrete implementation for those cases where
you want to retrieve the data from your database. This class depends on the DataSource
abstraction through inheritance. Finally, you define the API class to support retrieving the data
from the REST API. This class also depends on the DataSource abstraction.

Here’s how you can use the FrontEnd class in your code:

>>> from app_dip import API, Database, FrontEnd

123
>>> db_front_end = FrontEnd(Database())
>>> db_front_end.display_data()
Display data: Data from the database

>>> api_front_end = FrontEnd(API())


>>> api_front_end.display_data()
Display data: Data from the API

Here, you first initialize FrontEnd using a Database object and then again using an API object.
Every time you call .display_data(), the result will depend on the concrete data source that
you use. Note that you can also change the data source dynamically by reassigning the
.data_source attribute in your FrontEnd instance.

Conclusion
You’ve learned a lot about the five SOLID principles, including how to identify code that
violates them and how to refactor the code in adherence to best design practices. You saw good
and bad examples related to each principle and learned that applying the SOLID principles can
help you improve your object-oriented design in Python.

In this tutorial, you’ve learned how to:

 Understand the meaning and purpose of each SOLID principle


 Identify class designs that violate some of the SOLID principles in Python
 Use the SOLID principles to help you refactor Python code and improve its OOD

With this knowledge, you have a strong foundation of well-established best practices that you
should apply when designing your classes and their relationships in Python. By applying these
principles, you can create code that’s more maintainable, extensible, scalable, and testable.

Python Classes: The Power of Object-


Oriented Programming
by Leodanis Pozo Ramos Apr 26, 2023 14 Comments intermediate python

124
Tweet Share Email

Table of Contents

 Getting Started With Python Classes


o Defining a Class in Python
o Creating Objects From a Class in Python
o Accessing Attributes and Methods
 Naming Conventions in Python Classes
o Public vs Non-Public Members
o Name Mangling
 Understanding the Benefits of Using Classes in Python
 Deciding When to Avoid Using Classes
 Attaching Data to Classes and Instances
o Class Attributes
o Instance Attributes
o The .__dict__ Attribute
o Dynamic Class and Instance Attributes
o Property and Descriptor-Based Attributes
o Lightweight Classes With .__slots__
 Providing Behavior With Methods
o Instance Methods With self
o Special Methods and Protocols
o Class Methods With @classmethod
o Static Methods With @staticmethod
o Getter and Setter Methods vs Properties
 Summarizing Class Syntax and Usage: A Complete Example
 Debugging Python Classes
 Exploring Specialized Classes From the Standard Library
o Data Classes
o Enumerations
 Using Inheritance and Building Class Hierarchies
o Simple Inheritance
o Class Hierarchies
o Extended vs Overridden Methods
o Multiple Inheritance
o Method Resolution Order (MRO)
o Mixin Classes
o Benefits of Using Inheritance
 Using Alternatives to Inheritance
o Composition
o Delegation
o Dependency Injection
 Creating Abstract Base Classes (ABCs) and Interfaces
 Unlocking Polymorphism With Common Interfaces
 Conclusion

125
Remove ads

Python supports the object-oriented programming paradigm through classes. They provide an
elegant way to define reusable pieces of code that encapsulate data and behavior in a single
entity. With classes, you can quickly and intuitively model real-world objects and solve complex
problems.

If you’re new to classes, need to refresh your knowledge, or want to dive deeper into them, then
this tutorial is for you!

In this tutorial, you’ll learn how to:

 Define Python classes with the class keyword


 Add state to your classes with class and instance attributes
 Provide behavior to your classes with methods
 Use inheritance to build hierarchies of classes
 Provide interfaces with abstract classes

To get the most out of this tutorial, you should know about Python variables, data types, and
functions. Some experience with object-oriented programming (OOP) is also a plus. Don’t worry
if you’re not an OOP expert yet. In this tutorial, you’ll learn the key concepts that you need to
get started and more. You’ll also write several practical examples to help reinforce your
knowledge of Python classes.

Free Bonus: Click here to download your sample code for building powerful object blueprints
with classes in Python.

Getting Started With Python Classes


Python is a multiparadigm programming language that supports object-oriented programming
(OOP) through classes that you can define with the class keyword. You can think of a class as a
piece of code that specifies the data and behavior that represent and model a particular type of
object.

What is a class in Python? A common analogy is that a class is like the blueprint for a house.
You can use the blueprint to create several houses and even a complete neighborhood. Each
concrete house is an object or instance that’s derived from the blueprint.

Each instance can have its own properties, such as color, owner, and interior design. These
properties carry what’s commonly known as the object’s state. Instances can also have different
behaviors, such as locking the doors and windows, opening the garage door, turning the lights on
and off, watering the garden, and more.

In OOP, you commonly use the term attributes to refer to the properties or data associated with
a specific object of a given class. In Python, attributes are variables defined inside a class with
the purpose of storing all the required data for the class to work.

126
Similarly, you’ll use the term methods to refer to the different behaviors that objects will show.
Methods are functions that you define within a class. These functions typically operate on or
with the attributes of the underlying instance or class. Attributes and methods are collectively
referred to as members of a class or object.

You can write fully functional classes to model the real world. These classes will help you better
organize your code and solve complex programming problems.

For example, you can use classes to create objects that emulate people, animals, vehicles, books,
buildings, cars, or other objects. You can also model virtual objects, such as a web server,
directory tree, chatbot, file manager, and more.

Finally, you can use classes to build class hierarchies. This way, you’ll promote code reuse and
remove repetition throughout your codebase.

In this tutorial, you’ll learn a lot about classes and all the cool things that you can do with them.
To kick things off, you’ll start by defining your first class in Python. Then you’ll dive into other
topics related to instances, attributes, and methods.

Remove ads

Defining a Class in Python

To define a class, you need to use the class keyword followed by the class name and a colon,
just like you’d do for other compound statements in Python. Then you must define the class
body, which will start at the next indentation level:

class ClassName:
# Class body
pass

In a class body, you can define attributes and methods as needed. As you already learned,
attributes are variables that hold the class data, while methods are functions that provide
behavior and typically act on the class data.

Note: In Python, the body of a given class works as a namespace where attributes and methods
live. You can only access those attributes and methods through the class or its objects.

As an example of how to define attributes and methods, say that you need a Circle class to
model different circles in a drawing application. Initially, your class will have a single attribute
to hold the radius. It’ll also have a method to calculate the circle’s area:

# circle.py

import math

class Circle:
def __init__(self, radius):

127
self.radius = radius

def calculate_area(self):
return round(math.pi * self.radius ** 2, 2)

In this code snippet, you define Circle using the class keyword. Inside the class, you write two
methods. The .__init__() method has a special meaning in Python classes. This method is
known as the object initializer because it defines and sets the initial values for your attributes.
You’ll learn more about this method in the Instance Attributes section.

The second method of Circle is conveniently named .calculate_area() and will compute the
area of a specific circle by using its radius. It’s common for method names to contain a verb,
such as calculate, to describe an action the method performs. In this example, you’ve used the
math module to access the pi constant as it’s defined in that module.

Note: In Python, the first argument of most methods is self. This argument holds a reference to
the current object so that you can use it inside the class. You’ll learn more about this argument in
the section on instance methods with self.

Cool! You’ve written your first class. Now, how can you use this class in your code to represent
several concrete circles? Well, you need to instantiate your class to create specific circle objects
from it.

Creating Objects From a Class in Python

The action of creating concrete objects from an existing class is known as instantiation. With
every instantiation, you create a new object of the target class. To get your hands dirty, go ahead
and make a couple of instances of Circle by running the following code in a Python REPL
session:

>>> from circle import Circle

>>> circle_1 = Circle(42)


>>> circle_2 = Circle(7)

>>> circle_1
<__main__.Circle object at 0x102b835d0>
>>> circle_2
<__main__.Circle object at 0x1035e3910>

To create an object of a Python class like Circle, you must call the Circle() class constructor
with a pair of parentheses and a set of appropriate arguments. What arguments? In Python, the
class constructor accepts the same arguments as the .__init__() method. In this example, the
Circle class expects the radius argument.

Calling the class constructor with different argument values will allow you to create different
objects or instances of the target class. In the above example, circle_1 and circle_2 are

128
separate instances of Circle. In other words, they’re two different and concrete circles, as you
can conclude from the code’s output.

Great! You already know how to create objects of an existing class by calling the class
constructor with the required arguments. Now, how can you access the attributes and methods of
a given class? That’s what you’ll learn in the next section.

Accessing Attributes and Methods

In Python, you can access the attributes and methods of an object by using dot notation with the
dot operator. The following snippet of code shows the required syntax:

obj.attribute_name

obj.method_name()

Note that the dot (.) in this syntax basically means give me the following attribute or method
from this object. The first line returns the value stored in the target attribute, while the second
line accesses the target method so that you can call it.

Note: Remember that to call a function or method, you need to use a pair of parentheses and a
series of arguments, if applicable.

Now get back to your circle objects and run the following code:

>>> from circle import Circle

>>> circle_1 = Circle(42)


>>> circle_2 = Circle(7)

>>> circle_1.radius
42
>>> circle_1.calculate_area()
5541.77

>>> circle_2.radius
7
>>> circle_2.calculate_area()
153.94

In the first couple of lines after the instantiation, you access the .radius attribute on your
circle_1 object. Then you call the .calculate_area() method to calculate the circle’s area. In
the second pair of statements, you do the same but on the circle_2 object.

You can also use dot notation and an assignment statement to change the current value of an
attribute:

>>> circle_1.radius = 100


>>> circle_1.radius
100

129
>>> circle_1.calculate_area()
31415.93

Now the radius of circle_1 is entirely different. When you call .calculate_area(), the result
immediately reflects this change. You’ve changed the object’s internal state or data, which
typically impacts its behaviors or methods.

Remove ads

Naming Conventions in Python Classes


Before continuing diving into classes, you’ll need to be aware of some important naming
conventions that Python uses in the context of classes. Python is a flexible language that loves
freedom and doesn’t like to have explicit restrictions. Because of that, the language and the
community rely on conventions rather than restrictions.

Note: Most Python programmers follow the snake_case naming convention, which involves
using underscores (_) to separate multiple words. However, the recommended naming
convention for Python classes is the PascalCase, where each word is capitalized.

In the following two sections, you’ll learn about two important naming conventions that apply to
class attributes.

Public vs Non-Public Members

The first naming convention that you need to know about is related to the fact that Python
doesn’t distinguish between private, protected, and public attributes like Java and other
languages do. In Python, all attributes are accessible in one way or another. However, Python has
a well-established naming convention that you should use to communicate that an attribute or
method isn’t intended for use from outside its containing class or object.

The naming convention consists of adding a leading underscore to the member’s name. So, in a
Python class, you’ll have the following convention:

Member Naming Examples


Public Use the normal naming pattern. radius , calculate_area()
Non-public Include a leading underscore in names. _radius, _calculate_area()

Public members are part of the official interface or API of your classes, while non-public
members aren’t intended to be part of that API. This means that you shouldn’t use non-public
members outside their defining class.

130
It’s important to note that the second naming convention only indicates that the attribute isn’t
intended to be used directly from outside the containing class. It doesn’t prevent direct access,
though. For example, you can run obj._name, and you’ll access the content of ._name.
However, this is bad practice, and you should avoid it.

Non-public members exist only to support the internal implementation of a given class and may
be removed at any time, so you shouldn’t rely on them. The existence of these members depends
on how the class is implemented. So, you shouldn’t use them directly in client code. If you do,
then your code could break at any moment.

When writing classes, sometimes it’s hard to decide if an attribute should be public or non-
public. This decision will depend on how you want your users to use your classes. In most cases,
attributes should be non-public to guarantee the safe use of your classes. A good approach will
be to start with all your attributes as non-public and only make them public if real use cases
appear.

Name Mangling

Another naming convention that you can see and use in Python classes is to add two leading
underscores to attribute and method names. This naming convention triggers what’s known as
name mangling.

Name mangling is an automatic name transformation that prepends the class’s name to the
member’s name, like in _ClassName__attribute or _ClassName__method. This results in
name hiding. In other words, mangled names aren’t available for direct access. They’re not part
of a class’s public API.

For example, consider the following sample class:

>>> class SampleClass:


... def __init__(self, value):
... self.__value = value
... def __method(self):
... print(self.__value)
...

>>> sample_instance = SampleClass("Hello!")


>>> vars(sample_instance)
{'_SampleClass__value': 'Hello!'}

>>> vars(SampleClass)
mappingproxy({
...
'__init__': <function SampleClass.__init__ at 0x105dfd4e0>,
'_SampleClass__method': <function SampleClass.__method at 0x105dfd760>,
'__dict__': <attribute '__dict__' of 'SampleClass' objects>,
...
})

>>> sample_instance = SampleClass("Hello!")

131
>>> sample_instance.__value
Traceback (most recent call last):
...
AttributeError: 'SampleClass' object has no attribute '__value'

>>> sample_instance.__method()
Traceback (most recent call last):
...
AttributeError: 'SampleClass' object has no attribute '__method'

In this class, .__value and .__method() have two leading underscores, so their names are
mangled to ._SampleClass__value and ._SampleClass__method(), as you can see in the
highlighted lines. Python has automatically added the prefix _SampleClass to both names.
Because of this internal renaming, you can’t access the attributes from outside the class using
their original names. If you try to do it, then you get an AttributeError.

Note: In the above example, you use the built-in vars() function, which returns a dictionary of
all the members associated with the given object. This dictionary plays an important role in
Python classes. You’ll learn more about it in the .__dict__ attribute section.

This internal behavior hides the names, creating the illusion of a private attribute or method.
However, they’re not strictly private. You can access them through their mangled names:

>>> sample_instance._SampleClass__value
'Hello!'

>>> sample_instance._SampleClass__method()
Hello!

It’s still possible to access named-mangled attributes or methods using their mangled names,
although this is bad practice, and you should avoid it in your code. If you see a name that uses
this convention in someone’s code, then don’t try to force the code to use the name from outside
its containing class.

Name mangling is particularly useful when you want to ensure that a given attribute or method
won’t get accidentally overwritten. It’s a way to avoid naming conflicts between classes or
subclasses. It’s also useful to prevent subclasses from overriding methods that have been
optimized for better performance.

Wow! Up to this point, you’ve learned the basics of Python classes and a bit more. You’ll dive
deeper into how Python classes work in a moment. But first, it’s time to jump into some reasons
why you should learn about classes and use them in your Python projects.

Remove ads

Understanding the Benefits of Using Classes in Python

132
Is it worth using classes in Python? Absolutely! Classes are the building blocks of object-
oriented programming in Python. They allow you to leverage the power of Python while writing
and organizing your code. By learning about classes, you’ll be able to take advantage of all the
benefits that they provide. With classes, you can:

 Model and solve complex real-world problems: You’ll find many situations where the
objects in your code map to real-world objects. This can help you think about complex
problems, which will result in better solutions to your programming problems.
 Reuse code and avoid repetition: You can define hierarchies of related classes. The
base classes at the top of a hierarchy provide common functionality that you can reuse
later in the subclasses down the hierarchy. This allows you to reduce code duplication
and promote code reuse.
 Encapsulate related data and behaviors in a single entity: You can use Python classes
to bundle together related attributes and methods in a single entity, the object. This helps
you better organize your code using modular and autonomous entities that you can even
reuse across multiple projects.
 Abstract away the implementation details of concepts and objects: You can use
classes to abstract away the implementation details of core concepts and objects. This
will help you provide your users with intuitive interfaces (APIs) to process complex data
and behaviors.
 Unlock polymorphism with common interfaces: You can implement a particular
interface in several slightly different classes and use them interchangeably in your code.
This will make your code more flexible and adaptable.

In short, Python classes can help you write more organized, structured, maintainable, reusable,
flexible, and user-friendly code. They’re a great tool to have under your belt. However, don’t be
tempted to use classes for everything in Python. In some situations, they’ll overcomplicate your
solutions.

Note: In Python, the public attributes and methods of a class make up what you’ll know as the
class’s interface or application programming interface (API). You’ll learn more about
interfaces throughout this tutorial. Stay tuned!

In the following section, you’ll explore some situations where you should avoid using classes in
your code.

Deciding When to Avoid Using Classes


Python classes are pretty cool and powerful tools that you can use in multiple scenarios. Because
of this, some people tend to overuse classes and solve all their coding problems using them.
However, sometimes using a class isn’t the best solution. Sometimes a couple of functions are
enough.

133
In practice, you’ll encounter a few situations in which you should avoid classes. For example,
you shouldn’t use regular classes when you need to:

 Store only data. Use a data class or a named tuple instead.


 Provide a single method. Use a function instead.

Data classes, enumerations, and named tuples are specially designed to store data. So, they might
be the best solution if your class doesn’t have any behavior attached.

If your class has a single method in its API, then you may not require a class. Instead, use a
function unless you need to retain a certain state between calls. If more methods appear later,
then you can always create a class. Remember the Python principle:

Simple is better than complex. (Source)

Additionally, you should avoid creating custom classes to wrap up functionality that’s available
through built-in types or third-party classes. Use the type or third-party class directly.

You’ll find many other general situations where you may not need to use classes in your Python
code. For example, classes aren’t necessary when you’re working with:

 A small and simple program or script that doesn’t require complex data structures or
logic. In this case, using classes may be overkill.
 A performance-critical program. Classes add overhead to your program, especially
when you need to create many objects. This may affect your code’s general performance.
 A legacy codebase. If an existing codebase doesn’t use classes, then you shouldn’t
introduce them. This will break the current coding style and disrupt the code’s
consistency.
 A team with a different coding style. If your current team doesn’t use classes, then stick
with their coding style. This will ensure consistency across the project.
 A codebase that uses functional programming. If a given codebase is currently written
with a functional approach, then you shouldn’t introduce classes. This will break the
underlying coding paradigm.

You may find yourself in many other situations where using classes will be overkill. Classes are
great, but don’t turn them into a one-size-fits-all type of tool. Start your code as simply as
possible. If the need for a class appears, then go for it.

Attaching Data to Classes and Instances


As you’ve learned, classes are great when you must bundle data and behavior together in a single
entity. The data will come in the form of attributes, while the behavior will come as methods.
You already have an idea of what an attribute is. Now it’s time to dive deeper into how you can
add, access, and modify attributes in your custom classes.

First, you need to know that your classes can have two types of attributes in Python:

134
1. Class attributes: A class attribute is a variable that you define in the class body directly.
Class attributes belong to their containing class. Their data is common to the class and all
its instances.
2. Instance attributes: An instance is a variable that you define inside a method. Instance
attributes belong to a concrete instance of a given class. Their data is only available to
that instance and defines its state.

Both types of attributes have their specific use cases. Instance attributes are, by far, the most
common type of attribute that you’ll use in your day-to-day coding, but class attributes also come
in handy.

Remove ads

Class Attributes

Class attributes are variables that you define directly in the class body but outside of any method.
These attributes are tied to the class itself rather than to particular objects of that class.

All the objects that you create from a particular class share the same class attributes with the
same original values. Because of this, if you change a class attribute, then that change affects all
the derived objects.

As an example, say that you want to create a class that keeps an internal count of the instances
you’ve created. In that case, you can use a class attribute:

>>> class ObjectCounter:


... num_instances = 0
... def __init__(self):
... ObjectCounter.num_instances += 1
...

>>> ObjectCounter()
<__main__.ObjectCounter object at 0x10392d810>
>>> ObjectCounter()
<__main__.ObjectCounter object at 0x1039810d0>
>>> ObjectCounter()
<__main__.ObjectCounter object at 0x10395b750>
>>> ObjectCounter()
<__main__.ObjectCounter object at 0x103959810>

>>> ObjectCounter.num_instances
4

>>> counter = ObjectCounter()


>>> counter.num_instances
5

ObjectCounter keeps a .num_instances class attribute that works as a counter of instances.


When Python parses this class, it initializes the counter to zero and leaves it alone. Creating

135
instances of this class means automatically calling the .__init__() method and
incremementing .num_instances by one.

Note: In the above example, you’ve used the class name to access .num_instances inside
.__init__(). However, using the built-in type() function is best because it’ll make your class
more flexible:

>>> class ObjectCounter:


... num_instances = 0
... def __init__(self):
... type(self).num_instances += 1
...

The built-in type() function returns the class or type of self, which is ObjectCounter in this
example. This subtle change makes your class more robust and reliable by avoiding hard-coding
the class that provides the attribute.

It’s important to note that you can access class attributes using either the class or one of its
instances. That’s why you can use the counter object to retrieve the value of .num_instances.
However, if you need to modify a class attribute, then you must use the class itself rather than
one of its instances.

For example, if you use self to modify .num_instances, then you’ll be overriding the original
class attribute by creating a new instance attribute:

>>> class ObjectCounter:


... num_instances = 0
... def __init__(self):
... self.num_instances += 1
...

>>> ObjectCounter()
<__main__.ObjectCounter object at 0x103987550>
>>> ObjectCounter()
<__main__.ObjectCounter object at 0x1039c5890>
>>> ObjectCounter()
<__main__.ObjectCounter object at 0x10396a890>
>>> ObjectCounter()
<__main__.ObjectCounter object at 0x1036fa110>

>>> ObjectCounter.num_instances
0

You can’t modify class attributes through instances of the containing class. Doing that will create
new instance attributes with the same name as the original class attributes. That’s why
ObjectCounter.num_instances returns 0 in this example. You’ve overridden the class attribute
in the highlighted line.

In general, you should use class attributes for sharing data between instances of a class. Any
changes on a class attribute will be visible to all the instances of that class.

136
Instance Attributes

Instance attributes are variables tied to a particular object of a given class. The value of an
instance attribute is attached to the object itself. So, the attribute’s value is specific to its
containing instance.

Python lets you dynamically attach attributes to existing objects that you’ve already created.
However, you most often define instance attributes inside instance methods, which are those
methods that receive self as their first argument.

Note: Even though you can define instance attributes inside any instance method, it’s best to
define all of them in the .__init__() method, which is the instance initializer. This ensures that
all of the attributes have the correct values when you create a new instance. Additionally, it
makes the code more organized and easier to debug.

Consider the following Car class, which defines a bunch of instance attributes:

# car.py

class Car:
def __init__(self, make, model, year, color):
self.make = make
self.model = model
self.year = year
self.color = color
self.started = False
self.speed = 0
self.max_speed = 200

In this class, you define a total of seven instance attributes inside .__init__(). The attributes
.make, .model, .year, and .color take values from the arguments to .__init__(), which are
the arguments that you must pass to the class constructor, Car(), to create concrete objects.

Then, you explicitly initialize the attributes .started, .speed, and .max_speed with sensible
values that don’t come from the user.

Note: Inside a class, you must access all instance attributes through the self argument. This
argument holds a reference to the current instance, which is where the attributes belong and
live. The self argument plays a fundamental role in Python classes. You’ll learn more about
self in the section Instance Methods With self.

Here’s how your Car class works in practice:

>>> from car import Car

>>> toyota_camry = Car("Toyota", "Camry", 2022, "Red")


>>> toyota_camry.make
'Toyota'
>>> toyota_camry.model

137
'Camry'
>>> toyota_camry.color
'Red'
>>> toyota_camry.speed
0

>>> ford_mustang = Car("Ford", "Mustang", 2022, "Black")


>>> ford_mustang.make
'Ford'
>>> ford_mustang.model
'Mustang'
>>> ford_mustang.year
2022
>>> ford_mustang.max_speed
200

In these examples, you create two different instances of Car. Each instance takes specific input
arguments at instantiation time to initialize part of its attributes. Note how the values of the
associated attributes are different and specific to the concrete instance.

Unlike class attributes, you can’t access instance attributes through the class. You need to access
them through their containing instance:

>>> Car.make
Traceback (most recent call last):
...
AttributeError: type object 'Car' has no attribute 'make'

Instance attributes are specific to a concrete instance of a given class. So, you can’t access them
through the class object. If you try to do that, then you get an AttributeError exception.

Remove ads

The .__dict__ Attribute

In Python, both classes and instances have a special attribute called .__dict__. This attribute
holds a dictionary containing the writable members of the underlying class or instance.
Remember, these members can be attributes or methods. Each key in .__dict__ represents an
attribute name. The value associated with a given key represents the value of the corresponding
attribute.

In a class, .__dict__ will contain class attributes and methods. In an instance, .__dict__ will
hold instance attributes.

When you access a class member through the class object, Python automatically searches for the
member’s name in the class .__dict__. If the name isn’t there, then you get an
AttributeError.

138
Similarly, when you access an instance member through a concrete instance of a class, Python
looks for the member’s name in the instance .__dict__. If the name doesn’t appears there, then
Python looks in the class .__dict__. If the name isn’t found, then you get a NameError.

Here’s a toy class that illustrates how this mechanism works:

# sample_dict.py

class SampleClass:
class_attr = 100

def __init__(self, instance_attr):


self.instance_attr = instance_attr

def method(self):
print(f"Class attribute: {self.class_attr}")
print(f"Instance attribute: {self.instance_attr}")

In this class, you define a class attribute with a value of 100. In the .__init__() method, you
define an instance attribute that takes its value from the user’s input. Finally, you define a
method to print both attributes.

Now it’s time to check the content of .__dict__ in the class object. Go ahead and run the
following code:

>>> from sample_dict import SampleClass

>>> SampleClass.class_attr
100

>>> SampleClass.__dict__
mappingproxy({
'__module__': '__main__',
'class_attr': 100,
'__init__': <function SampleClass.__init__ at 0x1036c62a0>,
'method': <function SampleClass.method at 0x1036c56c0>,
'__dict__': <attribute '__dict__' of 'SampleClass' objects>,
'__weakref__': <attribute '__weakref__' of 'SampleClass' objects>,
'__doc__': None
})

>>> SampleClass.__dict__["class_attr"]
100

The highlighted lines show that both the class attribute and the method are in the class
.__dict__ dictionary. Note how you can use .__dict__ to access the value of class attributes
by specifying the attribute’s name in square brackets, as you usually access keys in a dictionary.

Note: You can access the same dictionary by calling the built-in vars() function on your class
or instance, as you did before.

139
In instances, the .__dict__ dictionary will contain instance attributes only:

>>> instance = SampleClass("Hello!")

>>> instance.instance_attr
'Hello!'
>>> instance.method()
Class attribute: 100
Instance attribute: Hello!

>>> instance.__dict__
{'instance_attr': 'Hello!'}

>>> instance.__dict__["instance_attr"]
'Hello!'

>>> instance.__dict__["instance_attr"] = "Hello, Pythonista!"


>>> instance.instance_attr
'Hello, Pythonista!'

The instance .__dict__ dictionary in this example holds .instance_attr and its specific value
for the object at hand. Again, you can access any existing instance attribute using .__dict__ and
the attribute name in square brackets.

You can modify the instance .__dict__ dynamically. This means that you can change the value
of existing instance attributes through .__dict__, as you did in the final example above. You
can even add new attributes to an instance using its .__dict__ dictionary.

Using .__dict__ to change the value of instance attributes will allow you to avoid
RecursionError exceptions when you’re wiring descriptors in Python. You’ll learn more about
descriptors in the Property and Descriptor-Based Attributes section.

Dynamic Class and Instance Attributes

In Python, you can add new attributes to your classes and instances dynamically. This possibility
allows you to attach new data and behavior to your classes and objects in response to changing
requirements or contexts. It also allows you to adapt existing classes to specific and dynamic
needs.

For example, you can take advantage of this Python feature when you don’t know the required
attributes of a given class at the time when you’re defining that class itself.

Consider the following class, which aims to store a row of data from a database table or a CSV
file:

>>> class Record:


... """Hold a record of data."""
...

140
In this class, you haven’t defined any attributes or methods because you don’t know what data
the class will store. Fortunately, you can add attributes and even methods to this class
dynamically.

For example, say that you’ve read a row of data from an employees.csv file using
csv.DictReader. This class reads the data and returns it in a dictionary-like object. Now
suppose that you have the following dictionary of data:

>>> john = {
... "name": "John Doe",
... "position": "Python Developer",
... "department": "Engineering",
... "salary": 80000,
... "hire_date": "2020-01-01",
... "is_manager": False,
... }

Next, you want to add this data to an instance of your Record class, and you need to represent
each data field as an instance attribute. Here’s how you can do it:

>>> john_record = Record()

>>> for field, value in john.items():


... setattr(john_record, field, value)
...

>>> john_record.name
'John Doe'
>>> john_record.department
'Engineering'

>>> john_record.__dict__
{
'name': 'John Doe',
'position': 'Python Developer',
'department': 'Engineering',
'salary': 80000,
'hire_date': '2020-01-01',
'is_manager': False
}

In this code snippet, you first create an instance of Record called john_record. Then you run a
for loop to iterate over the items of your dictionary of data, john. Inside the loop, you use the
built-in setattr() function to sequentially add each field as an attribute to your john_record
object. If you inspect john_record, then you’ll note that it stores all the original data as
attributes.

You can also use dot notation and an assignment to add new attributes and methods to a class
dynamically:

>>> class User:

141
... pass
...

>>> # Add instance attributes dynamically


>>> jane = User()
>>> jane.name = "Jane Doe"
>>> jane.job = "Data Engineer"
>>> jane.__dict__
{'name': 'Jane Doe', 'job': 'Data Engineer'}

>>> # Add methods dynamically


>>> def __init__(self, name, job):
... self.name = name
... self.job = job
...
>>> User.__init__ = __init__

>>> User.__dict__
mappingproxy({
...
'__init__': <function __init__ at 0x1036ccae0>
})

>>> linda = User("Linda Smith", "Team Lead")


>>> linda.__dict__
{'name': 'Linda Smith', 'job': 'Team Lead'}

Here, you first create a minimal User class with no custom attributes or methods. To define the
class’s body, you’ve just used a pass statement as a placeholder, which is Python’s way of doing
nothing.

Then you create an object called jane. Note how you can use dot notation and an assignment to
add new attributes to the instance. In this example, you add .name and .job attributes with
appropriate values.

Then you provide the User class with an initializer or .__init__() method. In this method, you
take the name and job arguments, which you turn into instance attributes in the method’s body.
Then you add the method to User dynamically. After this addition, you can create User objects
by passing the name and job to the class constructor.

As you can conclude from the above example, you can construct an entire Python class
dynamically. Even though this capability of Python may seem neat, you must use it carefully
because it can make your code difficult to understand and reason about.

Remove ads

Property and Descriptor-Based Attributes

Python allows you to add function-like behavior on top of existing instance attributes and turn
them into managed attributes. This type of attribute prevents you from introducing breaking
changes into your APIs.

142
In other words, with managed attributes, you can have function-like behavior and attribute-like
access at the same time. You don’t need to change your APIs by replacing attributes with method
calls, which can potentially break your users’ code.

To create a managed attribute with function-like behavior in Python, you can use either a
property or a descriptor, depending on your specific needs.

Note: To dive deeper into Python properties, check out Python’s property(): Add Managed
Attributes to Your Classes.

As an example, get back to your Circle class and say that you need to validate the radius to
ensure that it only stores positive numbers. How would you do that without changing your class
interface? The quickest approach to this problem is to use a property and implement the
validation logic in the setter method.

Here’s what your new version of Circle can look like:

# circle.py

import math

class Circle:
def __init__(self, radius):
self.radius = radius

@property
def radius(self):
return self._radius

@radius.setter
def radius(self, value):
if not isinstance(value, int | float) or value <= 0:
raise ValueError("positive number expected")
self._radius = value

def calculate_area(self):
return round(math.pi * self._radius**2, 2)

To turn an existing attribute like .radius into a property, you typically use the @property
decorator to write the getter method. The getter method must return the value of the attribute. In
this example, the getter returns the circle’s radius, which is stored in the non-public ._radius
attribute.

To define the setter method of a property-based attribute, you need to use the decorator
@attr_name.setter. In the example, you use @radius.setter. Then you need to define the
method itself. Note that property setters need to take an argument providing the value that you
want to store in the underlying attribute.

Inside the setter method, you use a conditional to check whether the input value is an integer or a
floating-point number. You also check if the value is less than or equal to 0. If either is true, then

143
you raise a ValueError with a descriptive message about the actual issue. Finally, you assign
value to ._radius, and that’s it. Now, .radius is a property-based attribute.

Here’s an example of this new version of Circle in action:

>>> from circle import Circle

>>> circle_1 = Circle(100)


>>> circle_1.radius
100
>>> circle_1.radius = 500
>>> circle_1.radius = 0
Traceback (most recent call last):
...
ValueError: positive number expected

>>> circle_2 = Circle(-100)


Traceback (most recent call last):
...
ValueError: positive number expected

>>> circle_3 = Circle("300")


Traceback (most recent call last):
...
ValueError: positive number expected

The first instance of Circle in this example takes a valid value for its radius. Note how you can
continue working with .radius as a regular attribute rather than as a method. If you try to assign
an invalid value to .radius, then you get a ValueError.

Note: Remember to reload the circle.py module if you’re working on the same REPL session
as before. This recommendation will also be valid for all the examples in this tutorial where you
change modules that you defined in previous examples.

It’s important to note that the validation also runs at instantiation time when you call the class
constructor to create new instances of Circle. This behavior is consistent with your validation
strategy.

Using a descriptor to create managed attributes is another powerful way to add function-like
behavior to your instance attributes without changing your APIs. Like properties, descriptors can
also have getter, setter, and other types of methods.

Note: To learn more about descriptors and how to use them, check out Python Descriptors: An
Introduction.

To explore how descriptors work, say that you’ve decided to continue creating classes for your
drawing application, and now you have the following Square class:

# square.py

144
class Square:
def __init__(self, side):
self.side = side

@property
def side(self):
return self._side

@side.setter
def side(self, value):
if not isinstance(value, int | float) or value <= 0:
raise ValueError("positive number expected")
self._side = value

def calculate_area(self):
return round(self._side**2, 2)

This class uses the same pattern as your Circle class. Instead of using radius, the Square class
takes a side argument and computes the area using the appropriate expression for a square.

This class is pretty similar to Circle, and the repetition starts looking odd. Then you think of
using a descriptor to abstract away the validation process. Here’s what you come up with:

# shapes.py

import math

class PositiveNumber:
def __set_name__(self, owner, name):
self._name = name

def __get__(self, instance, owner):


return instance.__dict__[self._name]

def __set__(self, instance, value):


if not isinstance(value, int | float) or value <= 0:
raise ValueError("positive number expected")
instance.__dict__[self._name] = value

class Circle:
radius = PositiveNumber()

def __init__(self, radius):


self.radius = radius

def calculate_area(self):
return round(math.pi * self.radius**2, 2)

class Square:
side = PositiveNumber()

def __init__(self, side):


self.side = side

def calculate_area(self):

145
return round(self.side**2, 2)

The first thing to notice in this example is that you moved all the classes to a shapes.py file. In
that file, you define a descriptor class called PositiveNumber by implementing the .__get__()
and .__set__() special methods, which are part of the descriptor protocol.

Next, you remove the .radius property from Circle and the .side property from Square. In
Circle, you add a .radius class attribute, which holds an instance of PositiveNumber. You do
something similar in Square, but the class attribute is appropriately named .side.

Here are a few examples of how your classes work now:

>>> from shapes import Circle, Square

>>> circle = Circle(100)


>>> circle.radius
100
>>> circle.radius = 500
>>> circle.radius
500
>>> circle.radius = 0
Traceback (most recent call last):
...
ValueError: positive number expected

>>> square = Square(200)


>>> square.side
200
>>> square.side = 300
>>> square.side
300
>>> square.side = -100
Traceback (most recent call last):
...
ValueError: positive number expected

Python descriptors provide a powerful tool for adding function-like behavior on top of your
instance attributes. They can help you remove repetition from your code, making it cleaner and
more maintainable. They also promote code reuse.

Remove ads

Lightweight Classes With .__slots__

In a Python class, using the .__slots__ attribute can help you reduce the memory footprint of
the corresponding instances. This attribute prevents the automatic creation of an instance
.__dict__. Using .__slots__ is particularly handy when you have a class with a fixed set of
attributes, and you’ll use that class to create a large number of objects.

146
In the example below, you have a Point class with a .__slots__ attribute that consists of a
tuple of allowed attributes. Each attribute will represent a Cartesian coordinate:

>>> class Point:


... __slots__ = ("x", "y")
... def __init__(self, x, y):
... self.x = x
... self.y = y
...

>>> point = Point(4, 8)


>>> point.__dict__
Traceback (most recent call last):
...
AttributeError: 'Point' object has no attribute '__dict__'

This Point class defines .__slots__ as a tuple with two items. Each item represents the name
of an instance attribute. So, they must be strings holding valid Python identifiers.

Note: Although .__slots__ can hold a list object, you should use a tuple object instead.
Even if changing the list in .__slots__ after processing the class body had no effect, it’d be
misleading to use a mutable sequence there.

Instances of your Point class don’t have a .__dict__, as the code shows. This feature makes
them memory-efficient. To illustrate this efficiency, you can measure the memory consumption
of an instance of Point. To do this, you can use the Pympler library, which you can install from
PyPI using the pip install pympler command.

Once you’ve installed Pympler with pip, then you can run the following code in your REPL:

>>> from pympler import asizeof

>>> asizeof.asizeof(Point(4, 8))


112

The asizeof() function from Pympler says that the object Point(4, 8) occupies 112 bytes in
your computer’s memory. Now get back to your REPL session and redefine Point without
providing a .__slots__ attribute. With this update in place, go ahead and run the memory check
again:

>>> class Point:


... def __init__(self, x, y):
... self.x = x
... self.y = y

>>> asizeof.asizeof(Point(4, 8))


528

147
The same object, Point(4, 8), now consumes 528 bytes of memory. This number is over four
times greater than what you got with the original implementation of Point. Imagine how much
memory .__slots__ would save if you had to create a million points in your code.

The .__slots__ attribute adds a second interesting behavior to your custom classes. It prevents
you from adding new instance attributes dynamically:

>>> class Point:


... __slots__ = ("x", "y")
... def __init__(self, x, y):
... self.x = x
... self.y = y
...

>>> point = Point(4, 8)


>>> point.z = 16
Traceback (most recent call last):
...
AttributeError: 'Point' object has no attribute 'z'

Adding a .__slots__ to your classes allows you to provide a series of allowed attributes. This
means that you won’t be able to add new attributes to your instances dynamically. If you try to
do it, then you’ll get an AttributeError exception.

A word of caution is in order, as many of Python’s built-in mechanisms implicitly assume that
objects have the .__dict__ attribute. When you use .__slots__(), then you waive that
assumption, which means that some of those mechanisms might not work as expected anymore.

Providing Behavior With Methods


Python classes allow you to bundle data and behavior together in a single entity through
attributes and methods, respectively. You’ll use the data to define the object’s current state and
the methods to operate on that data or state.

A method is just a function that you define inside a class. By defining it there, you make the
relationship between the class and the method explicit and clear.

Because they’re just functions, methods can take arguments and return values as functions do.
However, the syntax for calling a method is a bit different from the syntax for calling a function.
To call a method, you need to specify the class or instances in which that method is defined. To
do this, you need to use dot notation. Remember that classes are namespaces, and their members
aren’t directly accessible from the outside.

In a Python class, you can define three different types of methods:

1. Instance methods, which take the current instance, self, as their first argument
2. Class methods, which take the current class, cls, as their first argument
3. Static methods, which take neither the class nor the instance

148
Every type of method has its own characteristics and specific use cases. Instance methods are, by
far, the most common methods that you’ll use in your custom Python classes.

Note: To learn more about instance, class, and static methods, check out Python’s Instance,
Class, and Static Methods Demystified.

In the following sections, you’ll dive into how each of these methods works and how to create
them in your classes. To get started, you’ll begin with instance methods, which are the most
common methods that you’ll define in your classes.

Remove ads

Instance Methods With self

In a class, an instance method is a function that takes the current instance as its first argument. In
Python, this first argument is called self by convention.

Note: Naming the current instance self is a strong convention in Python. It’s so strong that it
may look like self is one of the Python keywords. However, you could use any other name
instead of self.

Even though it’s possible to use any name for the first argument of an instance method, using
self is definitely the right choice because it’ll make your code look like Python code in the eyes
of other developers.

The self argument holds a reference to the current instance, allowing you to access that instance
from within methods. More importantly, through self, you can access and modify the instance
attributes and call other methods within the class.

To define an instance method, you just need to write a regular function that accepts self as its
first argument. Up to this point, you’ve already written some instance methods. To continue
learning about them, turn back to your Car class.

Now say that you want to add methods to start, stop, accelerate, and brake the car. To kick things
off, you’ll begin by writing the .start() and .stop() methods:

# car.py

class Car:
def __init__(self, make, model, year, color):
self.make = make
self.model = model
self.year = year
self.color = color
self.started = False
self.speed = 0
self.max_speed = 200

149
def start(self):
print("Starting the car...")
self.started = True

def stop(self):
print("Stopping the car...")
self.started = False

The .start() and .stop() methods are pretty straightforward. They take the current instance,
self, as their first argument. Inside the methods, you use self to access the .started attribute
on the current instance using dot notation. Then you change the current value of this attribute to
True in .start() and to False in .stop(). Both methods print informative messages to
illustrate what your car is doing.

Note: Instance methods should act on instance attributes by either accessing them or changing
their values. If you find yourself writing an instance method that doesn’t use self in its body,
then that may not be an instance method. In this case, you should probably use a class method or
a static method, depending on your specific needs.

Now you can add the .accelerate() and .brake() methods, which will be a bit more
complex:

# car.py

class Car:
def __init__(self, make, model, year, color):
self.make = make
self.model = model
self.year = year
self.color = color
self.started = False
self.speed = 0
self.max_speed = 200

# ...

def accelerate(self, value):


if not self.started:
print("Car is not started!")
return
if self.speed + value <= self.max_speed:
self.speed += value
else:
self.speed = self.max_speed
print(f"Accelerating to {self.speed} km/h...")

def brake(self, value):


if self.speed - value >= 0:
self.speed -= value
else:
self.speed = 0
print(f"Braking to {self.speed} km/h...")

150
The .accelerate() method takes an argument that represents the increment of speed that
occurs when you call the method. For simplicity, you haven’t set any validation for the input
increment of speed, so using negative values can cause issues. Inside the method, you first check
whether the car’s engine is started, returning immediately if it’s not.

Then you check if the incremented speed is less than or equal to the allowed maximum speed for
your car. If this condition is true, then you increment the speed. Otherwise, you set the speed to
the allowed limit.

The .brake() method works similarly. This time, you compare the decremented speed to 0
because cars can’t have a negative speed. If the condition is true, then you decrement the speed
according to the input argument. Otherwise, you set the speed to its lower limit, 0. Again, you
have no validation on the input decrement of speed, so be careful with negative values.

Your class now has four methods that operate on and with its attributes. It’s time for a drive:

>>> from car import Car

>>> ford_mustang = Car("Ford", "Mustang", 2022, "Black")


>>> ford_mustang.start()
Starting the car...

>>> ford_mustang.accelerate(100)
Accelerating to 100 km/h...
>>> ford_mustang.brake(50)
Braking to 50 km/h...
>>> ford_mustang.brake(80)
Braking to 0 km/h...
>>> ford_mustang.stop()
Stopping the car...

>>> ford_mustang.accelerate(100)
Car is not started!

Great! Your Car class works nicely! You can start your car’s engine, increment the speed, brake,
and stop the car’s engine. You’re also confident that no one can increment the car’s speed if the
car’s engine is stopped. How does that sound for minimal modeling of a car?

It’s important to note that when you call an instance method on a concrete instance like
ford_mustang using dot notation, you don’t have to provide a value for the self argument.
Python takes care of that step for you. It automatically passes the target instance to self. So, you
only have to provide the rest of the arguments.

However, you can manually provide the desired instance if you want. To do this, you need to call
the method on the class:

>>> ford_mustang = Car("Ford", "Mustang", 2022, "Black")

>>> Car.start(ford_mustang)
Starting the car...

151
>>> Car.accelerate(ford_mustang, 100)
Accelerating to 100 km/h...
>>> Car.brake(ford_mustang, 100)
Braking to 0 km/h...
>>> Car.stop(ford_mustang)
Stopping the car...

>>> Car.start()
Traceback (most recent call last):
...
TypeError: Car.start() missing 1 required positional argument: 'self'

In this example, you call instance methods directly on the class. For this type of call to work, you
need to explicitly provide an appropriate value to the self argument. In this example, that value
is the ford_mustang instance. Note that if you don’t provide a suitable instance, then the call
fails with a TypeError. The error message is pretty clear. There’s a missing positional argument,
self.

Remove ads

Special Methods and Protocols

Python supports what it calls special methods, which are also known as dunder or magic
methods. These methods are typically instance methods, and they’re a fundamental part of
Python’s internal class mechanism. They have an important feature in common: Python calls
them automatically in response to specific operations.

Python uses these methods for many different tasks. They provide a great set of tools that will
allow you to unlock the power of classes in Python.

You’ll recognize special methods because their names start and end with a double underscore,
which is the origin of their other name, dunder methods (double underscore). Arguably,
.__init__() is the most common special method in Python classes. As you already know, this
method works as the instance initializer. Python automatically calls it when you call a class
constructor.

You’ve already written a couple of .__init__() methods. So, you’re ready to learn about other
common and useful special methods. For example, the .__str__() and .__repr__() methods
provide string representations for your objects.

Go ahead and update your Car class to add these two methods:

# car.py

class Car:
# ...

def __str__(self):
return f"{self.make}, {self.model}, {self.color}: ({self.year})"

152
def __repr__(self):
return (
f"{type(self).__name__}"
f'(make="{self.make}", '
f'model="{self.model}", '
f"year={self.year}, "
f'color="{self.color}")'
)

The .__str__() method provides what’s known as the informal string representation of an
object. This method must return a string that represents the object in a user-friendly manner. You
can access an object’s informal string representation using either str() or print().

The .__repr__() method is similar, but it must return a string that allows you to re-create the
object if possible. So, this method returns what’s known as the formal string representation of
an object. This string representation is mostly targeted at Python programmers, and it’s pretty
useful when you’re working in an interactive REPL session.

In interactive mode, Python falls back to calling .__repr__() when you access an object or
evaluate an expression, issuing the formal string representation of the resulting object. In script
mode, you can access an object’s formal string representation using the built-in repr() function.

Run the following code to give your new methods a try. Remember that you need to restart your
REPL or reload car.py:

>>> from car import Car

>>> toyota_camry = Car("Toyota", "Camry", 2022, "Red")

>>> str(toyota_camry)
'Toyota, Camry, Red: (2022)'
>>> print(toyota_camry)
Toyota, Camry, Red: (2022)

>>> toyota_camry
Car(make="Toyota", model="Camry", year=2022, color="Red")
>>> repr(toyota_camry)
'Car(make="Toyota", model="Camry", year=2022, color="Red")'

When you use an instance of Car as an argument to str() or print(), you get a user-friendly
string representation of the car at hand. This informal representation comes in handy when you
need your programs to present your users with information about specific objects.

If you access an instance of Car directly in a REPL session, then you get a formal string
representation of the object. You can copy and paste this representation to re-create the object in
an appropriate environment. That’s why this string representation is intended to be useful for
developers, who can take advantage of it while debugging and testing their code.

153
Python protocols are another fundamental topic that’s closely related to special methods.
Protocols consist of one or more special methods that support a given feature or functionality.
Common examples of protocols include:

Protocol Provided Feature Special Methods


Allows you to create iterator
Iterator .__iter__() and .__next__()
objects
Iterable Makes your objects iterable .__iter__()
Lets you write managed .__get__() and optionally .__set__(),
Descriptor
attributes .__delete__(), and .__set_name__()
Context Enables an object to work on
.__enter__() and .__exit__()
manager with statements

Of course, Python has many other protocols that support cool features of the language. You
already coded an example of using the descriptor protocol in the Property and Descriptor-Based
Attributes section.

Here’s an example of a minimal ThreeDPoint class that implements the iterable protocol:

>>> class ThreeDPoint:


... def __init__(self, x, y, z):
... self.x = x
... self.y = y
... self.z = z
... def __iter__(self):
... yield from (self.x, self.y, self.z)
...

>>> list(ThreeDPoint(4, 8, 16))


[4, 8, 16]

This class takes three arguments representing the space coordinates of a given point. The
.__iter__() method is a generator function that returns an iterator. The resulting iterator yields
the coordinates of ThreeDPoint on demand.

The call to list() iterates over the attributes .x, .y, and .z, returning a list object. You don’t
need to call .__iter__() directly. Python calls it automatically when you use an instance of
ThreeDPoint in an iteration.

Remove ads

Class Methods With @classmethod

You can also add class methods to your custom Python classes. A class method is a method that
takes the class object as its first argument instead of taking self. In this case, the argument
should be called cls, which is also a strong convention in Python. So, you should stick to it.

154
You can create class methods using the @classmethod decorator. Providing your classes with
multiple constructors is one of the most common use cases of class methods in Python.

For example, say you want to add an alternative constructor to your ThreeDPoint so that you
can quickly create points from tuples or lists of coordinates:

# point.py

class ThreeDPoint:
def __init__(self, x, y, z):
self.x = x
self.y = y
self.z = z

def __iter__(self):
yield from (self.x, self.y, self.z)

@classmethod
def from_sequence(cls, sequence):
return cls(*sequence)

def __repr__(self):
return f"{type(self).__name__}({self.x}, {self.y}, {self.z})"

In the .from_sequence() class method, you take a sequence of coordinates as an argument,


create a ThreeDPoint object from it, and return the object to the caller. To create the new object,
you use the cls argument, which holds an implicit reference to the current class, which Python
injects into your method automatically.

Here’s how this class method works:

>>> from point import ThreeDPoint

>>> ThreeDPoint.from_sequence((4, 8, 16))


ThreeDPoint(4, 8, 16)

>>> point = ThreeDPoint(7, 14, 21)


>>> point.from_sequence((3, 6, 9))
ThreeDPoint(3, 6, 9)

In this example, you use the ThreeDPoint class directly to access the class method
.from_sequence(). Note that you can also access the method using a concrete instance, like
point in the example. In each of the calls to .from_sequence(), you’ll get a completely new
instance of ThreeDPoint. However, class methods should be accessed through the
corresponding class name for better clarity and to avoid confusion.

Static Methods With @staticmethod

155
Your Python classes can also have static methods. These methods don’t take the instance or the
class as an argument. So, they’re regular functions defined within a class. You could’ve also
defined them outside the class as stand-alone function.

You’ll typically define a static method instead of a regular function outside the class when that
function is closely related to your class, and you want to bundle it together for convenience or for
consistency with your code’s API. Remember that calling a function is a bit different from
calling a method. To call a method, you need to specify a class or object that provides that
method.

If you want to write a static method in one of your custom classes, then you need to use the
@staticmethod decorator. Check out the .show_intro_message() method below:

# point.py

class ThreeDPoint:
def __init__(self, x, y, z):
self.x = x
self.y = y
self.z = z

def __iter__(self):
yield from (self.x, self.y, self.z)

@classmethod
def from_sequence(cls, sequence):
return cls(*sequence)

@staticmethod
def show_intro_message(name):
print(f"Hey {name}! This is your 3D Point!")

def __repr__(self):
return f"{type(self).__name__}({self.x}, {self.y}, {self.z})"

The .show_intro_message() static method takes a name as an argument and prints a message
on the screen. Note that this is only a toy example of how to write static methods in your classes.

Static methods like .show_intro_message() don’t operate on the current instance, self, or the
current class, cls. They work as independent functions enclosed in a class. You’ll typically put
them inside a class when they’re closely related to that class but don’t necessarily affect the class
or its instances.

Here’s how the method works:

>>> from point import ThreeDPoint

>>> ThreeDPoint.show_intro_message("Pythonista")
Hey Pythonista! This is your 3D Point!

>>> point = ThreeDPoint(2, 4, 6)

156
>>> point.show_intro_message("Python developer")
Hey Python developer! This is your 3D Point!

As you already know, the .show_intro_message() method takes a name as an argument and
prints a message to your screen. Note that you can call the method using the class or any of its
instances. As with class methods, you should generally call static methods through the
corresponding class instead of one of its instances.

Getter and Setter Methods vs Properties

Programming languages like Java and C++ don’t expose attributes as part of their classes’ public
APIs. Instead, these programming languages make extensive use of getter and setter methods to
give you access to attributes.

Note: To dive deeper into the getter and setter pattern and how Python approaches it, check out
Getters and Setters: Manage Attributes in Python.

Using methods to access and update attributes promotes encapsulation. Encapsulation is a


fundamental OOP principle that recommends protecting an object’s state or data from the outside
world, preventing direct access. The object’s state should only be accessible through a public
interface consisting of getter and setter methods.

For example, say that you have a Person class with a .name instance attribute. You can make
.name a non-public attribute and provide getter and setter methods to access and change that
attribute:

# person.py

class Person:
def __init__(self, name):
self.set_name(name)

def get_name(self):
return self._name

def set_name(self, value):


self._name = value

In this example, .get_name() is the getter method and allows you to access the underlying
._name attribute. Similarly, .set_name() is the setter method and allows you to change the
current value of ._name. The ._name attribute is non-public and is where the actual data is
stored.

Here’s how you can use your Person class:

>>> from person import Person

>>> jane = Person("Jane")


>>> jane.get_name()

157
'Jane'

>>> jane.set_name("Jane Doe")


>>> jane.get_name()
'Jane Doe'

Here, you create an instance of Person using the class constructor and "Jane" as the required
name. That means you can use the .get_name() method to access Jane’s name and the
.set_name() method to update it.

The getter and setter pattern is common in languages like Java and C++. Besides promoting
encapsulation and APIs centered on method calls, this pattern also allows you to quickly add
function-like behavior to your attributes without introducing breaking changes in your APIs.

However, this pattern is less popular in the Python community. In Python, it’s completely normal
to expose attributes as part of an object’s public API. If you ever need to add function-like
behavior on top of a public attribute, then you can turn it into a property instead of breaking the
API by replacing the attribute with a method.

Here’s how most Python developers would write the Person class:

# person.py

class Person:
def __init__(self, name):
self.name = name

This class doesn’t have getter and setter methods for the .name attribute. Instead, it exposes the
attribute as part of its API. So, you can use it directly:

>>> from person import Person

>>> jane = Person("Jane")


>>> jane.name
'Jane'

>>> jane.name = "Jane Doe"


>>> jane.name
'Jane Doe'

In this example, instead of using a setter method to change the value of .name, you use the
attribute directly in an assignment statement. This is common practice in Python code. If your
Person class evolves to a point where you need to add function-like behavior on top of .name,
then you can turn the attribute into a property.

For example, say that you need to store the attribute in uppercase letters. Then you can do
something like the following:

# person.py

158
class Person:
def __init__(self, name):
self.name = name

@property
def name(self):
return self._name

@name.setter
def name(self, value):
self._name = value.upper()

This class defines .name as a property with appropriate getter and setter methods. Python will
automatically call these methods, respectively, when you access or update the attribute’s value.
The setter method takes care of uppercasing the input value before assigning it back to ._name:

>>> from person import Person

>>> jane = Person("Jane")


>>> jane.name
'JANE'

>>> jane.name = "Jane Doe"


>>> jane.name
'JANE DOE'

Python properties allow you to add function-like behavior to your attributes while you continue
to use them as normal attributes instead of as methods. Note how you can still assign new values
to .name using an assignment instead of a method call. Running the assignment triggers the
setter method, which uppercases the input value.

Summarizing Class Syntax and Usage: A Complete Example


Up to this point, you’ve learned a lot about Python classes: how to create them, when and how to
use them in your code, and more. In this section, you’ll review that knowledge by writing a class
that integrates most of the syntax and features you’ve learned so far.

Your class will represent an employee of a given company and will implement attributes and
methods to manage some related tasks like keeping track of personal information and computing
the employee’s age. To kick things off, go ahead and fire up your favorite code editor or IDE and
create a file called employee.py. Then add the following code to it:

# employee.py

class Employee:
company = "Example, Inc."

def __init__(self, name, birth_date):


self.name = name
self.birth_date = birth_date

159
In this Employee class, you define a class attribute called .company. This attribute will hold the
company’s name, which is common to all employees on the payroll.

Then you define the initializer, .__init__(), which takes the employee’s name and birth date as
arguments. Remember that you must pass appropriate values for both arguments when you call
the class constructor, Employee().

Inside .__init__(), you define two public instance attributes to store the employee’s name and
birth date. These attributes will be part of the class API because they’re public attributes.

Now say that you want to turn .birth_date into a property to automatically convert the input
date in ISO format to a datetime object:

# employee.py

from datetime import datetime

class Employee:
# ...

@property
def birth_date(self):
return self._birth_date

@birth_date.setter
def birth_date(self, value):
self._birth_date = datetime.fromisoformat(value)

Here, you define the .birth_date property through the @property decorator. The getter method
returns the content of ._birth_date. This non-public attribute will hold the concrete data.

To define the setter method, you use the @birth_date.setter decorator. In this method, you
assign a datetime.datetime object to ._birth_date. In this example, you don’t run any
validation on the input data, which should be a string holding the date in ISO format. You can
implement the validation as an exercise.

Next, say you want to write a regular instance method to compute the employee’s age from their
birth date:

# employee.py

from datetime import datetime

class Employee:
# ...

def compute_age(self):
today = datetime.today()
age = today.year - self.birth_date.year
birthday = datetime(
today.year,

160
self.birth_date.month,
self.birth_date.day
)
if today < birthday:
age -= 1
return age

Here, .compute_age() is an instance method because it takes the current instance, self, as its
first argument. Inside the method, you compute the employee’s age using the .birth_date
property as a starting point.

Now say that you’ll often build instances of Employee from dictionaries containing the data of
your employees. You can add a convenient class method to quickly build objects that way:

# employee.py

from datetime import datetime

class Employee:
# ...

@classmethod
def from_dict(cls, data_dict):
return cls(**data_dict)

In this code snippet, you define a class method using the @classmethod decorator. The method
takes a dictionary object containing the data of a given employee. Then it builds an instance of
Employee using the cls argument and unpacking the dictionary.

Finally, you’ll add suitable .__str__() and .__repr__() special methods to make your class
friendly to users and developers, respectively:

# employee.py

from datetime import datetime

class Employee:
# ...

def __str__(self):
return f"{self.name} is {self.compute_age()} years old"

def __repr__(self):
return (
f"{type(self).__name__}("
f"name='{self.name}', "
f"birth_date='{self.birth_date.strftime('%Y-%m-%d')}')"
)

The .__str__() method returns a string describing the current employee in a user-friendly
manner. Similarly, the .__repr__() method returns a string that will allow you to re-create the
current object, which is great from a developer’s perspective.

161
Here’s how you can use Employee in your code:

>>> from employee import Employee

>>> john = Employee("John Doe", "1998-12-04")


>>> john.company
Example, Inc.
>>> john.name
'John Doe'
>>> john.compute_age()
24
>>> print(john)
John Doe is 24 years old
>>> john
Employee(name='John Doe', birth_date='1998-12-04')

>>> jane_data = {"name": "Jane Doe", "birth_date": "2001-05-15"}


>>> jane = Employee.from_dict(jane_data)
>>> print(jane)
Jane Doe is 21 years old

Cool! Your Employee class works great so far! It allows you to represent employees, access their
attributes, and compute their ages. It also provides neat string representations that will make your
class look polished and reliable. Great job! Do you have any ideas of cool features that you could
add to Employee?

Debugging Python Classes


Debugging often represents a large portion of your coding time. You’ll probably spend long
hours tracking errors in the code that you’re working on and trying to fix them to make the code
more robust and reliable. When you start working with classes and objects in Python, you’re
likely to encounter some new exceptions.

For example, if you try to access an attribute or method that doesn’t exist, then you’ll get an
AttributeError:

>>> class Point:


... def __init__(self, x, y):
... self.x = x
... self.y = y
...

>>> point = Point(4, 8)


>>> point.z
Traceback (most recent call last):
...
AttributeError: 'Point' object has no attribute 'z'

The Point class doesn’t define a .z instance attribute, so you get an AttributeError if you try
to access that attribute.

162
You’ll find a few exceptions that can occur when working with Python classes. These are some
of the most common ones:

 An AttributeError occurs when the specified object doesn’t define the attribute or
method that you’re trying to access. Take, for example, accessing .z on the Point class
defined in the above example.
 A TypeError occurs when you apply an operation or function to an object that doesn’t
support that operation. For example, consider calling the built-in len() function with a
number as an argument.
 A NotImplementedError occurs when an abstract method isn’t implemented in a
concrete subclass. You’ll learn more about this exception in the section Creating Abstract
Base Classes (ABC) and Interfaces.

These are just a few examples of exceptions that can occur when you’re working with Python
classes. You’ll also find some common mistakes that people sometimes make when they start to
write their own classes:

 Forgetting to include the self argument in instance methods


 Forgetting to instantiate the class by calling its constructor with appropriate arguments
 Confusing and misusing class and instance attributes
 Not following or respecting naming conventions for members
 Accessing non-public members from outside the containing class
 Overusing and misusing inheritance

These are just a few common mistakes that people might make when they’re getting started with
Python classes. From this list, you haven’t learned about inheritance yet. Don’t worry about it for
now. Inheritance is an advanced topic that you’ll study later in this tutorial.

Exploring Specialized Classes From the Standard Library


In the Python standard library, you’ll find many tools that solve different problems and deal with
different challenges. Among all these tools, you’ll find a few that will make you more productive
when writing custom classes.

For example, if you want a tool that saves you from writing a lot of class-related boilerplate
code, then you can take advantage of data classes and the dataclasses module.

Similarly, if you’re looking for a tool that allows you to quickly create class-based enumerations
of constants, then you can turn your eye to the enum module and its different types of
enumeration classes.

In the following sections, you’ll learn the basics of using data classes and enumerations to
efficiently write robust, reliable, and specialized classes in Python.

Data Classes

163
Python’s data classes specialize in storing data. However, they’re also code generators that
produce a lot of class-related boilerplate code for you behind the scenes.

For example, if you use the data class infrastructure to write a custom class, then you won’t have
to implement special methods like .__init__(), .__repr__(), .__eq__(), and .__hash__().
The data class will write them for you. More importantly, the data class will write these methods
applying best practices and avoiding potential errors.

Note: To learn more about data classes in Python, check out Data Classes in Python 3.7+
(Guide).

As you already know, special methods support important functionalities in Python classes. In the
case of data classes, you’ll have accurate string representation, comparison capabilities,
hashability, and more.

Even though the name data class may suggest that this type of class is limited to containing data,
it also offers methods. So, data classes are like regular classes but with superpowers.

To create a data class, go ahead and import the @dataclass decorator from the dataclasses
module. You’ll use this decorator in the definition of your class. This time, you won’t write an
.__init__() method. You’ll just define data fields as class attributes with type hints.

For example, here’s how you can write the ThreeDPoint class as a data class::

# point.py

from dataclasses import dataclass

@dataclass
class ThreeDPoint:
x: int | float
y: int | float
z: int | float

@classmethod
def from_sequence(cls, sequence):
return cls(*sequence)

@staticmethod
def show_intro_message(name):
print(f"Hey {name}! This is your 3D Point!")

This new implementation of ThreeDPoint uses Python’s @dataclass decorator to turn the
regular class into a data class. Instead of defining an .__init__() method, you list the instance
attributes with their corresponding types. The data class will take care of writing a proper
initializer for you. Note that you don’t define .__iter__() or .__repr__() either.

164
Note: Data classes are pretty flexible when it comes to defining their fields or attributes. You can
declare them with the type annotation syntax. You can initialize them with a sensible default
value. You can also combine both approaches depending on your needs:

from dataclasses import dataclass

@dataclass
class ThreeDPoint:
x: int | float
y = 0.0
z: int | float = 0.0

In this code snippet, you declare the first attribute using the type annotation syntax. The second
attribute has a default value with no type annotation. Finally, the third attribute has both type
annotation and a default value. However, when you don’t specify a type hint for an attribute,
then Python won’t automatically generate the corresponding code for that attribute.

Once you’ve defined the data fields or attributes, you can start adding the methods that you need.
In this example, you keep the .from_sequence() class method and the
.show_intro_message() static method.

Go ahead and run the following code to check the additional functionality that @dataclass has
added to this version of ThreeDPoint:

>>> from dataclasses import astuple


>>> from point import ThreeDPoint

>>> point_1 = ThreeDPoint(1.0, 2.0, 3.0)


>>> point_1
ThreeDPoint(x=1.0, y=2.0, z=3.0)
>>> astuple(point_1)
(1.0, 2.0, 3.0)

>>> point_2 = ThreeDPoint(2, 3, 4)


>>> point_1 == point_2
False

>>> point_3 = ThreeDPoint(1, 2, 3)


>>> point_1 == point_3
True

Your ThreeDPoint class works pretty well! It provides a suitable string representation with an
automatically generated .__repr__() method. You can iterate over the fields using the
astuple() function from the dataclasses module. Finally, you can compare two instances of
the class for equality (==). As you can conclude, this new version of ThreeDPoint has saved you
from writing several lines of tricky boilerplate code.

Enumerations

165
An enumeration, or just enum, is a data type that you’ll find in several programming languages.
Enums allow you to create sets of named constants, which are known as members and can be
accessed through the enumeration itself.

Python doesn’t have a built-in enum data type. Fortunately, Python 3.4 introduced the enum
module to provide the Enum class for supporting general-purpose enumerations.

Days of the week, months and seasons of the year, HTTP status codes, colors in a traffic light,
and pricing plans of a web service are all great examples of constants that you can group in an
enum. In short, you can use enums to represent variables that can take one of a limited set of
possible values.

The Enum class, among other similar classes in the enum module, allows you to quickly and
efficiently create custom enumerations or groups of similar constants with neat features that you
don’t have to code yourself. Apart from member constants, enums can also have methods to
operate with those constants.

Note: To learn more about how to create and use enumerations in your Python code, check out
Build Enumerations of Constants With Python’s Enum.

To define a custom enumeration, you can subclass the Enum class. Here’s an example of an
enumeration that groups the days of the week:

>>> from enum import Enum

>>> class WeekDay(Enum):


... MONDAY = 1
... TUESDAY = 2
... WEDNESDAY = 3
... THURSDAY = 4
... FRIDAY = 5
... SATURDAY = 6
... SUNDAY = 7
...

In this code example, you define WeekDay by subclassing Enum from the enum module. This
specific enum groups seven constants representing the days of the week. These constants are the
enum members. Because they’re constants, you should follow the convention for naming any
constant in Python: uppercase letters and, if applicable, underscores between words.

Enumerations have a few cool features that you can take advantage of. For example, their
members are strict constants, so you can’t change their values. They’re also iterable by default:

>>> WeekDay.MONDAY = 0
Traceback (most recent call last):
...
AttributeError: cannot reassign member 'MONDAY'

>>> list(WeekDay)

166
[
<WeekDay.MONDAY: 1>,
<WeekDay.TUESDAY: 2>,
<WeekDay.WEDNESDAY: 3>,
<WeekDay.THURSDAY: 4>,
<WeekDay.FRIDAY: 5>,
<WeekDay.SATURDAY: 6>,
<WeekDay.SUNDAY: 7>
]

If you try to change the value of an enum member, then you get an AttributeError. So, enum
members are strictly constants. You can iterate over the members directly because enumerations
support iteration by default.

You can directly access their members using different syntax:

>>> # Dot notation


>>> WeekDay.MONDAY
<WeekDay.MONDAY: 1>

>>> # Call notation


>>> WeekDay(2)
<WeekDay.TUESDAY: 2>

>>> # Dictionary notation


>>> WeekDay["WEDNESDAY"]
<WeekDay.WEDNESDAY: 3>

In the first example, you access an enum member using dot notation, which is pretty intuitive
and readable. In the second example, you access a member by calling the enumeration with that
member’s value as an argument. Finally, you use a dictionary-like syntax to access another
member by name.

If you want fine-grain access to a member’s components, then you can use the .name and
.value attributes, which are pretty handy in the context of iteration:

>>> WeekDay.THURSDAY.name
'THURSDAY'
>>> WeekDay.THURSDAY.value
4

>>> for day in WeekDay:


... print(day.name, "->", day.value)
...
MONDAY -> 1
TUESDAY -> 2
WEDNESDAY -> 3
THURSDAY -> 4
FRIDAY -> 5
SATURDAY -> 6
SUNDAY -> 7

167
In these examples, you access the .name and .value attributes of specific members of WeekDay.
These attributes provide access to each member’s component.

Finally, you can also add custom behavior to your enumerations. To do that, you can use
methods as you’d do with regular classes:

# week.py

from enum import Enum

class WeekDay(Enum):
MONDAY = 1
TUESDAY = 2
WEDNESDAY = 3
THURSDAY = 4
FRIDAY = 5
SATURDAY = 6
SUNDAY = 7

@classmethod
def favorite_day(cls):
return cls.FRIDAY

def __str__(self):
return f"Current day: {self.name}"

After saving your code to week.py, you add a class method called .favorite_day() to your
WeekDay enumeration. This method will just return your favorite day of the week, which is
Friday, of course! Then you add a .__str__() method to provide a user-friendly string
representation for the current day.

Here’s how you can use these methods in your code:

>>> from week import WeekDay

>>> WeekDay.favorite_day()
<WeekDay.FRIDAY: 5>

>>> print(WeekDay.FRIDAY)
Current day: FRIDAY

You’ve added new functionality to your enumeration through class and instance methods. Isn’t
that cool?

Using Inheritance and Building Class Hierarchies


Inheritance is a powerful feature of object-oriented programming. It consists of creating
hierarchical relationships between classes, where child classes inherit attributes and methods
from their parent class. In Python, one class can have multiple parents or, more broadly,
ancestors.

168
This is called implementation inheritance, which allows you to reduce duplication and
repetition by code reuse. It can also make your code more modular, better organized, and more
scalable. However, classes also inherit the interface by becoming more specialized kinds of
their ancestors. In some cases, you’ll be able to use a child instance where an ancestor is
expected.

In the following sections, you’ll learn how to use inheritance in Python. You’ll start with simple
inheritance and continue with more complex concepts. So, get ready! This is going to be fun!

Simple Inheritance

When you have a class that inherits from a single parent class, then you’re using single-base
inheritance or just simple inheritance. To make a Python class inherit from another, you need
to list the parent class’s name in parentheses after the child class’s name in the definition.

To make this clearer, here’s the syntax that you must use:

class Parent:
# Parent's definition goes here...
pass

class Child(Parent):
# Child definitions goes here...
pass

In this code snippet, Parent is the class you want to inherit from. Parent classes typically
provide generic and common functionality that you can reuse throughout multiple child classes.
Child is the class that inherits features and code from Parent. The highlighted line shows the
required syntax.

Note: In this tutorial, you’ll use the terms parent class, superclass, and base class
interchangeably to refer to the class that you inherit from.

Similarly, you’ll use the terms child class, derived class, and subclass to refer to classes that
inherit from other classes.

Here’s a practical example to get started with simple inheritance and how it works. Suppose
you’re building an app to track vehicles and routes. At first, the app will track cars and
motorcycles. You think of creating a Vehicle class and deriving two subclasses from it. One
subclass will represent a car, and the other will represent a motorcycle.

The Vehicle class will provide common attributes, such as .make, .model, and .year. It’ll also
provide the .start() and .stop() methods to start and stop the vehicle engine, respectively:

# vehicles.py

class Vehicle:
def __init__(self, make, model, year):

169
self.make = make
self.model = model
self.year = year
self._started = False

def start(self):
print("Starting engine...")
self._started = True

def stop(self):
print("Stopping engine...")
self._started = False

In this code, you define the Vehicle class with attributes and methods that are common to all
your current vehicles. You can say that Vehicle provides a common interface for your vehicles.
You’ll inherit from this class to reuse this interface and its functionality in your subclasses.

Now you can define the Car and Motorcycle classes. Both of them will have some unique
attributes and methods specific to the vehicle type. For example, the Car will have a .num_seats
attribute and a .drive() method:

# vehicles.py

# ...

class Car(Vehicle):
def __init__(self, make, model, year, num_seats):
super().__init__(make, model, year)
self.num_seats = num_seats

def drive(self):
print(f'Driving my "{self.make} - {self.model}" on the road')

def __str__(self):
return f'"{self.make} - {self.model}" has {self.num_seats} seats'

Your Car class uses Vehicle as its parent class. This means that Car will automatically inherit
the .make, .model, and .year attributes, as well as the non-public ._started attribute. It’ll also
inherit the .start() and .stop() methods.

Note: Like inheritance in nature, inheritance in OOP goes in a single direction, from the parents
to the children. In other words, children inherit from their parents and not the other way around.

The class defines a .num_seats attribute. As you already know, you should define and initialize
instance attributes in .__init__(). This requires you to provide a custom .__init__() method
in Car, which will shadow the superclass initializer.

How can you write an .__init__() method in Car and still guarantee that you initialize the
.make, .model, and .year attributes? That’s where the built-in super() function comes on the
scene. This function allows you to access members in the superclass, as its name suggests.

170
Note: To learn more about using super() in your classes, check out Supercharge Your Classes
With Python super().

In Car, you use super() to call the .__init__() method on Vehicle. Note that you pass the
input values for .make, .model, and .year so that Vehicle can initialize these attributes
correctly. After this call to super(), you add and initialize the .num_seats attributes, which is
specific to the Car class.

Finally, you write the .drive() method, which is also specific to Car. This method is just a
demonstrative example, so it only prints a message to your screen.

Now it’s time to define the Motorcycle class, which will inherit from Vehicle too. This class
will have a .num_wheels attribute and a .ride() method:

# vehicles.py

# ...

class Motorcycle(Vehicle):
def __init__(self, make, model, year, num_wheels):
super().__init__(make, model, year)
self.num_wheels = num_wheels

def ride(self):
print(f'Riding my "{self.make} - {self.model}" on the road')

def __str__(self):
return f'"{self.make} - {self.model}" has {self.num_wheels} wheels'

Again, you call super() to initialize .make, .model, and .year. After that, you define and
initialize the .num_wheels attribute. Finally, you write the .ride() method. Again, this method
is just a demonstrative example.

With this code in place, you can start using Car and Motorcycle right away:

>>> from vehicles import Car, Motorcycle

>>> tesla = Car("Tesla", "Model S", 2022, 5)


>>> tesla.start()
Starting engine...
>>> tesla.drive()
Driving my "Tesla - Model S" on the road
>>> tesla.stop()
Stopping engine...
>>> print(tesla)
"Tesla - Model S" has 5 seats

>>> harley = Motorcycle("Harley-Davidson", "Iron 883", 2021, 2)


>>> harley.start()
Starting engine...
>>> harley.ride()
Riding my "Harley-Davidson - Iron 883" on the road.

171
>>> harley.stop()
Stopping engine...
>>> print(harley)
"Harley-Davidson - Iron 883" has 2 wheels

Cool! Your Tesla and your Harley-Davidson work nicely. You can start their engines, drive or
ride them, and so on. Note how you can use both the inherited and specific attributes and
methods in both classes.

You’ll typically use single inheritance or inheritance in general when you have classes that share
common attributes and behaviors and want to reuse them in derived classes. So, inheritance is a
great tool for code reuse. Subclasses will inherit and reuse functionality from their parent.

Subclasses will frequently extend their parents’ interface with new attributes and methods. You
can use them as a new starting point to create another level of inheritance. This practice will lead
to the creation of class hierarchies.

Class Hierarchies

Using inheritance, you can design and build class hierarchies, also known as inheritance trees. A
class hierarchy is a set of closely related classes that are connected through inheritance and
arranged in a tree-like structure.

The class or classes at the top of the hierarchy are the base classes, while the classes below are
derived classes or subclasses. Inheritance-based hierarchies express an is-a-type-of relationship
between subclasses and their base classes.

Each level in the hierarchy will inherit attributes and behaviors from the above levels. Therefore,
classes at the top of the hierarchy are generic classes with common functionality, while classes
down the hierarchy are more specialized. They’ll inherit attributes and behaviors from their
superclasses and will also add their own.

Taxonomic classification of animals is a commonly used example to explain class hierarchies. In


this hierarchy, you’ll have a generic Animal class at the top. Below this class, you can have
subclasses like Mammal, Bird, Fish, and so on. These subclasses are more specific classes than
Animal and inherit the attributes and methods from it. They can also have their own attributes
and methods.

To continue with the hierarchy, you can subclass Mammal, Bird, and Fish and create derived
classes with even more specific characteristics. Here’s a short toy example:

# animals.py

class Animal:
def __init__(self, name, sex, habitat):
self.name = name
self.sex = sex
self.habitat = habitat

172
class Mammal(Animal):
unique_feature = "Mammary glands"

class Bird(Animal):
unique_feature = "Feathers"

class Fish(Animal):
unique_feature = "Gills"

class Dog(Mammal):
def walk(self):
print("The dog is walking")

class Cat(Mammal):
def walk(self):
print("The cat is walking")

class Eagle(Bird):
def fly(self):
print("The eagle is flying")

class Penguin(Bird):
def swim(self):
print("The penguin is swimming")

class Salmon(Fish):
def swim(self):
print("The salmon is swimming")

class Shark(Fish):
def swim(self):
print("The shark is swimming")

At the top of the hierarchy, you have the Animal class. This is the base class of your hierarchy. It
has the .name, .sex, and .habitat attributes, which will be string objects. These attributes are
common to all animals.

Then you define the Mammal, Bird, and Fish classes by inheriting from Animal. These classes
have a .unique_feature class attribute that holds the distinguishing characteristic of each group
of animals.

Then you create concrete mammals like Dog and Cat. These classes have specific methods that
are common to all dogs and cats, respectively. Similarly, you define two classes that inherit from
Bird and two more that inherit from Fish.

Here’s a tree-like class diagram that will help you see the hierarchical relationship between
classes:

173
Each level in the hierarchy can—and typically will—add new attributes and functionality on top
of those that its parents already provide. If you walk through the diagram from top to button,
then you’ll move from generic to specialized classes.

These latter classes implement new methods that are specific to the class at hand. In this
example, the methods just print some information to the screen and automatically return None,
which is the null value in Python.

Note: You can create class diagrams to represent class hierarchies that are based on inheritance.
However, that’s not the only relationship that can appear between your classes.

With class diagrams, you can also represent other types of relationships, including:

 Composition, which expresses a strong has-a relationship. For example, a robot has an
arm. If the robot stops existing, then the arm stops existing too.
 Aggregation, which expresses a softer has-a relationship. For example, a university has
an instructor. If the university stops existing, the instructor doesn’t stop existing.
 Association, which expresses a uses-a relationship. For example, a student may be
associated with a course. They will use the course. This relationship is common in
database systems where you have one-to-one, one-to-many, and many-to-many
associations.

You’ll learn more about some of these types of relationships in the section called Using
Alternatives to Inheritance.

174
That’s how you design and create class hierarchies to reuse code and functionality. Such
hierarchies also allow you to give your code a modular organization, making it more
maintainable and scalable.

Extended vs Overridden Methods

When you’re using inheritance, you can face an interesting and challenging issue. In some
situations, a parent class may provide a given functionality only at a basic level, and you may
want to extend that functionality in your subclasses. In other situations, the feature in the parent
class isn’t appropriate for the subclass.

In these situations, you can use one of the following strategies, depending on your specific case:

 Extending an inherited method in a subclass, which means that you’ll reuse the
functionality provided by the superclass and add new functionality on top
 Overriding an inherited method in a subclass, which means that you’ll completely
discard the functionality from the superclass and provide new functionality in the
subclass

Here’s an example of a small class hierarchy that applies the first strategy to provide extended
functionality based on the inherited one:

# aircrafts.py

class Aircraft:
def __init__(self, thrust, lift, max_speed):
self.thrust = thrust
self.lift = lift
self.max_speed = max_speed

def show_technical_specs(self):
print(f"Thrust: {self.thrust} kW")
print(f"Lift: {self.lift} kg")
print(f"Max speed: {self.max_speed} km/h")

class Helicopter(Aircraft):
def __init__(self, thrust, lift, max_speed, num_rotors):
super().__init__(thrust, lift, max_speed)
self.num_rotors = num_rotors

def show_technical_specs(self):
super().show_technical_specs()
print(f"Number of rotors: {self.num_rotors}")

In this example, you define Aircraft as the base class. In .__init__(), you create a few
instance attributes. Then you define the .show_technical_specs() method, which prints
information about the aircraft’s technical specifications.

Next, you define Helicopter, inheriting from Aircraft. The .__init__() method of
Helicopter extends the corresponding method of Aircraft by calling super() to initialize the

175
.thrust, .lift, and .max_speed attributes. You already saw something like this in the previous
section.

Helicopter also extends the functionality of .show_technical_specs(). In this case, you first
call .show_technical_specs() from Aircraft using super(). Then you add a new call to
print() that adds new information to the technical description of the helicopter at hand.

Here’s how Helicopter instances work in practice:

>>> from aircrafts import Helicopter

>>> sikorsky_UH60 = Helicopter(1490, 9979, 278, 2)


>>> sikorsky_UH60.show_technical_specs()
Thrust: 1490 kW
Lift: 9979 kg
Max speed: 278 km/h
Number of rotors: 2

When you call .show_technical_specs() on a Helicopter instance, you get the information
provided by the base class, Aircraft, and also the specific information added by Helicopter
itself. You’ve extended the functionality of Aircraft in its subclass Helicopter.

Now it’s time to take a look at how you can override a method in a subclass. As an example, say
that you have a base class called Worker that defines several attributes and methods like in the
following example:

# workers.py

class Worker:
def __init__(self, name, address, hourly_salary):
self.name = name
self.address = address
self.hourly_salary = hourly_salary

def show_profile(self):
print("== Worker profile ==")
print(f"Name: {self.name}")
print(f"Address: {self.address}")
print(f"Hourly salary: {self.hourly_salary}")

def calculate_payroll(self, hours=40):


return self.hourly_salary * hours

In this class, you define a few instance attributes to store important data about the current
worker. You also provide the .show_profile() method to display relevant information about
the worker. Finally, you write a generic .calculate_payroll() method to compute the salary
of workers from their hourly salary and the number of hours worked.

176
Later in the development cycle, some requirements change. Now you realize that managers
compute their salaries in a different way. They’ll have an hourly bonus that you must add to the
normal hourly salary before computing the final amount.

After thinking a bit about the problem, you decide that Manager has to override
.calculate_payroll() completely. Here’s the implementation that you come up with:

# workers.py

# ...

class Manager(Worker):
def __init__(self, name, address, hourly_salary, hourly_bonus):
super().__init__(name, address, hourly_salary)
self.hourly_bonus = hourly_bonus

def calculate_payroll(self, hours=40):


return (self.hourly_salary + self.hourly_bonus) * hours

In the Manager initializer, you take the hourly bonus as an argument. Then you call the parent’s
.__init__() method as usual and define the .hourly_bonus instance attribute. Finally, you
override .calculate_payroll() with a completely different implementation that doesn’t reuse
the inherited functionality.

Multiple Inheritance

In Python, you can use multiple inheritance. This type of inheritance allows you to create a
class that inherits from several parents. The subclass will have access to attributes and methods
from all its parents.

Multiple inheritance allows you to reuse code from several existing classes. However, you must
manage the complexity of multiple inheritance with care. Otherwise, you can face issues like the
diamond problem. You’ll learn more about this topic in the Method Resolution Order (MRO)
section.

Here’s a small example of multiple inheritance in Python:

# crafts.py

class Vehicle:
def __init__(self, make, model, color):
self.make = make
self.model = model
self.color = color

def start(self):
print("Starting the engine...")

def stop(self):
print("Stopping the engine...")

177
def show_technical_specs(self):
print(f"Make: {self.make}")
print(f"Model: {self.model}")
print(f"Color: {self.color}")

class Car(Vehicle):
def drive(self):
print("Driving on the road...")

class Aircraft(Vehicle):
def fly(self):
print("Flying in the sky...")

class FlyingCar(Car, Aircraft):


pass

In this example, you write a Vehicle class with .make, .model, and .color attributes. The class
also has the .start(), .stop(), and .show_technical_specs() methods. Then you create a
Car class that inherits from Vehicle and extends it with a new method called .drive(). You
also create an Aircraft class that inherits from Vehicle and adds a .fly() method.

Finally, you define a FlyingCar class to represent a car that you can drive on the road or fly in
the sky. Isn’t that cool? Note that this class includes both Car and Aircraft in its list of parent
classes. So, it’ll inherit functionality from both superclasses.

Here’s how you can use the FlyingCar class:

>>> from crafts import FlyingCar

>>> space_flyer = FlyingCar("Space", "Flyer", "Black")


>>> space_flyer.show_technical_specs()
Make: Space
Model: Flyer
Color: Black

>>> space_flyer.start()
Starting the engine...
>>> space_flyer.drive()
Driving on the road...
>>> space_flyer.fly()
Flying in the sky...
>>> space_flyer.stop()
Stopping the engine...

In this code snippet, you first create an instance of FlyingCar. Then you call all its methods,
including the inherited ones. As you can see, multiple inheritance promotes code reuse, allowing
you to use functionality from several base classes at the same time. By the way, if you get this
FlyingCar to really fly, then make sure you don’t stop the engine while you’re flying!

Method Resolution Order (MRO)

178
When you’re using multiple inheritance, you can face situations where one class inherits from
two or more classes that have the same base class. This is known as the diamond problem. The
real issue appears when multiple parents provide specific versions of the same method. In this
case, it’d be difficult to determine which version of that method the subclass will end up using.

Python deals with this issue using a specific method resolution order (MRO). So, what is the
method resolution order in Python? It’s an algorithm that tells Python how to search for inherited
methods in a multiple inheritance context. Python’s MRO determines which implementation of a
method or attribute to use when there are multiple versions of it in a class hierarchy.

Python’s MRO is based on the order of parent classes in the subclass definition. For example,
Car comes before Aircraft in the FlyingCar class from the previous section. MRO also
considers the inheritance relationships between classes. In general, Python searches for methods
and attributes in the following order:

1. The current class


2. The leftmost superclasses
3. The superclass listed next, from left to right, up to the last superclass
4. The superclasses of inherited classes
5. The object class

It’s important to note that subclasses come first in the search. Additionally, if you have multiple
parents that implement a given method or attributes, then Python will search them in the same
order that they’re listed in the class definition.

To illustrate the MRO, consider the following sample class hierarchy:

# mro.py

class A:
def method(self):
print("A.method")

class B(A):
def method(self):
print("B.method")

class C(A):
def method(self):
print("C.method")

class D(B, C):


pass

In this example, D inherits from B and C, which inherit from A. All the superclasses in the
hierarchy define a different version of .method(). Which of these versions will D end up calling?
To answer this question, go ahead and call .method() on a D instance:

>>> from mro import D

179
>>> D().method()
B.method

When you call .method() on an instance of D, you get B.method on your screen. This means
that Python found .method() on the B class first. That’s the version of .method() that you end
up calling. You ignore the versions from C and A.

Note: Sometimes, you may run into complex inheritance relationships where Python won’t be
able to create a consistent method resolution order. In those cases, you’ll get a TypeError
pointing out the issue.

You can check the current MRO of a given class by using the .__mro__ special attribute:

>>> D.__mro__
(
<class '__main__.D'>,
<class '__main__.B'>,
<class '__main__.C'>,
<class '__main__.A'>,
<class 'object'>
)

In the output, you can see that Python searches for methods and attributes in D by going through
D itself, then B, then C, then A, and finally, object, which is the base class of all Python classes.

The .__mro__ attribute can help you tweak your classes and define the specific MRO that you
want your class to use. The way to tweak this is by moving and reordering the parent classes in
the subclass definition until you get the desired MRO.

Mixin Classes

A mixin class provides methods that you can reuse in many other classes. Mixin classes don’t
define new types, so they’re not intended to be instantiated. You use their functionality to attach
extra features to other classes quickly.

You can access the functionality of a mixin class in different ways. One of these ways is
inheritance. However, inheriting from mixin classes doesn’t imply an is-a relationship because
these classes don’t define concrete types. They just bundle specific functionality that’s intended
to be reused in other classes.

To illustrate how to use mixin classes, say that you’re building a class hierarchy with a Person
class at the top. From this class, you’ll derive classes like Employee, Student, Professor, and
several others. Then you realize that all the subclasses of Person need methods that serialize
their data into different formats, including JSON and pickle.

With this in mind, you think of writing a SerializerMixin class that takes care of this task.
Here’s what you come up with:

180
# mixins.py

import json
import pickle

class Person:
def __init__(self, name, age):
self.name = name
self.age = age

class SerializerMixin:
def to_json(self):
return json.dumps(self.__dict__)

def to_pickle(self):
return pickle.dumps(self.__dict__)

class Employee(SerializerMixin, Person):


def __init__(self, name, age, salary):
super().__init__(name, age)
self.salary = salary

In this example, Person is the parent class, and SerializerMixin is a mixin class that provides
serialization functionality. The Employee class inherits from both SerializerMixin and
Person. Therefore, it’ll inherit the .to_json() and .to_pickle() methods, which you can use
to serialize instances of Employee in your code.

In this example, Employee is a Person. However, it’s not a SerializerMixin because this class
doesn’t define a type of object. It’s just a mixin class that packs serialization capabilities.

Note: Because of the method resolution order (MRO), which you learned about earlier, placing
your mixin classes before the base classes on the list of parents is often necessary. It’s especially
true for class-based views in the Django web framework, which uses mixins to modify the
behavior of a base view class.

Here’s how Employee works in practice:

>>> from mixins import Employee

>>> john = Employee("John Doe", 30, 50000)


>>> john.to_json()
'{"name": "John", "age": 30, "salary": 50000}'

>>> john.to_pickle()
b'...\x04name\x94\x8c\x08John Doe\x94\x8c\x03age\x94K\x1e\x8c\x06salary...'

Now your Employee class is able to serialize its data using JSON and pickle formats. That’s
great! Can you think of any other useful mixin classes?

181
Up to this point, you’ve learned a lot about simple and multiple inheritance in Python. In the
following section, you’ll go through some of the advantages of using inheritance when writing
and organizing your code.

Benefits of Using Inheritance

Inheritance is a powerful tool that you can use to model and solve many real-world problems in
your code. Some benefits of using inheritance include the following:

 Reusability: You can quickly inherit and reuse working code from one or more parent
classes in as many subclasses as you need.
 Modularity: You can use inheritance to organize your code in hierarchies of related
classes.
 Maintainability: You can quickly fix issues or add features to a parent class. These
changes will be automatically available in all its subclasses. Inheritance also reduces code
duplication.
 Polymorphism: You can create subclasses that can replace their parent class, providing
the same or equivalent functionality.
 Extensibility: You can quickly extend an exiting class by adding new data and behavior
to its subclasses.

You can also use inheritance to define a uniform API for all the classes that belong to a given
hierarchy. This promotes consistency and leverages polymorphism.

Using classes and inheritance, you can make your code more modular, reusable, and extensible.
Inheritance enables you to apply good design principles, such as separation of concerns. This
principle states that you should organize code in small classes that each take care of a single task.

Even though inheritance comes with several benefits, it can also end up causing issues. If you
overuse it or use it incorrectly, then you can:

 Artificially increase your code’s complexity with multiple inheritance or multiple levels
of inheritance
 Face issues like the diamond problem where you’ll have to deal with the method
resolution order
 End up with fragile base classes where changes to a parent class produce unexpected
behaviors in subclasses

Of course, these aren’t the only potential pitfalls. For example, having multiple levels of
inheritance can make your code harder to reason about, which may impact your code’s
maintainability in the long term.

Another drawback of inheritance is that inheritance is defined at compile time. So, there’s no
way to change the inherited functionality at runtime. Other techniques, like composition, allow
you to dynamically change the functionality of a given class by replacing its components.

182
Using Alternatives to Inheritance
Inheritance, and especially multiple inheritance, can be a complex and hard-to-grasp topic.
Fortunately, inheritance isn’t the only technique that allows you to reuse functionality in object-
oriented programming. You also have composition, which represents a has-a relationship
between classes.

Composition allows you to build an object from its components. The composite object doesn’t
have direct access to each component’s interface. However, it can leverage each component’s
implementation.

Delegation is another technique that you can use to promote code reuse in your OOP programs.
With delegation, you can represent can-do relationships, where an object relies on another object
to perform a given task.

In the following sections, you’ll learn more about these techniques and how they can make your
object-oriented code more robust and flexible.

Composition

As you already know, you can use composition to model a has-a relationship between objects.
In other words, through composition, you can create complex objects by combining objects that
will work as components. Note that these components may not make sense as stand-alone
classes.

Favoring composition over inheritance leads to more flexible class designs. Unlike inheritance,
composition is defined at runtime, which means that you can dynamically replace a current
component with another component of the same type. This characteristic makes it possible to
change the composite’s behavior at runtime.

In the example below, you use composition to create an IndustrialRobot class from the Body
and Arm components:

# robot.py

class IndustrialRobot:
def __init__(self):
self.body = Body()
self.arm = Arm()

def rotate_body_left(self, degrees=10):


self.body.rotate_left(degrees)

def rotate_body_right(self, degrees=10):


self.body.rotate_right(degrees)

def move_arm_up(self, distance=10):


self.arm.move_up(distance)

183
def move_arm_down(self, distance=10):
self.arm.move_down(distance)

def weld(self):
self.arm.weld()

class Body:
def __init__(self):
self.rotation = 0

def rotate_left(self, degrees=10):


self.rotation -= degrees
print(f"Rotating body {degrees} degrees to the left...")

def rotate_right(self, degrees=10):


self.rotation += degrees
print(f"Rotating body {degrees} degrees to the right...")

class Arm:
def __init__(self):
self.position = 0

def move_up(self, distance=1):


self.position += 1
print(f"Moving arm {distance} cm up...")

def move_down(self, distance=1):


self.position -= 1
print(f"Moving arm {distance} cm down...")

def weld(self):
print("Welding...")

In this example, you build an IndustrialRobot class out of its components, Body and Arm. The
Body class provides horizontal movements, while the Arm class represents the robot’s arm and
provides vertical movement and welding functionality.

Here’s how you can use IndustrialRobot in your code:

>>> from robot import IndustrialRobot

>>> robot = IndustrialRobot()

>>> robot.rotate_body_left()
Rotating body 10 degrees to the left...
>>> robot.move_arm_up(15)
Moving arm 15 cm up...
>>> robot.weld()
Welding...

>>> robot.rotate_body_right(20)
Rotating body 20 degrees to the right...
>>> robot.move_arm_down(5)
Moving arm 5 cm down...

184
>>> robot.weld()
Welding...

Great! Your robot works as expected. It allows you to move its body and arm according to your
movement needs. It also allows you to weld different mechanical pieces together.

An idea to make this robot even cooler is to implement several types of arms with different
welding technologies. Then you can change the arm by running robot.arm = NewArm(). You
can even add a .change_arm() method to your robot class. How does that sound as a learning
exercise?

Unlike inheritance, composition doesn’t expose the entire interface of components, so it


preserves encapsulation. Instead, the composite objects access and use only the required
functionality from their components. This characteristic makes your class design more robust
and reliable because it won’t expose unneeded members.

Following the robot example, say you have several different robots in a factory. Each robot can
have different capabilities like welding, cutting, shaping, polishing, and so on. You also have
several independent arms. Some of them can perform all those actions. Some of them can
perform just a subset of the actions.

Now say that a given robot can only weld. However, this robot can use different arms with
different welding technologies. If you use inheritance, then the robot will have access to other
operations like cutting and shaping, which can cause an accident or breakdown.

If you use composition, then the welder robot will only have access to the arm’s welding feature.
That said, composition can help you protect your classes from unintended use.

Delegation

Delegation is another technique that you can use as an alternative to inheritance. With
delegation, you can model can-do relationships, where an object hands a task over to another
object, which takes care of executing the task. Note that the delegated object can exist
independently from the delegator.

You can use delegation to achieve code reuse, separation of concerns, and modularity. For
example, say that you want to create a stack data structure. You think of taking advantage of
Python’s list as a quick way to store and manipulate the underlying data.

Here’s how you end up writing your Stack class:

# stack.py

class Stack:
def __init__(self, items=None):
if items is None:
self._items = []
else:

185
self._items = list(items)

def push(self, item):


self._items.append(item)

def pop(self):
return self._items.pop()

def __repr__(self) -> str:


return f"{type(self).__name__}({self._items})"

In .__init__(), you define a list object called ._items that can take its initial data from the
items argument. You’ll use this list to store the data in the containing Stack, so you delegate all
the operations related to storing, adding, and deleting data to this list object. Then you implement
the typical Stack operations, .push() and .pop().

Note how these operations conveniently delegate their responsibilities on ._items.append()


and ._items.pop(), respectively. Your Stack class has handed its operations over to the list
object, which already knows how to perform them.

It’s important to notice that this class is pretty flexible. You can replace the list object in
._items with any other object as long as it implements the .pop() and .append() methods. For
example, you can use a deque object from the collections module.

Because you’ve used delegation to write your class, the internal implementation of list isn’t
visible or directly accessible in Stack, which preserves encapsulation:

>>> from stack import Stack

>>> stack = Stack([1, 2, 3])


>>> stack
Stack([1, 2, 3])
>>> stack.push(4)
>>> stack
Stack([1, 2, 3, 4])
>>> stack.pop()
>>> stack.pop()
>>> stack
Stack([1, 2])

>>> dir(stack)
[
...
'_items',
'pop',
'push'
]

The public interface of your Stack class only contains the stack-related methods .pop() and
.push(), as you can see in the dir() function’s output. This prevents the users of your class
from using list-specific methods that aren’t compatible with the classic stack data structure.

186
If you use inheritance, then your child class, Stack, will inherit all the functionality from its
parent class, list:

>>> class Stack(list):


... def push(self, item):
... self.append(item)
... def pop(self):
... return super().pop()
... def __repr__(self) -> str:
... return f"{type(self).__name__}({super().__repr__()})"
...

>>> stack = Stack()


>>> dir(stack)
[
...
'append',
'clear',
'copy',
'count',
'extend',
'index',
'insert',
'pop',
'push',
'remove',
'reverse',
'sort'
]

In this example, your Stack class has inherited all the methods from list. These methods are
exposed as part of your class’s public API, which may lead to incorrect uses of the class and its
instances.

With inheritance, the internals of parent classes are visible to subclasses, which breaks
encapsulation. If some of the parent’s functionality isn’t appropriate for the child, then you run
the risk of incorrect use. In this situation, composition and delegation are safer options.

Finally, in Python, you can quickly implement delegation through the .__getattr__() special
method. Python calls this method automatically whenever you access an instance attribute or
method. You can use this method to redirect the request to another object that can provide the
appropriate method or attribute.

To illustrate this technique, get back to the mixin example where you used a mixin class to
provide serialization capabilities to your Employee class. Here’s how to rewrite the example
using delegation:

# serializer_delegation.py

import json
import pickle

187
class Person:
def __init__(self, name, age):
self.name = name
self.age = age

class Serializer:
def __init__(self, instance):
self.instance = instance

def to_json(self):
return json.dumps(self.instance.__dict__)

def to_pickle(self):
return pickle.dumps(self.instance.__dict__)

class Employee(Person):
def __init__(self, name, age, salary):
super().__init__(name, age)
self.salary = salary

def __getattr__(self, attr):


return getattr(Serializer(self), attr)

In this new implementation, the serializer class takes the instance that provides the data as an
argument. Employee defines a .__getattr__() method that uses the built-in getattr()
function to access the methods in the Serializer class.

For example, if you call .to_json() on an instance of Employee, then that call will be
automatically redirected to calling .to_json() on the instance of Serializer. Go ahead and try
it out! This is a pretty cool Python feature.

You’ve tried your hand at a quick example of delegation in Python to learn how a class can
delegate some of its responsibilities to another class, achieving code reuse and separation of
concerns. Again, you should note that this technique indirectly exposes all the delegated
attributes and methods. So, use it with care.

Dependency Injection

Dependency injection is a design pattern that you can use to achieve loose coupling between a
class and its components. With this technique, you can provide an object’s dependencies from
the outside, rather than inheriting or implementing them in the object itself. This practice allows
you to create flexible classes that are able to change their behavior dynamically, depending on
the injected functionality.

In your robot example, you can use dependency injection to decouple the Arm and Body classes
from IndustrialRobot, which will make your code more flexible and versatile.

Here’s the updated example:

# robot.py

188
class IndustrialRobot:
def __init__(self, body, arm):
self.body = body
self.arm = arm

# ...

# ...

In this new version of IndustrialRobot, you only made two small changes to .__init__().
Now this method takes body and arm as arguments and assigns their values to the corresponding
instance attributes, .body and .arm. This allows you to inject appropriate body and arm objects
into the class so that it can do its work.

Here’s how you can use IndustrialRobot with this new implementation:

>>> from robot import Arm, Body, IndustrialRobot

>>> robot = IndustrialRobot(Body(), Arm())

>>> robot.rotate_body_left()
Rotating body 10 degrees to the left...
>>> robot.move_arm_up(15)
Moving arm 15 cm up...
>>> robot.weld()
Welding...

>>> robot.rotate_body_right(20)
Rotating body 20 degrees to the right...
>>> robot.move_arm_down(5)
Moving arm 5 cm down...
>>> robot.weld()
Welding...

Overall, the class’s functionality remains the same as in your first version. The only difference is
that now you have to pass the body and arm objects to the class constructor. This step is a
common way of implementing dependency injection.

Now that you know about a few techniques that you can use as alternatives to inheritance, it’s
time for you to learn about abstract base classes (ABCs) in Python. These classes allow you to
define consistent APIs for your classes.

Creating Abstract Base Classes (ABCs) and Interfaces


Sometimes, you want to create a class hierarchy in which all the classes implement a predefined
interface or API. In other words, you want to define the specific set of public methods and
attributes that all the classes in the hierarchy must implement. In Python, you can do this using
what’s known as an abstract base class (ABC).

189
The abc module in the standard library exports a couple of ABCs and other related tools that you
can use to define custom base classes that require all their subclasses to implement specific
interfaces.

You can’t instantiate ABCs directly. You must subclass them. In a sense, ABCs work as
templates for other classes to inherit from.

To illustrate how to use Python’s ABCs, say that you want to create a class hierarchy to represent
different shapes, such as Circle, Square, and so on. You decide that all the classes should have
the .get_area() and .get_perimeter() methods. In this situation, you can start with the
following base class:

# shapes_abc.py

from abc import ABC, abstractmethod

class Shape(ABC):
@abstractmethod
def get_area(self):
pass

@abstractmethod
def get_perimeter(self):
pass

The Shape class inherits from abc.ABC, which means it’s an abstract base class. Then you define
the .get_area() and .get_perimeter() methods using the @abstractmethod decorator. By
using the @abstractmethod decorator, you declare that these two methods are the common
interface that all the subclasses of Shape must implement.

Now you can create the Circle class. Here’s the first approach to this class:

# shapes_abc.py

from abc import ABC, abstractmethod


from math import pi

# ...

class Circle(Shape):
def __init__(self, radius):
self.radius = radius

def get_area(self):
return pi * self.radius ** 2

In this code snippet, you define the Circle class by inheriting from Shape. At this point, you’ve
added the .get_area() method only. Now go ahead and run the following code:

>>> from shapes_abc import Circle

190
>>> circle = Circle(100)
Traceback (most recent call last):
...
TypeError: Can't instantiate abstract class Circle
with abstract method get_perimeter

What just happened? You can’t instantiate Circle. That’s what ABCs are for. To be able to
instantiate Circle, you must provide suitable implementations for all its abstract methods, which
means you need to define a consistent interface:

# shapes_abc.py

from abc import ABC, abstractmethod


from math import pi

# ...

class Circle(Shape):
def __init__(self, radius):
self.radius = radius

def get_area(self):
return pi * self.radius ** 2

def get_perimeter(self):
return 2 * pi * self.radius

This time, your Circle class implements all the required methods. These methods will be
common to all the classes in your shape hierarchy. Once you’ve defined suitable custom
implementations for all the abstract methods, you can proceed to instantiate Circle as in the
following example:

>>> from shapes_abc import Circle

>>> circle = Circle(100)


>>> circle.radius
100
>>> circle.get_area()
31415.926535897932
>>> circle.get_perimeter()
628.3185307179587

Once you’ve implemented custom methods to replace the abstract implementations of


.get_area() and .get_perimeter(), then you can instantiate and use Circle in your code.

If you want to add a Square class to your shape hierarchy, then that class must have custom
implementations of the .get_area() and .get_perimeter() methods:

# shapes_abc.py

from abc import ABC, abstractmethod


from math import pi

191
# ...

class Square(Shape):
def __init__(self, side):
self.side = side

def get_area(self):
return self.side ** 2

def get_perimeter(self):
return 4 * self.side

This example demonstrates how you can use ABCs to define a common interface for a group of
related classes. Each subclass must provide its own implementation of the abstract methods in
the base class. Note that in recent Python versions, you can use static duck typing as an
alternative to abstract base classes.

Unlocking Polymorphism With Common Interfaces


In the previous section, you learned about abstract base classes and explored how to use them to
promote the use of a common public interface across several related classes. Having a set of
classes to implement the same interface with specific behaviors for concrete classes is a great
way to unlock polymorphism.

Polymorphism is when you can use objects of different classes interchangeably because they
share a common interface. For example, Python strings, lists, and tuples are all sequence data
types. This means that they implement an interface that’s common to all sequences.

Because of this common interface, you can use them in similar ways. For example, you can:

 Use them in loops because they provide the .__iter__() method


 Access their items by index because they implement the .__getitem__() method
 Determine their number of items because they include the .__len__() method

These are just a few examples of common features of sequence data types. Note that you can run
all these operations and more without caring about which specific type you’re actually using in
your code. That’s possible because of polymorphism.

Consider the following examples, which use the built-in len() function:

>>> message = "Hello!"


>>> numbers = [1, 2, 3]
>>> letters = ("A", "B", "C")

>>> len(message)
6
>>> len(numbers)
3

192
>>> len(letters)
3

In these examples, you use len() with three different types of objects: a string, a list, and a
tuple. Even though these types are quite different, all of them implement the .__len__()
method, which provides support for len().

You can unlock polymorphism in your custom classes and classes by making them share
common attributes and methods.

For example, take a look at your Vehicle class hierarchy. The Car class has a method called
.drive(), and the Motorcycle class has a method called .ride(). This API inconsistency
breaks polymorphism. To fix the issue and use these classes in a polymorphic way, you can
slightly change Motorcycle by renaming its .ride() method to .drive():

# vehicles.py

# ...

class Motorcycle(Vehicle):
def __init__(self, make, model, year, num_wheels):
super().__init__(make, model, year)
self.num_wheels = num_wheels

def drive(self):
print(f'Riding my "{self.make} - {self.model}" on the road')

def __str__(self):
return f'"{self.make} - {self.model}" has {self.num_wheels} wheels'

Here, you rename the .ride() method to .drive() in the definition of Motorcycle. This small
change makes your classes have a common interface. So, you can use them interchangeably in
your code:

>>> from vehicles import Car, Motorcycle

>>> toyota = Car("Toyota", "Corolla", 2022, 5)


>>> honda = Car("Honda", "Civic", 2022, 4)
>>> harley = Motorcycle("Harley-Davidson", "Iron 883", 2022, 2)
>>> indian = Motorcycle("Indian", "Scout", 2022, 2)

>>> for vehicle in [toyota, honda, harley, indian]:


... vehicle.drive()
...
Driving my "Toyota - Corolla" on the road
Driving my "Honda - Civic" on the road
Riding my "Harley-Davidson - Iron 883" on the road
Riding my "Indian - Scout" on the road

193
Now you can drive either a car or a motorcycle without having to worry about an
AttributeError because one of them doesn’t have the appropriate method. You’ve just made
your classes work in a polymorphic way, which is a great way to add flexibility to your code.

Conclusion
You now know a lot about Python classes and how to use them to make your code more
reusable, modular, flexible, and maintainable. Classes are the building blocks of object-oriented
programming in Python. With classes, you can solve complex problems by modeling real-world
objects, their properties, and their behaviors. Classes provide an intuitive and human-friendly
approach to complex programming problems, which will make your life more pleasant.

In this tutorial, you’ve learned how to:

 Write Python classes using the class keyword


 Add state to your classes with class and instance attributes
 Give concrete behaviors to your classes with different types of methods
 Build hierarchies of classes using inheritance
 Create interfaces with abstract classes

With all this knowledge, you can leverage the power of Python classes in your code. Now you’re
ready to start writing your own classes in Python.

194
Create a New Window in Tk (tkinter)
Python Assets

2022-03-23

Comments

Desktop applications can be made up of more than one window. The main window is created by
the tk.Tk class and controls the life cycle of the application. Secondary windows, also known as
popup or child windows, might be created on application startup or in response to an event (for
example, a button press) via the tk.Toplevel class. When the user closes the main window,
secondary windows are also closed, thus ending the program execution. However, child windows
can be opened and closed multiple times during the application life cycle.

The following code creates an application with a parent window and a child window. The main
window contains a button that opens the secondary window. Inside the secondary window there
is another button to close itself.

import tkinter as tk
from tkinter import ttk
def open_secondary_window():
    # Create secondary (or popup) window.
    secondary_window = tk.Toplevel()
    secondary_window.title("Secondary Window")
    secondary_window.config(width=300, height=200)
    # Create a button to close (destroy) this window.
    button_close = ttk.Button(
        secondary_window,
        text="Close window",
        command=secondary_window.destroy
    )
    button_close.place(x=75, y=75)
# Create the main window.
main_window = tk.Tk()
main_window.config(width=400, height=300)
main_window.title("Main Window")
# Create a button inside the main window that
# invokes the open_secondary_window() function
# when pressed.
button_open = ttk.Button(
    main_window,
    text="Open secondary window",
    command=open_secondary_window
)

195
button_open.place(x=100, y=100)
main_window.mainloop()

Having two different windows, whenever we create a widget (be it a button or any other), we
must specify its parent window (i.e., the window in which it is located) as the first argument. The
button_open is inside the main window, hence in line 27 the main_window object is passed as
the first argument. The same goes for the button_close and the secondary window on line 12.
Here is the result:

In order for the child window to get focus automatically once created, we use the focus()
method:

def open_secondary_window():
    # Create secondary (or popup) window.
    secondary_window = tk.Toplevel()
    secondary_window.title("Secondary Window")
    secondary_window.config(width=300, height=200)
    # Create a button to close (destroy) this window.
    button_close = ttk.Button(
        secondary_window,
        text="Close window",
        command=secondary_window.destroy
    )
    button_close.place(x=75, y=75)
    secondary_window.focus()

When the two windows are open, the user will be able to interact with both of them. If we want
the user to be unable to use the main window while the secondary window is visible (known in
the GUI jargon as modal window), we call the grab_set() method:

def open_secondary_window():
    # Create secondary (or popup) window.
    secondary_window = tk.Toplevel()
    secondary_window.title("Secondary Window")
    secondary_window.config(width=300, height=200)
    # Create a button to close (destroy) this window.
    button_close = ttk.Button(
        secondary_window,
        text="Close window",
        command=secondary_window.destroy
    )

196
    button_close.place(x=75, y=75)
    secondary_window.focus()
    secondary_window.grab_set()  # Modal.

Both the main window and the child window provide the destroy() method to close them
programmatically. Note that main_window.destroy() finishes the whole application.

Although this way of organizing the code can be useful for small applications, a better solution is
to create a class for each window. So the code above in its object-oriented version would look
something like this:

import tkinter as tk
from tkinter import ttk
class SecondaryWindow(tk.Toplevel):
    def __init__(self, *args, **kwargs):
        super().__init__(*args, **kwargs)
        self.config(width=300, height=200)
        self.title("Secondary Window")
        self.button_close = ttk.Button(
            self,
            text="Close window",
            command=self.destroy
        )
        self.button_close.place(x=75, y=75)
        self.focus()
        self.grab_set()
class MainWindow(tk.Tk):
    def __init__(self, *args, **kwargs):
        super().__init__(*args, **kwargs)
        self.config(width=400, height=300)
        self.title("Main Window")
        self.button_open = ttk.Button(
            self,
            text="Open secondary window",
            command=self.open_secondary_window
        )
        self.button_open.place(x=100, y=100)
    def open_secondary_window(self):
        self.secondary_window = SecondaryWindow()
main_window = MainWindow()
main_window.mainloop()

This implementation has the benefit that widgets and methods of both windows are encapsulated
within their respective objects (main_window and secondary_window), avoiding name collisions
and reducing usage of global objects. The classes could even be in different modules: it's a
common pattern in GUI development to put each window in its own source code file.

197
Other features are also easier to implement with this arrangement of windows into classes. For
example, what happens if the user presses the button_open more than once? If the child window
is not modal (i.e., grab_set() has not been called), the user will be allowed to open an arbitrary
amount of child windows. This is generally an undesirable effect, so it is useful to add a
constraint so that the SecondaryWindow does not get open more than once at the same time.

import tkinter as tk
from tkinter import ttk
class SecondaryWindow(tk.Toplevel):
    # Class attribute that indicates whether this child window
    # is being used (alive) or not.
    alive = False
    def __init__(self, *args, **kwargs):
        super().__init__(*args, **kwargs)
        self.config(width=300, height=200)
        self.title("Secondary Window")
        self.button_close = ttk.Button(
            self,
            text="Close window",
            command=self.destroy
        )
        self.button_close.place(x=75, y=75)
        self.focus()
        # Set the window as alive once created.
        self.__class__.alive = True
    def destroy(self):
        # Restore the attribute on close.
        self.__class__.alive = False
        return super().destroy()
class MainWindow(tk.Tk):
    def __init__(self, *args, **kwargs):
        super().__init__(*args, **kwargs)
        self.config(width=400, height=300)
        self.title("Main Window")
        self.button_open = ttk.Button(
            self,
            text="Open secondary window",
            command=self.open_secondary_window
        )
        self.button_open.place(x=100, y=100)
    def open_secondary_window(self):
        if not SecondaryWindow.alive:
            self.secondary_window = SecondaryWindow()
main_window = MainWindow()
main_window.mainloop()

198
The logic under this code is simple. We create the alive attribute that is true when the window
is in use and false otherwise, and always query it before instantiating the child window.

But what if we want to access an object within a child window from the parent window? For
example, if we want to create a child window for the user to enter his name and then display it in
a label in the parent window:

An elegant solution for this scenario is to use a callback function (heavily used in event-driven
programming), which will be called by the child window when the entered name is available.
This approach is similar to that used in buttons when passing a function name to the command
argument.

import tkinter as tk
from tkinter import ttk
class InputWindow(tk.Toplevel):
    def __init__(self, *args, callback=None, **kwargs):
        super().__init__(*args, **kwargs)
        # callback is a function that this window will call
        # with the entered name as an argument once the button
        # has been pressed.
        self.callback = callback
        self.config(width=300, height=90)
        # Disable the button for resizing the window.
        self.resizable(0, 0)

199
        self.title("Enter Your Name")
        self.entry_name = ttk.Entry(self)
        self.entry_name.place(x=20, y=20, width=260)
        self.button_done = ttk.Button(
            self,
            text="Done!",
            command=self.button_done_pressed
        )
        self.button_done.place(x=20, y=50, width=260)
        self.focus()
        self.grab_set()
    def button_done_pressed(self):
        # Get the entered name and invoke the callback function
        # passed when creating this window.
        self.callback(self.entry_name.get())
        # Close the window.
        self.destroy()
class MainWindow(tk.Tk):
    def __init__(self, *args, **kwargs):
        super().__init__(*args, **kwargs)
        self.config(width=400, height=300)
        self.title("Main Window")
        self.button_request_name = ttk.Button(
            self,
            text="Request name",
            command=self.request_name
        )
        self.button_request_name.place(x=50, y=50)
        self.label_name = ttk.Label(
            self,
            text="You have not entered your name yet."
        )
        self.label_name.place(x=50, y=150)
    def request_name(self):
        # Create the child window and pass the callback
        # function by which we want to receive the entered
        # name.
        self.ventana_nombre = InputWindow(
            callback=self.name_entered
        )
    def name_entered(self, name):
        # This function is invoked once the user presses the
        # "Done!" button within the secondary window. The entered
        # name will be in the "name" argument.
        self.label_name.config(
            text="Your name is: " + name
        )
main_window = MainWindow()
main_window.mainloop()

200
Python Histogram Plotting: NumPy,
Matplotlib, pandas & Seaborn
by Brad Solomon Jul 02, 2018 16 Comments basics data-science data-viz numpy
Tweet Share Email

Table of Contents

 Histograms in Pure Python


 Building Up From the Base: Histogram Calculations in NumPy
 Visualizing Histograms with Matplotlib and pandas
 Plotting a Kernel Density Estimate (KDE)
 A Fancy Alternative with Seaborn
 Other Tools in pandas
 Alright, So Which Should I Use?

Watch Now This tutorial has a related video course created by the Real Python team. Watch it
together with the written tutorial to deepen your understanding: Python Histogram Plotting:
NumPy, Matplotlib, Pandas & Seaborn

In this tutorial, you’ll be equipped to make production-quality, presentation-ready Python


histogram plots with a range of choices and features.

If you have introductory to intermediate knowledge in Python and statistics, then you can use
this article as a one-stop shop for building and plotting histograms in Python using libraries from
its scientific stack, including NumPy, Matplotlib, pandas, and Seaborn.

A histogram is a great tool for quickly assessing a probability distribution that is intuitively
understood by almost any audience. Python offers a handful of different options for building and
plotting histograms. Most people know a histogram by its graphical representation, which is
similar to a bar graph:

201
This article will guide you through creating plots like the one above as well as more complex
ones. Here’s what you’ll cover:

 Building histograms in pure Python, without use of third party libraries


 Constructing histograms with NumPy to summarize the underlying data
 Plotting the resulting histogram with Matplotlib, pandas, and Seaborn

Free Bonus: Short on time? Click here to get access to a free two-page Python histograms cheat
sheet that summarizes the techniques explained in this tutorial.

Histograms in Pure Python


When you are preparing to plot a histogram, it is simplest to not think in terms of bins but rather
to report how many times each value appears (a frequency table). A Python dictionary is well-
suited for this task:

>>> # Need not be sorted, necessarily


>>> a = (0, 1, 1, 1, 2, 3, 7, 7, 23)

>>> def count_elements(seq) -> dict:


... """Tally elements from `seq`."""

202
... hist = {}
... for i in seq:
... hist[i] = hist.get(i, 0) + 1
... return hist

>>> counted = count_elements(a)


>>> counted
{0: 1, 1: 3, 2: 1, 3: 1, 7: 2, 23: 1}

count_elements() returns a dictionary with unique elements from the sequence as keys and
their frequencies (counts) as values. Within the loop over seq, hist[i] = hist.get(i, 0) +
1 says, “for each element of the sequence, increment its corresponding value in hist by 1.”

In fact, this is precisely what is done by the collections.Counter class from Python’s standard
library, which subclasses a Python dictionary and overrides its .update() method:

>>> from collections import Counter

>>> recounted = Counter(a)


>>> recounted
Counter({0: 1, 1: 3, 3: 1, 2: 1, 7: 2, 23: 1})

You can confirm that your handmade function does virtually the same thing as
collections.Counter by testing for equality between the two:

>>> recounted.items() == counted.items()


True

Technical Detail: The mapping from count_elements() above defaults to a more highly
optimized C function if it is available. Within the Python function count_elements(), one
micro-optimization you could make is to declare get = hist.get before the for loop. This
would bind a method to a variable for faster calls within the loop.

It can be helpful to build simplified functions from scratch as a first step to understanding more
complex ones. Let’s further reinvent the wheel a bit with an ASCII histogram that takes
advantage of Python’s output formatting:

def ascii_histogram(seq) -> None:


"""A horizontal frequency-table/histogram plot."""
counted = count_elements(seq)
for k in sorted(counted):
print('{0:5d} {1}'.format(k, '+' * counted[k]))

This function creates a sorted frequency plot where counts are represented as tallies of plus (+)
symbols. Calling sorted() on a dictionary returns a sorted list of its keys, and then you access
the corresponding value for each with counted[k]. To see this in action, you can create a
slightly larger dataset with Python’s random module:

>>> # No NumPy ... yet


>>> import random

203
>>> random.seed(1)

>>> vals = [1, 3, 4, 6, 8, 9, 10]


>>> # Each number in `vals` will occur between 5 and 15 times.
>>> freq = (random.randint(5, 15) for _ in vals)

>>> data = []
>>> for f, v in zip(freq, vals):
... data.extend([v] * f)

>>> ascii_histogram(data)
1 +++++++
3 ++++++++++++++
4 ++++++
6 +++++++++
8 ++++++
9 ++++++++++++
10 ++++++++++++

Here, you’re simulating plucking from vals with frequencies given by freq (a generator
expression). The resulting sample data repeats each value from vals a certain number of times
between 5 and 15.

Note: random.seed() is use to seed, or initialize, the underlying pseudorandom number


generator (PRNG) used by random. It may sound like an oxymoron, but this is a way of making
random data reproducible and deterministic. That is, if you copy the code here as is, you should
get exactly the same histogram because the first call to random.randint() after seeding the
generator will produce identical “random” data using the Mersenne Twister.

Building Up From the Base: Histogram Calculations in


NumPy
Thus far, you have been working with what could best be called “frequency tables.” But
mathematically, a histogram is a mapping of bins (intervals) to frequencies. More technically, it
can be used to approximate the probability density function (PDF) of the underlying variable.

Moving on from the “frequency table” above, a true histogram first “bins” the range of values
and then counts the number of values that fall into each bin. This is what NumPy’s histogram()
function does, and it is the basis for other functions you’ll see here later in Python libraries such
as Matplotlib and pandas.

Consider a sample of floats drawn from the Laplace distribution. This distribution has fatter tails
than a normal distribution and has two descriptive parameters (location and scale):

>>> import numpy as np


>>> # `numpy.random` uses its own PRNG.
>>> np.random.seed(444)
>>> np.set_printoptions(precision=3)

204
>>> d = np.random.laplace(loc=15, scale=3, size=500)
>>> d[:5]
array([18.406, 18.087, 16.004, 16.221, 7.358])

In this case, you’re working with a continuous distribution, and it wouldn’t be very helpful to
tally each float independently, down to the umpteenth decimal place. Instead, you can bin or
“bucket” the data and count the observations that fall into each bin. The histogram is the
resulting count of values within each bin:

>>> hist, bin_edges = np.histogram(d)

>>> hist
array([ 1, 0, 3, 4, 4, 10, 13, 9, 2, 4])

>>> bin_edges
array([ 3.217, 5.199, 7.181, 9.163, 11.145, 13.127, 15.109, 17.091,
19.073, 21.055, 23.037])

This result may not be immediately intuitive. np.histogram() by default uses 10 equally sized
bins and returns a tuple of the frequency counts and corresponding bin edges. They are edges in
the sense that there will be one more bin edge than there are members of the histogram:

>>> hist.size, bin_edges.size


(10, 11)

Technical Detail: All but the last (rightmost) bin is half-open. That is, all bins but the last are
[inclusive, exclusive), and the final bin is [inclusive, inclusive].

A very condensed breakdown of how the bins are constructed by NumPy looks like this:

>>> # The leftmost and rightmost bin edges


>>> first_edge, last_edge = a.min(), a.max()

>>> n_equal_bins = 10 # NumPy's default


>>> bin_edges = np.linspace(start=first_edge, stop=last_edge,
... num=n_equal_bins + 1, endpoint=True)
...
>>> bin_edges
array([ 0. , 2.3, 4.6, 6.9, 9.2, 11.5, 13.8, 16.1, 18.4, 20.7, 23. ])

The case above makes a lot of sense: 10 equally spaced bins over a peak-to-peak range of 23
means intervals of width 2.3.

From there, the function delegates to either np.bincount() or np.searchsorted().


bincount() itself can be used to effectively construct the “frequency table” that you started off
with here, with the distinction that values with zero occurrences are included:

>>> bcounts = np.bincount(a)


>>> hist, _ = np.histogram(a, range=(0, a.max()), bins=a.max() + 1)

>>> np.array_equal(hist, bcounts)

205
True

>>> # Reproducing `collections.Counter`


>>> dict(zip(np.unique(a), bcounts[bcounts.nonzero()]))
{0: 1, 1: 3, 2: 1, 3: 1, 7: 2, 23: 1}

Note: hist here is really using bins of width 1.0 rather than “discrete” counts. Hence, this only
works for counting integers, not floats such as [3.9, 4.1, 4.15].

Visualizing Histograms with Matplotlib and pandas


Now that you’ve seen how to build a histogram in Python from the ground up, let’s see how
other Python packages can do the job for you. Matplotlib provides the functionality to visualize
Python histograms out of the box with a versatile wrapper around NumPy’s histogram():

import matplotlib.pyplot as plt

# An "interface" to matplotlib.axes.Axes.hist() method


n, bins, patches = plt.hist(x=d, bins='auto', color='#0504aa',
alpha=0.7, rwidth=0.85)
plt.grid(axis='y', alpha=0.75)
plt.xlabel('Value')
plt.ylabel('Frequency')
plt.title('My Very Own Histogram')
plt.text(23, 45, r'$\mu=15, b=3$')
maxfreq = n.max()
# Set a clean upper y-axis limit.
plt.ylim(ymax=np.ceil(maxfreq / 10) * 10 if maxfreq % 10 else maxfreq + 10)

206
As defined earlier, a plot of a histogram uses its bin edges on the x-axis and the corresponding
frequencies on the y-axis. In the chart above, passing bins='auto' chooses between two
algorithms to estimate the “ideal” number of bins. At a high level, the goal of the algorithm is to
choose a bin width that generates the most faithful representation of the data. For more on this
subject, which can get pretty technical, check out Choosing Histogram Bins from the Astropy
docs.

Staying in Python’s scientific stack, pandas’ Series.histogram() uses


matplotlib.pyplot.hist() to draw a Matplotlib histogram of the input Series:

import pandas as pd

# Generate data on commute times.


size, scale = 1000, 10
commutes = pd.Series(np.random.gamma(scale, size=size) ** 1.5)

commutes.plot.hist(grid=True, bins=20, rwidth=0.9,


color='#607c8e')
plt.title('Commute Times for 1,000 Commuters')
plt.xlabel('Counts')
plt.ylabel('Commute Time')
plt.grid(axis='y', alpha=0.75)

pandas.DataFrame.histogram() is similar but produces a histogram for each column of data


in the DataFrame.

207
Plotting a Kernel Density Estimate (KDE)
In this tutorial, you’ve been working with samples, statistically speaking. Whether the data is
discrete or continuous, it’s assumed to be derived from a population that has a true, exact
distribution described by just a few parameters.

A kernel density estimation (KDE) is a way to estimate the probability density function (PDF) of
the random variable that “underlies” our sample. KDE is a means of data smoothing.

Sticking with the pandas library, you can create and overlay density plots using plot.kde(),
which is available for both Series and DataFrame objects. But first, let’s generate two distinct
data samples for comparison:

>>> # Sample from two different normal distributions


>>> means = 10, 20
>>> stdevs = 4, 2
>>> dist = pd.DataFrame(
... np.random.normal(loc=means, scale=stdevs, size=(1000, 2)),
... columns=['a', 'b'])
>>> dist.agg(['min', 'max', 'mean', 'std']).round(decimals=2)
a b
min -1.57 12.46
max 25.32 26.44
mean 10.12 19.94
std 3.94 1.94

Now, to plot each histogram on the same Matplotlib axes:

fig, ax = plt.subplots()
dist.plot.kde(ax=ax, legend=False, title='Histogram: A vs. B')
dist.plot.hist(density=True, ax=ax)
ax.set_ylabel('Probability')
ax.grid(axis='y')
ax.set_facecolor('#d8dcd6')

208
These methods leverage SciPy’s gaussian_kde(), which results in a smoother-looking PDF.

If you take a closer look at this function, you can see how well it approximates the “true” PDF
for a relatively small sample of 1000 data points. Below, you can first build the “analytical”
distribution with scipy.stats.norm(). This is a class instance that encapsulates the statistical
standard normal distribution, its moments, and descriptive functions. Its PDF is “exact” in the
sense that it is defined precisely as norm.pdf(x) = exp(-x**2/2) / sqrt(2*pi).

Building from there, you can take a random sample of 1000 datapoints from this distribution,
then attempt to back into an estimation of the PDF with scipy.stats.gaussian_kde():

from scipy import stats

# An object representing the "frozen" analytical distribution


# Defaults to the standard normal distribution, N~(0, 1)
dist = stats.norm()

# Draw random samples from the population you built above.


# This is just a sample, so the mean and std. deviation should
# be close to (1, 0).
samp = dist.rvs(size=1000)

# `ppf()`: percent point function (inverse of cdf — percentiles).

209
x = np.linspace(start=stats.norm.ppf(0.01),
stop=stats.norm.ppf(0.99), num=250)
gkde = stats.gaussian_kde(dataset=samp)

# `gkde.evaluate()` estimates the PDF itself.


fig, ax = plt.subplots()
ax.plot(x, dist.pdf(x), linestyle='solid', c='red', lw=3,
alpha=0.8, label='Analytical (True) PDF')
ax.plot(x, gkde.evaluate(x), linestyle='dashed', c='black', lw=2,
label='PDF Estimated via KDE')
ax.legend(loc='best', frameon=False)
ax.set_title('Analytical vs. Estimated PDF')
ax.set_ylabel('Probability')
ax.text(-2., 0.35, r'$f(x) = \frac{\exp(-x^2/2)}{\sqrt{2*\pi}}$',
fontsize=12)

This is a bigger chunk of code, so let’s take a second to touch on a few key lines:

 SciPy’s stats subpackage lets you create Python objects that represent analytical
distributions that you can sample from to create actual data. So dist = stats.norm()
represents a normal continuous random variable, and you generate random numbers from
it with dist.rvs().
 To evaluate both the analytical PDF and the Gaussian KDE, you need an array x of
quantiles (standard deviations above/below the mean, for a normal distribution).
stats.gaussian_kde() represents an estimated PDF that you need to evaluate on an
array to produce something visually meaningful in this case.

210
 The last line contains some LaTex, which integrates nicely with Matplotlib.

A Fancy Alternative with Seaborn


Let’s bring one more Python package into the mix. Seaborn has a displot() function that plots
the histogram and KDE for a univariate distribution in one step. Using the NumPy array d from
ealier:

import seaborn as sns

sns.set_style('darkgrid')
sns.distplot(d)

The call above produces a KDE. There is also optionality to fit a specific distribution to the data.
This is different than a KDE and consists of parameter estimation for generic data and a specified
distribution name:

sns.distplot(d, fit=stats.laplace, kde=False)

211
Again, note the slight difference. In the first case, you’re estimating some unknown PDF; in the
second, you’re taking a known distribution and finding what parameters best describe it given
the empirical data.

Other Tools in pandas


In addition to its plotting tools, pandas also offers a convenient .value_counts() method that
computes a histogram of non-null values to a pandas Series:

>>> import pandas as pd

>>> data = np.random.choice(np.arange(10), size=10000,


... p=np.linspace(1, 11, 10) / 60)
>>> s = pd.Series(data)

>>> s.value_counts()
9 1831
8 1624
7 1423
6 1323
5 1089

212
4 888
3 770
2 535
1 347
0 170
dtype: int64

>>> s.value_counts(normalize=True).head()
9 0.1831
8 0.1624
7 0.1423
6 0.1323
5 0.1089
dtype: float64

Elsewhere, pandas.cut() is a convenient way to bin values into arbitrary intervals. Let’s say
you have some data on ages of individuals and want to bucket them sensibly:

>>> ages = pd.Series(


... [1, 1, 3, 5, 8, 10, 12, 15, 18, 18, 19, 20, 25, 30, 40, 51, 52])
>>> bins = (0, 10, 13, 18, 21, np.inf) # The edges
>>> labels = ('child', 'preteen', 'teen', 'military_age', 'adult')
>>> groups = pd.cut(ages, bins=bins, labels=labels)

>>> groups.value_counts()
child 6
adult 5
teen 3
military_age 2
preteen 1
dtype: int64

>>> pd.concat((ages, groups), axis=1).rename(columns={0: 'age', 1: 'group'})


age group
0 1 child
1 1 child
2 3 child
3 5 child
4 8 child
5 10 child
6 12 preteen
7 15 teen
8 18 teen
9 18 teen
10 19 military_age
11 20 military_age
12 25 adult
13 30 adult
14 40 adult
15 51 adult
16 52 adult

What’s nice is that both of these operations ultimately utilize Cython code that makes them
competitive on speed while maintaining their flexibility.

213
# Importing libraries

import pandas as pd

import docx

doc = docx.Document()

df = pd.read_excel("univinfo.xlsx")

#mystat = pd.crosstab(datafile['college'],datafile['sex'],margins=True)

mystat = pd.crosstab(columns=df['sex'],index=[df["center"],df["college"],
df["department"]],margins=True)

mystat.plot(kind='bar', stacked=True, color=['red','blue'], grid=False)

mystat2 = pd.crosstab(columns=df['sex'],index=df["college"],margins=False)

mystat2.plot(kind='barh', stacked=False, color=['red','blue'], grid=False).legend(

loc='upper center', ncol=3, title="Year of Eating")

mystat.to_excel('outputstat2.xlsx', index=True)

print(mystat)

# ============ Legend =================================================================

# best

# upper right

# upper left

# lower left

# lower right

214
# right

# center left

# center right

# lower center

# upper center

# center

# =============================================================================

Set Pandas Conditional Column Based on Values of Another Column

There are many times when you may need to set a Pandas column value based on the condition
of another column. In this post, you’ll learn all the different ways in which you can create Pandas
conditional columns.

Table of Contents

Video Tutorial

If you prefer to follow along with a video tutorial, check out my video below:

Loading a Sample Dataframe

Let’s begin by loading a sample Pandas dataframe that we can use throughout this tutorial.

We’ll begin by import pandas and loading a dataframe using the .from_dict() method:

import pandas as pd

df = pd.DataFrame.from_dict(
{
'Name': ['Jane', 'Melissa', 'John', 'Matt'],
'Age': [23, 45, 35, 64],
'Birth City': ['London', 'Paris', 'Toronto', 'Atlanta'],
'Gender': ['F', 'F', 'M', 'M']
}
)

print(df)

This returns the following dataframe:

Name Age Birth City Gender


0 Jane 23 London F

215
1 Melissa 45 Paris F
2 John 35 Toronto M
3 Matt 64 Atlanta M

Using Pandas loc to Set Pandas Conditional Column


Pandas loc is incredibly powerful! If you need a refresher on loc (or iloc), check out my tutorial
here. Pandas’ loc creates a boolean mask, based on a condition. Sometimes, that condition can
just be selecting rows and columns, but it can also be used to filter dataframes. These filtered
dataframes can then have values applied to them.

Let’s explore the syntax a little bit:

df.loc[df[‘column’] condition, ‘new column name’] = ‘value if condition is


met’

With the syntax above, we filter the dataframe using .loc and then assign a value to any row in
the column (or columns) where the condition is met.

Let’s try this out by assigning the string ‘Under 30’ to anyone with an age less than 30, and
‘Over 30’ to anyone 30 or older.

df['Age Category'] = 'Over 30'


df.loc[df['Age'] < 30, 'Age Category'] = 'Under 30'

Let's take a look at what we did here:

1. We assigned the string 'Over 30' to every record in the dataframe. To learn more about
this, check out my post here or creating new columns.
2. We then use .loc to create a boolean mask on the Age column to filter down to rows
where the age is less than 30. When this condition is met, the Age Category column is
assigned the new value 'Under 30'

But what happens when you have multiple conditions? You could, of course, use .loc multiple
times, but this is difficult to read and fairly unpleasant to write. Let's see how we can accomplish
this using numpy's .select() method.

Using Numpy Select to Set Values using Multiple Conditions


Similar to the method above to use .loc to create a conditional column in Pandas, we can use
the numpy .select() method.

Let's begin by importing numpy and we'll give it the conventional alias np :

import numpy as np

216
Now, say we wanted to apply a number of different age groups, as below:

 <20 years old,


 20-39 years old,
 40-59 years old,
 60+ years old

In order to do this, we'll create a list of conditions and corresponding values to fill:

conditions = [
(df['Age'] < 20),
(df['Age'] >= 20) & (df['Age'] < 40),
(df['Age'] >= 40) & (df['Age'] < 59),
(df['Age'] >= 60)
]

values = ['<20 years old', '20-39 years old', '40-59 years old', '60+ years
old']

df['Age Group'] = np.select(conditions, values)

print(df)

Running this returns the following dataframe:

Name Age Birth City Gender Age Group


0 Jane 23 London F 20-39 years old
1 Melissa 45 Paris F 40-59 years old
2 John 35 Toronto M 20-39 years old
3 Matt 64 Atlanta M 60+ years old

Let's break down what happens here:

 We first define a list of conditions in which the criteria are specified. Recall that lists are
ordered meaning that they should be in the order in which you would like the
corresponding values to appear.
 We then define a list of values to use, which corresponds to the values you'd like applied
in your new column.

Something to consider here is that this can be a bit counterintuitive to write. You can similarly
define a function to apply different values. We'll cover this off in the section of using the
Pandas .apply() method below.

One of the key benefits is that using numpy as is very fast, especially when compared to using
the .apply() method.

Using Pandas Map to Set Values in Another Column

217
The Pandas .map() method is very helpful when you're applying labels to another column. In
order to use this method, you define a dictionary to apply to the column.

For our sample dataframe, let's imagine that we have offices in America, Canada, and France.
We want to map the cities to their corresponding countries and apply and "Other" value for any
other city.

city_dict = {
'Paris': 'France',
'Toronto': 'Canada',
'Atlanta': 'USA'
}

df['Country'] = df['Birth City'].map(city_dict)

print(df)

When we print this out, we get the following dataframe returned:

Name Age Birth City Gender Country


0 Jane 23 London F NaN
1 Melissa 45 Paris F France
2 John 35 Toronto M Canada
3 Matt 64 Atlanta M USA

What we can see here, is that there is a NaN value associated with any City that doesn't have a
corresponding country. If we want to apply "Other" to any missing values, we can chain the
.fillna() method:

city_dict = {
'Paris': 'France',
'Toronto': 'Canada',
'Atlanta': 'USA'
}

df['Country'] = df['Birth City'].map(city_dict).fillna('Other')

print(df)

This returns the following dataframe:

Name Age Birth City Gender Country


0 Jane 23 London F Other
1 Melissa 45 Paris F France
2 John 35 Toronto M Canada
3 Matt 64 Atlanta M USA

Using Pandas Apply to Apply a function to a column


Finally, you can apply built-in or custom functions to a dataframe using the Pandas .apply()
method.

218
Let's take a look at both applying built-in functions such as len() and even applying custom
functions.

Applying Python Built-in Functions to a Column

We can easily apply a built-in function using the .apply() method. Let's see how we can use the
len() function to count how long a string of a given column.

df['Name Length'] = df['Name'].apply(len)

print(df)

This returns the following dataframe:

Name Age Birth City Gender Name Length


0 Jane 23 London F 4
1 Melissa 45 Paris F 7
2 John 35 Toronto M 4
3 Matt 64 Atlanta M 4

Take note of a few things here:

 We apply the .apply() method to a particular column,


 We omit the parentheses "()"

Using Third-Party Packages in Pandas Apply

Similarly, you can use functions from using packages. Let's use numpy to apply the .sqrt()
method to find the scare root of a person's age.

import numpy as np

df['Age Squareroot'] = df['Age'].apply(np.sqrt)

print(df)

This returns the following dataframe:

Name Age Birth City Gender Age Squareroot


0 Jane 23 London F 4.795832
1 Melissa 45 Paris F 6.708204
2 John 35 Toronto M 5.916080
3 Matt 64 Atlanta M 8.000000

Using Custom Functions with Pandas Apply

Something that makes the .apply() method extremely powerful is the ability to define and
apply your own functions.

219
Let's revisit how we could use an if-else statement to create age categories as in our earlier
example:

def age_groups(x):
if x < 20:
return '<20 years old'
elif x < 40:
return '20-39 years old'
elif x < 60:
return '40-59 years old'
else:
return '60+ years old'

df['Age Group'] = df['Age'].apply(age_groups)

print(df)

This returns the following dataframe:

Name Age Birth City Gender Age Group


0 Jane 23 London F 20-39 years old
1 Melissa 45 Paris F 40-59 years old
2 John 35 Toronto M 20-39 years old
3 Matt 64 Atlanta M 60+ years old

Conclusion
In this post, you learned a number of ways in which you can apply values to a dataframe column
to create a Pandas conditional column, including using .loc, .np.select(), Pandas .map() and
Pandas .apply(). Each of these methods has a different use case that we explored throughout
this post.

220

You might also like