KEMBAR78
PP - Module-3 Notes | PDF | Regular Expression | Process (Computing)
0% found this document useful (0 votes)
19 views56 pages

PP - Module-3 Notes

Uploaded by

bhuvanvasa23s
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
19 views56 pages

PP - Module-3 Notes

Uploaded by

bhuvanvasa23s
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 56

MODULE-3

REGULAR EXPRESSION (RE)


INTRODUCTION
 Manipulating text/data is a big thing and it requires huge resources if it done through
manually.
 Generally in today’s world computers will perform word processing, "fill-out-form" Web
pages, streams of information coming from a database dump, stock quote information,
news feeds—the list goes on andon.
 Because we may not know the exact text or data which we have programmed our
machines to process, it becomes advantageous to be able to express this text or data in
patterns which a machine can recognize and take actionupon.
 So this begs the question of how can we program machines with the ability to look for
patterns intext.
 Regular Expressions (REs) provide such an infrastructure for advanced text pattern
matching, extraction, and/or search-and-replacefunctionality.
 Regular Expression, or we can calledit as Regex or RE, is a special type of string that can
be used for matching terms or words inside astring.
 REs are simply strings which use special symbols and characters to indicate pattern
repetitionortorepresentmultiplecharacterssothatthey can "match"asetofstringswith similar
characteristics described by the pattern.
 Inotherwords,theyenablematchingofmultiplestringswiththehelpofanREpattern
 Python supports REs through the standard library remodule.

Features of RegEx:
 Hundreds of code could be reduced to a one line elegant regularexpression
 Used to construct compilers, interpreters and texteditors
 Used to search and match textpatterns
 Used to validate text data formats especially inputdata
 Popular programming languages have Regex capabilities Python, Perl, JavaScript,Ruby
,C++,C#

General uses of RegEx:


 Search a string (search andmatch)
 Replace parts of a string(sub)
 Break string into small pieces(split)
 Finding a string(findall)
 Before using the regular expressions in your program, you must import the library using
"importre"
Example of RegEx:
 "[A-Za-z]\w+” RE and it means that the first character should be alphabetic, i.e.,
either A–Z or a–z, followed by at least one (+) word( alphanumeric) character(\w).
 In our filter, notice how many strings go into the filter, but the only ones to come out are
the ones we asked for via theRE.
SPECIAL SYMBOLS AND CHARACTERS
MetaCharacters
 Metacharacters are characters that are interpreted in a special way by a RegExengine.
Here's a list of metacharacters:
[] . ^ $ * + ? {} () \ |

1. [] - Character class (Squarebrackets):


 Square brackets specifies a set of characters you wish to match. It matches any single
character from characterclass

 Here,[abc]willmatchifthestringyouaretryingtomatchcontainsanyofthea,borc.
 You canalsospecifyarangeofcharactersusing-inside squarebrackets. [a-e]
is the same as[abcde].
[1-4] is the same as [1234].
[0-9] is the same as [0123456789].
 You can complement (invert) the character set by using caret ^ symbol at the start of a
square-bracket.:
[^abc] means any character except a or b or c.
[^0-9] means any non-digit character.

Example to demonstrate a character class::


import re
pattern1 = "[abcdefghijklm]"
pattern2 = "[^i]"
pattern3 = "[0-9]"
string="This is Python programming Module-3“
result=re.findall(pattern1,string)
print("Printing all the characters from a to m in the given string:")
print(result)
result=re.findall(pattern2,string)
print("Printingallthecharactersofthestringexcepti:") print(result)
result=re.findall(pattern3,string)
print("Printingnumbersfromthegivenstring:")
print(result)

Output:
Printing all the characters from a to m in the given string:
['h', 'i', 'i', 'h', 'g', 'a', 'm', 'm', 'i', 'g', 'd', 'l', 'e']
Printing all the characters of the string except i:
['T', 'h', 's', ' ', 's', ' ', 'P', 'y', 't', 'h', 'o', 'n', ' ', 'p', 'r', 'o', 'g', 'r', 'a', 'm', 'm', 'n', 'g', ' ', 'M', 'o', 'd', 'u', 'l',
'e', '-', '3']
Printing numbers from the given string:
['3']

2. Dot (.) or Period Symbol:


 A period matches any single character (except newline'\n').


Whether letter, number, whitespace not including "\n," printable, nonprintable, or a
symbol, the dot can match themall.
Example to demonstrate a dot or period symbol::
import re
pattern1 = ".o"
pattern2="p .... n"
pattern3="p ........ n"
string="This is Python programming Module-3“
result=re.findall(pattern1,string)
print("Printing a word ends with 'o':")
print(result)
result=re.findall(pattern2,string,re.I)
print("Printing a word starts with 'p/P' and ends with 'o':")
print(result)
result=re.findall(pattern3,string)
print("Printing a word starts with 'p' and ends with 'o':")
print(result)

Output:
Printing a word ends with 'o':
['ho', 'ro', 'Mo']
Printing a word starts with 'p/P' and ends with 'o':
['Python']
Printing a word starts with 'p' and ends with 'o':
['programmin']

3. Caret (^) Symbol:


 The caret symbol ^ is used to check if a string starts with a certaincharacter.
Example to demonstrate a caret (^) symbol::
import re
pattern1 = "^T"
pattern2 = "^This is"
pattern3 = "^Python"
string="ThisisPythonprogrammingModule-3“
result=re.findall(pattern1,string)
print("Printingafirstcharacterofstring:")
print(result)
result=re.findall(pattern2,string)
print("Printingthefirsttwowordsofstring:")
print(result)
result=re.findall(pattern3,string)
print("Printing an empty list:")
print(result)

Output:
Printing a first character of string:
['T']
Printing the first two words of string:
['This is']
Printing an empty list:
[]

4. Dollar ($)Symbol:
 The dollar symbol $ is used to check if a string ends with a certaincharacter.

Example to demonstrate a dollar ($) symbol:


import re
pattern1 = "3$"
pattern2 = "Module-3$"
pattern3 = "Python$"
string="This is Python programming Module-3“
result=re.findall(pattern1,string) print("Printing
last character of string:") print(result)
result=re.findall(pattern2,string) print("Printing
last word of string:") print(result)
result=re.findall(pattern3,string) print("Printing
an empty list:") print(result)

Output:
Printing last character of string:
['3']
Printing last word of string:
['Module-3']
Printing an empty list:
[]

5. Star (*)Symbol:
 The star symbol * matches zero or more occurrences of the pattern left toit.

Example to demonstrate using asterisk or star(*) operator::


import re
pattern1 = "program*ii*ng"
pattern2 = "[A-Za-z]*"
pattern3 = "[^A-Za-z]*"
string="This is Python programming Module-3“
result=re.findall(pattern1,string)
print("Printing a character where repetition of letter is obtained:")
print(result)
result=re.findall(pattern2,string)
print("Printing all alphabets of a string:")
print(result)
result=re.findall(pattern3,string)
print("Printing all characters except alphabets:")
print(result)
Output:
Printing a character where repetition of letter is obtained:
['programming']
Printing all alphabets of a string:
['This', '', 'is', '', 'Python', '', 'programming', '', 'Module', '', '', '']
Printing all characters except alphabets:
['', '', '', '', ' ', '', '', ' ', '', '', '', '', '', '', ' ', '', '', '', '', '', '', '', '', '', '', '', ' ', '', '', '', '', '', '', '-3', '']

6. Plus(+)symbol:
 The plus symbol + matches one or more occurrences of the pattern left toit.

Example to demonstrate using plus(+) operator:


import re
pattern1 = "program+ii+ng"
pattern2 = "[A-Za-z]+"
pattern3 = "[^A-Za-z]+"
string="ThisisPythonprogrammingModule-3"
result=re.findall(pattern1,string)
print("Printing an emptylist:")
print(result)
result=re.findall(pattern2,string)
print("Printing all alphabets of a string:")
print(result)
result=re.findall(pattern3,string)
print("Printing all characters except alphabets:")
print(result)

Output:
Printing an empty list:
[]
Printing all alphabets of a string:
['This', 'is', 'Python', 'programming', 'Module']
Printing all characters except alphabets:
[' ', ' ', ' ', ' ', '-3']

7. Question Mark (?)Symbol:


 Thequestionmarksymbol(?) matcheszerooroneoccurrenceofthepatternlefttoit.
Example to demonstrate using question mark (?) symbol:
import re
pattern1 = "program?ing"
pattern2 = "Pythonn?"
pattern3 = "Python?"
string="ThisisPythonprogrammingModule-3"
result=re.findall(pattern1,string)
print("Printing an emptylist:")
print(result)
result=re.findall(pattern2,string)
print("Printing a word of a string:")
print(result)
result=re.findall(pattern3,string)
print("Printing a word of a string:")
print(result)

Output:
Printing an empty list:
[]
Printing a word of a string:
['Python']
Printing a word of a string:
['Python']

8. Braces{}:
 Consider this code: {n,m}. This means at least n, and at most m repetitions of the pattern
left toit.
Example to demonstrate using braces ({}) symbol:
import re
pattern1 = "m{1,2}"
pattern2 = "m{2}"
pattern3 = "m{3}"
string="This is Python programming Module-3"
result=re.findall(pattern1,string,re.I)
print("Printing m characters where minimum it contain 1 and maximum it contain 2
times:")
print(result)
result=re.findall(pattern2,string)
print("Printingmcharactersifastringcontainwordwithmrepeatedfor2times:") print(result)
result=re.findall(pattern3,string)
print("Printing an empty list:")
print(result)

Output:
Printing m characters where minimum it contain 1 and maximum it contain 2 times:
['mm', 'M']
Printing m characters if a string contain word with m repeated for 2 times:
['mm']
Printing an empty list:
[]

9. Alternation ( | ) or Pipe symbol:


 The pipe symbol ( | ), a vertical bar on your keyboard, indicates an alternation operation,
meaning that it is used to choose from one of the different regular expressions which are
separated by the pipesymbol.
 For example, below are some patterns which employ alternation, along with the strings
theymatch:

 With this one symbol, we have just increased the flexibility of our regular expressions,
enabling the matching of more than just onestring.
 Alternation is also sometimes called union or logicalOR.

Example to demonstrate using alternation (pipe) |symbol:


import re
pattern = "RegExr|RegEx"
string="RegEx,RegExr,Hello"
result=re.findall(pattern,string)
print("The strings which matches with the given pattern are:")
print(result)

Output:
The strings which matches with the given pattern are:
['RegEx', 'RegExr']

10. Group -():


 Parentheses () is used to group sub-patterns. For example, (a|b|c)xz match any string that
matches either a or b or c followed byxz
 A pair of parentheses ( ( ) ) can accomplish either (or both) of the below when used with
regularexpressions:
1. grouping regularexpressions
2. matchingsubgroups

Example to demonstrate using group () - symbol:


import re
pattern = "g(oog)+le"
string="gogle,googoogle“
result=re.search(pattern,string)
if result!=None:
print("The string which matches with the given pattern is:")
print(result.group())

Output:
The string which matches with the given pattern is:
googoogle

11. Backslash (\):


 Backlash \ is used to escape various characters including all metacharacters.
 For example, \$a match if a string contains $ followed by a.
 Here, $ is not interpreted by a RegEx engine in a specialway.
 Ifyouareunsureifacharacterhasspecialmeaningor not,youcanput\infrontofit.
 This makes sure the character is not treated in a specialway.
Example to demonstrate using Backslash (\) symbol:
import re
pattern ="Mr\.([A-Z]|[a-z]|\s)+[\(][\$][\)]"
string="""Mr.John is a famous physician
He earns huge amount in dollars($)
He is an USA citizen"""
result=re.search(pattern,string)
if result!=None:
print("The string which matches with the given pattern is:")
print(result.group())
Output:
The string which matches with the given pattern is:
Mr.John is a famous physician
He earns huge amount in dollars($)

Special Sequences:
 Special sequences make commonly used patterns easier to write. Here's a list of special
sequences:
\A,\b,\B,\d,\D,\s,\S,\w,\W,\Z

1. \A - Matches if the specified characters are at the start ofa string.

Example to demonstrate using \A- symbol:


import re
pattern1 ="\APython"
pattern2="\Aprogrammming"
string="Python is a good programming language"
result1=re.findall(pattern1,string)
result2=re.findall(pattern2,string)
print("Printingthefirstwordofthestring")
print(result1)
print("Printinganywordotherthanfirstwordofthestring")
print(result2)
Output:
Printing the first word of the string
['Python']
Printing any word other than first word of the string
[]
2. \b-Matchesifthespecifiedcharactersareat thebeginningorendofaword.

Example to demonstrate using \b- symbol at beginning of a word:


import re
string=input("Enter string:")
result1=re.search(r"\bfoo",string)
ifresult1!=None:
print("Match")
else:
print("No Match")

Output 1:
Enter string: foot
Match

Output 2:
Enter string: afootball
No Match

Example to demonstrate using \b- symbol at end of a word:


import re
string=input("Enter string:")
result1=re.search(r"foo\b",string)
ifresult1!=None:
print("Match")
else:
print("No Match")

Output 1:
Enter string: afoo
Match

Output 2:
Enter string: football
No Match
3. \B - Opposite of \b. Matches if the specified characters are not at the beginning or end of a
word.

Example to demonstrate using \B- symbol at beginning of a word:


import re
string=input("Enter string:")
result1=re.search(r"\Bfoo",string)
ifresult1!=None:
print("Match")
else:
print("No Match")

Output 1:
Enter string: afootball
Match

Output 2:
Enter string: football
No Match

Example to demonstrate using \B- symbol at end of a word:


import re
string=input("Enter string:")
result1=re.search(r"foo\B",string)
ifresult1!=None:
print("Match")
else:
print("No Match")

Output 1:
Enter string: foot
Match

Output 2:
Enter string: a foo
No Match
4. \d - Matches any decimal digit. Equivalent to[0-9]

Example to demonstrate using \d-symbol:


import re
string=input("Enter string:")
result1=re.search(r"\d",string)
if result1!=None:
print("Match")
else:
print("No Match")

Output 1:
Enter string: 123
Match

Output 2:
Enter string: Python
No Match

5. \D - Matches any non-decimal digit. Equivalent to[^0-9]

Example to demonstrate using \D-symbol:


import re
string=input("Enter string:")
result1=re.search(r“\D",string)
if result1!=None:
print("Match")
else:
print("No Match")

Output 1:
Enter string: Python
Match

Output 2:
Enter string: 123
No Match
6. \s - Matches where a string contains any whitespace character. Equivalent to [\t\n\r\f].

Example to demonstrate using \s-symbol:


import re
string=input("Enter string:")
result1=re.search(r"\s",string)
if result1!=None:
print("Match")
else:
print("No Match")

Output 1:
Enter string: Python Language
Match

Output 2:
Enter string: PythonLanguage
No Match

7. \S-Matcheswhereastringcontainsanynon-whitespacecharacter.Equivalentto[^\t\n\r\f].

Example to demonstrate using \S-symbol:


import re
string=input("Enter string:")
result1=re.search(r“\S",string)
if result1!=None:
print("Match")
else:
print("No Match")

Output 1:
Enter string: PythonLanguage
Match

Output 2:
Enterstring: (Empty Space)
NoMatch
8. \w - Matches any alphanumeric character (digits and alphabets). Equivalent to [a-zA-Z0-9_].
By the way, underscore _ is also considered analphanumeric character.

Example to demonstrate using \w-symbol:


import re
string=input("Enter string:")
result1=re.search(r"\w",string)
if result1!=None:
print("Match")
else:
print("No Match")

Output 1:
Enter string: Python_3
Match

Output 2:
Enter string: !$#@|)
No Match

9. \W - Matches any non-alphanumeric character. Equivalent to[^a-zA-Z0-9_]

Example to demonstrate using \W-symbol:


import re
string=input("Enter string:")
result1=re.search(r“\W”,string)
if result1!=None:
print("Match")
else:
print("No Match")

Output 1:
Enter string: !$#@|)
Match
Output 2:
Enter string: Python_3
No Match

10. \Z-Matchesifthespecifiedcharactersareattheendofastring.

Example to demonstrate using \Z-symbol:


import re
string=input("Enter string:")
result1=re.search(r"Python\Z",string)
ifresult1!=None:
print("Match")
else:
print("No Match")

Output 1:
Enter string: I like Python
Match
Output 2:
Enter string: Python is fun
No Match

REs AND PYTHON


 Now that we know all about regular expressions, we can examine how Python currently
supports regular expressions through the remodule.
 The re module was introduced to Python in version1.5.
 However, regular expressions are still regular expressions, so most of the basic concepts
from this section can be used with the old regex and regsubsoftware.
 In contrast, the new re module supports the more powerful and regular Perl-style (Perl5)
REs, allows multiple threads to share the same compiled RE objects, and supportsnamed
subgroups.
 The re engine was rewritten in 1.6 for performance enhancements as well as adding
Unicodesupport.
 Theinterfacewasnotchanged, hencethereasonthemodulenamewasleftalone.
re Module: (Core Functions and Methods)
 In this section, we introduce some the basic functions and methods using regular
expressions for retrieving information from files or long textstrings.
 Thefunctionsandmethodsofthismodulewillhelptounderstandthathowtouseregular
expressions inPython.
 First, all Python regular expression matching is not available until the re module is
loaded:
import re
 Second, whileregularexpressionscanbedefinedwithsimplestringsintheRElanguage, it is
much more efficient, for matches that will be used over and over, to compile the
expression into a Python Regular expression instance; The regular expression class
defines match, search, findall, and other usefulmethods.

Common Regular Expression Functions and Methods


Function/Methods Description

1. Compiling REs withcompile():


 Compile a regular expression pattern into a regular expression object, which can be used
for matching using its match(), search() and other methods, describedbelow:
re.compile(pattern, flags=0)
 The expression’s behavior can be modified by specifying a flagsvalue.
 Thesequence
match_object = re.compile(pattern)
result =match_object.match(string)
is equivalent to result = re.match(pattern, string) but using re.compile() and saving the
resulting regular expression object for reuse is more efficient when the expression will be
used several times in a single program.
 Sowiththehelpofre.compile()wecanabletosearcha patternagainwithoutrewritingit.
Program:
import re
pattern=re.compile(r'[A-Z]+')
result1=pattern.findall('MRECisatopreputedcollege')
print(result1)
result2=pattern.findall('MRECisaffiliatedtoJNTU')
print(result2)

Output:
['MREC']
['MREC', 'JNTU']
RegEx Flags

Regular expression objects:


 Python regular expressionobjectsare compiled regular expressions created
byre.compile.
 Consider an example where a RegEx(example: ‘b.+’ ) needs to compile to convert into
object
import re
pattern = re.compile(r'b.+')
 The name pattern is defined to be a Python regular expression object. Regular expression
objects are then usedto find patterns in and manipulate strings.
 The five most commonly used regular expression methodsare:
1. search
2. match
3. findall
4. split
5. sub /subn

 Each of these methods on a compiled regular expression object has a corresponding


function which can be applied tostrings.
 Thus, using the regular expression object pattern created previously we coulddo:
pattern.search('ba')
 Equivalently, we could bypass the step in which we created a regular expression object
and justdo:
search(r'b.+','ba')
 The same goes for match, split, findall andsub.
 The difference between these two is efficiency. Using the regular expression object is
faster.
 Ifyouaredoingmultiplesearcheswiththesamepattern,usea regularexpressionobject.

1. search method: Finding Pattern inText


re.search(): This function will search the regular expression pattern and return the first
occurrence.
 Unlike Python re.match(), it will check all lines of the inputstring.
 The Python re.search() function returns a match object when the pattern is found and
“None” if the pattern is notfound.

Using RegEx Object:


 If we take regular expression object as an inputthe syntax to perform search method is
statedbelow:
result=pattern.search(Text)
 The result will returns match object orNone
Note:Here patternisRegularExpressionobjectandTextcanbe stringwhere weperformsearch
operation.

Example:
import re
pattern=re.compile(r"(\+\d+)-(\d{10})")
result=pattern.search("The contact number is:+91-9012345678")
print(result)

Output:
<_sre.SRE_Match object; span=(22, 36), match='+91-9012345678'>
Using RegEx:
 If we take regular expression as an inputthe syntax to perform search method is stated
below:
result = re.search(pattern, text)
 The result will returns match object orNone
Note: Here patternis RegularExpression and Text can be string where we perform search
operation.

Example:
import re
result = re.search(r"(\+\d+)-(\d{10})","The contact number is:+91-9012345678")
print(result)

Output:
<_sre.SRE_Match object; span=(22, 36), match='+91-9012345678'>

2. match method:
 re.match(): This function of re in Python will search the regular expression pattern and
return the firstoccurrence.
 ThePythonRegExMatchmethodchecksfora matchonlyatthebeginningofthestring.
 So,ifamatchisfoundinthefirst line,itreturnsthematchobject.
 But if a match is found in some other line or it is not beginning of string then the Python
RegEx match function returnsNone.

Using RegEx Object:


 If we take regular expression object as an inputthe syntax to perform match method is
statedbelow:
result=pattern.match(Text)
 The result will returns match object orNone
Note: Here pattern is Regular Expression object and Text can be string where we perform match
operation.

Example:
import re
pattern=re.compile(r"(\+\d+)-(\d{10})")
result=pattern.match("The contact number is:+91-9012345678")
print(result)
Output:
None

Using RegEx :
 If we take regular expression as an input the syntax to perform match method is stated
below:
result = re.match(pattern, text)
 The result will returns match object orNone
Note: Here pattern isRegularExpression and Text can be string where we perform match
operation.
Example:
import re
result = re.match(r"(\+\d+)-(\d{10})","+91-9012345678 is contact number")
print(result)
Output:
<_sre.SRE_Match object; span=(0, 14), match='+91-9012345678'>

Match objects
 Python match objects are what regular expression methods search and matchreturn.
 The two most important methods of match objects are group andgroups.

group(*groupN):
 *groupN refers to any number of arguments, includingnone.
 Returns one or more subgroups of thematch.
 If there is a single argument, the result is a single string; if there are multiple arguments,
the result is a tuple with one item perargument.
 Without no arguments, *groupN defaults to a single zeroargument.
 If a *groupN argument is zero, the corresponding return value is the entire matching
string; if it is in the inclusive range [1..99], it is the string matching the corresponding
parenthesizedgroup.
 If a group number is negative or larger than the number of groups defined in the pattern,
an IndexError exception israised.
 If a group is contained in a part of the pattern that did not match, the corresponding result
isNone.
 If a group is contained in a part of the pattern that matched multiple times, the last match
isreturned.

groups(default_value=None)
 Return a tuple containing all the subgroups of the match, from 1 up to however many
groups are in thepattern.
 The default_value argument is used for groups that did not participate in the match; it
defaults toNone.
result=re.match(r"(\+\d+)-(\d{10})","+91-9012345678iscontactnumber")
print(result.group()) #Entire Match Output: +91-9012345678
print(result.group(0)) #Entire Match Output: +91-9012345678
print(result.group(1)) #Sub group 1 Output:+91
print(result.group(2)) #Sub group 2 Output: 9012345678
print(result.group(1,2)) #Multiple arguments give a tuple Output: ('+91', '9012345678')
print(result.groups()) #Multiple arguments give a tuple Output: ('+91', '9012345678')
 Apart from the two most common methods in match objects there are still some other
commonly methods and attributes of match objectare:

start():The start() function returns the index of the start of the matched substring. So for the
previous example if we consider match object i.e., result then to obtain start() using below
syntax.
print(result.start()) #Output: 0

end(): It returns the end index of the matched substring


print(result.end()) #Output:14

span(): The span() function returns a tuple containing start and end index of the matched part.
print(result.span()) #Output:(0,14)

re and string: The re attribute of a matched object returns a regular expression object.
Similarly, string attribute returns the passedstring.
print(result.re) #Output:re.compile('(\\+\\d+)-(\\d{10})')
print(result.string) #Output: +91-9012345678 is the contact number

Example: Write a Python program to display the mobile number from the given string
using different match object methods and attributes.
import re
pattern=re.compile(r"(\+\d+)-(\d{10})")
result=pattern.search("The contact number is +91-9012345678")
print("The country code of mobile number is: " +result.group(1))
print("The mobile number excluding country code is: " +result.group(2))
print("Complete mobile number is: " +result.group())
print("Displaying mobile number in groups: "+str(result.groups()))
print("The starting index of the matched substring in the given string is: "+str(result.start()))
print("The end index of the matched substring in the given string is: "+str(result.end()))
print("The range index of the matched substring in the given string is: "+str(result.span()))
print("The regular expression is: "+str(result.re))
print("The string(text)of this example is: "+str(result.string))
Output:
The country code of mobile number is: +91
The mobile number excluding country code is: 9012345678
Complete mobile number is: +91-9012345678
Displaying mobile number in groups: ('+91', '9012345678')
The starting index of the matched substring in the given string is: 22
The end index of the matched substring in the given string is: 36
The range index of the matched substring in the given string is: (22, 36)
The regular expression is: re.compile('(\\+\\d+)-(\\d{10})')
The string(text)of this example is: The contact number is +91-9012345678
Named Groups:
 GroupsareusedinPythoninordertoreferenceregularexpressionmatches.
 By default, groups, without names, are referenced according to numerical order starting
with 1 .
 Let's say we have a regular expression that has 3 subexpressions.
 Auserentersinhisbirthdate,accordingtotheday,month, andyear.
 Let's say the user must first enter the month, then the day, and then theyear.
 Using the group() function in Python, without named groups, the first match (the day)
would be referenced using the statement, group(1). The second match (the month) would
be referenced using the statement, group(2). The third match (the year) would be
referenced using the statement,group(3).
 Now, with named groups, we can name each match in the regular expression. So instead
of referencing matches of the regular expression with numbers (group(1), group(2), etc.),
we can reference matches with names, such as group(‘day'), group(‘month'),
group('year').
 Named groups makes the code more organized and morereadable.
 By seeing, group(1), you don't really know what thisrepresents.
 Butifyou see,group('month')orgroup('year'),youknowit'sreferencingthemonthorthe year.
 So named groups makes code more readable and more understandable rather than the
default numericalreferencing.

Syntax:
(?P<group_name>regexp)

Example: Design a code to demonstrate Named groups.


import re
string=input("Enter Date (DD-MM-YYYY): ")
pattern= r"^(?P<day>\d+)-(?P<month>\d+)-(?P<year>\d+)"
matches= re.search(pattern, string)
print("Day printed using name of group : ", matches.group('day'))
print("Month printed using name of group: ", matches.group('month'))
print("Year printed using name of group: ", matches.group('year'))
print("Day printed using group number: ", matches.group(1))
print("Month printed using group number: ", matches.group(2))
print("Year printed using group number: ", matches.group(3))
print("Complete Date: ",matches.group())
print("Complete Date in groups format: ",matches.groups())
print("Date represented in dictionary format: ",matches.groupdict())
Output:
Enter Date (DD-MM-YYYY): 15-04-2021
Day printed using name of group: 15
Month printed using name of group: 04
Year printed using name of group: 2021
Day printed using group number: 15
Month printed using group number: 04
Year printed using group number: 2021
Complete Date: 15-04-2021
Complete Date in groups format: ('15', '04', '2021')
Date represented in dictionary format: {'year': '2021', 'day': '15', 'month': '04'}

3. findallmethod:
re.findall(): findall() module is used to search for “all” occurrences that match a given pattern.
 In contrast, search() module will only return the first occurrence that matches the
specifiedpattern.
 findall() will iterate over all the lines of the file and will return all non-overlapping
matches of pattern in a singlestep.

Using RegEx Object:


 If we take regular expression object as an input the syntax to perform findall method is
statedbelow:
result=pattern.findall(Text)
 The result will returns match object or an emptyList

Note:HerepatternisRegularExpression objectandTextcanbestring whereweperformfindall


operation.

Example:
import re
pattern=re.compile(r"(\+\d+\-\d{10})")
result=pattern.findall("The contact numbers are: +91-9012345678 and +91-
9876543210")
for results in result:
print(results)
Output:
+91-9012345678
+91-9876543210

Using RegEx:
 If we take regular expression as an inputthe syntax to perform findall method is stated
below:
result = re.findall(pattern, text)
 The result will returns match object or EmptyList

Note: Here pattern is Regular Expression and Text can be string where we perform search
operation.

Example:
import re
result=re.findall(r"(\+\d+\-\d{10})","The contact numbers are: +91-9012345678 and
+91-9876543210")
for results in result:
print(results)
Output:
+91-9012345678
+91-9876543210

4. splitmethod:
re.split(): This method will split a string based on a regular expression pattern in Python. The
Pythons re module’s re.split() method split the string by the occurrences of the regex pattern,
returning a list containing the resulting substrings.
Python RegEx split operations

Syntax:
re.split(pattern, string, maxsplit=0, flags=0)
 The regular expression pattern and target string are the mandatoryarguments.
The maxsplit, and flags are optional.
 pattern: the regular expression pattern used for splitting the targetstring.
 string:Thevariablepointingtothetargetstring(i.e.,thestringwewanttosplit).
 maxsplit: The number of splits you wanted toperform. If maxsplit is nonzero, at
mostmaxsplitsplitsoccur,andtheremainderofthestringisreturnedasthefinalelement of
thelist.
 flags: By default, no flags areapplied.

Return value
 Itsplitthetargetstringaspertheregularexpressionpattern,andthematchesarereturned in the
form of alist.
 If the specified pattern is not found inside the target string, then the string is not split in
anyway,butthesplitmethodstillgeneratesalistsincethisisthewayit’sdesigned.
 However, the list contains just one element, the target stringitself.

Example: 1. Split a string into words


import re
target_string = """Students are learning Regular Expression and
this topic is of Module-3 of Python programming"""
# split on white-space
result = re.split(r"\s+", target_string)
print(result)

Output:
['Students', 'are', 'learning', 'Regular', 'Expression', 'and', 'this', 'topic', 'is', 'of', 'Module-3', 'of',
'Python', 'programming']

Example: 2. Limit the number of split


import re
target_string = """Students are learning Regular Expression and
this topic is of Module-3 of Python programming"""
# split on white-space
result = re.split(r"\s+", target_string,maxsplit=1)
print(result)

Output:
['Students', 'are learning Regular Expression\nand this topic is of Module-3 of Python
programming']

Example: 3. Split a string into numbers


import re
target_string = "01-01-2000"
# split on non digit character
result = re.split(r"\D", target_string)
print(result)

Output:
['01', '01', '2000']

Example: 4. Split string by two delimiters


import re
target_string = "25,08,89,09-03-96"
# 2 delimiter - and ,
# use OR (|) operator to combine two pattern
result = re.split(r"-|,", target_string)
print(result)
Output:
['25', '08', '89', '09', '03', '96']

5. sub /subnmethod:
re.sub(): The method returns a string where matched occurrences are replaced with the content
of replace variable. So it will substitute all occurrences unless max provided.
Syntax:
re.sub(pattern, replace, string, 0)
 pattern: the regular expression pattern used for substituting the string with the value of
replacevariable
 replace:thevaluewhichwillbereplacedinplaceofthe matchwhichoccursbypattern.
 string: The string which acts as a source for replacement of values (i.e., the string we
want to perform substitution and to store theresult in another variable).
 Max number: The number of maximum substitution you wanted to perform. If this
numberisnonzero,atmostthatnumberofsubstitutionoccurs.Bydefaultitwillbezero.

Example
import re
string = "MREC is a reputedcollege"
#pattern =r"\s"match="***" result1
= re.sub(r"\s","***",string)
print(result1)
#pattern =r"\s" match=""
result2 =re.sub(r"\s","",string,2)
print(result2)

Output:
MREC***is***a***reputed***college
MRECisa reputed college

re.subn():The re.subn() is similar to re.sub() except it returns a tuple of 2 items containing the
new string and the number of substitutions made.
Example
import re
string = "MREC is a reputedcollege"
#pattern =r"\s"match="***" result1
= re.subn(r"\s","***",string)
print(result1)
#pattern =r"\s"match="" result2
=re.subn(r"\s","",string,2)
print(result2)
Output:
('MREC***is***a***reputed***college', 4)
('MRECisa reputed college', 2)

PROGRAMS
1. DevelopaPythonprogramtoextractallemailaddressesfromthegivenstring.
Code:
importre
string= """This is Jack and anyone can communicate with this email id's,
which are jack_1234@rediffmail.com and jack2007@gmail.com"""
result = re.findall("([a-zA-Z0-9_.]+@[a-zA-Z0-9]+\.[a-z.]+)", string)
for email in result:
print(email)

Output:
jack_1234@rediffmail.comjack2007@g
mail.com

2. DevelopaPythonprogramtocheckthatastringcontainsonlyacertainsetofcharacters (in this


case a-z, A-Z and0-9)
Code:
import re
string=input("Enter any string: ")
pattern = re.compile(r'[^a-zA-Z0-9]')
result = pattern.search(string)
if(result!=None):
print("The given string contains none specified character(s)")
else:
print("The given string contains only the specified character(s)")

Output 1:
Enter any string: Python3
The given string contains only the specified character(s)

Output 2:
Enter any string: !@#$
The given string contains none specified character(s)
3.Develop a Python program that matches a string that has an a followed by zero or more
b's.
Code:
import re
string=input("Enter string:")
pattern ='ab*?'
ifre.search(pattern,string):
print('Match')
else:
print('No match')

Output 1:
Enter string: abba
Match

Output 2:
Enter string: bbbc
No match

4.Develop a Python program to replace maximum 2 occurrences of space, comma, or dot


with acolon.
Code:
import re
string = 'MREC,one of the top reputed college in Hyderabad'
result= re.sub("[ ,.]",":",string,2)
print(result)

Output :
MREC:one:of the top reputed college in Hyderabad

5.Develop a Python program to match a string that contains only upper and lowercase
letters, numbers, andunderscores.
Code:
import re
string=input("Enter any string: ")
pattern='^[a-zA-Z0-9_]*$'
if re.search(pattern , string):
print("Pattern matches")
else:
print("No match")

Output 1:
Enter any string: MREC is a reputed college
No match

Output 2:
Enter any string:MREC_is_a_reputed_college
Patternmatches

6. Develop a Python program to convert a date of YYYY-MM-DD format to DD-MM-


YYYYformat.
Code:
import re
string = "2021-04-15"
result=re.sub(r'(\d{4})-(\d{1,2})-(\d{1,2})','\\3-\\2-\\1',string)
print("Date in YYYY-MM-DD Format:",string)
print("DateinDD-MM-YYYYFormat:",result)

Output:
Date in YYYY-MM-DD Format: 2021-04-15
Date in DD-MM-YYYY Format: 15-04-2021
MULTI THREADING
INTRODUCTION
Multi Tasking:
 Executing several tasks simultaneously is the concept ofmultitasking.
 There are 2 types of MultiTasking
1. Process based MultiTasking
2. Thread based MultiTasking

1. Process based MultiTasking:


 Executingseveraltaskssimultaneouslywhereeach taskisaseparateindependentprocess is
called process based multitasking.
Eg: while typing python program in the editor we can listen mp3 audio songs from the
same system. At the same time we can download a file from the internet.
 All these task are executing simultaneously and independent of each other. Hence it is
process based multitasking.
 This type of multi tasking is best suitable at operating systemlevel.

2. Thread basedMultitasking:
 Executing several tasks simultaneously where each task is a seperate independent part of
the same program, is called Thread based multi tasking, and each independent part is
called aThread.
 This type of multi tasking is best suitable at programmaticlevel.

Note: Whether it is process based or thread based, the main advantage of multi tasking is to
improve performance of the system by reducing response time.

The main important application areas of multi threading are:


1. To implement Multimediagraphics
2. To developanimations
3. To develop videogames
4. To develop web and application serversetc...

Note: Where ever a group of independent jobs are available, then it is highly recommended to
execute simultaneously instead of executing one by one.
 For such type of caseswe should go for MultiThreading.
 Python provides one inbuilt module "threading" to provide support for developing
threads.
 Hence developing multi threaded Programs is very easy inpython
THREADS
 A thread has a beginning, an execution sequence, and aconclusion.
 It has an instruction pointer that keeps track of where within its context it is currently
running.
 It can be preempted (interrupted) and temporarily put on hold (also known as sleeping)
while other threads are running this is calledyielding.
 Multiple threads within a process share the same data space with the main thread and can
therefore share information or communicate with one another more easily than if they
were separateprocesses.
 Threads are generally executed in a concurrent fashion, and it is this parallelism and data
sharing that enable the coordination of multipletasks.
 Naturally, it is impossible to run truly in a concurrent manner in a single CPU system, so
threads are scheduled in such a way that they run for a little bit, then yield to other
threads (going to the proverbial "back of the line" to await more CPU time again).
Throughout the execution of the entire process, each thread performs its own, separate
tasks, and communicates the results with other threads asnecessary.

PROCESSES
 Computerprogramsaremerely executables,binary(orotherwise),whichresideondisk.
 They do not take on a life of their own until loaded into memory and invoked by the
operatingsystem.
 A process (sometimes called a heavyweight process) is a program in execution. Each
process has its own address space, memory, a data stack, and other auxiliary data to keep
track ofexecution.
 The operating system manages the execution of all processes on the system, dividing the
time fairly between all processes. Processes can also fork or spawn new processes to
perform other tasks, but each new process has its own memory, data stack, etc., and
cannotgenerallyshareinformationunlessinterprocesscommunication(IPC)isemployed

Main Thread: Every Python Program by default contains one thread which is nothing but Main
Thread.

Q. Program to print name of current executing thread:


import threading
print("Current Executing Thread:",threading.current_thread().getName())

o/p: Current Executing Thread: MainThread


Note: threading module contains function current_thread() which returns the current executing
Thread object. On this object if we call getName() method then we will get current executing
thread name.

Creating Thread:
The ways of Creating Thread in Python:
We can create a thread in Python by using 3 ways
1. Creating a Thread without using anyclass
2. Creating a Thread by extending Threadclass
3. Creating a Thread without extending Threadclass

1. Creating a Thread without using any class:

from threading import Thread


thread_object = Thread(target=function_name, args=(arg1, arg2, …))

thread_object – It represents our thread.


target – It represents the function on which the thread will act.
args – It represents a tuple of arguments which are passed to the function.

Ex:-
t = Thread(target=disp, args=(10,20))

How to Start Thread


 Once a thread is created it should be started by calling start()Method.
Ex 1:
from threading import Thread
def disp(a, b):
print(“Thread Running:”, a, b)
t = Thread(target=disp, args=(10, 20))
t.start()
from threading import Thread
def disp(a, b):
print(“Thread Running:”, a, b)
for i in range(5):
t = Thread(target=disp, args=(10, 20))
t.start()
 Main thread is responsible to create and Start Child Thread, once the child thread has
started both the thread behaveseparately.
Ex 2:
from threading import Thread
def disp():
for i in range(5):
print(“ChildThread”)
t =Thread(target=disp)
#uptoherethereisonlyonethread–MainThread
#AlltheabovecodeexecutedwithinMainThread
t.start()
#Once westartChildthread,therearenowTwoThreads–MainThreadandThread-1 #
Child Thread is responsible to run dispmethod
# and below code will be run by Main thread
for i in range(5):
print(“Main Thread”)
Output:
ChildThread
ChildThread
ChildThread
ChildThread
ChildThread
Main Thread
Main Thread
Main Thread
Main Thread
MainThread

 If multiple threads present in our program, then we cannot expect execution order and
hence we cannot expect exact output for the multi threadedprograms.
 B'z of this we cannot provide exact output for the aboveprogram.
 Itisvariedfrommachinetomachineandruntorun.

Note: Thread is a pre defined class present in threading module which can be used to create our
own Threads.

Demo Program 1:
from threading import Thread
def disp(a, b):
print("Thread Running:", a, b)
t = Thread(target=disp, args=(10, 20))
t.start()
Output:
Thread Running: 10 20

Demo Program 2:
from threading import Thread
def disp(a, b):
print("Thread Running:", a, b)
for i in range(5):
t = Thread(target=disp, args=(10, 20))
t.start()
Output:
Thread Running: 1020
Thread Running: 1020
Thread Running: 1020
Thread Running: 1020
Thread Running: 1020

Set and Get Thread Name:


 current_thread() – This function return current threadobject.
 getName() – Every thread has a name by default, to get the name of thread we can use
thismethod.
 setName(name) – This method is used to set the name ofthread.
 name Property – This property is used to get or set name of thethread.

Ex:-
thread_object.name = ‘String’
print(thread_object.name)

Demo Program on Set and Get Thread Name:


from threading import *
print(current_thread().getName())
current_thread().setName("MREC")
print(current_thread().getName())
print(current_thread().name)
current_thread().name = "MREC Campus"
print(current_thread().name)

Output:
MainThread
MREC
MREC
MREC Campus

2. Creating a Thread by extending Threadclass:


 We have tocreate child class for Threadclass.
 Inthatchildclasswehavetooverriderun()methodwithourrequiredjob.
 Whenever we call start() method then automatically run() method will be
executed and performs ourjob.
Syntax:
class ChildClassName(Thread):
statements
Thread_object = ChildClassName ()

Ex:-
class Mythread(Thread):
pass
t = Mythread()

Demo Program 2:
from threading import *
class MyThread(Thread):
def run(self):
for i in range(5):
print("Child Thread-1")
t=MyThread()
t.start()
t.join()

for i in range(5):
print("Main Thread-1")

Output:
ChildThread-1
ChildThread-1
ChildThread-1
ChildThread-1
ChildThread-1
Main Thread-1
Main Thread-1
Main Thread-1
Main Thread-1
MainThread-1

Thread Class’s Methods:


 start ( ) – Once a thread iscreated it should be started by calling start() Method.
 run( ) – Every thread will run this method when thread is started. We can override this
method and write our own code as body of the method. A thread will terminate
automatically when it comes out of the run( )Method.
 join ( ) – This method is used to wait till the thread completely executes the run ( )
method.
Thread Child Class with Constructor:
Demo Program:
from threading import *
class Mythread(Thread):
def __init__(self, a, b):
Thread.__init__(self)
self.a = a
self.b =b
print("Child thread running:",a,b)
t = Mythread(10,20)
t.start()

Output:
Child thread running: 10 20

3. Creating a Thread without extending Thread class:


We can create an independent thread child class that does not inherit from Thread Class
from threading module.
class ClassName:
statements
object_name = ClassName()
Thread_object = Thread(target=object_name.function_name, args=(arg1, arg2,…))

Demo Program:-
from threading import Thread
class Mythread:
def disp (self, a, b):
print(a, b)
myt = Mythread()
t=Thread(target=myt.disp,args=(10,20))
t.start()

Output:
10 20

Thread Identification Number:


 Every thread has anunique identification number which can be accessed using variable
ident.

Syntax:- Thread_object.ident
Ex:- t.ident

Demo_Ident Thread:
from threading import *
def test():
print("Child Thread")
t=Thread(target=test)
t.start()
print("Main Thread Identification Number:",current_thread().ident)
print("Child Thread Identification Number:",t.ident)

Output:
Child Thread
Main Thread Identification Number:9484
Child Thread Identification Number: 6408

Race Condition:
 Race condition is a situation that occurs when threads are acting in an unexpected
sequence, thus leading to unreliableoutput.
 This can be eliminated using threadsynchronization.

Demo_Program_Race _condition:

from threading import *


class Train:
def __init__(self,available_seat):
self.available_seat = available_seat

def reserve(self, need_seat):


print("Available seat:",self.available_seat)
if(self.available_seat >= need_seat):
name = current_thread().name
print(f"{need_seat} seat is alloted for {name}")
self.available_seat -= need_seat

else:
print("Sorry! All seats has alloted")
t = Train(1)
t1=Thread(target=t.reserve,args=(1,),name="John")
t2=Thread(target=t.reserve,args=(1,),name="Jack")
t1.start()
t2.start()

Output:
Available seat:Available seat:11
1 seat is alloted for John1 seat is alloted for Jack

Thread Synchronization:
 Many threads trying to access the same object can lead to problems like making data
inconsistent or getting unexpectedoutput.
 So when a thread is already accessing an object, preventing any other thread accessing
the same object is called ThreadSynchronization.
 The object on which the threads are synchronized is called Synchronized Object or
Mutually Exclusive Lock(mutex).
 Thread Synchronization is recommended when multiple threads are acting on the same
objectsimultaneously.

There are following techniques to do Thread Synchronization:


 UsingLocks
 Using RLock (Re-EntrantLock)
 UsingSemaphores

Locks:
 Locks are typically used to synchronize access to a sharedresource.
 Lock can be used to lock the object in which the thread isacting.
 ALockhasonlytwostates,lockedandunlocked.Itiscreatedintheunlockedstate.

acquire( ):
 Thismethodisusedtochangesthestatetolockedandreturnsimmediately.
 When the state is locked, acquire() blocks until a call to release() in another thread
changesittounlocked,thentheacquire()callresetsittolockedandreturns.

Syntax:-acquire(blocking=True, timeout = -1)


True – It blocks until the lock is unlocked, then set it to locked and return True.
False - It does not block. If a call with blocking set to True would block, return False
immediately; otherwise, set the lock to locked and return True.

Timeout-Wheninvokedwiththefloating-pointtimeoutargumentsettoapositivevalue, block
for at most the number of seconds specified by timeout and as long asthe lock cannot be
acquired. A timeout argument of -1 specifies an unbounded wait. It is forbidden to
specify a timeout when blocking isfalse.
 The return value is True if the lock is acquired successfully, False if not (for example if
the timeoutexpired).

release( ):
 This method is used to release a lock. This can be called from any thread, not only the
thread which has acquired thelock.
 When the lock is locked, reset it to unlocked, and return. If any other threads areblocked
waiting for the lock to become unlocked, allow exactly one of them toproceed.
 When invoked on an unlocked lock, a RuntimeError israised.
 There is no returnvalue.

Syntax:-release( )

Lock_Demo Program:
from threading import *
class Train:
def __init__(self,available_seat):
self.available_seat = available_seat
self.l = Lock()
print(self.l)

def reserve(self, need_seat):


self.l.acquire()
print("Available seat:",self.available_seat)
if(self.available_seat >= need_seat):
name = current_thread().name
print(f"{need_seat} seat is alloted for {name}")
self.available_seat -= need_seat

else:
print("Sorry! All seats has alloted")
self.l.release()
t = Train(2)
t1 = Thread(target=t.reserve, args=(1,),name ="John")
t2 = Thread(target=t.reserve, args=(1,),name ="Jack")
t3=Thread(target=t.reserve,args=(1,),name="harry")
t1.start()
t2.start()
t3.start()

t1,t2,t3.join()
print("Main Thread")

Output:
<unlocked_thread.lockobjectat0x0000028AE063ED80>
Available seat:2
1 seat is alloted forJohn
Available seat:1
1 seat is alloted forJack
Available seat:0
Sorry! All seats has alloted
Main Thread

RLock:
 Areentrantlockisa synchronizationprimitivethatmaybeacquiredmultipletimesbythe
samethread.
 The standard Lock doesn’t know which thread is currently holding thelock.
 If the lock is held, any thread that attempts to acquire it will block, even if the same
threaditselfisalreadyholdingthelock.Insuchcases,RLock(re-entrantlock)isused.
 A reentrant lock must be released by the thread that acquired it. Once a thread has
acquired a reentrant lock, the same thread may acquire it again without blocking; the
thread must release it once for each time it has acquiredit.

RLock_Demo Program:
from threading import *
class Train:
def __init__(self,available_seat):
self.available_seat = available_seat
self.l = RLock()
print(self.l)

def reserve(self, need_seat):


self.l.acquire(blocking = True, timeout =-1)
print(self.l)
print("Available seat:",self.available_seat)
if(self.available_seat >= need_seat):
name = current_thread().name
print(f"{need_seat} seat is alloted for {name}")
self.available_seat -= need_seat

else:
print("Sorry! All seats has alloted")
self.l.release()
t = Train(2)
t1 = Thread(target=t.reserve, args=(1,),name ="John")
t2 = Thread(target=t.reserve, args=(1,),name ="Jack")
t3=Thread(target=t.reserve,args=(1,),name="Harry")
t1.start()
t2.start()
t3.start()

Output:
<unlocked _thread.RLock object owner=0 count=0 at 0x0000028AE0315B40>
<locked _thread.RLock object owner=1712 count=1 at 0x0000028AE0315B40>
Available seat: 2
1 seat is alloted for John
<locked _thread.RLock object owner=1644 count=1 at 0x0000028AE0315B40>
Available seat: 1
1 seat is alloted for Jack
<locked _thread.RLock object owner=1452 count=1 at 0x0000028AE0315B40>
Available seat: 0
Sorry! All seats has allotted

Semaphore:
 In Lock and RLock, at a time only one Thread is allowed to execute but sometimes our
requirement is to execute a particular number of Threads at atime.
 Suppose we have to allow at a time 10 members to access the Database and only 4
members are allowed to access NetworkConnection.
 To handle such types of requirements we can not use Lock and RLock concept and here
we should go forSemaphore.
 Semaphore can be used to limit the access to the shared resources with limitedcapacity.
It is an advanced part of synchronization.
 This is one of the oldest synchronization primitives in the history of computer science,
invented by the early Dutch computer scientist Edsger W. Dijkstra,

Create an object of Semaphore:


object_name = Semaphore(count)
 Here ‘count’ is the number of Threads allowed to access simultaneously. The default
value of count is1.
 When a Thread executes acquire() method then the value of “count” variable will be
decremented by1.
 Whenever a Thread executes release() method then the value of “count” variable will be
incremented by 1. i.e whenever acquire() method will be called the value of count
variable will be decremented and whenever release() method will be called the value of
“count” variable will beincremented.
Way to create an object of Semaphore :
Case 1 :
object_name.Semaphore()
 In this case, by default value of the count variable is 1 due to which only one thread is
allowed toaccess.
 It is exactly the same as the Lockconcept.

Case 2 :
object_name.Semaphore(n)
 In this case, a Semaphore object can be accessed by n Threads at atime.
 The remaining Threads have to wait until releasing thesemaphore.

Semaphore_Demo_Program:
# importing the modules
from threading import *
import time

# creating thread instance where count = 3


obj = Semaphore(3)

# creating instance
def display(name):

# calling acquire method


obj.acquire()

print('Hello, ', end = '')


time.sleep(2)
print(name)

# calling release method


obj.release()

# creating multiple thread


t1=Thread(target=display,args=('Thread-1',))
t2=Thread(target=display,args=('Thread-2',))
t3=Thread(target=display,args=('Thread-3',))
t4=Thread(target=display,args=('Thread-4',))
t5=Thread(target=display,args=('Thread-5',))
t6=Thread(target=display,args=('Thread-6',))

# calling the threads


t1.start()
t2.start()
t3.start()
t4.start()
t5.start()
t6.start()

Output:
Hello, Hello, Hello, Thread-2Thread-3
Thread-1
Hello, Hello,
Hello, Thread-4Thread-5

Thread-6

Dead Lock:
 A Deadlock is a situation where each of the process waits for a resource which is being
assigned to some anotherprocess.
 In this situation, none of the process gets executed since the resource it needs, is held by
some other process which is also waiting for some other resource to bereleased.
 Deadlocks are the most feared issue that developers face when writing
concurrent/multithreadedapplicationsinpython.Thebest waytounderstanddeadlocksis by
using the classic computer science example problem known as the Dining
PhilosophersProblem.
 The problem statement for dining philosophers is asfollows:
 Five philosophers are seated on a round table with five plates of spaghetti (a type of
pasta) and five forks, as shown in thediagram.
 At any given time, a philosopher must either be eating orthinking.
 Moreover, a philosopher must take the two forks adjacent to him (i.e., the left and right
forks) before he can eat the spaghetti. The problem of deadlock occurs when all five
philosophers pick up their right forkssimultaneously.
 Since each of the philosophers has one fork, they will all wait for the others to put their
fork down. As a result, none of them will be able to eatspaghetti.
 Similarly, in a concurrent system, a deadlock occurs when different threads or processes
(philosophers)trytoacquirethesharedsystemresources(forks)atthesametime.
 As a result, none of the processes get a chance to execute as they are waiting for another
resource held by some otherprocess.
Dining Philosophers Problem

GLOBAL INTERPRETER LOCK


 The Python Global Interpreter Lock or GIL, in simple words, is a mutex (or a process
lock)thatallowsonlyonethreadtoholdthecontrol ofthePythoninterpreter.
 Thismeansthatonlyonethreadcanbeinastateofexecutionatanypointintime.
 The impact of the GIL isn’t visible to developers who execute single-threaded programs,
but it can be a performance bottleneck in CPU-bound and multi-threadedcode.
 Itisa mechanismtoapplyagloballock onaninterpreter.Itisusedincomputer-language
interpreters to synchronize and manage the execution of threads so that only one native
thread (scheduled by the operating system) can execute at atime.
 In a scenario where you have multiple threads, what can happen is that both the thread
might try to acquire the memory at the same time, and as a result of which they would
overwrite the data in thememory.
 Hence, arises a need to have a mechanismthat could help prevent this phenomenon.
 Some popular interpreters that have GIL are CPythonand Ruby MRI.
 As most of you would know that Python is an interpreted language, it has various
distributions like CPython, Jython,IronPython.
 Out of these, GIL is supported only in CPython, and it is also the most widely used
implementation ofPython.
 CPython has been developed in both C and Python language primarily to support and
work with applications that have a lot of C language underneath thehood.
 Even if your processor has multiple cores, a global interpreter will allow only one thread
to be executed at atime.
 The threads 1 and 2 calling the factorial function may take twice as much time as a single
thread calling the functiontwice.
 This also tells you that the memory manager governed by the interpreter is not thread-
safe, which means that multiple threads fail to access the same shared data
simultaneously.
Hence, GIL:
 Limits the threadingoperation.
 Parallel execution isrestricted.
 TheGILensuresthatonlyonethreadrunsintheinterpreteratonce.
 It helps in simplifying various low-level details like memorymanagement.
 With GIL cooperative computing or coordinated multitasking is achieved instead of
parallelcomputing.

THE THREAD AND THREADING MODULES


In order to implement multithreaded applications with python. There are two main modules
which can be used to handle threads in Python:
 The thread module,and
 The threading module

The ThreadModule
 The thread module has long been deprecated. Starting with Python 3, it has been
designatedasobsoleteandisonlyaccessibleas__threadfor backwardcompatibility.
 We can use the higher-level threading module for applications which you intend to
deploy. Howeverthethread module can be covered here for educational purposes.
Syntax:
The syntax to create a new thread using this module is as follows:
thread.start_new_thread(function_name, arguments)

The Threading Module


 This module is the high-level implementation of threading in python and the de facto
standard for managing multithreaded applications. It provides a wide range of features
when compared to the thread module.

Daemon Thread:
 Adaemonthreadisathreadwhichrunscontinuouslyinthebackground.
 It provides support to non-daemonthreads.
 When last non-daemon thread terminates, automatically all daemon threads will be
terminated. We are not required to terminate daemon threadexplicitly.
 The main objective of Daemon Threads is to provide support for Non Daemon Threads( likemain
thread)
Eg: Garbage Collector

Create Daemon Thread:

setDaemon(True) Method or daemon = True Property is used to make a thread a Daemon


thread.
Ex:-
t1 = Thread(target=disp)
t1.setDaemon(True)
t1.daemon = True

setDaemon(True/False) - This method is used to set a thread as daemon thread.


You can set thread as daemon only before starting that thread which means active thread status
cannot be changed as daemon.
If we pass True non-daemon thread will become daemon and if False daemon thread will
become non-daemon.

daemon Property - This property is used to check whether a thread is daemon or not. It returns
True if thread is daemon else False.
We can also use daemon property to set a thread as daemon thread or vice versa.

isDaemon() - This method is used to check whether a thread is daemon or not. It returns True if
thread is daemon else False.
Eg:
from threading import *
print(current_thread().isDaemon()) #False
print(current_thread().daemon) #False

Default Nature of Thread:


 Main Thread is always non-daemonthread.
 Rest of the threads inherits daemon nature from theirparents.
 Ifparentthreadisnondaemonthenchildthreadwillbecomenondaemonthread.
 Ifparentthreadisdaemonthenchildthreadwillalsobecomeadaemonthread.
 When last non-daemon thread terminates, automatically all daemon threads will be
terminated. We are not required to terminate daemon threadexplicitly.

isAlive():
isAlive() method checks whether a thread is still executing or not.

Syntax
t1.isAlive()
True or False

active_count():
This function returns the number of active threads currently running.

Active_count_demo _program:
from threading import *
import time
def display():
print(current_thread().getName(),"...started")
time.sleep(3)
print(current_thread().getName(),"...ended")
print("TheNumberofactiveThreads:",active_count())
t1=Thread(target=display,name="ChildThread1")
t2=Thread(target=display,name="ChildThread2")
t3=Thread(target=display,name="ChildThread3")
t1.start()
t2.start()
t3.start()
print("The Number of active Threads:",active_count())
time.sleep(5)
print("The Number of active Threads:",active_count())

Output:
The Number of active Threads: 1
ChildThread1 ...started
ChildThread2 ...started
ChildThread3 ...started
The Number of active Threads: 4
ChildThread1 ...ended
ChildThread2 ...ended
ChildThread3 ...ended
The Number of active Threads: 1

enumerate() function:
This function returns a list of all active threads currently running.

Enumerate_function_demo_program:
from threading import *
import time
def display():
print(current_thread().getName(),"...started")
time.sleep(3)
print(current_thread().getName(),"...ended")
t1=Thread(target=display,name="ChildThread1")
t2=Thread(target=display,name="ChildThread2")
t3=Thread(target=display,name="ChildThread3")
t1.start()
t2.start()
t3.start()
l=enumerate()
for t inl:
print("Thread Name:",t.name)
time.sleep(5)
l=enumerate()
for t in l:
print("Thread Name:",t.name)
Output:
ChildThread1 ...started
ChildThread2 ...started
ChildThread3 ...started
Thread Name: MainThread
Thread Name: ChildThread1
Thread Name: ChildThread2
Thread Name: ChildThread3
ChildThread1 ...ended
ChildThread2 ...ended
ChildThread3 ...ended
Thread Name: MainThread

Thread Communication:
Two or more threads communicate with each other.
 Event
 Condition
 Queue

Event:
 This is one of the simplest mechanisms for communication between threads: one thread
signals an event and other threads wait forit.
 An event object manages an internal flag that can be set to true with the set() method and
reset to false with the clear()method.
 The wait() method blocks until the flag istrue.
 The flag is initiallyfalse.

Create Event Object:


from threading import Event
e = Event()

Event Methods:
set()- It sets the internal flag to true. All threads waiting for it to become true are awakened.
Threads that call wait() once the flag is true will not block at all.

clear()- It resets the internal flag to false. Subsequently, threads calling wait() will block until

set() is called to set the internal flag to true again.

is_set() – It returns true if and only if the internal flag is true.

wait(timeout=None) –
 It blocks until the internal flag is true. If the internal flag is true on entry, return
immediately. Otherwise, block until another thread calls set() to set the flag to true, or
until the optional timeoutoccurs.
 When the timeout argument is present and not None, it should be a floating pointnumber
specifying a timeout for the operation in seconds (or fractionsthereof).
 This method returns true if and only if the internal flag has been set to true, either before
the wait call or after the wait starts, so it will always return True except if a timeout is
given and the operation timesout.

Event_Demo_Program:
from threading import Thread,Event
from time import sleep
def traffic_lights():
sleep(2)
e.set()
print("Green Light Enabled")
sleep(4)
print("Red Light Enabled")
e.clear()

def traffic():
e.wait()
while e.is_set():
print("You can Drive...")
sleep(0.5)
print("Program Completed")

e=Event()

t1 =Thread(target=traffic_lights)
t2 = Thread(target=traffic)
t1.start()
t2.start()

Output:
Green Light EnabledYou can Drive...

You can Drive...


You can Drive...
You can Drive...
You can Drive...
You can Drive...
You can Drive...
You can Drive...
Red Light Enabled
Program Completed

Condition:
 Condition class is used to improve speed of communication between Threads. The
condition class object is called conditionvariable.
 Aconditionvariableisalwaysassociatedwith somekindoflock;thiscanbepassedinor one will
be created bydefault.
 Passing one in is useful when several condition variables must share the same lock. The
lock is part of the condition object: you don’t have to track itseparately.
 A condition is a more advanced version of the eventobject.

Create Condition Object


from threading import Condition
cv = Condition()

 notify(n=1) – This method is used to immediately wake up one thread waiting on the
condition. Where n is number of thread need to wakeup.
 notify_all()–Thismethodisusedtowakeupallthreadswaitingonthecondition.

 wait(timeout=None) – This method wait until notified or until a timeout occurs. If the
callingthreadhasnotacquiredthelockwhenthismethodiscalled,aRuntimeErroris
raised. Wait terminates when invokes notify() method or notify_all() method. The return value is
True unless a given timeout expired, in which case it is False.

Condition_Demo_Program:
from threading import Thread,Condition
from time importsleep

List = []

def produce():
co.acquire()
for i in range(1,5):
List.append(i)
sleep(1)
print("ItemProduced...")
co.notify()
co.release()

def consume():
co.acquire()
co.wait(timeout=0)
co.release()
print(List)

co = Condition()

t1 = Thread(target=produce)
t2 = Thread(target=consume)

t1.start()
t2.start()

Output:
ItemProduced...
ItemProduced...
ItemProduced...
ItemProduced...
[1, 2, 3,4]

Queue:
 TheQueueclass ofqueue moduleisusefultocreateaqueuethatholdsthe dataproduced by
theproducer.
 The data can be taken from the queue and utilized by theconsumer.
 We need not use locks since queues are threadsafe.

Create Queue Object:


from queue import Queue
q = Queue()

Queue Methods:
put ( )– This method is used by Producer to insert items into the queue.
Syntax:- queue_object.put(item)
Ex:- q.put(i)
get ( )– This method is used by Consumer to retrieve items from the queue.
Syntax:- producer_object.queue_object.get(item)
Ex:- p.q.get(i)
empty() – This method returns True if queue is Empty else returns False.
Ex:- q.empty()
full() – This method returns True if queue is Full else returns False.
Ex:- q.full()

Producer-Consumer Demo_Program:
from threading import Thread
from queue import Queue
from time import sleep

class Producer:
def __init__(self):
self.q =Queue()

defproduce(self):
for i in range(1,5):
print("Item Produced",i)
self.q.put(i)
sleep(1)

class Consumer:
def __init__(self,prod):
self.prod = prod

def consumer(self):
for i in range(1,5):
print("Item Recieved",self.prod.q.get(i))

p = Producer()
c = Consumer(p)
t1 =
Thread(target=p.produce) t2
= Thread(target=c.consumer)

t1.start()
t2.start()

Output:
Item Produced1
Item Recieved1
Item Produced2
Item Recieved2
Item Produced3
Item Recieved3
Item Produced4
Item Recieved4

RELATED MODULES
 The below lists some of the modules you may use when programming multithreaded
applications

Threading-Related Standard Library Modules

Module Description
thread Obsoleteinpython3.xandwhichisbasic,lower-levelthreadmodule
threading Higher-level threading and synchronizationobjects
queue Synchronized FIFO queue for multiple threads
socketserver TCPandUDPmanagerswithsomethreadingcontrol

You might also like