FILE HANDLING
A computer file (or simply “file”) is a resource for storing information, which is
available to a computer program, and is usually based on some kind of durable
electronic storage. A file is durable in the sense that it remains available for
programs to use after the current program has finished.
FILE?
At its core, a file is a contiguous set of bytes used to store data. This data is
organized in a specific format and can be anything as simple as a text file or as
complicated as a program executable. In the end, these byte files are then translated
into binary 1 and 0 for easier processing by the computer.
Files on most modern file systems are composed of three main parts:
• Header: metadata about the contents of the file (file name, size, type, and so
on)
• Data: contents of the file as written by the creator or editor
• End of file (EOF): special character that indicates the end of the file
TYPES OF FILE - There are two types of files:
1. Text Files- A text file is usually considered as sequence of lines. Line is a
sequence of characters(ASCII or UNICODE), stored on permanent storage
media.
The default character coding in python is ASCII. Each line is terminated by
a special character, known as End of Line(EOL). Text files are stored in
human readable form and they can also be created using any text editor.
2. Binary Files - A binary file is just a file that contains information in the
same format in which the information is held in the memory. In binary file,
there is no delimiter for a line. Also no translation occurs in binary file. As a
result the binary files are faster and easier for a program to read and write
than are the text files. As long as the files doesn’t need to be read by people
or need to be ported to a different type system, binary files are the best
way to store the program.
Text File Binary File
Its Bits represent character. Its Bits represent a custom data.
Less prone to get corrupt as change Can easily get corrupted, corrupt on
reflects as soon as made and can be even single bit change
undone.
Store only plain text in a file. Can store different types of data
(audio, text, image) in a single file.
Widely used file format and can be Developed for an application and can
opened in any text editor. be opened in that application only.
Mostly .txt and .rtf are used as Can have any application defined
extensions to text files. extension.
OPENING FILES IN PYTHON
A common operation needed during program execution is to load data from an
existing file or to create a new file to store data. To accomplish this, the program
first needs to open a file. Opening a file refers to getting the file ready either for
reading or for writing. This can be done using the open() function. This function
returns a file object and takes two arguments, one that accepts the file name and
another that accepts the mode(Access Mode).
Here
• File_name is the name of the file on secondary storage media, which can be
string constant or a variable. The name can include the description of path,
in case the file does not reside in the same folder/directory in which we are
working.
• Access_mode describes how file will be used throughout the program. This
is an optionl parameter and the default access_mode is reading.
• Buffering is for specifying how much is read from the file in one read.
FILE OBJECT
Finally, the function will return an object of file type which allows us to use,
access and manipulate all the user accessible files. One can read and write any
such files. When a file operation fails for an I/O-related reason, the exception
IOError is raised.
FILE PATHS
When you access a file on an operating system, a file path is required. The file path
is a string that represents the location of a file. It’s broken up into three major
parts:
• Folder Path: the file folder location on the file system where subsequent
folders are separated by a forward slash / (Unix) or backslash \ (Windows)
• File Name: the actual name of the file
• Extension: the end of the file path pre-pended with a period (.) used to
indicate the file type
If both the Python file being executed and the target file to read doesn't exist in the
same directory, we need to pass the full path of the file to read, to
the open() function as shown in the following code snippet:
file_example = open ("F:\\Directory\\AnotherDirectory\\tests\\TestingText.txt")
Thus the two ways to give paths in filenames correctly are:
i. double the slashes
f=open(“c:\\temp\\record.txt”, “r”)
ii. Give raw string by prefixing the file path string with r
f=open(r “c:\temp\record.txt”, “r”)
Note:
The r is placed before filename to prevent the characters in filename string to be
treated as character and error is raised of invalid address. The r makes the string
raw, that is, it tells that special character. For example, if there is \temp in the file
address, then \t is treated as the tab the string is without any special characters.
The r can be ignored if the file is in same directory and address is not being
placed.
Something to keep in mind is to always make sure that both the file name and the
path given are correct. If either is incorrect or doesn't exist, the
error FileNotFoundError will be thrown, which needs to then be caught and
handled by your program to prevent it from crashing.
ACCESS MODE
Access modes govern the type of operations possible in the opened file. It refers
to how the file will be used once it’s opened. These modes also define the
location of the File Handle in the file. File handle is like a cursor, which defines
from where the data has to be read or written in the file.
There are 6 access modes in python.
• Read Only (‘r’): Open text file for reading. The handle is positioned at the
beginning of the file. If the file does not exist, raises I/O error. This is also
the default mode in which the file is opened.
• Read and Write (‘r+’): Open the file for reading and writing. The handle is
positioned at the beginning of the file. Raises I/O error if the file does not
exist.
• Write Only (‘w’): Open the file for writing. For existing file, the data is
truncated and over-written. The handle is positioned at the beginning of the
file. Creates the file if the file does not exist.
• Write and Read (‘w+’): Open the file for reading and writing. For existing
file, data is truncated and over-written. The handle is positioned at the
beginning of the file.
• Append Only (‘a’): Open the file for writing. The file is created if it does
not exist. The handle is positioned at the end of the file. The data being
written will be inserted at the end, after the existing data.
• Append and Read (‘a+’): Open the file for reading and writing. The file is
created if it does not exist. The handle is positioned at the end of the file.
The data being written will be inserted at the end, after the existing data.
Note:
The above-mentioned modes are for opening, reading or writing text files only.
While using binary files, we have to use the same modes with the letter ‘b’ at the
end. So that Python can understand that we are interacting with binary files.
• ‘wb’ – Open a file for write only mode in the binary format.
• ‘rb’ – Open a file for the read-only mode in the binary format.
• ‘ab’ – Open a file for appending only mode in the binary format.
• ‘rb+’ – Open a file for read and write only mode in the binary format.
• ‘ab+’ – Open a file for appending and read-only mode in the binary format.
OPENING A FILE USING WITH CLAUSE IN PYTHON
we can also open a file using with clause. It is designed to provide much cleaner
syntax and exceptions handling when you are working with code. The syntax of
with clause is:
with open (file_name, access_mode) as file_ object:
The advantage of using with clause is that any file that is opened using this clause
is closed automatically, once the control comes outside the with clause. In case the
user forgets to close the file explicitly or if an exception occurs, the file is closed
automatically.
with open(“myfile.txt”,”r+”) as myObject:
content = myObject.read()
Here, we don’t have to close the file explicitly using close() statement. Python will
automatically close the file.
CLOSING A FILE - Python has a close() method to close a file. The close()
method can be called more than once and if any operttion is performed on a closed
file it raises a ValueError. While closing a file, the system frees the memory
allocated to it. The syntax of close() is:
file_object.close()
Here, file_object is the object that was returned while opening the file. Python
makes sure that any unwritten or unsaved data is flushed off (written) to the file
before it is closed. Hence, it is always advised to close the file once our work is
done. Also, if the file object is re-assigned to some other file, the previous file is
automatically closed.
WRITING TO FILE
There are two ways to write in a file.
• write() : ewrite() method takes a string as a parameter and writes in the file.
for storing data with the end of line character, we will have to add ‘\n’
character to the end of the string. An argument to the function has to be
string, for storing numeric value, we have to convert it to a string.
File_object.write(str1)
• writelines() : For a list of string elements, each string is inserted in the text
file. Used to insert multiple strings at a single time.
File_object.writelines(L) for L = [str1, str2, str3]
TO CREATE A TEXT FILE IN PYTHON
With Python Write to File, you can create a .text files (file1.txt) by using the code,
we have demonstrated here:
Step 1: f= open("file1.txt","w+")
We declared the variable f to open a file named file1.txt. Open takes 2 arguments,
the file that we want to open and a string that represents the kinds of permission or
operation we want to do on the file. Here, we used "w" letter in our argument,
which indicates Python write to file and it will create a file if it does not exist in
library Plus sign indicates both read and write for Python create file operation.
Step 2: for i in range(10):
f.write("This is line %d\n" % (i+1))
We have a for loop that runs over a range of 10 numbers. Using the write function
to enter data into the file.
The output we want to iterate in the file is "this is line number", which we declare
with Python write to text file function and then percent d (displays integer)
So basically we are putting in the line number that we are writing, then a new line
character.
Step 3 : f.close()
This will close the instance of the file file1.txt stored Here is the result after code
execution for Python create file.
Q1: Python program to write roll,name and per in the file named “abc1.txt”
f=open("abc1.txt","w")
roll=101
f.write("Amit\n")
f.write(str(roll))
per=98.99
f.write("\n")
f.write(str(per))
#close the file
f.close()
Q2: Python program to write list sequence (names) in the file “abc.txt”
f=open("abc.txt","w")
names=["Amit\n","Sumit\n","Rohit\n","Kapil\n","Neha\n"]
f.writelines(names)
#close the file
f.close()
Q3: Python program to write contents into a file named “abc.txt”
with open("abc.txt","w") as f:
f.write("hello how are you\n")
f.write("welcome to python file handling\n")
f.write("end of the file\n")
#close the file
f.close()
READING FROM A FILE
Python provides various methods for reading data from a file. we can read
character data from a text file using the following read methods:
1. read() :
To read the entire data from the file. Starts reading from the cursor up to the end of
the file.
2. read(n):
To read “n” characters from the file. Starting from the cursor, if the file has
fewer characters than “n”, it will read until the end of the file.
3. readline():
To read only one line from the file; starts reading from the cursor up to and
including the end of line character.
4. readlines():
To read all the lines from the file into a list; starts reading from the cursor up to the
end of the file and return a list of lines.
Q 1: We have a file “test.txt” and we want to read the contents of the file using
read() method.
f=open("test.txt","r")
data=f.read()
print(data)
f.close()
Output:
This is line1
This is line2
This is line3
This is line4
This is line5
This is line6
This is line7
This is line8
This is line9
This is line10
Q2: We have a file “test.txt” and we want to read 1st 7 characters from the
file using read(n) method.
#to read the contents of the file
f=open("test.txt","r")
data=f.read(7)
print(data)
f.close()
Output:
This is
>>>
Q3: We have a file “test.txt” and we want to read 1st 3 lines from the file
using read(n) method.
f=open("test.txt","r")
data=f.readline()
print(data)
data=f.readline()
print(data)
data=f.readline()
print(data)
f.close()
Output:
This is line1
This is line2
This is line3
Q4: We have a file “test.txt” and we want to read contents from the file using
readlines() method.
#to read the contents of the file
f=open("test.txt","r")
data=f.readlines()
print(data)
f.close()
Output:
Since the output is stored in the form of list, so we get the following result.
['This is line1\n', 'This is line2\n', 'This is
line3\n', 'This is line4\n', 'This is line5\n', 'This
is line6\n', 'This is line7\n', 'This is line8\n',
'This is line9\n', 'This is line10']
>>>
Q5: To read the contents and display in proper each line in proper format we
modified above program as:
#to read the contents of the file
f=open("test.txt","r")
data=f.readlines()
for i in data:
print(i)
f.close()
Output:
This is line1
This is line2
This is line3
This is line4
This is line5
This is line6
This is line7
This is line8
This is line9
This is line10
Q6: Write a python script to read the contents of a file character by character
and display them?
def file_read():
f=open("abc.txt")
while True:
c=f.read(1)
if not c:
break
print(c,end='')
#function calling
file_read()
Q7: Write a (python script) function named count_len() to read the contents
of the file “abc.txt” and print its length.
Sol:
def count_len():
le=0
with open("abc.txt") as f:
while True:
c=f.read(1)
if not c:
break
print(c,end='')
le=le+1
print("length of file ",le)
#function calling
count_len()
Q8: write a (python script) function named count_lower() to read the contents
of the file “story.txt”. Further count and print total lower case alphabets in
the file
Sol:
def count_lower():
lo=0
with open("story.txt") as f:
while True:
c=f.read(1)
if not c:
break
print(c,end='')
if(c>='a' and c<='z'):
lo=lo+1
print("total lower case alphabets ",lo)
#function calling
count_lower()
Q9: Write a (python script) function named count_words() to read the
contents of the file “abc.txt” word by word and display the contents. and also
display total number of words starting with vowels “a”,”e”,”i”,”o”,”u” or
“A” ,”E”,”I”,”O”,”U”
def count_words():
w=0
vowels="aeiouAEIOU"
with open("abc.txt") as f:
for line in f:
for word in line.split():
if(word[0] in vowels):
print(word)
w=w+1
print("total words ",w)
#function calling
count_words()
Q9: write a (python script) function named count() to
read the contents of the file “story.txt”. Further
count and print the following:
#total length
#total alphabets
#total vowels
#total consonants
#total non alpha chars
Sol:
def count():
d=le=c1=c2=a=0
vowels="aeiouAEIOU"
with open("story.txt") as f:
while True:
c=f.read(1)
if not c:
break
print(c,end='')
le=le+1
if((c>='A' and c<='Z') or(c>='a' and c<='z')):
a=a+1
if c in vowels:
d=d+1
else:
c1=c1+1
c2=c2+1
print("total vowels ",d)
print("total consonants ",c1)
print("total alphabets ",a)
print("total chars which are not alphabets ",c2)
print("length = ",le)
#function calling
count()
SETTING OFFSETS IN A FILE
The functions that we have learnt till now are used to access the
data sequentially from a file. But if we want to access data in a
random fashion, then Python gives us seek() and tell() functions to
do so.
The tell() method - This function returns an integer that specifies
the current position of the file object in the file. The position so
specified is the byte position from the beginning of the file till the
current position of the file object. The syntax of using tell() is:
file_object.tell()
The seek() method - This method is used to position the file object
at a particular position in a file. The syntax of seek() is:
file_object.seek(offset [, reference_point])
where
• offset is the number of bytes by which the file object is to be
moved.
• reference_point indicates the starting position of the file
object. That is, with reference to which position, the offset has
to be counted. It can have any of the following values:
0 - beginning of the file
1 - current position of the file
2 - end of file
By default, the value of reference_point is 0, i.e. the offset is
counted from the beginning of the file.
For example, the statement fileObject.seek(5,0) will position the file
th
object at 5 byte position from the beginning of the file. The code in
Example
print("Learning to move the file object")
fileobject=open("testfile.txt","r+")
str=fileobject.read()
print(str)
print("Initially, the position of the file object is:
",fileobject. tell())
fileobject.seek(0)
print("Now the file object is at the beginning of the
file: ",fileobject.tell())
fileobject.seek(5)
print("We are moving to 10th byte position from the
beginning of file")
print("The position of the file object is at",
fileobject.tell())
str=fileobject.read()
print(str)
Output:
RESTART: Path_to_file\Program2-2.py
Learning to move the file object
roll_numbers = [1, 2, 3, 4, 5, 6]
Initially, the position of the file object is: 33
Now the file object is at the beginning of the file:
0
We are moving to 10th byte position from the
beginning of file
The position of the file object is at 10
numbers = [1, 2, 3, 4, 5, 6]
Example:
f = open("a.txt", 'w')
line = 'Welcome to python.mykvs.in\nRegularly visit
python.mykvs.in'
f.write(line)
f.close()
f = open("a.txt", 'rb+')
print(f.tell())
print(f.read(7)) # read seven characters
print(f.tell())
print(f.read())
print(f.tell())
f.seek(9,0) # moves to 9 position from begining
print(f.read(5))
f.seek(4, 1) # moves to 4 position from current location
print(f.read(5))
f.seek(-5, 2) # Go to the 5th byte before the end
print(f.read(5)) f.close()
OUTPUT
0
'Welcome'
7
' to python.mykvs.in\r\n Regularly visit python.mykvs.in'
59
'o pyt'
'mykvs'
'vs.in'
PICKLING
Pickling is a popular method of preserving food. Python pickle
module is used for serializing and de-serializing python object
structures. The process to converts any kind of python objects (list,
dict, etc.) into byte streams (0s and 1s) is called pickling or
serialization or flattening or marshalling. We can converts the
byte stream (generated through pickling) back into python objects
by a process called as unpickling.
Serialization is a process of transforming objects or data structures
into byte streams or strings. A byte stream is, well, a stream of
bytes – one byte is composed of 8 bits of zeros and ones. These byte
streams can then be stored or transferred easily.
The pickle module deals with binary files. Here, data are not written
but dumped and similarly, data are not read but loaded. The Pickle
Module must be imported to load and dump data. The pickle
module provides two methods - dump() and load() to work with
binary files for pickling and unpickling, respectively.
PYTHON PICKLE DUMP
In this section, we are going to learn, how to store data using
Python pickle. To do so, we have to import the pickle module first.
Then use pickle.dump() function to store the object data to the
file. pickle.dump() function takes 2 arguments.
• The first argument is the object that you want to store.
• The second argument is the file object you get by opening the
desired file in write-binary (wb) mode.
import pickle
data =int(input('Enter the number of data : '))
lst = []
for i in range(data):
raw = input('Enter data '+str(i)+' : ')
lst.append(raw)
open('important', 'wb')
file pickle.dump(lst, file)
file.close()
PYTHON PICKLE LOAD
To retrieve pickled data, the steps are quite simple. You have to
use pickle.load() function to do that. The primary argument of
pickle load function is the file object that you get by opening the file
in read-binary (rb) mode.
Example:
import pickle
file = open('important', 'rb')
data = pickle.load(file)
file.close()
print('Showing the pickled data:')
cnt = 0
for item in lst:
print('The data ', cnt, ' is : ', item)
cnt += 1
Output:
Showing the pickled data:
The data 0 is : 123
The data 1 is : abc
The data 2 is : !@#$
Q: Python program to take input for roll,name and per
of students and further write them in a binary file.
Sol:
import pickle
def write_details():
lst=[]
while True:
r=int(input("Enter roll "))
n=input("Enter name ")
p=float(input("Enter per "))
d=str(r)+' '+n+' '+str(p)
lst.append(d)
ch=input("Like to add more student records(y/n)")
if(ch=='y' or ch=='Y'):
continue
else:
break
fp=open("student","wb")
pickle.dump(lst,fp)
print("Names written in the file")
fp.close()
#function calling
write_details()
OUTPUT:
Enter roll 101
Enter name kamal
Enter per 98.90
Like to add more student records (y/n) y
Enter roll 102
Enter name Kishan
Enter per 99.98
Like to add more student records (y/n) y
Enter roll 103
Enter name Mohan
Enter per 98.78
Like to add more student records (y/n) n
Names written in the file
>>>
Q2: Python program to read empno, name and salary stored in
the above binary file.
Sol:
import pickle
f=open("emp","rb")
temp=pickle.load(f)
f.close()
#print(temp)
for rec in temp:
for field in rec.split():
print(field,end='\t')
print('\n')
Output:
1001 abc 12000.0
1002 xyz 23000.0
1003 aaa 25000.0
RELATIVE AND ABSOLUTE PATHS
Your computer drive is organized in a hierarchical structure of files
and directories.
• files -- These contain information. Examples include be csv
files, or python files.
• directories -- These contain files and directories inside of
them. If there are a large number of files to handle in our
Python program, we can arrange our code within different
directories to make things more manageable. A directory or
folder is a collection of files and subdirectories. Python has
the os module that provides us with many useful methods to
work with directories (and files as well).
Your filesystem starts from a root directory, notated by a forward
slash / on Uniux and by a drive letter C:/ on Windows.
Path
Path is a sequence of directory names which give you the hierarchy
to access a particular directory or file name.
1: Absolute file paths are notated by a leading forward slash or
drive label. An absolute path is a path that describes the location
of a file or folder regardless of the current working directory; in fact,
it is relative to the root directory. It contains the complete location
of a file or directory, hence the name. It is also referred to as
absolute pathname or full path and it always starts at the same
place, which is the root directory.
Relative file paths are notated by a lack of a leading forward
slash. For example, example_directory. A relative file path is
interpreted from the perspective your current working directory. If
you use a relative file path from the wrong directory, then the path
will refer to a different file than you intend, or it will refer to no file
at all.
The OS module provides functions for working with files and
directories. We can get the present working directory using
the getcwd() method of the os module. This method returns the
current working directory in the form of a string. We can also use
the getcwdb() method to get it as bytes object.
>>> import os
>>> os.getcwd()
'C:\\Program Files\\PyScripter'
Fill in the blanks:
1. A collection of bytes stored in computer’s secondary memory is
known as _______.
2. ___________ is a process of storing data into files and allows to
performs various tasks such as read, write, append, search
and modify in files.
3. The transfer of data from program to memory (RAM) to permanent
storage device (hard disk) and vice versa are known as
__________.
4. A _______ is a file that stores data in a specific format on
secondary storage devices.
5. In ___________ files each line terminates with EOL or ‘\n’ or
carriage return, or ‘\r\n’.
6. To open file data.txt for reading, open function will be written as f
= _______.
7. To open file data.txt for writing, open function will be written as f
= ________.
8. In f=open(“data.txt”,”w”), f refers to ________.
9. To close file in a program _______ function is used.
10. A __________ function reads first 15 characters of file.
11. A _________ function reads most n bytes and returns the read
bytes in the form of a string.
12. A _________ function reads all lines from the file.
13. A _______ function requires a string (File_Path) as parameter to
write in the file.
14. A _____ function requires a sequence of lines, lists, tuples etc. to
write data into file.
15. To add data into an existing file ________ mode is used.
16. A _________ function is used to write contents of buffer onto
storage.
17. A text file stores data in _________ or _________ form.
18. A ___________ is plain text file which contains list of data in
tabular form.
19. You can create a file using _________ function in python.
20. A __________ symbol is used to perform reading as well as writing
on files in python.
Answers:
1. File
2. File Handling
3. I/O Operations
4. Data file
5. Text File
6. open(“data.txt”,”r”)
7. open(“data.txt”,”w”)
8. File handle or File Object
9. close
10. read(15)
11. readline()
12.readlines()
13.write()
14.writelines()
15. append
16.flush()
17. ASCII, UNICODE
18. CSV
19.open()
20. +