
A useful program usually needs to interact with the outside world. Such interaction can involve receiving data or sending data outside the program. Data received from outside is called input data while the data the program sends outside is called output data. Together input and output operations are often referred to as I/O (Input/Output) operations.
The input may directly come from the user via the keyboard, an external file, or even the internet. We can display output directly to the console, which we have been doing so far using the print()
function. A program can also output data by storing data in a file, database, or sending data through the internet. Let's look into some basic I/O operations in Python.
Input and Output
In primary input and output operations, let's look into how we can get user data using the keyboard and output data into the console screen.
Getting User Data using input
Programs that require input data from the user can use the built-in function input(<prompt>)
. The input
function reads a line of input from the keyboard. When a program has an input
statement, Python pauses the program execution to allow the user to type a line of input. After the user types all the characters and presses the Enter key, Python returns the typed character as a string object.
Let's look at the usage of the input
function.
>>> your_name = input()
Primer # User Input
>>> your_name # User Input is stored in the name `your_name`
'Primer'
You can also include an optional prompt argument to the input()
function. The prompt argument displays a text string as a prompt to the user before pausing the program execution to read input. We can rewrite the above code listing to include a prompt.
>>> your_name = input("May I know your name? ")
May I know your name? Primer # Prompt is shown before the input
>>> your_name
'Primer'
The input
function always returns a string. If you wish to get a numeric value from the input data, you will need to convert the string to an appropriate type. We can convert the string type using numeric types such as int
or float
built-in functions.
>>> your_age = input("May I know you age ? ")
May I know you age ? 16
>>> your_age
'16' # String
>>> int(your_age)
16 # Integer
We can convert the input
directly to the given type by passing it to the built-in functions int
or float.
>>> your_birth_year = int(input("May I know your year of birth ? "))
>>> your_birth_year
2001 # Integer
Take a look at the following code.
>>> options = ["A", "B", "C", "D"]
>>> choice = input()
'1'
>>> choice = __X__
>>> options[choice]
'B'
What is the value of __X__
for which the above code is correct?
int(choice)
int(input)
int('B')
- None of the above
Let's understand the input
function by creating a small program. Let's create a program that asks the user to guess a random integer by providing hints.
from random import randint # To generate a random integer
secret_number = randint(1, 99) # A random secret number
while True:
try:
guess = int(input("Guess a number between 1 and 99: "))
except ValueError: # In case user types strings
print("Your guess must be number")
except KeyboardInterrupt: # In case user quits abruptly
print("Quitting the game. Thank you for playing.")
break
else: # Will run if no exceptions are raised
if guess < secret_number:
print("Your guess is low")
if guess > secret_number:
print("Your guess is high")
if guess == secret_number:
print("You got it correct. It is {}".format(secret_number))
break
The output of the script guessing_game.py
is shown below.
Guess a number between 1 and 99: 34
Your guess is high
Guess a number between 1 and 99: 25
Your guess is high
Guess a number between 1 and 99: 10
Your guess is low
Guess a number between 1 and 99: 11
You got it correct. It is 11
guessing_game.py
in your own words?
The guessing_game
script is an excellent demonstration of how the input()
function can create interactive Python programs. Let's take a look at how it works.
The guessing_game
script works in the following way:
- Python assigns a randomly generated integer between 1 and 99 to
secret_number.
- The condition for the
while
loop remains true unless the user correctly guesses thesecret_number
or purposely quits the program by pressingCtrl+c.
- The
try
block gets the input and converts it to an integer using theint
function - If the user inputs anything other than a number, Python raises
ValueError
, which is caught by theexcept
handler - The
else
clause only runs when thetry
block doesn't raise any exceptions. - If the user guesses correctly, the program ends.
Now that we have a bit of an idea about getting input data through the input
function. Let's do an exercise.
Take a look at the following code.
while True:
try:
mood = input("What's your mood : ")
if mood.strip().lower() == 'q' or mood.strip().lower() == 'exit':
print("Thank you for sharing.")
break
print(f"Aha ! You are {mood}")
except KeyboardInterrupt:
print("Thank you for sharing.")
break
How can you quit the above script once you have started?
- Typing
exit
and pressing Enter. - Typing
Q
and pressing Enter. - Typing
EXIT
and pressing Enter. - All of the above
Reading Command Line Arguments
When we execute a python script using the command line, Python lets you access the command line arguments provided using the sys
module. To understand, let's create a small python script and save it as check_arguments.py
.
The check_arguments.py
script below is a small script that simply prints out sys.argv
to the screen.
import sys
print(sys.argv)
If you execute the python script directly, you will receive the following output.
> python check_arguments.py
['check_arguments.py'] # Can be accessed by `sys.argv[0]`
The sys.argv
module lets you access the list of command-line arguments passed to a Python script. The first argument is the script name, which depends on the operating system as to whether this is a full pathname or not. Let's add some more arguments.
> python check_arguments.py hello world 1 2 3
['check_arguments.py', 'hello', 'world', '1', '2', '3']
All the arguments separated by space are presented as a python list that can be be accessed using their respective index.
Say we have the following script.
import sys
if len(sys.argv) == 4:
_, dad, mom, son= sys.argv
print(f"{dad} is married to {mom} and {son} is their son")
The output of the above script is shown below:
python3 happy_family.py __A__
Luffy is married to Boa, and Toffee is their son
What is the __A__
in the output of the above?
Luffy Boa Toffee
Boa Luffy Toffee
Toffee Luffy Boa
Luffy Toffee Boa
Now, let's look into how we can read files in Python.
Reading and Writing Files
One common use of programs is to read, and write files. Python provides the built-in function open()
to allow you to open a file for reading as well as writing a file. The open()
function returns a file object with various reading and writing operations methods. We will look into the file object later. First, let's look into how we can use the open()
function.
Opening Files
To start reading files in Python, let us create a plain text file named epictetus.txt
and save the following text.
epictetus.txt
How long are you going to wait before you demand the best for yourself and, in no instance, bypass the discriminations of reason?
You have been given the principles that you ought to endorse, and you have endorsed them. What kind of teacher, then, are you still waiting for to refer your self-improvement to him?
You are no longer a boy but a full-grown man. If you are careless and lazy now and keep putting things off and always deferring the day after which you will attend to yourself, you will not notice that you are making no progress, but you will live and die as someone quite ordinary.
From now on, then, resolve to live as a grown-up who is making progress, and make whatever you think best a law that you never set aside. And whenever you encounter anything difficult or pleasurable, or highly or lowly regarded, remember that the contest is now: you are at the Olympic Games, you cannot wait any longer, and that your progress is wrecked or preserved by a single day and a single event.
That is how Socrates fulfilled himself by attending to nothing except reason in everything he encountered. And you, although you are not yet a Socrates, should live as someone who at least wants to be a Socrates.
In the same folder, start a Python interpreter. To read the epicteus.txt
file in Python, use the following code listing.
>>> text_file = open("epictetus.txt") # Open the file
If the above code runs and doesn't result in the FileNotFoundError
exception, it means Python could successfully open the file.
Opening a file in Python doesn't mean the same thing as opening it in the text editor. The open
function returns a file-object iterator, which knows how to get the text file data.
To check if the text_file
is an iterator, let's pass the object to the next()
function.
>>> next(text_file)
'How long are you going to wait before you demand the best for yourself and in no instance bypass the discriminations of reason? \n'
The above code illustrates that the text_file
object is an iterator. We will check into more of its method later on.
The first thing you should know after learning to open a file is how to close the file. In Python, we can close the file using the close()
method available on the file-object.
>>>> text_file.close() # File is closed
Once you close the file, invoking the next()
using the file-object will raise ValueError.
>>> next(text_file)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
ValueError: I/O operation on closed file.
In Python, you should always close the file you open.
When the script or application is terminated, Python closes the file on its own. However, there is no guarantee when that will happen. In the meantime, this can cause some unwanted behavior, including resource leaks.
It is in your best interest to always make sure that your code behaves in a way that is well-defined while trying to reduce unwanted behavior.
When you are working with files, there are two main ways that you can use to ensure that the files are closed properly, even when encountering an error.
The first way is to use the try-finally
clause to work with files.
text_reader = open('epictetus.txt')
try:
# Operations using the file-object
finally:
text_reader.close() # Always executes
try-finally
clause to close an opened file. What happens in the finally clause when the try
encounters an error?
Even if Python encounters an error, it always executes the finally
clause. This block ensures that Python always closes the file properly after executing the statements.
The second way to close the file properly is to use the with
statement to open a file.
with open('epictetus.txt') as text_reader:
# Operations using the file-object
After Python executes the code inside thewith
statement, thewith
statement automatically closes the file.
Similar to the finally
clause, the with
statement closes files even when Python encounters an error.
Although you can use any of the two methods, we recommend using the with
statement to allow for a cleaner code while handling any unexpected errors.
Next, let's look at the file objects.
File Objects
The open()
function takes an argument mode
, specifying how the file is opened and returns a file-object. The default mode is r
, which means open for reading in text mode. The available modes are as shown in the table 3.
Character | Meaning |
---|---|
r |
open for reading (default) |
w |
open for writing, overwriting the file first |
rb or wb |
Open, read and write in binary mode |
When we earlier opened a file without specifying the mode, Python assumed we wanted to open in r
reading mode. Let's read the contents of the file epictetus.txt
in Python.
>>> with open('epictetus.txt') as text_reader: # Same as open('epictetus.txt', 'r')
... for line in text_reader:
... line
'How long are you going to wait before you demand the best for yourself and in no instance bypass the discriminations of reason? \n'
'\n'
...
#### Output Shortened
As the text_reader
file-object is an iterator, we can use the for
statement to loop over the object. The official documentation describes the file-object as the follows
An object exposing a file-oriented API (with methods such asread()
orwrite()
) to an underlying resource. Depending on how it was created, a file object can mediate access to a real on-disk file or another type of storage or communication device (for example, standard input/output, in-memory buffers, sockets, or pipes.). File objects are also called file-like objects or streams.
There are three types of file-objects:
- raw binary files
- buffered binary files
- text files
For this course, we will focus only on file-objects, which are of type text-files.
Binary files are files where the format is made up of non-readable characters usually stored in binary format. Binary files can range from image files like JPEGs or GIFs, audio files like MP3s or binary document formats like PDF. In Python, files are opened in read
mode by default. Now, take a look at how to read files.
Reading Files
The file objects have methods relating to reading operations for files.
Method | Description |
---|---|
read() |
Reads and returns entire file |
readable() |
Return whether object opened for reading |
readline() |
Read and return a line |
readlines() |
Returns a list object with lines from the file |
read
The read(size=-1)
method reads from the file based on the number of size
bytes provided as an argument. If you pass no arguments or the arguments that you give are either None
or -1
as an argument, then the entire file is read and returned as a string.
>>> with open('epictetus.txt') as reader:
... print(reader.read(30))
...
How long are you going to wait
In the above code listing, we are passing 30
as a size argument to specify the bytes to return.
readable
To check whether we opened the file for reading, we can use the readable()
method, which returns either True
or False
.
>>> with open('epictetus.txt', 'w') as reader:
# Open in `write` mode
... reader.readable()
...
False
readline
The readline(size=-1)
method returns bytes specified by the size
argument. Python reads the entire line if the size
is -1
or None
or not provided.
>>> with open('epictetus.txt') as reader:
... reader.readline()
...
'How long are you going to wait before you demand the best for yourself and in no instance bypass the discriminations of reason? \n'
readlines
The readlines
method reads from the file object and returns them as a list object.
>>> with open('epictetus.txt') as reader:
... reader.readlines()
...
['How long are you going to wait before you demand the best for yourself and in no instance bypass the discriminations of reason? \n', ..., 'That is how Socrates fulfilled himself by attending to nothing except reason in everything he encountered. And you, although you are not yet a Socrates, should live as someone who at least wants to be a Socrates.'] # List Truncated
For which value of __A__
, the following code returns a list object?
>>> with open('epictetus.txt') as reader:
... type(reader.__A__())
...
<class 'list'>
readlines
readline
readable
read
Iterating over lines
As we mentioned earlier, the file-object is an iterator. So, we can read lines using the iterator itself. We can use a for
loop to iterate over the file-object iterator to read the lines.
>>> with open('epictetus.txt') as reader:
... for lines in reader:
... print(lines)
...
How long are you going to wait before you demand the best for yourself and, in no instance, bypass the discriminations of reason?
### Lines Truncated
What's the output of the following code?
try:
reader = open('epictetus.txt', 'w')
s = ""
for lines in reader:
s += lines
print(len(s))
finally:
reader.close()
- 1238
- 1233
- 1235
- Raises Error
As we opened the file in writing mode bypassing w
mode, Python raises an error while reading it.
Now that we have covered how to read files, let's look into writing into files.
Writing Files
We can write into a file only when we open it in the write
mode. If you open a file in the write
mode, Python overwrites and removes all the previously stored content. The table 3 specifies the methods available to write into the file.
Methods | Description |
---|---|
write(text) |
Writing text to the stream |
writelines(lines) |
Writes the sequence lines to the file. |
writable() |
Returns whether object was opened for writing |
write
The write(text)
can be used to write strings to a file. The following code listing below illustrates writing to a file. Let's add a serial number to each line in epictetus.txt
by writing into a new file numbered_epictetus.txt
.
>>> with open('epictetus.txt') as reader:
... text = reader.readlines()
>>> with open('numbered_epictetus.txt') as writer:
... for index, line in enumerate(epictetus):
... writer.write("{}. {}".format(index + 1, line))
The output of writer.py
is shown below.
1.How long are you going to wait before you demand the best for yourself and, in no instance, bypass the discriminations of reason? 2.
3.You have been given the principles that you ought to endorse, and you have endorsed them. What kind of teacher, then, are you still waiting for to refer your self-improvement to him? 4.
...
In the above code,
- we read all lines from the
epictetus.txt
and name ittext
. - Then, we open a new file,
numbered_epictetus.txt
in thewrite
mode and write using thewrite()
function. - If you check the directory, you can see that Python creates a new file,
numbered_epictetus.txt
with each line number listed in the first few characters of the line.
We can also work with two different files at the same time while using with
statements.
>>> with open('epictetus.txt') as reader, open('numbered_epictetus_2.txt') as writer:
... for index, line in enumerate(reader.readlines())
... writer.write("{}. {}".format(index + 1, line))
The above code listing produces the same file as we saw earlier.
You might notice that the output numbers blank lines as well. We need to remove the blank lines.
with open('epictetus.txt') as __A__, open('numbered_epictetus_2.txt', 'w') as __B__:
text = [line + "\n" for line in reader.readlines() if line != "\n"]
for index, line in enumerate(__C__):
writer.write(f"{index+1}.{line}")
The above code removes the lines containing newline characters and an additional newline for spacing between paragraphs. What are the values of A, B, and C?
- A: reader, B: writer and C: text
- A: reader, B: writer and C:
reader.readlines()
- A: writer, B: reader and C: text
- A: text, B: writer and C: reader
Similar to the readlines
method for reading files, file objects also have writelines
. Let's take a look.
writelines
The writelines
method writes sequences to a file. Let's look into an example to understand more.
>>> with open('squares_and_cubes.txt', 'w') as writer:
a = [(x, x**2, x**3) for x in range(10)]
writer.writelines(["The square and cube of {} are {} and {} respectively.\n"
.format(x, y, z) for x, y, z in a ])
In the above code listing, we pass a list constructed by list comprehension as an argument to the writelines
method. Once you have executed the code above and check the directory, you will find a text file squares_and_cubes.txt
with the following text.
The square and cube of 0 are 0 and 0, respectively.
The square and cube of 1 are 1 and 1, respectively.
The square and cube of 2 are 4 and 8, respectively.
The square and cube of 3 are 9 and 27, respectively.
...
writelines
and write
are two main writing text methods in Python files. Can you think of the main difference between these two methods?
The write()
function accept only string arguments while writelines()
function takes sequences as arguments.
Python can also work with files such as csv
files, which we can open in spreadsheet software such as Microsoft Office, Google Sheets, or LibreOffice Calc. Let's look into how we can do that.
Reading and Writing CSV
Acsv
file is a type of plain text file that uses a special formating structure to represent tabular data. Thecsv
stands for Comma Separated Values.
To understand why, let's look at an example of a csv
file. In a csv
file, each line corresponds to one row of the table while a comma separates each cell. Because we use a comma to separate each piece of data, csv
is called Comma Separated Values, and the comma is called delimiter. You can use other delimiters as well, such as hyphen -
, or space.
Suppose we have a table of data shown in the table 4.
Name | City | Age |
---|---|---|
John | Berlin | 45 |
Mark | London | 34 |
Liu | Shanghai | 45 |
Balakrishna | Chennai | 33 |
Sofia | Istanbul | 34 |
We can represent the tabular data in the form of csv
in the following form.
person_data.csv
name,city,age
John,Berlin,45
Mark,London,34
Liu,Shanghai,45
Balakrishna,Chennai,33
Sofia,Istanbul,34
Save the file with the text shown above with the name person_data.csv
and start an interpreter in the same directory.
Reading CSV File
The csv
module exposes a handy function reader(<file-object>)
which can parse a given csv file. Let's see it in action.
>>> import csv
>>> with open('person_data.csv') as csv_file:
... csv_reader = csv.reader(csv_file)
... print(list(csv_reader))
[['name', ' city', ' age'], ['John', ' Berlin', ' 45'], ['Mark', ' London', ' 34'], ['Liu', ' Shanghai', ' 45'], ['Balakrishna', ' Chennai', ' 33'], ['Sofia', ' Istanbul', ' 34']]
In the above code listing,
we open the file person_data.csv
using with
statement as csv_file
.
To parse the csv
file we pass the csv_file
to the reader()
function of csv
module.
The reader()
function returns an iterator, which upon each iteration returns a row of the csv
file.
Let's parse and store the data in a Python dictionary using dictionary comprehension.
>>> from csv import reader
>>> with open('person_data.csv') as csv_file:
... csv_reader = reader(csv_file)
... person_dict = {
... name:{"city" : city, "age": int(age)} # Convert the age to integer
... for index, (name, city, age) in enumerate(tuple(csv_reader))
... if index != 0 # Skip first line
... }
>>> person_dict
{'John': {'city': ' Berlin', 'age': 45}, 'Mark': {'city': ' London', 'age': 34}, 'Liu': {'city': ' Shanghai', 'age': 45}, 'Balakrishna': {'city': ' Chennai', 'age': 33}, 'Sofia': {'city': ' Istanbul', 'age': 34}}
Parsing csv
using Python is pretty useful, especially when working with a large amount of data.
The following script generates text from the person_data.csv
file we earlier created.
import csv
with open('person_data.csv') as csv_file:
lines = [f"{age} year old {name} is from {city}"
for (name, city, age) in list(__A__)[1:]] # A
for line in lines:
print(line) # Prints in the form '45 year old John is from Berlin ...'
What is the value of __A__ in the above code?
csv.reader(csv_file)
csv_file
reader(csv_file)
writer(csv_file)
Apart from the reader
function, the csv
module has a DictReader
function, which helps read csv
files and store them directly as Python dictionaries. Let's look into it in the next section.
Reading CSV using DictReader
The DictReader(fileobject, fieldnames)
maps the information in each row to a dictionary whose keys are provided to the function using an optional fieldnames
parameter. The fieldnames
parameter is a sequence. If you omit the fieldnames, Python uses the first row of the file-object values as the fieldnames.
Let's read the persons_data.csv
using DictReader
function of csv
module.
>>> from csv import DictReader
>>> with open('person_data.csv') as csv_file:
... csv_reader = DictReader(csv_file)
... for row in csv_reader:
... print("{} year old {} belongs to {}"
... .format(row["age"], row["name"], row["city"]))
45 year old John belongs to Berlin
34 year old Mark belongs to London
45 year old Liu belongs to Shanghai
33 year old Balakrishna belongs to Chennai
34 year old Sofia belongs to Istanbul
In the above code listing, the csv_reader
is an iterator resulting from DictReader
. Python references each line item in a row using it's column name. We have not provided the optional fieldnames
parameter.
Let's provide a fieldnames parameter to the DictReader
function.
>>> with open('person_data.csv') as csv_file:
next(csv_file) # Skip first row
csv_reader = DictReader(csv_file, fieldnames=["Nom", "Ville", "Âge"])
print(list(csv_reader))
[{'Nom': 'John', 'Ville': 'Berlin', 'Âge': '45'}, {'Nom': 'Mark', 'Ville': 'London', 'Âge': '34'}, {'Nom': 'Liu', 'Ville': 'Shanghai', 'Âge': '45'}, {'Nom': 'Balakrishna', 'Ville': 'Chennai', 'Âge': '33'}, {'Nom': 'Sofia', 'Ville': 'Istanbul', 'Âge': '34'}]
When we passed the fieldnames
parameter with our custom column names, the DictReader
function generates dictionary objects with the specified keys.
Note that we have to skip the first row manually; otherwise, it would have been added to the dictionary object as well.
What is the value of __A__
in the below code?
>>> from csv import DictReader
>>> with open('person_data.csv') as csv_file:
next(csv_file)
csv_reader = DictReader(csv_file, fieldnames = __A__ ) # A
for row in csv_reader:
print(f"{row['age']} year old {row['first_name']} is from {row['location']}")
45 year old John is from Berlin
34 year old Mark is from London
45 year old Liu is from Shanghai
33 year old Balakrishna is from Chennai
34 year old Sofia is from Istanbul
["first_name", "location","age"]
["age", "name","location"]
["location", "first_name","age"]
["name", "city","age"]
We can also use the csv
module to write csv
files in Python. Let's take a look.
Writing a Python List to csv
file
Let's say we have the following list of data.
>>> person_details = [
['John Doe', 'Berlin', 'Germany', 45],
['Mark Waugh', 'London', 'United Kingdom', 34],
['Liu Xi', 'Shanghai', 'China', 45],
['Balakrishna Ram', 'Chennai', 'India', 33],
['Sofia Khan', 'Istanbul', 'Turkey', 34]
]
To save the list object as a csv
object, we can use the writer
function from the csv
module.
>>> with open('person_details.csv', mode='w') as csv_file:
csv_writer = csv.writer(csv_file)
csv_writer.writerow(['firstname', 'lastname', 'city', 'country', 'age'])
for person in person_details:
csv_writer.writerow([
person[0].split()[0], # Firstname
person[0].split()[1], # Lastname
person[1], # City
person[2], # Country
person[3] # Age
]
)
The writer
function takes the file-object as an argument and exposes a function writerows
, which can write a row to the csv
file.
In the above code, we first write the column names in the first row. It is a practice to write the column name for respective columns in the first row of a csv
file. Then for each item in the person_details
, we write a row in the csv
file.
If you can successfully run the code, you should see a file resembling below.
firstname,lastname,city,country,age
John,Doe,Berlin,Germany,45
Mark,Waugh,London,United Kingdom,34
Liu,Xi,Shanghai,China,45
Balakrishna,Ram,Chennai,India,33
Sofia,Khan,Istanbul,Turkey,34
If you can generate such a file, you have successfully written a python list object to a csv
file.
Let's say we need the data in the csv file in the following format.
Name, Location, Age
John Doe, Berlin-Germany, 45 years old
...
The following code achieves the same.
import csv
person_details = [
['John Doe', 'Berlin', 'Germany', 45],
... ] # Shortened for brevity
with open('person_details.csv', mode='w') as csv_file:
csv_writer = csv.writer(csv_file)
csv_writer.writerow(['Name', 'Location', 'Age'])
for person in person_details:
csv_writer.writerow([
person[0], # name
__A__, # City - Country (A)
person[3] # Age
]
)
What is the value of __A__?
f"{person[1]}-{person[2]}"
person[1]
f"{person[2] - person[1]}
person[2]
The writerow
function of the csv
module is useful if you want to write sequences of data row into a csv
file. If you wish to write sequences of dict
objects into csv
, the module offers another function. Let's take a look.
Writing a Python Dict to csv
file
Let's say we have a list of dictionary objects, as shown below.
>>> persons_info = [{'firstname': 'John',
'lastname': 'Doe',
'city': 'Berlin',
'country': 'Germany',
'age': 45},
{'firstname': 'Mark',
'lastname': 'Waugh',
'city': 'London',
'country': 'United Kingdom',
'age': 34},
{'firstname': 'Liu',
'lastname': 'Xi',
'city': 'Shanghai',
'country': 'China',
'age': 45},
]
We can write the dictionary list object to a csv
file using the DictWriter
function.
>>> from csv import DictWriter
>>> with open('persons_info.csv', 'w') as csv_file:
... fieldnames = ['firstname', 'lastname', 'city', 'country', 'age']
... writer = DictWriter(csv_file, fieldnames=fieldnames)
... writer.writeheader()
... for person in persons_info:
... writer.writerow(person)
The DictWriter
function accepts file-object
and fieldnames
as required parameters. The fieldnames
argument is used by the DictWriter
to determine the keys and values from the python dictionaries and write the header row. The above code generates the following csv
file.
persons_info.csv
firstname,lastname,city,country,age
John,Doe,Berlin,Germany,45
Mark,Waugh,London,United Kingdom,34
Liu,Xi,Shanghai,China,45
We write the following script to generate a csv
file.
from csv import DictWriter
pirates = [{"captain": "Luffy", "name": "Zorro", "group": "Strawhat"}]
with open('pirates_info.csv', 'w') as csv_file:
fieldnames = ['name', 'captain', 'group']
writer = DictWriter(csv_file, fieldnames=fieldnames)
writer.writeheader()
for pirate in pirates:
writer.writerow(pirate)
What's the order of the items in the generated csv
file?
Luffy, Zorro, Strawhat
Strawhat, Zorro, Luffy
Zorro, Luffy, Strawhat
Strawhat, Luffy, Zorro
As mentioned before, we use the files in the csv
formats primarily to exchange data between different applications. There is another commonly used data exchange format called JSON
. Let's take a look next.
Reading and Writing json
JSON is a data format widely used to exchange data. JSON stands for JavaScript Object Notation and was inspired by a subset of JavaScript Programming language.
JSON has become a language-agnostic or language-independent data format due to its simplicity. Like files written in csv
format, files written in JSON
format are both readable by machines and humans. JSON is a format that encodes objects in a string.
Serialization is the process of encoding JSON
while decoding data is called deserialization.
Say have a python dictionary object.
{foo: [1, 2, 3], bar: "Hello"}
You can serialize it into a string in the JSON
format.
'{"foo": [1, 2, 3], "bar": "Hello"}'
This string can be stored or sent anywhere using the internet. The receiver can then recover the underlying data by deserialization. For instance, we can decode the above string in JavaScript in the following way. You can try it out in the browser console.
// Try out in your browser console.
> JSON.parse('{"foo": [1, 2, 3], "bar": "Hello"}')
foo: (3) [1, 2, 3]
bar: "Hello"
The python list
object becomes the equivalent array
object in JavaScript.
JSON
format is quite useful in exchanging data. How does it differ from csv
?
CSV and JSON are both forms of structured data.
The primary difference is that CSV is a flat data format, which means you only need to know two values, the row number, and column number, to get any value in the file.
JSON is a hierarchical data format. This means values can be nested underneath each other. You may need to know some pieces of information to get different values in the file.
Python has a built-in module, JSON
, which can be used to encoding and decoding the JSON
format. Let's look into how we can serialize data.
Serialise Data in JSON
Python JSON
module has the following functions to serialize data in the JSON
format.
Function | Description |
---|---|
json.dumps(obj) |
Serialise python object to a JSON formatted str |
json.dump(obj, fp) |
Serialise python object to a file-object fp |
The JSON
encodes the Python objects to the corresponding JSON objects. By default, Python converts objects using the conversion table shown in Table 6.
Python | JSON |
---|---|
dict | object |
list, tuple | array |
str | string |
int, float | number |
True | true |
False | false |
None | null |
You might recall that JSON is a data exchange format. Most programming languages have their implementation of lists, dictionaries, and Boolean.
Therefore, to exchange data between programs written in two or more languages, it makes sense to convert them to a common data format (i.e., JSON format) for convenience.
We will start working with the JSON format by converting Python objects into JSON format. Let's look into how we can write JSON
to a string.
Writing json
to a string using dumps
The dumps
function lets you encode a Python object directly into the JSON formatted string.
>>> import json
>>> a = {"foo": (1, 2, 3, None, True), "bar": "Hello"}
>>> json.dumps(a)
'{"foo": [1, 2, 3, null, true], "bar": "Hello"}'
We can see that the dumps
function converts python objects to the JSON format's respective objects in the above code.
For larger JSON
string, it is often useful to print with indenting for better readability. It is a hierarchical data format. Printing with auto-indenting is called pretty printing because, well, it looks pretty.
Let's convert to a JSON
and pretty-print the JSON
.
>>> person_details
{'name': 'John Doe', 'age': 44, 'family': ['Jane Doe', 'Another Doe', 'Yet Another Doe'], 'preferences': {'color': 'blue', 'music': 'country folk'}}
>>> print(json.dumps(person_details)) # Without indenting
{"name": "John Doe", "age": 44, "family": ["Jane Doe", "Another Doe", "Yet Another Doe"], "preferences": {"color": "blue", "music": "country folk"}}
The json.dumps()
function takes an optional argument indent
and prints the JSON array elements and object members at the particular indent.
>>> print(json.dumps(person_details, indent=4)) # Pretty print
{
"name": "John Doe",
"age": 44,
"family": [
"Jane Doe",
"Another Doe",
"Yet Another Doe"
],
"preferences": {
"color": "blue",
"music": "country folk"
}
}
Take a look at the following code listing.
>>> import json
>>> gift = __A__(name="RC Car", qty=4)
>>> json.dumps(gift)
'{"name": "RC Car", "qty": 4}'
What is the value of __A__
?
dict
tuple
list
set
We looked at how we can generate JSON-formatted strings. Let's look at how we can directly write JSON-formatted strings to a file.
Writing json
to a file using dump
The dump
function from the json
module lets us encode an object as JSON formatted stream to a file-like object which supports the write
operation.
>>> import json
>>> a = {"foo": (1, 2, 3, None, True), "bar": "Hello"}
>>> with open('sample.json', 'w') as writer:
... json.dump(a, writer)
The above results in creating a file sample.json
in the same directory with the following text.
{ "foo": [1, 2, 3, null, true], "bar": "Hello" }
The dump
and dumps
function use the same conversion table shown in table 6 to convert Python objects to corresponding JSON format. They also take additional arguments, which we can see in the Python documentation[1] of the module.
Let's say we wish to generate the following json
file.
{
"foo": [1, 2, 3, null, true],
"bar": "Hello"
}
We have written the following code listing, which generates the above file.
import json
sample = {"foo": (1, 2, 3, None, True), "bar": "Hello"}
with open('sample2.json', 'w') as writer:
json.dump(__A__)
What's the value of A
in the above code sample?
sample, writer, indent=2
writer, indent=4, sample
sample2, writer, indent=2
sample2, writer, indent=2
We have seen how to serialize Python objects to JSON data format. Let's see how we deserialize JSON objects to obtain python objects.
Deserialize Data in JSON
We can use the functions shown in table 7 to deserialize JSON formatted strings or JSON files.
Function | Description |
---|---|
json.loads(s) |
Deserialize string s to a Python object |
json.load(fp) |
Deserialize a file-like object fp to Python object |
By default, loads
and load
use the conversion table shown in table 8 to construct Python objects from a JSON formatted file or string.
JSON | Python |
---|---|
object | dict |
array | list |
string | str |
number(int) | int |
number(real) | float |
true | True |
false | False |
null | None |
When we convert a tuple
object into JSON, it is converted into an equivalent JSON array
object. However, when we convert back the same JSON array
into Python, it is converted to a list
object instead of a tuple
.
Now, let's look into deserialization in a bit more detail.
Deserialize objects from a json
string using loads
We have dumped a python object to JSON
formatted string and load it back using the loads
function.
>>> import json
>>> b = {"foo" : [(1.25, 2.4, 3/5), False]}
>>> c = json.dumps(b) # '{"foo": [[1.25, 2.4, 0.6], false]}'
>>> json.loads(c)
{'foo': [[1.25, 2.4, 0.6], False]}
The loads
function converts a JSON formatted string to a python object using the JSON-to-Python conversion table shown in table 8.
We can note that the loads
function converts the array object in JSON format to the list
object instead of the tuple
object.
What's the output of the following code?
>>> import json
>>> a = {"foo": [1/3, 2/3, 3/3]}
>>> b = json.dumps(a)
>>> c = json.loads(b)
>>> c == a
- True
- False
We have learned to deserializing objects from a JSON-formatted string. Let's look into how to deserialize objects from the json
file.
Deserialize objects from a json
file using load
We can also deserialize python objects directly from a JSON file using the load
function. To illustrate, let's create and store a JSON
file.
>>> import json
>>> person_details
{'name': 'John Doe', 'age': 44, 'family': ['Jane Doe', 'Another Doe', 'Yet Another Doe'], 'preferences': {'color': 'blue', 'music': 'country folk'}}
>>> with open('person_details.json', 'w') as writer:
... json.dump(person_details, writer)
...
The json.dump()
function serialize python objects into the file person_details.json
. Now, let's load it back and print on the screen using the json.load()
function.
>>> import json
>>> with open('person_details.json') as reader:
... loaded = json.load(reader)
>>> loaded
{'name': 'John Doe', 'age': 44, 'family': ['Jane Doe', 'Another Doe', 'Yet Another Doe'], 'preferences': {'color': 'blue', 'music': 'country folk'}}
We are deserializing python objects from JSON formatted file and naming it loaded
to access it later in the above code listing. As you can see, this can be useful in many scenarios.
There are additional features of the JSON module that can read on the official python documentation.
In this section, we covered how to write and read files using Python. Next, let's look into how to work with files and directories in Python.

Working with Files and Directories
There are several operations that we perform on files and directories. However, in this topic, we will learn in detail only a few operations.
The following is a list of useful functionalities related to the files and directories that we can perform using Python.
- Getting the current working directory
- Listing contents of a directory
- Creating new directories
- Deleting Files and Directories
- Copying, Moving and Renaming Files and Directories
Simply put, a directory is a file that acts as a folder for other files and can also contain other directories.
If you run the Python Interpreter in a directory then, that directory is referred to as current working directory (cwd) of Python.
Let's look at how we can get the current working directory in Python.
Getting Current Working Directory
We can obtain the current working directory using the getcwd()
function from the os
module.
>>> import os
>>> os.getcwd()
'/home/primer' # Ouput on Unix based OS
The getcwd()
function returns a string representing the current working directory. The result will be different for you, depending on your underlying operating system. The file system in a windows varies from that of a UNIX based operating system. Operating systems such as Ubuntu, Arch, Debian, and even Mac OS are UNIX-based operating systems.
Windows-based OS stores store files in folders in different data drive such as C:
, D:
or E:
. Unix-based OS is ordered in a tree structure, starting with the root directory denoted by /
.
The UNIX based OSes use forward slash to separate /
directories while windows use backward slash \
. Linux creates the Home directory at /<name>
while for windows, it is usually C:\Documents and Settings\<name>
.
Take the following code listing. What's the output of the last statement?
>>> import os
>>> os.getcwd()
'/home/primer'
>>> len(getcwd().split('/'))
- 3
- 2
- 1
- Raises
NameError
I hope you got some idea about the different file-systems in different families of OSs. Next, let's look into how to list the content of a directory.
Listing contents of a Directory
We will be working from a directory, my_directory
, which has the following contents in it.
my_directory/
|
├── my_world/
| ├── details.json
|
├── expenses/
| ├── weekly_expenses.csv
| └── months_expenses.csv
|
└── automating_boring_work.py
Create an empty folder named my_directory
and create some files and folders, as shown in the above structure. Start an interpreter inside the folder my_directory.
>>> import os
>>> os.getcwd()
'/home/primer/my_directory' # On Unix-based
The rest of the topic will assume that you are working on Python Interpreter out of this particular directory.
To list out contents of the current directory, we can use
os.listdir(path='.')
os.scandir(path='.')
Let's look at each of the functions starting with os.listdir()
Listing directory using os.listdir()
To get the list of files and folders in the current working directory, we can use the os.listdir()
function by supplying it with a path string. The os.listdir()
returns a list containing entries in the directory given by the path
argument.
>>> import os
>>> current_directory = os.getcwd() # '/home/primer/my_directory'
>>> os.listdir(current_directory)
['automating_boring_work.py', 'my_world', 'expenses']
The os.listdir()
has default argument path="."
.
The operating system uses the .
character to refer to the current directory. At the same time, we can use the ..
characters to refer to the parent directory of the current directory.
If we call os.listdir()
without any arguments, we will get the same result as above.
>>> import os
>>> os.listdir() # Same as os.listdir('.')
['automating_boring_work.py', 'my_world', 'expenses']
What will be the output of the following code listing?
>>> import os
>>> os.listdir('.')
'/home/primer/my_directory'
>>> os.listdir('..')
- Will show entries in the folder
primer
- Will show entries inside the folder
my_directory
- Will show entries inside the folder
home
- Will raise an error
As I mentioned previously, ..
is used to go back to the parent directory.
Now that we can list entries in a directory, there are two main ways to list entries of directories present outside or inside the current directory. Let's take a look.
Suppose we want to access the my_world
directory.
- Absolute Path
The directory is referenced starting from the root folder such'/home/primer/my_directory/my_world'
- Relative Path
We can reference the file starting from its relative position to the current directory./my_world
. We can even omit the./
that reference to it only asmy_world
as the directory is a subdirectory of the current directory.
Let's see if both of these give the same result.
>>> import os
>>> os.listdir('/home/primer/my_directory/my_world') # Absolute Path
['details.json']
>>> os.listdir('./my_world') # Relative Path #1
['details.json']
>>> os.listdir('my_world') # Relative Path #2
['details.json']
We can see that both absolute and relative paths provide the same output.
What's the output of the following code?
>>> import os
>>> os.getcwd()
`'/home/primer/my_directory`
>>> os.listdir(os.getcwd() + '/' + 'expenses')
['months_expenses.csv', 'weekly_expenses.csv']
['months_expenses.csv', 'weekly_expenses.csv']
['details.json']
- Raises Error
Another function that also allows us to list our directories is the scandir
from the os
module. Let's take a look.
Listing directory using os.scandir()
The os.listdir()
returns a list of items that can be slow for many operations. A new function os.scandir()
was introduced Python 3.5 onwards, which returns an iterator instead of a list.
For instance,
>>> import os
>>> os.scandir() # Current Directory
<posix.ScandirIterator object at 0x7ff2ce925bd0> # Iterator
The os.listdir()
functions simply returns the list of entries while os.scandir()
returns attributes of the entries as well. The scandir
iterator returns the os.DirEntry
object for each file or directory entry, which provides information about the entry.
Earlier, we used theopen()
function with thewith
statement and mentioned that thewith
statement automatically closes the files after use. This is because the functionopen()
supports context manager protocol. When an iterator supports the context manager protocol, the iterator automatically frees up the handled resources when the iterator is exhausted.
The os.scandir()
supports [context manager protocol]{.s} and therefore can be used using with
statement. We will learn more about in context manager protocol in later courses.
Let's take a look at the [os.DirEntry
]{.s} objects.
>>> import os
>>> for entry in os.scandir():
... print(entry)
<DirEntry 'automating_boring_work.py'> # DirEntry Object
<DirEntry 'my_world'>
<DirEntry 'expenses'>
The attributes and methods of os.DirEntry
is shown in table 9 which are useful for additional information about the entries.
Methods and Attributes | Description |
---|---|
name |
The entry’s base [filename, relative to scandir path argument.]{.s} |
path |
The entry's [full path name]{.s} |
is_dir() |
Return True if [this entry is a directory]{.s} |
is_file() |
Return True if [this entry is a file]{.s} |
Let's list the names of all the files and folders present in the current directory and their type.
>>> with os.scandir() as entries: # with statement can be used
... for entry in entries:
... print(f"Entry Name: {entry.name}") # Print entry name
... print(f"Entry Path: {entry.path}") # Print path of the entrie
... print("Entry Type: {}".format("File" if entry.is_file() else "Directory")) # Print whether file
... print("{}".format("="*30)) # Divider
Entry Name: automating_boring_work.py
Entry Path: ./automating_boring_work.py
Entry Type: File
==============================
Entry Name: my_world
Entry Path: ./my_world
Entry Type: Directory
==============================
Entry Name: expenses
Entry Path: ./expenses
Entry Type: Directory
==============================
The entry.path
depends on the [path
argument provided to scandir()
function]{.s}. If we pass the current directory's absolute path, we will get the [absolute path of each entry as well]{.s}.
>>> with os.scandir('/home/primer/my_directory') as entries:
... for entry in entries:
... print(f'Entry Absolute Path: {entry.path}')
Entry Absolute Path: /home/primer/my_directory/automating_boring_work.py
Entry Absolute Path: /home/primer/my_directory/my_world
Entry Absolute Path: /home/primer/my_directory/expenses
As you can see, using a scandir
is a better choice when working with Python's files and directories.
What's the output of the following:
>>> import os
>>> os.getcwd()
'/home/primer/my_directory'
>>> with os.scandir() as entries:
... [entry.name for entry in entries if entry.is_dir() ]
['automating_boring_work.py', 'my_world', 'expenses']
['my_world', 'expenses']
['automating_boring_work.py', 'my_world', 'expenses']
['automating_boring_work.py']
[]
Earlier, we represented the tree-structure of our folder with all of its sub-directories listed out. We can replicate the same using scandir(). Let's create a function that lists out the files and sub-directory of a given folder.
Listing all files and sub-directories
Earlier, we represented the tree-structure of our folder with all of its sub-directories listed out. We can replicate the same using scandir()
. Let's create a function that lists out the files and sub-directory of a given folder.
We will name our function list_all()
, which takes two arguments, path
and indent
.
>>> def list_all(path=".", indent=0):
with os.scandir(path) as entries:
for entry in entries:
if entry.is_dir():
# Directory
print("{}+ {}".format("\t"*indent, entry.name))
# Recursion
list_all(path=entry.path, indent=indent + 1)
else:
# File
print("{}- {}".format("\t"*indent ,entry.name))
When we call the function list_all
for the current directory, it gives the following result.
>>> list_all()
- automating_boring_work.py
+ my_world
- details.json
+ expenses
- weekly_expenses.csv
- months_expenses.csv
list_all()
is doing under the hood?
In the list_all
function, we use recursion to get files and directories of each directory. We represent the files using -
while representing the directories using the +
symbol.
We wrote a custom function, list_all()
, to get all the files and directories entries in a given folder. Python provides a function os.walk()
that we can use to check out the files and directories. Let's take a look.
Walking through Directories
The function os.walk(top)
returns an iterator that returns entries names in the path top
by walking the file tree.
Each directory in the tree rooted at directory path top
, including top
itself, yields a 3-tuple object consisting of (dirpath, dirnames, filenames)
. The os.walk(top, topdown=True)
accepts an optional argument topdown
whose value is True
by default.
If the optional argument is not specified or True
, the 3-tuple is generated for a directory top
is specified before it's sub-directories. Python generates the directories in a top-down approach.
Let's check out an example.
>>> import os
>>> for dirpath, dirnames, filenames in os.walk('.'):
... # Top-down
... print(dirpath,dirnames, filenames)
. ['my_world', 'expenses'] ['automating_boring_work.py']
./my_world [] ['details.json'] # Entries in the './my_world'
./expenses [] ['weekly_expenses.csv', 'months_expenses.csv']
In the above code, we can see that the above code returns the list of files filenames
and directory dirnames
in directory path dirpath
. It starts with the path
provided, which is the current directory in the above code (.
). It then goes to respective folders and gets the list of files and directories.
If the topdown
argument is False
, the 3-tuple object for the directory top
is generated after the 3-tuple object for all of its subdirectories have been created. This is the bottom-up approach. Let's take a look.
>>> import os
>>> for dirpath, dirnames, filenames in os.walk('.', topdown=False): # Bottom-up
... print(dirpath,dirnames, filenames)
./my_world [] ['details.json']
./expenses [] ['weekly_expenses.csv', 'months_expenses.csv']
. ['my_world', 'expenses'] ['automating_boring_work.py']
# Root directory is listed at last
Let's look into how we can create directories using Python.
Making Directories
At some point, you would want to create directories using Python. The os
module provides two functions that can help you create directories which is shown in table 10.
Function | Description |
---|---|
os.mkdir() |
Create a single directory |
os.makedirs() |
Create multiple directories |
Let's look into how it works.
Creating a directory using mkdir
The os.mkdir(path)
accepts a path argument and creates a directory in the path. If the directory already exists, Python raises the FileExistsError
exception.
>>> import os
>>> os.mkdir('my_new_dir') # Directory Created
>>> os.mkdir('my_new_dir') # Raises Exception
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
FileExistsError: [Errno 17] File exists: 'my_new_dir'
You can use loops to create multiple directories.
Let's create ten new directories within our newly created my_new_dir
directory.
>>> for num in range(10):
... os.mkdir('my_new_dir/dir_{}'.format(num))
If you check in the my_new_dir
directory, you will find ten directories with names ranging from dir_0
to dir_9
.
The above code requires that the directory my_new_dir
exists before it can create other directories.
What is the output of the following?
>>> import os
>>> os.listdir()
['automating_boring_work.py', 'my_world', 'expenses']
>>> os.mkdir('2020/April/10/1200-1300')
- Creates successfully
- Raises
FileNotFoundError
- Raises
FileExistsError
- Raises
SyntaxError
For many endeavors, you want to create intermediate-level directories using a single command.
We can do so using os.makedirs
. Let's look into that function next.
Creating directories using makedirs
Let's say you want to create directories with the following structure.
2020/
|
└── April/
|
└── 10/
|
└── 1200-1300/
We can do this using the os.makedirs(name, exist_ok=False)
.
>>> import os
>>> os.makedirs('2020/April/10/1200-1300')
>>> os.makedirs('2020/April/10/1200-1300') # Will raise FileExistsError
You can check in the current directory that Python has created the tree-level directories shown above. If you call the makedirs
function again with the same set of arguments, it will raise the FileExistsError
exception.
If you want to override the exception, the function makedirs
accepts exist_ok
, which is by default False
.
>>> import os
>>> os.makedirs('2020/April/10/1200-1300', exist_ok=True) # Won't raise Exception
The makedirs
is useful when you want to create intermediate folders such as we created above.
makedirs
and mkdir
function of the os
module?
I guess we have learned how to create directories using Python. For the next trick, let's learn how to delete files and directories using Python.
Deleting Files and Directories
To delete files and directories, Python provides a lot of built-in functions. Let's start with deleting a single file using Python.
Deleting Single Files
To delete single files, Python provides two identical functions in the os
module which is shown in table 11.
Function | Description |
---|---|
os.remove(path) |
Removes the file path |
os.unlink(path) |
Remove the file path |
Both remove(path)
and unlink(path)
are semantically identical functions meaning they have the same functionality. We can use the unlink
command to remove files on the Unix-like operating system, hence the name.
Let's create two blank empty files and then remove them using Python.
>>> with open('blank_1.txt', 'w') as file1, open('blank_2.txt', 'w') as file2:
... pass
The above code creates two empty files, blank_1.txt
and blank_2.txt
in the current directory. Now, let's remove these files.
>>> import os
>>> os.remove('blank_1.txt') # Removes file
>>> os.unlink('blank_2.txt') # Removes file
If you successfully execute the code above, Python would have successfully removed the files.
Suppose you try to pass a file-path that doesn't exist or is a directory. In that case, the above two functions will raise FileNotFoundError
or IsADirectoryError
, respectively. Therefore, while removing files, we need to ensure that:
- the path exists and,
- the path is not a directory
We can check both of these functions using the path
sub-module of the os
module. The path
module has some interesting functions, some of which we can see in the table 12.
Function | Description |
---|---|
os.path.exists(path) |
Return True if path refers to an existing path |
os.path.isfile(path) |
Return True if path refers to an file, False if doesn't exists or is not a file |
os.path.isdir(path) |
Return True if path refers to a directory |
os.path.abspath(path) |
Returns the absolute version of path |
os.path.relpath(path, start=os.curdir) |
Returns the relative file path either from current directory or optional start directory |
The functions exists()
and isfile()
in the os.path
module is useful while deleting a given file. Usually, we should check if the path exists using the function isfile
as it checks both if a path exists and is a file.
import os
file_path = "some_file"
if os.path.isfile(file_path):
os.remove(file_path)
The FileNotFoundError
, IsADirectoryError
can be caught using OSError
exception. We can use the try-except
block to handle any errors resulting from remove operation safely.
import os
file_path = 'some_file'
try:
os.remove(file_path)
except OSError as e:
print(f'Error while removing {dir_path} : {e.strerror}')
What is the output of the code below?
>>> import os
>>> os.listdir() # Same directory as earlier
[ 'automating_boring_work.py', 'my_world', 'expenses',]
>>> for file in os.listdir():
if os.path.isfile(file):
os.remove(os.listdir()[0])
>>> len(os.listdir())
- 2
- 3
- 4
- 0
There is a difference between deleting empty directories and directories having files in them. Let's check how to remove empty directories in the next section.
Deleting Empty Directories
To delete empty directories, we can use the function os.rmdir(path)
. When the path
argument to the rmdir()
function doesn't exist or is not empty, FileNotFoundError
or OSError
exception is raised, respectively.
We can use the following code to safely remove a directory by handling errors in an except
block using OSError
exceptions.
import os
dir_path = "some_directory"
try:
os.rmdir(dir_path)
except OSError as e:
print(f'Error while removing {dir_path} : {e.strerror}')
When you try to remove a non-empty directory some_directory
, the above program throws the following message:
Error while removing some_directory: No such file or directory
Below is a code for removing all the empty directories inside the current working directory.
# DON'T EXECUTE THIS CODE
# YOU MIGHT END UP DELETING IMPORTANT
# AND THEN MAIL ME SAYING I CAUSED IT :(
import os
for dirpath, dirnames, filenames in os.walk('.'):
for dirname in dirnames:
try:
os.rmdir(__A__) # What's A?
print(f'Removed empty dir: {__A__}')
except OSError as e:
print(f'Error while removing {__A__} : {e.strerror}')
What is the value of A?
dirname
dirpath
filenames
dirnames
We checked out how to delete empty directories. Now, let's look into how to delete non-empty directories in the next section.
Deleting Non-Empty Directories
Python has a built-in shutil
module, which contains several functions relating to file collections. To delete the non-empty directories, you can use the rmtree()
function in the built-in module shutil
.
Let's create some empty files in a folder.
>>> import os
>>> os.mkdir('test_dir')
>>> for tempfile in ["test_dir/file_{}.txt".format(num) for num in range(10)]:
... with open(tempfile, 'w') as writer:
... pass
The above code will create a director test_dir
and populate it with some empty files. We can delete the folder test_dir
in the following way:
>>> import shutil
>>> shutil.rmtree('test_dir') # Folder is deleted
Python deletes the folder test_dir
and its contents when the shutil.rmtree()
function is invoked. We use the shutil.rmtree
to delete directories, empty or otherwise.
The shutil
module also offers other functions that we can use to copy, move, and rename files and directories. We will look into that in the next section.
Copying, Moving and Renaming Files and Directories
Often, we need to copy, move, and rename a set of files and folders. The shutil
module provides functions for doing that, and I have listed them in the table 13:
Function | Description |
---|---|
shutil.copy(src, dst) |
Copies the file src to the file or directory dst . |
shutil.copy2(src, dst) |
Identical to copy() except also attempts to preserve file metadata |
shutil.copytree(src, dst) |
Recursively copy an entire directory tree rooted at src to a directory named dst and return the destination directory |
shutil.move(src, dst) |
Recursively move a file or directory src to another location dst and return the destination directory |
shutil.rename(src, dst) |
Rename the file or directory src to dst |
Let's look into each of the functions starting with copying files.
Copying Files
The shutil
module provides two functions to copy files and directories: copy
and copy2
. The copy2()
function is identical to copy()
except it also attempts to preserve file metadata such creation date or last modified date.
Let's create two folders dir_1
and dir_2
and create an empty file test.txt
in dir_1
directory.
>>> import os
>>> os.mkdir('dir_1'); os.mkdir('dir_2')
>>> with open('dir_1/test.txt', 'w') as writer: # Create a file
... writer.write('Hello World')
We can copy the fiile test.txt
in the dir_1
using either shutil.copy()
or shutil.copy2()
.
>>> import shutil
>>> shutil.copy2('dir_1/test.txt', 'dir_2/test_copy.txt') # same as `shutil.copy`
'dir_2/test_copy.txt' # Returns path to newly created file
After running the above code, you can check that the directory dir_2
to find a newly created file, test_copy.txt
. In the shutil.copy(src, dst)
function, the argument src
should be a file path while dst
can be file path or directory.
If you put the dst
argument as a directory, the name of the newly created file will be taken from the base filename of the src
file path.
>>> import shutil
>>> shutil.copy2('dir_1/test.txt', '.') # dst is current directory
'./test.txt' # file is copied to current directory.
In the above code, we provided the argument dst
to be the current directory (.
); therefore, we copied the file in the current directory. Let's do an exercise.
Take a look at the code below.
>>> import shutil, os
>>> os.makedirs('dir1/dir2/dir3')
>>> with open('dir1/text1.py', 'w') as writer:
writer.write('print("Hello World")')
>>> shutil.copy2(__A__, __B__) # What's A and B?
'dir1/dir2/dir3/text1.py'
- A:
'dir1/text1.py'
, B:'dir1/dir2/dir3'
- A:
'dir1/text1.py'
, B:'dir1/dir2/dir3/text1_copy.py'
- A:
'dir1/dir2/dir3/text1.py'
, B:'dir1/text1.py'
- A:
'dir1/dir2/dir3'
, B:'dir1/text1.py'
We can also copy and move files directories using Python. Python's shutil
provides a copytree()
function to do that. Let's take a look.
Copying Directories
The shutil.copytree(src, dst)
function copies an entire directory along with its content rooted at path src
to a directory specified by path dst
. Earlier in the previous exercise, we created the directory dir_1
with a file test.txt
. Let's copy the entire directory to a new directory, dir_3
.
>>> import shutil
>>> shutil.copytree('dir_1', 'dir_3')
'dir_3' # Directory copied
The shutil.copytree(src, dst)
function creates a new directory at dst
if the directory doesn't exist. If a directory exists at path dst
, Python will raise FileExistsError
.
Moving Files and Directories
The shutil.move(src, dst)
moves a file or directory at src
to another file path specified by dst
and returns the path to the newly moved file. If the destination dst
is an existing directory, then src
is moved inside that directory.
Let's move the entire directory dir_1
inside a dirs
folder.
>>> import shutil
>>> shutil.move('dir_1', 'dirs/dir_1') # if `dirs` directory doesn't exist
'dirs/dir_1'
You can check using the files and directories that the directory dir_1
no longer exists and has been successfully moved to the dirs
directory.
Suppose the destination is in the same directory as that of the source file or directory. In that case, Python uses os.rename()
to rename files and folders. Otherwise, Python copies files and directories from src
to dst
, and then Python removes them.
Let's look into renaming files and directories in the next section.
Rename Files and Directories
The function os.rename(src, dst)
can be used to rename files and folders. Let's create a directory with some temporary files.
>>> import os
>>> os.mkdir('logs')
>>> for tempfile in [f'logs/LoG_{num}.txt' for num in range(10)]:
... with open(tempfile) as writer:
... pass
>>> os.listdir('logs')
['LoG_3.txt', 'LoG_1.txt', 'LoG_8.txt', 'LoG_4.txt', 'LoG_0.txt', 'LoG_5.txt', 'LoG_9.txt', 'LoG_6.txt', 'LoG_7.txt', 'LoG_2.txt']
This spelling LoG_x
almost hurts the eyes. We need to rename all the LoG_x
format files to log_x
. Let's rename each file using os.rename(src, path)
>>> with os.scandir('logs') as entries:
for entry in entries:
os.rename(entry.path, entry.path.replace('LoG', 'log'))
>>> os.listdir('logs')
>>> os.listdir('logs')
['log_2.txt', 'log_9.txt', 'log_5.txt', 'log_0.txt', 'log_1.txt', 'log_8.txt', 'log_7.txt', 'log_4.txt', 'log_3.txt', 'log_6.txt']
That looks much better.
We have finally completed basic operations related to files and directories using Python. In the next section, we will look into working with modules and creating our modules in Python.

Modules
So far, we have been importing standard built-in modules. We can write a custom module too.
If you write definitions in a file, you can use them in the interactive interpreter or another script by importing them. Such a file is called a module. The name of the file is the module name with the suffix .py
.
Let's create a module and add some definitions to it.
We will create a cases
module that will define functions that can convert strings to different case styles shown in the table 14.
Case Name | Example | Description |
---|---|---|
Snake Case | snake_case |
Punctuation is removed, and spaces are replaced by single underscore _ , and words are lowercased. |
Camel Case | CamelCase |
Spaces and Punctuation are removed, and the first letter of each word is capitalized. |
Kebab Case | kebab-case |
Punctuation is removed, and spaces are replaced by a single hyphen - and words are lowercased. |
Let's write functions to convert a given string to different cases and save it in a cases
file. This file is a module that we can import using the import
keyword followed by the filename cases
.
def snake_case(str, sep=" "):
"""
Converts a given string to snake_case
Parameters:
str -- Required string to convert to snake case
sep -- Optional delimiter for the passed string; defaults to " "
"""
return "_".join([x.lower() for x in str.strip().split(sep)])
def kebab_case(str, sep=" "):
"""
Converts a given string to kebab_case
Parameters:
str -- Required string to convert to kebab case
sep -- Optional delimiter for the passed string; defaults to " "
"""
return "-".join([x.lower() for x in str.strip().split(sep)])
def camel_case(str, sep=" "):
"""
Converts a given string to camelCase
Parameters:
str -- Required string to convert to camel case
sep -- Optional delimiter for the passed string; defaults to " "
"""
return "".join([x.lower() if not index else x.capitalize()
for index, x in enumerate(str.strip().split(sep)) ])
camel_case
function works in your words?
In the came_case()
, we use a list comprehension to generate a list of words to provide to the str.join()
method.
To import our newly created module cases
, we will have to start a Python interpreter in the same directory. Then we can directly import our module cases
.
The code below uses the function snake_case()
from our newly created module cases
.
>>> import cases
>>> string = "The quick brown fox jumps over the lazy dog"
>>> cases.snake_case(string)
'the_quick_brown_fox_jumps_over_the_lazy_dog' # Snake Case
We can also directly import functions from the module cases
using the from
keyword.
>>> from cases import kebab_case, camel_case
>>> string = "The quick brown fox jumps over the lazy dog"
>>> kebab_case(string)
'the-quick-brown-fox-jumps-over-the-lazy-dog'
>>> camel_case(kebab_case(string), sep="-")
'theQuickBrownFoxJumpsOverTheLazyDog'
As you can see, we can import definitions from the module we defined earlier and used it.
As you might have guessed, we can simply create a module_name.py
and start the interpreter in the same directory to access the module. There are certain other directories as well, where you can store your modules, and you will be able to import them in Python.
In the next section, let's understand how Python imports modules.
Module Search
We imported the module, cases
using the import
statement.
>>> import cases
When the interpreter executes the import foo
statement,
- it first searches for a built-in
module
with the namefoo
. - If no such built-in module exists, the interpreter searches for a file named
foo.py
in a list of directories given by thesys.path
.
Let's have a look at the list of the directories provided by sys.path
.
>>> import sys
>>> sys.path # The output will be different for everyone
['', '/usr/lib/python38.zip', '/usr/lib/python3.8', '/usr/lib/python3.8/lib-dynload', '/usr/local/lib/python3.8/dist-packages', '/usr/lib/python3/dist-packages']
The directories are given by sys.path
is initialized from:
- the directory containing the input script or the current directory when using the interpreter
- The list of directories given by the environment variable:
PYTHONPATH
- The installation-dependent directory configured at the time of installation
To ensure that Python is able to import your defined module, you should put it in the directories listed above.
Let's say you created a script called math.py
with the following content.
pi = "I am pi, the irrational number"
You started an interpreter in the same directory as that of math.py
and wrote the following code.
>>> from math import pi
>>> pi
What's the output of the code above?
3.141592653589793
'I am pi, the irrational number'
- Raises
NameError
- Raises
ValueError
Python first looks for the module in the built-in directory and finds the standard math
module. Therefore, it imports the standard module instead of our custom math.py
module.
Next, let's look more at importing definitions.
Importing Definitions
The cases
module had three function definitions. A module can contain other definitions of objects and expressions, and its content is made available with the import
statement. We can import the content in several ways and use the definitions in several ways.
Importing Modules
We can import the module name foo
and then access the definition bar
by dot notation.
>>> import cases
>>> cases.snake_case() # Accessing Definition using dot-notation
>>> cases
<module 'cases' from '/home/primer/Python-I/cases.py'>
We can import several modules using an import
statement and separating them with a comma.
>>> import foo, bar, foobar
Importing Modules with a different name
We can import modules with a different name using the as
keyword in the import
statement.
>>> import cases as c
>>> c
<module 'cases' from '/home/primer/Python-I/cases.py'>
>>> c.snake_case("Hello World")
'hello_world'
In the above instance, we are naming the cases
module as c
. In this case, Python doesn't add the name cases
to the current namespace. The imported module has attributes _name__
and _file__
, which gives the module's name and path, respectively.
>>> import cases as c
>>> c.__name__
'cases'
>>> c.__file__
'/home/primer/Python-I/cases.py'
By importing modules with a different name, we can avoid name-collisions with other functions with a similar name. Other advantages include for convenience purposes such as from decimal import Decimal as D
, we looked in Chapter 2. Let's look at how to import definitions from modules directly.
Importing definitions from modules directly
We can also import the module's definitions directly using the from
keyword in the import
statement.
>>> from cases import snake_case, camel_case, kebab_case
In this case, the function definitions imported are directly available as callable in the current namespace.
>>> kebab_case("Hello World")
'hello-world'
Importing every definition from modules
We can use the wild card operator *
to import every definition. We can rewrite the above import statement as follows.
>>> from cases import * # Import everything
The wild card operator *
imports all names present in the module apart from those beginning with an underscore (_
).
*
in your program. Can you think of a particular reason why?
When you import modules and all their definitions, you overwrite any pre-existing with the same name. Importing every definition also limits the names you can assign to new objects.
We don't recommend importing *
from the module as it leads to poor code readability. Although you can use it in the interactive interpreter.
List of definitions in a module using dir
function
Earlier, we used the dir()
function to check the name-space. This dir()
function also returns all the properties and methods, even built-in properties, which are the default for an object.
When used on module objects, it returns the list of definitions as well as other attributes. Let's take a look.
>>> import cases
>>> dir(cases)
['__builtins__', '__cached__', '__doc__', '__file__', '__loader__', '__name__', '__package__', '__spec__', 'camel_case', 'kebab_case', 'snake_case']
The names starting with _
double underscores are special or dunder methods of a module. We can also see that the three functions that can be accessed using the module are camel_case
, kebab_case
, and snake_case
.
Reloading the module
While keeping the interactive interpreter running, let's add another function, say meow_case()
to our cases.py
and save it. cases.py
...
def meow_case(str):
return 'Meow'
Now, let's re-import the cases
module.
... # Continued from previous session
>>> import cases # Re-import cases
>>> cases.meow_case('Hello World')
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
AttributeError: module 'cases' has no attribute 'meow_case'
For reasons of efficiency, a module is only loaded once per interpreter session. Suppose you make any changes to the cases.py
file and check in the interactive interpreter. In that case, you will need to reload the module or restart the interpreter. To reload the module, use a function called reload()
from module importlib
.
>>> import cases, importlib
>>> importlib.reload(cases) # reload the cases module
<module 'cases' from '/home/primer/Python-I/cases.py'>
>>> cases.meow_case('Hello World')
'Meow'
The reload()
function takes only a module object. Therefore, reloading a function whose definition has been changed but has been imported directly cannot work.
>>> from cases import camel_case
>>> importib.reload(camel_case) # Reloading imported function will not work
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/lib/python3.8/importlib/__init__.py", line 140, in reload
raise TypeError("reload() argument must be a module")
TypeError: reload() argument must be a module
Executing Modules as Scripts
cases
module. What do you think happens when you execute the cases
module directly as a script?
When we can execute the cases
module as the script, nothing happens as the cases.py
module has only function definitions.
> python3 cases.py # Nothing happens
There is a difference between executing the module as a script and importing it as another file.
To understand the difference, let us create a script greeter.py
.
greeter.py
def greet(name="World"):
print(f"Hello {name}")
Let's execute the greeter.py
file as a script.
> python greeter.py # Nothing happens
Let's import the greeter
in the interactive interpreter.
# Changing the name of the imported module
>>> import greeter as G
>>> G.__name__
'greeter' # Original module name
When we execute a Python script, it runs it as __main__
irrespective of its filename. The file's name is always accessible in the module using the __name__
.
Therefore, we can use __name__ == "__main__"
condition in an if
statement to check if the module is being run in the script mode.
Earlier, we saw that we could access the command-line arguments using the sys.argv
module. Let's rewrite the greeter
module to add some more functionality while executing in the script mode.
import sys
def greet(name="World"):
print(f"Hello {name}")
if __name__ == "__main__":
try:
greet(sys.argv[1])
except IndexError:
# Exception Handling when no command arguments
greet()
Statements inside the if
block will only execute if you execute the module in the script mode. Let's execute the greeter
as a script.
> python3 greeter.py
Hello World
Now that our greeter
module accepts command line arguments. Let's try some out.
> python3 greeter.py
Hello World
> python3 greeter.py there
Hello there
> python3 greeter.py "there. What is up?"
Hello there. What is up?
Take a look at the script below.
import sys
def read_file(filename):
with open(f'{filename}') as reader:
return " ".join(reader.readlines())
if __name__ == "__main__":
try:
print(read_file(sys.argv[1]))
except:
print("Please enter a valid filename")
What does the above Python script do?
- Reads the file name and outputs the number of characters in it
- Reads the file given as argument and prints out the content
- Reads the file and outputs the number of words in it
- Reads the file and outputs the number of characters in it.
We can also access the standard modules can from the command line. To execute standard modules, we need to use the -m
flag, followed by the module name.
Earlier, we used the tokenize
module to get the tokens generated for a file. Let's tokenize the greeter.py
file contents using the following command.
python3 -m tokenize greeter.py
Executing the above command might result in the following output.
0,0-0,0: ENCODING 'utf-8'
1,0-1,3: NAME 'def'
1,4-1,9: NAME 'greet'
1,9-1,10: OP '('
1,10-1,14: NAME 'name'
1,14-1,15: OP '='
1,15-1,22: STRING '"World"'
... # Shortened for brevity
You can look for other standard modules that which we can execute as a script.
One of the applications is formatting json
files. Earlier, we used indented a json
file to make it more readable. We can create a custom script to which can help us do generate well-indented json
files.
This also brings an end to our lesson on modules. Now that we have covered the module, let's look into how we can handle a collection of modules or packages in the next section.

Packages
In Python, packages are a way of structuring a collection of modules in Python using dotted module names.
We can create a sample package pkg
by creating an empty folder named pkg
to understand Python packages. Create two empty files, module1.py
and module2.py
, in the newly created pkg
folder. Now, let's write the files with the following content.
def greet():
print("Hi from module1")
def greet():
print("Hi from module2")
Now, your directory structure should look something like this.
pkg
├── module1.py
└── module2.py
Open an interactive interpreter in the same directory as where folder pkg
is placed.
>>> import pkg.module1, pkg.module2
>>> pkg.module1.greet()
Hey from module 1
>>> pkg.module2.greet()
Hey from module 2
In the code listing, we can see that both module1
and module2
define the greet()
function; however, using the dot-notation to access each function helps us avoid name clashes.
We can also import the modules directly while giving them a different name to avoid name-clash.
>>> from pkg.module1 import greet as greet1
>>> from pkg.module2 import greet as greet2
We can also import the modules directly from the package.
>>> from pkg import module1, module2
Continuing from the previous package pkg we created, which of the following is the correct way to import the greet()
function from module1.?
from pkg import module1.greet
from pkg.module1 import greet
from pkg from module1 import greet
from pkg import module1 as module.greet
If a file named __init__.py
is present in a package directory, Python invokes it when the package or a module in the package is imported.
You can use this for the execution of package initialization code. Let's understand what initialization means next.
Package initialization
An __init__.py
file was required to make Python treat directories containing modules as packages earlier in Python 2.7. The __init__.py
was required even though it was empty. However, from Python 3.3 onward, the __init__.py
file is not required.
But it's quite useful in certain scenarios. Let's take a look.
If an _ _init__.py
file is present in the package directory, Python invokes it when we import the package or module in the package.
In the simplest case, __init__.py
can just be an empty file. Let's initialize our pkg
package we earlier created by adding an __init__.py
file in the pkg
directory and the following content.
print(f"Initializing Package {__name__} ")
guest_names = ["Luffy", "Zorro", "Sanji"]
__init__.py
Now our directory tree would look as below.
pkg
├── __init__.py
├── module1.py
└── module2.py
Let's import the pkg
package again by opening an interactive interpreter in the same location as the pkg
directory.
>>> import pkg
Initializing Package pkg # __init__.py invoked
>>> pkg.guest_names
["Luffy", "Zorro", "Sanji"]
When we import the package pkg
, the statements inside it's __init__.py
is automatically executed. We can also access the guest_name
list object present in the __init__.py
using dot-notation. The pkg
package is a name-space.
Names defined in the __init__.py
act global names for the package, which can be accessed by any of the modules inside the package directory.
To illustrate, let's import the guest_names
in a new module3.py
inside the pkg
directory.
from pkg import guest_names
def greet_guests(guests=guest_names):
for guest in guests:
print(f"Hello there, {guest}")
Now, let's test it out in our interactive interpreter.
# Imported from package-level namespace
>>> from pkg import module3
>>> module3.greet_guests()
Hello there, Luffy
Hello there, Zorro
Hello there, Sanji
If we import package pkg on a freshly opened interpreter and pass it to the dir()
function, as shown below, which of the following will be present?
>>> import pkg
>>> dir(pkg)
module1
module2
module3
guest_names
Python doesn't automatically import all the modules in the package when the package is imported.
If we simply import the package pkg
, we will not be able to access the modules contained in it.
>>> import pkg
Initializing Package pkg
>>> dir(pkg)
['__builtins__', '__cached__', '__doc__', '__file__', '__loader__', '__name__', '__package__', '__path__', '__spec__', 'guest_names']
>>> pkg.module1
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
AttributeError: module 'pkg' has no attribute 'module1'
Although the names defined in the __init__.py
are automatically imported. We can automatically import modules by importing them in the __init__.py
file.
To automatically import modules from pkg
, we will change the __init__.py
to look like below.
import pkg.module1, pkg.module2, pkg.module3
print(f"Initializing Package {__name__} ")
guest_names = ["Luffy", "Zorro", "Sanji"]
__init__.py
Now, let's save the __init__.py
file, restart our interpreter, and import the package again.
>>> import pkg
Initializing Package pkg
>>> dir(pkg)
['__builtins__', '__cached__', '__doc__', '__file__', '__loader__', '__name__', '__package__', '__path__', '__spec__', 'guest_names', 'module1', 'module2', 'module3', 'pkg']
As you can see, the modules are automatically imported now.
Now, we can run functions inside our package pkg
.
>>> pkg.module1.greet()
Hi from module1
The __init__.py
lets us automatically import modules into the package namespace but not into the current namespace.
Let's restart our interpreter and import the package pkg
again.
>>> import pkg
Initializing Package pkg
>>> dir() # Current Namespace
['__annotations__', '__builtins__', '__doc__', '__loader__', '__name__', '__package__', '__spec__', 'pkg']
We can check the current namespace by calling the dir()
function. As you can see, only the package pkg
is present in the namespace. To import all the definitions from the package, we can use the *
wildcard operator.
>>> from pkg import *
>>> dir()
['__annotations__', '__builtins__', '__doc__', '__loader__', '__name__', '__package__', '__spec__', 'guest_names', 'module1', 'module2', 'module3', 'pkg']
All the modules we have imported in the __init__.py
are imported into our current namespace. We can invoke them directly, as shown below.
>>> module1.greet()
Hi from module1
__init__.py
entirely, restart the interpreter and run from pkg import *
. Do you think all the modules are still going to be imported to the local namespace? Why or Why not?
Python, by default, doesn't implicitly implicitly any underlying module in a package.
Therefore, none of the modules in the package pkg
are going to be imported.
We can decide which modules will be imported while using the *
operator by specifying them in __all__
in the __init__.py
. Let's take a look.
Let's again create an __init__.py
file with the following code.
guest_names = ["Luffy", "Sanji", "Zorro"]
__all__ = ['module1', 'module2']
__init__.py
Save the file, restart the interpreter, and let's import everything from the package once again.
>>> from pkg import *
>>> dir(pkg)
['__annotations__', '__builtins__', '__doc__', '__loader__', '__name__', '__package__', '__spec__', 'module1', 'module2']
Since we specified only module1
and module2
in the __all__
, module3
and guest_names
were not imported into the local namespace.
We can also use __all__
in the module to specify what objects can be imported while using the wildcard operator.
To understand, let's modify the module1.py
to look like the below.
def greet():
print("Hi from module1")
def _secret_number():
return 42
def not_so_secret_number():
return 43
Save the file and restart the interpreter. Let's now use the *
wildcard operator to import every object from the module1
module.
>>> from pkg.module1 import *
>>> dir()
['__annotations__', '__builtins__', '__doc__', '__loader__', '__name__', '__package__', '__spec__', 'greet', 'not_so_secret_number']
As you can see, both the greet
and not_so_secret_number
are imported, but not _secret_number
. If we don't specify __all__
in the module, Python imports everything except names starting with an underscore (_)
.
Let's again change our module1.
module to add __all__,
and your module should look below.
__all__ = ["greet"]
def greet():
print("Hi from module1")
def _secret_number():
return 42
def not_so_secret_number():
return 43
Let's save the file and restart the interpreter.
>>> from pkg.module1 import *
>>> dir()
['__annotations__', '__builtins__', '__doc__', '__loader__', '__name__', '__package__', '__spec__', 'greet']
As you can see, only the object specified in __all__,
i.e., greet()
, is imported.
__all__
?
We can say that __all__
is used by both packages and modules to specify what to import when import *
is invoked.
For a package, when __all__
is not defined, import *
does not import anything.
While for a module, when __all__
is not defined, import *
imports everything except names starting with an underscore.
We can also contain packages inside another package to an arbitrary depth.
A nested package is called a sub-package.
Sub-package
Let's create a new directory, main_pkg
, with the following structure.
main_pkg
├── sub_pkg1
│ ├── mod1.py
│ └── mod2.py
└── sub_pkg2
├── mod3.py
└── mod4.py
You can add the following to each module (mod1-4
) in the above.
def greet():
print(f"Hello from {__name__}!")
Restart the interpreter and use the following code listing. Importing sub-packages works similarly to the dot notation.
>>> import main_pkg.sub_pkg1.mod1 # Notation 1
>>> main_pkg.sub_pkg1.mod1.greet()
Hello from main_pkg.sub_pkg1.mod1!
>>> from main_pkg.sub_pkg1 import mod2 # Notation 2
>>> mod2.greet()
Hello from main_pkg.sub_pkg1.mod2!
>>> from main_pkg.sub_pkg2.mod3 import greet # Notation 3
>>> greet()
Hello from main_pkg.sub_pkg2.mod3!
>>> from main_pkg.sub_pkg2.mod4 import greet as mod4_greet # Notation 4
>>> mod4_greet()
Hello from main_pkg.sub_pkg2.mod4!
You can also add _init_py
to each sub-package and the top-level main_pkg
package to initialize the package.
That brings us to the end of this topic on modules and packages. We have covered most of the useful tools you will require to program using Python.
In the next chapter, we will look over the conventions used in the Python community and how to experience the Zen of Python.
https://docs.python.org/3/library/json.html JSON Module ↩︎
