AWS
PYTHON BOTO3
AWS Software Development Kit (SDK) for Python
PYTHON BOTO3
Boto3 is the Amazon Web Services (AWS) Software Development Kit (SDK) for Python
It allows Python developers to write software that makes use of services like Amazon
S3 and Amazon EC2 and Boto3 is maintained and published by Amazon Web Services
Boto (pronounced boh-toh) was named after the fresh water dolphin native to the
Amazon river The name was chosen by the author of the original Boto library, Mitch
Garnaat, as a reference to the company
Boto provides an easy to use, object-oriented API as well as low-level direct service
access
The AWS CLI and Boto3 library are built on top of a standard botocore module This
module is a low-level Python library that takes care of low-level operations required
to send secure HTTPS API requests to AWS and respond
PYTHON SDK
An SDK helps in the programming of mobile
applications bringing a group of tools in one place
Python SDK enables the user to write codes that
manage OCI (Oracle Cloud Infrastructure)
resources and its primary purpose is to provide an
easy Python environment to use
Furthermore, it enables the developers to
authenticate users to retrieve data and upload file
SDK contains certain instructions that allow developers to create systems and
develop applications
SDK tools include different things such as libraries, code samples,
AWS BOTO3 SDK
AWS BOTO3 SDK
• Boto3 allows to write Python code in order to manage the AWS
services programmatically
• Boto3 acts as an intermediary between the developers and the AWS
services to be managed
• Boto3 contains many easy to use APIs that can interact with AWS
services such as S3, EC2, RDS, SNS, etc.…
• Python and Boto3 combined together allow to automate processes
that can manage the various AWS services
AWS BOTO3 APPLICATIONS
• Automate the process of Uploading Files into an Amazon s3 Bucket
• Boto3 File Upload method can be used to implement the same
• Maintain regular backups of sensitive information on EC2 instances
• Boto3 Snapshot method can be used to implement the same
• Parse files and send Email notifications to alert user for validation
• Boto3 publish method of SNS
• Managing Uploads to an RDS database that in turn implements SQL
statements
•
CASE
SCENARIO 1
Keywords :
S3 Buckets, Parsing, SNS Topic, Publish
CASE SCENARIO 1
Suppose a Company hosts some of its resources on the AWS Cloud. The
Sales team of the Organization has a dedicated DCT( Data Centre
Technician ) Amazon S3 bucket which hosts all the Customer information
The Customer information is regularly analyzed by the Sales team and
validated Now the data comes in the form of CSV files which are not
properly organized Hence manual inspection is needed to figure out which
file belongs to which Sales representative
The problem statement is to automate the entire process by writing Python
scripts that can parse the file names in the DCT Sales bucket and find out it
belongs to which Sales representative in the Organization Once we have
that information we need to publish through an SNS topic to alert that
CASE SCENARIO 1
STEP1 – CREATE AN S3 BUCKET
Login to https://aws.amazon.com/console using AWS user credentials
Search for S3 service in the Services search bar
STEP1 – CREATE AN S3 BUCKET
In the S3 services dashboard click on the Create bucket tab to create a new bucket
Create a bucket by the name dct-allsales and select the region as Asia
Pacific (Mumbai)ap-south-1
STEP1 – CREATE AN S3 BUCKET
Once the bucket is created successfully we get the bucket creation message and the
bucket created appears in the account snapshot
Create subfolders for customer details and individual sales representatives
STEP1 – CREATE AN S3 BUCKET
STEP1 – CREATE AN S3 BUCKET
STEP1 – CREATE AN S3 BUCKET
STEP2 – CREATE ENVIRONMENT FOR CLOUD9 IDE
STEP2 – CREATE ENVIRONMENT FOR CLOUD9 IDE
STEP2 – CREATE ENVIRONMENT FOR CLOUD9 IDE
STEP2 – CREATE ENVIRONMENT FOR CLOUD9 IDE
STEP2 – CREATE ENVIRONMENT FOR CLOUD9 IDE
import boto3
# create boto3 resource and client
s3 = boto3.resource(‘s3’)
sns = boto3.client(‘sns’)
# define the s3 bucket and folders
bucket_name = ‘dct-allsales’
source_folder = ‘allcustomer-details/’
destination_folder = ‘sr1/’
#define the sns topic ARN
sns_topic_arn = 'arn:aws:sns:ap-south-1:574223795052:SR-File-Validation’
#define sales representative prefix
sr_prefix = ‘sr1_’
'#initialize a flag
files_moved = False
#list all objects in the source folder
for obj in bucket.objects.filter(Prefix=source_folder+sr_prefix):
print("Object = ",obj)
#construct the destination key
destination_key = destination_folder + obj.key.split('/')[-1]
#copy the object to the destination folder
copy_source = {
'Bucket' : bucket_name,
'Key' : obj.key }
bucket.copy(copy_source,destination_key)
#delete the object from the source folder
obj.delete()
print(f"Moved File {obj.key} to {destination_key}")
#set the flag to true files_moved = True
#If any files are moved publish a message to the SNS topic
if files_moved:
response = sns.publish(
TopicArn= sns_topic_arn,
Message= 'Sales Representative 1 you have new files to validate in your folder',
CASE
SCENARIO 2
Operations to a AWS S3 Datastore using Python Boto3
S3 BUCKET
A bucket in S3 is similar to a database in any other terminology
Objects are individual pieces of data within an S3 bucket
Some of the most important operations to an AWS S3 datastore can be performed using the
Python Boto3 library
Boto3 is the Python library to interact with all AWS services for ex say AWS S3
AWS S3 is a simple storage service which can store structured, semi structured and
unstructured data
It’s pretty common to store S3 for Photos and Videos, as well as use S3 as a data lake - a
central data hub where all data flows in
Let us create a bucket by the name boto-tutorial-bucket