How to save S3 object to a file using boto3

There is a customization that went into Boto3 recently which helps with this (among other things). It is currently exposed on the low-level S3 client, and can be used like this: s3_client = boto3.client(‘s3’) open(‘hello.txt’).write(‘Hello, world!’) # Upload the file to S3 s3_client.upload_file(‘hello.txt’, ‘MyBucket’, ‘hello-remote.txt’) # Download the file from S3 s3_client.download_file(‘MyBucket’, ‘hello-remote.txt’, ‘hello2.txt’) print(open(‘hello2.txt’).read()) … Read more

How to choose an AWS profile when using boto3 to connect to CloudFront

I think the docs aren’t wonderful at exposing how to do this. It has been a supported feature for some time, however, and there are some details in this pull request. So there are three different ways to do this: Option A) Create a new session with the profile dev = boto3.session.Session(profile_name=”dev”) Option B) Change … Read more

How to handle errors with boto3?

Use the response contained within the exception. Here is an example: import boto3 from botocore.exceptions import ClientError try: iam = boto3.client(‘iam’) user = iam.create_user(UserName=”fred”) print(“Created user: %s” % user) except ClientError as e: if e.response[‘Error’][‘Code’] == ‘EntityAlreadyExists’: print(“User already exists”) else: print(“Unexpected error: %s” % e) The response dict in the exception will contain the … Read more

Complete scan of dynamoDb with boto3

I think the Amazon DynamoDB documentation regarding table scanning answers your question. In short, you’ll need to check for LastEvaluatedKey in the response. Here is an example using your code: import boto3 dynamodb = boto3.resource(‘dynamodb’, aws_session_token=aws_session_token, aws_access_key_id=aws_access_key_id, aws_secret_access_key=aws_secret_access_key, region_name=region ) table = dynamodb.Table(‘widgetsTableName’) response = table.scan() data = response[‘Items’] while ‘LastEvaluatedKey’ in response: response = … Read more

How to import a text file on AWS S3 into pandas without writing to disk

pandas uses boto for read_csv, so you should be able to: import boto data = pd.read_csv(‘s3://bucket….csv’) If you need boto3 because you are on python3.4+, you can import boto3 import io s3 = boto3.client(‘s3′) obj = s3.get_object(Bucket=”bucket”, Key=’key’) df = pd.read_csv(io.BytesIO(obj[‘Body’].read())) Since version 0.20.1 pandas uses s3fs, see answer below.

Boto3 to download all files from a S3 Bucket

I have the same needs and created the following function that download recursively the files. The directories are created locally only if they contain files. import boto3 import os def download_dir(client, resource, dist, local=”/tmp”, bucket=”your_bucket”): paginator = client.get_paginator(‘list_objects’) for result in paginator.paginate(Bucket=bucket, Delimiter=”https://stackoverflow.com/”, Prefix=dist): if result.get(‘CommonPrefixes’) is not None: for subdir in result.get(‘CommonPrefixes’): download_dir(client, resource, … Read more

How to write a file or data to an S3 object using boto3

In boto 3, the ‘Key.set_contents_from_’ methods were replaced by Object.put() Client.put_object() For example: import boto3 some_binary_data = b’Here we have some data’ more_binary_data = b’Here we have some more data’ # Method 1: Object.put() s3 = boto3.resource(‘s3’) object = s3.Object(‘my_bucket_name’, ‘my/key/including/filename.txt’) object.put(Body=some_binary_data) # Method 2: Client.put_object() client = boto3.client(‘s3′) client.put_object(Body=more_binary_data, Bucket=”my_bucket_name”, Key=’my/key/including/anotherfilename.txt’) Alternatively, the binary … Read more