amazon web services - How to extract files in S3 on the fly with boto3?

Question

Welcome To Ask or Share your Answers For Others

amazon web services - How to extract files in S3 on the fly with boto3?

1 Answer

深蓝 · Answer 1 · 2021-10-23T18:58:48+0000

You can use BytesIO to stream the file from S3, run it through gzip, then pipe it back up to S3 using upload_fileobj to write the BytesIO.

# python imports
import boto3
from io import BytesIO
import gzip

# setup constants
bucket = '<bucket_name>'
gzipped_key = '<key_name.gz>'
uncompressed_key = '<key_name>'

# initialize s3 client, this is dependent upon your aws config being done 
s3 = boto3.client('s3', use_ssl=False)  # optional
s3.upload_fileobj(                      # upload a new obj to s3
    Fileobj=gzip.GzipFile(              # read in the output of gzip -d
        None,                           # just return output as BytesIO
        'rb',                           # read binary
        fileobj=BytesIO(s3.get_object(Bucket=bucket, Key=gzipped_key)['Body'].read())),
    Bucket=bucket,                      # target bucket, writing to
    Key=uncompressed_key)               # target key, writing to

Ensure that your key is reading in correctly:

# read the body of the s3 key object into a string to ensure download
s = s3.get_object(Bucket=bucket, Key=gzip_key)['Body'].read()
print(len(s))  # check to ensure some data was returned

Categories

amazon web services - How to extract files in S3 on the fly with boto3?

amazon web services - How to extract files in S3 on the fly with boto3?

Please log in or register to add a comment.

Please log in or register to answer this question.

1 Answer

Please log in or register to add a comment.

Just Browsing Browsing

Most popular tags