How to Encrypt Files Before Uploading to AWS S3 Using Python


Introduction

When uploading sensitive files to AWS S3, it is crucial to ensure that the data remains encrypted both in transit and at rest. Encrypting files before uploading them can help protect data integrity and privacy. In this tutorial, we will explore how to use Python to encrypt files before uploading them to an S3 bucket. We will use the AWS SDK for Python (Boto3) along with the cryptography library to handle encryption.

Prerequisites

  • Python 3.6 or above installed
  • AWS Account with S3 bucket created
  • AWS CLI configured with access keys
  • Boto3 and cryptography libraries installed

Install Necessary Libraries

We will be using the Boto3 library to interact with AWS S3 and the cryptography library to handle file encryption. Install these libraries using pip:

pip install boto3 cryptography

Encryption Strategy

For encrypting files, we will use symmetric encryption with the AES (Advanced Encryption Standard) algorithm. This means that the same key will be used for both encrypting and decrypting the files. The encryption key should be stored securely, and for the purposes of this example, we will generate a key in memory. You should replace this with a more secure key management solution in a production environment.

Step 1: Set Up AWS S3

Before we start with the encryption and upload process, make sure you have an S3 bucket set up and your AWS credentials configured. If you haven’t done so, you can configure them using the AWS CLI:

aws configure

Ensure that you have access to the bucket where you will upload the encrypted files.

Step 2: Import Required Libraries

Let’s start by importing the necessary libraries in our Python script:

import boto3
from cryptography.hazmat.primitives.ciphers import Cipher, algorithms, modes
from cryptography.hazmat.backends import default_backend
from cryptography.hazmat.primitives import padding
import os

Step 3: Encrypting the File

Now, we will create a function to encrypt the file. We will use AES encryption in CBC (Cipher Block Chaining) mode. The encryption process involves padding the file to ensure that its size is a multiple of the block size required by AES.

def encrypt_file(input_file, output_file, key):
    # Generate a random IV (Initialization Vector) for CBC mode
    iv = os.urandom(16)
    
    # Create a cipher object using AES algorithm and CBC mode
    cipher = Cipher(algorithms.AES(key), modes.CBC(iv), backend=default_backend())
    
    # Pad the input file to make its size a multiple of block size
    padder = padding.PKCS7(algorithms.AES.block_size).padder()
    
    with open(input_file, 'rb') as f:
        data = f.read()
    
    padded_data = padder.update(data) + padder.finalize()
    
    # Encrypt the padded data
    encryptor = cipher.encryptor()
    encrypted_data = encryptor.update(padded_data) + encryptor.finalize()
    
    # Write the encrypted data to the output file along with the IV
    with open(output_file, 'wb') as f:
        f.write(iv + encrypted_data)

Step 4: Upload Encrypted File to AWS S3

Once the file is encrypted, we can upload it to AWS S3 using the Boto3 library. Make sure that the S3 bucket name is correct and that you have the necessary permissions.

def upload_to_s3(file_name, bucket_name, s3_key):
    s3 = boto3.client('s3')
    
    # Upload the file to S3
    s3.upload_file(file_name, bucket_name, s3_key)
    print(f'File {file_name} uploaded to S3 bucket {bucket_name} with key {s3_key}')

Step 5: Putting It All Together

Now, we can combine everything into a single Python script. This script will take an input file, encrypt it, and then upload the encrypted file to AWS S3.

def main():
    input_file = 'example.txt'  # Specify the file to be encrypted
    output_file = 'encrypted_example.txt'  # Output encrypted file
    bucket_name = 'my-s3-bucket'  # Your S3 bucket name
    s3_key = 'encrypted_files/encrypted_example.txt'  # S3 path for storing the file
    key = os.urandom(32)  # Generate a random 256-bit key for AES encryption

    # Encrypt the file
    encrypt_file(input_file, output_file, key)
    
    # Upload the encrypted file to S3
    upload_to_s3(output_file, bucket_name, s3_key)

if __name__ == '__main__':
    main()

Step 6: Decrypting the File (Optional)

If you need to decrypt the file later, you can use a similar approach. You will need to know the encryption key and IV used during encryption. Here is how you can decrypt the file:

def decrypt_file(input_file, output_file, key):
    with open(input_file, 'rb') as f:
        iv = f.read(16)  # Read the first 16 bytes as the IV
        encrypted_data = f.read()  # Read the remaining data

    # Create the cipher object using the same key and IV
    cipher = Cipher(algorithms.AES(key), modes.CBC(iv), backend=default_backend())
    
    # Decrypt the data
    decryptor = cipher.decryptor()
    decrypted_data = decryptor.update(encrypted_data) + decryptor.finalize()

    # Remove padding
    unpadder = padding.PKCS7(algorithms.AES.block_size).unpadder()
    data = unpadder.update(decrypted_data) + unpadder.finalize()

    with open(output_file, 'wb') as f:
        f.write(data)

We earn commissions using affiliate links.


14 Privacy Tools You Should Have

Learn how to stay safe online in this free 34-page eBook.


Leave a Comment

Your email address will not be published. Required fields are marked *

Scroll to Top