Introduction
When uploading sensitive files to AWS S3, it is crucial to ensure that the data remains encrypted both in transit and at rest. Encrypting files before uploading them can help protect data integrity and privacy. In this tutorial, we will explore how to use Python to encrypt files before uploading them to an S3 bucket. We will use the AWS SDK for Python (Boto3) along with the cryptography library to handle encryption.
Prerequisites
- Python 3.6 or above installed
- AWS Account with S3 bucket created
- AWS CLI configured with access keys
- Boto3 and cryptography libraries installed
Install Necessary Libraries
We will be using the Boto3 library to interact with AWS S3 and the cryptography library to handle file encryption. Install these libraries using pip:
pip install boto3 cryptography
Encryption Strategy
For encrypting files, we will use symmetric encryption with the AES (Advanced Encryption Standard) algorithm. This means that the same key will be used for both encrypting and decrypting the files. The encryption key should be stored securely, and for the purposes of this example, we will generate a key in memory. You should replace this with a more secure key management solution in a production environment.
Step 1: Set Up AWS S3
Before we start with the encryption and upload process, make sure you have an S3 bucket set up and your AWS credentials configured. If you haven’t done so, you can configure them using the AWS CLI:
aws configure
Ensure that you have access to the bucket where you will upload the encrypted files.
Step 2: Import Required Libraries
Let’s start by importing the necessary libraries in our Python script:
import boto3
from cryptography.hazmat.primitives.ciphers import Cipher, algorithms, modes
from cryptography.hazmat.backends import default_backend
from cryptography.hazmat.primitives import padding
import os
Step 3: Encrypting the File
Now, we will create a function to encrypt the file. We will use AES encryption in CBC (Cipher Block Chaining) mode. The encryption process involves padding the file to ensure that its size is a multiple of the block size required by AES.
def encrypt_file(input_file, output_file, key):
# Generate a random IV (Initialization Vector) for CBC mode
iv = os.urandom(16)
# Create a cipher object using AES algorithm and CBC mode
cipher = Cipher(algorithms.AES(key), modes.CBC(iv), backend=default_backend())
# Pad the input file to make its size a multiple of block size
padder = padding.PKCS7(algorithms.AES.block_size).padder()
with open(input_file, 'rb') as f:
data = f.read()
padded_data = padder.update(data) + padder.finalize()
# Encrypt the padded data
encryptor = cipher.encryptor()
encrypted_data = encryptor.update(padded_data) + encryptor.finalize()
# Write the encrypted data to the output file along with the IV
with open(output_file, 'wb') as f:
f.write(iv + encrypted_data)
Step 4: Upload Encrypted File to AWS S3
Once the file is encrypted, we can upload it to AWS S3 using the Boto3 library. Make sure that the S3 bucket name is correct and that you have the necessary permissions.
def upload_to_s3(file_name, bucket_name, s3_key):
s3 = boto3.client('s3')
# Upload the file to S3
s3.upload_file(file_name, bucket_name, s3_key)
print(f'File {file_name} uploaded to S3 bucket {bucket_name} with key {s3_key}')
Step 5: Putting It All Together
Now, we can combine everything into a single Python script. This script will take an input file, encrypt it, and then upload the encrypted file to AWS S3.
def main():
input_file = 'example.txt' # Specify the file to be encrypted
output_file = 'encrypted_example.txt' # Output encrypted file
bucket_name = 'my-s3-bucket' # Your S3 bucket name
s3_key = 'encrypted_files/encrypted_example.txt' # S3 path for storing the file
key = os.urandom(32) # Generate a random 256-bit key for AES encryption
# Encrypt the file
encrypt_file(input_file, output_file, key)
# Upload the encrypted file to S3
upload_to_s3(output_file, bucket_name, s3_key)
if __name__ == '__main__':
main()
Step 6: Decrypting the File (Optional)
If you need to decrypt the file later, you can use a similar approach. You will need to know the encryption key and IV used during encryption. Here is how you can decrypt the file:
def decrypt_file(input_file, output_file, key):
with open(input_file, 'rb') as f:
iv = f.read(16) # Read the first 16 bytes as the IV
encrypted_data = f.read() # Read the remaining data
# Create the cipher object using the same key and IV
cipher = Cipher(algorithms.AES(key), modes.CBC(iv), backend=default_backend())
# Decrypt the data
decryptor = cipher.decryptor()
decrypted_data = decryptor.update(encrypted_data) + decryptor.finalize()
# Remove padding
unpadder = padding.PKCS7(algorithms.AES.block_size).unpadder()
data = unpadder.update(decrypted_data) + unpadder.finalize()
with open(output_file, 'wb') as f:
f.write(data)
We earn commissions using affiliate links.