Automating Google Cloud Storage File Uploads with Python and gcloud SDK


Google Cloud Storage (GCS) provides scalable object storage, enabling users to store and retrieve large amounts of data quickly. Automating file uploads to GCS can be done efficiently using Python and the Google Cloud SDK. In this guide, we will explore how to set up and use Python to automate file uploads, leveraging the gcloud SDK for authentication and interaction with the GCS API.

Prerequisites

Before automating file uploads, ensure that you have the following:

  • A Google Cloud Platform account.
  • Google Cloud SDK installed and initialized.
  • Python 3.x installed on your system.
  • Access to a GCS bucket.
  • Proper permissions set for your Google Cloud account (e.g., roles/storage.objectAdmin).

Setting Up Google Cloud SDK

The first step in automating file uploads to GCS is ensuring that the Google Cloud SDK is installed and configured on your machine. You can download the SDK from the official [Google Cloud SDK](https://cloud.google.com/sdk/docs/install) page. After installation, initialize it using the following command:

gcloud init
This will prompt you to log in to your Google Cloud account and set the default project.

Installing Python Libraries

To interact with Google Cloud Storage using Python, you need to install the necessary libraries. The primary library used is google-cloud-storage. Install it via pip:
pip install google-cloud-storage
In addition to this, ensure you have the google-auth library installed for authentication purposes:
pip install google-auth google-auth-oauthlib google-auth-httplib2

Authentication with Google Cloud

Authentication to interact with Google Cloud is done using service account credentials. First, create a service account key from the Google Cloud Console under IAM & Admin > Service accounts. Download the JSON key file, and set the environment variable GOOGLE_APPLICATION_CREDENTIALS to the path of the downloaded key file:
export GOOGLE_APPLICATION_CREDENTIALS=”/path/to/your/service-account-file.json”
This step ensures that the Python script uses the correct credentials to authenticate with Google Cloud.

Uploading Files with Python

With authentication set up, we can now write the Python code to automate file uploads to Google Cloud Storage. The following script uploads a local file to a specific GCS bucket:
python
from google.cloud import storage
import os

def upload_file_to_gcs(bucket_name, source_file_path, destination_blob_name):
# Initialize the Cloud Storage client
storage_client = storage.Client()

# Get the bucket
bucket = storage_client.get_bucket(bucket_name)

# Create a blob object from the file’s path
blob = bucket.blob(destination_blob_name)

# Upload the file
blob.upload_from_filename(source_file_path)

print(f”File {source_file_path} uploaded to {destination_blob_name}.”)

Using the Upload Function

To use the above upload_file_to_gcs function, specify the bucket name, local file path, and the desired destination blob name (the name the file will have in GCS). Here’s how you can call this function:
python
bucket_name = ‘your-bucket-name’
source_file_path = ‘path/to/local/file.txt’
destination_blob_name = ‘destination/path/in/bucket.txt’

upload_file_to_gcs(bucket_name, source_file_path, destination_blob_name)
Ensure the file exists at the source_file_path and that the GCS bucket is correctly specified.

Automating Uploads with Multiple Files

To automate uploads for multiple files, you can modify the script to loop through a directory and upload all files:
python
import os

def upload_directory_to_gcs(bucket_name, source_directory):
for filename in os.listdir(source_directory):
source_file_path = os.path.join(source_directory, filename)
if os.path.isfile(source_file_path):
destination_blob_name = f”uploads/{filename}”
upload_file_to_gcs(bucket_name, source_file_path, destination_blob_name)
Call the function by providing the bucket name and the directory containing the files:
python
bucket_name = ‘your-bucket-name’
source_directory = ‘/path/to/local/directory’

upload_directory_to_gcs(bucket_name, source_directory)

Handling Errors and Logging

When automating uploads, handling errors is crucial. You can implement basic error handling in the upload function as follows:
python
from google.cloud.exceptions import NotFound

def upload_file_to_gcs(bucket_name, source_file_path, destination_blob_name):
try:
# Initialize the Cloud Storage client
storage_client = storage.Client()

# Get the bucket
bucket = storage_client.get_bucket(bucket_name)

# Create a blob object from the file’s path
blob = bucket.blob(destination_blob_name)

# Upload the file
blob.upload_from_filename(source_file_path)

print(f”File {source_file_path} uploaded to {destination_blob_name}.”)
except NotFound:
print(f”Bucket {bucket_name} not found.”)
except Exception as e:
print(f”An error occurred: {e}”)
This error handling ensures that your script doesn’t crash when a bucket is not found or if there is an issue during the file upload process.

Conclusion

With Python and the Google Cloud SDK, automating file uploads to Google Cloud Storage is simple and efficient. The code examples provided enable you to easily upload single files, handle multiple files, and integrate error handling. By using the power of Python, you can streamline your workflow and ensure your files are uploaded securely to GCS automatically.

We earn commissions using affiliate links.


14 Privacy Tools You Should Have

Learn how to stay safe online in this free 34-page eBook.


Leave a Comment

Your email address will not be published. Required fields are marked *

Scroll to Top