Automating S3 to GCS Migration Using Bash Scripts
.png)
Introduction
Cloud storage plays a crucial role in modern infrastructure, providing scalable and reliable storage solutions. Many businesses migrate from AWS S3 to Google Cloud Storage (GCS) to leverage cost benefits, integration with Google Cloud services, or optimize their cloud strategies. However, when dealing with hundreds of S3 buckets, manual migration is inefficient and time-consuming.
To streamline the process, I automated the migration using Bash scripts and Google Cloud’s Storage Transfer Service. In this blog, I’ll walk you through the steps of automating S3 to GCS migration efficiently.
Why Automate S3 to GCS Migration?
Handling over 200+ S3 buckets manually would involve:
- Repetitive tasks – Creating GCS buckets, setting permissions, and transferring data for each bucket.
- Human errors – Misconfiguration, incorrect bucket names, or missing files.
- Time-consuming process – Manual intervention would take days to complete.
By automating this process, we can:
Save time – Script execution takes a few minutes instead of hours/days.
Eliminate errors – Ensures all S3 buckets are correctly transferred.
Enable monitoring & scheduling – Automate recurring data transfers with Google’s Storage Transfer Service.
Prerequisites
Before running the scripts, ensure you have:
A Google Cloud Project with Billing enabled.
AWS IAM User with s3:ListBucket and s3:GetObject permissions.
Installed Google Cloud SDK (gcloud CLI) on your local machine.
Step 1: Creating Google Cloud Storage Buckets
Each S3 bucket requires a corresponding GCS bucket. The script below reads a list of bucket names from a file and creates them in GCP.
create_gcs_bucket.sh
#!/bin/bash
# Variables
PROJECT_ID="ccd-poc-project" # Replace with your GCP project ID
BUCKET_LIST_FILE="bucket_names.txt" # File containing bucket names
OUTPUT_FILE="created_buckets.txt"
REGION="us-central1" # Change if needed
# Check if the bucket list file exists
if [ ! -f "$BUCKET_LIST_FILE" ]; then
echo "Error: Bucket names file '$BUCKET_LIST_FILE' not found!"
exit 1
fi
# Read bucket names and create GCS buckets
while IFS= read -r BUCKET_NAME || [[ -n "$BUCKET_NAME" ]]; do
if [[ -z "$BUCKET_NAME" ]]; then
continue # Skip empty lines
fi
# Clean bucket name
BUCKET_NAME=$(echo "$BUCKET_NAME" | tr -d '\r' | tr -d '[:space:]')
echo "Creating bucket: $BUCKET_NAME"
gcloud storage buckets create "gs://$BUCKET_NAME" --location="$REGION" --project="$PROJECT_ID"
if [ $? -eq 0 ]; then
echo "gs://$BUCKET_NAME" >> "$OUTPUT_FILE"
echo "Bucket $BUCKET_NAME created successfully."
else
echo "Error: Failed to create bucket $BUCKET_NAME"
fi
done < "$BUCKET_LIST_FILE"
Explanation:
- Reads bucket names from bucket_names.txt.
- Cleans up any unnecessary whitespace.
- Creates GCS buckets with the specified region.
- Stores created bucket names in created_buckets.txt.
Step 2: Automating Data Transfer from S3 to GCS
After creating the required GCS buckets, the next step is to automate data transfer using the gcloud transfer jobs command.
s3_to_gcs_transfer.sh
#!/bin/bash
# Variables
AWS_ACCESS_KEY="YOUR_AWS_ACCESS_KEY"
AWS_SECRET_KEY="YOUR_AWS_SECRET_KEY"
PROJECT_ID="ccd-poc-project"
CREDS_FILE="aws-creds.json"
# Create AWS credentials JSON file
cat <<EOF > "$CREDS_FILE"
{
"awsAccessKeyId": "$AWS_ACCESS_KEY",
"awsSecretAccessKey": "$AWS_SECRET_KEY"
}
EOF
# Read bucket names and create transfer jobs
while IFS= read -r BUCKET_NAME; do
echo "Creating transfer job for S3 bucket: $BUCKET_NAME"
JOB_NAME=$(gcloud transfer jobs create s3://"$BUCKET_NAME" gs://"$BUCKET_NAME" \
--source-auth-method=AWS_SIGNATURE_V4 \
--source-creds-file="$CREDS_FILE" \
--schedule-repeats-every=1d \
--project="$PROJECT_ID" \
--format="value(name)")
if [[ -n "$JOB_NAME" ]]; then
echo "Transfer job created successfully: $JOB_NAME"
else
echo "Failed to create transfer job for $BUCKET_NAME"
fi
done < bucket_names.txt
# Remove credentials file for security
rm "$CREDS_FILE"
echo "All transfer jobs created successfully!"
Explanation:
- Generates a secure AWS credentials file.
- Reads bucket names and initiates a transfer job.
- Checks if an existing transfer is running before creating a new one.
- Deletes the credentials file after execution for security.
Step 3: Running the Migration
To execute the scripts, follow these steps:
- Save the S3 bucket names in a file named bucket_names.txt.
- Run the GCS bucket creation script:
chmod +x create_gcs_bucket.sh
./create_gcs_bucket.sh
- Run the S3-to-GCS transfer script:
chmod +x s3_to_gcs_transfer.sh
./s3_to_gcs_transfer.sh
Conclusion
By automating S3 to GCS migration, we:
Eliminated manual effort for creating 200+ buckets.
Ensured accurate and efficient data transfers.
Scheduled daily syncs for incremental updates.
This solution scales easily and can be modified to include advanced features like logging, monitoring, and notifications.
If you found this guide helpful, feel free to share your thoughts and experiences in the comments. Happy migrating!