February 20, 2025

00 min read

Introduction

Cloud storage plays a crucial role in modern infrastructure, providing scalable and reliable storage solutions. Many businesses migrate from AWS S3 to Google Cloud Storage (GCS) to leverage cost benefits, integration with Google Cloud services, or optimize their cloud strategies. However, when dealing with hundreds of S3 buckets, manual migration is inefficient and time-consuming.

To streamline the process, I automated the migration using Bash scripts and Google Cloud’s Storage Transfer Service. In this blog, I’ll walk you through the steps of automating S3 to GCS migration efficiently.

Why Automate S3 to GCS Migration?

Handling over 200+ S3 buckets manually would involve:

  • Repetitive tasks – Creating GCS buckets, setting permissions, and transferring data for each bucket.
  • Human errors – Misconfiguration, incorrect bucket names, or missing files.
  • Time-consuming process – Manual intervention would take days to complete.

By automating this process, we can:

 Save time – Script execution takes a few minutes instead of hours/days.

 Eliminate errors – Ensures all S3 buckets are correctly transferred.

 Enable monitoring & scheduling – Automate recurring data transfers with Google’s Storage Transfer Service.

Prerequisites

Before running the scripts, ensure you have:

 A Google Cloud Project with Billing enabled.
AWS IAM User with s3:ListBucket and s3:GetObject permissions.
Installed Google Cloud SDK (gcloud CLI) on your local machine.

Step 1: Creating Google Cloud Storage Buckets

Each S3 bucket requires a corresponding GCS bucket. The script below reads a list of bucket names from a file and creates them in GCP.

create_gcs_bucket.sh

#!/bin/bash

# Variables

PROJECT_ID="ccd-poc-project"  # Replace with your GCP project ID

BUCKET_LIST_FILE="bucket_names.txt"  # File containing bucket names

OUTPUT_FILE="created_buckets.txt"

REGION="us-central1"  # Change if needed

# Check if the bucket list file exists

if [ ! -f "$BUCKET_LIST_FILE" ]; then

    echo "Error: Bucket names file '$BUCKET_LIST_FILE' not found!"

    exit 1

fi

# Read bucket names and create GCS buckets

while IFS= read -r BUCKET_NAME || [[ -n "$BUCKET_NAME" ]]; do

    if [[ -z "$BUCKET_NAME" ]]; then

        continue  # Skip empty lines

    fi

    # Clean bucket name

    BUCKET_NAME=$(echo "$BUCKET_NAME" | tr -d '\r' | tr -d '[:space:]')

    echo "Creating bucket: $BUCKET_NAME"

    gcloud storage buckets create "gs://$BUCKET_NAME" --location="$REGION" --project="$PROJECT_ID"

    if [ $? -eq 0 ]; then

        echo "gs://$BUCKET_NAME" >> "$OUTPUT_FILE"

        echo "Bucket $BUCKET_NAME created successfully."

    else

        echo "Error: Failed to create bucket $BUCKET_NAME"

    fi

done < "$BUCKET_LIST_FILE"

  Explanation:

  • Reads bucket names from bucket_names.txt.
  • Cleans up any unnecessary whitespace.
  • Creates GCS buckets with the specified region.
  • Stores created bucket names in created_buckets.txt.

Step 2: Automating Data Transfer from S3 to GCS

After creating the required GCS buckets, the next step is to automate data transfer using the gcloud transfer jobs command.

s3_to_gcs_transfer.sh

#!/bin/bash

# Variables

AWS_ACCESS_KEY="YOUR_AWS_ACCESS_KEY"

AWS_SECRET_KEY="YOUR_AWS_SECRET_KEY"

PROJECT_ID="ccd-poc-project"

CREDS_FILE="aws-creds.json"

# Create AWS credentials JSON file

cat <<EOF > "$CREDS_FILE"

{

  "awsAccessKeyId": "$AWS_ACCESS_KEY",

  "awsSecretAccessKey": "$AWS_SECRET_KEY"

}

EOF

# Read bucket names and create transfer jobs

while IFS= read -r BUCKET_NAME; do

  echo "Creating transfer job for S3 bucket: $BUCKET_NAME"

  JOB_NAME=$(gcloud transfer jobs create s3://"$BUCKET_NAME" gs://"$BUCKET_NAME" \

    --source-auth-method=AWS_SIGNATURE_V4 \

    --source-creds-file="$CREDS_FILE" \

    --schedule-repeats-every=1d \

    --project="$PROJECT_ID" \

    --format="value(name)")

  if [[ -n "$JOB_NAME" ]]; then

    echo "Transfer job created successfully: $JOB_NAME"

  else

    echo "Failed to create transfer job for $BUCKET_NAME"

  fi

done < bucket_names.txt

# Remove credentials file for security

rm "$CREDS_FILE"

echo "All transfer jobs created successfully!"

      Explanation:

  • Generates a secure AWS credentials file.
  • Reads bucket names and initiates a transfer job.
  • Checks if an existing transfer is running before creating a new one.
  • Deletes the credentials file after execution for security.

Step 3: Running the Migration

To execute the scripts, follow these steps:

  1. Save the S3 bucket names in a file named bucket_names.txt.
  2. Run the GCS bucket creation script:

chmod +x create_gcs_bucket.sh

./create_gcs_bucket.sh

  1. Run the S3-to-GCS transfer script:

chmod +x s3_to_gcs_transfer.sh

./s3_to_gcs_transfer.sh

Conclusion

By automating S3 to GCS migration, we:
Eliminated manual effort for creating 200+ buckets.
Ensured accurate and efficient data transfers.
Scheduled daily syncs for incremental updates.

This solution scales easily and can be modified to include advanced features like logging, monitoring, and notifications.

If you found this guide helpful, feel free to share your thoughts and experiences in the comments. Happy migrating!

S3 to GCS, Cloud Migration, Bash Automation, GCS Transfer

Automating S3 to GCS Migration Using Bash Scripts

Automating S3 to GCS Migration Using Bash Scripts

Introduction

Cloud storage plays a crucial role in modern infrastructure, providing scalable and reliable storage solutions. Many businesses migrate from AWS S3 to Google Cloud Storage (GCS) to leverage cost benefits, integration with Google Cloud services, or optimize their cloud strategies. However, when dealing with hundreds of S3 buckets, manual migration is inefficient and time-consuming.

To streamline the process, I automated the migration using Bash scripts and Google Cloud’s Storage Transfer Service. In this blog, I’ll walk you through the steps of automating S3 to GCS migration efficiently.

Why Automate S3 to GCS Migration?

Handling over 200+ S3 buckets manually would involve:

  • Repetitive tasks – Creating GCS buckets, setting permissions, and transferring data for each bucket.
  • Human errors – Misconfiguration, incorrect bucket names, or missing files.
  • Time-consuming process – Manual intervention would take days to complete.

By automating this process, we can:

 Save time – Script execution takes a few minutes instead of hours/days.

 Eliminate errors – Ensures all S3 buckets are correctly transferred.

 Enable monitoring & scheduling – Automate recurring data transfers with Google’s Storage Transfer Service.

Prerequisites

Before running the scripts, ensure you have:

 A Google Cloud Project with Billing enabled.
AWS IAM User with s3:ListBucket and s3:GetObject permissions.
Installed Google Cloud SDK (gcloud CLI) on your local machine.

Step 1: Creating Google Cloud Storage Buckets

Each S3 bucket requires a corresponding GCS bucket. The script below reads a list of bucket names from a file and creates them in GCP.

create_gcs_bucket.sh

#!/bin/bash

# Variables

PROJECT_ID="ccd-poc-project"  # Replace with your GCP project ID

BUCKET_LIST_FILE="bucket_names.txt"  # File containing bucket names

OUTPUT_FILE="created_buckets.txt"

REGION="us-central1"  # Change if needed

# Check if the bucket list file exists

if [ ! -f "$BUCKET_LIST_FILE" ]; then

    echo "Error: Bucket names file '$BUCKET_LIST_FILE' not found!"

    exit 1

fi

# Read bucket names and create GCS buckets

while IFS= read -r BUCKET_NAME || [[ -n "$BUCKET_NAME" ]]; do

    if [[ -z "$BUCKET_NAME" ]]; then

        continue  # Skip empty lines

    fi

    # Clean bucket name

    BUCKET_NAME=$(echo "$BUCKET_NAME" | tr -d '\r' | tr -d '[:space:]')

    echo "Creating bucket: $BUCKET_NAME"

    gcloud storage buckets create "gs://$BUCKET_NAME" --location="$REGION" --project="$PROJECT_ID"

    if [ $? -eq 0 ]; then

        echo "gs://$BUCKET_NAME" >> "$OUTPUT_FILE"

        echo "Bucket $BUCKET_NAME created successfully."

    else

        echo "Error: Failed to create bucket $BUCKET_NAME"

    fi

done < "$BUCKET_LIST_FILE"

  Explanation:

  • Reads bucket names from bucket_names.txt.
  • Cleans up any unnecessary whitespace.
  • Creates GCS buckets with the specified region.
  • Stores created bucket names in created_buckets.txt.

Step 2: Automating Data Transfer from S3 to GCS

After creating the required GCS buckets, the next step is to automate data transfer using the gcloud transfer jobs command.

s3_to_gcs_transfer.sh

#!/bin/bash

# Variables

AWS_ACCESS_KEY="YOUR_AWS_ACCESS_KEY"

AWS_SECRET_KEY="YOUR_AWS_SECRET_KEY"

PROJECT_ID="ccd-poc-project"

CREDS_FILE="aws-creds.json"

# Create AWS credentials JSON file

cat <<EOF > "$CREDS_FILE"

{

  "awsAccessKeyId": "$AWS_ACCESS_KEY",

  "awsSecretAccessKey": "$AWS_SECRET_KEY"

}

EOF

# Read bucket names and create transfer jobs

while IFS= read -r BUCKET_NAME; do

  echo "Creating transfer job for S3 bucket: $BUCKET_NAME"

  JOB_NAME=$(gcloud transfer jobs create s3://"$BUCKET_NAME" gs://"$BUCKET_NAME" \

    --source-auth-method=AWS_SIGNATURE_V4 \

    --source-creds-file="$CREDS_FILE" \

    --schedule-repeats-every=1d \

    --project="$PROJECT_ID" \

    --format="value(name)")

  if [[ -n "$JOB_NAME" ]]; then

    echo "Transfer job created successfully: $JOB_NAME"

  else

    echo "Failed to create transfer job for $BUCKET_NAME"

  fi

done < bucket_names.txt

# Remove credentials file for security

rm "$CREDS_FILE"

echo "All transfer jobs created successfully!"

      Explanation:

  • Generates a secure AWS credentials file.
  • Reads bucket names and initiates a transfer job.
  • Checks if an existing transfer is running before creating a new one.
  • Deletes the credentials file after execution for security.

Step 3: Running the Migration

To execute the scripts, follow these steps:

  1. Save the S3 bucket names in a file named bucket_names.txt.
  2. Run the GCS bucket creation script:

chmod +x create_gcs_bucket.sh

./create_gcs_bucket.sh

  1. Run the S3-to-GCS transfer script:

chmod +x s3_to_gcs_transfer.sh

./s3_to_gcs_transfer.sh

Conclusion

By automating S3 to GCS migration, we:
Eliminated manual effort for creating 200+ buckets.
Ensured accurate and efficient data transfers.
Scheduled daily syncs for incremental updates.

This solution scales easily and can be modified to include advanced features like logging, monitoring, and notifications.

If you found this guide helpful, feel free to share your thoughts and experiences in the comments. Happy migrating!

Related Blogs

No Related Blog Available

The Ankercloud Team loves to listen