Effortless Data Migration — Multi TB migration from S3 to GCS

Shubhangi Thakur
The Cloudside View
Published in
4 min readMar 14, 2024

--

We recently migrated multiple terabytes of data securely from S3 to GCS bucket for one of our clients and in this article, we will take you through the journey. The migration was carried out securely, using VPN/Interconnect connection.

Let’s configure the Storage Transfer Service environment and start the data transfer process:

Step 1: Prerequisites

Before we begin the setup, ensure you have completed the following prerequisites:

  1. VPN connection between AWS and Google Cloud.
  2. Service account with the required roles and attach it to the private VM:

Storage Object Admin

Storage Transfer Admin

Storage Transfer agent

3. Private VM in the same Google Cloud VPC that is connected with the AWS using VPN.

AWS ->->->

Step-2: Creating a VPC interface endpoint in AWS

VPC endpoint: In AWS, a VPC (Virtual Private Cloud) endpoint is a service that allows secure communication between instances within your VPC and AWS services without requiring traffic to traverse the public internet,it provides a private connection to supported AWS services from your VPC.

  1. Login inside AWS console -> Click to create endpoint

2. Give name -> Select AWS services -> Select s3 regional service-> Interface

3. Select same VPC -> Select enable DNS name -> Select same subnet

4. Select security group -> Click create

5. In this security group remove default outbound rule and allow gcp subnet ip range in inbound rule.

GCP ->->->

Step-3: Test the endpoint to see if it’s accessible via Google Cloud VM on port 443. Use telnet as follows:

telnet *bucketname*.bucket.vpce-xxxx-xxxx.s3.*region-name*.vpce.amazonaws.com 443

You should immediately receive a response similar to this:

Trying 172.31.35.33...
Connected to *bucketname*.bucket.vpce-xxxx-xxxx.s3.*region-name*.vpce.amazonaws.com.
Escape character is '^]'.

Step-4: Creating a agent pool for transfer job:

  1. In the GCP console, go to Data transfer -> Agent pools. Select to install agents in the default pool or you can create another agent pool.

2. Click create pool -> Pool name -> Set bandwidth limit -> No limit

Note: you can set a bandwidth limit if required for transfer.

Step-5: Installing the transfer agents inside Compute VM:

  1. Log in to the VM and execute the following commands to install Docker and authenticate Docker with your Google Cloud credentials.
curl -fsSL https://get.docker.com -o get-docker.sh
sudo sh get-docker.sh
sudo systemctl enable docker
sudo docker run -ti --name gcloud-config google/cloud-sdk gcloud auth application-default login

2. Navigate to Data Transfer > Agent Pools

3. After selecting the Pool, click on Install Agent

4. Number of agent 1 -> Agent name -> Select S3-compatible object storage

5. Have AWS Access Key ID and Secret Access Key. It should have read permissions on the S3 bucket.

4. Go to the service account attached to the VM -> Create a JSON key and store the service account key file on the Compute Engine instance where the agents will be installed.

5. Give JSON key file path -> Copy these commands and run them inside the VM to install the agent.

6. To check the agent is running, use docker command:

sudo systemctl status docker 
sudo docker ps
Agent is running
Agent is connected

Step-6: Creating a transfer job:

  1. Go to create transfer job -> choose source type as S3-compatible object storage -> select the agent pool where the agents are installed
  2. In Bucket or folder type the name of the S3 bucket -> Give VPC Endpoint -> Signing region, type the AWS region where the bucket located

3. Choose a destination, browse destination bucket and the folder path

4. In next step, if you want to run this job immediately or on a schedule. Optionally, you can check the settings. After configuring your desired settings click on Create to start the transfer.

5. Click on the job and check the status of the transfer. The data transfer duration will depend on the size of the objects in the source bucket.

Data transfer successfully completed

Hope you find it helpful! Keep learning, until next time :)

--

--

Hi, I am Student | Cloud Engineer | GCP+AWS Cloud | DevOps | 2XGCP Certified