Deploying ML models and other python apps to cloud can be tedious. Compute instances need to be provisioned; networking needs to be sorted out; autoscaling needs to be configured; secrets and credentials need to be safely managed.
Rather than spending hours on the above Dev-Ops tasks (don’t get me wrong, Dev-Ops and ML-Ops are important), I would like to focus on modeling: recipes that produce the best models and make them available for people to use. After years and many projects, I found Google Cloud Run to be a low maintainence solution, with CI/CD managed by Github Action. Similar solutions can be had with AWS ECS and Azure Container Instances. But this post will focus on Cloud Run.
Prerequisites
To follow along with the tutorial, you need:
Sample App
Let’s start from a very simple http server and run it locally.
docker run --rm -it -p 801:801 python:3.8-slim python -m http.server 801 -d /home/
Run it locally and we can verify it works by visiting localhost:801
in a browser.
Deploy to Cloud Run manually
However, the above docker image does not quite work for Cloud Run, as Cloud Run requires your app in the docker image to use the PORT
environment variable to determine which port the app listens to.
To solve this we need to build a simple docker image with the following Dockerfile
:
FROM python:3.8-slim
ENV PORT=8080
CMD python -m http.server $PORT -d /home
Install gcloud
and authenticate. Then build and deploy it with the following script (click to expand):
# Make sure to fill in the GCP project id:
project=your-gcp-project-id
app=example-app
platform=linux/amd64
region=us-central1
docker build --platform $platform -t example-app-image .
image=us.gcr.io/$project/$app:latest
docker tag example-app-image $image
docker push $image
gcloud run deploy $app --image $image --cpu 1 --memory 1Gi --min-instances 1 --region $region --allow-unauthenticated
Note that there are a couple of hard-coded defaults like the region (us-central1
), and image subdomain (us.gcr.io
). Feel free to adjust.
If successful, we will see something like this:
Deploying container to Cloud Run service [example-app] in project [your-project-id] region [us-central1]
✓ Deploying new service... Done.
✓ Creating Revision...
✓ Routing traffic...
✓ Setting IAM Policy...
Done.
Service [example-app] revision [example-app-...] has been deployed and is serving 100 percent of traffic.
Manage secrets
If the app needs to access secrets such as API keys and passwords, then it is a necessary to store and manage them securely.
Create a secret in GCP’s secret manager, and grant minimal necessary access.
Each secret is versioned. For example, we may create a secret: MY_API_KEY:latest
with latest
being the version tag.
When using gcloud run deploy
to deploy the app, pass in additional arguments:
--update-secrets=MY_API_KEY=MY_API_KEY:latest,OTHER_API_KEY=OTHER_API_KEY:latest
In the docker container, the secret value will be made available in the environment variable MY_API_KEY
.
Set up a secure Github action for continuous deployment
While manually running the gcloud
command is sufficient to deploy the app to Cloud Run, sometimes it can make sense to set up continuous deployment triggered by github push or release events.
Service account
First, we need to follow these instructions to create a service account and grant some permissions:
Go to IAM, click “grant access” and set: - principal: the new service account just created - role cloud run admin - role: roles/artifactregistry.createOnPushWriter - role: Secret manager secret accessor
Grant the default compute-engine account access to Secret Manager Secret Accessor role. Go to IAM and set: - principal: the default compute-engine service account - role: Secret Manager Secret Accessor
Go to IAM/service accounts, click into the default compute-engine service account, then allow the new service account to use this compute engine service account: - principal: the new service account just created - role: “Service account user”
Docker artifacts repository
A docker artifacts repository must be created in the same project as the Cloud Run service (we assume the location is “us-central1”):
gcloud artifacts repositories create slack-llm --location=us-central1 --repository-format=docker
This artifacts repository will hold the docker image of the app.
Workload identify federation and keyless authentication
For better cloud security, Google recommends setting up keyless authentication from github actions. To do that, we need to:
gcloud iam workload-identity-pools create "my-pool" \
--project="${PROJECT_ID}" \
--location="global" \
--display-name="Demo pool" \
--description="My Identify Pool"
gcloud iam workload-identity-pools providers create-oidc "my-provider" \
--project="${PROJECT_ID}" \
--location="global" \
--workload-identity-pool="my-pool" \
--display-name="Demo provider" \
--attribute-mapping="google.subject=assertion.sub,attribute.actor=assertion.actor,attribute.aud=assertion.aud" \
--issuer-uri="https://token.actions.githubusercontent.com"
gcloud iam service-accounts add-iam-policy-binding "my-service-account@${PROJECT_ID}.iam.gserviceaccount.com" \
--project="${PROJECT_ID}" \
--role="roles/iam.workloadIdentityUser" \
--member="principalSet://iam.googleapis.com/projects/${PROJECT_NUMBER}/locations/global/workloadIdentityPools/my-pool/attribute.repository/my-org/my-repo"
Alternatively, if we do not want to restrict the binding to the specific github repo, then:
gcloud iam service-accounts add-iam-policy-binding "my-service-account@${PROJECT_ID}.iam.gserviceaccount.com" \
--project="${PROJECT_ID}" \
--role="roles/iam.workloadIdentityUser" \
--member="principalSet://iam.googleapis.com/projects/${PROJECT_NUMBER}/locations/global/workloadIdentityPools/my-pool/*"
Github secrets
Add the following github secrets (see instructions on how to add secrets to a github repo):
WIF_PROVIDER=projects/my-gcp-project-number/locations/global/workloadIdentityPools/my-pool/providers/my-provider
WIF_SERVICE_ACCOUNT=my-service-account@my-project.iam.gserviceaccount.com
Github action yaml file
Now we should be ready to set up the actual github action. This is a redacted version of my working github action yaml file:
Put this in .github/workflows/deploy.yml
and the next time you push a change to main
, it should automatically deploy to Cloud Run.
Enjoy!