Deploying kumiho-FastAPI to Google Cloud Run & Cloudflare Edge
This guide covers deploying the stateless FastAPI service to Google Cloud Run in the us-central1 region, with Cloudflare Workers acting as the edge cache and routing layer for api.kumiho.cloud.
Architecture
User → Cloudflare Worker (api.kumiho.cloud) → Google Cloud Run (Origin)
Prerequisites
Google Cloud Account with billing enabled
Cloudflare Account with
kumiho.cloudzonegcloud CLI and Wrangler CLI installed
A GCP project (we use
kumiho-server)
CI/CD with GitHub Actions
The repository includes a workflow that automatically deploys both the Cloud Run origin and the Cloudflare Worker on push to main.
Required GitHub Secrets
Secret |
Description |
|---|---|
|
|
|
Full path to the Workload Identity Provider |
|
Service account email for deployment |
|
API Token with Workers deployment permissions |
|
Your Cloudflare Account ID |
Step-by-Step GCP Setup (Workload Identity Federation)
We use Workload Identity Federation to allow GitHub Actions to authenticate without long-lived JSON keys.
1. Create Service Account
gcloud iam service-accounts create github-deployer \
--display-name="GitHub Actions Deployer" \
--project=kumiho-server
2. Grant Required Roles
PROJECT_ID=kumiho-server
SA_EMAIL=github-deployer@${PROJECT_ID}.iam.gserviceaccount.com
# Cloud Run Admin
gcloud projects add-iam-policy-binding $PROJECT_ID \
--member="serviceAccount:$SA_EMAIL" \
--role="roles/run.admin"
# Artifact Registry Writer
gcloud projects add-iam-policy-binding $PROJECT_ID \
--member="serviceAccount:$SA_EMAIL" \
--role="roles/artifactregistry.writer"
# Service Account User (to act as the runtime SA)
gcloud projects add-iam-policy-binding $PROJECT_ID \
--member="serviceAccount:$SA_EMAIL" \
--role="roles/iam.serviceAccountUser"
3. Configure Workload Identity Federation
# Create Workload Identity Pool
gcloud iam workload-identity-pools create "github-pool" \
--project="${PROJECT_ID}" \
--location="global" \
--display-name="GitHub Actions Pool"
# Get the Pool ID
POOL_ID=$(gcloud iam workload-identity-pools describe "github-pool" \
--project="${PROJECT_ID}" \
--location="global" \
--format='value(name)')
# Create Workload Identity Provider
gcloud iam workload-identity-pools providers create-oidc "github-provider" \
--project="${PROJECT_ID}" \
--location="global" \
--workload-identity-pool="github-pool" \
--display-name="GitHub Actions Provider" \
--attribute-mapping="google.subject=assertion.sub,attribute.actor=assertion.actor,attribute.repository=assertion.repository" \
--issuer-uri="https://token.actions.githubusercontent.com"
# Allow GitHub to impersonate the Service Account
# Replace OWNER/REPO with your actual repository path
gcloud iam service-accounts add-iam-policy-binding "${SA_EMAIL}" \
--project="${PROJECT_ID}" \
--role="roles/iam.workloadIdentityUser" \
--member="principalSet://iam.googleapis.com/${POOL_ID}/attribute.repository/OWNER/REPO"
Cloudflare Setup
1. DNS Configuration
Ensure api.kumiho.cloud is pointed to a dummy IP (e.g., 100:: AAAA record) and is Proxied (Orange Cloud).
2. Worker Configuration
The worker in kumiho-FastAPI/worker is configured via wrangler.toml.
# kumiho-FastAPI/worker/wrangler.toml
routes = [
{ pattern = "api.kumiho.cloud/*", zone_name = "kumiho.cloud" }
]
The ORIGIN_URL variable in the worker must point to the Cloud Run service URL. This is handled automatically by the CI/CD workflow.
Manual Deployment (Emergency Only)
1. Build and Push Origin
cd kumiho-FastAPI
docker build -t us-central1-docker.pkg.dev/kumiho-server/kumiho-server/kumiho-fastapi:latest .
docker push us-central1-docker.pkg.dev/kumiho-server/kumiho-server/kumiho-fastapi:latest
2. Deploy Origin
gcloud run deploy kumiho-fastapi \
--image us-central1-docker.pkg.dev/kumiho-server/kumiho-server/kumiho-fastapi:latest \
--region us-central1 \
--allow-unauthenticated
3. Deploy Worker
cd kumiho-FastAPI/worker
# Set the ORIGIN_URL to the URL from step 2
sed -i 's|ORIGIN_URL = ".*"|ORIGIN_URL = "https://kumiho-fastapi-xxxxx-uc.a.run.app"|' wrangler.toml
npx wrangler deploy
gcloud run services logs read kumiho-fastapi --region us-central1
Or use the Cloud Console:
https://console.cloud.google.com/run/detail/us-central1/kumiho-fastapi/logs
View Metrics
https://console.cloud.google.com/run/detail/us-central1/kumiho-fastapi/metrics
Key metrics to monitor:
Request latency (p50, p95, p99)
Request count by status code
Instance count
Memory/CPU utilization
Rollback
To rollback to a previous revision:
# List revisions
gcloud run revisions list --service kumiho-fastapi --region us-central1
# Route traffic to a specific revision
gcloud run services update-traffic kumiho-fastapi \
--region us-central1 \
--to-revisions kumiho-fastapi-00005-abc=100
Cost Optimization
Cloud Run charges only when handling requests:
Idle cost: $0 (with min-instances=0)
Request cost: ~$0.40 per million requests
Compute cost: ~$0.00002400 per vCPU-second
For predictable traffic, consider setting --min-instances 1 to avoid cold starts:
gcloud run services update kumiho-fastapi \
--region us-central1 \
--min-instances 1
Troubleshooting
Cold Start Latency
If cold starts are slow:
Set
--min-instances 1Use a smaller base image (already using
python:3.11-slim)Reduce dependencies in
requirements.txt
Container Fails to Start
Check logs:
gcloud run services logs read kumiho-fastapi --region us-central1 --limit 50
Common issues:
Port mismatch (ensure
--port 8000matches DockerfileEXPOSE 8000)Missing environment variables
Dependency installation failures
Permission Denied
Ensure the Cloud Run service account has access to required resources:
gcloud run services describe kumiho-fastapi \
--region us-central1 \
--format 'value(spec.template.spec.serviceAccountName)'