Docs

Autoscaling on Google Cloud

The MonkeysLegion stack runs great on three GCP primitives—pick the one that matches your traffic profile and ops comfort level.

Option

When to choose

Highlights

Cloud Run

HTTP-only apps, <30-min requests

per-request billing, zero-to-N autoscale, HTTPS out of the box

Managed Instance Group (MIG)

Stateful uploads, WebSockets, long-lived workers

VM-level control, custom machine types, “autoscaler” based on CPU / Load Balancer

Google Kubernetes Engine (GKE)

Multiple services, background queues, complex networking

HorizontalPodAutoscaler, service mesh, spot node pools

1 · Cloud Run (simplest)

FROM gcr.io/google-appengine/php82:latest

COPY . /app
WORKDIR /app

RUN composer install --no-dev --optimize-autoloader \
    && php vendor/bin/ml config:cache \
    && php vendor/bin/ml view:clear

CMD ["php", "-S", "0.0.0.0:8080", "-t", "public"]
gcloud run deploy myapp \
  --region=us-central1 \
  --source=. \
  --allow-unauthenticated \
  --min-instances=0 \
  --max-instances=20 \
  --set-env-vars "APP_ENV=prod,DB_HOST=/cloudsql/myproj:us-central1:mydb,JWT_SECRET=$(cat jwt.pem)"
  • Scale to zero when idle; scales up to 20 containers (adjustable).

  • Concurrency defaults to 80—tune via --concurrency 16 for predictable memory.

2 · Compute Engine MIG + Load Balancer

Terraform snippet

resource "google_compute_instance_template" "ml" {
  name        = "ml-template"
  machine_type= "e2-micro"
  metadata_startup_script = <<-SH
    #!/bin/bash
    cd /var/www/myapp
    git pull
    composer install --no-dev --optimize-autoloader
    php vendor/bin/ml migrate --env=prod
    systemctl restart php-fpm nginx
  SH
  disk {
    source_image = "projects/debian-cloud/global/images/family/debian-12"
    auto_delete  = true
    boot         = true
  }
}

resource "google_compute_region_instance_group_manager" "ml" {
  name               = "ml-group"
  base_instance_name = "ml"
  region             = "us-central1"
  versions {
    instance_template = google_compute_instance_template.ml.id
  }
  target_size = 2
  auto_healing_policies {
    health_check = google_compute_health_check.app.id
    initial_delay_sec = 90
  }
}

resource "google_compute_autoscaler" "ml" {
  name   = "ml-auto"
  region = "us-central1"
  target = google_compute_region_instance_group_manager.ml.id
  autoscaling_policy {
    max_replicas    = 10
    min_replicas    = 2
    cpu_utilization {
      target = 0.6
    }
  }
}
  • Horizontal scaling based on average CPU ≥ 60 %.

  • Put a Global HTTP(S) Load Balancer in front; health check hits /healthz.

3 · GKE with HorizontalPodAutoscaler

apiVersion: apps/v1
kind: Deployment
metadata:
  name: ml
spec:
  replicas: 2
  selector:
    matchLabels: app: ml
  template:
    metadata:
      labels:
        app: ml
    spec:
      containers:
      - name: php
        image: gcr.io/myproj/ml:1.4.0
        ports: [{ containerPort: 8080 }]
        resources:
          requests: { cpu: "100m", memory: "128Mi" }
          limits:   { cpu: "500m", memory: "512Mi" }
---
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: ml-hpa
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: ml
  minReplicas: 2
  maxReplicas: 15
  metrics:
  - type: Resource
    resource:
      name: cpu
      target:
        type: Utilization
        averageUtilization: 60
  • Scales pods between 2 and 15 based on CPU.

  • Add Prometheus metrics + Custom Metrics API if you prefer QPS-based scaling (request_per_second).

  • Use an internal Cloud SQL Proxy sidecar or Cloud SQL Auth Proxy to reach managed databases.

4 · Sticky uploads & cache

If you store user-uploaded files or cache to disk, mount Filestore or Cloud Storage FUSE; both are supported in MIGs and GKE. Cloud Run requires Cloud Storage.

5 · Secrets and config

  • Store runtime secrets in Secret Manager; export via --set-secrets (Cloud Run) or secretEnv (GKE).

  • Keep .mlc config in your repo, override sensitive keys with env vars (env("DB_PASS")).

  • JWT private key can live in Secret Manager; mount to /var/secrets/jwt.pem.

6 · Zero-downtime deploys

Platform

Strategy

Cloud Run

Default gradual rollout (traffic‐split) or --revision-suffix $SHA + manual promotes

MIG

instance_group_manager supports rolling update with minimal-action=REPLACE

GKE

kubectl rollout restart deployment/ml triggers rolling strategy (configurable maxUnavailable)

7 · Observability

Signal

GCP product

MonkeysLegion tie-in

Metrics

Cloud Monitoring (Prometheus integration)

/metrics endpoint via Telemetry package

Logs

Cloud Logging

Monolog GoogleCloudLoggingHandler

Traces

Cloud Trace

OpenTelemetry exporter (community recipe)

8 · Cost tips

  • Cloud Run: set idle CPU = off, keep min instances = 0 for side projects.

  • MIG: Prefer E2-micro for steady low traffic; add preemptible group for bursty tasks.

  • GKE: Use autopilot or enable cluster autoscaler + spot node pools.