Building a Secure, AI-Powered Blog Platform on Cloud Run — From Zero to Production

A security-first serverless platform that publishes via API, narrates posts with AI, generates its own social cards, and ships a hidden, post-grounded AI assistant. Here's how every piece fits — and the bugs that shaped it.

FastAPICloud RunTerraform GeminiGoogle ADKGCS Cloud BuildCloud ArmorSecurity

1. Why I Built This

I wanted a blog I could publish to by automation — drop an HTML file at an endpoint and have it go live, daily, without touching a CMS. But "just a blog" quickly became a canvas for everything I care about as a Cloud Architect, Security Specialist, and AI Architect: how do you make an upload endpoint safe? How do you add AI without exploding cost? How do you run a public AI agent that can't be turned against you?

The result is a serverless platform where security is a first-class concern at every boundary, AI features run at publish time (not per-view), and the whole thing is reproducible from Terraform and shipped through CI/CD.

Design principle
Stateless compute, stateful storage, security at every boundary. Cloud Run holds no state — everything lives in GCS. Secrets never touch code. The compute has no public surface; all traffic flows through a load balancer with a WAF.

2. Architecture

 GitHub ──push──▶ Cloud Build ──build/push──▶ Artifact Registry
                  │ deploy                │ pull
 Custom Domain         ▼                    ▼
 blog.domain ─▶ Global LB + Cloud Armor (WAF + rate limit)
                  │ internal-ingress-only
                  ▼
        ┌──────────────────────────────────────┐
        │            Cloud Run (FastAPI)        │
        │ public blog · auth upload API ·       │
        │ audio · social cards · hidden agent   │
        └───┬──────────┬───────────┬────────────┘
            │          │           │
      ┌─────▼───┐  ┌───▼─────┐  ┌──▼──────────────┐
      │ GCS     │  │ Secret  │  │ Vertex AI       │
      │ posts/  │  │ Manager │  │ (Gemini) + TTS  │
      │ audio/  │  └─────────┘  └─────────────────┘
      │ cards/  │
      │ contact/├─finalize─▶ EventArc ─▶ Cloud Function ─▶ Email
      └─────────┘
Figure 1 — Stateless Cloud Run behind a WAF; all state in GCS; AI via Vertex; events drive notifications.

3. Features

FeatureHow it works
Authenticated HTML uploadPOST /api/posts with an API key — designed for daily automation
HTML sanitizationTwo-pass (BeautifulSoup decompose + bleach allowlist) — posts render in the site theme
Categories + aliasesAllowlisted; AI auto-resolves to ai-ml
Search + autosuggestClient-side over a cached index — zero server cost
AI audio summariesGemini summary → Cloud TTS → MP3, generated at publish time
Auto social cards1200×630 og:image per post via Pillow
SEORSS, sitemap, robots, Open Graph / Twitter cards
Hidden AI assistantToken-gated chat grounded to one post
Contact formHoneypot + rate limit → GCS → Cloud Function email
Modern UIDark/light, animations, mobile nav, syntax highlighting, a11y

4. Security: First-Class, Not Bolted On

Every boundary has a control. The threat I cared most about: a public upload endpoint and a public AI endpoint are both attractive targets.

ControlWhat it doesThreat
API key (constant-time)hmac.compare_digest on the upload keyUnauthorized publishing, timing attacks
Secret ManagerSecrets injected at runtime, never in code/stateCredential leakage
Two-pass sanitizationDecompose <script>/<style>/<iframe> then allowlistStored XSS
Security headers + CSPHSTS, X-Frame-Options, script-src 'self'XSS, clickjacking, MIME sniffing
LB-only ingress*.run.app blocked; traffic must pass the WAFWAF/rate-limit bypass
Cloud ArmorSQLi/XSS preconfigured rules + rate limitingExploit probes, DoS
Slug validationRejects ../ and absolute pathsPath traversal
Least-privilege SAsEach service gets only the IAM it needsBlast-radius containment
Non-root containerRuns as appuserContainer escape
Generic error handlerTracebacks to logs, not clientsInformation disclosure
A bug that became a security lesson
My first sanitizer used bleach.clean(strip=True) — which removes disallowed tags but keeps their text content. A post's <style> block got stripped, but all its CSS dumped into the page as visible text. The same path would surface <script> source. Fix: a first pass decomposes dangerous elements entirely before bleach runs. Strip ≠ remove.

5. The Hidden AI Agent

The platform includes a hidden chat assistant (opened via a secret URL hash + access token) that answers questions about one post at a time. Its security is architectural: the agent has zero tools, so a prompt injection can make it say something off-policy but never do anything. The user supplies only a question; the server fetches and injects the single post.

LayerMechanismGuarantee
Capability starvationAgent(tools=[])No data access, no actions — text only
Server-side groundingOne post fetched by validated slugCan't reach other data or inject content
Pre-flight code gateRegex refuses code at 0 tokensNo code generation + no cost-DoS
Output backstopCode fences replaced with a refusalCatches model disobedience
Full deep-dive
The agent's complete threat model, three-layer defense, and the 6,159-token cost bug are covered in a dedicated post: "Securing an AI Agent."

6. Automated Publishing & CI/CD

Posting is a single authenticated API call — ideal for a daily cron that generates and uploads HTML. Infrastructure and app both ship through Cloud Build:

  • App pipeline — on push to main: test → build → push → deploy, image pinned to the git SHA (immutable).
  • Infra pipeline — Terraform plan on PR (read-only SA), apply on merge.
  • Path-filtered triggersapp/** and infra/** changes run independent pipelines.
# Publish a post (what the daily automation calls)
curl -X POST https://blog.domain/api/posts \
  -H "X-API-Key: $KEY" \
  -F "file=@post.html;type=text/html" \
  -F "category=ai-ml"

7. Scalability & Reliability

Scalability

  • Cloud Run auto-scales 0→N; scale-to-zero when idle
  • GCS scales infinitely — no DB to outgrow
  • AI runs at publish time, not per-view — cost stays flat with traffic
  • Client-side search = zero server cost
  • CDN-ready global LB

Reliability

  • Fail-fast startup → bad config never serves traffic (auto-rollback)
  • Graceful degradation → audio/card failure doesn't block publishing
  • GCS versioning → accidental-delete recovery
  • Immutable SHA-pinned deploys
  • Tested CI gate → broken code never reaches prod

8. Problems Faced

ProblemRoot CauseFix
500 on uploadEnv var typoFail-fast bucket check at startup
Raw CSS dumped into postbleach keeps stripped-tag contentTwo-pass decompose-then-clean
FastAPI ↔ ADK dependency clashstarlette version conflictBumped FastAPI to compatible release
Gemini model 404 mid-buildProvider deprecated the modelModel ID → env var
Audio/agent routes 404include_router missingRegistered + route-existence test
Agent refusal cost 6k tokensRefused after the LLM callPre-flight 0-token gate
UI cramped & center-alignedOne width for all contentPer-page responsive containers
Meta-lesson
Many breakages traced to applying partial code edits. The durable fix was process: file-level writes, a route-existence test, and pre-commit hooks running py_compile plus the test suite — so a dropped feature or syntax error can't reach production.

9. Future Scope

EnhancementWhy
OIDC upload authReplace static API keys with signed identity tokens
Cloud CDNEdge-cache static + pages for speed and cost
Hard global rate limitsMove soft per-instance limits to shared Redis/Firestore
Post update / draftsAdd PUT + draft workflow
Agent streaming (SSE)Stream long answers instead of one blob
Dedicated agent serviceKeep the blob image lean; isolate ADK deps

10. References

  1. Cloud Run: cloud.google.com/run/docs
  2. FastAPI: fastapi.tiangolo.com
  3. Terraform Google Provider: registry.terraform.io
  4. Cloud Build CI/CD: cloud.google.com/build/docs
  5. Cloud Armor: cloud.google.com/armor/docs
  6. Google ADK: google.github.io/adk-docs
  7. Vertex AI (Gemini): cloud.google.com/vertex-ai/generative-ai
  8. OWASP Top 10 for LLM Apps: owasp.org/.../top-10-for-llm
  9. MDN — Content Security Policy: developer.mozilla.org/.../CSP
  10. Secret Manager: cloud.google.com/secret-manager/docs