Demo Script — Cloudflare Developer Platform

Overview

ForAE, BDR, CSM — business conversation, no live coding

Length30 minutes · discovery call or in-person

GoalHook with chatbot + R2 wow moments first, then build the full picture, close with pricing and a pilot ask

Sitedev.pongpisit.com

One-sentence pitch:
"A Cloudflare Worker sits in front of your existing site and adds AI, cuts infrastructure costs, and localises for any market — in hours, without touching your backend or waiting on engineering."

Flow at a Glance

Time	What You're Doing	Business Point
0:00	Set the stage — the one-concept setup	Worker = thin layer in front, nothing changes behind
2:30	Routes — 60-second architecture explainer	Why none of this requires backend changes
4:00	AI Chatbot — inject without touching origin	Support deflection, 30-min deploy, 24/7
8:00	R2 — kill the egress bill	$0 egress, auditable metadata, no URL changes
12:00	Search — the silence moment	Lost revenue from zero results
16:00	Local pricing at the edge	20–40% conversion uplift, no backend sprint
19:00	Auto subtitles	48× cheaper than AWS Transcribe
22:00	Voice AI agent	24/7 voice support, no SDK needed
24:00	Collaborative Room Designer	Real-time multi-user sync — one DO replaces 5 infrastructure pieces
25:30	SEO — Rich Results	Stars + price in Google Search, zero backend changes
26:30	Pricing calculator — put a number on it	Real BOM, downloadable CSV for finance
29:00	Try it Yourself + Close	Pilot in 2 weeks, engineering can start today

Before You Start

Know their current cloud provider (AWS, GCP, Azure, or local) Know which markets/languages they care about most Have a 1–2 min local-language audio clip ready for subtitles Pre-open /developer-pricing in a background tab Pre-open /try in a background tab

0:00–2:30 — Set the Stage

Navigate to: dev.pongpisit.com/entertainment

"I want to show you something before I explain anything. This is a live streaming platform — real film data, real AI behind every button. No slides, no mocks."

"Everything I'm going to demo in the next 30 minutes was added to this site without changing a single line of its backend code. The secret is one concept I'll show you in 60 seconds."

Discovery: "What's the one feature your product team has been asking engineering for — that keeps getting pushed back?"

2:30–4:00 — The Foundation: Workers Intercept Everything

Navigate to: dev.pongpisit.com/demo/routes

"Here's the one thing you need to understand. A Cloudflare Worker is a small piece of code deployed at the edge — in front of your existing site. When a request comes in, the Worker intercepts it first."

"Your origin server never knows. It gets a completely normal request and returns its normal response. The Worker reads it, modifies it, adds to it — and returns something better to the user. That's it."

"No Kubernetes. No redeployment. One command — wrangler deploy — and it's live in 330 cities in 10 seconds."

Click the route animation — show request intercepting before origin

Key phrase to land: "Your engineering team keeps their sprint. We add a layer in front. Nothing breaks, everything gets better."

4:00–8:00 — AI Chatbot: Injected Without Touching the Server

Navigate to: dev.pongpisit.com/entertainment → scroll to Developer Guide → open AI Chat guide card → click Activate

"Watch the bottom right corner of the screen."

Click Activate — chat button appears instantly

"That button was not in the page before. The server returned exactly the same HTML it always does. A Worker intercepted the response and injected the chat widget using HTMLRewriter — a streaming HTML parser built into the Workers runtime."

"No backend deploy. No frontend PR. No waiting on engineering. 30 minutes of work."

Type into the chat: What should I watch if I loved Parasite?

"Instant AI recommendations, grounded in the catalog via AI Search with AutoRAG. The chatbot can answer questions about any film in the database — no manual FAQ writing, no stale content. It retrieves current data every time."

"No support ticket. No wait time. No human agent at 3am on a Sunday."

Discovery: "Do you have a chatbot today? What did it cost to build and what are you paying monthly to maintain it?"

Business framing: Average support ticket in media/e-commerce: $8–$15. If this handles 500 queries a month, that's $4,000–$7,500 in avoided support cost — from a feature that took 30 minutes to deploy.

8:00–12:00 — R2: Zero Egress, Auditable Cache

Navigate to: dev.pongpisit.com/demo/r2 → Run Demo

"Now let me show you something that hits your finance team immediately."

"Every image, every thumbnail your users load — AWS charges you to send that data out. They call it egress. It's typically 5–10% of cloud bills for any company with user-facing content."

"Cloudflare R2 stores the same content, serves it from the nearest Cloudflare PoP — and charges zero egress fees. Permanently. Not zero with a cap. Zero."

Click "Run Demo" — watch the cost meters flip from red to green

"But here's what makes it interesting for your engineering team — not just your finance team. When the Worker stores an image in R2, it writes custom metadata: exactly when that asset was first cached, its original file size, and where it came from."

"You can open the R2 dashboard right now and see every object with those three fields. You don't just know it's cached — you know when it was cached and from where. That's auditable infrastructure."

Discovery: "Do you have a sense of your current CDN or data transfer line item? Even a ballpark per month?"

Numbers to use: AWS S3 egress = $0.085/GB. At 50 TB/month (typical media platform) that's $4,250/month — permanently gone the moment R2 is in front. R2 storage = $0.015/GB. No URL changes for users, no migration of existing content.

12:00–16:00 — Semantic Search: The Silence Moment

Navigate to: dev.pongpisit.com/entertainment → search bar at the top

Type: mind-bending sci-fi with emotional depth — pause 3 full seconds before saying anything

"Inception is in this catalog. Interstellar is in this catalog. The user typed exactly what they want to watch tonight — and got nothing. What do they do? They leave."

Discovery: "Do you track search abandonment or zero-result rate today?"

Navigate to: dev.pongpisit.com/demo/search?ctx=entertainment → Run Semantic Search

"Now the platform understands what they mean, not just what they typed. Inception. Interstellar. WALL·E. Same query, completely different results. The user stays, finds something, watches it."

"Your search engine wasn't rewritten. A Worker sits in front, embeds the query with AI, and returns semantically matched results. Your existing database is untouched."

Business framing: If 20% of daily searches return zero results and recovering 30% of those converts to a session — at 50,000 daily searches that's 3,000 recovered sessions per day. That's not a technical metric, that's subscriber retention.

16:00–19:00 — Local Pricing: Convert Every Market

Navigate to: dev.pongpisit.com/entertainment/subscribe

"When someone in Thailand opens your subscription page and sees $9.99/month — they do the math. Is this available here? Why dollars? That friction kills conversions."

"Watch — this page shows local pricing automatically. Thai users see ฿350. Indonesian users see Rp162,000. Singaporean users see S$13. Same price point, right local context."

"No backend change. The Worker reads the visitor's country from request.cf.country — it's already there on every request, for free, from Cloudflare's network."

Discovery: "Which markets are you most focused on this year? Are you seeing different conversion rates by country?"

Business framing: Platforms that localise pricing typically see 20–40% improvement in free-to-paid conversion from non-USD markets. That's not a design change — it's a Worker and a rate table.

19:00–22:00 — Auto Subtitles: 48× Cheaper

Navigate to: dev.pongpisit.com/demo/subtitles

"If you produce video — lectures, films, training content — subtitles are a requirement in most markets. The standard approach is Amazon Transcribe at $0.024 per minute, plus an S3 bucket, IAM policies, and a batch pipeline."

Upload the local-language audio clip (MP3, 1–2 min)

"Workers AI Whisper does this at $0.0005 per minute — 48 times cheaper. One API call. No batch pipeline, no IAM policies, no S3 bucket."

"The browser splits the audio into 45-second chunks with 5-second overlap, sends 3 in parallel. Two AI passes clean the output — a Southeast Asian language model per chunk, then GPT-OSS 120B reads the full transcript and corrects domain vocabulary globally. You download a subtitle file ready for any video player."

"For 100 hours of new content per month — the difference between a $144 subtitle bill and a $3 subtitle bill."

Click a timestamp to seek the player → Download .vtt file

22:00–25:00 — Voice AI: Support That Never Sleeps

Navigate to: dev.pongpisit.com/demo/voice → click Voice

Speak: "What's a good film for family movie night?"

"Speech-to-text, AI reasoning, text-to-speech — three models, one endpoint, under 2 seconds. No third-party SDK, no per-minute billing to a call centre platform."

Click "Video Call with Agent" — show Visitor and Presenter links

"And if you want a live agent option — this opens a WebRTC video session. Visitor link for the customer, presenter link for your agent. No Zoom subscription. All routed through Cloudflare."

Discovery: "What does your current support infrastructure cost monthly? Is it 24/7 or business hours only?"

24:00–25:30 — Collaborative Room Designer: Real-Time Without the Infrastructure

Navigate to: dev.pongpisit.com/demo/room-designer

"Here's one that usually silences the room with product teams. A collaborative room planner — customers place HÖMSTYLE furniture in a floor plan, see it in 3D, design together in real time. Your product team has been asking for this for two years. Engineering scoped it at six months: WebSocket server, Redis pub/sub, Socket.io cluster, load balancer, real-time database. One Durable Object replaces all five."

Point at the 2D floor plan — KIVIK sofa, HEMNES bed, BEKANT desk already placed

"Drag the sofa. Watch the green border — you hold the lock. Nobody else can move it simultaneously. That's the DO's SQLite lock system. No Redis, no coordination service."

Open a second browser tab (incognito) to the same URL

"Two users, two tabs. Drag in one — it moves in the other instantly. Live cursors show exactly where each person is in the room. Add a piece in Tab 2 — appears in Tab 1 immediately."

"The entire WebSocket server, the SQLite database, the pub/sub broadcast, the lock manager — it's one TypeScript class. Deploy time: under two minutes."

Discovery: "What features has your product team been requesting that engineering keeps pushing back? Is real-time collaboration one of them?"

Business framing: Room planners increase furniture conversion by 20–30% — customers who visualise the fit buy with confidence and return less. The traditional build cost is 3–6 months of engineering time. With Durable Objects: one Worker class, one wrangler deploy. The DO hibernates when idle — 1,000 open connections cost $0/month. First user wakes it in under 5ms.

Land and expand: Room designer is the Durable Objects beachhead. Once DOs are in the account, the conversation opens to KV (edge caching), R2 (image storage), Workers AI (product recommendations). The room planner is the use case that gets the deal started.

25:30–26:30 — SEO: Rich Results Without Touching the Backend

Navigate to: dev.pongpisit.com/demo/seo

"One more that's directly tied to revenue — search visibility. Your product pages are a React SPA. Googlebot crawls every URL and sees the same generic title: your site name. No meta description. No structured data. Products don't appear in Google. When they do, there are no star ratings, no price — nothing to make someone click."

"A Worker intercepts the HTML, reads the product data from your existing database, and injects the right title, meta description, and JSON-LD structured data for every URL. Googlebot now sees a unique, keyword-rich title for each product. Your pages become eligible for Rich Results — stars, price, availability — directly in Google Search."

Click "Inject SEO Tags" → switch to Google Preview tab

"That's what shows up in Google when someone searches for your product. Unique title, meta description, star rating, price. Zero backend changes."

Discovery: "Do your product pages show up in Google Search today? Are you seeing organic traffic from product-specific queries?"

Business framing: Rich Results (stars + price in Google) increase CTR by 20–30% on average. For an e-commerce site with 10,000 product pages that were previously invisible — the SEO uplift compounds every day.

26:30–29:00 — Put a Real Number on It

Navigate to: dev.pongpisit.com/developer-pricing → click the closest preset to their use case

"Let me build an estimate based on what you've told me about your volumes."

Adjust Workers, R2, Workers AI, and AI Gateway sliders to approximate customer numbers

"This is your approximate monthly Cloudflare cost across everything we just showed. The R2 line is especially worth noting — this is what you stop paying AWS for egress. Download this as a spreadsheet for your finance team."

Click "Download CSV"

Discovery: "Is infrastructure cost a decision your team makes, or does it go through FinOps or procurement?"

29:00–30:00 — Try It Yourself + Close

Navigate to: dev.pongpisit.com/try

"Everything you've seen is deployable on your own domain today. Two scenarios here — AI chatbot on any existing site, and R2 image cache with zero egress. Both have copy-paste Worker code. The R2 one needs zero configuration — it works on whatever hostname it's deployed to."

"Your engineering team can have either of these running in under an hour."

"To summarise what we covered:"

— AI Chatbot injected via HTMLRewriter — 30-min deploy, no backend change
— R2 — $0 egress permanently, every cached asset auditable with timestamp + metadata
— Semantic search — understands intent, reduces abandonment
— Local pricing — auto-converts for any market via request.cf.country
— Subtitles — 48× cheaper than AWS Transcribe, dual-model pipeline (Nova-3 for English, Whisper for local languages)
— Voice AI — STT + LLM + TTS in one endpoint, 24/7
— Collaborative Room Designer — real-time multi-user sync, one DO replaces WebSocket server + Redis + load balancer + real-time DB
— SEO — unique title + meta description + JSON-LD Rich Results for every product URL
— Image resizing — 1.8 MB product photo → 180 KB WebP on mobile, just a URL prefix
— AI review assistant — pill tags + full review drafts, 3× review completion

None of this touched your backend. None required a sprint.

Ask for the meeting: "Which of these would have the highest business impact for your team in the next 90 days? I'd like to get our SE team on a 45-minute call with your engineering lead to scope a 2-week pilot. Would Tuesday or Thursday work?"

Objection Handling

"We're already on AWS / GCP and it works fine."

"Totally — those are great platforms. Cloudflare doesn't replace them. We sit in front. Your AWS infrastructure stays exactly as it is. We add AI, eliminate egress fees, and handle localisation at the edge. Your cloud bill gets smaller. Your product gets better. Nothing breaks."

"Our engineering team is really busy right now."

"That's exactly the point. What you just saw — the chatbot, the R2 cache, the local pricing — none of it required your engineers to touch their existing sprint. Workers are additive. We deploy in front. Your team stays focused on what they're already building."

"We already have a chatbot / search tool."

"What are you paying for it monthly? Most teams we talk to are paying $2,000–$10,000/month for third-party search and chat. Workers AI consolidates that into your Cloudflare bill at a fraction of the cost — and you own the model, the data, and the deployment."

"How is this different from OpenAI or Azure AI?"

"OpenAI gives you a model. Azure AI gives you a model. You still have to build the servers, the APIs, the caching, the scaling. Workers AI gives you models already integrated into the edge network — in 330 cities, with no GPU servers to manage and no egress fees when your data moves between services. You're calling one API instead of building infrastructure."

FAQ — Technical & Business Questions

Discovery Questions Bank

Use throughout — do not front-load

Infrastructure Cost

"What's the biggest line item on your cloud bill right now?"

"How much are you paying for CDN or data transfer monthly?"

AI & Support

"Do you have a chatbot today? How long did it take to build?"

"Is your support 24/7? What's your average ticket cost?"

Search & Discovery

"Do you know your zero-result search rate?"

"What do users complain about most with search on your platform?"

Market Expansion

"Which markets are you most focused on this year?"

"Are you seeing lower conversion from non-USD markets?"

Product Velocity

"How long does a feature request take from approval to production?"

"What's on your roadmap that engineering says will take the longest?"

Decision Process

"Who needs to be in the room for a pilot decision?"

"Are you in a budget cycle now or is this Q3/Q4?"

Demo Overview

URLdev.pongpisit.com

AudienceCTO / VP Engineering / Senior Developer / Enterprise Architect

GoalEstablish Workers architecture clearly, then hit the two best wow moments (chatbot injection + R2 metadata) early while attention is high — then breadth

Timing Cheat Sheet

Time	Section	URL
0:00	Opening question + architecture setup	/entertainment
2:30	Routes — deep architecture walkthrough	/demo/routes
5:00	AI Chatbot via HTMLRewriter — injection pattern	/entertainment → guide card
8:30	R2 + X-R2-Bypass + customMetadata + cache checker	/demo/r2 → /try
13:30	AI Sentiment — 5-star trap demo	/demo/sentiment
16:00	Semantic search — the silence moment	/entertainment → /demo/search
19:00	Local pricing at the edge	/entertainment/subscribe
21:00	KV edge caching	/demo/caching
22:30	Auto subtitles — live upload	/demo/subtitles
25:30	Voice AI + WebRTC video call	/demo/voice
27:00	Collaborative Room Designer — Durable Objects	/demo/room-designer
28:30	SEO — Rich Results via HTMLRewriter + JSON-LD	/demo/seo?ctx=entertainment
29:00	APIs at the edge	/demo/api-edge?ctx=entertainment
29:30	Pricing BOM live build + Close	/developer-pricing → /try

Before You Start

Local-language audio clip ready (MP3, under 10 MB, 1–3 min) curl -o /dev/null -w "%{http_code}" https://dev.pongpisit.com returns 200 Know customer's approximate monthly request volume + storage footprint Pre-open /developer-pricing and /try in background tabs Voice demo: confirm speaker set to arcas on /demo/voice Noisy room? Use typed input for voice demo instead of mic

0:00–2:30 — Opening: The One Concept That Unlocks Everything

Navigate to: dev.pongpisit.com/entertainment

Opening question: "When your engineering team gets a new feature request — AI search, a chatbot, local pricing — what's the typical journey from approval to production? Three months? Six?"

"What you're looking at is a live demo environment. 15 use cases across AI, storage, search, caching, SEO, real-time collaboration, and more — deployed across two real websites: a furniture store and a streaming platform. Real Cloudflare APIs. Real AI models. Real data. No mocks."

"Every single feature was deployed without touching either origin server. Not a line of backend code changed. That's possible because of one concept I want to establish before anything else."

2:30–5:00 — Routes: How Workers Intercept the Request Lifecycle

Navigate to: dev.pongpisit.com/demo/routes

"A Worker is deployed against a route pattern — *yourdomain.com/* or more specific like *yourdomain.com/images/*. Every matching request hits the Worker before it reaches your origin."

"The Worker has full access to the request: headers, body, URL, method, and Cloudflare's free metadata — country, city, PoP, ASN. It can fetch the origin response, read it, transform it, and return something different. The origin has no idea any of this happened."

Click the animation — walk through the request → Worker → origin → Worker → browser flow

Deploy model: wrangler deploy — seconds to go live across 330 cities in 125+ countries. Roll back in one command. No Kubernetes, no EC2, no containers, no cold-start overhead — V8 isolate, under 1ms startup. ~50ms from 95% of the world's internet-connected population.

Execution model: Workers are stateless by default. State lives in KV (eventually consistent), D1 (SQLite at edge), R2 (object storage), or Durable Objects (strongly consistent, single-instance coordination). Each tool has a precise use case — not everything needs a database.

5:00–8:30 — AI Chatbot via HTMLRewriter: The Injection Pattern

Navigate to: dev.pongpisit.com/entertainment → Developer Guide section → AI Chat guide card → Activate

"Now I want to show you something immediately practical. Watch the bottom right corner."

Click Activate — chat widget appears on the page

"That widget was not in the HTML the server sent. The origin returned the same response it always does. The Worker piped that response through HTMLRewriter — a streaming HTML transformer built into the Workers runtime — and appended the chat widget before the closing </body> tag."

"The Worker also created a new API endpoint — POST /api/chat — that didn't exist on the origin. Worker B wraps Worker A. You can layer capabilities without merge conflicts, without touching source code."

Type: What should I watch on a rainy Sunday afternoon?

Technical depth: HTMLRewriter uses a CSS-selector API. .on('body', { element(el) { el.append(script, {html:true}) } }) — the transform is streaming, so it doesn't buffer the full HTML. Response time to first byte is unchanged.

Model: @cf/meta/llama-3.1-8b-instruct-fast — included in Workers Paid plan free tier. Context window set to last 6 messages. System prompt is site-specific — swappable per route.

Pattern applies to: any chatbot, any A/B test, any personalisation injection, any analytics tag — without a frontend deploy.

8:30–13:30 — R2: Zero Egress + X-R2-Bypass + customMetadata

Navigate to: dev.pongpisit.com/demo/r2 → Run Demo

"Now let's talk about infrastructure cost. Every image request your users make — your cloud provider charges egress. AWS S3: $0.085/GB. GCP: $0.12. Azure: $0.087. It's not visible day-to-day but it compounds."

"R2 stores the same objects, serves them from the nearest Cloudflare PoP, charges zero egress. Permanently."

Click "Run Demo" — watch cost meters

Worker architecture — every detail:
1. Any hostname, zero config: No ORIGIN constant. Worker intercepts /images/* on whatever domain it's deployed to — works on lazada.com, shopee.com, any custom hostname.
2. Self-loop prevention via X-R2-Bypass: On cache miss, the Worker needs to fetch from origin — but origin is behind the same route. So it re-fetches the request with X-R2-Bypass: 1 header. The next Worker invocation sees that header and immediately return fetch(request) — Cloudflare routes to the real origin. No infinite loop, no external config.
3. Non-blocking cache write: ctx.waitUntil() stores to R2 in the background — user gets their response immediately, doesn't wait for the write to complete.
4. customMetadata on every put: Three fields written on first cache: cached-at (ISO timestamp), size-bytes (original file size), origin-url (source URL). Visible in R2 dashboard. Returned as X-R2-Cached-At, X-R2-Size, X-R2-Origin response headers.
5. HEAD vs GET: The gallery cache checker uses env.R2.head(key) — reads all metadata without downloading the image body. 30 parallel HEAD requests verify the full gallery in under 2 seconds, zero bandwidth cost.

Navigate to: dev.pongpisit.com/try → R2 Image Cache scenario

"The demo ZIP here contains a built-in cache checker. Upload it to Pages as your origin, deploy the Worker, load the gallery once — then paste your Worker URL into the checker. 30 HEAD requests fire in parallel. Every card shows a green badge with the exact ISO timestamp from customMetadata, the file size, and the origin URL."

"Open the R2 dashboard right now — every object has those three metadata fields. That's not just object storage. That's a queryable audit log of when every asset entered your edge infrastructure."

Discovery: "Do you have visibility today into which specific assets are cached and when they were first stored? Or is it just a hit/miss rate?"

13:30–16:00 — AI Sentiment: The 5-Star Trap

Navigate to: dev.pongpisit.com/demo/sentiment → Run AI Sentiment Analysis

"Before I show this — quick question. Do you look at star ratings on reviews? Most e-commerce teams do. The aggregate number looks fine, maybe 4.2 stars."

"Watch what the AI finds."

Wait for results — point to Wanchai B.'s review with the red mismatch banner

"Five stars. The star average includes this. But read what it says — shelf collapsed, nearly hit a child, filing a complaint. The reviewer explicitly wrote 'I am giving 5 stars so this review gets seen.'"

"distilbert-sst-2-int8 catches it immediately — NEGATIVE, high confidence. The star score said nothing. The AI saw through it."

"The Worker wraps your existing reviews API. Fetches what you already have. Runs each review through the model in parallel via Promise.all(). Returns enriched JSON with sentiment scores appended. Your reviews endpoint is unchanged."

Pattern: Wrap existing API → add AI enrichment in-flight → return richer JSON. Zero origin changes. Works for any JSON endpoint — reviews, support tickets, form submissions, social comments.

Parallel execution: Promise.all(reviews.map(r => env.AI.run(...))) — all reviews scored simultaneously. Not sequential. On Workers, concurrent AI calls to the same PoP cost the same as one.

16:00–19:00 — Semantic Search: The Silence Moment

Navigate to: dev.pongpisit.com/entertainment → search bar

Type: mind-bending sci-fi with emotional depth — pause 3 seconds, say nothing

"Inception is in this catalog. Interstellar is in this catalog. Zero results because SQL LIKE looks for the literal string."

Navigate to: dev.pongpisit.com/demo/search?ctx=entertainment → Run Semantic Search

"The Worker embeds the query using @cf/baai/bge-m3 — a multilingual embedding model that handles Thai, Indonesian, Vietnamese, and 100+ languages out of the box. Queries Vectorize for cosine similarity. Enriches from D1. Under 50ms from any PoP."

Stack: Workers AI (bge-m3 embeddings) + Vectorize (ANN index, cosine similarity) + D1 (SQLite metadata enrichment). All in one Worker deployment.

Index setup: One-time seeding — embed your catalog, insert into Vectorize. Incremental updates on new content. Query latency is O(1) — Vectorize uses HNSW approximate nearest neighbour, not brute-force scan.

Multilingual: bge-m3 is trained on 100+ languages. Same index serves Thai, English, Indonesian queries without separate models.

19:00–21:00 — Local Pricing: request.cf.country, $0 Cost

Navigate to: dev.pongpisit.com/entertainment/subscribe

"request.cf.country, request.cf.city, request.cf.colo — free metadata on every single request. No geo-IP API subscription. No database lookup. No latency."

"This page auto-detects your location and shows local pricing. Worker reads the country, looks up the exchange rate, rewrites the price elements via HTMLRewriter. Origin sends the same HTML. User sees their currency."

Use the currency switcher to flip between THB / IDR / SGD / MYR / JPY / GBP — 11 currencies total — show prices updating instantly

Comparison: MaxMind GeoIP2 City = $24/month + SDK + database download. AWS Location Service = $0.50/1,000 queries. request.cf.country = $0, 0ms, already in the request object on every invocation.

Pattern extends to: content language defaulting, regulatory compliance (GDPR banners only for EU), market-specific feature flags — all without a separate geo service.

21:00–22:30 — KV Edge Caching: 98% Database Reduction

Navigate to: dev.pongpisit.com/demo/caching → Run Demo

"At 100,000 daily active users, you're potentially making 100,000 database calls for the same catalog data that hasn't changed in hours. KV checks the edge first. Hit: under 10ms from the PoP handling the request. Miss: query D1 once, write to KV with a 60-second TTL. The next 9,999 requests in that window never touch the database."

KV characteristics: Eventually consistent globally — writes propagate to all PoPs in ~60 seconds. Reads are always local to the PoP — no cross-region roundtrip. 10M reads/month included on Workers Paid plan.

When to use KV vs D1 vs R2: KV = high-read, low-write, eventually consistent (catalog, config, feature flags). D1 = relational, ACID, SQLite API (user data, orders, sessions). R2 = objects, blobs, binary (images, files, backups).

22:30–25:30 — Auto Subtitles: Three-Model Pipeline

Navigate to: dev.pongpisit.com/demo/subtitles

"AWS Transcribe: $0.024/min + S3 + IAM + batch pipeline. Workers AI Whisper: $0.0005/min — 48× cheaper. One API call."

"The pipeline: the browser decodes the audio to 16kHz mono PCM, splits into 45-second chunks with 5-second overlap, sends 3 chunks in parallel. Model routing is automatic — Nova-3 ($0.0052/min) for English, Whisper ($0.0005/min) for Thai and other ASEAN languages. Then two AI correction passes: @cf/aisingapore/gemma-sea-lion-v4-27b-it per chunk for regional language nuance, then @cf/openai/gpt-oss-120b reads the full assembled transcript, detects the video domain (medical, tech, legal), and corrects domain vocabulary globally. Two AI passes — one for language, one for domain accuracy."

Upload the local-language audio clip

While processing: "Audio decoded to 16kHz mono PCM, split into 45-second chunks with 5s overlap, sent in batches of 3 simultaneously — two AI passes clean the output"

Result arrives → click a timestamp to seek → toggle VTT view → Download .vtt

Models used:
— @cf/openai/whisper-large-v3-turbo — STT, all languages, $0.0005/min
— @cf/aisingapore/gemma-sea-lion-v4-27b-it — per-chunk ASEAN language cleanup
— @cf/openai/gpt-oss-120b — global pass: domain vocabulary, speaker consistency

Cost at 200 hrs/month:
— English via Nova-3: $0.0052 × 12,000 min = $62 (vs AWS $288). Still 4.6× cheaper, no infrastructure.
— Thai/ASEAN via Whisper: $0.0005 × 12,000 min = $6 (vs AWS $288). 48× cheaper.

Why two models? Nova-3 has lower word error rate for conversational English — better for customer support recordings. Whisper handles 100+ languages including Thai, Indonesian, Vietnamese — Nova-3 hallucinates on ASEAN phoneme clusters.

25:30–27:00 — Voice AI: Three Models, One Endpoint

Navigate to: dev.pongpisit.com/demo/voice → click Voice

Speak: "What's a good film for family movie night?"

"@cf/deepgram/nova-3 for STT — lower word error rate than Whisper for conversational speech, critical for a support use case where users speak naturally. @cf/meta/llama-3.1-8b-instruct-fast for LLM reasoning. @cf/deepgram/aura-2-en — arcas voice — for TTS. Three models, one Worker endpoint, under 2 seconds end-to-end."

Click "Video Call with Agent" — show Visitor and Presenter links

"RealtimeKit spins up a WebRTC session via Cloudflare Calls. Visitor link for the customer, presenter link for the agent. No Zoom, no Twilio, no third-party SDK billing. The media is routed through Cloudflare — which means it works without the WebRTC latency penalty of a US-only TURN server."

27:00–28:30 — Collaborative Room Designer: Durable Objects + WebSocket Hibernation + SQLite

Navigate to: dev.pongpisit.com/demo/room-designer

"This is the Durable Objects demo. One class — WebSocket server, SQLite database, pub/sub broadcast, furniture lock manager, live cursors — all in a single TypeScript file. No Redis. No Socket.io. No load balancer."

Point at the 2D floor plan — 3 pre-seeded rooms visible in the dropdown (Vardagsrum, Sovrum, Hemmakontor)

"Three rooms, three separate DO instances. Each room's state is completely independent. env.ROOM_DESIGNER.idFromName('vardagsrum') — that's the whole routing layer."

Click the KIVIK sofa to select it — drag it across the floor

"Green border — you hold the SQLite lock. UPDATE furniture SET x=?, y=? WHERE id=? AND locked_by=? — atomic write, broadcast delta to every connected WebSocket. Release on pointer-up."

Press R to rotate — then Del to remove — then open the bottom Add Furniture panel and click BILLY Bookcase to add it back

"Every message type — grab, move, rotate, release, add, remove, cursor, reset — is handled in webSocketMessage(). One handler, no external broker."

Click 3D button — orbit the scene — click the sofa in 3D — drag the ↕ Height slider

"Three.js renders the room with recognisable multi-part furniture shapes. Click any piece to select it — raycasting hits the bounding volume, finds the group, reads the furniture ID from the mesh key. Height slider adjusts the group's Y position client-side — useful for placing a lamp on a table."

Open a second tab (incognito) — drag furniture in Tab 1

"WebSocket Hibernation: the DO sleeps when all connections are idle. Memory cost drops to zero. When the first message arrives, it wakes in under 5ms — ctx.acceptWebSocket(server) with server.serializeAttachment(session) survives the sleep cycle. 1,000 idle connections cost nothing per month."

Point at the Live Events bar — show DO version counter incrementing

"DO v{n} — every state change increments this counter via SQLite. It's not in memory. It survives a Worker restart. The presence strip shows every connected user's name and colour — this.ctx.getWebSockets() iterates the live sessions."

Press ↺ Reset — watch both tabs snap back to the seed layout simultaneously

Technical depth:
— DO class registered with new_sqlite_classes, not new_classes — required to enable ctx.storage.sql
— WebSocket Hibernation: ctx.acceptWebSocket() (not native ws.accept()) — sessions serialised via serializeAttachment() survive hibernation
— SQLite schema: furniture table (id, catalog_id, label, width, depth, x, y, rotation, color, locked_by) + room_meta table (key/value pairs for room dimensions and version)
— Lock system: locked_by column — grab sets it, release clears it, WebSocket close releases all locks held by that session
— Seed layout: 3 rooms seeded on first access — real HÖMSTYLE products (ek01–ek20) with real dimensions from the product catalog
— Protocol messages: join / grab / move / rotate / release / add / remove / cursor / reset / ping
— Broadcast patterns: broadcastAll() for state changes, broadcast(msg, skip) for cursors (skip sender)

28:30–29:00 — SEO: Rich Results via HTMLRewriter + JSON-LD

Navigate to: dev.pongpisit.com/demo/seo?ctx=entertainment

"React SPA. Every film URL sends <title>STREAMVAULT</title> to Googlebot. No meta description. No structured data. The film pages don't rank individually."

"The Worker reads the film ID from the URL, fetches metadata from D1 or KV, and uses HTMLRewriter to rewrite the <title>, inject <meta name="description">, and append a <script type="application/ld+json"> Movie schema block — all in the response stream before the browser sees it. Origin sends the same HTML it always did."

Click "Inject SEO Tags" → switch to Google Preview tab → then JSON-LD tab → then HTML Head tab

Technical depth:
— HTMLRewriter is streaming — no buffering, TTFB unchanged
— JSON-LD Movie schema: aggregateRating.ratingValue, director, dateCreated, genre — all from D1
— KV caches metadata per film for 1h — origin D1 called once, edge serves thereafter
— Same pattern works for Product (e-commerce), Article (blogs), Course (ed-tech)
— Google Rich Results eligibility: star ratings, runtime, and release year visible directly in SERPs

29:00–29:30 — APIs at the Edge: New Endpoints, Zero Origin

Navigate to: dev.pongpisit.com/demo/api-edge?ctx=entertainment

Click each endpoint: Mood Discovery → Content Check → Watch Order → Plan My List

"Four API endpoints. None existed on the origin. The Worker handles them entirely at the edge — origin never receives these requests. Each is independently deployable. If one throws an error, the others keep running. No microservices orchestration, no container cluster."

Pattern: Worker matches pathname → handles at edge → returns JSON. Origin is never called for these routes. Because Workers are route-matched, you can add endpoints to any existing domain without touching the origin's routing table.

29:30–30:00 — Pricing: Build the BOM Live + Close

Navigate to: dev.pongpisit.com/developer-pricing

Ask first: "Roughly how many API requests per month? What's your storage footprint — images, videos, files? How many AI calls per day across search, chatbot, voice?"

Click closest preset → adjust sliders to their actual numbers

Key numbers to walk through:

Workers: 10M requests/month included at $5/month base — overage $0.30/million
R2: $0.015/GB/month storage — $0 egress. Whatever they're paying AWS/GCP for egress disappears
AI Gateway Enterprise: 200K logs included, $8/100K overage — each AI query = 2 logs (request + response)
Vectorize: 10M stored + 50M queried dimensions included — scales to hundreds of millions
KV: 10M reads/month included — most catalog caching use cases are free tier
Workers AI: per-neuron billing — most models free up to daily limits on Workers Paid

Click "Copy Summary" → paste to Salesforce/CPQ · Click "Download CSV" → spreadsheet for their finance team

TCO comparison: When comparing vs AWS, remove the egress line items entirely. At enterprise content scale that alone is $5,000–$20,000/month that disappears from the bill.

29:30–30:00 — Close: Try it Yourself

Navigate to: dev.pongpisit.com/try

"Two scenarios here — AI chatbot on any existing site, and R2 image cache with zero egress. Both have copy-paste Worker code. The R2 Worker needs zero configuration — no ORIGIN to set. It adds X-R2-Bypass: 1 when fetching origin, breaking the self-loop. Deploy it to any hostname, bind an R2 bucket, done. Self-configuring."

"Two more if their engineering team wants to explore further: Image resizing — 1.8 MB product photo → 180 KB WebP on mobile, just a /cdn-cgi/image/ URL prefix, zero new infrastructure. And AI Content Generation at /demo/feedback — Workers AI generates contextual review pill tags and full draft reviews, 3× review completion rate."

Final question: "Which of these has the highest business impact for your team in the next 90 days — and what would a 2-week proof of concept look like for you?"

Stop. Let them answer.

FAQ — Technical Deep Dives

Discovery Questions to Weave In

Scatter these at natural transitions — never front-load:

"What's your current approach to caching — are you running something in front of your origin today?" (before Routes)
"Do you have a chatbot today? What's the deploy model — separate service, or injected somehow?" (before chatbot)
"Do you have visibility into which assets are cached and when they were first stored?" (during R2 customMetadata)
"Do you look at star ratings as a signal in your reviews system? How are you surfacing problems?" (before sentiment)
"What does your search look like today — SQL LIKE, Elasticsearch, something else?" (before search)
"Which markets are you most focused on — are you seeing different conversion by country?" (before local pricing)
"How many hours of video or audio content do you produce per month? What languages?" (before subtitles — note: Nova-3 for English, Whisper for local languages)
"What features has your product team been asking for that engineering keeps scoping as 3–6 month projects? Is real-time collaboration one of them?" (before Room Designer)
"Do your product pages rank individually in Google, or does every URL look the same to Googlebot?" (before SEO)
"What's the largest image file size your users download on mobile?" (before image resizing — if relevant)
"Who makes infrastructure cost decisions at your company — is FinOps involved?" (before pricing)