🛒 E-commerce / Retail

AI Chatbot + Search via HTMLRewriter & AI Search

A Worker uses HTMLRewriter to inject a live AI chatbot into any page. With AI Search (AutoRAG), it answers from your uploaded product catalog, FAQs, and docs — grounded in your real data. Zero origin changes.

The Problem

""Our customers need instant answers but the backend team says adding a chatbot means 6 weeks of re-architecture and a new deployment pipeline.""

The Outcome

lines changed on your origin server. The chatbot and search bar are injected by the Worker and powered by AI Search — the origin never knows.

Live demo below

The problem

Customers leave without answers. Your support inbox can't respond until Monday.

"We've calculated that we lose 15% of potential sales to unanswered product questions. Customers visit, can't get answers, and buy from a competitor."
— Head of Customer Experience, furniture retailer

Both sides run simultaneously — real API calls, no mocks

Without Cloudflare

Legacy page — no chatbot

homstyle.com/products/strandmon

🪑

STRANDMON Wing Chair

STR-4892 · In Stock

฿10,465

High-back wing chair for great neck and head support. Perfect for reading and relaxation.

Questions? Email support@homstyle.com

⚠️ Response time: 3–5 business days

With Cloudflare

💬 Chat | 🔍 Chat + AI Search

Workers AI — Llama 3.1 8B📄 This page only · STRANDMON Wing Chair

Hi! I'm HÖMSTYLE AI — I can answer questions about the STRANDMON Wing Chair on this page.

⚠️ Scoped to current page only — try "Compare with POÄNG" or "Best sofa for kids?" to see the limitation

The win

🔧

Origin code changes

HTMLRewriter injects the widget — origin untouched

🔍

1 tag

To add AI Search

<search-bar-snippet> — one web component, no build step

📂

AutoRAG

Grounded in your data

Upload catalog, FAQs → llama-3.3-70b answers from docs

AI Search — powered by Cloudflare AutoRAG

🔍

search-bar-snippet

Inline search with real-time results dropdown

api-url="https://dev.pongpisit.com/api/search-proxy/"

Try:

⌘

search-modal-snippet

Command palette — opens with Cmd+K / Ctrl+K

⌘K

HÖMSTYLE

Click "Search…" or press ⌘K to open

Esc to close · ↑↓ to navigate

Productionising this

What changes when you ship this for real

Rate limit per session

Add a Workers Rate Limiting binding (or KV-backed sliding window) to cap queries per IP / fingerprint. Llama 3.1 8B Fast = 1M neurons/min default cap on Workers Paid.

Secrets out of code

AI Search instance name is fine in wrangler.toml. Anything sensitive (third-party API keys, dashboards) must use wrangler secret put — never commit to git.

Strict CORS

Replace Access-Control-Allow-Origin: * with an explicit allowlist (your prod domain + staging). The chatbot endpoint should not be embeddable from any origin.

Observability

Set observability = { enabled = true } in wrangler.toml. Workers Logs surfaces every console.error and request — invaluable when AI Search starts returning unexpected chunks.

Inject script externally

For strict-CSP origins, serve the chat widget JS from /_chat/widget.js on the Worker so origins can script-src 'self' instead of unsafe-inline.

AI Search content refresh

AI Search re-indexes the R2 source bucket on a schedule (default: 6h). For frequently-changing catalogs, set up an R2 event notification or trigger re-index via the API after upload.