Skip to content

Add /generate slash command for FLUX-1-schnell image generation#42

Merged
jonnyparris merged 2 commits intomainfrom
feat/flux-generate-command
Apr 23, 2026
Merged

Add /generate slash command for FLUX-1-schnell image generation#42
jonnyparris merged 2 commits intomainfrom
feat/flux-generate-command

Conversation

@jonnyparris
Copy link
Copy Markdown
Owner

Summary

  • Adds /generate <prompt> slash command that generates images via Workers AI FLUX-1-schnell and renders them inline
  • New POST /session/:id/generate endpoint bypasses the LLM chat path entirely
  • Images are uploaded to R2 under attachments/ and delivered via existing message_attachments SSE event

Why

A previous session (65c3019e) showed me asking an LLM (GPT-5.4) to generate a steampunk Cloudflare logo — it just offered prompt-writing advice instead. Dodo couldn't actually make images.

Workers AI has FLUX-1-schnell available as @cf/black-forest-labs/flux-1-schnell and the UI already knows how to render attachments. This PR wires the two together.

What changed

  • wrangler.jsonc — add ai: { binding: "AI" }
  • src/types.ts — add AI: Ai to the Env interface
  • src/shared-index.ts — add FLUX-1-schnell to WORKERS_AI_MODELS catalog
  • src/index.ts — add POST /session/:id/generate (rate-limited 30/hr per user)
  • src/coding-agent.ts — add handleGenerate():
    • Persists user message to Think session
    • Calls env.AI.run("@cf/black-forest-labs/flux-1-schnell", { prompt })
    • Uploads base64 JPEG to R2 via existing uploadAttachment()
    • Emits message + message_attachments SSE events
  • public/js/dodo-chat.js — detect /generate <prompt> in sendMessage() and route to new endpoint

Test plan

  • npx tsc --noEmit passes
  • npx vitest run — all 382 tests pass
  • Manual: type /generate a cyberpunk cat in a Dodo session, verify image renders inline

Notes

  • FLUX-1-schnell returns base64 JPEG in response.image — we upload that to R2 directly rather than inlining the data URL
  • The user sees a minimal assistant bubble with the image; no LLM narration is involved
  • Rate limit is 30/hr per user (half of chat prompt limit) to stay conservative on Workers AI costs

beep-boop-🤖

…e generation

Wire up Workers AI image generation in Dodo via a new /generate slash
command. Users can type /generate <prompt> to get a FLUX-1-schnell image
rendered inline in the chat.

Changes:
- Add AI binding to wrangler.jsonc
- Add AI: Ai to Env interface
- Add @cf/black-forest-labs/flux-1-schnell to WORKERS_AI_MODELS catalog
- Add POST /session/:id/generate endpoint (rate-limited 30/hr)
- Add handleGenerate() in CodingAgent: persists user+assistant messages,
  calls env.AI.run() for FLUX, uploads base64 JPEG to R2, emits SSE
  message_attachments event for inline rendering
- Add /generate slash command detection in dodo-chat.js client
@cloudflare-workers-and-pages
Copy link
Copy Markdown

cloudflare-workers-and-pages Bot commented Apr 23, 2026

Deploying with  Cloudflare Workers  Cloudflare Workers

The latest updates on your project. Learn more about integrating Git with Workers.

Status Name Latest Commit Updated (UTC)
✅ Deployment successful!
View logs
dodo fcb8ac1 Apr 23 2026, 09:26 PM

Apply all review comments from PR #42:

Critical fixes:
- C1: Guard against concurrent prompt collision (return 409 when another
  prompt is running, matching handleMessage's behaviour)
- C2: Fall back to inline data URL when R2 is unavailable instead of
  silently emitting a text-only stub — users always see the image
- C3: Enforce 2048-char prompt limit (FLUX-1-schnell schema cap) via a
  dedicated generateImageSchema

Correctness:
- Defensive FLUX response parsing (no more 'as { image: string }' type
  assertion) — throws a descriptive error if shape drifts
- Random seed on every call so repeat prompts don't collapse to the same image
- Split chat models (WORKERS_AI_MODELS) from image models
  (WORKERS_AI_IMAGE_MODELS) so FLUX never shows up in the chat model
  picker where it would brick every subsequent prompt
- Centralize FLUX constants (model id, media type, max prompt length)

UX:
- Shorter assistant caption (truncate prompt to 80 chars + truncation
  marker) — long prompts no longer dominate the bubble
- Multi-line prompt support in slash command regex ([\s\S]+ instead of .+)

Cost/abuse:
- Add daily cap (100/day) on top of hourly cap (30/hr) — bounds long-horizon
  FLUX cost exposure per user

Refactor:
- Extract persistGenerateUserMessage() and persistGeneratedImageMessage()
  helpers to keep handleGenerate readable (~70 lines instead of 140)

Tests:
- New test/generate-unit.test.ts covers slash command regex (basic,
  case-insensitive, multi-line, whitespace-only, concatenated, mid-message),
  FLUX prompt-length constant parity with the Workers AI schema, and the
  chat-vs-image model catalog separation
- All 395 tests pass (up from 382)
@jonnyparris jonnyparris marked this pull request as ready for review April 23, 2026 21:30
@jonnyparris jonnyparris merged commit 7e8a1c6 into main Apr 23, 2026
4 checks passed
jonnyparris added a commit that referenced this pull request Apr 23, 2026
#44)

Follow-up to #42. Three things:

1. Slash command autocomplete in the chat input. Type '/' to see a menu of
   available commands (currently just /generate). Arrow keys + Enter/Tab
   complete, Escape dismisses. Menu positioned above the input, styled to
   match the rest of the app.

2. Server-side /generate routing. handleMessage and handlePrompt now detect
   /generate in the content and delegate to the image-generation core
   (runImageGeneration). This means /generate works from:
     - Browser UI (client intercept + this server fallback)
     - MCP send_message / send_prompt tools
     - Any future HTTP clients

   Previously the slash command only worked in the browser because the
   client JS was the only layer that knew about it.

3. New MCP tool: generate_image. Calls /generate directly with a prompt
   string. Lets LLM agents (incl. myself) invoke FLUX from conversations.

Refactor:
- Extract runImageGeneration() core from handleGenerate — shared entry
  point for the dedicated endpoint and slash-routed messages.
- GENERATE_SLASH_REGEX + extractGeneratePrompt() in shared-index so the
  browser and server agree on what counts as a /generate request.

Tests:
- 6 new unit tests for extractGeneratePrompt (happy path, null on non-/
  messages, whitespace handling, multi-line, case-insensitive)
- Updated existing slash-regex tests to use the canonical shared export
- All 401 tests pass
@jonnyparris jonnyparris deleted the feat/flux-generate-command branch April 25, 2026 19:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant