Commit Graph

6 Commits

Author SHA1 Message Date
Muyue
97a25295fc feat(browser-test): persistent token + auto-reconnect + screenshot + smarter strategy (v0.7.9)
Four user-reported issues with the AI-driven browser test feature:

1. WS dies on every page reload / navigation (token was single-use,
   5 min TTL → AI loses session permanently).
2. AI can't see new pages opened by its actions (same root cause).
3. No screenshot capability — AI cannot capture page state visually.
4. AI burns ~150 tool calls for what's 5 human actions, mostly
   list_clickables loops.

Fixes:

- Token sliding TTL (60 min): ConsumeToken no longer deletes the
  token; it refreshes its expiration on each successful WS connect.
  Same token survives reload / re-paste / navigation as long as
  there's no 60-min idle gap.

- Snippet auto-reconnect: WS onclose schedules reconnect with
  500ms × attempt backoff (max ~2.5s). Handles transient drops,
  server restarts, and WS hiccups without user intervention. Full
  navigation kills the JS context and is unrecoverable from JS — but
  the user just re-pastes the snippet, same token works.

- New 'screenshot' action: snippet captures via SVG foreignObject +
  canvas → base64 PNG → sent back over the existing WS reply
  channel. Server decodes and writes to ~/.muyue/screenshots/
  <filename>.png (sanitized name, timestamp default). Filename
  characters limited to a safe charset to prevent path escape.
  Best-effort: external CSS / cross-origin images / iframes won't
  inline.

- Studio system prompt rewritten <browser_test_strategy>:
  - Explicit rule: don't list_clickables after every click
  - Action cost table (cheap vs expensive)
  - When to re-list (URL change, dialog, click-not-found only)
  - Standard final report format ✓ / ✗ / ⚠ / 📸

Also bundles v0.7.8 (cherry-picked): unsafe.Pointer(uintptr(hPC))
instead of unsafe.Pointer(&hPC) in UpdateProcThreadAttribute, so
PROC_THREAD_ATTRIBUTE_PSEUDOCONSOLE is correctly applied and the
spawned shell attaches to the embedded xterm.js instead of opening
a separate external console window (regression fix from v0.7.6).

- internal/api/browsertest.go: token sliding, screenshot save,
  param schema, snippet rewrite, helpers
- internal/agent/prompts/studio_system.md: strategy rewrite
- internal/version/version.go: 0.7.7 → 0.7.9
- CHANGELOG.md: v0.7.9 entry covering all fixes
2026-04-27 16:50:04 +02:00
Muyue
c820d55710 feat: AI-driven browser tests — Tests tab + browser_test agent tool
Some checks failed
PR Check / check (pull_request) Failing after 33s
New feature: give Studio's AI control of any browser tab to test buttons,
read the console, and report which buttons work / fail.

Backend (internal/api/browser_test.go, ~480 LOC):
- WebSocket endpoint /api/ws/browser-test, auth by single-use 5-min token
- BrowserTestStore: session map (capped at 16, LRU evict), token store
- REST: /api/test/snippet (issues token + JS snippet), /api/test/sessions,
  /api/test/console/{id}
- Agent tool 'browser_test' wired into the registry, with actions:
  list_clickables / click / eval / console / current_url / type / wait /
  summary. click returns the console_delta produced during the click.
- Embedded JS runner: opens WS, hooks console + window.onerror +
  unhandledrejection, dispatches dispatcher commands, replies with
  correlation IDs, watches for URL changes.

Frontend:
- New Tests tab (web/src/components/Tests.jsx): snippet copy + connected
  sessions list + live console viewer
- App.jsx: 5th tab + Ctrl+4 shortcut (Config moves to Ctrl+5)
- api/client.js: getTestSnippet / getTestSessions / getTestConsole

Studio prompt:
- internal/agent/prompts/studio_system.md: added browser_test entry to the
  tools table + <browser_test_strategy> section explaining the recommended
  loop (summary → list_clickables → click → check console_delta → report)

Versioning:
- v0.6.0 → v0.7.0
- CHANGELOG.md: full entry under v0.7.0
2026-04-27 11:02:05 +02:00
Muyue
6a7b4d8001 release: v0.6.0 — security audit fixes + 7 new features
All checks were successful
PR Check / check (pull_request) Successful in 57s
Audit corrections (security, concurrency, stability):
- chat_engine: bound resp.Choices[0] access, release tool slot per-iteration
- conversation_multi: synchronous save under existing lock (was racy fire-and-forget)
- workflow/engine: short-circuit on failed deps (no more infinite busy-wait); track failed/skipped status
- handlers_workflow: rune-aware truncate for plan goal (UTF-8 safe)
- server: CORS limited to localhost origins (was wildcard)
- handlers_info / terminal: mask API keys and SSH passwords as "***" in GET responses; preserve stored secret if "***" sent on update
- terminal: sshpass uses -e + SSHPASS env var (was both -p and -e)
- handlers_chat: MaxBytesReader 50 MB on /api/chat
- image_cache: 10 MB cap per image
- handlers_config: font size <= 72; profile-save unmarshal errors propagated
- handlers_info: /lsp/auto-install ProjectDir restricted to user home
- Shell.jsx: parenthesized resize-condition (operator precedence)
- orchestrator_test: CleanAIResponse capitalization (fixes failing vet)

New features:
- platform: detect OS name (Debian, Ubuntu, Windows 11, macOS X.Y) and inject in Studio system prompt next to the date
- agents: default timeout 30 min for crush_run/claude_run (cap also 30 min)
- agents: new cwd, wsl_distro, wsl_user params; on Windows hosts launch via "wsl -d <distro> -u <user> --cd <cwd> --"
- agents: new claude_run tool (mirror of crush_run for Claude Code CLI)
- terminal: list installed WSL distros individually in new-tab menu (Windows only)
- studio: system prompt rewritten around BMAD-METHOD personas + mandatory delegation template
- studio: "Réflexion avancée" toggle — inactive provider produces a preliminary report injected as [RAPPORT PRÉALABLE] context for the active provider
- studio: "Historique compressé" toggle — collapses past tool calls to last action only, with "Tout afficher" expansion
2026-04-27 10:12:11 +02:00
Augustin
12000e523c fix: token persistence, context windows, CSS tables/bullets/hr, image attachments
All checks were successful
Beta Release / beta (push) Successful in 1m1s
- Fix token count reset on app restart: persist realTokens in conversation.json
- Fix token/context window values: Studio 150K (summarize at 120K), Terminal 100K
- Fix table rendering in terminal tab: correct thead/tbody display model
- Fix copy button always top-right in Studio code blocks
- Add markdown horizontal rule (---) support in Studio and Terminal
- Fix bullet list double dot: remove CSS ::before duplicate bullet point
- Add image attachments support (VLM description, file mentions @file.ext)
- Add sudo detection with cache (sync.Once)
- Fix message content serialization (TextContent wrapper)
- Guide AI to use read_file instead of cat in studio prompt

💘 Generated with Crush

Assisted-by: GLM-5.1 via Crush <crush@charm.land>
2026-04-26 15:19:26 +02:00
Augustin
cb3d35756a feat: terminal sudo blocking, token tracking, mermaid & consumption UI
All checks were successful
Beta Release / beta (push) Successful in 1m3s
- Block sudo/doas commands when not running as root
- Add real token counting from API responses
- Track and display consumption by provider/day
- Add Mermaid diagram rendering in Shell and Studio
- Add copy-to-clipboard buttons for code blocks
- Support tables in AI message rendering
- Update system prompt with context (date, time, root status)

💘 Generated with Crush

Assisted-by: MiniMax-M2.7 via Crush <crush@charm.land>
2026-04-26 12:43:15 +02:00
Augustin
66b773ff86 feat(agent): refactor AI chat with streaming, agent registry, and tool execution
All checks were successful
Beta Release / beta (push) Successful in 47s
- Replace old tool-call regex with proper agent registry
- Add streaming chat via SSE (handleStreamChat / handleNonStreamChat)
- Add internal/agent package with tool definitions and execution
- Add orchestrator with system prompt and tool scaffolding
- Add internal/agent/ directory
- Studio.jsx: streaming chat with thinking indicator and tool result rendering
- global.css: chat bubble styles, streaming animation, thinking dots
- handlers_chat.go: full rewrite using new agent/orchestrator architecture

💘 Generated with Crush

Assisted-by: MiniMax-M2.7 via Crush <crush@charm.land>
2026-04-22 21:19:36 +02:00