Developer Tools: Automation

DEVELOPER TOOLS: AUTOMATION

21 SRC

21 sources Updated May 24, 2026

Developer Tools: Automation

Browser and file control are becoming precise, code-driven primitives: WebMCP controls the main Chrome instance without sandboxed Playwright, dev-browser lets agents drive browsers by writing code (faster than screenshot-based computer use), agent-browser uses 82% fewer tokens than Playwright MCP, and MarkItDown converts any file (PDF/PPTX/DOCX/Excel/audio/YouTube) to clean Markdown because LLMs reason better on the format they were trained on. Server-side automation is unlocked (Obsidian Headless Publish/Sync without the desktop app), local-first ingestion runs without data leaving the machine (brain-ingest extracts 12-18 claims from a 90-minute talk), and Obsidian gets fuzzy + precise retrieval via smart-connections + qmd MCP servers. Issue tracking is going real-time and agent-driven: LogRockets surfaces customer issues for a markdown-based fix workflow without planning decks, and Symphony assigns a Codex agent to every open issue — turning trackers into always-on systems where humans shift from doing to reviewing. Productivity skills give agents eyes, self-reflection, and self-improvement (/ss screenshot skill saving ~1 hour/week, Codex's hidden Chronicle analyzing usage patterns, claude-smart turning repeated corrections into reusable local rules), while structured /goal prompts add measurable outcomes, constraints, verification, rollback, and stop rules to agent tasks. The pattern replicates across verticals — ml-intern automates the post-training research loop end-to-end (beating Claude Code on GPQA), finance has its own dedicated Ai Trading sub-ecosystem, and systematic diagnostics now treat before/after tests as the closure condition for mundane ops work. The supporting cast (Goose, Paperclip Desktop, Orca IDE's Notion-style editing, official Claude Code setup plugins) and a hard scaling lesson (git's 5GB ceiling forces SQLite migration for 2.3GB+ knowledge bases) round out the agent-automation tooling layer.

Insights

Browser and File Automation

  • Enable chrome://inspect/#remote-debugging to use Google's WebMCP to control your main Chrome browser instance directly, without sandboxed Playwright (from lightning fast chrome browser control)
  • dev-browser CLI (npm i -g dev-browser) lets agents control browsers by writing code -- faster than screenshot-based computer use because code is precise, repeatable, composable (from dev browser sawyerhood)
  • MarkItDown (Microsoft, MIT, 87K stars) converts PDF/PPTX/DOCX/Excel/images/audio/YouTube to clean Markdown -- LLMs reason better on Markdown because they were trained on it (from markitdown microsoft file converter)
  • MarkItDown ships as MCP server for Claude Desktop integration; single command: markitdown path-to-file.pdf > document.md (from markitdown microsoft file converter)
  • Obsidian Headless now supports Publish and Sync without the desktop app -- enables server-side vault automation (from obsidian headless publish sync)

Real-Time Issue Tracking

  • Use LogRockets to identify customer issues in near-real-time, then fix via markdown-based workflow without planning decks (from real time customer issue tracking at rippling)

  • Goose has strong underlying technical capabilities being overlooked despite interface limitations — Jack Dorsey calls it a "superpower" tool with the team actively pushing rapid improvements (from goose ai development tool jack endorsement)

  • Paperclip Desktop is now available as a free Mac app, packaging the Paperclip orchestration platform into a standalone desktop interface (from paperclip desktop mac app launch)

  • Orca IDE v1.0.80+ adds Notion-style markdown editing — described as "Notion and Obsidian had a baby" — enabling faster spec reviews and natural doc editing without fighting the editor during agentic workflows (from orca ide notion style markdown editing)

  • Git repositories have a 5GB size limit; knowledge bases that grow to 2.3GB+ require migrating to SQLite backends — plan for this transition early in long-running git-based knowledge systems (from garry tan openclaw git wiki gstack)

  • agent-browser (Vercel Labs, 26K+ GitHub stars) lets AI agents control a dedicated Chrome browser for web scraping — handles JavaScript-heavy sites, pages behind logins, and dynamic content; uses 82% fewer tokens than Playwright MCP (from nick spisak shared link)

  • brain-ingest CLI processes video/audio locally (no data leaves machine) and generates structured Obsidian notes with frontmatter and wikilinks — extracts 12-18 claims from a 90-minute talk vs. raw transcript noise (from nyk builderz shared link)

  • Two MCP servers for Obsidian integration: smart-connections (semantic search across vault) and qmd (structured queries and metadata operations) — together they give Claude both fuzzy and precise retrieval over a knowledge graph (from nyk builderz shared link)

Productivity Skills

  • Allie K. Miller's /ss screenshot skill is the reference implementation for visual input — lists newest files in your screenshots folder, grabs the most recent or N most recent (/ss 4), and acts on the trailing argument (huh/fix/do this); saves ~1 hour/week (from claude screenshot skill visual processing)

  • Codex's hidden Chronicle feature analyzes computer-usage patterns and offers direct productivity feedback — invoked by an explicit prompt asking what the user has been doing inefficiently (from codex app chronicle productivity analysis)

  • A strong /goal prompt should specify one measurable outcome, repo/context, constraints, priority, plan, done-when criteria, verification steps, output expectations, and stop rules — turning vague "make no mistakes" instructions into a bounded mission contract (from goal command prompt engineering structure)

  • Stop rules are an explicit anti-scope-creep mechanism: halt on high-impact ambiguity, surface ranked proposals before acting, and stop expanding once the stated goal is satisfied (from goal command prompt engineering structure)

  • The official claude-code-setup plugin scans a project for hooks, skills, MCP servers, subagents, and automations, then configures them step-by-step — setup automation is becoming part of the Claude Code product surface, not a community-only chore (from claude code setup plugin enhancement)

  • "Codex maxxing" is emerging as a named discipline around daily primitives and workflows used by Codex team members, signaling that agent productivity now has repeatable operating practices worth documenting and critiquing (from codex maxxing primitives draft)

  • claude-smart stores actionable local learnings from corrections, such as replacing a hanging npm test watch command with npm test -- --run, so repeated repo-specific mistakes become reusable automation rules (from claude smart self improving plugin)

  • Systematic network troubleshooting should start with baseline speedtest-cli, then check DNS resolution, MTU, packet loss, Wi-Fi interference, stale network profiles, bandwidth-heavy background processes, and mDNS before running before/after validation (from systematic network troubleshooting methodology)

ML Research Automation

  • ml-intern automates the post-training research loop end-to-end: arXiv reads, citation walking, HF dataset pulls, HF Jobs training, run monitoring, failure diagnosis, retraining — beat Claude Code on GPQA (32% vs 22.99% in <10h); CLI plus phone/desktop web app, with $1k GPU + Anthropic credits provisioned for early users (from ml intern automated research agent)

Trading and Finance Tooling

  • See Ai Trading for the full set of AI-driven quant/trading repos (Vibe-Trading, FinceptTerminal, OpenBB, daily_stock_analysis, Kronos, qlib, freqtrade) — finance has its own dedicated topic.

Symphony and Open-Source Orchestration

  • Symphony assigns a Codex agent to every open issue in a task tracker — turns issue trackers into always-on agentic systems, shifting humans from doing to reviewing and directing; open-source orchestrator for Codex (from symphony codex agent orchestrator)

Voices

8 contributors