Welcome to the 289th edition of Coding Jag brought to you by TestMu AI!👐
Multiple AI agents are now working together in real-world projects - one writes the code, another runs the tests, a third flags what broke - and this is happening inside the tools teams use every day. The challenge now is not whether to adopt multi-agent AI but how to use it without losing visibility, coverage, or control.
This edition captures where the real work is happening. GitHub is parallelizing agent tasks directly from the command line. OpenAI just raised $122 billion. Google folded Firebase into AI Studio, so vibe-coded apps finally have real backends.
You will also find practical field reports, from a solo developer teaching Claude to QA a mobile app overnight, to Bas Dijkstra's experiment with Claude Code generating tests for a Spring Boot API, to LangChain's step-by-step agent evaluation checklist for teams trying to ship responsibly.
📬 Got something worth sharing? Hit reply - we'd love to hear what you're seeing out there.
News
06 min
github.blog
🤖 GitHub's /fleet command in Copilot CLI lets Copilot break prompts into parallel subagent tracks. Matt Nigh and Brian LaFlamme show how an orchestrator decomposes tasks, dispatches agents simultaneously, and resolves dependencies across files and modules.
12 min
firebase.blog
🔥 Kara Yu and Sam Phillips announce Firebase's native integration with Google AI Studio. The Antigravity coding agent now auto-provisions Firestore and Firebase Auth, turning vibe-coded prototypes into persistent, authenticated apps from a single prompt.
07 min
openai.com
💰 OpenAI closes a $122 billion round at an $852 billion valuation, anchored by Amazon, NVIDIA, and SoftBank. The company reports $2B monthly revenue, 900M weekly ChatGPT active users, and Codex users growing 5x in three months.
AI
12 min
medium.com
Pinterest cut build times 36% - not by writing better tests, but by rebuilding infrastructure. In this article, Karthik Sivakumar lays emphasis on the fact that quality engineering teams must shift from scripting test cases to architecting scalable test execution platforms.
08 min
thegreenreport.blog
❓ Irfan Mujagic at The Green Report examines the core paradox of AI-generated tests: when AI writes both code and tests, neither can reliably catch the other's errors. Human verification remains essential, not optional.
11 min
christophermeiklejohn.com
📱 Christopher Meiklejohn built a Claude-powered QA agent for his Zabriskie app, automating screenshot analysis and bug reports. Android setup took 90 minutes. iOS took over six hours, exposing a sharp tooling gap in mobile automation.
11 min
ontestautomation.com
🧪 Bas Dijkstra tests Claude Code's ability to write acceptance tests for a Spring Boot API from scratch. Claude hit 95% line coverage and 91% mutation coverage in minutes, but missed several critical application execution paths.
08 min
developers.googleblog.com
🧠 Lavi Nigam and Shubham Saboo walk through Google ADK's SkillToolset, enabling agents to load domain expertise on demand at runtime. Progressive disclosure replaces monolithic system prompts for security checklists, compliance audits, and validators.
11 min
blog.langchain.com
✅ Victor Moreira at LangChain delivers a practical step-by-step agent evaluation checklist covering error analysis, dataset construction, grader design, and offline and online eval strategies. Core advice: start with the simplest eval that gives a real signal.
Automation
06 min
testmuai.com
📚 TestMu AI's customer story with Atypon, a leading scholarly publishing platform, shows how KaneAI and HyperExecute accelerate test coverage across complex content delivery pipelines, cutting manual overhead while keeping reliability intact at scale.
10 min
mindstudio.ai
🌐 MindStudio Team explains how Claude Code controls browsers via Playwright for scraping, form filling, and multi-step workflows. Describe the task in plain English; Claude writes, runs, reads output, debugs failures, and iterates until complete.
12 min
tothenew.com
⚙️ Mahesh Sahebrao Wankhede at To The New built a full Selenium POM framework using MCP. AI proves strong at scaffolding but struggles to maintain architectural consistency across a real codebase without explicit context preservation in each session.
Tools
12 min
aimultiple.com
🔍 Cem Dilmegani at AIMultiple evaluated 7 AI test agents and found most are overhyped Selenium wrappers with marketing. A handful genuinely handle test authoring, visual testing, or maintenance, each carrying significant real-world limitations practitioners should know.
Video & Podcast
06 min
testingpodcast.com
🎙️ In this podcast of This Week in Quality, Simon and Judy dive into “joy scrolling” in the MoTaverse, trade stories about time-zone chaos and meaningful moments, and highlight how community content, chapters, and AMAs help turn learning into real momentum.
10 min
youtube.com
🎥 This tutorial by Execute Automation explores how to work with AI agents in real time to create advanced, context-aware automation workflows without manually coding anything. Just describe what you want in plain language, and the AI takes over.
Events
11 min
testguild.com
🏟️ Joe Colantonio brings TestGuild IRL to San Francisco in 2026 on 8th April 2026, an in-person event for automation engineers and QA professionals. Expect hands-on sessions, real-world case studies, and direct access to practitioners solving actual testing challenges.