Welcome to the 302nd edition of Coding Jag brought to you by TestMu AI!👐
This week the frontier got locked down. OpenAI previewed a GPT-5.6 family, Sol, Terra, and Luna, so capable that, at the US government's request, early access is limited to about 20 vetted partners. Anthropic went the other way and shipped Claude Sonnet 5, a cheaper, agent-focused model that became the default for free and Pro users. And while the labs jockeyed over models and talent, a critical Oracle E-Business Suite flaw was already being exploited in the wild. Powerful, guarded, and moving fast.
For testers and engineers, the ground shifted underneath all of it. Inside: Microsoft's disclosure of prompt injection escalating into remote code execution inside agent frameworks, the 'Agents' Last Exam' benchmark that scores real economically valuable work (top systems still pass under 1% of the hardest tasks), context engineering and observability for coding agents, Playwright 1.61's passkey and storage APIs, AI across the test-automation lifecycle, fresh dev-tool power rankings, and podcasts on self-fixing mobile tests. Plus, TestMu Conf 2026 registration is open.
📬 Come across something useful or interesting? Just reply and let's exchange ideas.
News
10 min
openai.com
☁️ In an official preview post, OpenAI unveiled its GPT-5.6 family but is holding early access to a small group of about 20 trusted partners and organizations. Highlights include three models, Sol, the flagship for software engineering, computer use, knowledge work, research, and cybersecurity; Terra, a lower-cost option with roughly GPT-5.5-level performance at 2x cheaper; and Luna, the fastest and most cost-efficient, all available only through the OpenAI API and Codex, per-million-token pricing running Sol at $5/$30, Terra at $2.50/$15, and Luna at $1/$6, and general availability promised in the coming weeks.
09 min
anthropic.com
☁️ In its official announcement, Anthropic introduced Claude Sonnet 5, 'built to be the most agentic Sonnet model yet,' and made it the default for Free and Pro plans and available across Max, Team, and Enterprise. Highlights include introductory pricing of $2 per million input tokens and $10 output through August 31, 2026 (then $3 and $15), performance approaching Opus 4.8 at a lower price, gains over Sonnet 4.6 in reasoning, tool use, and coding, and an overall lower rate of undesirable behaviors than its predecessor.
08 min
thehackernews.com
🔒 Ravie Lakshmanan reports that a critical Oracle E-Business Suite Payments flaw, CVE-2026-46817 (CVSS 9.8), is already being exploited in the wild. Highlights include an improper privilege-management and authentication bug that lets an unauthenticated attacker compromise the system over HTTP, impact across versions 12.2.3 through 12.2.15 with fixes shipped in Oracle's Critical Patch Update, and honeypot observations from Defused Cyber catching an actor exploiting it over a weekend, with no public proof-of-concept out yet.
09 min
startuphub.ai
☁️ StartupHub.ai covers a Semafor interview in which Demis Hassabis defends Google's talent competitiveness after four DeepMind researchers reportedly left for Anthropic in a single week. Highlights include DeepMind's roughly 3,000 researchers making those exits a fraction of one percent of headcount, Isomorphic Labs' $2.1B Series B (total funding $2.7B) with Eli Lilly and Novartis deals worth nearly $3B in milestones, and Google Cloud's Q1 revenue of $20.0B, up 63% year over year.
AI
09 min
docs.x.com
☁️ Per X's official developer docs, X now runs a hosted MCP server that connects AI tools to the X API through the Model Context Protocol. Highlights include two servers, X MCP at api.x.com/mcp (search posts, look up users, manage bookmarks, fetch trends, and draft Articles) plus a Docs MCP for documentation retrieval; two auth paths, read-only app-only bearer tokens or full OAuth 2.0 through a lightweight local 'xurl' bridge that handles PKCE login and token refresh so the model acts with your account's scopes; and ready-made setup for Grok Build, Cursor, Claude Desktop, and VS Code.
08 min
techcrunch.com
☁️ Julie Bort reports that 8090 Labs closed a $135M Series A led by Salesforce Ventures, with WndrCo and Craft Ventures joining, as Chamath Palihapitiya steps from board member to full-time CEO. Highlights include the company's Software Factory product, which lets corporate programming teams build production-quality software with AI plus enterprise controls like audit trails; a focus on enterprise development rather than throwaway prototypes; and Palihapitiya calling it the moment he had been waiting for to return to a full-time operating role since Facebook.
09 min
anthropic.com
☁️ In its official 'Redeploying Fable 5' announcement, Anthropic says it is bringing Claude Fable 5 back after the US lifted its export-control order on June 30. Highlights include Fable 5 returning across the Claude Platform, Claude.ai, Claude Code, and Claude Cowork; a new safety classifier that blocks the flagged jailbreak behavior in over 99% of cases and reroutes blocked requests to Claude Opus 4.8; up to 50% of weekly usage limits for paid tiers through July 7 before moving to usage credits; and cloud access on AWS, Google Cloud, and Microsoft Foundry to follow.
Automation
08 min
github.com
☁️ The latest Playwright release, v1.61, adds several headline capabilities for test automation. Highlights include a WebAuthn virtual authenticator (Credentials) that lets tests register passkeys and answer navigators. Credentials ceremonies across all browsers without physical keys, a new Web Storage API exposing page.localStorage and page.sessionStorage, expanded video modes ('on-all-retries,' 'retain-on-first-failure,' and 'retain-on-failure-and-retries'), and updated browser support with WebSocket requests now captured in HAR and trace recordings.
09 min
testmuai.com
☁️ Salman Khan maps how AI shows up across test automation in 2026. Highlights include the ThinkSys QA Trends Report 2026 finding 77.7% of organizations use or plan to use AI in QA, four core building blocks (machine learning, NLP, data analytics, and RPA), a KaneAI case study where an edtech firm automated 400 SAP test cases to cut manual effort 60% and lift coverage 50%, and a reminder to keep human approval gates before merging AI-generated code.
Tools
07 min
github.blog
☁️ GitHub Desktop 3.6 ships Git worktree support plus deeper Copilot integration. Highlights include working across multiple branches at once without stashing, switching, or re-cloning; Copilot-generated commit messages that respect your repo's custom instructions and metadata rules; AI-assisted merge-conflict resolution that explains conflicting changes and suggests a fix to review; and a model picker with bring-your-own-key options for third-party providers.
07 min
cursor.com
☁️ Cursor is now a native iOS app in public beta, letting developers build from anywhere. Highlights include launching cloud-based agents or remotely steering agents running on your own computer, lock-screen and push notifications when an agent finishes, needs input, or has code ready to review, and the ability to hand active work off between your local machine and cloud environments for longer-running tasks in isolated VMs.
Video & Podcast
09 min
testguild.com
🎤 On the TestGuild News Show, Joe Colantonio rounds up the week's test-automation news. Highlights include a new Playwright release that finally makes passwordless (passkey) login testable, LinkedIn's autonomous QA agent that independently surfaced more than 200 genuine bugs, a cautionary tale of an AI chatbot fabricating a company policy that never existed and triggering real legal fallout, and a look at emerging AI tooling for Selenium.
08 min
twit.tv
🎤 On This Week in Tech 1090, Leo Laporte and guests unpack the week's biggest stories. Highlights include Apple and Microsoft raising device prices amid AI-data-center-driven memory shortages, the White House abruptly pulling Anthropic's Fable model and restricting top AI models to select partners, Matter 1.6 smart-home updates alongside privacy worries over Ring and Flock cameras, and Ford rehiring engineers to fix mistakes made by automated systems.
Events
07 min
testmuai.com
🎤 TestMu Conf 2026 is TestMu AI's free virtual conference on agentic engineering and quality, running August 19 to 21, 2026. Highlights include 80-plus sessions and an expected 75,000-plus testers, builders, and engineers from 120-plus countries, a confirmed keynote from Replit CTO Luis Hector Chavez with more speakers to come, and live competitions and quizzes with prizes up to $15,000, plus on-demand access to every session after the event. Registration is free.
08 min
starwest.techwell.com
🎤 STARWEST 2026, TechWell's flagship software testing conference, runs September 20 to 25 at the Disneyland Hotel in Anaheim, California, with a virtual option. Highlights include 75-plus talks spanning keynotes, trainings, tutorials, and sessions; tracks covering test management, test techniques, performance testing, test automation, agile testing, and mobile testing; and a program centered on AI-powered testing, continuous testing, low-code/no-code automation, and the evolving role of the tester.