Anthropic introduces 'routines' to Claude Code, enabling developers to automate and schedule coding tasks, running them without direct interaction or active sessions.
A new framework evaluates the impact of skills on agent performance, showing a 20% accuracy boost and cost efficiency, while highlighting evaluation challenges.
Vercel has open-sourced Open Agents, a platform for building custom AI coding agents, addressing the limitations of generic tools in large codebases.
AI Engineer Europe highlighted the infrastructure gap in agent deployment, with teams struggling to manage and evaluate skills effectively in production environments.
GitHub introduces remote control for Copilot CLI, allowing users to manage terminal sessions from web or mobile, reflecting a shift in AI coding agent usage.
GitHub pauses new Copilot Pro trials and tightens usage limits due to increased demand, reflecting broader challenges faced by AI tool providers managing system capacity.
The article explores lessons learned from evaluating an AI PR reviewer, highlighting the importance of risk classification and addressing false positives in AI models.
Factory launches a desktop app for its AI 'Droids,' enhancing software development with persistent environments and expanded agent interactions on macOS and Windows.
Claude Managed Agents by Anthropic offers a hosted platform to run AI agents in production, simplifying infrastructure needs and reducing development time.
An AI PR reviewer achieves 97.7% accuracy by creating evidence packs and risk classifications, aiding humans in decision-making without directly hunting for bugs.
GitHub's 'Rubber Duck' feature in Copilot CLI uses a second AI model to review code, offering a fresh perspective and identifying potential issues early in development.
Inspecting your agent sessions can help optimize skills by identifying friction points and verifiers during real-world usage, using tools like Tessl's behavior-audit skill.