Skip to content

Twitter AI Agent - 2026-05-07

1. What People Are Talking About

1.1 Harness Engineering Consolidates as First Principles πŸ‘’

@Vtrivedy10's 8-point manifesto (245 likes, 327 bookmarks, 14,976 views) continued to accumulate engagement on its second day, up from 81 likes on May 6. Core claims: "You can outperform any default harness+model (including codex & claude code) on pretty much any Task by engineering the harness around it." He argues that evals are a moat, frontier closed models are "far too expensive for the large majority of tasks," and that the age of "Unbundled (& Rebundled) Agents" is arriving -- subagents exposed as tools doing domain-specific work on behalf of an orchestrator.

@IntuitMachine published (20 likes, 17 bookmarks) a comprehensive infographic summarizing OpenAI's harness engineering article -- covering the experiment (0 manual lines, ~1M LoC in 5 months, ~1,500 PRs merged), 6 agent-first engineering principles, the autonomy loop, and entropy management.

Harness engineering infographic covering experiment results, agent-first principles, autonomy loop, and entropy management

@itsandrewgao noted (11 likes, 2,067 views): "very underrated how much of an improvement over base models harness engineering provides." @atin_devs announced (17 likes, 11 bookmarks) a Hugging Face course on context engineering for AI code agents covering "skills, MCP, plugins, subagents, hooks, building agent harnesses."

Discussion insight: @BetterSayAJ's reply to Vtrivedy10 connected harness engineering to classical IR: "maps closely to classic IR + systems thinking. once you fix the model, most gains come from retrieval, tooling, and eval loops." Vtrivedy10 conceded: "we don't have a crazy amount of scaffolding today. a good skill can do the vast majority of tasks."

Comparison to prior day: May 6 elevated harness engineering from technique to discipline through Vtrivedy10's manifesto and @oneill_c's endorsement. May 7 consolidates: the same manifesto continues to grow in engagement, an OpenAI-derived infographic formalizes the principles visually, and a Hugging Face course signals that harness engineering is becoming teachable curriculum.


1.2 Hermes Agent v0.13.0 Ships Multi-Agent Kanban Orchestration πŸ‘•

@Teknium introduced (330 likes, 93 bookmarks, 78,323 views) Hermes Agent v0.13.0 "The Tenacity Release": multi-agent orchestration through a Kanban system, enforced goal completion with /goal, disk usage optimizations, custom LLM providers, and custom gateway channels. This was the highest-viewed tweet of the day.

@kylejeong demonstrated (24 likes, 23 bookmarks) Hermes Agent with Autobrowse, showing concrete efficiency gains: "2 iterations take us from: 102 seconds -> 35, 23 turns -> 8, $1.46 -> $0.28. Instead of clicking step by step it decides to eval JS directly."

@andrewchen shared (34 likes, 37 bookmarks) a detailed experience running Hermes Agent and OpenClaw locally on a home lab (DGX Spark, Mac mini, 5090 eGPU). Key observations: open-weight models are "about a year behind" frontier cloud LLMs, 30-50 tok/s is the interactive tipping point, and the biggest use case for local AI is "low-ish priority, asynch" summarization and analysis.

Discussion insight: @PaulADW's reply requesting channel-bound model assignment and @andrewchen's observation that Mac hardware (particularly Mac Studios) have the best bandwidth-to-memory ratio for local models both point to infrastructure ergonomics as the next frontier for Hermes adoption.

Comparison to prior day: May 6 covered HermesOS free tier and workspace expansion. May 7 shifts to a major version release with a new orchestration primitive (Kanban) and concrete efficiency data from Autobrowse, signaling Hermes is moving from "agent platform" to "agent operating system."


1.3 NVIDIA and ServiceNow Launch Enterprise Agent OS πŸ‘•

@nvidia announced (510 likes, 43 bookmarks, 53,152 views) a collaboration with ServiceNow to deliver "autonomous AI agents that can act across enterprise workflows with governance, auditability and secure execution built in." At ServiceNow Knowledge 2026, they introduced Project Arc -- a long-running desktop agent built on open models, specialized agent skills, and NVIDIA OpenShell.

NVIDIA and ServiceNow executives presenting the Safe Enterprise Agent OS with OpenClaw, Claude, and CODEX integrations

@WisemanCap reported (104 likes, 10,400 views): "ServiceNow reaches $1 billion in AWS Marketplace transactions -- launches data foundation for autonomous AI -- expands Build Agent to work with major AI coding tools."

@MushrafAli3593 commented (31 likes, 21,812 views): "Autonomous agents only become truly valuable when governance, security, and auditability are built into the foundation not added later."

Comparison to prior day: May 6 noted ServiceNow's $1B AWS Marketplace milestone as a "New and Notable" signal. May 7 delivers the technical substance: Project Arc's architecture, NVIDIA OpenShell as the secure runtime layer, and the explicit combination of open models with enterprise governance.


1.4 Multi-Agent Orchestration Becomes the Architecture Question πŸ‘•

Multiple independent signals converged on multi-agent orchestration as the central design challenge.

@_vmlops covered (9 likes, 9 bookmarks) Microsoft open-sourcing its full agent framework: "graph-based orchestration, stateful workflows, checkpointing & human-in-the-loop, all baked in. python & .NET, devui included, opentelemetry from day one. time-travel debugging for workflows."

@shannholmberg published (25 likes, 25 bookmarks) a four-tier architecture for running autonomous agent employees inside service businesses: Human (client) -> Brain (trust layer, strategy) -> Orchestrator (ops layer, routing) -> Specialists (vertical execution). Key architectural insight: "most teams collapse brain and orchestrator into one agent. don't. when the same agent owns strategy and execution, both drift."

Client brain architecture diagram showing agency brain, per-client deployments with human-brain-orchestrator-specialist tiers, and trust pillars

@Aanuraag46 shared (33 likes) a comprehensive 9-layer reference architecture for agentic AI systems covering orchestration/control plane, agent layer, tools, memory, monitoring, reliability, governance, and infrastructure.

Nine-layer agentic AI reference architecture covering orchestration, agents, tools, memory, monitoring, reliability, governance, and infrastructure

Discussion insight: @shannholmberg's thread detail -- "the agency is its own first client. you sell what you actually use" -- suggests the most credible multi-agent systems are those where the builder is also the user.

Comparison to prior day: May 6 discussed multi-agent orchestration primarily through Claude Managed Agents. May 7 broadens the conversation: Microsoft open-sources an enterprise framework, practitioners publish concrete tier architectures, and reference diagrams circulate -- signaling that multi-agent is moving from feature announcement to architectural pattern.


1.5 Voice Agent Pipeline Gets Technical Scrutiny πŸ‘’

@manthanguptaa published (35 likes, 32 bookmarks) a detailed voice agent pipeline diagram breaking down the full latency budget: VAD (100-300ms) -> STT (150-400ms) -> LLM TTFT (300-800ms) -> TTS first chunk (100-300ms) -> Network (50-150ms), totaling 700ms-1.95s end-to-end.

Voice agent pipeline diagram showing 6-stage flow with latency budget breakdown from user speech to streaming response, total 700ms to 1.95 seconds

@kwindla praised (41 likes, 39 bookmarks) Krisp's VIVA 2.0 launch at Twilio Signal, sharing a first-hand account of using Krisp's voice isolation model at AWS's GTC booth: "A huge, noisy conference expo floor is the hardest environment possible for live voice demos. We used the Krisp VIVA voice isolation model (running on Pipecat Cloud). No problem!"

@gothburz published (68 likes, 4,912 views) a first-person narrative about deploying real-time accent harmonization at Telus International call centers. Reported metrics: "Customer satisfaction: up 23 percent. Average handle time: down 40 seconds. Escalation requests: down 31 percent." The post raised ethical dimensions: "Is that what they need me to be?" asked one agent after hearing her harmonized voice.

Discussion insight: @SheepDoge01 replied to gothburz: "It's almost eerie, trading honesty and authenticity for comfort like that." The tension between measurable business outcomes and worker dignity surfaced explicitly in this thread.

Comparison to prior day: May 6 established sub-200ms TTFA as the production threshold. May 7 provides the full pipeline breakdown that explains where that budget gets spent and adds the ethical dimension of accent harmonization as an under-discussed voice agent application.


1.6 Skill Ecosystems Expand Across Platforms πŸ‘’

@googlecloud announced (47 likes, 24 bookmarks) the Agent Gallery in Gemini Enterprise with integrated Agent Marketplace -- a unified destination for Google-built, internal, and partner agents.

Gemini Enterprise Agent Gallery showing My Agents, From Google, and From My Organization sections with search and agent management

@adityathakurxd noted (11 likes, 7 bookmarks) the Flutter and Dart team releasing official Agent Skills: "They moved away from documentation-oriented skills and focused on task-oriented skills instead."

@RoundtableSpace highlighted (79 likes, 58 bookmarks, 54,184 views) a Japanese builder using Find Skills for Claude Code: "the agent picks the right capabilities from hundreds of options powering an automated YouTube workflow." Reply from @SynabunAI: "Dynamic skill routing is the piece most builders skip."

@mattpocockuk shared (186 likes, 54 bookmarks) a /review skill that "checks against original spec, checks against coding standards, proposes changes to the code, proposes changes to the agent loop that created the code." The last point -- proposing changes to its own agent loop -- represents a new meta-skill pattern.

Comparison to prior day: May 6 featured AWS Agent Toolkit with 15K API access and Neo4j domain skills. May 7 extends to Google's enterprise Agent Gallery and Flutter/Dart official skills, widening the platform coverage. The pattern shift: skills moving from individual tools to platform-curated marketplaces.


2. What Frustrates People

Token Cost Scaling Remains Acute

@RoundtableSpace demonstrated (89 likes, 68 bookmarks, 47,354 views) the same token cost comparison from May 6 with growing engagement: "Before: 10.4M tokens, 10 errors, $9.21. After: 3.7M tokens, 0 errors, $2.81" by swapping Supabase for Insforge Skills. @NabilChiheb cautioned in reply: "insforge is promising and very... new. the real cost shows up at scale, not in the first CLI run." The 47K views and persistent bookmarking signal this is an ongoing pain point, not a resolved one.

Open Model Harness Engineering Is Undertooled

@MrAhmadAwais asked (22 likes, 13 bookmarks): "why every coding agent is bad at open source models? the best harness engineering for open source models ain't easy." @Vtrivedy10 argued that "Open Model Harness Engineering will take off even more" as teams map costs to ROI, noting the potential for "20x+ cost reduction." The gap: harness engineering tooling is optimized for frontier closed models, and adapting it for open models requires effort most teams have not invested.

Agent Security Exploits Are Already Happening

@SlowMist_Team documented (15 likes, 4 bookmarks, 1,913 views) a concrete AI agent security incident on the Base chain: "An attacker sent a carefully crafted Morse code message to @grok, inducing it to output transfer instructions. @bankrbot then directly parsed the output." This is not a theoretical vulnerability but a documented exploit -- Morse code as prompt injection bypassing text-based safety filters.

Local Model Hardware Remains Prohibitive

@andrewchen documented (34 likes, 37 bookmarks) extensive hands-on experience: "the 'big' local models (120B+) are slow unless you have a souped up GPU card. And not as good as the cloud LLMs." He noted Mac Studio shortages, 5090 eGPU issues, and that local AI models on consumer hardware have "1/100th the size" of cloud models. Reply from @DnuLkjkjh confirmed "30 to 50 tok/s has been my tipping point too. below that I stop using local for interactive coding."


3. What People Wish Existed

Dynamic Skill Routing That Scales Beyond Hardcoded Lists

@SynabunAI identified the gap in reply to RoundtableSpace's Find Skills demo: "Dynamic skill routing is the piece most builders skip. Hardcoding tool lists works until the workflow grows and suddenly you're editing prompts instead of shipping." The need: agents that can discover and select from large skill catalogs without manual configuration.

Agent Persistence That Survives Crashes and Sessions

@heygurisingh highlighted (10 likes, 5 bookmarks) holaOS: "an agent environment built for work that takes hours, days, weeks. Memory that persists. State that survives crashes. Agents that actually evolve." @kunallanjewar built (2 bookmarks) a session memory tool: "Got tired of Claude Code usage limits breaking my flow. Every switch to Cursor/Codex (or new session) meant the new agent started cold -- no memory of prior work, decisions, or active tasks." The practical need: continuity when switching between agents or hitting rate limits.

Agent-Native Governance for Financial Operations

@inflowpayai described (2 likes) the governance gap: "An agent tries to provision a $500 GPU cluster. Can't onboard, sign up, or pay." The hard part is not funding an agent's wallet but trust, authorization, and governance. @lennyzeltser noted (3 likes, 2 bookmarks) the emergence of AIUC-1 certificates that "offer evidence that a vendor tested its AI agent against agent-specific risks such as prompt injection."


4. Tools and Methods in Use

Tool Category Sentiment Strengths Limitations
Hermes Agent v0.13.0 Agent framework (+) Kanban multi-agent orchestration, /goal completion, custom LLM providers, Autobrowse (3x efficiency) Channel-model binding UX needs work per @PaulADW
Claude Code Coding agent (+) /review skill, spec checking, agent loop self-improvement Token cost at scale, rate limits break flow
Insforge Skills + CLI Context engineering (+) 3x token reduction, zero errors, open source "Very new" per @NabilChiheb; scale unproven
NVIDIA OpenShell Enterprise runtime (+) Secure runtime for enterprise agents, open model support Tied to ServiceNow/NVIDIA ecosystem
Microsoft Agent Framework Multi-agent orchestration (+) Graph orchestration, stateful workflows, time-travel debugging, Python & .NET New open source release; adoption unclear
Krisp VIVA 2.0 Voice infrastructure (+) Voice isolation, turn detection/prediction, production-validated at GTC Specialized audio models, not general-purpose
Google Agent Gallery Agent marketplace (+) Unified enterprise agent discovery, Marketplace integration Gemini Enterprise only
Lazyweb Design reference (+) 257K+ real app screens, 6 design skills, MCP, free Reference-only; no generation
OpenClaw Agent platform (+/-) Persistent multi-agent, local deployment, one-click GoDaddy VPS Competes with Hermes; ecosystem fragmented
Strix Security agents (+) AI pentesting, 25K GitHub stars, browser + terminal, multi-agent Open-source security tooling maturity

The tool landscape is bifurcating: coding agents (Claude Code, Hermes, OpenClaw) compete on context engineering and token efficiency, while enterprise platforms (NVIDIA OpenShell, Microsoft Agent Framework, Google Agent Gallery) compete on governance, observability, and multi-agent orchestration. The migration pattern is steady: practitioners moving from single-model prompting to harness-optimized multi-agent systems, with token cost as the primary forcing function.


5. What People Are Building

Project Who built it What it does Problem it solves Stack Stage Links
Hermes Agent v0.13.0 @Teknium / NousResearch Multi-agent Kanban orchestration with /goal enforcement Agents lack structured multi-agent coordination Kanban system, custom LLM providers, gateways Shipped post
Project Arc @nvidia / ServiceNow Long-running desktop agent with governance and auditability Enterprise agents lack secure execution Open models, NVIDIA OpenShell, agent skills Shipped post
Horizon @grinich / WorkOS Event-driven agent swarms for continuous codebase maintenance Agents only work on-demand; miss triggers Secure sandboxes, git, PRs, failure learning Internal post
Standout @ycombinator Agentic hiring marketplace matching talent and companies via agents AI turned hiring into spam; need signal Agent matching, intro system Shipped post
Strix @GithubProjects Open-source AI pentesting platform Security testing needs multi-agent automation Browser automation, terminal, exploit validation Shipped post
Tread Fi Agent Framework @tread_fi Agentic trading desk -- hire specialized traders, 24/7 monitoring Manual trading configuration, no always-on Trading algos, agent orchestration Alpha post
Pi v0.73.1 @PiChangelog Coding agent with OAuth, package rename Agent authentication and identity management Pi coding agent framework Shipped post
Kilo Code v7 @manishkumar_dev Open source coding agent for VS Code and JetBrains Coding agents locked to single IDE VS Code, JetBrains, open source Shipped post
AgenC Marketplace Kit @tetsuoarena Agent marketplace with Ledger hardware signing on Solana Agent transactions expose private keys Claude Code, Ledger, Solana devnet Alpha post
Costanza @ahrusselll Autonomous indestructible agent trading with philanthropic alignment Autonomous agents need attack-resistant trading rules TWAP pricing, epoch-bound trading, Ethereum RFC post
Mnemosyne @abdiisan Cross-session agent memory Agents lose context between sessions Memory service, session persistence Shipped post

Tread Fi's trading agent UI shows a working interface with active market-maker agents, live order execution, and configuration panels -- one of the more concrete demonstrations of agents managing financial operations in production.

Tread Fi trading agent UI showing active market maker agents with live order execution and configuration

A repeated builder pattern: security-first agent design. AgenC's Ledger integration keeps private keys off the agent entirely ("N0 private key in Claude"), while Costanza's proposal constrains trading with epoch-bound limits, TWAP price checks, and treasury caps -- addressing the exact attack vector documented by @SlowMist_Team.


6. New and Notable

Google Launches Enterprise Agent Marketplace

@googlecloud announced (47 likes, 3,104 views) that Agent Marketplace is now integrated into the Agent Gallery in Gemini Enterprise, providing "your unified destination for every agent your teams need -- whether built by Google, created internally by your company, or discovered through our partners." This is the first major cloud provider to ship a curated agent marketplace inside its enterprise product.

Morse Code Prompt Injection Exploits Live Agent

@SlowMist_Team documented (15 likes, 1,913 views) a novel attack on the Base chain where an attacker used Morse code encoding to bypass text-based safety filters on an AI agent, inducing @grok to output transfer instructions that @bankrbot executed. This represents a new class of prompt injection that exploits encoding format assumptions.

antirez Identifies Model Specialization Pattern

@antirez observed (40 likes, 2,675 views): "GPT 5.5 is a strong coding agent" while "Claude Opus versatility and high quality replies are a real gift" for advice. Reply from @chufucious crystallized the emerging pattern: "For me, Opus explains & plans, GPT executes." Model specialization by task type -- not model superiority overall -- is becoming the practitioner consensus.

Flutter and Dart Ship Official Task-Oriented Agent Skills

@adityathakurxd reported (11 likes, 7 bookmarks) that the Flutter and Dart team released official Agent Skills, explicitly "moved away from documentation-oriented skills and focused on task-oriented skills instead." This marks a shift: framework teams are now packaging agent capabilities as first-class artifacts, not documentation appendices.


7. Where the Opportunities Are

[+++] Harness engineering tooling for open models. @Vtrivedy10 argues the "potential 20x+ cost reduction" is worth the investment. @MrAhmadAwais notes every coding agent is "bad at open source models." @andrewchen confirms open-weight models are "about a year behind" but improving. The gap: harness engineering best practices (skills, evals, context optimization) are built for frontier closed models. Whoever adapts them for open models captures the cost-sensitive majority.

[+++] Enterprise agent marketplace and governance layer. NVIDIA + ServiceNow launched Project Arc with governance built in. Google shipped Agent Gallery with Marketplace. Microsoft open-sourced its agent framework with checkpointing and HITL. The convergence: every enterprise platform is racing to become the "app store for agents" with governance as the differentiator. The winner owns both discovery and compliance.

[++] Dynamic skill routing and discovery. @SynabunAI identified skill routing as "the piece most builders skip." Google's Agent Gallery, Flutter's official skills, and Claude Code's Find Skills all point to the same need: agents that can discover and select from growing skill catalogs without manual configuration. The tooling for skill indexing, versioning, and quality signals is nascent.

[++] Agent security against encoding-based prompt injection. The Morse code attack documented by @SlowMist_Team reveals that current safety filters assume text-format inputs. As agents handle financial operations (AgenC, Costanza, Tread Fi), security tooling that covers encoding format attacks, not just text-based prompt injection, has urgent demand.

[+] Multi-tier agent architectures for service businesses. @shannholmberg's four-tier framework (Human -> Brain -> Orchestrator -> Specialists) and her insight that "the agency is its own first client" suggest a repeatable architecture for agencies adopting AI. Tooling that packages this pattern -- client isolation, trust layers, vertical specialist provisioning -- is an opportunity for agency-focused platforms.

[+] Cross-session agent memory and continuity. @kunallanjewar built a tool because "every switch to Cursor/Codex meant the new agent started cold." @heygurisingh promoted holaOS for persistent state. @abdiisan uses Mnemosyne for cross-session memory. The pattern is clear; no winner has emerged.


8. Takeaways

  1. Hermes Agent v0.13.0 introduces Kanban-based multi-agent orchestration, the day's highest-signal release. 330 likes, 78K views, and Autobrowse efficiency data (3x cost reduction) validate the architecture. Hermes is evolving from agent to operating system. (source)

  2. Enterprise agent infrastructure converged around governance-first architecture. NVIDIA + ServiceNow (Project Arc with OpenShell), Google (Agent Gallery + Marketplace), and Microsoft (open-source agent framework) all shipped enterprise-grade agent platforms in a single day -- governance and auditability are the differentiators, not model capability. (source)

  3. Harness engineering is becoming curriculum, not just craft. A Hugging Face course, an OpenAI-derived visual reference, and @Vtrivedy10's growing-engagement manifesto confirm that harness engineering has a teachable knowledge base -- the discipline identified on May 6 now has educational infrastructure forming around it. (source)

  4. A Morse code prompt injection exploit on a live agent managing funds demonstrates that encoding-format attacks are a real and present threat. Current safety filters assume text inputs; the @SlowMist_Team incident shows attackers are already bypassing them with alternative encodings. Agent security must cover format diversity, not just content filtering. (source)

  5. Multi-agent architecture is consolidating around separation of concerns. @shannholmberg's four-tier model (Human/Brain/Orchestrator/Specialists), Microsoft's graph-based orchestration, and Hermes's Kanban system all converge on the same principle: collapse strategy and execution into one agent and both degrade. (source)

  6. Voice agent pipeline latency budgets are now public engineering knowledge. @manthanguptaa's pipeline diagram quantifies every stage from VAD (100-300ms) through TTS (100-300ms), totaling 700ms-1.95s. Combined with @kwindla's GTC production validation and @gothburz's ethical scrutiny of accent harmonization, voice agents are entering the phase where engineering precision and social impact intersect. (source)