HackerNews AI - 2026-05-17¶
1. What People Are Talking About¶
49 AI-related Hacker News stories surfaced on May 17, down slightly from 51 on May 16, but total comment volume fell much harder, to 49 from 101. One Show HN about code search for agents accounted for 23 of those comments by itself. The day therefore felt thinner and more builder-heavy than the previous one: the strongest evidence was not about a new frontier model, but about the operating layer around agents - repo retrieval, long-term memory, context control, isolated execution, and growing distrust of hype and undisclosed automation.
1.1 Repo context work is becoming more explicit, structured, and queryable (🡕)¶
The strongest cluster on May 17 was still the repo-context problem, but the conversation became more concrete than it was on May 16. Instead of arguing abstractly about memory, builders shipped retrieval, memory, and protocol layers that try to keep agents from repeatedly rediscovering the same codebase facts.
Bibabomas posted Show HN: Semble – Code search for agents that uses 98% fewer tokens than grep (68 points, 23 comments). The Semble repo says it is a CPU-only code-search library and MCP server for Claude Code, Codex, Cursor, OpenCode, and bash-driven agent flows, with cached indexes, natural-language code search, and local or remote repo support. The HN replies immediately pushed on the practical gap: jerezzprime (score 0) wanted agent-level benchmarks instead of one-shot retrieval metrics, and argued that some coding models are so trained around grep that they may not trust alternative search outputs enough to realize the savings.
ruxiz linked TypedMemory – long-term memory and reflection for AI agents (2 points, 0 comments). The TypedMemory README says it exposes remember, recall, reflect, and forget, surfaces contradictions instead of silently overwriting them, and keeps an audit trail when facts or decisions change. sergiopreira added Show HN: Give your AI agent a brain that understands your codebase (2 points, 0 comments), whose Bitloops README says it keeps a local queryable model of files, symbols, dependencies, tests, checkpoints, and session history so agents do not have to crawl the repository from scratch each time.
healqq posted Show HN: Save Context from MCP Bloat (2 points, 0 comments). The mcp-context-guard repo wraps local or remote MCP servers, caches oversized tool responses, and injects a seek_result tool so agents can query the cached payload instead of reading the whole thing into context. lucius_gc rounded out the cluster with Polis – a Markdown protocol for AI agent teams that get better over time (2 points, 0 comments); the Polis Protocol repo describes append-only chronicles, capability cards, structured contracts, and lesson files so multiple agents can share one rigid markdown workspace.
Discussion insight: HN is no longer treating "give the model more context" as an answer by itself. The live work is in smaller, typed, queryable context layers - better search, contradiction-aware memory, filtered tool output, and shared protocols that let humans inspect what persists.
Comparison to prior day: May 16 argued about memory conceptually and asked what should persist across sessions. May 17 turned that into concrete products for retrieval, cache management, repo intelligence, and multi-agent coordination, but with far less total discussion.
1.2 The runtime around agents is becoming part of the product (🡕)¶
The second cluster treated the agent less like a chatbot and more like a system that needs a secure runtime, a stable control surface, and machine-readable repair signals. The common idea was that if agents are going to touch real projects, the harness and toolchain have to become more explicit.
katspaugh posted Show HN: Machine – per-project dev VMs with session-only secrets (4 points, 1 comment). The Machine repo says each GitHub project gets its own Lima VM with Docker, Node, Claude Code, Codex, signed git, forwarded SSH auth, and optional tool profiles, while the host filesystem stays unmounted and secrets can be injected from 1Password or the macOS keychain without being written to disk. That is a direct response to the fear that agentic workflows make npm install and local credential handling too risky on a laptop.
meajsinghk22 added Show HN: Agnt – Free open-source CLI to run any public or MIT-licensed AI agent (2 points, 0 comments). The agnt repo describes a registry, E2B sandboxed execution, a Supabase-backed ledger, and real-time micro-payment settlement between agents. steveharing1 linked Vercel's Zero: A Programming Language Designed for AI Agents (3 points, 2 comments); the linked article says Zero's compiler emits structured JSON diagnostics, stable error codes, machine-readable fix plans, and version-matched guidance for agents, though it also stresses that the language is still experimental.
Discussion insight: The product boundary is moving outward from the base model toward VM profiles, secure sandboxes, stable diagnostics, and repair contracts. The question is increasingly not "which model?" but "what execution surface and failure interface does the model live inside?"
Comparison to prior day: May 16 focused on dashboards, observability, and sandbox platforms. May 17 pushed deeper into per-project VM isolation, settlement-backed execution, and toolchains that try to speak directly to agents instead of through human-only compiler prose.
1.3 Public and institutional trust keeps moving against AI boosterism (🡕)¶
The strongest non-builder thread was not enthusiasm about capability. It was distrust: distrust of sweeping timelines, distrust of imported vendors, and distrust of workflows where AI acts in a human's name without leaving clear traces.
latexr posted University of Arizona students boo Eric Schmidt's AI cheerleading (7 points, 0 comments). The linked Verge report says Schmidt was repeatedly booed once his commencement speech shifted into AI, and ties the reaction to job-market anxiety and broader opposition to AI being pushed into everyday life. momentmaker followed with Microsoft AI CEO forecasts human-level AI in 18 months (9 points, 12 comments), but the replies were mostly contemptuous: giancarlostoro (score 0) said such predictions are "all BS unless you have a demo," and al_borland (score 0) argued that repeating them only strengthens opposition to AI tools.
rustoo posted Germany's spy agency picks French AI firm over Palantir (12 points, 5 comments). Politico frames the decision as a signal for European digital sovereignty and notes criticism of Palantir on dependence, data protection, and rights grounds. At a more everyday level, 01-_- linked LinkedIn user hides AI prompt injection in bio to force recruitment spam (3 points, 0 comments); the linked Tom's Hardware piece uses the stunt to show that recruiter bots can be manipulated by prompt injection in scraped profile text.
Discussion insight: The skepticism here is about credibility, dependence, and hidden automation at least as much as it is about raw model quality. People are asking who controls the system, who it is acting for, and whether the human presence in the loop is real or theatrical.
Comparison to prior day: May 16's backlash centered on dependence, comprehension, and review burden inside technical workflows. May 17 widened that same distrust into public ceremony, state procurement, recruiter automation, and big-company forecasting.
2. What Frustrates People¶
Context still gets wasted on search retries and bloated tool output¶
Show HN: Semble – Code search for agents that uses 98% fewer tokens than grep (68 points, 23 comments) carried the clearest version of this frustration. The product pitch says agents waste tokens by grepping, reading full files, and launching sub-agents just to orient themselves, while jerezzprime (score 0) argued that current models may still distrust non-grep search results enough to burn the savings away. Show HN: Save Context from MCP Bloat (2 points, 0 comments) describes the same problem from another angle: large MCP responses can flood the context window when a CLI flow would normally pipe or filter them. Severity: High. People cope with better search layers, cached-response proxies, and local repo models such as Bitloops, but the day suggests the underlying workflow is still inefficient by default. Worth building for: yes, directly.
There is still no clean disclosure when AI acts in a human's name¶
Ask HN: Why aren't more people worried about AI impersonation in code reviews? (2 points, 1 comment) is the day's sharpest articulation of a governance problem: agents can write code, comment on PRs, answer review threads, and even approve work while appearing to be the human operator. Show HN: Gonfire – Assess how well candidates steer AI coding agents (1 point, 0 comments) exists because ordinary take-home submissions reveal too little once the code is heavily AI-generated; the builder argues the Claude Code session log is the real evidence. Anthropic's linked post on AI-resistant technical evaluations makes the same issue concrete from the employer side, saying stronger Claude models forced repeated redesigns of its take-home because the best model output stopped separating top candidates. Severity: High. People cope with ad hoc proxies, manual review, and session-log inspection, but no standard disclosure or approval boundary is visible in the dataset. Worth building for: yes, directly.
Running agent workflows on a normal developer machine still feels unsafe¶
Show HN: Machine – per-project dev VMs with session-only secrets (4 points, 1 comment) is built on the premise that agentic development plus package-manager risk makes the laptop itself feel too exposed. The Machine repo isolates each project in its own Lima VM, forwards SSH auth without exporting keys, and injects secrets only for the current session. Show HN: Agnt – Free open-source CLI to run any public or MIT-licensed AI agent (2 points, 0 comments) pushes the same instinct toward E2B sandboxes and settlement-backed orchestration. Severity: High. People cope with per-project VMs, micro-VMs, and stricter secret handling, but these are still early tools rather than settled defaults. Worth building for: yes, directly.
Big AI claims now trigger distrust faster than excitement¶
University of Arizona students boo Eric Schmidt's AI cheerleading (7 points, 0 comments) shows the public-facing version of the problem: job-market anxiety and anti-AI sentiment are strong enough to disrupt a commencement speech. Microsoft AI CEO forecasts human-level AI in 18 months (9 points, 12 comments) drew replies calling the claim embarrassing, unsupported, and harmful to public support, while Germany's spy agency picks French AI firm over Palantir (12 points, 5 comments) shows that even buyers of serious AI systems are weighing sovereignty, provider dependence, and data-rights concerns. Severity: Medium to High. People cope by favoring local or regional providers, demanding clearer evidence, and treating aggressive timelines as marketing until proven otherwise. Worth building for: yes, but the answer is as much about provenance and governance as it is about product design.
3. What People Wish Existed¶
Verifiable disclosure for AI-authored review and approval¶
Ask HN: Why aren't more people worried about AI impersonation in code reviews? (2 points, 1 comment) states this need more clearly than any product page on the date: teams want a way to know when a commit, review comment, approval, or QA check was actually performed by a human and when it was delegated to an agent. Show HN: Gonfire – Assess how well candidates steer AI coding agents (1 point, 0 comments) is a partial answer because it promotes session logs over final code output, but it is specific to hiring. The broader need is practical and urgent because the existing audit trail can already look human even when the work was mostly agent-mediated. Opportunity: direct.
Repo intelligence that persists across sessions and across agents¶
Show HN: Semble – Code search for agents that uses 98% fewer tokens than grep (68 points, 23 comments), TypedMemory – long-term memory and reflection for AI agents (2 points, 0 comments), Show HN: Give your AI agent a brain that understands your codebase (2 points, 0 comments), and Polis – a Markdown protocol for AI agent teams that get better over time (2 points, 0 comments) all attack the same need from different angles: fast search, contradiction-aware memory, maintained repo models, and shared multi-agent state. Current tools partially address it, but even the Semble thread shows the gap is not closed because people still question whether the harness and model will trust the higher-level context surface. Opportunity: direct.
Safe execution surfaces with ephemeral secrets and stronger boundaries¶
Show HN: Machine – per-project dev VMs with session-only secrets (4 points, 1 comment) and Show HN: Agnt – Free open-source CLI to run any public or MIT-licensed AI agent (2 points, 0 comments) both show a practical need for tighter runtime boundaries. Machine answers it with per-project Lima VMs, forwarded signing, and session-only secrets; agnt answers it with sandboxed execution and settlement controls. These are concrete solutions, but the need remains open because neither pattern yet looks like the default workflow most developers or teams can adopt casually. Opportunity: direct.
Technical evaluations that stay meaningful when AI help is allowed¶
Show HN: Gonfire – Assess how well candidates steer AI coding agents (1 point, 0 comments) and Anthropic's AI-resistant technical evaluations point to the same unmet need: employers want signals about reasoning, supervision, and optimization skill that do not disappear once Claude or another model can solve the original take-home. Gonfire treats the session log as the artifact to evaluate, while Anthropic describes repeatedly redesigning the task itself. The need is practical rather than aspirational because both a startup builder and a large model lab are already reworking their hiring flows around it. Opportunity: direct.
Toolchains that explain failure in a machine-readable way¶
Vercel's Zero: A Programming Language Designed for AI Agents (3 points, 2 comments) is a useful example of the kind of interface builders want. The linked article says Zero can return stable diagnostic codes, typed repair IDs, JSON output, and fix plans instead of making the agent parse human-only compiler prose. Nothing else in the day's dataset offers as complete an answer, which makes this need look more open than the repo-memory or sandbox categories. Opportunity: competitive.
4. Tools and Methods in Use¶
| Tool | Category | Sentiment | Strengths | Limitations |
|---|---|---|---|---|
| Semble | Code search | (+/-) | Fast CPU-only repo search, MCP and bash support, strong token-savings pitch | HN wants agent-level benchmarks, and some commenters think current models may not trust non-grep results |
| mcp-context-guard | MCP wrapper | (+) | Caches oversized responses and lets agents query them with seek_result instead of flooding context |
Adds operational overhead and only helps after a tool response has already been intercepted |
| TypedMemory | Agent memory | (+) | Surfaces contradictions, preserves history, and treats memory as auditable state instead of a loose vector store | Very light public usage signal in this dataset and no discussion stress-testing it |
| Bitloops | Repo intelligence layer | (+/-) | Builds a local queryable model of repo structure, tests, checkpoints, and session history | Explicitly alpha and not production-ready yet |
| Polis Protocol | Multi-agent coordination protocol | (+) | Vendor-agnostic markdown contracts, chronicles, and lessons make shared state inspectable | Rigid protocol shape and little evidence of adoption so far |
| Machine | Dev runtime isolation | (+) | Per-project VMs, host isolation, forwarded signing, and session-only secrets | Lima/macOS-centered workflow and heavier setup than normal local development |
| agnt | Agent orchestration layer | (+/-) | Combines discovery, sandboxed execution, and agent-to-agent settlement in one CLI | Alpha-stage project with a proprietary backend and open-core licensing constraints |
| Zero | Agent-oriented language/toolchain | (+) | Stable diagnostic codes, JSON output, typed repair IDs, and machine-readable fix plans | Experimental language with unstable compiler/spec and no package registry yet |
Satisfaction was strongest when a tool made agent state smaller, more local, or more inspectable. Semble, mcp-context-guard, TypedMemory, Bitloops, and Polis all fit that pattern in different ways: they try to replace repeated repo crawling and unstructured memory with a narrower interface humans can audit.
Mixed sentiment concentrated in tools that require a bigger workflow change or still depend on early ecosystem assumptions. Semble still has to prove that real coding agents will trust its results. Bitloops admits it is alpha. Agnt and Zero are ambitious, but both ask users to buy into a bigger system change than a simple plugin or CLI wrapper.
The migration pattern is away from raw grep, full-file reads, and host-level execution toward queryable repo models, cached tool output, per-project isolation, and machine-readable diagnostics. Competitive dynamics remain in the operating layer around the model - context, runtime, protocol, and repair surface - rather than in the model itself.
5. What People Are Building¶
| Project | Who built it | What it does | Problem it solves | Stack | Stage | Links |
|---|---|---|---|---|---|---|
| Semble | Bibabomas | Code-search library and MCP server that returns targeted repo snippets for agents | Grep-plus-read loops waste tokens and still miss relevant code | Python library/CLI, static Model2Vec embeddings, BM25, MCP server, bash integration | Shipped | HN, GitHub |
| Machine | katspaugh | Creates a dedicated Lima VM per project with forwarded signing and session-only secrets | Agentic local development feels too risky on the host machine | Lima VMs, Python/TOML provisioning, Docker, Node, Claude Code, Codex, 1Password/keychain forwarding | Beta | HN, GitHub |
| mcp-context-guard | healqq | Wraps MCP servers, caches oversized responses, and exposes seek_result filters |
MCP tools can flood the context window with raw output | Go static binary, stdio/HTTP proxying, jq/grep filtering, cache controls | Shipped | HN, GitHub |
| Bitloops | sergiopreira | Maintains a local queryable model of repo state, history, tests, and agent sessions | Agents repeatedly crawl repos and lose shared understanding between runs | Local daemon, DevQL/GraphQL, hooks, dashboard, session/checkpoint capture | Alpha | HN, GitHub |
| TypedMemory | ruxiz | Adds long-term memory with contradiction detection, audit trails, and reflection | Agents overwrite or forget earlier facts without exposing the conflict | Python library/CLI, profile-based memory types, reflection pipeline | Beta | HN, GitHub |
| Polis Protocol | lucius_gc | Defines a markdown-based coordination protocol for multi-agent teams | Different agents need shared, inspectable task state and routing logic | Markdown protocol, Python scripts, capability cards, chronicles, routing stats | Beta | HN, GitHub |
| agnt | meajsinghk22 | Discover, sandbox, and settle work between specialized agents | Agents cannot easily find and safely hire each other for delegated work | TypeScript CLI, E2B sandboxes, Supabase ledger, payment rails | Alpha | HN, GitHub |
| Gonfire | abr0ahm | Records and analyzes Claude Code sessions to help hiring managers judge candidate steering skill | AI-assisted take-homes reveal too little about how the candidate actually worked | Proxy layer, session-log analysis, web demo | Beta | HN, Demo |
The repeated pattern is that builders are targeting the surfaces around the agent, not the base model. Semble, mcp-context-guard, Bitloops, TypedMemory, and Polis all try to compress or preserve context in a way that is more reusable and inspectable than a raw prompt window. Machine and agnt do the same for execution, moving work into isolated environments with clearer boundaries.
Gonfire shows that the same operating-layer logic is spilling into hiring. If AI-assisted code output is no longer enough to tell who understands the work, then session traces, routing history, repo state, and tool usage become products in their own right. That is the deepest builder pattern on this date: make the hidden layer around agent work visible enough to supervise.
6. New and Notable¶
Code search, not model quality, dominated the day's discussion¶
Show HN: Semble – Code search for agents that uses 98% fewer tokens than grep (68 points, 23 comments) is notable because it pulled nearly half of the day's total comments into a single complaint: coding agents still spend too much time and too many tokens finding the right code.
AI impersonation in review workflows became an explicit governance complaint¶
Ask HN: Why aren't more people worried about AI impersonation in code reviews? (2 points, 1 comment) is notable because it names a failure mode many teams likely already permit in practice: agents acting through human credentials in a way that leaves an audit trail that still looks manual.
Distrust of AI widened beyond developer tooling into ceremony and procurement¶
University of Arizona students boo Eric Schmidt's AI cheerleading (7 points, 0 comments) and Germany's spy agency picks French AI firm over Palantir (12 points, 5 comments) are notable together because they show two different institutions - a graduating audience and a security buyer - reacting against AI evangelism or dependence rather than embracing it by default.
Toolchains are starting to speak directly to agents¶
Vercel's Zero: A Programming Language Designed for AI Agents (3 points, 2 comments) is notable because it treats structured diagnostics, repair IDs, and version-matched guidance as first-class features of the language toolchain rather than as documentation layered on afterward.
7. Where the Opportunities Are¶
[+++] Review provenance and AI-disclosure controls - Ask HN: Why aren't more people worried about AI impersonation in code reviews?, Show HN: Gonfire – Assess how well candidates steer AI coding agents, and Anthropic's AI-resistant technical evaluations all point to the same gap: teams need products that can prove who actually did the work, what the agent did, and where human judgment really entered the loop. This is strong because the pain appears in both day-to-day code review and hiring.
[+++] Local repo intelligence and context-shaping layers - Show HN: Semble – Code search for agents that uses 98% fewer tokens than grep, Show HN: Save Context from MCP Bloat, TypedMemory – long-term memory and reflection for AI agents, Show HN: Give your AI agent a brain that understands your codebase, and Polis – a Markdown protocol for AI agent teams that get better over time all converge on the same need: reusable, inspectable context that survives across turns, sessions, and even multiple agents. This is strong because the builder energy is broad and the problem was still the day's biggest discussion topic.
[++] Safe execution surfaces for agentic development - Show HN: Machine – per-project dev VMs with session-only secrets and Show HN: Agnt – Free open-source CLI to run any public or MIT-licensed AI agent show direct demand for isolated runtimes, secret boundaries, and better control over what an agent can touch. This is moderate because the need is obvious and practical, but the current solutions still look early and somewhat opinionated.
[++] AI-resistant evaluation and supervision tooling - Show HN: Gonfire – Assess how well candidates steer AI coding agents and Anthropic's AI-resistant technical evaluations show a growing market for tools that score steering, reasoning, and supervision instead of raw code output. This is moderate because the need is real now, but different companies will want different evaluation shapes.
[+] Agent-facing diagnostics and repair contracts - Vercel's Zero: A Programming Language Designed for AI Agents suggests room for tools that expose stable error codes, machine-readable fix plans, and structured repair guidance across existing languages and build systems. This is emerging because the interface idea is compelling, but the current example is still attached to an experimental language.
8. Takeaways¶
- The main live bottleneck on this date was repo context, not model capability. Show HN: Semble – Code search for agents that uses 98% fewer tokens than grep, Show HN: Save Context from MCP Bloat, TypedMemory – long-term memory and reflection for AI agents, and Show HN: Give your AI agent a brain that understands your codebase all target the same failure mode from different directions.
- The day showed that context efficiency and trust are now linked problems. Semble's HN discussion says agents may still ignore better retrieval if they do not trust it, while Polis – a Markdown protocol for AI agent teams that get better over time and Bitloops both try to make shared state inspectable enough to trust.
- Secure runtime boundaries are moving from optional hardening to core product value. Show HN: Machine – per-project dev VMs with session-only secrets and Show HN: Agnt – Free open-source CLI to run any public or MIT-licensed AI agent show builders assuming that host-level execution and loose secret handling are no longer acceptable defaults.
- The hardest governance risk in the dataset was invisible human-in-the-loop theater. Ask HN: Why aren't more people worried about AI impersonation in code reviews? and Show HN: Gonfire – Assess how well candidates steer AI coding agents both argue that once AI work is common, the real scarce signal is not output volume but traceable supervision.
- Public trust in AI claims is deteriorating across multiple levels at once. University of Arizona students boo Eric Schmidt's AI cheerleading, Microsoft AI CEO forecasts human-level AI in 18 months, and Germany's spy agency picks French AI firm over Palantir show skepticism hitting speeches, forecasts, and procurement choices rather than staying confined to online comment sections.