Skip to content

Twitter AI Coding - 2026-05-20

1. What People Are Talking About

1.1 Security controls moved from theory to incident response 🡕

The sharpest change in today's feed was that AI-coding security stopped sounding architectural and started sounding operational. Posts about a malicious VS Code extension reaching internal GitHub systems quickly turned into advice about key rotation, trusted extension sources, pre-execution scanning, and credential mediation. Compared with May 19's perimeter-and-sandbox discussion, May 20 was much more about how teams keep agent tooling from turning into a supply-chain liability.

@rayklanderman warned (24 likes, 1 reply, 7,995 views) that developers should rotate GitHub Copilot keys immediately and stick to the official Microsoft marketplace after an OpenVSX extension became the entry point. @TechieUltimatum summarized (10 likes, 4 replies, 204 views) the breach as roughly 3,800 internal GitHub repositories being exfiltrated, including systems tied to Copilot, Actions, billing, and auth, and the screenshots made the sales-listing angle tangible rather than abstract.

Screenshot summarizing a malicious VS Code extension breach that exposed thousands of internal GitHub repositories and triggered secret rotation

Forum screenshot advertising thousands of stolen private GitHub repositories for sale after the extension-borne breach

@PurpleOps_io added (2 likes, 8,974 views) the most important corrective nuance: “internal repos” could mean secret-scanning logic, Copilot-auth flows, and signing-cert issuance, so “no customer impact” does not make the exposure trivial. On the defensive side, @Dinosn shared (6 likes, 486 views) hol-guard, which wraps agent tool calls before execution, and @devopsdotcom linked (105 views) to 1Password's Codex MCP integration, which delivers just-in-time credentials so secrets never land in prompts, logs, or terminal output.

Discussion insight: The strongest security voices were concrete, not vague. The useful responses were about which marketplace to trust, what keys to rotate, and which layer should mediate agent access to secrets and tools.

Comparison to prior day: May 19 centered on self-hosted sandboxes and private-network access for managed agents. May 20 turned that same trust conversation into an incident-driven supply-chain discussion with immediate mitigations.

1.2 Shared skills and operating procedures became a bigger part of the story 🡕

A second theme was that AI-coding know-how is being packaged into things teams can actually reuse: skill repos, workshop curricula, and opinionated operating procedures. The feed had less “write a better prompt” energy than it did a week earlier and more evidence that people want durable artifacts the agent can load repeatedly. That makes the conversation less about prompt craft and more about institutional memory.

@iammukeshm highlighted (16 likes, 510 views) Microsoft's new dotnet/skills repo as a shared library of .NET-oriented skills for Copilot, Claude, Gemini, and Codex. The linked repo adds concrete substance beyond the tweet text: separate skill areas for ASP.NET Core, Aspire, Orleans, diagnostics, NuGet, testing, AI/agents, and upgrades, plus a public dashboard for evaluating skill accuracy and efficiency. That turns “prompt templates” into versioned project infrastructure.

Diagram of the dotnet/skills repository showing reusable AI coding skill areas across ASP.NET Core, Orleans, AI agents, testing, and architecture

@nasqret reported (31 likes, 1 reply, 869 views) that after a 300-person introductory training round, 60 researchers were already in a longer Codex workflow course building research apps in chemistry, biology, mathematics, and physics. The workshop photos matter because they show the output, not just the rhetoric: people who were barely console users are already running domain-specific dashboards and interactive scientific tools.

Workshop screenshot showing a scientific app built during Codex training running live on a participant laptop

Second workshop screenshot showing another live science-focused app produced inside the Codex research training sessions

@ziwenxu_ argued (4 likes, 3 replies, 127 views) that most people are not “vibe coding” so much as “panic prompting,” and then laid out a workflow loop of context before code, plans before diffs, CLAUDE.md before chaos, and tests before trust. A lower-volume but aligned signal came from @JeremyNguyenPhD sharing (2 likes, 159 views) a tutorial for academic researchers that explicitly teaches subagents plus Git/GitHub discipline rather than treating the model as a magic endpoint.

Tutorial card for academic Claude Code users emphasizing subagents and version control with Git and GitHub

Discussion insight: The most useful nuance today was that the winning pattern is not “let the agent do everything.” It is “encode good judgment into skills, docs, and checkpoints so the agent can reuse it safely.”

Comparison to prior day: May 19 had more attention on hosted agent runtimes and orchestration surfaces. May 20 spent more of its energy on the reusable instructions and training programs that make those runtimes dependable.

1.3 Cost pressure is now producing concrete routing and local-model workarounds 🡕

The pricing story also hardened. Instead of vague anxiety about agent costs, today's feed showed specific ways people are changing their stack: plan with a frontier model, implement with a cheaper one, proxy expensive clients through free providers, or move the whole thing onto local hardware. The overall tone was that agentic coding is useful enough to keep - but only if teams can stop treating every step like it deserves the most expensive model.

@haider1 tied (39 likes, 10 replies, 1,816 views) OpenAI's recent Codex confidence to compute access, and the attached chart made that argument legible by contrasting 75+ GW of contracted capacity with a much smaller operational footprint today. Replies added the main counterpoint: compute buys time, but habit-forming product UX still matters.

Bar chart comparing OpenAI's contracted and operational compute capacity to explain why Codex limits and model access feel different now

@DivyanshT91162 shared (2 likes, 1 reply, 110 views) free-claude-code, which routes Claude Code through NVIDIA NIM, OpenRouter, Ollama, and other providers so users can keep the interface while escaping the default subscription path. @markessien described (7 likes, 1 reply, 451 views) the same pattern in plain language: use Claude or Gemini to make the plan, then let a smaller model from z.ai or OpenCode do the bulk coding.

README screenshot for free-claude-code showing a localhost proxy that routes Claude Code requests through free or local model providers

The clearest cost shock came from @ivasuyadav posting (3 likes, 2 replies, 32 views) a CodexBar screenshot showing about $19,985 of spend for the day and roughly $1.3 million over 30 days on 603 billion tokens across around 100 continuously running Codex instances. Meanwhile @tom_doerr showed (2 likes, 157 views) a local Apple Silicon alternative that replaces paid Claude Code with on-device models, making “local fallback” look more like a normal response to subscription pressure than a hobbyist experiment.

CodexBar menubar screenshot showing daily, 7-day, and 30-day OpenAI API spend for continuously running Codex workloads

Repository screenshot for claude-code-local showing a local Apple Silicon setup that replaces paid Claude Code with on-device models

Discussion insight: The emerging pattern is not one-model loyalty. It is workflow splitting: a smarter model for planning and review, cheaper or local models for implementation, and explicit instrumentation so costs do not stay invisible.

Comparison to prior day: May 19 already showed model routing becoming normal. May 20 added stronger evidence that people are actively routing around subscriptions, watching spend in dashboards, and trying to re-home agent workflows onto local machines.

1.4 Codex gained more operator mindshare while Copilot and Antigravity faced sharper judgment 🡕

The day's clearest preference signal was not a vendor announcement. It was a pile of user judgments about which tools feel polished, which tools handle long-running jobs, and which ones are losing credibility. Codex kept picking up momentum in polls and migration stories, while Copilot and Antigravity drew more complaints about execution and differentiation.

@mark_k posted (34 likes, 12 replies, 2,049 views) a poll where Codex took 57.8% of the vote versus 22.2% for Claude Code and 20% for Cursor. Replies made the split more precise than the topline: Codex is winning long-running agentic work, while Cursor still gets credit for IDE polish.

Poll result screenshot showing Codex at 57.8 percent, Claude Code at 22.2 percent, and Cursor at 20 percent among 45 respondents

@_heyrico said (24 likes, 7 replies, 1,110 views) they had fully migrated to Codex because it feels more polished and reliable, and in replies they gave the most practical reason: Codex makes it easier to navigate and understand what has happened so far, while Claude Code sometimes leaves them unsure what they are working on. On the benchmark-and-perception side, @daniel_mac8 argued (39 likes, 7 replies, 4,156 views) for OpenAI leadership with chart-heavy evidence rather than pure fandom, using one image to argue GPT-5.5 leads on intelligence and another to show Grok 4.3 leading intelligence-per-dollar.

Artificial Analysis chart showing GPT-5.5 at the top of a model intelligence ranking used to argue for OpenAI leadership

Scatter plot comparing model intelligence and price, highlighting Grok 4.3 as especially strong on intelligence per dollar

The negative side of the comparison was equally direct. @ravikiran_dev7 complained (45 likes, 31 replies, 1,916 views) that GitHub had every distribution advantage and still got outpaced by newer agent products; replies added nuance by saying Copilot was early and strong, then got complacent. @ZypherHQ called (62 likes, 18 replies, 5,953 views) Antigravity “basically unusable,” and @notjazii backed (10 likes, 2 replies, 42 views) the copycat critique with a side-by-side UI screenshot that made the Codex/Cursor resemblance easy to see.

Side-by-side desktop UI screenshot comparing Google Antigravity 2.0 with a Codex-style agent workspace to question how differentiated Antigravity looks

Discussion insight: Even the positive Codex commentary was measured. Users were not saying every rival was bad; they were separating long-running agent work, IDE polish, context tracking, and reliability into different jobs and then judging tools on those jobs.

Comparison to prior day: May 19 still had positive Copilot momentum from remote control and Gemini 3.5 Flash rollout. May 20 shifted toward user preference, migration narratives, and harsher comparison against Codex and Cursor-style experiences.


2. What Frustrates People

Supply-chain exposure around agent tooling and extensions

@rayklanderman warned (24 likes, 1 reply, 7,995 views) that GitHub Copilot keys should be rotated immediately after the malicious OpenVSX extension story, and @PurpleOps_io added (2 likes, 8,974 views) that the phrase “internal GitHub repos” could include secret scanning, Copilot auth, and signing pipelines. The frustration here is not only that an attack happened; it is that the attack path was ordinary developer tooling, which makes every plugin, skill pack, and MCP surface feel like a potential ingress point. Severity: High.

People are coping by narrowing trust boundaries and adding guardrails. @Dinosn shared (6 likes, 486 views) hol-guard as a pre-execution wrapper for agents and their plugins, while @devopsdotcom linked (105 views) to 1Password's just-in-time credential flow for Codex. Worth building for: Yes. This is a concrete, urgent problem with obvious budget and trust implications.

Large agent-generated diffs still break outside the happy path

@zerion0 described (14 likes, 3 replies, 121 views) the most specific failure report in the dataset: Opus 4.7 produced a 31-file, 2,600-line feature that looked correct in local testing and then broke most of the app in QA. @ziwenxu_ framed (4 likes, 3 replies, 127 views) the same problem as “panic prompting” - vague asks, blind diff acceptance, and missing process discipline. Severity: High.

The coping pattern is procedural rather than model-specific: context before code, plans before diffs, Git before experiments, and tests before trust. Worth building for: Yes. The gap is not another chat UI; it is stronger review, regression isolation, and change-scope control around agent-generated edits.

Runaway spend is forcing people into proxies, local models, and model splitting

@ivasuyadav posted (3 likes, 2 replies, 32 views) a CodexBar screenshot showing about $19,985 in daily spend and roughly $1.3 million over 30 days, which gave the feed one of its few hard numbers for what always-on coding agents can cost at scale. @DivyanshT91162 shared (2 likes, 1 reply, 110 views) a free-claude-code proxy, and @markessien described (7 likes, 1 reply, 451 views) the new norm as “make plans with the smart models, let the smaller models code.” Severity: High.

The frustration is not merely price sensitivity. It is that useful agent workflows now feel operationally expensive enough to require routing policies, local fallbacks, or free-provider hacks just to stay practical. Worth building for: Yes. The evidence points to a real need for spend visibility, policy controls, and safer cost-down paths.

Tool polish and context tracking still decide who people trust

@ZypherHQ called (62 likes, 18 replies, 5,953 views) Antigravity buggy and interruption-prone, while @ravikiran_dev7 complained (45 likes, 31 replies, 1,916 views) that GitHub had every structural advantage and still got outrun by newer agent products. On the other side, @_heyrico said (24 likes, 7 replies, 1,110 views) they switched to Codex because it is easier to navigate and keep track of the work so far. Severity: Medium.

This frustration is less about raw model quality than about whether the product helps people supervise long-running work without getting lost. Worth building for: Yes, but mainly in workflow visibility, recovery, and context management rather than another undifferentiated coding shell.


3. What People Wish Existed

Credential-safe access to tools, repos, and secrets

The most immediate practical need in today's feed was for agent workflows that can touch real systems without turning every plugin or prompt into a secret-handling risk. @devopsdotcom linked (105 views) to 1Password's Codex MCP flow, which keeps credentials out of prompts, logs, and terminal output, while @Dinosn shared (6 likes, 486 views) hol-guard as a pre-execution layer for skills, plugins, and MCP servers. The tone was practical, not aspirational: people want a way to let agents act without expanding the blast radius of every extension. Opportunity rating: direct.

Shared team memory that survives beyond one user's prompts

@iammukeshm pointed (16 likes, 510 views) to Microsoft's dotnet/skills repo as a reusable skill layer for .NET teams, and @ziwenxu_ argued (4 likes, 3 replies, 127 views) for CLAUDE.md, plans, and tests as part of the working loop rather than optional extras. @JeremyNguyenPhD added (2 likes, 159 views) that even academic users now need subagent and version-control instruction. The need is practical and urgent because the alternative is repeated context typing and inconsistent team behavior. Opportunity rating: direct.

Spend-aware routing across frontier, cheaper, and local models

@DivyanshT91162 shared (2 likes, 1 reply, 110 views) a proxy that lets Claude Code run on free or open providers, @markessien described (7 likes, 1 reply, 451 views) the planner-versus-implementer split explicitly, and @tom_doerr showed (2 likes, 157 views) a local Apple Silicon replacement path for Claude Code. That combination says the unmet need is not just “cheaper models.” It is a control plane that decides when to use premium reasoning, when to fall back, and how to keep the workflow intact. Opportunity rating: direct.

Better away-from-desk supervision for long-running coding sessions

@GHchangelog announced (18 likes, 1,027 views) remote control for Copilot CLI sessions across GitHub Mobile, VS Code, and JetBrains, including non-GitHub repos, and @AightApp said (2 likes, 270 views) its phone app can now remotely control Codex CLI and coordinate multiple Codex instances in one group chat. People clearly want agents to keep working while they walk away, but with enough visibility to intervene from the phone. Opportunity rating: competitive.


4. Tools and Methods in Use

Tool Category Sentiment Strengths Limitations
OpenAI Codex Coding agent / runtime (+) Strong for long-running agentic work, reliable enough to drive migrations, mobile and remote-control use cases are showing up in the wild Can become very expensive at sustained scale; some users still pair it with other tools for IDE polish
GitHub Copilot / Copilot CLI Coding assistant + hosted/remote workflow (+/-) Huge distribution, remote control across devices, reusable skills and CLI workflows, still integrated deeply into GitHub surfaces Security breach discussion hit Copilot-adjacent trust; critics say execution has lagged newer agent products
Claude Code Terminal coding agent (+/-) Remains a planning and reasoning reference point, especially in multi-model stacks and skills-heavy workflows Cost pressure is pushing users toward proxies or local replacements; some users say it is easier to lose track of session state
Google Antigravity Agent workspace / IDE (-) Still attracts curiosity around agent teams and Google model access Strong complaint volume around bugs, interruptions, and undifferentiated UI compared with Codex/Cursor
dotnet/skills Reusable skill library (+) Turns prompts into versioned team infrastructure, covers broad .NET workflow areas, backed by public evaluation signals .NET-focused today; value depends on teams actually curating and maintaining skill packs
hol-guard Agent security wrapper (+) Wraps tools before execution and scans plugins, skills, and MCP surfaces Early-stage control layer that adds another operational dependency
1Password MCP for Codex Credential mediation (+) Keeps secrets out of prompts, logs, and model context while still letting agents act Adds another identity/control hop that teams need to integrate carefully
free-claude-code Cost workaround / proxy (+/-) Keeps Claude Code UX while routing requests through free, open, or local providers It is still a workaround layer, and quality depends on the substitute provider behind the proxy

The satisfaction spectrum was pragmatic rather than ideological. @mark_k showed (34 likes, 12 replies, 2,049 views) that Codex outperformed Claude Code and Cursor in a small poll, @_heyrico explained (24 likes, 7 replies, 1,110 views) a migration to Codex in terms of reliability and navigation, and @GHchangelog showed (18 likes, 1,027 views) Copilot still gaining real workflow surface area through remote control. At the same time, @ZypherHQ captured (62 likes, 18 replies, 5,953 views) the frustration side for Antigravity and @ravikiran_dev7 did the same (45 likes, 31 replies, 1,916 views) for Copilot.

Common workarounds were consistent across the feed: split planning from implementation, route implementation to smaller or local models, add spend visibility, and supervise sessions from the phone when possible. @DivyanshT91162 showed (2 likes, 1 reply, 110 views) the proxy path, @markessien described (7 likes, 1 reply, 451 views) the planner-versus-implementer split, @tom_doerr showed (2 likes, 157 views) a local fallback, and @AightApp pointed (2 likes, 270 views) to phone-based remote supervision. Together they suggest the winning product is increasingly the one that coordinates models, devices, and guardrails, not just the one with the strongest single model.


5. What People Are Building

Project Who built it What it does Problem it solves Stack Stage Links
dotnet/skills @iammukeshm / Microsoft Shared skill library for .NET coding agents Teams keep retyping the same .NET context, conventions, and workflows into every AI session .NET skill packs, GitHub repo, agentskills.io-compatible structure, evaluation dashboard Shipped GitHub / tweet
hol-guard @Dinosn / Hashgraph Online Wraps coding agents and scans plugins, skills, and MCP surfaces before execution Agent toolchains are becoming a supply-chain and pre-execution security risk Python package, CLI wrapper, plugin scanner, agent integrations Beta GitHub / tweet
Clawdmeter @tom_doerr Physical dashboard that monitors Claude Code usage in real time and adds hardware controls Heavy Claude Code users want persistent visibility and fast physical control over sessions ESP32-S3 touchscreen, BLE HID, Claude Code shortcuts, token meter Alpha GitHub / tweet
free-claude-code @DivyanshT91162 / open-source community Routes Claude Code through free, open, or local providers while preserving the client workflow Paid frontier-model subscriptions are too expensive for some coding-agent workflows Local proxy, NVIDIA NIM, OpenRouter, Ollama, local models Beta GitHub / tweet
Aight @AightApp Phone client for remotely controlling Codex CLI sessions and coordinating multiple instances Long-running coding sessions need away-from-desk supervision and lightweight coordination iOS app, Codex CLI remote control, group chat coordination Shipped download / tweet
claude-code-local @tom_doerr / nicedreamzapp Replaces paid Claude Code with local Apple Silicon model stacks Subscription pressure is pushing users toward on-device alternatives Apple Silicon, local models, on-device inference modes Beta GitHub / tweet

The strongest builder signal was dotnet/skills because it turns a community habit - private skill files and project-specific instructions - into a first-party, reusable repo that whole teams can share. That is different from shipping yet another agent shell: it productizes the working memory around the agent instead of the chat surface itself.

The security side produced its own mini-cluster. hol-guard and 1Password's Codex MCP story both point to a growing market for agent-control layers that sit between the model and real tools, with scanning, approval, and credential mediation as the core value. The same pattern showed up in the breach discourse: the fear was not only bad model output, but bad tool access.

The hardware and local-runtime builds were equally revealing. Clawdmeter makes Claude Code usage visible on a physical screen and even maps BLE buttons to session controls, which is a sign that some users are spending enough time with these agents to justify dedicated hardware.

ESP32 touchscreen showing a live Claude Code usage meter and pixel-art status display for an always-on coding session

Repeated builder pattern: people are wrapping existing agents with control layers, training layers, spend layers, and remote-control layers. The feed offered very little evidence for a brand-new coding assistant winning mindshare today; the more common move was to extend, constrain, or reroute Codex, Claude Code, or Copilot.


6. New and Notable

Microsoft made shared .NET agent skills first-party

@iammukeshm surfaced (16 likes, 510 views) the new public dotnet/skills repo, and the repo structure plus public dashboard made this more than a feel-good open-source post. It showed a big platform vendor treating reusable skill packs as product infrastructure rather than community folklore.

1Password pushed credential mediation directly into Codex workflows

@devopsdotcom linked (105 views) to 1Password's MCP integration for Codex, which issues credentials just in time and keeps them out of prompts, logs, and context windows. That mattered because it answered the day's biggest security question with a specific product design instead of generic advice.

Copilot CLI remote control expanded beyond GitHub repos

@GHchangelog announced (18 likes, 1,027 views) remote control for Copilot CLI sessions across GitHub Mobile, VS Code, and JetBrains, and the linked changelog says it now supports non-GitHub repos and directories without repos. That is notable because it pushes Copilot further toward a background runtime you supervise from multiple surfaces.

NVIDIA aimed infrastructure work at Codex- and Claude-style agent loops

@NVIDIAAP said (2 likes, 1 reply, 72 views) Dynamo is being hardened for Codex-, OpenClaw-, and Claude Code-style multi-turn harnesses, with KV reuse, preserved interleaved tool calls, and streaming dispatch called out as the main fixes. Even at low volume, this was notable because it points to a more specialized backend layer for agentic coding stacks.

Architecture diagram for NVIDIA Dynamo showing KV reuse, interleaved reasoning and tool calls, and streaming dispatch in multi-turn agent loops


7. Where the Opportunities Are

[+++] Agent security control planes - The GitHub extension-borne breach, PurpleOps's warning about what internal repos can contain, hol-guard's pre-execution wrapper, and 1Password's Codex MCP all point to the same missing layer: policy, scanning, approval, and secret mediation for agent toolchains. This is strong because the need is already phrased in operational terms, not abstract safety language. (source)

[+++] Spend governance and model-routing infrastructure - free-claude-code, the z.ai planner/implementer split, local Apple Silicon replacements, and the CodexBar spend screenshot all show that useful coding agents are outrunning simple subscription economics. The opportunity is strongest where a product can decide what should run on premium models, what can fall back, and how to explain the bill. (source)

[++] Shared skills and process memory for teams - dotnet/skills, CLAUDE.md discipline, and research tutorials on subagents/Git all say the same thing: teams do not just need better models, they need reusable operating context. This is a meaningful opportunity because it solves inconsistency and onboarding rather than chasing another raw-capability benchmark. (source)

[++] Regression guardrails for multi-file autonomous changes - zerion0's 31-file failure report is the clearest evidence today that “it worked locally” is not enough when an agent edits across dozens of files. A product that can constrain blast radius, isolate test coverage gaps, and stage risky diffs would address an acute and repeated pain point. (source)

[+] Remote supervision and multi-instance coordination - Copilot remote control and Aight's phone-based Codex coordination show that people increasingly want agents to keep running while they are away from the keyboard. The signal is emerging rather than settled, but it is now concrete enough to look like a category rather than a gimmick. (source)


8. Takeaways

  1. AI-coding security became incident-driven today. The feed moved from abstract trust architecture to extension marketplaces, key rotation, secret exposure, and pre-execution guardrails after the GitHub internal-repo breach discussion. (source)
  2. Reusable skills and training are becoming the real leverage layer. Microsoft's dotnet/skills repo, research workshops using Codex, and CLAUDE.md-style workflow advice all treated process memory as infrastructure, not a personal prompt trick. (source)
  3. Cost pressure is changing architecture choices, not just sentiment. The day produced a free Claude Code proxy, a planner-versus-implementer model split, local Apple Silicon fallbacks, and a screenshot of seven-figure monthly Codex spend. (source)
  4. Codex had the strongest preference momentum in the dataset. Poll results, migration stories, and reliability comments all pushed in its favor, while Copilot and Antigravity absorbed more criticism for execution and differentiation. (source)
  5. Builders are wrapping agents with control layers rather than replacing them. The notable projects today focused on skills, guardrails, dashboards, remote control, and local-runtime substitutions around existing agents. (source)