Skip to content

Twitter AI Coding - 2026-05-13

1. What People Are Talking About

1.1 GitHub Copilot billing shock triggers mass churn decisions πŸ‘–

The dominant non-Antigravity story of the day was GitHub Copilot's token billing transition, which produced a wave of community posts showing extreme cost disparity between current flat-rate pricing and the usage-based model that takes effect June 1. Ed Zitron's post, anchored by a Reddit screenshot showing $39 of current billing against $5,851.77 of usage-based billing for the same April usage, became the clearest statement of the problem. His follow-on analysis was more alarming: GitHub Copilot had $1.08B ARR by end of 2025 while allowing users to burn 300–1,000% of their subscription in tokens per month, meaning hundreds of millions in monthly subsidies. GitHub responded on the same day with a new plan structure including a Max plan and flex allotments.

@edzitron reported (636 likes, 63 retweets, 26,539 views, 65 bookmarks) that a Reddit user's April 2026 usage simulation showed $39 current billing versus $5,851.77 under usage-based billing β€” a 150x cost increase for a single individual user. The Reddit post had 372 upvotes and 174 comments.

Reddit r/GithubCopilot billing simulation showing one user's April 2026 usage: $39.00 under current PRU billing vs $5,851.77 under usage-based AIC billing, with daily requests and AI credits usage chart

In a follow-up thread, edzitron argued that GitHub Copilot was "a MAJOR API customer of Anthropic β€” that business effectively dies June 1 2026" and that the vast majority of Copilot subscribers would churn. His churn prediction post (23 likes, 2,968 views) included two Reddit posts showing this in real time: one user going from $10 to $177.22 ("Given my current usage/workflow, obviously GC is out... My gut reaction is Claude Code with Claude Pro at $20/month"), and another from $47.27 to $3,962.04 ("I loved GitHub and GHCP so far, but with that price tag it is just better to sign elsewhere or go local").

Reddit r/GithubCopilot post "Copilot Alternatives" showing billing simulation: $10.00 current PRU billing vs $177.22 usage-based AICs, user asking community for Claude Code alternatives

Reddit r/GithubCopilot post "Well... I am leaving too" showing $47.27 current billing vs $3,962.04 usage-based billing β€” an 84x increase β€” concluding to switch away from GitHub Copilot

@JamesMontemagno announced (48 likes, 12 replies, 8,036 views, 30 bookmarks) GitHub's same-day response: new flex allotments and a Max plan. Pro ($10/month) now includes $15 total usage (base $10 + flex $5); Pro+ ($39/month) includes $70 total (base $39 + flex $31); the new Max plan ($100/month) includes $200 total. The blog post confirmed that code completions are not counted against AI credits.

GitHub Copilot individual plan pricing table as of June 1, 2026: Pro $10/month ($15 total), Pro+ $39/month ($70 total), Max $100/month ($200 total) with base and flex allotment breakdown

Discussion insight: The replies to the Montmagno post split between users who found the flex allotments reassuring and those who argued that heavy agentic users would still exceed them. The edzitron thread attracted a reply noting that this likely affects Cursor, Augment Code, and other AI coding subscriptions that have similarly been subsidizing heavy usage.

Comparison to prior day: On May 12, the billing conversation centered on a single organization's 3.2x cost simulation and the removal of GPT Codex 5.3 from the model picker. On May 13, the community moved from simulation to declaration: multiple users stated in public they were cancelling Copilot and migrating to Claude Code.

1.2 Google Antigravity shutdown speculation goes viral ahead of Google I/O πŸ‘–

The highest-engagement post of the day asked a simple question: is Google shutting down Antigravity? With 313,583 views, 229 replies, 26 quotes, and 177 bookmarks, it became the most widely seen discussion in the AI coding space on May 13.

@hiarun02 posted (1,568 likes, 229 replies, 313,583 views, 177 bookmarks) that Google appears to be shutting down Antigravity and asked whether any developers are still working on it. The image attached is Antigravity's own model picker, which shows Gemini 3.1 Pro (High), Gemini 3.1 Pro (Low) [New], Gemini 3 Flash, Claude Sonnet 4.6 (Thinking), Claude Opus 4.6 (Thinking), and GPT-OSS 120B (Medium) β€” confirming the product is still operational with active model choices but no new updates.

Antigravity model picker showing available models: Gemini 3.1 Pro (High), Gemini 3.1 Pro (Low) [New], Gemini 3 Flash, Claude Sonnet 4.6 (Thinking), Claude Opus 4.6 (Thinking), GPT-OSS 120B (Medium)

@LexnLin asked (92 likes, 11 replies, 4,508 views) whether Google I/O (one week away) would bring an Antigravity comeback, noting the last changelog update was nearly a month prior. The screenshot confirms version 1.23.2, dated April 16, 2026 β€” only bug fixes for MCP server loading and workspace-specific settings.

Google Antigravity Changelog showing the most recent entry: version 1.23.2, April 16, 2026, with bug fixes for MCP server loading and workspace-specific settings β€” highlighted in red as the last update

@Presidentlin listed (24 likes, 6 replies, 683 views) wants from Google: cheap Flash models, powerful Pro models, working Gemini app search, updates to Antigravity ("Apr 16, 2026 was the last update. Is it dead?"), and separate credits for Gemini Code Assist vs Antigravity.

@anshulkr713 countered (130 likes, 2,981 views) that "it's because of Sundar that Google is leading everywhere from Cloud to AI (Gemini, Veo, antigravity) to hardware (TPUs)" β€” listing Antigravity as a current Google AI win. The counterpoint attracted 130 likes in a reply thread.

@rseroter linked (13 likes, 12 bookmarks, 601 views) a HowToGeek article arguing that Antigravity beats Claude at coding β€” but only if users stop approaching it like traditional programmers.

Discussion insight: Replies to hiarun02 included both pessimists ("Tried it but it was shit compared to claude code") and wait-and-see users ("Google I/O is next week, they'll probably announce something"). The community's reaction to the "shutdown" framing reveals how little information flows from Google to its developer community β€” no official statement contradicting the shutdown speculation appeared anywhere on the day.

Comparison to prior day: May 12 showed Antigravity as a comparison point in context-architecture posts. May 13 raised the existential question of whether the tool has a future, amplified by the pre-Google-I/O timing.

1.3 OpenCode desktop performance sprint and ecosystem growth πŸ‘•

OpenCode attracted multiple threads on May 13 from both its core team (performance work) and the broader community (a curated follower list, a RealPython tutorial, and testing infrastructure exploration).

@LukeParkerDev reported (55 likes, 11 replies, 4,903 views) that session startup time on OpenCode desktop is now ~40ms, down from approximately 20 seconds on real data β€” a roughly 500x improvement. The work also reduced idle CPU, reflow time, and eliminated flickering. Parker noted that Linux and macOS were already fast; the gains primarily benefit Windows users.

@MichelleBakels compiled (28 likes, 1,342 views) a curated X list of OpenCode employees, enabling users to follow the team directly or add an OpenCode-team tab to their feed. The list prompted @ThePrimeagen to reply with a joke about tracking Dax's insult count.

@realpython published (2 likes, 245 views, 3 bookmarks) a 28-minute tutorial guide, "How to Use OpenCode for AI-Assisted Python Coding," covering installation, setup with a free Gemini API key, and daily workflow. The guide describes OpenCode as supporting 75+ AI providers and an AGENTS.md configuration file for per-project context.

@jlongster explored (15 likes, 1,786 views) the TUI version of Bombadil (from @owickstrom) as a testing framework for OpenCode, describing it as "too low-level right now, but promising."

Discussion insight: The simultaneous appearance of a core-team performance post, a community follower list, and a mainstream publication tutorial suggests OpenCode is moving from early-adopter to practitioner audience. The RealPython tutorial (written for intermediate Python developers with no assumption of prior AI coding tool experience) is a leading indicator of broader adoption.

Comparison to prior day: May 12 saw OpenCode referenced in context-architecture frameworks alongside Claude Code and Cursor. May 13 showed OpenCode as a distinct subject of focused discussion and investment, including official documentation from a major Python education site.

1.4 Skills as code: EvoSkill, Hermes-Agent, and the 1,453-entry library πŸ‘•

Multiple independent posts converged on a common insight: skills (SKILL.md files that encode agent behaviors) are becoming the primary unit of improvement for AI coding agents β€” not fine-tuning, not new model versions.

@yasenka244 described (19 likes, 10 replies, 112 views) two open-source drops making the argument concrete: SentientAGI's EvoSkill reads what an agent fails at, mutates the skill folder, evaluates results, and keeps what wins β€” without retraining, without a cluster, without version-locked checkpoints. NousResearch's Hermes Agent ships 83 skills and 28 tools in one local CLI. The post described the two as composable: EvoSkill growing the folder, Hermes running it.

EvoSkill x NousResearch Hermes-Agent: "fine-tuning is the most expensive way to not change your model" β€” showing Hermes-Agent v0.13 terminal with 83 skills and 28 tools, and EvoSkill's 0.0β†’0.5 baseline improvement on a letter-counting task

@thegreatest_sv shared (12 likes, 10 bookmarks, 266 views) a Google Cloud AI Director's personal agent playbook: 22 skills, 7 slash commands, a 6-stage pipeline (DEFINE→/spec, PLAN→/plan, BUILD→/build, VERIFY→/test, REVIEW→/review+/code-simplify, SHIP→/ship), and three agent personas. Compatible with Claude Code, Cursor, Antigravity, OpenCode, and Gemini CLI.

Agent Skills GitHub README showing "Production-grade engineering skills for AI coding agents" with DEFINE→PLAN→BUILD→VERIFY pipeline diagram and /spec, /plan, /build, /test commands

@aiecosystemhq documented (4 likes, 75 views, 1 bookmark) the Antigravity Awesome Skills library (the name is historical; the library is cross-platform) at 1,453+ agentic skills, installable via npm, covering Claude Code, Cursor, Codex CLI, Gemini CLI, Antigravity, Kiro, OpenCode, GitHub Copilot, and others. Skills are organized into role-based bundles and workflow-driven execution tracks.

Antigravity Awesome Skills GitHub README: "1,453+ Agentic Skills for Claude Code, Gemini CLI, Cursor, Copilot & More" β€” installable GitHub library and npm installer for reusable SKILL.md playbooks

Discussion insight: The yasenka244 thread attracted replies that questioned the composability claim ("how do you prevent EvoSkill from optimizing for the eval, not the real task?"), which is a technically important critique pointing to Goodhart's Law in skill evaluation. No response addressed this directly.

Comparison to prior day: May 12 showed the skills ecosystem through the Antigravity Awesome Skills library entry at 1,453 entries and the Hedgineer plugin marketplace. May 13 added the self-improvement angle (EvoSkill) and the practitioner-authored playbook as a standalone artifact.

1.5 Context engineering enters the practitioner vocabulary πŸ‘•

Two posts on May 13 pushed context engineering β€” the practice of deliberately controlling what tokens an AI coding agent sees β€” from Karpathy's conference talk into practitioner economics. The Karpathy quote from AI Ascent 2026 is the day's most actionable one-liner.

@dunik_7 posted (16 likes, 744 views) the economic case: a typical Claude Code session auto-loads 47,000 tokens of repo context, then the user asks Claude to fix 30 lines, then Claude reads the 47,000 tokens to find 30 lines, returns a 200-token fix, and repeats 50 times. Cost: $35/day. Signal sent: 30 lines. The conclusion: "You didn't pay Claude to fix the bug. You paid Claude to read your entire repo 50 times so it could find 30 lines." The attribution is Karpathy at AI Ascent 2026: "context engineering is the new vibe coding."

@nicdunz observed (29 likes, 4 replies, 976 views) the same phenomenon from the user experience side: "openai is doing something right because i actually try to stay away from codex and my computer because i know its doing work and its better to leave it alone." Replies reinforced: "The app gets better when you stop babysitting it."

Discussion insight: The dunik_7 post received minimal replies but 3 bookmarks β€” suggesting practitioners are internalizing the math without debating it publicly. The Karpathy attribution lends authority: he coined "vibe coding" and is now positioning context engineering as the next competency layer.

Comparison to prior day: May 12 introduced context architecture (layered CLAUDE.md files, skills folders, hooks) as a structural solution. May 13 added the unit economics argument β€” not "how do you organize context" but "how much does unmanaged context cost."

1.6 Vibe coding: contested term, contested practice πŸ‘’

The vibe coding debate ran through May 13 with three distinct positions: authoritative critique (Uncle Bob), pragmatic defense (Emergent Labs / YC president Garry Tan), and counter-cultural resistance (the 100-day no-vibe-coding game developer).

@unclebobmartin declared (76 likes, 16 replies, 3,559 views) that "vibe coding is flying in the clouds with no instruments." Uncle Bob Martin, author of Clean Code, used an aviation metaphor: IFR (instrument-based) versus VFR (visual) flight β€” vibe coding is flying in conditions that require instruments without any. Reply from @MrCollison: "We need a term for vibe coding but you actually PR the code and have standards. Or we can rename vibe coding to architectural gambling." Reply from @GeorgePaulChi: "Vibe coding works great right up until you have to maintain the vibes six months later in production."

@hamptonism framed (16 likes, 15 bookmarks, 3,636 views) the counter-argument via Emergent Labs: vibe coding tools get you from 0 to 1 but leave you stuck; Emergent uses Claude to give agents that handle planning, building, debugging, shipping β€” taking products from 1 to 100. The YC + Anthropic partner credibility was cited by multiple quote-tweets.

@Gatsz01 announced (64 likes, 17 retweets, 15 bookmarks, 376 views) a 100-day challenge: build a game from scratch with no game engine and no vibe coding, writing C++ by hand. The high bookmark count (15 relative to 376 views) signals practitioner community solidarity with the craft-first position.

@tetranow quoted (6 likes, 1,023 views) Garry Tan (YC) from a podcast: "The engineers who hate vibe coding and AI the most are the people who would benefit the most from embracing it."

Discussion insight: The positions are hardening rather than converging. Uncle Bob's metaphor resonates with experienced engineers; Garry Tan's challenge resonates with those who see engineering gatekeeping as the real problem.

Comparison to prior day: May 12 focused on context architecture as a discipline that makes AI coding more reliable. May 13 surfaced the deeper question: is AI-assisted coding a legitimate engineering practice or an instrument-free gamble?

1.7 Microsoft's AI strategy bifurcates: Copilot billing vs. M&A pivot πŸ‘’

While GitHub Copilot's billing transition triggered user churn, Reuters reports that Microsoft is quietly pivoting its AI strategy at the corporate level β€” exploring acquisitions of AI coding startups to reduce OpenAI dependence. The Cursor/SpaceX detail emerged as the day's most surprising competitive fact.

@wallstengine reported (107 likes, 11 retweets, 15,194 views) citing Reuters: Microsoft weighed acquiring Cursor but backed away over antitrust concerns linked to GitHub Copilot. Microsoft is now in talks with Stanford-built diffusion LLM startup Inception, with M12 (Microsoft Ventures) having joined Inception's $50M seed round. @faststocknewss added (5,220 views) that SpaceX bought Cursor shortly after Microsoft walked away, and that SpaceX has also courted Inception. Microsoft has spent over $100B on OpenAI investments and infrastructure, per court testimony.

Discussion insight: Reply from @norveclifinance flagged that "OpenAI's own CFO reportedly warned that if revenue does not grow fast enough, the company may struggle to pay for future compute contracts." This frames the Microsoft diversification not as opportunism but as risk management.

Comparison to prior day: May 12 reported Grok Build (xAI) as a new entrant in the IDE market. May 13 revealed that SpaceX now owns Cursor β€” a more consequential competitive development that went largely undiscussed outside the financial commentary accounts.


2. What Frustrates People

GitHub Copilot usage-based billing shock

Severity: High. The June 1 billing transition is the most discussed pain point of the day. Users who heavily used Copilot's agentic features β€” long agent runs, frontier models, multi-step work β€” are discovering that their flat-rate subscription covered 3x to 150x their actual cost. The frustration is not with the transition per se but with the scale of the surprise: users were never told how much they were consuming. Multiple Reddit posts (documented with screenshots) show individuals immediately deciding to cancel and migrate to Claude Code or go local. The workaround being discussed is either the new Max plan ($100/month for $200 of credits) or switching to Claude Code + Claude Pro ($20/month) for coding work.

Claude Code hard rate limits mid-session

Severity: Medium. One image in the dataset shows "You've hit your usage limit. try again at 5:03 PM." β€” the Claude Code rate limit error. This interrupts in-progress agentic sessions. The pain is not in the message but in timing: hitting a rate limit mid-agent-run forces work stoppage with no recovery path until the limit resets.

Gemini CLI/Antigravity responsiveness and update cadence

Severity: Medium. @bedros_p posted (13 likes, 1,100 views) that the Gemini CLI is "unusable" — taking an hour to respond, slow startup, with a GitHub issue tracking the problem closed as "not a priority." The Antigravity stagnation (no changelog since April 16) compounds this: two of Google's AI coding surfaces appear deprioritized simultaneously. The speculation that Antigravity may be renamed again (like Bard→Gemini, IDX renamed twice) reflects fatigue with Google's product naming instability.

Codex server overloads and connection failures

Severity: Medium. @9hills documented (9 likes, 2,517 views) a Codex service_unavailable_error with "code":"server_is_overloaded". The reply thread suggested this is increasingly common. One practitioner described adding three protections to all long Codex/Claude Code tasks: breakpoint recovery, intermediate artifact checkpointing, and feeding only error snippets back on failure. @android_poet documented a Codex WebSocket/HTTPS transport error with BadRecordMac and stream disconnection before completion.

Security research blocked by AI safety guardrails

Severity: Medium (for security researchers). @sixhobbits posted a multi-panel screenshot showing Claude Code blocking a legitimate security investigation ("please help me understand the recent tanstack and mistral-ai supply chain attack from yesterday") with an API error citing Anthropic's Usage Policy and directing the user to a "Cyber Verification Program" form. ChatGPT flagged the same session for "possible cybersecurity risk" with a link to its "Trusted Access for Cyber" program. Both tools applied friction to legitimate incident response. The gap between "authorized" security researchers and working developers investigating real incidents affecting their own codebases is not addressed by either verification program.


3. What People Wish Existed

Transparent token accounting in subscription tools

Users running agentic workflows want to know how many tokens they have consumed before they hit a rate limit or face a surprise bill β€” not after. The Copilot billing shock is downstream of years of opaque PRU accounting. Multiple posts express a wish for a live dashboard showing token burn rate during agentic sessions, not just a retrospective usage report. @dunik_7 articulated the need precisely: "You didn't pay Claude to fix the bug. You paid Claude to read your entire repo 50 times so it could find 30 lines." A token-efficient mode that surfaces context size before executing agent loops would address both the cost and the surprise dimensions. Opportunity: direct.

Mobile access to running agents without laptop tethering

@biraj21_ launched (24 likes, 7 retweets, 675 views) Shellular on ProductHunt to address exactly this: a mobile app (iOS + Android) that gives phone access to the full dev environment including running Claude Code, Codex, OpenCode, Copilot CLI, and Pi sessions, with terminal, localhost, and browser DevTools. The relay server is open source. The timing of the launch alongside an explicit "tired of carrying your laptop everywhere because your agents are running?" hook confirms the pain is real and monetizable. Opportunity: competitive (Lunel was also mentioned in replies as a similar product).

Google I/O AI coding announcements (Antigravity, Gemini CLI)

Multiple posts reflect a specific unmet need: a clear statement from Google on the roadmap for Antigravity and Gemini CLI. Users are not asking for features; they are asking for signal. "Is it dead?" and "will we get a comeback at I/O?" are practical questions that determine whether developers invest time in learning the tools. The Google Developer program does not appear to have active community management in this space, judged by the absence of any official replies in the threads. Opportunity: aspirational (depends on Google's actual intentions).

Self-improving agent skills without fine-tuning

@yasenka244 identified the need implicitly: a way to make an agent better at specific tasks without retraining a model, without owning a cluster, and without version-locking to a checkpoint. EvoSkill + Hermes-Agent provides a partial answer. The missing piece is evaluation design β€” as noted in replies, skill evolution is only as good as the benchmark it optimizes against. A benchmark-creation tool for specific codebases or task types would close this loop. Opportunity: direct, emerging.

Stable-rate API access for production AI coding workloads

@TheGeorgePu articulated (21 likes, 12 replies, 1,077 views) the need starkly: "The 'stable-rate API' era is over." His response β€” fine-tuning a 4B model on two A100s for $50 β€” is not a product, it is a protest. The underlying wish is for API pricing that does not change without announcement, models that do not degrade silently, and platform relationships that support long-term investment in prompt engineering. Opportunity: direct, high urgency among production AI coding teams.


4. Tools and Methods in Use

Tool Category Sentiment Strengths Limitations
Claude Code AI coding agent (CLI) (+/-) Default choice for migration from Copilot; strong for agentic runs; head of Claude Code shipped 49 features in 2 days Hard rate limits hit mid-session; 47,000-token context loading costs $35/day without careful management
OpenAI Codex AI coding agent (CLI) (+/-) Autonomous /goal mode; widely used; strong for long-running tasks Server overloads; WebSocket connection failures; UI criticized by users considering migration
OpenCode AI coding agent (CLI + desktop) (+) 500x session startup improvement; 75+ AI provider support; Real Python guide published; active development TUI testing infrastructure still early
GitHub Copilot AI coding assistant (IDE) (-) New flex allotments; Max plan value; WinUI plugin; CLI statusline Billing shock for agentic users; Opus 3x price hike; 150x cost disparity for heavy users; GPT Codex 5.3 removed without notice
Google Antigravity AI coding agent (IDE) (+/-) Multi-model support (Gemini, Claude, GPT-OSS); KDNuggets tutorial; active in some comparisons No changelog since April 16; community uncertainty about shutdown; Gemini CLI performance issues (1-hour response times)
Cursor AI coding IDE (+) Acquired by SpaceX (new investor signal); used as reference in Microsoft M&A discussions No direct posts about features today
EvoSkill (SentientAGI) Agent self-improvement (+) Mutates and evaluates skill folders without fine-tuning; supports Claude Code, OpenCode, Codex, Goose, OpenHands; Apache 2.0 Goodhart's Law risk: optimizes for eval, not necessarily real-task performance
Hermes-Agent (NousResearch) Local agent runtime (+) 83 skills, 28 tools in one local CLI; first-class skill folder support; Apache 2.0 v0.13; still early
Shellular Mobile dev environment (+) Full dev env on phone (terminal, agents, localhost, DevTools); iOS + Android; relay server open source New; limited track record
Orca (stablyai) Agent orchestrator (+) Runs Claude Code/Codex/OpenCode in parallel worktrees; tracks /goal progress; macOS/Windows/Linux Low engagement at launch (34 likes)
WinUI Agent Plugin Domain-specific skills (+) End-to-end loop for WinUI development (scaffold→build→run→test); token-efficient; XAML/MVVM aware Windows-only domain
Chrome DevTools MCP Browser debugging MCP (+) 39.3K stars; TypeScript; performance traces, network+console inspection, reliable Chrome automation External dependency on Chrome
GPT-5.5 (various providers) LLM (+/-) Strong for design work (Copilot); github-copilot-direct access is fastest (14.1s) and most token-efficient Prices doubled; openai:gpt-5.5@high is slowest and highest-cost of three access paths; fine-tuning API removed
Ollama Local model runtime (+) Used for local AI coding agent home lab setup with OpenCode Requires own hardware; setup complexity

The overall satisfaction spectrum shows a clear split: tools with predictable, usage-matched pricing (Claude Code, OpenCode, local Ollama) are gaining community sentiment while tools with recent price escalation or opaque billing (GitHub Copilot, GPT-5.5 via OpenAI direct) are losing it. Migration patterns are directional: from GitHub Copilot to Claude Code (confirmed in multiple Reddit posts shared on Twitter), with some users going to Codex (via subscription) or to local/Ollama as a protest move. The three-provider benchmark image from @xoofx shows that accessing the same model (GPT-5.5 High) via github-copilot-direct is faster and cheaper than through openai:gpt-5.5@high β€” a provider routing insight that has real cost implications.

GPT-5.5 High evaluated across three providers: github-copilot-direct (14.1s, 28,552 input tokens, 124 reasoning tokens β€” fastest), openai:gpt-5.5@high (18.2s, 34,833 input tokens, 240 reasoning tokens β€” most verbose), codex_subscription (18.1s, 32,428 input tokens β€” best quality/efficiency balance)


5. What People Are Building

Project Who built it What it does Problem it solves Stack Stage Links
Orca @JinjingLiang (stablyai) AI orchestrator running Claude Code, Codex, OpenCode in parallel worktrees Tracking /goal progress across multiple concurrent agent sessions macOS/Windows/Linux; supports any CLI agent Shipped github.com/stablyai/orca
Shellular @biraj21_ Mobile app giving full dev environment access on phone Laptop tethering while agents run iOS + Android; relay server open source; 5 agent support Shipped ProductHunt launch
WinUI Agent Plugin @Niels9001 (Microsoft) Agent skills for WinUI and Windows App SDK development Generic agents failing on WinUI/XAML specifics Skill plugins for GitHub Copilot CLI and Claude Code Shipped devblogs.microsoft.com
Academic Research Skills for Claude Code @PrakashS720 (Imbad0202) 10 specialized agents for academic research pipeline (citation hunting, fake reference detection, peer review simulation) Full 15,000-word paper at cost of a coffee; citation hallucination in AI research Claude Code CLI / VS Code / JetBrains; v3.7.0; CC BY-NC 4.0 Shipped Plugin: academic-research-skills
KimiFlare @sinasanm Kill switch dashboard for AI-powered services with pre-flight safety checks (ARM SYSTEM, DISABLE AUTH, CUT OFF PROXY) Emergency shutdown for AI services without service team access macOS native app Shipped x.com post
GEO/SEO Analysis Skills @HowToAI_ Claude Code skills for Generative Engine Optimization analysis (11 skills, 5 parallel agents, 6 schema templates, PDF reports) AI search is eating traditional search; only 23% of marketers investing in GEO Claude Code skills Shipped x.com post
Local AI Coding Agent Home Lab @vspinmaster Home lab setup with OpenCode + Ollama for offline AI coding API cost and privacy concerns for local development OpenCode + Ollama; self-hosted Shipped virtualizationhowto.com

Significant project notes:

Orca (stablyai) is the most technically novel project of the day. It solves parallel agent management: running Claude Code, Codex, and OpenCode simultaneously across multiple Git worktrees, each tracked in one dashboard. The /goal progress tracking feature directly addresses the observability gap that Karpathy's context engineering point highlights β€” you can't manage what you can't measure. Available on macOS, Windows, and Linux at launch.

Academic Research Skills for Claude Code targets academia's forthcoming "vibe coding" moment. The v3.7.0 release includes a 7-mode blocking checklist for integrity gates, style calibration from past work samples, a Devil's Advocate agent to attack the thesis, and peer review simulation. The philosophy is explicit: "AI is your copilot, not the pilot." The tool references Lu et al. (2026, Nature 651:914–919) β€” The AI Scientist β€” to define what fully autonomous research fails at and why human-in-the-loop is the design premise.

Academic Research Skills for Claude Code v3.7.0 GitHub README showing install commands, philosophy ("AI is your copilot, not the pilot"), integrity gate design referencing Lu et al. 2026 in Nature, and positioning relative to The AI Scientist project

KimiFlare builds a kill switch into its AI service stack β€” a governance tool for production AI systems, not a development tool. The pre-flight safety check UI (COVER boxes for ARM SYSTEM, DISABLE AUTH, CUT OFF PROXY) shows that at least some builders are thinking about fail-safe mechanisms as a first-class requirement.


6. New and Notable

SpaceX now owns Cursor

Per Reuters (cited in two tweets), SpaceX acquired Cursor shortly after Microsoft abandoned the deal over antitrust concerns related to GitHub Copilot ownership. SpaceX has also courted Inception (the diffusion LLM startup). An aerospace company owning the second-most-used AI coding IDE is unexpected and currently underanalyzed. The strategic rationale β€” whether Cursor was acquired for internal use, for investment return, or as an AI coding platform play β€” is not explained in available sources.

"Context engineering is the new vibe coding" β€” Karpathy at AI Ascent 2026

Andrej Karpathy coined "vibe coding" as a term; he now frames context engineering as the next layer of competency. The coinage has immediate practical implications: the person who gave vibe coding its name is now saying there is a skill above it that most practitioners have not yet developed. The @dunik_7 economic analysis ($35/day to fix 30 lines by loading 47,000 tokens 50 times) provides the concrete cost signal that Karpathy's observation predicts.

GitHub Copilot was subsidizing users at 300–1,000% of subscription value

Ed Zitron's analysis of the billing transition β€” drawing on the Reddit screenshot showing $5,851.77 of usage against a $39 subscription β€” implies that GitHub Copilot was running at a massive structural loss on agentic users. At $1.08B ARR and a 300–1,000% usage subsidy for some cohort, the June 1 transition is not just a pricing change; it is a retroactive repricing of years of value that users had been receiving without knowing it.

WinUI Agent Plugin: skills as domain-specific packages

Microsoft's WinUI Agent Plugin is the first major domain-specific skill package from a platform vendor (not a third-party). It signals that platform owners are beginning to invest in skills-as-packages as the supported way to extend AI coding agents for specific frameworks β€” not documentation, not API, but agent-executable skill files.

AI Engineer Singapore 2026 agenda (May 15–17)

@nshawbin posted their personal agenda for AI Engineer Singapore 2026: 24 sessions across 3 days, categories including software, agents, coding agents, design, physical AI. Notable sessions: "Building a Guide, Verify, Solve Loop for your Coding Agents with Software Factory," "What We Learned From Analyzing Five Million Vibecoded PRs," "Designing the Agent-Native IDE: When Designers Ship the Code," "Toward World Models: From Language to Physical Intelligence." The conference began two days after this dataset date and represents the community's live gathering point.


7. Where the Opportunities Are

[+++] Token-efficient context management layer for AI coding agents β€” Multiple independent signals converge here: Karpathy's "context engineering is the new vibe coding" quote, the @dunik_7 math showing $35/day to fix 30 lines via unmanaged context, and the Copilot billing shock all point to the same gap. A tool that profiles agent sessions by token cost, recommends which context to exclude, and surfaces the cost of each agent turn before execution has both consumer (developers watching their bills) and enterprise (CTO cost governance) demand. No existing tool addresses this directly. The provider routing insight from the xoofx benchmark (github-copilot-direct is 20% faster and uses fewer tokens than openai:gpt-5.5@high for the same model) suggests that routing optimization is also valuable.

[+++] Migration tooling from GitHub Copilot to alternatives β€” The churn wave is visible and in progress. Users are manually evaluating Claude Code, Codex, OpenCode, and local Ollama setups. A tool that helps migrate Copilot settings, CLAUDE.md equivalents, and skill configurations β€” or benchmarks the alternatives against a user's actual workflow β€” would meet the market exactly where it is. Claude Code is the primary stated migration destination; tooling that reduces that switch cost is strategically aligned with Anthropic's interests.

[++] Self-improving skill folders for specific codebases β€” EvoSkill + Hermes-Agent demonstrates the concept at a generic level. The untapped version is codebase-specific: EvoSkill that learns from your repo's specific patterns, failure modes, and test suite. The missing piece is benchmark generation β€” creating meaningful task sets from a user's own codebase to guide skill evolution. This is direct, technical, and currently only partially addressed by the open-source drops.

[++] Domain-specific agent skill packages β€” The WinUI plugin establishes the pattern: a platform vendor ships a skills package that makes AI agents reliable in their specific domain. The same pattern applies to every niche framework (game dev, mobile dev with React Native, embedded systems, data pipelines, scientific computing). Winning in a specific domain before the generalist tools catch up is the near-term opportunity. The Academic Research Skills plugin shows that even academic workflows can be packaged.

[+] Mobile-first agent oversight and control β€” Shellular launched today and addresses a real pain point (laptop tethering). The broader opportunity is not just remote access but mobile-first oversight: notifications when an agent hits a decision point, a kill switch (like KimiFlare), approval flows for high-risk operations. The agent-is-running-without-you paradigm creates a new category of mobile-native tooling that does not require porting a desktop IDE to mobile.

[+] Security researcher tier for AI coding assistants β€” The sixhobbits supply chain attack investigation was blocked by both Claude and ChatGPT. Both tools direct users to "Trusted Access for Cyber" and "Cyber Verification Program" β€” but these are framed around professional security firms, not developers investigating incidents affecting their own dependencies. A lighter-weight verification tier for developers who need to do legitimate security incident response without joining a formal cyber program would reduce friction and increase trust.


8. Takeaways

  1. GitHub Copilot's June 1 billing transition is already driving migration decisions. Multiple Reddit users shared billing simulations showing 17x to 150x cost increases, and at least two stated publicly they are switching to Claude Code. The same-day announcement of flex allotments and a Max plan is a retention response, but the damage to trust β€” from subsidy to surprise β€” is ongoing. (edzitron)

  2. Google Antigravity's 27-day update silence is the community's leading concern about Google's AI coding commitment. The most-viewed post of the day (313,583 views) was a question, not a complaint β€” which is more damaging for Google: it signals uncertainty rather than frustration. The Google I/O timing makes next week a forced resolution event. (hiarun02)

  3. OpenCode is accelerating on all fronts simultaneously. A 500x session startup improvement, a mainstream RealPython tutorial, and a curated community follow list appeared on the same day β€” suggesting the tool is entering a growth phase where practitioner adoption is feeding back into development priority. (LukeParkerDev)

  4. "Context engineering is the new vibe coding" is the practitioner quote of the day. Karpathy's framing from AI Ascent 2026 has a concrete economic interpretation: unmanaged context costs $35/day to accomplish 30 lines of work. The implication is that token efficiency is now a first-order engineering competency, not an optimization. (dunik_7)

  5. SpaceX now owns Cursor. This emerged from Reuters reporting cited in two tweets and went largely underdiscussed. It is the most structurally surprising competitive development of the day: an aerospace company owns the second-most-cited AI coding IDE in the dataset. (faststocknewss)

  6. Skills are becoming the primary unit of agent improvement, not model updates. EvoSkill (SentientAGI) + Hermes-Agent (NousResearch) prove the concept: a frozen model with an evolving skill folder outperforms on domain tasks without retraining. The 1,453-entry Antigravity Awesome Skills library quantifies where the ecosystem stands. (yasenka244)

  7. The provider you use to access a model matters as much as the model choice. The xoofx benchmark showed github-copilot-direct access to GPT-5.5 High is 22% faster and uses 18% fewer input tokens than openai:gpt-5.5@high for the same task and model. Provider routing optimization is an underexplored cost lever. (xoofx)