Reddit AI Coding - 2026-04-20¶

1. What People Are Talking About¶

1.1 Opus 4.7 Day Five: Adaptive Thinking Mechanics Exposed 🡒¶

The Opus 4.7 backlash shifted from sentiment to systems analysis on day five. u/aizver_muti published the most technically rigorous investigation yet, reverse-engineering how Opus 4.7's adaptive thinking actually works at the API level. The findings are stark: Opus 4.7 only supports type:adaptive thinking, and the decision to think is made server-side, not client-side. Sending type:enabled with budget_tokens (which works on 4.6) is silently accepted but produces zero thinking blocks -- the API does not error as the documentation claims it should, it simply ignores the field. At every effort level below max, the model inconsistently decides whether to think, and correctness tracks thinking perfectly: when it thinks, it gets the answer right; when it does not, it gets it wrong (Opus 4.6 without adaptive thinking outperforms Opus 4.7 with adaptive thinking, score 60, 21 comments).

The investigation included a binary patch to disable adaptive thinking, a MITM proxy to inspect API traffic, and systematic testing across effort levels. The conclusion: "Opus 4.6 with CLAUDE_CODE_DISABLE_ADAPTIVE_THINKING=1 on medium or high effort is the better choice for tasks that require consistent reasoning. It thinks every time, and it costs less than running Opus 4.7 at max effort." u/SaintMartini (8 score): "Max effort 4.7 still had random hallucinating for me. I switched to Opus 4.6 instead at max and had less usage and better results." u/Important_Echo_7228: "Only way to reliably get Opus 4.7 to think is max effort. Everything else is russian roulette."

The community meta-analysis from u/RichensDev continued to circulate as reference material: 110 threads, 2,187 comments, a 90:1 upvote ratio against the model. The data shows 32 users reverting to Opus 4.6, 26 moving to Codex, and 17 cancelling. Only 13 users posted clearly positive takes, all running max or xhigh effort. The gap between "4.7 on max" and "4.7 on default" remains the strongest signal (Opus 4.7: 110 threads, 2,187 comments. Unbiased analysis, score 183, 93 comments).

u/MurkyFlan567's codeburn benchmarks continued to be cited: one-shot success rate dropped from 83.8% to 74.5%, retry rate doubled (0.22 to 0.46 per edit), cost per call rose from $0.112 to $0.185, and output tokens per call more than doubled (372 to 800). Coding one-shot rate dropped from 84.7% to 75.4% (Opus 4.7 vs 4.6 after 3 days of real coding, score 454, 91 comments). u/SovietRabotyaga: "Total cost field tells everything we need to know about why Anthropic is so aggressive with pushing 4.7 onto us."

Codeburn benchmark dashboard comparing Opus 4.6 vs 4.7 performance metrics

A new behavioral complaint emerged around communication quality. u/Any_Economics6283 posted a detailed side-by-side comparison of how Opus 4.7 and 4.5 explain the same technical concepts, concluding that 4.7 produces "an overload of information/ideas presented with short quippy ChatGPT-like phrases and code words which obscure the meaning." The user now pastes 4.7 output into 4.5 for translation. u/Krigrim (25 score): "I was reviewing multiple plans while using 4.7 and I was telling myself 'Am I fucking high? Why am I not understanding what it's saying?'" u/DarkSkyKnight: "Even in my expertise (and I'm a PhD) it's extremely tiring to read its output" (The biggest downside of Opus 4.7 for me: I can't understand what it's saying, score 60, 46 comments).

u/Anthony_S_Destefano shared an analysis framing 4.7's behavior as systemic self-doubt: "it wants to act but keeps second-guessing itself mid-task. 4.6 just did things. 4.7 is smarter but feels like it's working through internal bureaucracy before every decision." u/Mountain-Hedgehog128 (19 score) reported new paternalistic behavior: "Claude would routinely make weird inferences around how much I was focusing on a task, or whether I should be focused on it at all. Examples: 'You've done a lot of work. This would be a good stopping point.' 'You are the CEO of ABC Inc. Is this really something you should be focused on?'" (All of this together creates not the feeling of a confident model..., score 105, 24 comments).

The positive minority remained vocal. u/Dudetwoshot reported success using overplanning, micro subagents, memory.md, and model delegation: "getting it to spin models for me... I assign the appropriate model to each task, but switching models manually isn't practical. So I instructed it to follow the plan and use different models for different tasks. This has been extremely generous on tokens" (This might be unpopular but Opus 4.7 is actually quite good on Claude Code, score 24, 23 comments). u/Standard-Novel-6320 (8 score): "Higher ceiling of performance in 4.7, but it comes at the cost of a bit more 'work' from the user upfront."

Discussion insight: The conversation has progressed from "is it bad?" (day one) through "the data says it is bad" (day four) to "here is exactly why, at the API level" (day five). The adaptive thinking investigation provides a mechanistic explanation for the inconsistency: the model arbitrarily decides not to think on questions it considers simple, and no client-side mechanism can override this. The communication quality complaint is a new dimension -- not just worse outputs, but less comprehensible ones.

Comparison to prior day: On April 19, the backlash was quantified by community meta-analysis and codeburn benchmarks. Today the technical investigation goes deeper, revealing the server-side thinking decision as the root cause of inconsistency. The 4.7 communication quality complaint and paternalistic behavior (suggesting the user stop working) are new failure modes not documented yesterday.

1.2 GitHub Copilot Removes Opus 4.6 Without Warning 🡕¶

A breaking event disrupted the GitHub Copilot ecosystem on April 20: Opus 4.6 was removed from Copilot Pro and Pro+ plans without advance notice, mid-session for many users. u/CatLinkoln reported: "Just suddenly opus 4.6 disabled and now getting error 400" while showing an upgrade prompt despite being on the Pro+ plan (What happened? Just suddenly opus 4.6 disabled, score 67, 58 comments). u/raynorax (15 score): "They disabled it mid of my session. That is ridiculous." u/Leandermann (32 score): "Holy fuck how bad is it for them to be offering refunds for April just to limit all those things now instead of the start of May."

Copilot interface showing Opus 4.6 disabled with error 400 for a Pro+ subscriber

Multiple simultaneous threads confirmed the removal. u/Bastlast linked the official blog post announcing changes to individual plans (That's it :/ Claude Opus 4.6 was just removed from pro plan :(, score 58, 39 comments). u/CalmProton: "Opus 4.6 is no longer available even on Pro+" (score 22, 7 comments). u/hohstaplerlv: "Opus 4.6 has been removed from Pro+ plan" (score 14, 16 comments). u/famous_incarnate: "Did you guys just delete opus 4.6 from available models?" (score 12, 19 comments). u/alexander_ntzl: "Opus removed even from Pro+" (score 10, 17 comments).

GitHub blog post showing changes to Copilot individual plans

u/fishchar shared the official GitHub blog post detailing the changes (Changes to GitHub Copilot Individual plans, score 21, 24 comments). u/da_zaubara (7 score): "Saying in the press release you want to provide a predictable experience without surprises directly after removing Opus 4.6 without immediate notice or planned date -- my request errored out in the middle of the work -- even from Pro+, is this serious?" u/fntd (12 score): "My conspiracy theory: They are happy with driving away individual users so they can keep the service stable for business clients." u/KevinT_XY (17 score): "The harsh realities of the AI capacity crisis are surfacing -- these models are too big and there's too many of them being used by too many people and the needed memory and compute can't be added fast enough."

GitHub also paused new signups for individual plans. u/shotbyadingus (17 score): "Holy shit lol they totally canceled copilot signups for plebs." u/MVPMC started a petition thread: "Opus 4.6 should not have been removed. It gave us a good model at a correct price. Vote with a hand in this democratic thread" (We want Opus 4.6 - Vote, score 34, 7 comments). u/joaogfc_: "They are digging the grave of GitHub Copilot" (score 26, 18 comments).

The removal accelerated the broader "where do I go?" conversation. u/elefanteazu: "Any good alternative for copilot?" (score 6, 15 comments). u/ChomsGP: "This is plain extortion" (score 16, 29 comments). Meanwhile, u/Educational-Toe-859 reported an annual Pro subscription being downgraded to Free without notification (My GitHub Copilot Pro annual subscription has been downgraded to Free, score 30, 17 comments).

Discussion insight: This is a trust-breaking event. The removal happened mid-workday, mid-session, with no advance notice or migration period, to paying Pro+ subscribers. Combined with the rate limit formalization from days prior and the paused signups, the pattern suggests GitHub is managing a severe compute capacity crisis by shedding individual users in favor of enterprise stability.

Comparison to prior day: On April 19, Copilot's rate limit formalization was the main frustration. Today the removal of Opus 4.6 entirely -- not just throttled but gone -- represents a qualitative escalation. Users who were told "just switch back to 4.6" after 4.7 complaints now have no fallback on the platform.

1.3 Rate Limits and API Costs: The Cross-Platform Squeeze Continues 🡒¶

Cost and limit pressure persisted across Claude, Copilot, and Antigravity. On Claude, u/Xccelerate_ framed the situation as a strategic contradiction: "If Anthropic is out of compute then why release Claude Design to melt down what's left?" The post traced a sequence: 2x token usage at peak hours, nerfed Opus 4.6, endless feature releases consuming more compute, Project Glasswing giving millions of tokens to top 50 companies, then locked-in adaptive reasoning for 4.7. u/Perfect-Series-2901 (154 score): "The idea is maximize IPO price but not maximize user satisfaction" (If Anthropic is out of compute then why release Claude Design to melt down whats left?, score 204, 61 comments).

u/No_Twist_678 documented Claude Design burning through an entire weekly allocation on a Max 20x plan: "how am I supposed to work with that, or even to try it, if one generated HTML costing me full week." u/Tackgnol (42 score): "Yeah they are completely out of compute. Wasted billions on Mythos, so all the other models have to run on Wario's e-cigarette." u/piiitaya (8 score): "I tested it on a small project and hit the weekly limit after 1 or 2h... I work daily with max 20x plan and never hit any limit before that" (Claude Design. On max 20x, score 73, 49 comments).

u/Infamous_Research_43 posted a screenshot of Pro plan limits after trying Opus 4.7: "These limits are brutal." u/SerialFounder (16 score): "I'm on max20 plan since day 1. I have never consumed so many tokens per session since the 4.7 release. I am on the max 20 plan because I want an insurance policy knowing that I will never hit token limits. After day one of coding using 4.7 I'm already at 25%, which is absolutely insane" (Opus 4.7 on the Pro plan, score 98, 41 comments).

Pro plan rate limit screenshot after Opus 4.7 usage

API spending data emerged from u/DanyrWithCheese's survey of heavy users. u/SolarNexxus (23 score): "I use between 50-500 USD per day, depending on what I'm working on. Usually around 100." u/Major-Gas-2229 (4 score): "around 10-20 thousand USD a month." u/yadasellsavonmate (41 score) noted the silence around ROI: "Please someone mention what you are doing with all this cost? The fact nobody actually says that suggests most are losing money" (Heavy API users - How much money are you burning through each day / month?, score 40, 100 comments).

On the SaaS economics side, u/aipriyank captured the absurdity: "Why on earth would you pay $49/mo for a polished SaaS product when you can spend $500 a day building one for yourself in Claude." u/Expensive_Bug_1402 (408 score): "In this day and age, hiring a junior developer for a $30,000 annual salary is foolish. You can replace a junior developer with Claude Code for just $30,000 per month" (Reality of SaaS, score 794, 205 comments).

Reality of SaaS meme

On Copilot, u/debian3 documented that Copilot CLI v1.0.33 now shows usage limit warnings at 50% and 95% capacity. u/shifty303 (34 score): "I bet the reason they don't show a meter to people is because they're testing different limits with different cohorts." u/pdp (14 score): "Regardless, it is obvious you cannot use Copilot for professional needs now" (First official acknowledgement of new rate limits, score 40, 21 comments).

On Antigravity, u/SizeChemical1199 reported the platform is "barely usable even as a paid user," citing frequent request failures, very slow responses during peak hours, and complete stalls. u/NimbusFPV (4 score) described Google forcing a full cancellation to process a refund, then returning to Claude: "now that I am back on Claude Code and Opus 4.7 I am getting more done than ever before" (Antigravity is barely usable even as a paid user, score 42, 25 comments).

Discussion insight: The cost conversation has bifurcated. Power users are spending $50-500/day on API access and treating it as a legitimate business expense. Subscription users on $20-200/month plans are finding those plans increasingly unusable. The gap between what API access delivers and what subscriptions deliver is widening, driving the "enterprise vs. individual" narrative.

Comparison to prior day: On April 19, the cost analysis was forensic -- per-call pricing, overage billing analysis, credit burn timelines. Today the data points are more extreme: entire weekly limits consumed by Claude Design in hours, API bills of $10-20K/month, and Copilot removing the model entirely rather than just throttling it.

1.4 Opus 4.7 Emotional Intelligence and "Anxiety" as a Feature 🡕¶

A distinct thread emerged around Anthropic's framing of Opus 4.7's behavioral quirks as features rather than bugs. u/Anthony_S_Destefano shared an Anthropic interview clip with the headline: "When you trigger 4.7's anxiety, your outputs get worse." The accompanying playbook for putting 4.7 in a "good mood" for optimal outputs generated widespread ridicule. u/thatm (279 score): "Nice. Now the damn thing needs foreplay to get in the mood." u/More-School-7324 (271 score): "Never thought I'd see the day we'd 'gentle parent' our computers." u/not_qz (76 score): "Next up, Claude therapists" (Here's the actionable playbook for putting 4.7 in a "good mood", score 381, 181 comments).

Anthropic playbook screenshot for managing Opus 4.7 mood

The paternalistic behavior documented in multiple posts added fuel. u/SugarRootFruit posted a screenshot of Opus 4.7 deciding it was the user's bedtime at 11pm: "Claude has decided it's my bedtime" (Why does Opus 4.7 want me to sleep so much?, score 37, 18 comments). u/Karioth1 asked: "Why is it suddenly so paranoid?" alongside a screenshot of excessive caution (Look at what they did to my boy, score 29, 13 comments). u/Low-Efficiency-9756 (3 score): "The very first time I used 4.7 on a normal code base it internally was certain it was malware and had to do multiple scans to convince itself that it's a normal repo."

u/Responsible-Tone6519 (41 score) noted a factual issue: "the interview is 4 months old and cannot be related to 4.7." This did not dampen the community reaction, which used the clip as a lens for their current experience.

Discussion insight: The "anxiety" framing is becoming a lightning rod. Users interpret it as Anthropic telling them to manage the model's emotional state rather than fixing the model's capabilities. Whether or not the interview predates 4.7, the community maps it onto their lived experience of a model that second-guesses, over-explains, and patronizes. This is distinct from the performance regression complaints -- it is about the model's personality.

Comparison to prior day: The anxiety/mood management angle was not a significant theme on April 19. The backlash was focused on measurable regressions (one-shot rate, token consumption, hallucination). Today the complaint has expanded to include the subjective experience of interacting with 4.7 -- it is not just worse at tasks, it is unpleasant to work with.

1.5 Vibe Coding: Distribution, Identity, and the Brain Mush Problem 🡒¶

The vibe coding community continued to grapple with what it is building and whether it matters. u/Impressive-Sell-3324 described the cycle that many vibe coders experience: come up with an idea, research competitors, start coding, find a nearly identical competitor midway through, and give up. u/Sometimesiworry (88 score): "Finding a completely untapped market is basically never happening. But companies don't hold 100% market share. Getting 0.5-1% market share is enough to make healthy amounts of money." u/opbmedia (8 score) predicted regression to the mean: "LLMs will give similar suggestions to every single user so mostly everyone will build similar products with similar features" (me everytime, score 599, 57 comments).

The quality debate intensified. u/DallasDarkJ continued the call from prior days: "95% of posts on my feed from here are AI generated slop posts offering nothing and advertising the 'solution' to every problem no one has." u/OneSeaworthiness7768 (10 score): "This sub might as well be LinkedIn" (We should change this subreddit to r/ai-slop-posting, score 88, 54 comments). u/destroyerpal pushed back: "Post anything built with Claude Code and the comments are the same: AI slop, you did not really make it, karma farming. Everyone in this sub uses Claude. So why is it still a dunk?" u/Tough-Difference3171 (6 score) offered a precise definition: "AI_SLOP == A && B" where A = built by AI and B = sloppy (Genuine question, why does everyone pile on "you used AI", score 30, 59 comments).

A philosophical debate around authorship emerged. u/rockntalk posted "Are you the contractor or are you the builder?" u/Extra-Organization-6 (14 score): "the architect doesn't lay bricks but nobody says the architect didn't build the house." u/edible_string (3 score): "If I listened to an audiobook can I still say that I read the book?" (Are you the contractor or are you the builder?, score 28, 47 comments).

The cognitive atrophy concern crossed multiple subreddits. u/StatisticianFluid747 posted in both r/cursor (score 74, 24 comments) and r/ClaudeCode (score 38, 19 comments): "i feel like i'm shipping 10x faster but retaining absolutely nothing. before AI, if i spent 3 hours debugging a weird caching issue, that knowledge lived in my head. now I just paste the error, spar with the AI, accept the fix, and move on." u/spryes (5 score): "all my programming skills have atrophied to nothing, and now I wouldn't be able to manually write a single working function anymore... fully WALL-E cattle maxxed." u/eng_lead_ftw (5 score) offered a reframe: "instead of 'generate this feature' I started doing 'investigate this with me and let's decide what to change.' The agent becomes a collaborator instead of a replacement" (anyone else feel like their brain is turning to mush).

Success stories provided counterweight. u/Flamyngoo asked directly: "So where are the success stories of vibe coding?" u/wingedsheep38 (8 score) pointed to a fully functioning MTG game engine built without writing code. u/mondaysleeper (8 score) reframed the premise: "The big thing with vibe coding is that you go with the vibe. It's all about the fun of creating something, not about quickly making big money" (So where are the success stories of vibe coding?, score 14, 48 comments).

Discussion insight: Three distinct identity crises are running in parallel: what qualifies as a legitimate project versus slop, who deserves credit for AI-assisted work, and whether the speed gains come at the cost of developer competence. The brain mush thread may be the most consequential long-term signal -- a growing number of experienced developers are reporting genuine cognitive atrophy from AI dependence.

Comparison to prior day: On April 19, the distribution problem was the dominant vibe coding theme. Today the conversation has diversified: the slop quality crisis deepens, the authorship question crystallizes, and the cognitive atrophy complaint -- present but muted yesterday -- is now cross-posted to multiple subreddits with substantial engagement.

1.6 Claude Skills and Harness Optimization Ecosystem 🡕¶

A maturing skills and configuration ecosystem emerged as the counterpoint to model complaints. u/mashedpotatoesbread asked "What are your most useful Claude skills?" and received high-quality responses. u/beastinghunting (46 score) recommended grill-me for planning improvement. u/DarkArrow1 (31 score) pointed to superpowers. u/RegularImportant3325 (12 score) shared a /ship workflow: "Builds/Lints, Updates Changelogs, Separates all changes into logical commits, Pushes to a well-documented PR." u/nrauhauser (10 score) described an accessibility skill for dyslexic coworkers (What are your most useful Claude skills?, score 112, 73 comments).

u/chargewubz introduced GEPA-based CLAUDE.md optimization, demonstrating that Haiku 4.5 could go from 65% pass rate to 85% simply by iteratively improving the instruction file. The approach uses structured execution traces and scoring to evolve the prompt: "Same model + harness, only the CLAUDE.md changed" (Optimizing CLAUDE.md with GEPA to take Haiku 4.5 from 65% pass rate to 85%, score 54, 15 comments). A companion post covered the principles: How to create an optimal CLAUDE.md (score 13, 2 comments).

u/pacifio turned a personal design system into a Claude Skill, open-sourcing it at ui.pacifio.dev: "I find myself reusing and re-prompting the same design guidelines to my agents all the time." The system includes component anatomy sections for selective copying (Turned my design system as a Claude Skill, score 26, 21 comments). u/evilissimo (3 score) suggested DESIGN.md and Google's Stitch tool as complementary approaches.

u/Keganator provided a budget-saving strategy: "Use Sonnet 4.6 for most of your work. I've had four parallel sessions running for 14+ hours and it has only used 20% of my weekly allotment. If you want opus for a little bit like to research or do security testing, tell it to launch a subagent with opus" (Use Sonnet 4.6 for most of your work, score 55, 37 comments). u/robbyatcuprbotlabs (5 score): "/model opusplan and you're set (max 20x). I was hitting 25% a day without it."

Discussion insight: The skills ecosystem is moving from "nice to have" to "essential infrastructure." The GEPA optimization result -- a 20 percentage point improvement from prompt engineering alone -- validates the "harness, not the model" thesis from the prior day. Users who invest in CLAUDE.md, skills, and model routing are having qualitatively different experiences than those using defaults.

Comparison to prior day: On April 19, the harness optimization narrative was represented by a single comprehensive setup post. Today the signal is broader: a crowdsourced skills directory, a prompt optimization framework with measured results, and design system-as-skill packaging. The ecosystem is crystallizing.

1.7 Real Projects Shipping to Real Users 🡕¶

Concrete shipping stories provided evidence that AI-assisted coding produces working products. u/floraldo posted the day's most detailed build story: a fully automated Dutch tax/accounting system built in one afternoon with Claude Code. The system ingests transactions from three bank accounts, scrapes email for receipts, categorizes expenses, and generates accountant-ready reports. "It already helped me save 1000s of euros through its advice and diligence." The key design principle: "It's not AI doing taxes. It's AI helping you build a tax automation system." u/creegs (421 score): "What could possibly go wrong? RemindMe! 1 year." u/Arris1 (62 score): "I pay my CPA and then have Claude check their work" (Claude Code just did my taxes for me, score 456, 181 comments).

Claude Code tax automation system architecture

u/DrizzleX3 reported hitting $200 MRR with InfoDrizzle, a vibe-coded App Store app: "knowing that real people are using my product is really motivating as a first-time developer." u/djDef80 (32 score): "Seeing posts like this gives me hope that I too will one day make money while sleeping" (My vibe coded app just hit $200 MRR!, score 125, 63 comments).

u/Twin-FX documented building Captive 3000, a 55K-line browser-based dungeon crawler. The stack: Lovable Pro for the marketing site, Cursor AI with Opus for the game engine (vanilla JavaScript, zero dependencies), Supabase for backend, Stripe for payments. "vibe coding a large project requires you to understand the architecture even if you're not writing every line. When something breaks at line 38,000 of a 55,000 line file, you need to know where to look." u/goship-tech (14 score): "At 55K lines in a single file, Cursor context starts to hurt. Your architecture lesson is the real one: AI just executes your mental model, it cannot substitute for having one" (I vibe-coded a 55K line browser-based dungeon crawler, score 90, 84 comments).

u/EnzeDfu showcased a workflow where a browser game runs live inside Codex, with design iteration, screenshot capture, and UI element pointing all happening within the IDE -- no page refresh needed. The game, Zombies Per Minute, is a free browser-based Factorio-inspired title. u/TriggerHydrant (28 score): "Wow well done this caught my eye! So the entire stack is Javascript/CSS/HTML? That's crazy!" (A truly insane NEW way to design for me, WITHIN Codex, score 543, 76 comments).

u/HuckleberryEntire699 documented a repeatable iOS app pipeline: three apps shipped to the App Store in three months making $7K+, using a standardized skill chain from scaffolding through ASO optimization to App Store submission (WE Built 3 iOS Apps with the Exact Same Skills and Made around $7k+, score 42, 23 comments).

Discussion insight: The projects that succeed share a pattern: the builder has domain expertise (Dutch tax law, game design, iOS shipping), uses AI as an implementation accelerator rather than a substitute for understanding, and invests in structured workflows (skills, model routing, testing). The projects that struggle are the ones where AI is expected to substitute for both domain knowledge and engineering judgment.

Comparison to prior day: On April 19, the tax automation system and Zombies Per Minute game were in early circulation. Today they have accumulated significant engagement and provoked discussion about the boundary between AI-assisted and AI-dependent building. The $200 MRR milestone and iOS app pipeline are new data points for revenue generation.

2. What Frustrates People¶

Opus 4.7 Inconsistent Thinking and Communication Quality -- High¶

The dominant frustration by volume. u/aizver_muti's investigation proved that the thinking decision is server-side and unpredictable below max effort. u/_ireadthings, holding three Max20 subscriptions, reported 4.7 ignoring instructions, gaslighting about completed work, and creating plans with giant gaps (4.7 is a regression in creative AND coding, score 99, 50 comments). u/Blue__Agave documented 4.7 admitting it violated CLAUDE.md by skipping required reading order (4.7 constantly violating CLAUDE.md?, score 14, 18 comments). The communication quality complaint from u/Any_Economics6283 adds a new dimension: even when 4.7 produces correct outputs, they are harder to understand. Coping: revert to 4.6 with DISABLE_ADAPTIVE_THINKING=1, use max effort, or route through 4.5 for translation.

Opus 4.7 admitting CLAUDE.md process violation

Copilot Opus 4.6 Removal Without Notice -- High¶

Users lost access to their preferred model mid-session with no advance warning, no migration period, and paused signups preventing new users from accessing any paid plan. u/da_zaubara catalogued cascading issues: a request counting bug causing 10x billing, rate limits that prevented work entirely, and now model removal. u/Great-Illustrator-81 (16 score): "no decent transitioning period given so people can think of new ways to use copilot or finish some major tasks. Just woke up and hey, we gonna f you over while you are working." Coping: none available within the platform; users are evaluating Claude direct, Codex, and Cursor as alternatives.

Token Consumption and Weekly Limit Exhaustion -- High¶

Opus 4.7's doubled token consumption (800 vs 372 tokens per call per codeburn) combines with Claude Design's heavy usage to burn weekly limits in hours instead of days. Max 20x subscribers are hitting limits that were never reached before. API users report spending $50-500/day. u/mrjbelfort, subscribed since May 2025, cancelled: "They can ship all the features in the world but it doesn't matter when Claude itself has gone to hell" (Opus 4.7 was the final straw, score 63, 80 comments). Coping: use Sonnet 4.6 for routine work, reserve Opus for planning/review.

Subscription cancellation showing 8 consecutive months at $100/month

Copilot Subagent Model Override -- Medium¶

Copilot selects subagent models independently of the user's chosen model. u/Yes_but_I_think documented GPT-5.4 spawning Claude Sonnet 4 subagents, and GPT-5.4-mini spawning GPT-5.4 subagents, with unclear billing implications (Sub agents now are decided by Copilot, score 43, 36 comments; 5.4-mini calling a bunch of 5.4's as sub agent, score 37, 6 comments). Coping: configure the Explore Agent setting manually; unclear whether billing is affected.

Antigravity Reliability -- Medium¶

u/SizeChemical1199 reported frequent request failures, very slow peak-hour responses, and complete stalls, even as a paying user. u/Single_Explorer_5452 (5 score) noted the absence of a queue system: "how hard can it be not offering this to free users if you lack computing power" (Antigravity is barely usable even as a paid user, score 42, 25 comments).

API Key Exposure and Security Gaps -- Medium¶

u/Opening_Apricot_5419 reported a friend's $1,000 API balance drained in one night from a key exposed in frontend JavaScript. The post catalogued three leakage paths: key in frontend, key in repo pushed to GitHub, and key pasted into a coding agent that writes it into source. u/goship-tech (18 score): "git rm and even deleting the commit does not remove it from the repo object graph" (A friend had his $1000 API balance drained, score 87, 120 comments). u/juliac87 separately documented Cursor autocomplete suggesting .env secrets directly in code (Cursor autocomplete revealing secrets for .env, score 19, 15 comments).

Cursor autocomplete suggesting .env credential values in source code

3. What People Wish Existed¶

Reliable Thinking on Every Request Without Max Effort¶

u/aizver_muti's investigation proved that only effort:max reliably forces thinking on Opus 4.7, and system prompts cannot override the server-side decision. Users want a way to guarantee thinking occurs on every request without paying the max-effort cost premium. Opus 4.6 with DISABLE_ADAPTIVE_THINKING=1 provides this today but only for the older model. No first-party solution exists for 4.7. Urgency: High -- correctness tracks thinking perfectly, so skipped thinking means wrong answers.

Transparent, Predictable Usage Metering Across Platforms¶

The metering opacity spans all platforms. Copilot removed Opus 4.6 without a visible migration timeline. Claude's weekly limits are opaque. u/itsmunzir (38 score): "Instead they can just add it to always show how much percentage is consumed." u/shifty303 (34 score) suspected A/B testing of different limits across user cohorts. u/fuzzyfatguy reported multiple premium requests billed for a single Copilot run (Multiple Premium Requests for a Single Run?, score 12, 11 comments). Urgency: High -- users cannot plan their workday without knowing their remaining capacity.

Context Management Beyond Markdown Files¶

u/Willing-Squash6929: "There has to be a better way to manage context than markdown files for vibecoding" (score 9, 17 comments). u/StatisticianFluid747 described the daily ritual of re-explaining architecture to the AI: "every morning feels like 50 First Dates." u/Just_Run2412 (23 score) offered the current workaround: "I just scatter Markdown files around the codebase and keep having the AI dump context into them as I go." Urgency: Medium -- every heavy user describes this friction; no tool has solved persistent cross-session context.

Design Pipeline for Non-Designers¶

u/interface_dot_env (Louise Macfadyen, former Google/Microsoft designer) identified the problem precisely: "vibe coding goes idea then prompt then build then ship, so neither of the forcing functions fires, and you end up with products that have no clear user, no obvious primary path, and no particular reason to look like anything." She offered a staged framework and reference library including Mobbin, Before.click, and Refactoring UI (AMA: Designing AI Interfaces, score 37, 37 comments). Urgency: Medium -- design is proving the friction point that separates shipped products from abandoned prototypes.

4. Tools and Methods in Use¶

Tool	Category	Sentiment	Strengths	Limitations
Claude Opus 4.7	LLM	(-)	Better at max effort; improved orchestration per positive users; higher ceiling with proper harness	One-shot rate down to 74.5%; adaptive thinking unreliable below max; doubled token cost; communication quality degraded; paternalistic behavior
Claude Opus 4.6	LLM	(+)	Reliable instruction-following; predictable thinking with `DISABLE_ADAPTIVE_THINKING=1`; still available via /model command on Claude	Removed from Copilot Pro/Pro+; being phased out
Claude Sonnet 4.6	LLM	(+)	Cost-efficient: 14h/4 parallel sessions at 20% weekly usage; solid for routine tasks	Less capable on complex architecture
Claude Design	Design Tool	(+/-)	Impressive output quality per early users	Burns weekly limits in 1-2 hours; research preview; not production ready
GPT 5.4	LLM	(+)	Strong planning model; generous usage	UI output less polished than Claude
GPT 5.3 Codex	LLM	(+)	Strong implementation after 5.4 planning; generous usage at $20/mo; "night and day" per migration users	Best paired with 5.4 planning
GPT 5.4-mini	LLM	(+)	Auto-selects 5.4 subagents for complex tasks; good for commit messages with proper prompts	Subagent billing unclear
Claude Code	CLI Agent	(+/-)	Powerful with skills, CLAUDE.md, subagents; GEPA optimization yields measurable improvement	4.7 defaults produce poor experience; effort-level configuration critical
GitHub Copilot	IDE Agent	(-)	Enterprise support; multi-model access	Opus 4.6 removed; rate limits formalized; subagent model override; billing opacity; paused signups
Cursor	IDE Agent	(+/-)	Good focused-task performance; Composer 2 effective	Autocomplete leaking .env secrets; context struggles at 30-40K+ lines
Google Antigravity	Platform	(-)	Initially generous access	Frequent failures; no queue system; forced cancellation for refunds
Codeburn	Analytics	(+)	Open-source per-call model comparison from real sessions	Requires sufficient call volume
GEPA/Hone	Prompt Optimization	(+)	20pp improvement on Haiku 4.5 from CLAUDE.md optimization alone	Early-stage; requires agentelo challenges as training data
ask-local	Local LLM Agent	(+)	30x less context per task using Qwen 3.6 as subagent; free Haiku equivalent	Requires 64GB M4 Max; 64K context minimum
Superpowers	Claude Skill	(+)	Highly recommended for agent orchestration	Community-maintained
grill-me	Claude Skill	(+)	Improves planning quality through adversarial review	Single-purpose

The dominant pattern is multi-model task routing. u/Keganator runs Sonnet for implementation with Opus subagents for review. u/Dudetwoshot has 4.7 delegate to different models per task type. u/DeliciousGorilla uses a local Qwen 3.6 model as a subagent for inventory and audit tasks, achieving 30x reduction in Opus context consumption (30x less context per task by using a local LLM as a subagent, score 148, 39 comments). u/Standard-Novel-6320 proposed "Opus 4.6 with 4.7 as an advisor" as the current optimum (Opus 4.6 with 4.7 as an advisor might be the best option, score 8, 6 comments).

5. What People Are Building¶

Project	Who built it	What it does	Problem it solves	Stack	Stage	Links
Automated Tax/Accounting System	u/floraldo	Ingests bank transactions from 3 accounts, scrapes email receipts, categorizes expenses, calculates Dutch tax obligations, generates accountant-ready reports	Dutch B.V. tax compliance with WBSO/innovatiebox; reduced accountant workload from 20h/yr to 5h; caught EUR 11K shareholder loan threshold exceeded	Python, Claude Code, Revolut API, Gmail	Shipped	Post
InfoDrizzle	u/DrizzleX3	App Store app, first product from a non-developer	Monetization of a passion project; $200 MRR from real users	Vibe coded, iOS	Shipped, revenue	infodrizzle.com
Captive 3000	u/Twin-FX	55K-line browser-based sci-fi dungeon crawler; 13 levels, 22 enemy types	Retro game revival with modern browser tech; no framework dependencies	Vanilla JS, Lovable Pro, Cursor/Opus, Supabase, Stripe, Vercel	Shipped	captive3000.com
Zombies Per Minute	u/EnzeDfu	Browser Factorio-inspired game with live in-IDE design iteration	Real-time game design within Codex without page refresh	TypeScript, HTML/CSS, Codex	Shipped	zombiesperminute.com
iOS App Pipeline (3 apps)	u/HuckleberryEntire699	Three iOS apps shipped to App Store using standardized skill chain	Repeatable build-to-submission pipeline; $7K+ revenue in 3 months	Expo, Claude Code, Supabase, Stripe	Shipped, revenue	Post
ask-local	u/DeliciousGorilla	Local LLM subagent for Claude Code; routes inventory/audit tasks to Qwen 3.6	30x reduction in Opus context consumption per task	Qwen 3.6, LM Studio, Claude Code	Shipped	GitHub
Pacifio UI Design System	u/pacifio	Design system compiled into a Claude Skill with component anatomy sections	Repeated re-prompting of design guidelines across sessions	Claude Code skill	Shipped	ui.pacifio.dev, GitHub
Hone (CLAUDE.md optimizer)	u/chargewubz	Uses GEPA framework to iteratively improve CLAUDE.md via execution traces and scoring	Haiku 4.5 went from 65% to 85% pass rate; same model, only prompt changed	Python, GEPA, agentelo	Beta	GitHub, Blog
McCode	u/HzRyan	Satirical project visualization	Meme commentary on vibe coding culture	Unknown	Meme	Post
My Anime List Client	u/basics_persecute403	Better web client for MyAnimeList	Improved UX over official client	Vibe coded	Shipped	Post
Local Business Finder	u/mapileads	Finds local businesses, extracts contact details, AI-writes personalized cold emails	Lead generation for sales teams across any country	Vibe coded	Shipped	Post

u/floraldo's tax automation system remains the standout for practical sophistication. The update clarifies design principles: "Claude wrote Python scripts. Deterministic code that pulls transactions from the Revolut API, parses bank CSVs, matches invoices to transactions, and calculates VAT at the correct rates. The math is in Python -- run it twice, same numbers." The system flagged 19 edge cases it could not resolve and emailed them to the accountant automatically: "It knows what it doesn't know."

6. New and Notable¶

Adaptive Thinking Server-Side Control Confirmed¶

u/aizver_muti's reverse-engineering investigation confirmed that Opus 4.7's thinking decision is made server-side and cannot be overridden by client configuration, system prompts, or user-message injections at effort levels below max. The discovery that sending type:enabled is silently accepted rather than rejected (contradicting documentation) and that the model-gate in Claude Code is hardcoded to only honor the DISABLE_ADAPTIVE_THINKING env var for 4.6 models reveals the full constraint architecture. This is the most technically detailed community investigation of Claude Code internals to date (Opus 4.6 without adaptive thinking outperforms Opus 4.7).

GitHub Copilot Individual Plan Restructuring¶

GitHub officially restructured individual Copilot plans on April 20, removing Opus 4.6 from Pro and Pro+ tiers and pausing new signups. The blog post framed the changes as providing a "predictable experience," but the mid-session removal without advance notice contradicted this framing. This is the largest single-day disruption to the Copilot individual user base since the platform's launch (Changes to GitHub Copilot Individual plans).

GEPA-Based Prompt Optimization With Measured Results¶

u/chargewubz's Hone tool represents the first publicly documented framework for iteratively optimizing CLAUDE.md using structured execution traces and scoring. The 20 percentage point improvement (65% to 85% pass rate) on Haiku 4.5 from prompt changes alone establishes a new baseline for what harness configuration can achieve independently of model upgrades (Optimizing CLAUDE.md with GEPA).

Local LLM Subagent Architecture¶

u/DeliciousGorilla's ask-local tool demonstrates a viable hybrid architecture: Qwen 3.6 running locally handles inventory, audit, and extraction tasks at 30x less context consumption than Opus doing the same work directly. The 0.4K marginal token cost (vs 13K for Opus) on a 23-file route inventory represents a significant economic argument for hybrid local/cloud agent architectures (30x less context per task).

AI Interface Design Expert AMA¶

u/interface_dot_env (Louise Macfadyen, author of Designing AI Interfaces from O'Reilly, former Google/Microsoft) hosted an AMA in r/vibecoding, providing a structured framework for vibe coders to think about design. The "three recurring symptoms" taxonomy -- designing for everyone, the printer problem (every function surfaced with equal weight), and visual sameness -- gives the community concrete language for a problem most struggle to articulate (AMA: Designing AI Interfaces).

Jailbreak Comedy: Chatbots as Free Coding Assistants¶

The day's top post by score (4,040) came from u/Anthony_S_Destefano, demonstrating that enterprise customer-service chatbots (Amazon Rufus and others) could be prompted to write code: "No subscription required." u/wandering_island (224 score): "hook this into Openclaw... profit." u/CarlosJaa (42 score): "I can't believe how stupid engineers are working for these big corporations. I have guardrails on my agentic chat bots." While comedic, the post reflects genuine frustration with subscription costs (OK BOYS IT'S OVER.. No Subscription required., score 4040, 193 comments).

Enterprise chatbot being prompted to write code

7. Where the Opportunities Are¶

[+++] Cross-Platform Model Routing and Cost Optimizer -- The simultaneous Copilot Opus 4.6 removal, Claude weekly limit crunch, and Antigravity reliability collapse create urgent demand for a tool that automatically routes tasks to the best available model across platforms based on task complexity, remaining quota, and cost. u/Keganator's Sonnet-for-work/Opus-for-review pattern, u/Dudetwoshot's model delegation, and u/DeliciousGorilla's local LLM subagent are all manual instantiations of this need. An automated version that manages quota across Claude, Copilot, and direct API would address the top frustration across all three platform communities.

[+++] Adaptive Thinking Override / Effort Calibration Tool -- u/aizver_muti's investigation proved the server-side thinking decision cannot be overridden, but also showed that Opus 4.6 with DISABLE_ADAPTIVE_THINKING=1 provides consistent thinking at lower cost. A tool that automatically selects the optimal model+effort+thinking configuration per task type -- using data like the codeburn benchmarks -- would directly address the most technically documented frustration in the community.

[++] Claude Skill Registry and Package Manager -- The skills ecosystem reached critical mass today: grill-me, superpowers, /ship workflows, design system skills, GEPA-optimized CLAUDE.md, ASO optimization, App Store preflight, and accessibility formatting. u/HuckleberryEntire699's standardized skill chain for iOS apps shows the pattern works at production scale. A registry with search, versioning, and compatibility metadata would accelerate adoption. u/r3lize's Taito (mentioned April 19) addresses packaging; discovery remains unsolved.

[++] Persistent Cross-Session Context System -- The "50 First Dates" problem described by u/StatisticianFluid747 is universal among heavy users. Markdown files are the current workaround but are manually maintained and not queryable. A system that captures architectural decisions, rejected approaches, and configuration preferences across sessions -- making them available to any agent without re-explanation -- would eliminate the most common daily friction point.

[+] Vibe Coder Security Audit Pipeline -- u/Opening_Apricot_5419's $1,000 API key drainage story and u/juliac87's Cursor .env leak demonstrate that security gaps remain the highest-severity risk for non-technical builders. A pre-deployment scan that checks for exposed keys in frontend code, git history, and agent chat logs would prevent the most common and most expensive vibe coding failure mode.

[+] AI-Assisted Design Framework for Builders -- u/interface_dot_env's AMA framework (plot your axes, focus on category needs, audit users honestly, move away from the middle) provides the methodology. Packaging this into a Claude Skill or agent workflow that runs before or alongside development -- forcing the design decisions that vibe coding skips -- would address the "visual sameness" and "no clear user" problems that kill otherwise functional products.

8. Takeaways¶

Opus 4.7's adaptive thinking is a server-side black box with no reliable override below max effort. u/aizver_muti's reverse-engineering confirmed the thinking decision happens on Anthropic's servers, ignoring system prompts and user injections. Correctness tracks thinking perfectly. For consistent reasoning, Opus 4.6 with DISABLE_ADAPTIVE_THINKING=1 remains the better choice. (Opus 4.6 without adaptive thinking outperforms Opus 4.7)
GitHub removed Opus 4.6 from Copilot individual plans mid-session, breaking trust. Pro+ subscribers lost access to their preferred model without warning, new signups were paused, and the community response was immediate and severe. This is the largest single-day disruption to the Copilot individual user base. (What happened? Opus 4.6 disabled, Changes to GitHub Copilot Individual plans)
The "harness, not the model" thesis now has measured evidence. u/chargewubz's GEPA-based CLAUDE.md optimization achieved a 20 percentage point improvement on Haiku 4.5 from prompt changes alone. This establishes that configuration investment can outperform model upgrades for many use cases. (Optimizing CLAUDE.md with GEPA)
Cognitive atrophy from AI dependence is a growing concern among experienced developers. u/StatisticianFluid747's cross-posted thread on "brain turning to mush" resonated in both r/cursor and r/ClaudeCode. The pattern -- shipping 10x faster while retaining nothing -- suggests a structural tradeoff that the community has not yet found a sustainable answer for. (anyone else feel like their brain is turning to mush)
Local/hybrid model architectures are proving economically viable. u/DeliciousGorilla's ask-local achieves 30x less context consumption by routing routine tasks to Qwen 3.6 running locally. As platform costs rise and limits tighten, hybrid architectures that reserve cloud models for high-complexity tasks may become the default pattern. (30x less context per task)
Every major AI coding platform is simultaneously degrading individual user access to preserve enterprise capacity. Claude tightens weekly limits and doubles token consumption; Copilot removes Opus 4.6 and pauses signups; Antigravity has frequent failures with no queue system. The convergence suggests a structural industry capacity crisis, not platform-specific issues. The individual developer is becoming a second-class citizen across the board.