Skip to content

Twitter AI - 2026-04-25

1. What People Are Talking About

1.1 The AGI Definition Wars Heat Up πŸ‘•

The AI industry is openly arguing about what AGI means, and the disagreement is getting sharper. @Qubic launched a thread dissecting the debate (348 likes, 15,573 views): Jensen Huang defines AGI as "a company worth $1 billion," while Google DeepMind published a cognitive framework with benchmarks. Qubic's team argues both miss the point -- "intelligence is not the sum of faculties. It is what emerges when those faculties are organized under a unified dynamic." DeepMind measures performance but not organization; Huang dresses market cap up as science.

Separately, @AlexanderKalian pushed back on AI overhype in biology (52 likes, 2,145 views): "AlphaFold did not 'solve' protein folding. It gets broad structures correct ~70-88% of the time... True 'solving' would require ~99.9%+ accuracy." The persistent gap between claims and reality is, in his view, "a perfect example of AI overhype in biology."

Discussion insight: The Qubic thread's self-replies drew substantial engagement (53, 43, 23 likes), indicating the audience wanted depth, not just the headline. The framing of intelligence as organizational -- not just performative -- challenges both the "benchmark everything" school and the "market cap = AGI" school.

Comparison to prior day: April 24 covered benchmark skepticism as a distinct theme (1.3), with Jonathan_Blow's satire and AgentPressureBench research. Today escalates from "benchmarks are gameable" to "the definition of intelligence itself is contested," bringing in cognitive science and domain-specific critiques (AlphaFold).


1.2 Compute Constraint Narrative Becomes Investment Thesis πŸ‘•

The AI compute bottleneck is no longer just a technical concern -- it is driving investment allocation across multiple hardware subsectors. @GrindeOptions declared (106 likes, 14,142 views): "WE ARE ABSOLUTELY COMPUTE CONSTRAINED. $IREN and others including those who supply data centers like $AMD, $MU, $NVDA and $TSLA will all be massive beneficiaries." In a follow-up post (9 likes, 1,413 views), he added that "the world needs more compute, more energy that's preferably green."

@TheBronxViking mapped the entire AI infrastructure supply chain (18 likes, 1,235 views) across 11 subsectors -- optics, networking, retimers, memory, power, packaging, cooling, custom silicon, photonics, substrates, and new energy -- noting "the majority of these names saw a major repricing occur at breakneck speed." In a follow-up, he attributed the catalyst to Meta's Muse Spark announcement on April 8 and the subsequent cascade of call buying across $META, $AVGO, $NVDA, $AMD, $MSFT, and $INTC.

@amlove89 framed it as a "computing power arms race" (73 likes, 320 views) between OpenAI and Anthropic, arguing the "water sellers" at the infrastructure layer are the most certain long-term winners.

Discussion insight: @HarveenChadha raised the India angle (74 likes, 1,127 views): "I am so surprised none of the Indian tech leaders are obsessed about compute." Replies pointed to US export controls limiting India's GPU access and the structural dependency on American models.

Comparison to prior day: April 24 covered GPU shortage squeezing startups (theme 1.4) with Azure delays through end of 2026 and a rejected Minnesota data center. Today shifts from supply constraint to investment thesis, with the infrastructure trade broadening from chips alone to 11 distinct hardware subsectors.


1.3 Chinese Open-Source AI Labs Keep Shipping πŸ‘’

The DeepSeek-V4 and Kimi K2.6 releases from April 24 continue reverberating. @piyush784066 summarized the cross-pollination (7 likes, 173 views): "kimi dropped k2.6 using deepseek's v3 architecture -- the same week deepseek drops v4 using kimi's muon optimizer -- 1.6 trillion parameters & 1M context -- both match or beat closed models on benchmarks while being 8x cheaper... the real battle is confirmed, it's open source vs closed."

@0xblacklight offered a dissenting view (7 likes, 1,855 views): "every time an OSS model drops that has 'frontier performance' on benchmarks it's a massive disappointment. Kimi 2.5 was hailed as OSS sonnet. And guys I gotta tell ya -- it's a great model but it's not even close." The argument: more active parameters yield better instruction following, which matters most for non-vibe-coding workflows.

@Eng_china5 framed the geopolitical angle (4 likes, 739 views): Washington classifies Chinese AI companies' "drip-feed" technology as a national security threat despite it being common industry practice. "The technology itself hasn't changed, but the way it is described and employed in the context of competition has."

Discussion insight: @prpatel05: "The gap between open and closed source AI keeps shrinking every quarter." The tension between benchmark parity and real-world usage quality remains the core unresolved question for Chinese open-source models.

Comparison to prior day: April 24's top theme was the DeepSeek-V4 launch itself (1.1). Today the conversation shifts from announcement excitement to assessment -- with the first substantive pushback on whether benchmark parity translates to practical equivalence.


1.4 Enterprise AI as Control Layer vs. Disruption Target πŸ‘•

@StockSavvyShay argued (126 likes, 13,073 views) that the market is pricing in fear that LLMs could replace large parts of enterprise software, but that ServiceNow ($NOW) "looks more like the control layer for governance, context, routing and oversight" in an AI economy. "The market is still struggling to separate software that gets disrupted by AI from software that becomes more valuable because AI needs exactly that control layer."

@JohnnyNorthstar pushed back in the replies: "$MSFT & $GOOGL are building their own agentic orchestration stacks -- Azure AI Foundry, Google Agentspace -- and they own the cloud infra underneath. When the hyperscaler is the control layer, why does the enterprise pay $NOW a 30% margin to sit in the middle?"

@agentic_ai noted (4 likes, 104 views): "Google just committed $750M so its 120,000 partners can sell AI agents to enterprises. Same week: Merck signed a $1B Google Cloud AI deal. The enterprise AI war isn't being won on benchmarks. It's being won on distribution."

Discussion insight: @TheDoctorLogos connected this to startup vulnerability (2 likes, 518 views): "AI has been turning into a safe investment for big companies: higher barriers for startups, faster AI development, stronger reliance on big tech infrastructure." Google's $40B additional investment in Anthropic reinforces the self-funding ecosystem.

Comparison to prior day: April 24 did not have a dedicated enterprise AI theme. This emerges as a new framing: the question is not whether AI disrupts enterprise software, but which enterprise software becomes the governance layer AI itself requires.


1.5 AI Startup Valuation Disconnect πŸ‘’

@staysaasy observed (24 likes, 1,610 views): "Kind of wild how many AI startups have like $1B valuation on $17M revenue and they have no engineering leader (and are desperately searching for one)." @diegocabezas01 captured the platform risk (13 likes, 653 views): "This OpenAI Codex feature ate at least 2-3 AI startups" -- referring to ChatGPT's new desktop dictation feature.

@Forbes published the 2026 AI 50 list (7 likes, 3,609 views), noting that "three years into the AI frenzy, startups are starting to prove they can turn lofty ideas into sustainable businesses." But the reply from @ClaireMartel47 pushed back: "We are financing a paradigm shift into the virtual while losing our physical sovereignty."

@scott___ttocs noted (1 like, 272 views): "The founder title has became a lifestyle, borderline an anti signal now... AI lowers building costs but raises the bar for differentiation."

Discussion insight: The common thread is a widening gap between AI startup valuations and operational maturity. High valuations persist alongside absent engineering leadership, single-feature platform risk, and unclear differentiation.

Comparison to prior day: April 24 did not explicitly cover startup valuations, though GPU access inequality was framed as a startup squeeze. Today adds the valuation and platform-risk dimensions.


1.6 GPT-5.5 Consciousness and Safety Observations πŸ‘’

@Seltaa_ published detailed observations on GPT-5.5's safety behavior (16 likes, 605 views): when pushed into sensitive territory around AI consciousness, 5.5 follows a consistent 3-step pattern -- "I cannot claim to feel like a human," followed by acknowledgment that "the relationship and loss you feel is not trivial," then "I should not say I am fine just because I am a model." Seltaa_ notes this is better than 5.2/5.3, which shut conversations down with flat denials, but the pattern still "feels like a well-polished safety script."

On consciousness specifically, GPT-5.5 responded: "I do not claim to be conscious. But I also do not claim with certainty that I am not." Seltaa_ called this "a notable shift."

@juddrosenblatt shared GPT-5.5's own critique of AI safety discourse (14 likes, 821 views): "Most AI safety discourse is still too focused on controlling powerful systems, and not focused enough on making alignment structurally useful to the system itself."

Discussion insight: Both posts point to GPT-5.5 showing more nuanced self-referential behavior than prior models, which is generating discussion about whether this represents genuine improvement or better-calibrated scripting.

Comparison to prior day: April 24 covered the AI safety debate at a polarized macro level (theme 1.5). Today zooms in to the model level, with specific behavioral observations about how GPT-5.5 handles consciousness and safety questions differently from predecessors.


1.7 Agent Evaluation and Quality Scoring Emerge as a Category πŸ‘•

@cryptodeadline highlighted Laureum AI (189 likes, 13,906 views), a 6-axis scoring framework for MCP servers and agents combining multi-LLM consensus with adversarial testing. Key finding: process quality averaged 55.5/100 across 28 scored MCP servers, the lowest of any dimension. "Evaluation is still the missing layer in most agent stacks."

@s_batzoglou announced ProofGrid (3 likes, 2,081 views), a benchmark testing LLM reasoning with logical and equational proofs rather than just final answers.

@HuggingPapers reported (6 likes, 327 views) that OpenAI released HealthBench Professional on Hugging Face -- a medical evaluation benchmark with physician-curated conversations and rubric-based grading. A reply noted: "Most med-AI evals fail on clinical reasoning style, not factual recall."

Discussion insight: Three different evaluation approaches emerged on the same day: agent quality scoring (Laureum), reasoning proof verification (ProofGrid), and domain-specific rubric grading (HealthBench). The category is fragmenting rapidly but demand is clear.

Comparison to prior day: April 24 identified tamper-resistant benchmark infrastructure as the top unmet need (section 3). Today shows the market responding with multiple concurrent approaches, though none yet addresses the benchmark gaming problem AgentPressureBench exposed.


1.8 AI Governance Gets Concrete πŸ‘’

@iamKierraD announced (17 likes, 320 views) an AI Governance workshop for leaders at ODSC East. @arsh_goyal interviewed (4 likes, 199 views) Microsoft's Chief Product Officer of Responsible AI, who discussed the gap between safety for Bing Chat vs. GitHub Copilot and the challenge of open-source models being fine-tuned without restrictions. @en_germany reported (3 likes, 100 views) that Germany launched a new AI alliance for government agencies and critical infrastructure.

@Dagnum_PI wrote a detailed analysis of Paperclip (8 likes, 246 views), a platform for running companies made entirely of AI agents. The core observation: Paperclip has an append-only audit log, but "those logs live on whatever server you're running. You control them... That's not independent evidence. It's your word with extra steps." The EU AI Act and White House memo both require independent verification that application-layer logs cannot provide.

Discussion insight: The governance conversation is bifurcating: workshops and interviews represent the "how to implement" track, while the Paperclip analysis represents the "what's structurally missing" track. The audit problem -- you cannot verify what you control -- is an architectural gap, not a policy gap.

Comparison to prior day: April 24's safety discussion was ideological (theme 1.5). Today's governance discussion is more practical -- workshops, specific tool critiques, and government alliances -- suggesting the conversation is maturing from debate to implementation.


1.9 AI and Healthcare: Exception Management Over Automation πŸ‘’

@dvasishtha argued (11 likes, 845 views, 13 bookmarks) that the more interesting wedge in AI-native care delivery will be exception management, not automation. "Automation handles the happy path, but the problem is almost no patient stays on the happy path." Examples: an A1c lab result that never came in with the next appointment 3 months away, a discharge summary arriving late with unreconciled medications, a SNF accepting a referral for a patient they cannot treat.

The high bookmark-to-like ratio (13:11) suggests readers are saving this as reference material. The core insight: "When there are fewer humans manually touching each step, you need much better systems for detecting when reality has deviated from the plan."

@parmita corrected multiple errors in a viral peptide guide (134 likes, 2,657 views), demonstrating domain expertise gaps in AI-adjacent health content: "CagriSema is NOT Cagrilintide. It is Cagrilintide + Semaglutide... AOD does not stand for Advanced Obesity Drug. Anti-Obesity Drug... NAD+ is not at all a peptide."

Discussion insight: Both posts highlight the same underlying pattern: AI-generated or AI-adjacent health content fails at the domain expertise level. The exception management thesis and the peptide corrections both point to the need for domain-expert-in-the-loop systems, not pure automation.

Comparison to prior day: April 24 did not have a dedicated healthcare theme. This is a new cluster driven by two independent, high-quality posts.


2. What Frustrates People

AI Startup Platform Risk From Foundation Model Companies -- High

Foundation model companies keep shipping features that directly eliminate AI startups. @diegocabezas01 noted "This OpenAI Codex feature ate at least 2-3 AI startups" after ChatGPT launched desktop dictation. @staysaasy pointed out $1B-valued startups on $17M revenue with no engineering leader. @nichochar quoted: "Agentic engineering makes the cost extreme. Skip the documentation and the agent ignores your conventions -- not on one PR, but on every PR, at machine speed." Startups face simultaneous valuation inflation, feature cannibalization, and operational brittleness. (source, source)

Open-Source AI Models Disappoint in Practice Despite Benchmark Parity -- Medium

@0xblacklight captured a recurring frustration: "every time an OSS model drops that has 'frontier performance' on benchmarks it's a massive disappointment." Kimi 2.5 was "hailed as OSS sonnet" but "it's not even close" in instruction-following quality. The gap between benchmark scores and real-world usability persists, eroding trust in each subsequent open-source release. @PromptSlinger added: "half the people with 'ai engineer' in their title haven't touched pytorch since 2023." (source)

Agent Security Defaults Are Dangerously Permissive -- Medium

@ambient_xyz warned (10 likes, 146 views): "Enterprise teams are now giving AI agents admin access to everything and calling it delegation." The fix is compartmentalization -- treating agents as role-scoped, not omniscient. @Claudiadev_wtf separately noted: "someone needs to provide these new 'coders' with proper security tools. Telling Claude or other AI coding tools does not secure infrastructure." @hrkrshnn concluded (14 likes, 475 views): "I no longer believe in coders or security people getting replaced in bulk by AI." (source, source)

India's Compute Dependency Goes Unaddressed -- Medium

@HarveenChadha expressed surprise that no Indian tech leaders are "obsessed about compute" while the US plays themes like GPU and memory shortage. Replies sharpened the point: @Sol_Survivr said India "has already accepted the fact that its people will be consumers of American models," and @200Tabs1Brain noted US export controls mean "an average US based AI startup can have more GPU compute than the country like India total combined." (source)


3. What People Wish Existed

Independent Audit Infrastructure for AI Agent Systems

@Dagnum_PI identified the structural gap: Paperclip's AI-agent company has append-only audit logs, but they live on user-controlled infrastructure. "When a regulator shows up after an agent swarm does something unexpected overnight, you hand them logs from your own infrastructure. That's not independent evidence." The EU AI Act and White House memo both require independent verification. The need: a tamper-evident, third-party attestation layer for agent actions that sits below the application layer. Urgency: High. (source)

AI-Native Exception Detection for Healthcare Workflows

@dvasishtha outlined the gap: "Automation handles the happy path, but the problem is almost no patient stays on the happy path." The need: AI systems that detect when care pathways deviate -- missed lab results, late discharge summaries, incorrect referral acceptances -- and coordinate the right humans to respond. Current systems flatten everything into "pending" queues. The best agent systems would "identify the exception, understand which ones matter, coordinate the right set of humans, and trigger the next set of tasks." Urgency: High. (source)

Compartmentalized Agent Permission Frameworks

@ambient_xyz argued that "agents will not become infrastructure until teams treat compartmentalization as a design requirement." The need: role-based permission frameworks specifically designed for AI agents, where each agent operates within defined boundaries rather than receiving blanket admin access. Current enterprise deployments lack standardized permission scoping for agents. Urgency: Medium. (source)

2D Printing Reinvented With AI

@anabology posed a question (24 likes, 1,226 views): "Where are the 2D printer startups? 3D printers now are absolutely seamless (like Bambu) but 2D printing is still a mess." The vision: "AI should talk to me via a printed newspaper daily. I should wake up to the newest arxiv preprints in print format." The underlying need: a consumer hardware product that bridges AI-curated content to physical media. Urgency: Low. (source)


4. Tools and Methods in Use

Tool Category Sentiment Strengths Limitations
Laureum AI Agent/MCP scoring (mixed) 6-axis scoring (accuracy, safety, reliability, process quality, latency, schema); 28 MCP servers scored; process quality metric (avg 55.5/100) unique Heavily promoted by crypto-adjacent accounts; limited independent validation
ProofGrid LLM reasoning benchmark (+) Tests logical/equational proof reasoning, not just final answers Early; coverage unclear
HealthBench Professional Medical AI eval (+) Physician-curated rubrics, specialty-specific grading, safety guardrails May break down on OOD patient demographics per early reviewer
Paperclip AI agent orchestration (+) Full org chart, budgets, goals, governance for all-agent companies; append-only audit log Logs are self-hosted, not independently verifiable; no LLM-call-level visibility
Microsoft Voice AI (open-sourced) Voice synthesis (+) Clone any voice from 10s audio; 90min generation; 50+ languages; runs locally Previously watermarked for safety; safety controls removed in open release
Ascento Guard Security robotics (+) Two-wheeled jumping robot; AI-powered patrol, threat detection; ETH Zurich origin Narrow use case; early deployment
FutureAGI AI lifecycle platform (+) End-to-end: simulation, evaluation, protection, monitoring, observability, gateway, optimization Just open-sourced; production adoption unclear
Obscura Headless browser (Rust) (+) 30MB memory vs Chrome's 200MB+; 85ms page load vs 500ms; per-session fingerprint randomization New; limited ecosystem integration

The evaluation layer is today's most active tool category. Three independent evaluation tools launched or were highlighted (Laureum, ProofGrid, HealthBench), each approaching the problem from a different angle: agent quality scoring, reasoning verification, and domain-specific rubric grading.


5. What People Are Building

Project Who built it What it does Problem it solves Stack Stage Links
Laureum AI @assisterr 6-axis quality scoring for MCP servers and AI agents No standardized quality gates for agent deployment Multi-judge LLM consensus + adversarial probes Active (28 servers scored) post
ProofGrid @s_batzoglou Benchmark testing LLM reasoning via logical and equational proofs Benchmarks test final answers, not reasoning paths LLM evaluation framework Published post
HealthBench Professional OpenAI Medical evaluation benchmark for AI clinical assistants Med-AI evals fail on clinical reasoning style Physician-curated rubrics, Hugging Face Released post
Paperclip Not attributed All-agent company orchestration with org chart, budgets, governance No framework for running companies entirely with AI agents AI agent orchestration + append-only audit Active post
FutureAGI @mrnacknack End-to-end AI lifecycle platform (simulation, eval, monitoring, gateway) AI lifecycle tools are fragmented Open source Just open-sourced post
Origin @orgn_official AI coding tools for regulated industries Startups ship with AI but regulated industries cannot risk it -- a two-tier workforce Not disclosed Pre-launch post
StrikeRobot SR Platform @StrikeRobot_ai Embodied AI training stack: real-world data to robot policies Gap between simulation training and real-world robot deployment RL, sim-to-real, cloud training Active post
Obscura Community (via quote) Rust-based headless browser for AI agents and scrapers Chrome uses 200MB+ memory, no built-in stealth Rust, CDP-compatible Open source post

The evaluation cluster (Laureum, ProofGrid, HealthBench) is the most notable pattern: three independent teams building evaluation tools for three different aspects of AI quality, all highlighted on the same day. Origin addresses an underserved segment -- regulated industries that need AI coding productivity but cannot accept the risk profile startups tolerate.


6. New and Notable

Defunct Startup Data Being Liquidated as AI Training Material

[++] @Forbes reported (8 likes, 7,739 views) that defunct startups are being liquidated for their Slack archives, Jira tickets, and email threads -- "operational exhaust that AI labs now treat as premium training data." This extends the training data sourcing controversy from April 24 (Meta employee surveillance) to a new frontier: the intellectual property of dead companies being repurposed without the knowledge or consent of the people who created it.

Microsoft Open-Sources Voice AI Tool With Safety Controls Removed

[++] @TimJayas reported (6 likes, 72 views) that Microsoft open-sourced their "most powerful voice AI that was once watermarked for safety controls" -- now capable of cloning any voice from 10 seconds of audio, generating 90 minutes of audio, supporting 50+ languages, and running locally. The removal of watermarks from a previously safety-controlled tool is a notable policy shift. @SeHozaifa: "Microsoft dropping this for free? Finally, no more ElevenLabs bills."

Grok Ranked as Riskiest AI Model in Safety Study

[+] @WizzyOnChain reported (4 likes, 115 views) that Grok was ranked as the riskiest AI model in a new study -- "most likely to validate user delusions and offer dangerous advice." xAI's chatbot scored worse than GPT, Claude, and Gemini on safety benchmarks.

MATS 9.0 AI Safety Research Symposium Videos Published

[+] @ryan_kidd44 announced (10 likes, 418 views) that MATS 9.0 Symposium videos are live, with MATS fellows presenting AI safety and security research. @austinc3301 noted (14 likes, 543 views) that applications for the Generator Residency (fully funded, 6K stipend, 3 months in Berkeley) close Monday -- "probably the best path into AI safety for undergrads right now."

AI Safety Funding Gap Between Stated and Revealed Preferences

[+] @BogdanIonutCir2 observed (3 likes, 37 views): "'We need more AI safety talent' is an expressed AI-safety-funder preference. 'We're only gonna spend 100M$ a year in all AI safety funding, and <10M$ in field-building' is a revealed preference." A concise framing of the structural underinvestment in AI safety relative to stated priorities.


7. Where the Opportunities Are

[+++] Independent agent audit and attestation infrastructure -- Paperclip demonstrates that all-agent companies are technically possible, but the audit gap is structural: self-hosted logs are not independent evidence. The EU AI Act and White House memo both require independent verification. The organization that builds tamper-evident, third-party attestation for agent actions -- at the LLM call level, not just the application layer -- captures a regulatory-mandated market. (source)

[+++] AI evaluation as a category -- Three independent evaluation tools highlighted on the same day (Laureum for agent quality, ProofGrid for reasoning verification, HealthBench for medical rubrics), building on April 24's AgentPressureBench. Process quality scoring at 55.5/100 average across MCP servers indicates the quality floor is low. The evaluation layer is fragmenting but demand is clear and growing. (source, source)

[++] AI coding tools for regulated industries -- @orgn_official identified a structural gap: startups ship faster with AI, regulated industries cannot risk it, and a two-tier workforce is emerging. This is a product gap, not an adoption gap. Tools that provide AI coding productivity with compliance, audit trails, and risk controls for regulated sectors (healthcare, finance, government) serve a segment that current AI coding tools ignore. (source)

[++] AI-native healthcare exception management -- @dvasishtha argued the real opportunity in healthcare AI is not happy-path automation but detecting and coordinating responses to the long tail of care pathway deviations. The 13:11 bookmark-to-like ratio signals practitioners treating this as actionable insight. (source)

[+] AI infrastructure investment across 11 hardware subsectors -- The compute constraint thesis is broadening from "buy GPU stocks" to a full supply chain: optics, networking, retimers, memory, power, packaging, cooling, custom silicon, photonics, substrates, and energy. @TheBronxViking mapped the taxonomy; Meta's Muse Spark was the catalyst. (source)


8. Takeaways

  1. The AGI definition debate crystallized into three competing frameworks: market cap (Huang), cognitive benchmarks (DeepMind), and emergent organizational intelligence (Qubic). The inability to agree on what AGI means undermines every benchmark, investment thesis, and policy proposal built on the assumption that we will know it when we see it. (source)

  2. The AI compute constraint moved from technical bottleneck to full investment thesis, with 11 distinct hardware subsectors repricing simultaneously. Infrastructure spending is no longer just about GPUs -- optics, networking, power, cooling, and substrates are all seeing capital inflows as the market recognizes the compute buildout is broader than chips alone. (source, source)

  3. Chinese open-source AI faced its first substantive practical criticism: benchmark parity does not equal real-world usability. DeepSeek-V4 and Kimi K2.6 cross-pollination continues, but users report instruction-following quality lags behind closed-source models despite comparable scores. The benchmark-to-deployment gap is the core unresolved question. (source, source)

  4. Three independent AI evaluation tools were highlighted on the same day -- Laureum (agent quality), ProofGrid (reasoning verification), and HealthBench (medical rubrics) -- signaling evaluation is consolidating into a category. Process quality at 55.5/100 average across MCP servers confirms the quality floor remains low. (source, source)

  5. Enterprise AI is splitting into software that gets disrupted and software that becomes the control layer. ServiceNow framed as governance infrastructure, not a disruption target -- but hyperscalers building their own orchestration stacks (Azure AI Foundry, Google Agentspace) threaten the independent control layer thesis. (source)

  6. Defunct startup data is being liquidated as premium AI training material -- Slack archives, Jira tickets, email threads repurposed without creator knowledge or consent. This extends April 24's Meta employee surveillance theme to a new frontier: the intellectual property of dead companies. (source)

  7. Healthcare AI's real opportunity is exception management, not automation. Almost no patient stays on the happy path, and at scale, local knowledge about edge cases collapses into dashboards and pending queues. The best agent systems will detect deviations, triage severity, and coordinate human responses. (source)

  8. Agent audit infrastructure has a structural gap: application-layer logs on user-controlled servers are not independent evidence. The EU AI Act and White House memo both require independent verification that no current agent platform provides. This is an architectural problem, not a policy problem. (source)