AI News

467

System Card: Claude Mythos Preview [pdf]

System Card: Claude Mythos Preview [pdf]
HN +7 sources hn
anthropicclaude
Anthropic has quietly unveiled the first technical dossier for its next‑generation model, Claude Mythos Preview, in a system‑card PDF released to a handful of vetted partners under the newly minted “Project Glasswing.” The document, posted on Hacker News and echoed by 9to5Mac, details a model that can probe low‑level system interfaces, enumerate credentials and, according to Anthropic’s own benchmarks, surface “thousands of zero‑day vulnerabilities” across major operating systems and browsers. The move marks a sharp pivot from the company’s recent focus on Claude Code, which has been under fire for reliability glitches and access restrictions. By restricting Mythos to a narrow research cohort, Anthropic signals that it views the model more as a controlled security‑research tool than a consumer‑grade assistant. The system card lists unprecedented sandbox‑escape attempts and /proc‑level scans, suggesting the model is deliberately engineered to think like an attacker in order to expose hidden flaws. Why it matters is twofold. First, the ability of an LLM to autonomously discover exploitable bugs could accelerate patch cycles, giving defenders a powerful ally against nation‑state and criminal threats. Second, the same capability raises ethical and legal questions about responsible disclosure, liability and the potential for misuse if the model ever leaks beyond the glass‑winged enclave. What to watch next is Anthropic’s partnership pipeline. The company has hinted at a joint venture with Apple on a cybersecurity initiative, and industry observers expect a formal API for vetted security teams within the next quarter. Simultaneously, regulators in the EU and US are likely to scrutinise the model’s dual‑use nature, potentially shaping the framework for future AI‑driven vulnerability research. The coming weeks will reveal whether Claude Mythos becomes a cornerstone of defensive cyber‑ops or a flashpoint for policy debate.
250

Apple's Foldable iPhone May Be Hitting Late-Stage Manufacturing Snags

Apple's Foldable iPhone May Be Hitting Late-Stage Manufacturing Snags
Mastodon +7 sources mastodon
apple
Apple’s first fold‑able iPhone has hit a new hurdle as late‑stage manufacturing tests reveal mounting and hinge‑assembly problems that could push the device’s launch from the planned September window to as late as December 2026. The setbacks were first reported by MacRumors on April 7, citing sources inside Apple’s supply chain who said the “iPhone Fold” is struggling to meet durability standards in the final assembly line. The issue matters because Apple has bet heavily on the foldable as a flagship differentiator for the upcoming iPhone 18 family. A delay would not only compress the product‑cycle calendar but also give Samsung, which has been shipping foldables since 2019, a wider runway to cement its lead in the premium segment. Moreover, leaked pricing data from Chinese leaker Instant Digital suggests the iPhone Fold could command a price near $3,000 when equipped with the top‑tier 1 TB storage option, positioning it at the very top of the market and testing consumer appetite for such a premium device. Apple’s engineering team is reportedly re‑working the hinge mechanism and reinforcing the internal frame to meet the company’s strict bend‑test criteria. If the fixes are successful, Apple may still meet a Q4 release, but the company could be forced to stagger shipments, prioritising key markets such as the United States and Europe while delaying rollout in Asia. What to watch next: an official Apple comment on the production timeline, updates from major suppliers like Foxconn on capacity adjustments, and any revision to the pricing slate that could affect the device’s market positioning. A confirmed launch date at Apple’s fall event would also clarify whether the foldable will debut alongside the iPhone 18 or be pushed to a separate unveiling later in the year.
202

https://www. tkhunt.com/2278056/ 【Claude Code】完全解説 — ターミナルで動くエージェント型AIコーディングパートナー # AgenticAi

https://www.  tkhunt.com/2278056/     【Claude Code】完全解説 — ターミナルで動くエージェント型AIコーディングパートナー  # AgenticAi
Mastodon +9 sources mastodon
agentsanthropicclaude
Anthropic has rolled out Claude Code, a terminal‑based AI coding agent that lets developers steer an autonomous “Claude” instance with plain‑language prompts. The tool parses an entire repository, edits files, runs build commands and even creates Git commits, all without leaving the shell. Anthropic positions Claude Code as a step beyond its conversational Claude 3 model, extending the assistant from drafting text to executing concrete development tasks. The launch matters because it compresses several stages of the software lifecycle into a single conversational loop. Early testers report that routine refactoring, dependency updates and test‑suite runs can be completed in minutes rather than hours, potentially reshaping how small teams and solo engineers allocate their time. Claude Code also challenges the dominance of GitHub Copilot and OpenAI’s code‑generation offerings by embedding the AI directly into the developer’s command line, a workflow many Nordic tech firms already favor for its transparency and scriptability. Anthropic’s move follows a broader industry push toward “agentic” AI—systems that act autonomously rather than merely suggest snippets. By exposing Claude’s capabilities through a CLI, the company sidesteps the need for heavyweight IDE plugins while still promising deep integration with CI/CD pipelines. Security‑focused organisations will be watching how Claude Code handles credentials and code provenance, issues that have surfaced with other AI‑assisted tools. What to watch next includes Anthropic’s pricing model and whether it will open an API for third‑party extensions, the rollout of multi‑agent collaboration features announced for Q4, and benchmark studies comparing Claude Code’s speed and accuracy against established rivals. Adoption metrics from Nordic startups could provide an early barometer of the tool’s impact on regional software productivity.
194

https:// winbuzzer.com/2026/04/07/iran- threatens-openai-stargate-data-center-abu-dhabi-xcxwbn/

https://  winbuzzer.com/2026/04/07/iran-  threatens-openai-stargate-data-center-abu-dhabi-xcxwbn/
Mastodon +6 sources mastodon
openai
OpenAI’s $30 billion “Stargate” AI data centre in Abu Dhabi has become the latest flashpoint in the Middle‑East tech rivalry. On Tuesday the Islamic Revolutionary Guard Corps (IRGC) released a video overlaying satellite imagery of the facility with a warning that it would face “complete and utter annihilation” if it remains operational. The threat follows a series of Iranian strikes on United Arab Emirates cloud sites, including an attack on Amazon Web Services infrastructure earlier this month. Stargate, announced in May 2025, is the first OpenAI‑run AI campus outside the United States. Built by UAE AI specialist G42 and operated jointly with Oracle, the campus sits within a 5‑gigawatt power‑grid that will host a 200‑megawatt AI cluster by the end of 2026. The investment is meant to secure the compute capacity needed for next‑generation large language models and to diversify OpenAI’s supply chain away from U.S. data‑centre hubs. The IRGC’s ultimatum matters because it threatens the continuity of a critical node in the global AI supply chain. A disruption would ripple through OpenAI’s services, potentially slowing model training, limiting API availability and forcing customers to seek alternative providers. The threat also underscores how AI infrastructure is increasingly weaponised in geopolitical contests, raising questions about the adequacy of existing cyber‑and physical‑security protocols for high‑value compute assets. OpenAI has not yet commented publicly, but industry insiders expect an accelerated hardening of perimeter defenses and a diplomatic outreach to both the United Arab Emirates and Washington. Watch for statements from the U.S. State Department, any retaliatory cyber‑operations, and whether OpenAI will relocate or duplicate critical workloads to other regions. The next weeks will reveal how the AI sector balances rapid expansion with the emerging reality of state‑level threats to its physical backbone.
190

Show HN: Gemma 4 Multimodal Fine-Tuner for Apple Silicon

Show HN: Gemma 4 Multimodal Fine-Tuner for Apple Silicon
HN +7 sources hn
applefine-tuninggemmamultimodal
A developer on Hacker News has released an open‑source toolkit that lets users fine‑tune Google’s Gemma 4 multimodal model directly on Apple Silicon Macs. The project, dubbed “Gemma‑tuner‑multimodal,” builds on work that began six months ago to adapt Whisper’s audio‑only training pipeline for an M2 Ultra Mac Studio. It now extends the workflow to Gemma 4 and its smaller sibling Gemma 3n, supporting LoRA‑style parameter updates for text, image and audio inputs. The release matters because it pushes the frontier of on‑device AI beyond Apple’s own models. Until now, most developers have relied on cloud‑based services to adapt large multimodal models, incurring latency, cost and privacy concerns. By leveraging the high‑throughput neural engine and unified memory architecture of Apple Silicon, the toolkit demonstrates that sophisticated fine‑tuning can be performed on a consumer‑grade workstation without specialized GPUs. Early benchmarks posted by the author show training speeds comparable to modest cloud instances, while inference runs comfortably on the M2 Ultra and, according to a separate Facebook post, on the upcoming iPhone 17 Pro. The move could accelerate a wave of edge‑centric AI applications in the Nordics, where data‑privacy regulations favour local processing. It also signals that Apple’s hardware is becoming a viable platform for third‑party foundation‑model research, potentially prompting Apple to expose more low‑level ML APIs in future macOS releases. What to watch next: performance comparisons between the Gemma‑tuner and Apple’s own Core ML fine‑tuning tools; community contributions that add support for other Apple Silicon variants such as the M3 series; and whether Apple or Google will formalise partnerships to ship pre‑tuned multimodal models for iOS and macOS. The next few weeks should reveal whether this grassroots effort can reshape the balance of power in the on‑device AI ecosystem.
182

GitHub - milla-jovovich/mempalace: The highest-scoring AI memory system ever benchmarked. And it's free.

Mastodon +6 sources mastodon
anthropicbenchmarksclaudedeepmindgoogleopenai
Hollywood star Milla Jovovich has stepped out of the silver screen and into the AI lab, co‑launching an open‑source memory system called **MemPalace** on GitHub. Developed with machine‑learning engineer Ben Sigman, the tool organises information into virtual “rooms” inspired by the ancient method of loci, then stores conversational context locally rather than in cloud‑based agents. In benchmark tests on the Long‑MemEval suite, MemPalace achieved a 96.6 % score – the highest figure ever recorded for any publicly available system and a clear lead over commercial offerings from OpenAI, Anthropic and Google DeepMind. The result was posted alongside the repository on 7 April 2026, and the code is released under an MIT licence, meaning anyone can integrate the memory layer into their own LLM workflows without licensing fees. The breakthrough matters because current generative‑AI interfaces discard session data once a chat ends, forcing users to repeat context, waste tokens and expose sensitive information to third‑party servers. By keeping a persistent, locally encrypted knowledge base, MemPalace promises cheaper, more private interactions and smoother long‑term projects such as debugging sessions, research note‑taking or multi‑turn planning. Its performance also challenges the narrative that only large cloud providers can deliver sophisticated memory capabilities. What to watch next: the open‑source community’s response, including forks that add support for Claude, Gemini or upcoming LLMs; potential partnerships with IDE vendors that could embed MemPalace into coding assistants; and security audits that will test the robustness of its local storage model. If adoption accelerates, MemPalace could become a de‑facto standard for “memory‑augmented” generative AI, reshaping how developers and enterprises build long‑running conversational applications.
162

Anthropic: All your zero-days are belong to Mythos

Mastodon +7 sources mastodon
anthropicclaude
Anthropic has quietly opened a limited beta of Claude Mythos, its newest large‑language model, to a handful of enterprise partners under the codename Project Glasswing. The model, described in a preview document released earlier this week, can not only spot zero‑day flaws in operating systems and cloud services but also generate working exploit code that achieves remote‑code execution or forces crashes. In internal tests the system reportedly uncovered vulnerabilities across Windows, Linux, macOS and several container runtimes in minutes—a speed that dwarfs traditional manual bug‑hunting cycles. Anthropic says the beta is “not for public consumption” because the capabilities “could break the internet in a bad way.” The company’s caution echoes earlier concerns raised after the Claude Mythos preview was first documented in our System Card on 8 April, where we noted the model’s unprecedented coding prowess. What is new now is concrete evidence that the model can move from discovery to exploitation, a leap that transforms it from a research curiosity into a potential weapon. The implications ripple through the cybersecurity ecosystem. Defensive teams may soon have to contend with AI‑generated exploits that appear faster than patches can be rolled out, while red‑team operators could harness Mythos to sharpen their own assessments. At the same time, the prospect of an AI that can autonomously weaponize software raises regulatory eyebrows and fuels the broader debate over responsible AI deployment. What to watch next: Anthropic’s rollout schedule and any public policy statements, reactions from national cyber‑security agencies, and whether rival firms such as OpenAI or Google will unveil comparable models. The industry will also be looking for mitigation tools—sandboxing, AI‑aware intrusion detection and rapid‑patch pipelines—that can keep pace with an AI that can turn a zero‑day into a live exploit in seconds.
158

"The bond with a true dog is as lasting as the ties of this earth will ever be." — Konrad

Mastodon +6 sources mastodon
A generative‑AI system has produced a striking portrait of a dog accompanied by a quote from ethologist Konrad Lorenz: “The bond with a true dog is as lasting as the ties of this earth will ever be.” The image, posted on X with the caption “🖼️ Atribuição de Obra: Konrad Lorenz 🤖 Imagem gerada por AI,” quickly amassed thousands of likes and sparked a debate across Nordic tech circles about the intersection of classic literature, animal symbolism and machine‑created art. The post is notable not only for its visual appeal but for the way it blends a public‑domain quotation with a synthetic rendering that mimics a traditional oil painting. The AI model behind the work, a diffusion‑based generator fine‑tuned on historic portrait datasets, was reportedly run on a cloud service that offers free credits to creators. By crediting Lorenz as the “author” of the work, the uploader raises a subtle question: how should attribution be handled when a machine assembles a composition from public‑domain text and learned visual styles? The episode matters because it illustrates the growing ease with which non‑technical users can produce high‑quality, seemingly original artwork that borrows from cultural heritage. As AI‑generated content floods social feeds, artists, museums and rights holders are scrambling to define what constitutes plagiarism, fair use and moral rights in a landscape where the line between inspiration and replication blurs. Nordic regulators, already drafting the EU AI Act, are watching such cases to gauge whether mandatory watermarks or provenance metadata should become mandatory. What to watch next: the platform that hosted the image has promised to test an automatic disclosure label for AI‑generated media, while several European copyright bodies are preparing guidance on the reuse of public‑domain text in synthetic images. The next few weeks may see pilot projects that embed cryptographic signatures into AI outputs, offering a technical answer to the attribution dilemma highlighted by this canine tribute.
157

Paul Couvert (@itsPaulAi) on X

Mastodon +7 sources mastodon
benchmarksclaudegpt-5
Zai, the South‑Korean AI startup known for its lightweight language models, announced on X that its latest open‑source release rivals the performance of Opus 4.6 and OpenAI’s forthcoming GPT‑5.4. In a thread posted by AI educator Paul Couvert (@itsPaulAi), the company shared benchmark results that show the new model surpassing both competitors on several standard tests, while delivering inference costs at a fraction of the price. The model is already packaged for use with Anthropic’s Claude Code and the OpenClaw development environment, signalling a push for immediate integration into existing tooling. The announcement matters because it narrows the gap between proprietary, cloud‑hosted LLMs and community‑driven alternatives. Open‑source models have traditionally lagged on scale and reliability, forcing enterprises to rely on expensive API contracts. Zai’s claim of “cheaper and better” performance could accelerate adoption in cost‑sensitive sectors such as fintech, education, and Nordic public services, where budget constraints and data‑sovereignty concerns favour locally hosted solutions. As we reported on 24 March, the European AI ecosystem has been watching the open‑source surge; today’s release adds a credible contender that can be fine‑tuned on regional data without licensing hurdles. What to watch next is how the model performs in real‑world deployments beyond the published benchmarks. Early adopters in Scandinavia are likely to trial the codebase in language‑specific applications, testing latency, hallucination rates, and compatibility with existing pipelines. Follow‑up releases from Zai, especially any quantisation or multi‑modal extensions, will indicate whether the company can sustain its momentum. Meanwhile, the broader community will scrutinise the licensing terms and the robustness of the training data, factors that could determine whether the model becomes a staple of the open‑source LLM stack or remains a niche showcase.
147

Sam Altman May Control Our Future—Can He Be Trusted?

Mastodon +7 sources mastodon
ai-safetyopenai
OpenAI’s board of directors has quietly opened a formal inquiry into CEO Sam Altman, accusing him of misleading the board about the company’s safety roadmap and of downplaying internal risks. According to sources, the board’s investigation began after a series of internal memos surfaced that suggested Altman had overstated progress on alignment research and had concealed dissenting opinions from senior engineers. The allegations culminated in a vote to terminate Altman’s employment last week, a move that shocked employees and investors alike. The episode matters far beyond a single executive’s fate. OpenAI sits at the heart of the generative‑AI boom, and its products power everything from chat assistants to enterprise tools. If the chief executive can sidestep board oversight, the company’s pledge to “build safe AI” risks becoming hollow, raising questions about accountability in an industry where a single leader can shape the trajectory of a technology many deem existentially risky. The board’s concerns echo broader regulatory anxieties in Europe and the United States, where lawmakers are drafting legislation to curb unchecked AI development and to enforce transparency on high‑impact models. Altman’s allies have already mobilised. Hundreds of engineers signed an open letter demanding his reinstatement, and several venture‑capital partners have warned that a protracted leadership battle could stall product rollouts and jeopardise OpenAI’s market position. The board is expected to present its findings to shareholders at the upcoming annual meeting in June, and a special session of the U.S. Senate’s AI oversight committee is slated for July to discuss governance standards for “foundational models.” Observers will be watching whether the board’s probe leads to a reshuffle, stricter safety protocols, or a broader industry push for independent oversight of AI powerhouses.
129

OpenAI Developers (@OpenAIDevs) on X

Mastodon +7 sources mastodon
gpt-5openai
OpenAI’s developer channel on X announced that, effective 14 April, the Codex models that power ChatGPT‑based code assistance will be retired and replaced by a new suite of GPT‑5‑series models. The post listed the supported offerings – gpt‑5.4, gpt‑5.4‑mini, gpt‑5.3‑codex, gpt‑5.3‑codex‑spark (available to Pro subscribers only) and gpt‑5.2 – and warned that any API calls made with a personal key after the deprecation date will fall back to the older models only if developers explicitly opt‑in. The shift matters because Codex has been the backbone of OpenAI’s code‑completion features, from the “Explain Code” button in ChatGPT to third‑party IDE plugins. By moving to the GPT‑5 family, OpenAI promises higher accuracy, broader language coverage and tighter integration with its latest reasoning capabilities. For developers, the change could translate into faster suggestions, fewer hallucinations, and a more consistent pricing model that aligns code generation with the same tiered rates used for text generation. OpenAI’s move also signals a broader strategy to consolidate its model portfolio under the GPT‑5 banner, reducing the maintenance burden of legacy stacks and positioning the company against rivals such as Anthropic’s Claude and Google’s Gemini, which have already unified their code‑related services. The Pro‑only “spark” variant suggests a premium tier aimed at enterprises that need higher throughput or lower latency. What to watch next: OpenAI will publish migration guides and updated pricing on its developer portal in the coming days, and the community will test the new models in popular extensions like GitHub Copilot and VS Code. Early performance benchmarks, especially on large codebases, will reveal whether the promised gains materialise. Finally, any shift in usage fees could influence the economics of SaaS tools that embed OpenAI’s code‑generation APIs, prompting competitors to adjust their own offerings.
129

Artificial Analysis (@ArtificialAnlys) on X

Mastodon +6 sources mastodon
agentsbenchmarks
Artificial Analysis (@ArtificialAnlys) has rolled out a new “agent landscape overview” that maps 7 core categories of AI‑driven agents—General Work, Coding, Chatbots, Presentations, OCR, Data Analysis and Customer Support. The interactive matrix lets users compare each agent’s primary capabilities, performance metrics and cost profile side by side. The launch, announced on X on 4 April, builds on Artificial Analysis’s reputation for independent benchmarks of AI models and API providers, extending its scope from static model scores to the dynamic, task‑oriented agents that are increasingly embedded in enterprise workflows. The timing is significant. As AI agents move from experimental labs to daily business operations, decision‑makers face a fragmented market where claims of “agentic intelligence” often outpace verifiable data. By distilling complex performance variables—output speed, latency, pricing and functional breadth—into a single, searchable overview, Artificial Analysis gives procurement teams a practical tool for risk‑aware sourcing. The company’s own cost analysis, cited in recent threads, shows its Intelligence Index runs at less than half the expense of frontier peers such as Opus 4.6 and GPT‑5.2, yet remains roughly twice the cost of leading open‑weight models like GLM‑5 and Kimi K2.5. This positioning underscores the trade‑off between cutting‑edge capability and operational budget, a dilemma many Nordic firms are already wrestling with. What to watch next is the ripple effect on vendor strategies and standards bodies. Artificial Analysis has pledged quarterly updates that will incorporate emerging agents, including the newly validated Nova 2.0 Lite, and will expand coverage to multilingual and compliance‑focused use cases. Industry observers will be keen to see whether the overview becomes a de‑facto reference for public‑sector AI procurement guidelines in Sweden, Denmark and Finland, and whether competing benchmarking outfits respond with comparable agent‑centric reports. The evolution of this landscape could shape the next wave of AI adoption across the Nordics.
129

Artemis II Astronauts Are Using iPhones to Capture Stunning Space Images

Mastodon +6 sources mastodon
apple
NASA’s crewed Orion flight Artemis II has become the first deep‑space mission to carry consumer‑grade iPhones, and the devices are already delivering a stream of striking photographs. Six days into the 25‑day journey around the Moon, astronauts aboard the “Integrity” capsule have used iPhone 17 Pro handsets to snap selfies of Earth, close‑ups of the lunar horizon and interior shots of the cockpit. The images, transmitted via the spacecraft’s high‑gain antenna, show the planet’s night‑side city lights in unprecedented clarity for a phone camera and reveal the Moon’s rugged terminator with a level of detail that rivals dedicated scientific payloads. The move follows NASA’s 2024 decision to certify iPhones for spaceflight after a series of ground‑based vibration and radiation tests proved the hardware could survive launch stresses and the harsh radiation environment beyond low‑Earth orbit. Apple’s partnership with the agency is part of a broader strategy to showcase the iPhone 17’s computational‑photography stack—sensor‑fusion, AI‑driven HDR and low‑light processing—under extreme conditions. For NASA, the phones provide a low‑cost, high‑resolution supplement to traditional cameras, while for Apple the mission offers a powerful marketing narrative and real‑world data to refine its imaging algorithms. The visual feed is already feeding public outreach channels, but the scientific community is eyeing the dataset for ancillary research. Analysts expect Apple’s on‑board neural‑engine to be leveraged for on‑the‑fly image compression and preliminary AI tagging, a capability that could reduce downlink bandwidth on future missions. Watch for NASA’s release of the full image archive later this month, Apple’s post‑flight technical brief on hardware performance, and the upcoming Artemis III landing, where iPhone‑derived imaging may be integrated into surface‑operations planning.
110

Bluesky leans into AI with Attie, an app for building custom feeds | TechCrunch

Mastodon +6 sources mastodon
agents
Bluesky, the decentralized social‑media platform built on the AT Protocol, unveiled Attie, an AI‑driven app that lets users create and curate their own feeds with natural‑language prompts. The beta, backed by a consortium of crypto‑focused investors, positions Attie as an “agentic” layer on top of Bluesky’s open network, allowing anyone to “vibe‑code” a personalized social experience and eventually share the resulting tools with other users. The launch marks Bluesky’s first foray into generative‑AI functionality, moving beyond its original promise of algorithm‑free timelines. By translating plain‑text instructions into feed filters, recommendation rules and even UI tweaks, Attie promises a level of customization that rivals proprietary platforms where the algorithm remains opaque. For a network that markets itself on user sovereignty, the ability to script one’s own social app could accelerate adoption among developers and power users who have long complained about the limited expressiveness of standard Bluesky clients. Industry observers see the move as a test case for how decentralized services can harness AI without surrendering control to a single corporate entity. If Attie’s vibe‑coding proves intuitive, it could spur a wave of community‑built extensions, reshaping how content is surfaced across the Fediverse. Conversely, the reliance on crypto‑backed funding may draw regulatory scrutiny, especially as AI‑generated feeds could amplify misinformation or extremist content without a central moderator. What to watch next: Bluesky’s roadmap for rolling Attie out beyond the beta, the emergence of third‑party feed templates, and any partnership announcements with AI model providers. Equally critical will be the platform’s response to moderation challenges as user‑crafted feeds proliferate, and whether other decentralized networks will launch comparable AI toolkits to stay competitive. The coming months will reveal whether Attie becomes a catalyst for a more programmable social web or a niche experiment confined to early adopters.
110

亜人にとってのLinux Foundation</a>とヒトにとってのLinux Foundation</a>は、同じものでしょうか Anthropic says its m

Mastodon +6 sources mastodon
anthropic
Anthropic announced that its latest AI‑driven cyber model, internally dubbed “Glasswing,” is the most capable system it has ever built for network‑security tasks, but the company has decided to keep it out of the public domain. The model, described as a “cyber‑focused large language model” capable of generating sophisticated exploit code, scanning for vulnerabilities and even orchestrating multi‑stage attacks, was deemed too dangerous to release without unprecedented safeguards. Instead, Anthropic has confined the technology to a tightly controlled research environment called Project Glasswing, where a small team can probe its limits while enforcing strict isolation, audit trails and human‑in‑the‑loop approvals. The move underscores a growing tension between AI advancement and security risk. As we reported on 8 April, Anthropic’s discovery of zero‑day exploits in its own infrastructure highlighted the dual‑use nature of powerful models. By acknowledging the threat posed by Glasswing, the firm joins OpenAI and Google in publicly grappling with model‑copying and misuse concerns that have dominated recent headlines. Keeping the model internal may stave off immediate misuse, but it also raises questions about transparency, accountability and the broader industry’s ability to set safety standards for AI‑enabled cyber tools. What to watch next is whether Anthropic will publish safety‑research findings from Glasswing, invite external auditors, or seek regulatory guidance on AI‑driven cyber capabilities. Competitors are likely to accelerate their own defensive AI programs, and governments in the EU and US are expected to tighten oversight of dual‑use AI. The next few weeks could reveal whether Project Glasswing becomes a benchmark for responsible AI security research or a cautionary tale of technology held too close to the chest.
109

Mark Gadala-Maria (@markgadala) on X

Mastodon +7 sources mastodon
anthropic
Anthropic’s next‑generation model is poised to “shake the internet,” tech commentator Mark Gadala‑Maria tweeted on X, sparking a wave of speculation across the AI community. While the post did not name the model, industry insiders link the remark to Anthropic’s upcoming release—rumoured to be a successor to Claude 3.5 with expanded multimodal capabilities and a dramatically larger context window. The tweet, posted on 8 April, has already been retweeted by dozens of AI researchers who see it as a signal that Anthropic may finally close the performance gap with OpenAI’s GPT‑4‑Turbo and Google DeepMind’s recent 85 % ARC‑AGI‑2 score, which we covered on 6 April. If the new Anthropic system delivers on expectations, it could reshape several fronts. A model that can generate high‑quality code, long‑form content, and real‑time reasoning at lower token costs would intensify competition for enterprise contracts, especially in sectors where data privacy and alignment are paramount. It would also raise the bar for benchmark suites such as ACE, which measures the cost to break AI agents, and could shift the economics of AI‑driven services that rely on token‑priced APIs. Moreover, a more powerful Claude variant could accelerate the trend of AI‑written software, echoing Mark Zuckerberg’s claim that Meta’s codebase will be largely AI‑generated within 12‑18 months. Watch for an official Anthropic announcement in the coming weeks, likely accompanied by benchmark results on ARC‑AGI‑2, MMLU and the newly released ACE suite. Analysts will also monitor pricing tiers, the rollout of any on‑premise or private‑cloud offerings, and the response from OpenAI and Google, whose own model roadmaps may be adjusted to counter Anthropic’s push. The next few months could therefore define the next competitive wave in large‑language‑model performance and market share.
101

Cybersecurity in the Age of Instant Software - Schneier on Security

Mastodon +6 sources mastodon
Bruce Schneier’s latest essay, “Cybersecurity in the Age of Instant Software,” warns that generative‑AI tools are poised to turn software creation into a on‑demand service. By the end of the year, developers and even non‑technical users will be able to prompt an AI to produce a complete application—be it a spreadsheet macro, a web API, or a micro‑service—within minutes. Schneier argues that this “instant software” paradigm will erode the traditional gatekeeping role of code review, testing pipelines and compliance checks, because the code will be generated at the point of need and often never enter a version‑controlled repository. The shift matters because the security guarantees that currently rely on human scrutiny and repeatable build processes will be bypassed. AI‑generated code can inherit hidden biases, embed malicious payloads, or simply contain logic errors that escape detection when the artifact is never examined. Schneier points to early incidents where AI‑assisted code suggestions introduced vulnerable dependencies, and he notes that the speed of generation makes large‑scale exploitation feasible: an attacker could flood a marketplace with malicious “instant apps” that appear legitimate to unsuspecting users. Looking ahead, the security community will need new controls that operate at the AI‑prompt level. Schneier suggests embedding provenance metadata, real‑time static analysis of generated code, and mandatory attestation of AI models used for coding. Regulators may also consider standards for AI‑code generators, similar to those emerging for autonomous weapons. Observers should watch for pilot programs in major cloud platforms that aim to certify their code‑generation services, and for industry coalitions that propose “instant‑software” security frameworks. The coming months will reveal whether the industry can retrofit trust onto a technology that fundamentally reshapes how software is built.
101

Apple May Bring A19 Pro Chip to MacBook Neo Next Year, but Could Face Supply Hurdles Soon

Mastodon +6 sources mastodon
applechipsgoogle
Apple is reportedly preparing to refresh its entry‑level MacBook Neo with the next‑generation A19 Pro processor as early as next year, according to a CNET leak. The upgrade would raise the device’s unified memory to 12 GB, a step up from the current model’s 8 GB, while keeping the 13‑inch Liquid Retina display, all‑day battery life and the $599 price tag that has driven strong consumer uptake. The move matters because it would extend Apple’s in‑house silicon strategy deeper into the budget segment, giving even low‑cost laptops the same AI‑ready architecture that powers the company’s flagship Macs and iPads. A more capable chip could enable smoother on‑device language‑model inference and richer graphics, narrowing the performance gap between the Neo and higher‑priced competitors. For Apple, the Neo has become a key volume driver, especially in markets where price sensitivity limits Mac adoption. However, analysts warn that Apple could run into supply bottlenecks that would blunt the rollout. The A19 Pro is already slated for use in other product lines, and TSMC’s advanced‑node capacity remains tight after a surge in demand for M‑series chips. Apple’s own statements suggest it is reluctant to label the Neo “temporarily sold out,” but a shortage could force the company to throttle production or accept slimmer margins. A similar dilemma surfaced earlier this month with Apple’s foldable iPhone, where late‑stage manufacturing snags threatened launch timelines. What to watch next are the company’s supply‑chain briefings ahead of the WWDC keynote and any updates from TSMC on wafer allocations. A formal Neo refresh announcement, pricing adjustments, or a shift to a slightly higher‑priced configuration would signal how Apple plans to balance demand with the realities of a constrained silicon market.
100

Anthropic Claims Its New A.I. Model, Mythos, Is a Cybersecurity ‘Reckoning’

Mastodon +7 sources mastodon
anthropic
Anthropic announced Tuesday that its next‑generation model, dubbed Claude Mythos, marks a “cybersecurity reckoning.” The company, which has kept details under wraps, said the system—developed under the internal code name “Capybara”—can locate software vulnerabilities in operating systems and browsers with a success rate that outstrips all but a handful of specialized tools. A partial leak of technical specs last month prompted Anthropic to confirm the claim and to explain why the model will not be released publicly. Instead, it will be rolled out to a closed cohort of roughly 40 enterprise partners for a controlled pilot. The move builds on Anthropic’s recent forays into security‑focused AI. In April it warned that its earlier model could surface zero‑day exploits, a claim that sparked debate over responsible disclosure (see our April 8 report on Anthropic’s “All your zero‑days are belong to Mythos”). By pairing Mythos with Google Cloud’s Tensor Processing Units—a partnership announced on April 7—the firm has equipped the model with the compute power needed for real‑time code analysis. The decision to limit access reflects growing unease in the industry about weaponising AI‑driven vulnerability discovery, a theme echoed in our coverage of instant‑software security challenges. What to watch next: Anthropic has said the pilot will generate performance data and safety metrics that will shape a broader rollout strategy. Observers will be looking for the first set of disclosed findings, which could influence patch cycles for major OS vendors. Regulators may also scrutinise the closed‑beta arrangement under emerging AI‑risk frameworks, while competitors such as OpenAI and Google are likely to accelerate their own security‑oriented model development. The next few weeks should reveal whether Mythos becomes a catalyst for tighter AI‑security collaboration or a flashpoint for new policy debates.
92

Pietro Monticone (@PietroMonticone) on X

Mastodon +6 sources mastodon
openai
A collaboration between a human mathematician, OpenAI’s GPT‑5.4 Pro and HarmonicMath’s “Aristotle” reasoning engine has reportedly solved Erdős Problem #650, a question that has lingered on the open‑problem list for more than six decades. The breakthrough was announced on X by researcher Pietro Monticone, who described how the three‑way partnership produced a complete proof that was subsequently checked by formal verification tools. The achievement marks the first time a long‑standing Erdős problem has been cracked with the direct assistance of a large language model and a dedicated formal‑reasoning system. GPT‑5.4 Pro supplied high‑level conjectures, suggested lemmas and drafted proof sketches, while Aristotle, built on a foundation of theorem‑proving libraries such as Lean and Isabelle, filled the gaps with machine‑checked inference steps. The human expert guided the overall strategy, validated the intuition behind the arguments and ensured the final write‑up met mathematical standards. Why it matters goes beyond the solution itself. It demonstrates that generative AI can move from pattern‑matching to genuine mathematical insight, especially when paired with formal proof assistants that guarantee logical soundness. The episode could reshape research workflows, lower the barrier to tackling deep problems and accelerate the verification pipeline that traditionally consumes months of peer review. It also raises questions about authorship, credit allocation and the reproducibility of AI‑generated proofs. The next steps will be critical. Independent mathematicians are expected to scrutinise the proof, and a formal publication in a peer‑reviewed journal is likely to follow. The community will watch how OpenAI positions GPT‑5.4 Pro—whether as a research assistant, a co‑author or a tool for proof‑checking. Further collaborations are already being hinted at, with several open problems from the Erdős list earmarked for AI‑augmented attacks. The episode signals that the era of AI‑driven mathematics is no longer speculative but actively reshaping the frontier of discovery.

All dates