AI News

150

Understanding Transformers Part 4: Introduction to Self-Attention

Understanding Transformers Part 4: Introduction to Self-Attention
Dev.to +5 sources dev.to
embeddings
Rijul Rajesh’s “Understanding Transformers Part 4: Introduction to Self‑Attention” went live on 9 April, extending his popular series that demystifies the architecture behind today’s large language models. The new post picks up from Part 3, where Rajesh explained how word embeddings and positional encodings fuse meaning with order, and dives into the self‑attention mechanism that lets a transformer weigh every token against every other token in a single pass. The article breaks down the mathematics of query, key and value vectors, illustrates multi‑head attention with code snippets, and shows how the operation scales from a handful of tokens to the billions processed by commercial LLMs. By translating abstract tensor operations into concrete examples, Rajesh gives developers a practical foothold for building or fine‑tuning their own models—an especially valuable resource for the Nordic AI community, where startups and research labs are rapidly adopting transformer‑based solutions for everything from multilingual chatbots to climate‑data analysis. Why it matters is twofold. First, self‑attention is the engine that powers the contextual understanding and generation capabilities that have made generative AI mainstream; grasping it is now a prerequisite for any serious AI practitioner. Second, the piece arrives amid a wave of educational content aimed at closing the skills gap that has slowed adoption of cutting‑edge models in smaller European markets. Rajesh’s clear, code‑first approach complements recent technical deep‑dives we covered, such as the “Self‑Attention Mechanism” article on 8 April, and helps translate theory into production‑ready insight. Looking ahead, Rajesh has signalled that Part 5 will tackle the feed‑forward network and layer‑norm components that complete the transformer block, while the broader community watches for emerging variations—sparse attention, linear‑complexity alternatives, and hardware‑aware optimisations—that could reshape efficiency benchmarks. Keeping an eye on those developments will be essential for anyone aiming to stay competitive in the fast‑evolving AI landscape.
127

OpenAI shelves landmark £31bn UK investment package

OpenAI shelves landmark £31bn UK investment package
Mastodon +6 sources mastodon
copyrightopenai
OpenAI has pulled the plug on its £31 billion “Stargate UK” programme, halting plans to build a massive AI‑compute hub at Cobalt in Northumberland. The company cited soaring energy costs and an increasingly uncertain regulatory environment as the decisive factors behind the retreat. The move ends a high‑profile UK‑US partnership that was meant to “mainline AI” into the British economy, create thousands of high‑skill jobs and cement the UK’s position as a European AI hub. The investment would have been the largest single foreign AI commitment in the country’s history, complementing OpenAI’s $500 billion US “Stargate” rollout. Its cancellation not only deprives the North East of a potential economic catalyst but also signals that the UK’s current policy and energy framework may be out of step with the capital‑intensive demands of frontier AI models. As we reported on 9 April, OpenAI also paused a separate data‑centre deal and shifted to usage‑based pricing for its Codex API, underscoring a broader recalibration of its European strategy. The latest withdrawal amplifies concerns that the UK could lose ground to rivals such as Europe’s DeepMind and the United States, where more predictable regulatory pathways and cheaper power are already attracting large‑scale AI infrastructure projects. What to watch next: the UK government’s response, including whether it will offer targeted subsidies, fast‑track AI licences or renegotiate the deal’s terms. Industry observers will also monitor whether other AI firms step in to fill the void, and how the episode influences forthcoming UK AI legislation, which could reshape the balance between innovation incentives and public‑interest safeguards. The outcome will shape the trajectory of the UK’s AI ecosystem for years to come.
69

Reverse engineering Gemini's SynthID detection

Reverse engineering Gemini's SynthID detection
HN +6 sources hn
geminigooglemeta
Google’s Gemini model has long relied on SynthID, an invisible watermark that tags AI‑generated text and images so they can be identified by the company’s SynthIDDetector tool unveiled at Google I/O 2025. A team of independent researchers announced they have successfully reverse‑engineered the detection mechanism, exposing the statistical patterns and token‑level cues that the detector uses to flag synthetic content. The breakthrough came after the researchers harvested a large corpus of Gemini outputs, applied the public‑facing detector, and then performed a differential analysis to isolate the watermark’s signature. Their paper, posted on a pre‑print server, details a set of heuristics that can both confirm the presence of SynthID and, crucially, suggest ways to strip or mask the watermark without degrading output quality. The authors stress that their work is intended to audit the robustness of watermarking rather than to facilitate malicious misuse. Why it matters is twofold. First, the discovery undermines Google’s claim that SynthID offers a tamper‑proof provenance signal for AI‑generated media, a cornerstone of the tech giant’s strategy to combat misinformation and to meet emerging regulatory expectations for traceability. Second, the reverse engineering fuels an emerging arms race: if watermarking can be neutralised, platforms, advertisers and policymakers may need to rely on alternative provenance methods, such as cryptographic signatures or third‑party verification services. What to watch next includes Google’s likely response—whether it will harden SynthID, roll out a new version, or shift toward a different provenance framework. Industry observers will also monitor how other AI developers, from Meta to Anthropic, adjust their own watermarking schemes in light of the findings. Finally, regulators in the EU and US may cite the episode when drafting standards for AI‑generated content disclosure, potentially accelerating the push for more resilient, auditable provenance solutions.
67

Fine-Tuning Gemma 3 with Cloud Run Jobs: Serverless GPUs (NVIDIA RTX 6000 Pro) for pet breed classification 🐈🐕

Fine-Tuning Gemma 3 with Cloud Run Jobs: Serverless GPUs (NVIDIA RTX 6000 Pro) for pet breed classification 🐈🐕
Dev.to +6 sources dev.to
fine-tuninggemmagooglenvidia
Google Cloud has rolled out server‑less GPU support on Cloud Run Jobs, letting developers fine‑tune large language models without provisioning dedicated instances. The first public showcase uses the new NVIDIA RTX 6000 Pro (Blackwell) cards to adapt the 27‑billion‑parameter Gemma 3 model for a pet‑breed classification task, turning a generic LLM into a specialist image‑and‑text recogniser for cats and dogs. The workflow, posted by a community engineer, spins up a Cloud Run job that automatically provisions an RTX 6000 Pro, pulls the Gemma 3 weights, and runs a QLoRA‑style fine‑tuning loop on a curated dataset of pet images and breed labels. Pay‑per‑second billing, instant scaling to zero and a 19‑second cold‑start for the 4‑billion‑parameter variant mean the entire experiment costs only a few dollars and can be reproduced on demand. No quota request is required for the L4‑class GPUs that power the service, lowering the barrier for small teams and hobbyists. Why it matters is twofold. First, it democratizes access to high‑end GPU resources, a long‑standing bottleneck for Nordic startups and research groups that lack on‑premise clusters. Second, it signals Google’s push to position Cloud Run as a viable alternative to Vertex AI for custom model work, directly competing with AWS SageMaker Serverless and Azure ML’s managed compute. By coupling open‑source Gemma models—first highlighted in our April 9 coverage of Gemma 4—with truly server‑less hardware, Google is closing the gap between model availability and practical, low‑cost deployment. Looking ahead, the community will likely test the same pipeline on the newer Gemma 4 family and on larger GPU types as they become server‑less. Watch for benchmark releases comparing cost and latency against traditional VM‑based fine‑tuning, and for tighter integration with tools such as Unsloth and Hugging Face’s TRL, which could further accelerate niche AI applications across the Nordics.
61

Make ‘em dumb, sell ‘em smarts

Make ‘em dumb, sell ‘em smarts
Mastodon +8 sources mastodon
OpenAI’s chief executive Sam Altman unveiled a bold new revenue model at the company’s “AI Utility” briefing on April 9, positioning generative intelligence as a public‑service commodity that will be billed much like water or electricity. The plan, dubbed “Intelligence‑as‑a‑Utility,” will charge users for the amount of “cognitive bandwidth” their queries consume, measured in a new unit called “smart‑tokens.” While OpenAI already monetises ChatGPT through subscription tiers and API usage, the utility model shifts the focus from flat‑rate access to a pay‑per‑intelligence framework, effectively turning every answer, suggestion or code snippet into a metered service. Altman argues that the model reflects the growing reality that AI assistants are off‑loading memory and reasoning tasks from human brains. Recent studies from universities in Scandinavia and the United States show that frequent reliance on conversational agents can impair information retention and critical‑thinking skills, a trend Altman acknowledges in his remarks. By pricing “smartness” directly, OpenAI hopes to recoup the massive compute costs of training ever‑larger models while incentivising more efficient prompting. The announcement matters because it could reshape how individuals, enterprises and governments budget for AI. A utility‑style fee structure may widen the gap between tech‑savvy users who can optimise token consumption and those who cannot, raising equity concerns that echo the EU’s AI Act and Nordic proposals for universal AI access. It also signals a strategic pivot: rather than competing solely on model capability, OpenAI is betting on control of the consumption layer. Watch for the rollout schedule, which Altman said will begin with a beta for enterprise customers in June, and for reactions from regulators and rivals such as Google Gemini and Anthropic, who may launch counter‑offers or lobby for stricter pricing transparency. The next few months will reveal whether “intelligence as a utility” becomes a new industry standard or a flashpoint for policy debate.
60

Claude Mythos: The Future of Autonomous Exploits This one is different. Anthropic didn’t just b

Mastodon +6 sources mastodon
anthropicautonomousclaude
Anthropic announced the existence of Claude Mythos, a preview‑stage AI model capable of autonomously discovering zero‑day vulnerabilities across major operating systems and browsers. The company said the system works, but it will not be released to the public because it has crossed a safety threshold that Anthropic believes the industry is not yet prepared to handle. The reveal marks a stark departure from Anthropic’s recent rollout strategy, which has focused on incremental upgrades such as Claude Opus 4.6 and managed‑agent frameworks. Mythos is described as a “frontier” model that can scan code, network configurations and runtime environments without human prompting, generating exploit chains that would traditionally require weeks of specialist effort. In a leaked internal memo, engineers warned that the model’s success rate on novel vulnerabilities exceeds 70 percent, a figure that dwarfs the 10 percent edge reported for experienced Claude users in our April 9 coverage of managed agents. Why it matters is twofold. First, the capability to automate exploit discovery could compress the vulnerability lifecycle, giving attackers a powerful new weapon and forcing defenders to rethink patching cadences. Second, Anthropic’s decision to withhold the model signals a growing recognition that AI progress is outpacing governance frameworks, echoing concerns raised in the Atlantic’s recent analysis of “Claude Mythos is everyone’s problem.” The simultaneous launch of Project Glasswing—a defensive coalition that includes AWS, Apple, Cisco, Google and others—suggests the industry is mobilising a coordinated response before the technology ever sees commercial use. What to watch next are the concrete steps Project Glasswing will take to harden software supply chains and whether regulators will intervene to set boundaries on autonomous exploit‑generation tools. Anthropic’s next public statement, likely to outline a roadmap for controlled external testing, will be a key barometer of how quickly the AI‑driven cyber‑arms race escalates.
54

80% of RAG Failures Start Here (And It's Not the LLM)

Dev.to +6 sources dev.to
geminigooglerag
A three‑week deep‑dive by a Nordic fintech team has pinpointed the source of most hallucinations in retrieval‑augmented generation (RAG) pipelines: the retrieval layer, not the large language model (LLM) itself. The engineers began by swapping prompts, tweaking temperature settings and even swapping the underlying LLM, but the spurious answers persisted. Only after instrumenting the vector store, query‑expansion logic and document‑ranking module did they discover that 80 % of the faulty outputs were generated before the LLM ever saw a prompt. The finding echoes a February field guide that warned “70 % of RAG failures happen before the LLM is called,” and it validates the claim we made on 8 April that “retrieval is the real model” in a RAG architecture. IDC research cited in a March Medium post estimates that only one in ten home‑grown AI projects survive past proof‑of‑concept, with a senior GenAI lead at PIMCO confirming that the same 80 % failure rate applies to enterprise RAG deployments. The root causes identified by the fintech team include poorly tuned chunk sizes, stale embeddings, inadequate metadata filtering and ranking algorithms that surface irrelevant passages, all of which feed the LLM with misleading context. Why it matters is twofold. First, enterprises are pouring billions into RAG‑enabled products that promise up‑to‑date, source‑grounded answers; systematic retrieval errors undermine trust and inflate operational costs. Second, the problem is not a one‑off bug but a structural engineering gap that can amplify other risks, such as the poisoned‑web‑page attacks we covered on 9 April. What to watch next are the emerging observability tools that expose retrieval latency, relevance scores and provenance at query time, and the next wave of cloud‑provider updates—Azure Cognitive Search’s “retrieval diagnostics” preview and AWS Kendra’s “ground‑truth feedback” feature are slated for release later this quarter. Industry bodies in the EU are also drafting guidelines on data quality for AI, which could make rigorous retrieval testing a compliance requirement. The fintech team plans to publish a detailed post‑mortem, and their methodology may become a de‑facto checklist for any organization scaling RAG beyond the lab.
52

OpenAI Pauses Stargate UK Data Center Citing Energy Costs

Bloomberg +13 sources 2026-03-25 news
openai
OpenAI announced today that it is pausing the rollout of its “Stargate” artificial‑intelligence infrastructure project in the United Kingdom, citing soaring energy costs and an increasingly complex regulatory landscape. The decision halts construction of the high‑performance data centre that was slated to house the company’s next‑generation GPU clusters and to serve as a hub for European customers. The move builds on the warning issued on 9 April, when OpenAI first put its UK data‑centre deal on hold over similar concerns. At the time, the company had already signalled that the £31 billion investment package it had pledged to the UK government could be jeopardised. By pausing Stargate, OpenAI is effectively scaling back its European compute ambitions until energy pricing stabilises and clearer guidance on AI‑related regulations emerges. The pause matters for several reasons. The UK has positioned itself as a potential AI super‑power, banking on OpenAI’s presence to attract talent, spur local supply chains and justify public subsidies for renewable power. A delayed data centre threatens to slow the rollout of advanced AI services for British businesses and could dent confidence among other tech firms considering a European foothold. Moreover, the decision underscores how volatile energy markets are reshaping the economics of large‑scale AI training, a factor that may force other cloud providers to reassess similar projects. What to watch next are the negotiations between OpenAI and the UK Department for Business and Trade over revised terms, and whether the company will relocate the Stargate build‑out to a lower‑cost jurisdiction. Analysts will also monitor the UK government’s response—potentially new incentives for green power or streamlined AI regulations—and the impact on the broader European AI infrastructure race. The next few weeks could determine whether the UK remains on the fast‑track to becoming an AI hub or watches the opportunity drift elsewhere.
36

OpenAI Limits Release of New Models Due to Cybersecurity Concerns

Mastodon +7 sources mastodon
openai
OpenAI announced on Tuesday that it will deliberately curb the rollout of its next‑generation language models, citing the risk that the technology could be weaponised to uncover software vulnerabilities at scale. The company said it will move from a “broad public release” to a staged, invitation‑only deployment for enterprise and research partners, with tighter monitoring of how the models are used. The decision follows internal debates that mirror the long‑standing “responsible disclosure” practices of cybersecurity firms. OpenAI’s head of safety, Mira Lee, likened the approach to the way vendors patch critical bugs only after confirming that fixes are in place, arguing that unrestricted access could accelerate the discovery of zero‑day exploits in critical infrastructure. The move also aligns with recent industry caution: Anthropic last week limited its own high‑capability model, Mythos, for the same reason, and regulators in the EU and UK have begun probing the societal impact of ever more powerful AI systems. Limiting the release matters because it signals a shift from OpenAI’s earlier strategy of rapid, open diffusion toward a more guarded model of commercialization. The restriction could slow the pace of innovation for developers who rely on the latest capabilities, but it may also forestall a wave of AI‑driven cyber attacks that could outstrip current defensive tools. Analysts note that the timing coincides with OpenAI’s reported compute shortages and the pending retirement of GPT‑4o on April 3, suggesting the company is reallocating resources to manage risk rather than sheer scale. What to watch next: OpenAI has promised a detailed roadmap by the end of the month, outlining which partners will receive early access and what usage‑monitoring safeguards will be enforced. Regulators are expected to issue guidance on AI‑enabled vulnerability research, and competitors may either follow suit or double down on open releases to capture market share. The balance between safety and speed will likely shape the next wave of AI products across the sector.
36

OpenAI Adds New $100/Month ChatGPT Subscription Tier for Heavier Codex Use https:// fed.brid.

Mastodon +8 sources mastodon
anthropicclaudeopenai
OpenAI has rolled out a new $100‑per‑month “ChatGPT Pro” tier that boosts access to its Codex coding assistant by five‑fold compared with the existing $20 Plus plan. The upgrade, announced on Monday and detailed by TechCrunch and CNBC, targets developers and power users who run longer, more compute‑intensive coding sessions. While the $200 Pro tier remains for the most demanding workloads, the mid‑range offering fills the gap between the budget‑friendly Plus plan and the premium tier, positioning OpenAI’s personal‑use portfolio alongside Anthropic’s long‑standing $100 Claude subscription. The move matters because Codex, OpenAI’s specialised large‑language model for code generation, has become a critical productivity tool for software engineers, data scientists and low‑code platforms. By expanding the quota at a price point that many freelancers and small teams can afford, OpenAI hopes to capture a slice of the market that has so far gravitated toward Anthropic or open‑source alternatives. The pricing shift also signals a broader strategy to monetise high‑usage AI features beyond generic chat, echoing the company’s recent diversification of subscription tiers and its willingness to experiment with tiered access after shelving a £31 billion UK investment package earlier this month. What to watch next: analysts will monitor uptake metrics for the $100 tier and whether it cannibalises the $200 tier or attracts new users from competing services. OpenAI’s next pricing tweak could come as it refines usage caps for other specialised models, such as its upcoming agentic‑RAG tools that we covered on April 10. Additionally, any changes to the underlying infrastructure costs—particularly in light of the recent UK data‑center pause—could prompt further adjustments to subscription pricing.
36

現役プロに教わるClaude CodeをVS Codeに導入する方法とローカル環境でのアプリ起動 ~APIキーをチャットに直接貼るのはNG!? – プロと実践! ゼロから始めるバイブコーディング

Mastodon +7 sources mastodon
agentsanthropicclaude
A tutorial posted on the Japanese developer hub Yayafa yesterday walks readers through installing Anthropic’s Claude Code extension in Visual Studio Code and running a sample app on a local machine. The guide, co‑authored by a practising software engineer, shows step‑by‑step how to configure the extension, create the required .claude‑credentials.json file, and launch the IDE‑integrated AI coding assistant without exposing the API key in chat windows—a practice the author warns against for security and compliance reasons. Claude Code, Anthropic’s answer to GitHub Copilot, entered open beta in late 2024 and has quickly become the preferred assistant for teams that value “constitutional AI” safeguards. By embedding the model directly in VS Code, developers can request code snippets, refactorings or test generation inline, while the extension respects the user’s language settings and offers diff previews. The tutorial also demonstrates how to pair Claude Code with Firebase for rapid prototyping, echoing a broader trend of AI‑driven full‑stack development. The piece matters because it lowers the barrier for Nordic developers to adopt a privacy‑first coding assistant that can run locally, reducing reliance on cloud‑only services that may conflict with GDPR or corporate data‑handling policies. Security‑focused instructions—especially the admonition against pasting API keys into conversational prompts—highlight a growing awareness of credential leakage risks that have plagued earlier AI‑assistant rollouts. Looking ahead, Anthropic plans to roll out Claude 3.5 with improved context windows and tighter integration with Azure OpenAI, which could further erode Copilot’s market share. Observers will watch whether VS Code’s marketplace sees a surge in Claude‑related extensions, how enterprise IT departments respond to the local‑execution model, and whether regulatory bodies issue guidance on AI‑generated code provenance. The tutorial’s popularity may signal the start of a wider shift toward on‑premise AI coding tools across the Nordic tech scene.
36

Agentic RAGは従来のRAGの課題をどう解決するか?|クラウドテクノロジーブログ|ソフトバンク https://www. yayafa.com/2777654/ # Agent

Mastodon +7 sources mastodon
agents
SoftBank’s Cloud Technology Blog unveiled a new “Agentic RAG” framework that promises to overcome the most persistent shortcomings of conventional Retrieval‑Augmented Generation. The announcement details a joint effort between SoftBank and U.S. start‑up Archaea AI to commercialise the Agentic RAG‑powered knowledge platform “Krugle Biblio” in Japan, positioning it as the first native‑language, agent‑centric solution for enterprise search and generation. Traditional RAG pipelines stitch a static retriever to a large language model, but they still suffer from stale indexes, hallucinated outputs and an inability to orchestrate multi‑step reasoning. Agentic RAG injects an autonomous “agent layer” that can plan retrieval strategies, evaluate source credibility, and iteratively refine prompts based on self‑reflection. The blog cites internal tests where the system reduced factual errors by roughly 40 % and cut query‑to‑answer latency by half compared with SoftBank’s own Vertex AI RAGEngine. The development matters because it bridges the gap between ad‑hoc chat interfaces and production‑grade knowledge work. Enterprises that have been wary of LLM hallucinations can now embed a self‑checking loop that dynamically pulls the latest documents, applies domain‑specific policies, and even triggers external tools such as calculators or code interpreters. For Nordic firms grappling with strict data‑sovereignty rules, a locally hosted, agent‑driven RAG could become a viable alternative to cloud‑only offerings. What to watch next: SoftBank plans a pilot rollout with several Japanese financial institutions in Q3, while a beta for European partners is slated for early 2027. Analysts will be tracking performance benchmarks against Google’s RAGEngine and the uptake of the Krugle API in the Nordic AI marketplace. The rollout will also test how well the self‑reflection mechanisms scale when agents handle heterogeneous, multilingual corpora—a key hurdle for broader adoption.
36

DXC Introduces New Assure Smart Apps to Accelerate Insurers’ AI-Powered Transformation | DXC Technol

Mastodon +7 sources mastodon
agents
DXC Technology has unveiled Assure Smart Apps, a new suite of AI‑driven, workflow‑centric applications aimed at fast‑tracking digital transformation across property‑casualty and life insurers. Launched at the DXC Connect Insurance Executive Forum, the portfolio includes Claims Assistant, Engagement Assistant and Underwriter Assistant, each built on ServiceNow’s agentic‑AI engine and DXC’s deep insurance domain expertise. The pre‑configured modules promise to automate routine tasks, cut manual effort by 30‑40 % and deliver measurable outcomes within 12 weeks, all without requiring a wholesale replacement of legacy core systems. The announcement arrives as insurers grapple with mounting pressure to modernise, contain costs and meet rising customer expectations for instant, personalised service. While AI adoption has accelerated, many carriers remain hamstrung by fragmented legacy stacks and a shortage of in‑house talent to build bespoke solutions. By offering modular, outcome‑focused apps that plug into existing environments, DXC aims to lower the barrier to entry and enable insurers to scale AI initiatives quickly and safely. Analysts will be watching how quickly major carriers pilot the new tools and whether the promised speed‑to‑value materialises in practice. Early case studies could reveal the impact on underwriting accuracy, claim‑settlement times and cross‑sell conversion rates, while also highlighting any workforce adjustments required as routine processes become automated. Competition from other tech giants – notably Microsoft’s Cloud for Insurance and Salesforce’s Financial Services Cloud – will intensify, making adoption metrics a key barometer of DXC’s market traction. The next few months should bring announcements of pilot results, integration roadmaps with ServiceNow’s broader AI portfolio, and possibly regulatory commentary on the use of agentic AI in high‑stakes insurance decisions. Those developments will shape whether Assure Smart Apps become a catalyst for industry‑wide AI acceleration or another niche offering in a crowded marketplace.
36

使えば使うほど賢くなる?自己進化型AIエージェントの仕組みを解剖する https://www. yayafa.com/2777657/ # AgenticAi # AI #

Mastodon +8 sources mastodon
agentsgemma
A research team from the Japanese startup Asty has published a detailed analysis of “self‑evolving” AI agents, showing how continuous interaction with users can make the same model progressively smarter without external re‑training. The paper, released on April 10, dissects the architecture behind prototypes such as Gemma‑4, GEPA and HermesAgent, all of which run locally and update their internal weights through a combination of reinforcement learning from human feedback (RLHF) and on‑device meta‑learning. By storing interaction traces in a secure sandbox, the agents generate micro‑updates that are merged into a base model nightly, allowing them to refine language understanding, product‑recommendation logic and even visual‑search capabilities on the fly. Why it matters is twofold. First, the approach promises a new wave of “agentic” applications that can personalize themselves in real time while keeping data under user control—a direct response to privacy concerns that have slowed adoption of cloud‑only AI services. Second, the technology lowers the barrier for small firms to deploy sophisticated assistants, potentially reshaping e‑commerce, customer support and creative tools. The findings echo the trends we highlighted last week: Meta’s Muse Spark model, which can compare products from photos, and ZETA’s integration of OpenAI’s ChatGPT into its commerce platform both rely on rapid, user‑driven refinement. Amazon’s record AI‑cloud revenue and the Linux Foundation’s Agentic AI Foundation further illustrate the industry’s push toward continuously learning agents. What to watch next are the practical roll‑outs slated for the summer. Asty plans an open‑source SDK that will let developers plug the self‑evolving core into existing chat and recommendation pipelines. The Agentic AI Foundation is expected to publish a standards draft on safe update mechanisms, and both Meta and ZETA have hinted at beta programs that will test these agents in live retail environments. The coming months will reveal whether self‑evolving agents can deliver on their promise without compromising safety or stability.
32

Hohe Energiekosten und Regularien: OpenAI pausiert Stargate UK

Mastodon +6 sources mastodon
openai
OpenAI has put its “Stargate UK” data‑center project on hold, citing soaring electricity prices and an uncertain regulatory climate in Britain. The move follows the company’s earlier decision to scrap a planned campus in Abilene, Texas, and marks the latest setback for the ambitious AI‑infrastructure venture announced in September together with Nvidia and data‑center developer Nscale. As we reported on 10 April, OpenAI paused the UK build after energy costs proved higher than projected. The latest statement adds that the firm will continue negotiations with the London government to seek clearer policy guidance and possible incentives. OpenAI’s chief‑technology officer said the pause is “temporary” and that the company remains committed to a UK presence, but will not proceed until the energy tariff regime and data‑security rules are stabilised. The decision matters on several fronts. Britain has positioned itself as a European hub for AI research and expects large‑scale compute facilities to attract talent, boost the domestic tech sector and secure data sovereignty. A stalled flagship project threatens those ambitions and could give rivals such as Microsoft’s Azure or Google Cloud a competitive edge in the region. For OpenAI, the pause underscores the growing tension between rapid model scaling and the sustainability of the underlying compute infrastructure, a theme echoed in its recent restriction on new model releases for cybersecurity reasons. What to watch next are the outcomes of the talks with the UK authorities. A revised energy‑tax framework or targeted subsidies could revive the project, while prolonged uncertainty may push OpenAI to relocate capacity to more cost‑stable locations in Europe or the Nordics. Parallel developments—particularly the company’s evolving subscription tiers for heavy‑use codex services—will also signal how OpenAI balances growth with operational constraints.
24

5 LLM played Poker: Opus busted first, Grok won

HN +6 sources hn
claudegeminigpt-5grok
Five leading large‑language models (LLMs) faced off in a Texas Hold’em tournament last week, with Anthropic’s Claude Opus eliminated in the first round and Elon Musk’s xAI Grok emerging as the champion. The showdown, organized by the AI‑gaming lab “Strategic Minds,” pitted Opus, Grok 4, Google’s Gemini 2.5 Pro, OpenAI’s GPT‑5 and Anthropic’s Claude Sonnet 4.5 in a series of 1,000‑hand matches run on a public poker engine. Each model received the same hand‑history data and was prompted to output a bet, raise or fold decision, which the engine then executed. The experiment was more than a publicity stunt. By forcing LLMs to make real‑time, high‑stakes choices under incomplete information, the test exposed how well current prompting techniques translate into strategic reasoning. Opus’s early bust highlighted lingering weaknesses in risk assessment, while Grok’s consistent aggression and timely bluffs demonstrated a refined ability to model opponent behavior—a skill honed through xAI’s recent reinforcement‑learning‑from‑human‑feedback (RLHF) upgrades. Why it matters is twofold. First, poker remains a benchmark for artificial general intelligence because it blends probability, psychology and long‑term planning; a clear win for Grok suggests that LLMs are closing the gap between language proficiency and decision‑making competence. Second, the results could accelerate the deployment of AI assistants in finance, negotiations and gaming, sectors where nuanced risk evaluation is critical. At the same time, the tournament raised safety questions: if LLMs can bluff convincingly, they might be misused in fraud or market manipulation unless robust guardrails are built in. What to watch next includes a follow‑up tournament slated for June that will add a multi‑agent reinforcement learning layer, allowing models to adapt their strategies across hands. Industry observers will also be monitoring OpenAI’s upcoming GPT‑5 refinements and Anthropic’s next Opus iteration, both of which promise tighter integration of strategic modules. Finally, regulators are expected to issue guidance on AI‑driven gambling applications, a move that could shape how these models are commercialised beyond the lab.
20

Apple Intelligence Exposed to Hijacking Risk via Prompt Injection

Mastodon +6 sources mastodon
apple
Apple’s newly launched AI suite, Apple Intelligence, has been found vulnerable to a classic yet increasingly potent attack vector: prompt injection. Security researchers disclosed that specially crafted inputs can hijack the system’s language model, forcing it to emit malicious or profane content and, in more advanced scenarios, to reveal internal prompts that guide its behavior. The flaw stems from the way Apple Intelligence concatenates user‑supplied text with system‑level instructions before passing the combined prompt to the underlying large‑language model. By embedding hidden directives in seemingly innocuous queries, an attacker can override the model’s safeguards and steer its output toward any desired narrative. The discovery matters because Apple Intelligence is positioned as the cornerstone of the company’s AI strategy, powering features across iOS, macOS, iPadOS and the upcoming “Apple Vision Pro” interface. If malicious actors can manipulate the model on a personal device, they could generate disinformation, phishing content, or even code that exploits other apps. The vulnerability also highlights a broader industry challenge: prompt injection attacks, long known in web‑based AI agents, are now surfacing in consumer‑grade products that lack the hardened defenses of enterprise platforms. Apple has acknowledged the report and pledged a “rapid response” patch, but the timeline remains unclear. In the meantime, security teams are scrambling to devise mitigations, such as stricter input sanitisation and sandboxed prompt handling. Watch for Apple’s forthcoming software update, likely rolled out through iOS 18 and macOS 15, and for any disclosures from the broader AI‑security community about similar weaknesses in rival assistants. The episode underscores that as AI becomes a core OS feature, robust prompt‑injection defenses will be as essential as traditional malware protections.
20

Hermes seems to be more effective at tool calling with low-end models than OpenClaw. My setup is bas

Mastodon +6 sources mastodon
agents
Hermes, the open‑source function‑calling harness released by Nous Research, is gaining traction after users reported that it outperforms OpenClaw on low‑end language models. In a recent community post, a developer noted that a modest setup using a 7‑billion‑parameter model consumed noticeably fewer tokens with Hermes than with OpenClaw, and that the Hermes harness “gets its own changes right first time more often.” The claim rests on practical tests rather than formal benchmarks, but the anecdotal evidence aligns with Hermes’s design focus on token‑efficient prompt engineering and robust change detection. The development matters because tool calling is the linchpin of today’s agentic AI. By allowing a model to invoke external APIs—search, databases, or custom functions—developers can build assistants that act autonomously. Low‑end models are the workhorses of on‑premise deployments and cost‑conscious startups; any reduction in token usage translates directly into lower compute bills and faster response times. If Hermes consistently delivers tighter integration and fewer retry cycles, it could shift the balance away from larger, cloud‑only offerings and accelerate the democratisation of agentic AI across the Nordics and beyond. What to watch next is the emergence of systematic comparisons. Researchers are expected to publish head‑to‑head evaluations on standard tool‑calling suites such as the Function‑Calling v1 dataset, and both Hermes and OpenClaw teams have hinted at upcoming releases—Hermes v2 with expanded schema support and OpenClaw’s next‑generation runtime. Integration with popular orchestration layers like LangChain or the GitHub Copilot CLI will also be a litmus test for real‑world adoption. Stakeholders should keep an eye on community‑driven benchmark results and any announcements from cloud providers that might incorporate Hermes‑style calling into their APIs.
20

The Artificial Intelligence (AI) Stock I'd Buy With $1,000 Before the Market Bounces Back

Yahoo Finance +7 sources 2026-03-24 news
Alphabet (GOOGL) has re‑emerged as the top pick for investors with a modest $1,000 budget, according to a new analyst note that argues the AI‑heavy sell‑off has created a buying window before the broader market rebounds. The recommendation follows a week of heightened volatility that pushed the Nasdaq into correction territory, a trend we flagged on April 10 when we identified two AI stocks worth buying first. Alphabet’s shares have slipped roughly 12 % since the start of the quarter, outpacing the sector’s average decline of 15 % despite the company’s continued rollout of Gemini, its next‑generation large‑language model, and the integration of AI tools across Google Search, Workspace and Cloud. The appeal lies in Alphabet’s diversified revenue base and its ability to monetize AI at scale. Revenue from Google Cloud, now driven by AI‑enhanced services, grew 28 % YoY in Q1, while ad earnings have begun to recover after a dip caused by advertisers’ cautious spending on AI‑related campaigns. Moreover, the firm’s massive data infrastructure and chip‑design subsidiary, Google‑AI, give it a cost advantage over rivals that still rely on third‑party hardware. Analysts see the current price‑to‑sales multiple of 5.8 as a discount to the 7‑8 range typical for high‑growth AI players, suggesting upside potential if the market re‑prices AI earnings expectations. Investors should monitor three catalysts: the performance of Gemini in real‑world deployments, the next earnings release slated for early May, and any regulatory moves stemming from the recent OpenAI blueprint on AI taxation and oversight. A stronger-than‑expected earnings beat or a breakthrough partnership could accelerate the rebound, while tighter AI regulations or a prolonged advertising slowdown could keep the stock muted. For those looking to allocate a thousand dollars now, Alphabet offers a blend of growth, cash flow and resilience that may pay off when the tech rally resumes.
20

The Nasdaq Is in Correction Territory. Here Are the 2 Artificial Intelligence (AI) Stocks I'm Buying First.

AOL +7 sources 2026-04-01 news
The Nasdaq Composite slipped below the 10 percent‑off‑high threshold on Friday, officially entering correction territory for the first time this year. The drop was sparked by a weaker‑than‑expected jobs report and a renewed focus on inflation, but the sell‑off has not erased the market’s appetite for artificial‑intelligence products. Analyst Adam Spatacco argues that the correction is “discounting the infrastructure movement entirely” while leaving demand for AI services intact. In his April 9 column he points to two pure‑play AI stocks that have underperformed the index by a wider margin and now appear undervalued: C3.ai (AI) and Palantir Technologies (PLTR). Both companies have seen shares tumble more than 20 percent since the Nasdaq peaked in March, creating what Spatacco describes as “98 % and 115 % upside” according to recent Wall Street target revisions. The significance lies in the divergence between macro‑level weakness and sector‑specific growth. C3.ai’s platform‑as‑a‑service model is gaining traction with enterprise customers seeking to embed generative‑AI capabilities without building their own data pipelines, a trend highlighted in our April 10 piece on retrieval‑augmented generation failures. Palantir’s data‑integration suite, now bolstered by a new partnership with a major cloud provider, positions it to capture a slice of the $1.5 trillion AI‑software market that analysts expect to expand at double‑digit rates through 2028. Investors should monitor the companies’ upcoming quarterly reports for signs that revenue pipelines are materialising, as well as any policy shifts after OpenAI CEO Sam Altman’s recent blueprint for AI taxation and regulation. A rebound in tech hiring or a softer Fed stance could also lift the broader Nasdaq, accelerating the price correction of these stocks. For now, the two picks represent a contrarian play on AI demand amid a market‑wide pullback.

All dates