
Reasoning models can generate seven to 10 times as many tokens as necessary on simple tasks, creating unsustainable costs at scale. Amazon's vision for metacognitive AI could fundamentally shift how models allocate computational resources.

Amazon's competitive-agent architecture creates a continuous improvement cycle that develops security protections at machine speed, reducing what typically takes weeks down to hours.


A new approach to reducing carbon emissions reveals previously hidden emission “hotspots” within value chains, helping organizations make more detailed and dynamic decisions about their future carbon footprints.

How agentic systems work under the hood — and how AWS’s new AgentCore framework implements their essential components.
The overthinking problem in AI
How Amazon uses AI agents to anticipate and counter cyber threats
A new view of supply chain emissions
Demystifying AI agents
Customer-obsessed science


Research areas
- November 20, 20254 min readA new evaluation pipeline called FiSCo uncovers hidden biases and offers an assessment framework that evolves alongside language models.
- October 20, 20254 min read
- October 2, 20253 min read
Featured news
Initiative will fund over 100 doctoral students researching machine learning, computer vision, and natural-language processing at nine universities.
The collaboration will advance research in generative AI, robotics, natural language processing and cloud computing while fostering innovation in foundational and emerging technologies.
University teams battle to harden and hack AI coding assistants in head-to-head tournament
Led by David Luan and Pieter Abbeel, the lab will focus on developing new foundational capabilities for enabling useful AI agents.
The company's new state-of-the-art foundation models deliver frontier intelligence and industry-leading price performance.
- Despite rapid progress in LLM agents, performance on long-horizon, tool-using tasks remains fragile. To better understand this fragility, we ask a simple question: do all actions contribute equally to failure? Analyzing execution traces on τ-Bench (Airline/Retail) and SWE-Bench Verified, we decompose trajectories into mutating (environment-changing) vs. non-mutating steps and formalize de-cisive deviations—earliest
- NeurIPS 2025 Workshop on Evaluating the Evolving LLM Lifecycle2025Error attribution in Large Language Model (LLM) multi-agent systems presents a significant challenge in debugging and improving collaborative AI systems. Current approaches to pinpointing agent and step level failures in multi-agent interaction traces—whether using all-at-once evaluation, step-by-step analysis, or binary search—fall short when analyzing complex patterns, struggling with both accuracy and
- Ethan Baron,Boris Oreshkin,Ruijun Ma,Hanyu Zhang,Kari Torkkola,Michael Mahoney,Andrew Gordon Wilson,Tatiana KonstantinovaNeurIPS 2025 Workshop on Recent Advances in Time Series Foundation Models2025Many time series applications require access to multi-step forecast trajectories in the form of sample paths. Recently, time series foundation models have leveraged multi-step lookahead predictions to improve the quality and efficiency of multi-step forecasts. However, these models only predict independent marginal distributions for each time step, rather than a full joint predictive distribution. To generate
- 2025Music recommendation systems face the dual challenge of capturing both immediate context and long-term preferences in users' listening patterns. We adapt a generalized sequential model architecture for music recommendation, introducing modifications that acknowledge how music preferences combine temporal patterns and stable tastes. By removing causal masking constraints typically used in sequential models
- Sachin Kumar Giroh,Pushpendu Ghosh,Aryan Jain,Harshal Paunikar,Anish Nediyanchath,Aditi Rastogi,Promod Yenigalla2025This paper introduces, a three-stage multi agent LLM framework designed to transform unstructured and ambiguous Standard Operating Procedure (SOP) into a structured plan and an executable code template. Unstructured SOPs—common across industries such as finance, retail, and logistics—frequently suffer from ambiguity, missing information, and inconsistency, all of which hinder automation. We address this
Collaborations
View allWhether you're a faculty member or student, there are number of ways you can engage with Amazon.
View allThe program offers unrestricted funds and other resources to support research at academic institutions and non-profit organizations in areas that align with our mission.
A global university competition to drive secure innovation in generative AI technology, which focuses on responsible AI and large language model coding security.
We partner with particular academic organizations across the world for deep and sustained collaborations in multiple research areas of mutual interest.
We hire world-class academics to work on large-scale technical challenges, while they continue to teach and conduct research at their universities.














