AI News
Real Time

Sam Altman and AWS CEO Matt Garman on AI Agents, Harnesses, and the Next Cloud Battle

Just three days before Microsoft and OpenAI announced a landmark amendment to their years-long exclusivity deal—ending Azure’s status as the sole clou...

Just three days before Microsoft and OpenAI announced a landmark amendment to their years-long exclusivity deal—ending Azure’s status as the sole cloud provider for openai’s models—Stratechery founder Ben Thompson sat down with OpenAI CEO Sam Altman and AWS CEO Matt Garman. The timing was fortuitous. The logical tensions behind the partnership were already clear: why were the leaders of the two companies coming together?

The reasoning was straightforward. Microsoft initially locked in a massive competitive advantage by making Azure the exclusive home for OpenAI’s models, but that very exclusivity also constrained OpenAI’s growth. Vast troves of enterprise data already resided on AWS, and customers were reluctant to migrate just to access a new model. Meanwhile, anthropic was surging by capitalizing on a simple premise: customers want the best model wherever their data already lives. For Microsoft, CLInging to exclusivity was beginning to hArm its most critical Investment. Loosening the grip was painful—Azure lost a key differentiator—but keeping it was worse. If OpenAI’s growth was stifled by the deal, Microsoft’s losses as a major shareholder would dwarf any gains for Azure.

This realization set the stage for the joint announcement of Bedrock Managed Agents, an environment powered by OpenAI. Think of it as Codex reimAGIned for the cloud—a fully managed runtime for Intelligent Agents that includes identity management, permissions, logging, governance, and deployment capabilities. Customer data stays within a secure AWS boundary; OpenAI never sees the raw data. The goal is to let enterprises already on AWS adopt cutting-edge AI without leaving their data environment.

Below are the core insights from the interview.

1. The “AWS Moment” for AI: Making Agents Production-Ready

Sam altman: Whenever I watch users interACT with our models, I’m thrilled they find it magical—but I’m also crushed by the unnecessary friction. They’re copying and pasting between tools, crafting convoluted Prompts, and endlessly trial-and-error testing. I see all that pain.

Matt Garman: Before this joint product, customers had to stitch everything together themselves—model invocation, identity systems, database credentials, internal integrations, underStanding their own data. Every single customer was reinventing the wheel. All that integration work fell entirely on them.

Matt Garman: The security Frameworks we’ve spent 20 years building for global banks, hospitals, and government agencies—VPCs, role-based permissions, gateways—are perfectly suited to help. Customers worry, “I love this tech, but how do I prevent a single mistake from triggering a company-ending event?” These problems are solvable. The key is giving customers a controlled sandbox environment.

Sam Altman: The model and the orchestration layer—the harness—are becoming inseparable. Many things we used to laboriously coach through system prompts, the model now just handles as it gets smarter. Take tool use: initially we didn’t think it needed to be baked into training, but the more integrated it is, the better it works. The boundary between the harness and the model will keep blurring, and eventually even pre-training and post-training will merge more tightly. But the whole industry is still in its “Homebrew Computer Club” era—the very early days where no one knows the final shape of things.

2. Local vs. Cloud Execution: Two Paths That Will Converge

Sam Altman: codex shifted from the cloud to local because local is simpler—your files and configurations are right there, you don’t have to think about where the data lives. But that’s not the endgame. The ultimate form is a cloud agent—something that keeps working when you close your laptop, parallelizes intensive tasks at a scale a single machine can’t match, and scales out on demand.

Matt Garman: No computing environment has ever truly eliminated the client. iPhone APPs have local components; local running naturally offers low latency and simplicity. But once you enter enterprise scenarios—sharing between people, permission boundaries, security perimeters—local hits its limits. The endpoint will be a fusion of local and cloud.

Sam Altman: When agents enter the workforce as “virtual colleagues,” all our mental models about software and permissions will need rewriting. Should you, an employee, have one account and let your agent use it? Or should your agent get its own account so the server can distinguish who’s acting? What if you have ten agents? I imagine a not-yet-invented primitive: when Ben’s agent logs in, it uses Ben’s account, but the system flags it is an agent, not the human Ben. We haven’t figured any of this out, but as agents grow more autonomous and join the workforce, these questions will surface fast.

3. The “Intelligence Factory” and a Pricing Revolution

Sam Altman: We’re fundamentally a token factory—no, actually, an intelligence factory. Customers don’t care what chip you use or how many tokens a model consumes. They care about one thing: getting the best unit of intelligence at the lowest possible price, with as much volume as they need. Our new 5.5 model costs far more per token than its predecessor, but it requires drastically fewer tokens to complete the same task, making the total cost cheaper. You shouldn’t care about token count; you should care about whether you paid a fair price and the job got done. Per-token pricing will look outdated in the long run. It will eventually evolve into paying per completed job.

Sam Altman: Utilities have elastic demand ceilings—water gets cheaper, but you won’t take two showers a day. Intelligence might be different. I’ve never seen another utility where I instinctively want to say, “If the price drops enough, I’ll consume it inDeFinitely.” Right now, more customers are begging me for more compute regardless of price, not haggling for discounts. But I’m confident we’ll continue to slash the cost of intelligence Dramatically.

Matt Garman: This matches the historical trajectory of compute exactly. A compute cycle today is order-of-magnitude cheaper than 30 years ago, yet we sell far more compute than ever. AI is still incredibly early. Everyone’s chasing frontier models because they’re the only ones doing truly useful work. The future will bring a mix of model architectures: small, fast ones for specialized tasks, and massive frontier models to tackle problems like curing cancer.

4. What’s the Endgame for Agents?

Ben Thompson: Enterprises may need a two-layer agent structure. A lower layer constantly drills into databases, SaaS apps, and file systems to retrieve, organize, and correlate Information—a “data integration agent.” An upper layer handles human interaction, presents insights, and executes decisions—a “user interface agent.”

Sam Altman: Lately, large enterprise customers are asking for a remarkably consistent set of tools: an agent runtime environment, a management layer to connect data and control Token Consumption, and a workspace for employees. The requirements converge more and more, but the product doesn’t fully exist yet. However, at some point, we might discover this multi-tier architecture is just us holding onto the old world. Once models are powerful enough, the whole thing might need to be reimagined from scratch.

Matt Garman: We still don’t know the final form. That’s part of the joy of doing this—getting customers onto the platform, learning from their real-world use, and feeding that back to make the product faster and better.

Ben Thompson: At Google Next, the story was full-stack vertical integration—from TPU chips to Gemini models to applications. Yet you and Sam—one without a frontier model, one not a cloud vendor—are sitting here announcing a Partnership. Is AWS lagging without its own frontier model, or is this an intentional open-platform bet?

Matt Garman: From day one, a core AWS strategy has been embracing partners. Our measure of success isn’t “do I own everything?” It’s “are our partners successful?” If they succeed, we succeed. This is a different philosophy from “I must own the full stack,” but both have believers. We believe customers should have the right to choose the best tool. If the best tool is something we built, great. If it’s a partner’s tool running on our infrastructure, that’s equally a win for us. No single company will own all the best applications.

Sam Altman: I genuinely think developers can now build an entirely new class of product. Model capabilities will improve along a very steep curve over the next year, and we’re building this platform together at exactly the right moment. I hope people look back in a year and the conversation isn’t “you can finally use OpenAI on AWS,” but rather, “we completely underestimated how important this new product was.”

Ben Thompson: The last time we did a product interview, it was with Microsoft’s Kevin Scott about the New Bing, and you were deeply confident about challenging google. Looking back now?

Sam Altman: ChatGPT exceeded expectations—it’s probably the first truly massive new consumer product since Facebook. The API and Codex have also performed well. But Google remains an underrated company in many respects—their breadth and depth are formidable. This collaboration with AWS isn’t just a win commercially; it’s a new starting point technically. When model capabilities and orchestration tools finally converge, what developers can build will be entirely different.


★★★★★
★★★★★
Be the first to rate this article.

Comments & Questions (0)

Captcha
Please be respectful — let's keep the conversation friendly.

No comments yet

Be the first to comment!