Podcasts DatAInnovators & Builders

95% Prompt Cache Hit Rate: How LLM Cost Reduction Actually Works in Production

Episode 7

Feb 24, 2026 | 1:02:31

Watch or Listen On

YouTube

Apple Podcasts

Spotify

Summary

Most agents fail in production not because the model is bad, but because they forget everything and can’t access the right data at the right time. Rowan Trollope, CEO of Redis, has built his entire product strategy around solving exactly those two problems, and in this episode he gets specific about how.

From architecting a semantic layer that sits between your enterprise data and the agent’s context window, to building memory systems that handle conflicting user preferences and temporal grounding, Rowan lays out the infrastructure decisions that actually determine whether agents make it out of POC. He also shares a clear-eyed take on the AI bubble, why he’d put money on infrastructure over apps right now, and what the dot-com crash taught him that still holds.

Topics Discussed

Why pointing an MCP server directly at backend databases breaks agent reliability
Redis Context Engine: CDC pipeline plus pydantic object models as a semantic layer for agents
LanCache: prompt-layer caching hitting 95% cache hit rates and 70% LLM cost reductions in production
Agent Memory Server: using an LLM to extract, vectorize, and resolve conflicting user preferences from raw transcripts
Contextual grounding: converting relative time and location references into absolutes before storing memories
Why the agentic infrastructure stack has not yet solidified and what that means for enterprise adoption timelines
“Provability” as the framework for predicting which job functions agents will automate next
The shift from specialist roles to “product builders” and what that means for how software teams are structured
How Redis became the number one data store for agent workloads by market share, and why agents self-select for simpler APIs
Why infrastructure is the safer bet in an AI bubble, drawing on lessons from the dot-com crash

I wouldn’t bet on any apps at this time. But infrastructure is a relatively safe bet.”

Rowan Trollope

CEO of Redis

Transcript

Rowan Trollope
This is my profession. Software engineering is completely different than it ever was before. I mean we are in a whole new world now. Large language models hallucinate, they make things up. I think they said this and this is what I think I had for breakfast three weeks ago.

The problem to be solved now with agents, one of the interesting problems is short-term memory, working set memory. My top engineer who uses AI as sort of primary method of coding is spending as much on tokens as his entire salary, so we are burning these data centers up.

Now that I have this technology on my hands I can just tell it to go do it, I’ll show up tonight and it’ll be done.

Are we in an AI bubble? I wouldn’t bet on any apps at this time, but infrastructure is much safer.

Saket Saurabh
Hello and welcome to another episode of Data Innovators and Builders. Today I am with Rowan Trollope, CEO of Redis. Rowan, very nice to have you here.

Rowan Trollope
Yeah, thanks Saket, great to be here.

Saket Saurabh
Awesome. So Rowan, maybe give us a little bit of your background coming into your career through Symantec and Cisco and Five9 and now at Redis.

Rowan Trollope
Absolutely. Yeah, I’ve been building software since I was 18 professionally and paid, but going even before then when I was a teenager doing fun projects. Since I was 18 I got hired at a startup and have worked in the industry ever since. I started and did about 20 years in originally consumer software and then as the beginnings of the cybersecurity industry leading up to the dot-com boom and bust.

Then I worked as head of products and engineering at Cisco, running their software assets, primarily was on the collaboration technology group that managed all of their communications infrastructure and technologies. That was exciting. Then I was CEO of a company called Five9 for about five years and now I’ve been at Redis, back to my roots as a developer, for three years.

Saket Saurabh
Awesome, that’s an incredible experience you have there. And I have personally been a huge fan of Redis. I initially knew it as a memory store but you have come in and set a very strategic direction for the company as a memory layer for AI and a much broader platform. So tell us a little bit about this evolution of Redis.

Rowan Trollope
Yeah, well, about three or four months after I joined, ChatGPT 1.0 came out and I had been really a big proponent of AI for the five years prior to that. So now it’s been a total of about eight-plus years since the transformer paper essentially came out right around then. And when I took the job at Redis, it struck me that we were going to enter into this new era of AI, really generative AI applications really becoming the main thing, whereas the last 10 years has been cloud and mobile.

If you look at the last 10 years where Redis really got its start, Redis stands for Remote Dictionary Server, and it became a core part of the application infrastructure on the internet, helping almost every business you could possibly imagine build out their infrastructure and scale it to internet scale. Redis is a core part of that infrastructure and we’ve been very lucky to be a very successful open source project adopted by lots of developers.

That was a 10-year cycle. And I think we’re now heading into a new cycle around building agents, and those agents are going to need to have a data store both to serve data to them and also for places where they can store information. So we’ve been very focused on that opportunity, just what does it mean to be a data store for agents and how does that work in the enterprise context.

And what are the new problems to be solved, like memory. You mentioned memory as a new emerging problem to be solved, since you have essentially a tremendous amount of exhaust coming off of these agents around unstructured data and turning that into sensible, useful information later. So solving agent memory.

The big aha moment for me that set us down this path was when really looking at an LLM, by way of analogy, an LLM is very similar to a human’s long-term memory. In the sense that it’s read a lot of things, been fed a lot of information, images and text, and it vaguely recalls much of that information and then can formulate it into text answers back to you about your memory. Which is very similar to the way our own brains work in our long-term memory when we ask questions, like gosh did I ever see that movie, and you start to get some ideas.

But the other part of our brain that’s not part of a large language model is the short-term memory and the working set memory. It’s what have I currently loaded and what am I currently operating on. That’s the kind of stuff that before this interview, you maybe had a little conversation, you took some notes, that’s all in your working set memory. It’s fast and it’s highly accurate.

And very similar to a human’s long-term memory, large language models hallucinate. They make things up. I think they said this and this is what I think I had for breakfast three weeks ago. So that’s kind of an interesting analogy. And the problem to be solved now with agents, one of the interesting problems, is short-term memory, working set memory, semantic memories, really the different ways of remembering things, and then also how do you load things from different data sources, not just the large language model.

The other thing about language models is they’re trained on the open internet. They don’t have your enterprise data. So how do you get your enterprise data into the short-term working set memory, which in the case of an LLM is the context window? How do you get all that stuff pre-loaded and do it in the most efficient and effective way?

Saket Saurabh
Yeah, that’s a problem that I know very well. And the way I think about this to some extent is that if you were to go to ChatGPT and ask a question and it doesn’t remember who you are and what you last talked about, your answer wouldn’t be as interesting, right? Isn’t that what we are basically talking about?

Rowan Trollope
Yeah, and I think if you look, clearly Anthropic and OpenAI have done some work to do context window compression and pulling in the last 24 hours of chat logs and so on, but that’s not really enough to really know a user. We’re starting to see products emerge like Claude’s memory features, around natively being imbued with the ability to store memories and record those over time.

But we’re also seeing a tremendous amount of different approaches and opinions about how these memories should be extracted, stored, and then retrieved later on down the road. So that’s one interesting problem set. We at Redis have taken an opinion, we’ve built out an open source project called Agent Memory Server, that’s at github.com/redis/agent-memory-server, and that’s our version of what a memory system should be.

A separate problem is, well, you go to talk to the agent and it needs to get data that’s not in the large language model, maybe it’s your company’s travel policy or what grade are you or what’s the weather in Paris. All these other kinds of information that maybe comes from an internal dataset, you need to be able to pull those things up in real time. So that’s the other problem that we’re solving.

Saket Saurabh
Yeah, absolutely. I think getting that data in real time, getting the meaning with that data, having good quality of the data, I think all of that comes into picture. I think one specific aspect you talked about is like travel policy, these are things in the form of documents. And I think one of the capabilities that Redis supports is that of a vector database.

I’m seeing the pace of evolution be so fast so I want your perspective on this. Two years back RAG was the hardest thing, vector databases were very hard, and now the number of providers of vector database solutions is still a very critical piece of technology. How do you see some of these technologies evolving in the coming years?

Rowan Trollope
Yeah, the pace is pretty fast. I think RAG, you’re absolutely right. The first couple of years since ChatGPT, most enterprise developers or business developers went off and tried to build chatbots because that was the obvious thing to do. Like well, we have an IT help desk, let’s create a chatbot that employees can talk to and that’ll make the experience better. There have been companies that have sprung up around that category.

In that world, the chatbot modality needs to be imbued with enterprise knowledge. If I’m asking how many days of PTO do I have left, it has to get that information from somewhere. And so RAG was a convenient way of doing that. Okay, let’s store the chunked PDF of the PTO policy and now when you ask a question to the chatbot we inject the necessary context in that RAG workflow. That worked fairly well to get enterprise data into the context window and so Redis was being used for that.

What we’ve seen over the last year or so is a shift towards agentic workflows. I get asked the question a lot just to define it, so maybe start at the high level. What is an agentic workflow, what is an agentic application? It’s not very complicated. It’s essentially a process that runs that has a large language model as part of it and which has the ability to run in an independent and automated way for a long period of time doing some sort of work.

It’s a process that’s running that uses a language model for reasoning and can be given tasks that it can then go execute. From one angle it kind of looks like the way a human being would take on a job. You give them some information, provide them with some company data and training, and then you say this is your job. You give them a job description. Then you say here’s a task for you to go solve, please go solve it. And with the information and training you’ve provided that entity, it can go do the work. That’s how a human being works. That junior employee, an intern, that’s how these agents are going to work. And they’re starting to be built out now and there’s a new set of problems that come along with those agentic applications.

Saket Saurabh
I think we are seeing that. I would say last year, 2025, was a lot about getting these from a prototype to a production stage and I think we are seeing a lot of improvement on that front. So maybe if you can share a little bit about where those challenges are and what it takes to take those agents to be something you can actually delegate to, give them a job description and let them work on it.

Rowan Trollope
Well, I think in the enterprise, agents are very much a matter of being in POCs at this time. Now the world’s changing quickly. We just recently saw the launch of Claude Code, which has been a great example of an agentic application that developers have now, and it’s gone kind of viral in the development community to demonstrate that agents are really feasible at this time.

I think before that, if you looked at the landscape, you’d say there were a couple of examples, maybe a handful, of agentic applications at scale. Those would be the coding products, Codex from OpenAI, Anthropic Claude Code, Cursor perhaps in their agent mode, Augment coding tool. They all built these agentic workflows and then added on MCP to get data in and out. That has been the area where there’s been actually real progress.

And now what we’re starting to see is, okay, could we actually build agentic applications outside of the coding workflows to automate other types of jobs? And I think what has so many pundits and observers fairly excited, excited and nervous to be honest, is well, we’ve seen what it’s going to do to the world of developers. This is my profession. Software engineering is completely different than it ever was before. We are in a whole new world now and that happened very quickly.

I envision you’re going to see Claude Code-like agents emerge in individual verticals or specializations that are going to completely transform many of the jobs that we do today. And that sounds somewhat scary, and it certainly is, but I’ll tell you as an engineer using these tools to write code now, it’s extremely exciting. It’s much more fun to write code when you’re given one of these agentic coding tools.

The interesting part of software development, I’ve been doing this for my whole life, was always you have an idea and you can materialize and see your idea come to life. Now the bunch of stuff you had to do involved esoteric non-English languages and programming and C and compiler errors and syntax problems and all this other stuff, but none of that was particularly interesting. What was interesting was you’ve got an idea and you can create some outcome.

And now what’s happened is all that stuff in the middle just went from taking what might have taken six months to now taking six hours. It is a remarkable transformation and will be a remarkable transformation of this job. Now that hasn’t rolled out to all engineers in the world yet, but if you look at how engineers are using these tools who are starting off with AI from the beginning, it’s a dramatic shift in the workflow and a dramatic shift in how much you can get done.

Saket Saurabh
Yeah, I mean, as an engineer myself I 100% agree with you. When I was learning to code many many years ago I think I was building a game, and much of it was just to see how it works. And yes, there’s a lot of time that goes in before it’s ready to try, and every idea you come up with again takes many hours or days to get there.

And I was listening to the interview from the creator of Claude Code, who was saying how addictive it is because it’s like one more prompt, I want to try one more prompt and see if it gets me where I’m looking to get to. And it’s certainly like that.

It’s extremely, I would say, yes, software engineering has completely changed in the past year or so. But I think it’s also become a very developer-centric world because a person with an idea, maybe a product manager or developer, can overnight or over a weekend build out stuff that just changes what the baseline is. So how do you see that, is this a phase we are in while technology is early or is this the way of things going forward?

Rowan Trollope
Well, I think it’s the way of things going forward. I was just recently in Israel last week visiting our engineering team. We have a big engineering team, a lot of really smart people, some of the smartest people in the world working on Redis, on the core of Redis. And we were talking about how AI has changed things. Our company is really leaning in with AI coding tools and expanding to the rest of the org. And I was talking to some of the engineers about just how much has changed.

A few things really stood out to me. One, and this is kind of from my own usage of the tools as well, it really brings together some different parts of the organization that have been sort of separate specialties. So product management and engineering and UI design, just take those three. Kind of the three jobs you would need to build a product. You need product management, what problem are we solving and how are we going to solve it. Developer, how do I actually go write the code. UI and UX, how do we display this to the customer.

What I see happening is those three specializations really blurring and coming together. I heard Satya Nadella talk about this last week and I think he said at LinkedIn they’ve moved in this direction towards creating product builders. A new role that’s not a developer or a product manager or a UX designer but it’s a product builder, a more generalist kind of role.

One great example is Salvatore Sanfilippo, the author of Redis. Salvatore is not a specialist coder, he is an incredible coder, one of the best coders I’ve probably ever worked with, but he’s also a product manager. He defines the requirements, he works with the community, he works with his users, in this case developers, to understand what would help them do their job. Salvatore is already one of these product builders. He understands his customer, he understands how to write code, and he also does UX design. In our case, because we’re an SDK, the UX is the docs, and Salvatore has always written his own documentation.

He’s a good example of someone who has always been a product builder and was for a long time considered one of the most successful and prolific engineers in our industry. That’s a model for I think how things are going to play out.

If you look at the rest of the teams, the classic way that big software companies have been built or startups, we tended to go more in the world of specialization. Engineers that, hey, I don’t care what we’re building. This is like, and this used to really get to me when I would hear people talk about engineering as if it was a factory. Specs go in, somebody writes specs, they hand it to a bunch of developers who are kind of the factory workers who build the thing and kick it out the other end and then somebody tests it and sells it to the customer. That always struck me as the wrong analogy for our industry.

My model of the world has always been more of the craftsman world where a craftsman is a product builder. If you take a craftsman who’s going to make a chair and they’re a woodworker, they don’t have somebody else designing the chair. A craftsman designs the product, they understand the use case, understand the user, they’re artistic, they’re creative, they come up with it, then they build it. And so I think what AI enables is a move back towards craftsmanship in this industry, but it’s super-powered craftsmanship because now you’ve got a complete studio of automation tools that can do everything for you from idea to implementation.

I’ll give you one example. This happened this morning. I was sitting with my CTO Benjamin and we were talking about how to make it easier for Claude Code to store its memories and how do we actually implement it in Redis. And I had an idea over the weekend, which was what if we had a CLI that translated POSIX file commands into Redis commands, and then you can just throw that CLI with a skill file at Claude Code and say hey I want you to store your stuff in Redis, not in the local file system.

We talked about it and in the middle of the conversation I said, wait, why don’t we just go spec this out while we’re talking, which we did. Then we created a prompt, I then used Claude Code to create the PRD, looked over the PRD, looked like a good idea in principle, tweaked it a little bit, and then basically said Claude Code, go implement this PRD. Now it’s still working on it, but it’ll be done soon. And that means that tonight I will have that product that Benjamin and I batted around in terms of an idea early this morning in my hands to go test with my agent.

That is a completely new world. That project may not have been very complicated, but the barrier, like, is it important enough that I would actually go do it, maybe, maybe not. You know, it’s one of those fleeting ideas that you have as a CEO or CTO. But now that I have this technology on my hands I can just tell it to go do it. I’ll show up tonight and it’ll be done. My agent’s running in a loop, taking step by step by step, and the thing will be done. I’ll be able to try it, if it makes sense maybe I’ll throw it on GitHub. What a dramatic change.

Now you can just imagine what this means for every other job function when this technology comes to those functions. I’d say one last thing on that front, which is the next logical question is where’s it likely to go next. And if you think about what made this so successful from an engineering perspective, it’s that the results are provable. We can run a lint checker or a type checker or a compiler against some code. We know if it compiles. So there is a provable nature to the world of development.

So if we look at all the other functions where the outcome is provable, those are likely the next pins to fall because that’s where you’ll be able to really train these models in an effective way.

Saket Saurabh
Yeah, and I think that’s a really good framework because what is provable as an outcome is kind of protecting you from some of the hallucinations of an agent. Or let’s say it’s a business process like managing a procurement or making a payment, you may not have as much provability there and that’s where the challenges would be.

But I really love what you’re saying in the sense that it’s all about having the idea and knowing what you want and knowing what is actually good, and then AI can do the work for you. That happened initially with writing content, like you can just generate a lot of content but what matters is is it good, can you judge it, can you write the right prompt.

And here we are saying that if you know what you want the product to do, previously it was like give me a detailed spec and I’ll build it for you. Give the spec to Claude Code and let it build it for me. But I’ll have a lot more things to throw at the market or at the users and see what really makes sense. So we are easy to build and easy to discard also maybe?

Rowan Trollope
Yeah, exactly. And I think Bezos once talked about Amazon’s superpower, and the superpower of being a scaled enterprise was that they would move quickly and their ability to move quickly allowed them to try more things. By trying more things they would find the few things that worked really really well faster, and then they would go build them.

Well you’re now in a world where I think even a startup has the ability to try way more things. You don’t have to be Amazon with hundreds of thousands of engineers to be able to say let’s throw 10 people at that for six months and just try it. Anybody can do that now.

And what I would add in this journey is that I think ultimately, if you eliminate these bottlenecks in the process, you get to what are the real limiting factors of a business. And the real limiting factors are in some cases how much you could do, like could you build something in enough time. But mostly, if that bottleneck is gone, the limiting factor becomes ideas. It becomes what are the things that make sense to build. Number two is distribution. Knowing the actual few set of things that would make a really big difference to grow your business and your ability to put those in front of enough people to get traction and adoption.

And I think those two problems, I don’t know that they are changed dramatically by AI just yet. But when you take the bottleneck out of writing the code or building the thing, it goes back to what should we build in the first place. In our company we’ve got a lot of ideas but none of this technology has made us better at knowing what the right things to build are. That’s something where we can certainly get more feedback and faster results by just trying a lot of things, but it does let you spend more time thinking about what are the right things that are going to change your business, and then those are the ones to go do.

Saket Saurabh
Yeah, and it takes a lot of context to actually distill that down, like what makes sense. And it’s a fast-moving context world. I mean, you thought about this idea of moving that markdown into Redis instead of on a file because last week Claude Code came out. And new things are coming out every week.

It’s an exciting time to be in technology. But at the same time I want to ask a little bit about the platform side. We talked about the chair analogy, we’re saying that we used to think about features, now we need to think about product, not just a feature as an engineer or product manager or designer. And in the same way, we have often thought about tools and platforms, but now customers are really thinking about solutions. They’re not saying I want a database or a model, I want something that solves this particular problem for me. How do you see the role of platforms as this solution becomes top of mind?

Rowan Trollope
Well, I think we’re at a place now, let me unpack the question because I want to make sure I kind of answer those different parts to it. We’re clearly in a world where, take Palantir and their forward deployed engineers, that model has proven itself out in their case, which as far as I understand involves a lot of government contracts. We have actually at Redis deployed and adopted a forward deployed engineer model in our industry.

I think their observation as I understand it is that a lot of these companies or agencies don’t have the talent to go and implement these solutions and that’s why you need to include engineers, FDEs, into that picture. For us, as we have engaged with our customers particularly around agentic, we do have forward deployed engineers who work with those customers because it’s hard to build these applications right now.

The infrastructure layer is not solidified yet enough. We’re not yet in that turn-the-crank implementation phase. We understand very well how to build a consumer app that can get to hundreds of millions of users, you build a Swift app, you build an Android app, we have all the tools for testing it and building it, the middleware and the back end, mobile backend as a service and identity. You can find thousands of engineers in San Francisco who could do that end-to-end process for you.

In the case of agents, that’s not the case. It’s going to be very hard to find somebody who can tell you I’ve got a pattern, I know how to do it, I’m going to bang this out, because we’re just not there yet. We’re still early in that process. So our forward deployed engineers are helping enterprise customers make some of those early choices. Especially in highly regulated verticals there’s a lot of work to do where the progress is going to be furthest out in the sense that you’ve got extremely sensitive customer data, everything has to be on premises, there’s all kinds of hurdles.

We’re pretty far out from still having production apps on that front. Hence us putting forward deployed engineers into these companies helps them adopt these technologies faster.

We’ve also seen a lot of progress by integrating with the leading ecosystems. Redis has been integrated into LangChain, for example. I would say maybe 50% of the deployments that we see in enterprise of agents are using LangChain or LangGraph as an underlying foundational infrastructure.

Now all of the model providers are starting to go there. There was an announcement from OpenAI on this front. Microsoft’s Shift agent framework, Google has their agent framework. Redis is integrated as the default memory storage platform for Microsoft’s agent framework, and we’re doing similar integrations with OpenAI. The fact that we’re part of that underlying fabric has gotten us integrated into these platforms really well.

And the second thing that’s driving it, because there was a survey that came out recently that showed Redis as the number one by market share for agent data, well how did we do that? One reason is that our technology is so easy to use. There’s a CLI, the commands are English-language understandable, even the protocol is essentially written in text. It’s easy to debug. It’s kind of perfectly built for an agent. Agents love using Redis. Agent coding tools are very very good at using Redis.

That’s another aspect of what drives success here. In our industry we used to build infrastructure tools for developers. Now we are increasingly building tools for agents who are developers. Agents are the developers. And so that’s a different skill set. You’re not building flashy websites or going to open source conferences. You’re putting markdown files on the internet that are written as the simplest possible explanation of how to use your tool. I actually anticipate that any of these legacy tools and APIs and platforms that are hard to use are going to really struggle from an agentic perspective, because if the LLM is smart it’s going to prefer simpler. Simpler is always better.

Saket Saurabh
Yeah. I think one of the things that I feel is also happening right now is, back to that FDE question, building these things is hard. I feel like because AI agents have suddenly created a market where everybody’s in market at the same time. I’ve talked to companies in manufacturing sectors outside the US and the level of technology or talent that they can have to build something, at the front of it everything looks easy working with agents, but there’s a lot of tech stack that goes beneath it.

I agree with you that it is hard to build. Making that possible, whether it’s integrated into existing solutions and all that, becomes critical. I also feel like many of these companies I talk to are looking at, hey, what I want to do is make this part of my process. What does it mean as far as agents and models and context and all of that stuff is not something that they are really thinking about. They need somebody to help.

Rowan Trollope
Yeah, and they need to see, okay, has anyone done this before and what tools did they use? We’re in that front-end part where everyone in the tech world now says they’re an AI company. Any new interesting problem, there’s probably 10 startups going after solving it. So it could be very confusing for a developer to figure out what’s the best practice here. There is no best practice yet. There’s a bunch of people trying things. We’re in that early phase.

This infrastructure stack for agents needs to settle down a little bit and get some more maturity before we really start to see the volume of applications being created. But I think that’s happening again much faster than in any previous cycle. And that’s the important thing, don’t assume that the formation, let’s say you use history as your guide, well how long did it take for the cloud mobile stack to solidify to where it became easy, quote unquote, to build a scaled application on the internet? That maybe took three to five years.

I think this time it’s going to be much faster. That stack is probably going to solidify very quickly. And frankly, if you can become one of the apps in that stack, one of the standards, like Redis is the standard in the cloud mobile stack for fast data storage and operational data platform, there’s a tremendous amount of value in being one of those winners.

Saket Saurabh
Absolutely. And you talked about every company calling themselves an AI company. One of the things, if you break it down, there are model companies of course that have the foundational models, just a handful of those. And then there are all these companies in the enterprise space, technology companies which have varied contexts around them. So how do you think about this sort of context-centric company and model-centric company and the opportunities? Because I see the model companies also getting into everything right now.

Rowan Trollope
Well, I’ll take the first part of that one. I think the model vendors are certainly going to try to make it as easy as possible to build applications using their platforms. The CSPs, Microsoft, Amazon, Google, I think they clearly want to have the front-end experience around how do you create an application on my platform. Obviously Microsoft has partnered with OpenAI, Amazon has partnered with Anthropic. I think we’re going to probably end up in a world where the hyperscalers do have a significant foothold in terms of enterprises or businesses that want to build applications going to their platform and getting an end-to-end stack.

It’s not clear to me where the line will be between the model companies and the hyperscalers. If I had to guess, I think the real expertise in the model companies is the models. The expertise in building enterprise software in general is going to be with the hyperscalers and also the distribution. So I would not take Google, Amazon, or Microsoft out of the running for owning that front-end experience.

We saw Cursor really take off, we saw Claude Code take off. I’m certain that you’re going to see the model vendors try to own more of that stack. How much they’re going to own beyond the model is not totally clear to me yet. What they have going for them right now is they’re small, they’re scrappy, they’re moving very very quickly. The OpenAI agents framework is fantastic. Anthropic has got an amazing platform in Claude Code which they just upgraded to a native application. So you’re seeing that space move super fast. Not clear where the line will be between what the hyperscalers do and what the model companies do, that’ll remain open.

I don’t think anyone has really moved seriously in the direction of going further down the stack, specifically to data platforms. There’s obviously the two pre-IPO companies in the data world, Databricks and Redis. I think in terms of having the biggest scale and being ready to go out and IPO, these companies, we’re one of them, and we’re seeing a lot of our growth driven by AI workloads.

But I think you’re going to have a lot of data companies that are not finding a place for themselves in this new stack. And if you look in the public markets from a data infrastructure perspective, you’ve got the hyperscalers who have their own solutions, but it’s a small list of companies. There’s obviously Snowflake and MongoDB, who are clearly multi-billion dollar revenue streams wanting to make sure they’re part of that next generation AI stack. Snowflake has obviously been doing a lot with agentic.

So those are the players as I see it who are circling around this. There’s a long list of ankle biters and new startups coming out to solve some of these new problems. But I don’t see enterprises looking to adopt new data platforms yet. There could be some specialized use cases, super low cost reasonable speed vector databases perhaps, maybe one. We’ve seen Google out there with their vector store that’s scaled up, some startups coming at that space also. There are some edge cases and interesting use cases but for generic data platforms it feels like the big guys in the category have really responded and built out the additional capabilities and are figuring out how to integrate with these ecosystems effectively.

Saket Saurabh
Yeah, and I think memory has a really important role to play. And the way I see this is that the Nvidia acquisition of Groq, paying a premium for on-die memory, especially for inferencing workloads where the next level of memory performance is kind of the RAM or the DDR that comes with the chip. And if I look at the same parallel, I would say Redis as a sort of fast memory compared to maybe a data warehouse or a database, so there is a premium when you can access that information or that context faster at the time of inferencing and stitch that together.

Rowan Trollope
That’s right. That is our, of the emerging problems and interesting things to go do and build businesses around, that is the one that I think is most interesting to us. Inference-time workloads. When you are running your inferencing, there is a bunch of things to do.

The first interesting opportunity that we saw emerge, we predicted this, was that for many applications the cost of intelligence, the token costs, were going to be too high to put something into really into production. We built out, and the costs are doing this by the way, they’re plummeting. Orders of magnitude per year the cost of intelligence is probably going to asymptote towards zero. But even with that, what’s necessary is a tremendous amount of infrastructure to make that happen, to drive those costs down. So you can do caching, as a classic computer science pattern, to make that happen.

So that’s where Redis has been used, and one of the killer apps on our platform has been caching. We built a product called LangCache. LangCache is a web service, it’s at redis.io/langcache, and what you can do is take a prompt and just feed it into LangCache. If it has seen a similar prompt before, it can just feed you the answer that it already has. If it doesn’t have the answer, nothing similar to that, it can go hit the LLM and then populate the cache. So it’s a cache that sits at the prompt layer.

We saw one company, very good for certain kinds of use cases, take open healthcare open enrollment for the healthcare industry. In that world, in the month of January and February whenever open enrollment is, tons of people go to the website and ask questions. We ran a POC for one of the largest healthcare providers and they’re working on an implementation now where we saw a cache hit rate of questions from consumers at 95%. Very significant savings for them on a large language model.

We have another Fortune 50 company who just bought our product and implemented it in all their TVs. So when you go into their voice search, it turns out a lot of it is not a multi-turn conversation. It’s like find me this, what’s good in this genre, play the super bowl, that kind of stuff. Those are all very very cacheable, and they saw a 70% reduction in their LLM charges by using LangCache.

The second area, and there’s a bunch of areas we can talk about, is agent memory, and another is data access, enterprise data access and managing context. We can go in either of those directions.

Saket Saurabh
Yeah, I think what we do at NextConnect is connect to the different enterprise systems and bring the data, because that’s the core of the technology. And I can totally see that caching that information, because you don’t want to hit that SAP system too many times. Rate limits and other things to deal with, it’s a real problem.

Rowan Trollope
Yeah, yeah, that’s exactly what we’ve been focused on. So the problem as I would describe it, just to restate what you just said in my words, is how do you get the right context, the right data into the context window at the right time, and how do you do that quickly and efficiently and securely.

The first pass at this, if you go back in time to maybe three months ago, how would you build an agent? You would as a developer write some middleware. In your agent run loop you would say, okay, take the prompt, go get my static context that I care about. Maybe use an LLM to figure out in the prompt if there’s some useful context that I need to bring in, go do your database query, pull your thing out of memory or whatever, and then stash it into the query. So if it’s a travel agent bot, we’re going to need the travel policy for the company, probably always need that, so just stuff that into the context window.

But then along comes MCP, a different way of pulling context instead of pushing it. So instead of the developer hand-rolling the context, now we have the agent asking for context. We tell the agent here are the MCP tools you have access to, here’s what they can do, describe them, and then let it go pull the context. Perhaps you have an API, get travel policy. Well that’s pretty simple. Anytime it needs the travel policy, it can just call that API, get the information, bring it into the context window.

The problem with that, it works well for something very simple like a travel policy, but when you start talking about complex back-end structured data sources you’re not going to want to point your agent directly at your database. All of those database companies immediately came out with MCP servers. You might think the best way to solve this is take the MCP server, stick it in my agent, and let it party on my data. Bad idea. The relationships in your underlying structured data stores are completely opaque. You don’t know what they are, you don’t know what the underlying data is in these tables. And oh by the way, you definitely don’t want your agent accessing everything in your back-end data store. It can get confused. You can overload the context window with information.

We have a different approach. We did an acquisition of a company that brought us a very popular open source project called Enrich MCP, based on the idea that what is needed to solve this problem is a semantic layer that sits on top of the data and allows you to define an object model essentially for your data. Take your structured data, denormalize everything, and then expose it in a way that is very very easy for an agent to understand.

We’ve now been working on our new product which is called Redis Context Engine. What the Context Engine does is take structured and unstructured data and put it into a materialized view and an object model view with descriptions and relationships that are all described in English, and then publish through an MCP server to the agent.

Part one is how do you get the data into Redis in the first place. We have a new product called Redis Data Integration. You can set up a CDC pipeline, pick your tables, pick your columns, pick your queries, pick your basic transforms. I want to take my Workday employee table, do a query, join it with another table, get the grade level and title of the employee, just take employee ID, grade level, and title, and stick that into a new table, denormalize that, and store it all in Redis in a series of keys. Make it really fast. Now we have a pipeline that’s set up to read your enterprise data, just the stuff you want, and push it into Redis.

The northbound side is how do we expose that to our agent. So the Context Engine is what you do, you create Pydantic models around the underlying data in Redis, describe what’s sitting there as a set of objects, semantically describe the fields and the keys and the relationships between the data and the objects in your model. Then with that small amount of work on the developer’s or data engineer’s part, you expose an MCP tool that exposes exactly what the agent needs, already pre-processed, already secured, already done in the right way, making the effectiveness of those agents super high.

It’s got an MCP northbound interface, a data gathering and transformation pipeline in the middle, and you come in at the bottom end and in the middle to define the objects for your agent. That’s solving problem number one, enterprise context and how you get that.

Now why Redis? We’re really fast. Redis is extremely low latency, designed as an operational data store for applications. This exact pattern I just described is how people were building cloud mobile apps before. They were just doing a lot of that stuff manually with middleware. All we’re doing is saying, we kind of know this is a problem that’s coming at us, let’s just go ahead and write a basic product that can do all this, which will make the lives of these developers much easier. Instead of the engineer and the data engineer writing the queries, it’s going to be the agents writing the queries, and you want to do a lot of that work for them.

Saket Saurabh
I want to bring this back to something that I think you have said once before, which is that AI models don’t fail because they’re slow but they forget. I think you wrote something to that effect. And I feel like you’re solving both the problems, you are solving the slow problem and you’re also solving the forgetting problem.

Rowan Trollope
Well, the forgetting problem is an interesting one because it’s slightly different. The Redis Context Engine is about how do we put the context of our enterprise data into the model. That’s a good way of solving it. The second piece is agent memory, and how do you take all the data coming off of an agent and make sense of it.

So first and foremost an agent needs to sometimes store data. What we often see those agents doing today is using the local file system and text files because language models are very text-friendly. But memory is an emerging and very challenging problem.

We have an unstructured conversation between a user and a back-end agent. We don’t know what it’s about but we know that there are logs and transcripts. Take the travel agent example. In the conversation you may express certain preferences that should be remembered. You might say something like, I need to go to my meeting in Paris next week, can you help me book this flight? And the travel agent does some work and comes back and says here are the available options. And maybe one of the things you say is, well, I don’t ever like flying in window seats, I always like the aisle.

That’s a useful piece of information that needs to be remembered. What you don’t want is for that preference to be forgotten. But it’s packed into a bunch of other text about Paris trips and this and that. So that’s one problem, extracting semantic memories that are worthwhile to be remembered, user preferences, experiences, etc., and defining how much to remember.

And then even more second-order problems like what about conflicting memories. Maybe you said you don’t like window seats today, but maybe two weeks ago you actually took a window seat and it wasn’t so bad, so now you can book me in window seats. Now we have an overlapping and conflicting memory that has to be resolved.

To put it into context, how did we used to do this in the cloud mobile era? We would build a SaaS app and then build a preferences page with a CRUD-style interface, like which seat I prefer, window or aisle, and some PM would write a requirement and you’d go save a field in your database that says user prefers aisle seats. Well, we’re not going to go build that kind of an app anymore. We don’t need to define a priori all the user preferences. The user can just express them in their normal native language, the LLM can figure it out.

We use an LLM under the covers. We’ve built a product called Agent Memory Server. You can just throw all of the chat transcripts at it. We use a large language model to extract relevant memories and store those in Redis as memories. We vectorize them and store the raw text as well. We have another side of the API which is to retrieve something, and you can pass in the context.

If we start a brand new conversation and you say I want to plan a trip to Brazil with my family, the travel agent will take that context, use its MCP tool to search the Redis memory server, and say got any travel preferences for Saket? Oh yes, through a vector search we find he doesn’t like window seats. And so we can include that, the agent can be intelligent and go, Saket, I remember two weeks ago you told me you didn’t like window seats, so I’m going to make sure that you are booked in an aisle seat. That’s a really delightful experience to have an agent that remembers you and your preferences.

Now, some of the model vendors have started to solve this in their apps, not at the API layer but in the apps. ChatGPT if you tell it explicitly to remember something it will. Some of the new agents being built like those using text files to store memory, I think that’s an interesting approach but I think we’re going to see lots of different things.

We’ve been working on this Agent Memory Server. It goes well beyond just simple extraction and recall of memories. It also deals with more thorny problems like what I call contextual grounding. Contextual grounding is essentially the problem where if you were to say something like, I need to go to Paris this Friday, if you were to store that and later today or tomorrow come into a chat with us, it’d be good for us to remember hey are you still planning that trip this Friday. Well, this Friday is a temporal reference that needs to be grounded to an absolute time. It’s a relative temporal reference, on this day Monday you said this Friday, well that’s actually February 8th or whatever. So contextual grounding in this case temporal grounding would be replacing relative time references with actual times, replacing relative locations with actual locations.

And then conflict resolution. You say you’re a vegetarian now, but then maybe in a month you decide you’re not a vegetarian anymore. How do I resolve the fact that you’ve said two things that conflict?

All these areas are solved in a way that’s inspired very much by the human brain. If you look at our long-term memory and some of the processes that run during REM sleep, there is a process that goes through and takes the collected memories in the short-term working set memory of the brain and compresses it, extracts the facts to be remembered, and stores them in the long-term memory banks. It also does conflict resolution. We’re inspired by that in terms of how this product works.

By the way, this is similar to what other memory companies are doing, we’ve got our own unique takes. So that’s agent memory server. You’ve got context on the one side bringing enterprise and business data or data about the user into the context, and then you’ve got remembering things, storing memories, and then recalling them later to make a much better experience.

Saket Saurabh
Okay, thank you. I mean it’s such a fascinating conversation, Rowan. And I can see that we’ve gone from basically a fast memory cache, which is how Redis started, to an entire memory platform. And extremely relevant to AI and the outcomes we can get from that. Really appreciate you going deep into that.

Just to quickly wrap it up, you said something about the history aspect of it and you’ve been through the dot-com rise and fall. Quickly, are we in an AI bubble, or is this not one?

Rowan Trollope
I don’t know about that. I would say probably there are some bubbly aspects to what’s happening now. On the surface it looks like there is. But if the bubble last time was kind of defined by, hey we’re running fiber that nobody ever used, well all of our data centers are running hot right now. My top engineer who uses AI as sort of primary method of coding is spending as much on tokens as his entire salary. So we are burning these data centers up. So I don’t think it’s a demand problem for this compute, I think that demand is there.

But to the degree that a bubble exists in some way, which sure it does, the way I think about that is I went through the dot-com crash and what was relatively safe at that time, what washed out were the famous examples like Webvan and Pets.com and those different apps. It was not clear that the apps were quite ready at that time. I think we’re in a similar state now where I wouldn’t bet on any apps at this time. It’d be hard, you’d be likely to be wrong if you made those bets. But infrastructure is much safer. I worked at a cybersecurity company when the dot-com crash happened and our sales just kept going. So I think infrastructure is a relatively safe bet and it’s one of the reasons I chose to come to Redis and to help build out this next generation of software.

Saket Saurabh
Awesome, this has been a great conversation, Rowan, very enlightening. Thank you so much for taking the time and would love to see how the memory platform takes off.

Rowan Trollope
Thank you for having me. Thank you.

Show Full Transcript

Featured Clips

Meet Our Guests

Rowan Trollope

CEO of Redis

Rowan Trollope is CEO of Redis, where he leads the company’s mission to help organizations unlock the full potential of real-time data. With a background in scaling technology companies and driving innovation, Rowan focuses on accelerating adoption of Redis’ platform for building fast, reliable, and intelligent applications.

Featured Business

Redis Visit Website

Redis is a real-time data platform company known for its open-source Redis database. The platform enables developers and enterprises to build high-performance, scalable applications that leverage in-memory data storage, real-time analytics, and AI-powered functionality, helping businesses deliver faster and smarter digital experiences.

Podcast Home

Ready to Conquer Data Variety?

Turn data chaos into structured intelligence today!

Scedule Demo

View Demo