Practical AI | Transcript: Orchestrating agents, APIs, and MCP servers

Orchestrating agents, APIs, and MCP servers

April 14, 2025 / 42:52/E309

Jerod: 00:04

Welcome to Practical AI, the podcast that makes artificial intelligence practical, productive, and accessible to all. If you like this show, you will love the changelog. It's news on Mondays, deep technical interviews on Wednesdays, and on Fridays, an awesome talk show for your weekend enjoyment. Find us by searching for the changelog wherever you get your podcasts. Thanks to our partners at fly.io.

Jerod: 00:28

Launch your AI apps in five minutes or less. Learn how fly.io.

Daniel: 00:44

Welcome to another episode of the Practical AI Podcast. This is Daniel Witenack. I'm CEO of PredictionGuard, and I'm really excited today to dig a little bit more into GenAI orchestration agents, coding assistants, all of those things with my guest, Pavel Veller, who is Chief Technologist at EPAM Systems. Welcome, Pavel. Great to have you here.

Pavel: 01:10

Thank you. Hello. Hello.

Daniel: 01:12

Yeah, yeah. Well, I mean, there's a lot of topics even before we kicked off the show. We were chatting in the background about some really interesting things. I'm wondering if you could just kind of level set us of people may or may not have heard of EPAM. I think one of the things that I saw that you all were working on was this GenAI orchestration platform dial.

Daniel: 01:36

Maybe before we get into some of the specifics about that and other things that you're interested in, maybe just give us a background of what EPAM is. I know you mentioned even in our discussions that some of what you're doing right now maybe wouldn't have even been possible a couple of years ago, and so things are developing rapidly. Just level set the kind of background of this area where you're working.

Pavel: 02:01

Sure. Yeah, yeah. EPAM is a professional services organization. We're global. We're in 50 something countries.

Pavel: 02:08

50,000 people globally work with clients. We have been for, I think, thirty two years to date. We do a lot of different things, as you can imagine. What I was mentioning about doing things that would not be possible is doing things with GenAI today. We do a lot of work for our own clients.

Pavel: 02:29

We also do work for ourselves, applying the same technology, because EPAM historically as a company has been running on software that we ourselves built. The philosophy has always been that things that do not differentiate you, like an accounting software or like CRM, you would go and buy off the shelf. Things that differentiate you, how we actually work, how we operate, how we execute projects, how we hire people, how we create teams, how we deploy teams. All of that software has always been our own since as early as late '90s, and we keep iterating on that software for ourselves. That software today is very much AI first, and a lot of things we do, we do with AI and really do because AI in its current form exists.

Daniel: 03:17

Interesting. Yeah. And how, I guess, does I think when we initially kind of were prompted to reach out to you, part of it was around this orchestration platform. So talk a little bit maybe generally, not necessarily about the platform per se, although we'll get into that, but just Gen AI orchestration generally. You talked about some of these things that are becoming possible.

Daniel: 03:45

Where does orchestration fit in that, and what do you mean by orchestration?

Pavel: 03:49

You probably think of Dial. You can Google it. We do a lot of applied innovation in general as a company, and this is one of the good examples of applied innovation to AI. The best way to think of Dial would be you guys all know ChatGPT. Right?

Pavel: 04:08

ChatGPT isn't an LLM. It's an application that connects to an LLM and gives you certain functionalities. It can be as simple as just chatting and asking questions. It can be a little more complex, uploading documents and speaking to them, like Talk to my documents. It can be even more complex when you start connecting your own tools to it.

Pavel: 04:29

We see our clients not only do this, but also want something like this for their own business processes. This orchestration engine becomes, How do I make it so that I don't have 20 different teams doing the same similar things over and over again in their own silos? How do I connect my teams and their AIs and their thoughts and results into a consolidated ecosystem? It's likely because of GenAI and because of what we can do with conversation and text becomes sort of conversation first. You can think of conversation first application mashups almost.

Pavel: 05:09

Right? Like you talk, express a problem, and what comes back is not just the answer. Maybe what comes back is UI elements, buttons you can click, forms you can fill out, things you can do, as well as things that are done for you by agents automatically. Dial, in that sense, is Well, by the way, it is open source. You guys can also go look, download, and play with it.

Pavel: 05:32

But it is a ChatGPT like conversational application that has many capabilities that go beyond. We have dial apps. They predate MCP, but the idea is that you so dial itself has a contract, an API that you implement. You basically come back with a streaming, API that can receive a user prompt. Whatever you do, you do, and you come back to Dial with not just text.

Pavel: 06:02

It's a much more powerful payload with UI elements, interactive elements, and things that Dial will display for me, the user, to continue my interaction. And Dial becomes this sort of center mass of how your company can build, implement, integrate AI into this single point of entry. And then Dial goes Well, from day one, Dial was a load balancing model agnostic proxy. Right? So every model, every deployment has limits, you know, tokens per minute, tokens per day, whatever, requests per minute.

Pavel: 06:36

You'll likely if you're a large organization with large different workflows, your AI appetite will go well beyond a single model deployment. You'd like to load balance across multiple, and then you'd like to try different models, ideally with the same API for you, the consumer. So Dial started like that. It started like load balancing, model agnostic proxy, single point of entry. We can log everything that is prompted in the organization.

Pavel: 07:04

We can do analysis on that separately because that's very helpful to know what kind of problems your teams are trying to solve. Then it evolved into this application hosting ecosystem. Now it's evolving clearly towards what MCP can bring in because now you can connect a lot more things to it through MCP. So I think it's running at 20 something clients by now.

Daniel: 07:27

So just a couple of follow-up questions. It's been in the news a lot, but just so people understand if maybe they haven't seen it, what are you referring to with MCP and how that relates to some of this API interface that you're enabling?

Pavel: 07:44

Well, the easiest is to Google it. You can Google it, you're going to find it, it's on Claude. Let me tell you how I think about this because not what it actually That's helpful. Yeah. Yeah.

Pavel: 07:55

I think about it in a very simple term. So MCP allows to connect the existing software world to LLMs. In a way, think like I don't want to hype it too much because it's not yet the global standard or anything. It's very early, early, early days. It's been months, right?

Pavel: 08:15

But let's say HTML and browsers and HTTP, they're enabled to connect us. People are too soft for all over the world. MCP does that but for LLMs. Today, if I today want to be able to prompt my application that is in front of an LLM to do things with additional tools Let's say I wanted to be able to search file system based on what I prompted and find a file and something in that file, right? So my application needs to be able to do that.

Pavel: 08:49

My option is what? I can write that function. I can then tell my LLM, Hey, here's this function you can call if you want to. Call it. I'm going to call it for you.

Pavel: 08:58

Great, that's one function. What if I need to do something else? I want to go talk to my CRM system and get something out of there. I'm going to write that function. If I'm going to write all the functions I can think of, it's going to take me years, probably hundreds of years.

Pavel: 09:13

Instead, what I can do today, can say, Hey, my LLM application, can you talk a protocol? Because there's a protocol called MCP. I'm going to bring you MCP servers that other people have built. For my CRM system, for my file system, for my CLI, there are MCP servers for everything. IntelliJ exposes itself as an MCP server to do things that IDE can do.

Pavel: 09:35

Now you can orchestrate those things through LLM. So you connect all those MCP servers through an MCP client, this application in front of LLM, to LLM, expose the tools to LLM. LLM can now ask the client to call a tool, and through this MCP protocol, the client calls the server, the server does the function that has been written in that server, and boom, LLM gets results. It's this connective tissue that did not exist three months ago. Three months ago, everybody was writing their own.

Pavel: 10:04

And right now, everybody, as far as I can tell, writing MCP servers, and those who talk to LLMs, they consume MCP servers.

Daniel: 10:13

Yeah. And maybe just give so I like the example that you gave of searching file systems. Just to kind of expand people's understanding of some of possibilities, what are some of the things that you've seen maybe implemented in Dial as things that are being orchestrated? In general terms, what are kind of some of these things?

Pavel: 10:36

Let me give you a higher level and much more sort of fruitful example, okay? Yeah. We have our own agentic developer. It's called AIRun Codeme, because AIRun has multiple different agentic systems. Codeme is specifically coding oriented.

Pavel: 10:59

We have others oriented at other parts of SDLC workflow. By the way, you guys can go to SWE bench look at verified list. I believe Codeme as of now takes fifth, it's number five on the list of all the agents who compete for solving open source defects and stuff. Codeme as an agentic system has many different assistants in it. Dial as a generic front door, as a ChatGPT, would like to be able to run those assistants for you as you talk to Dial.

Pavel: 11:35

Until MCP, it really couldn't other than, Hey, Codeme. Implement an API for all of your assistants. Let me learn to call all of your APIs. Now the story is, Hey, Codeme. Give me an MCP server for you, which is what they have done.

Pavel: 11:52

Dial as an MCP client can now connect to all Codeme features, all the assistants, expose them as tools to an LLM, and orchestrate them for me. So I come into the chat, I ask for something, that something includes reading a code base and making architecture sketches or proposals or evaluation, right? LLM will ask Codeme assistance to go and read that code base because there is a feature in Codeme that does it, and Dial needs to only orchestrate but doesn't need to rebuild or built from scratch. That's the idea, so this is the example.

Daniel: 12:34

Yeah. Could you talk a little bit I'm asking selfish questions because sometimes I get these asked of me and I'm always curious how people answer this. One of the questions that I get asked a lot in respect to this topic is, okay, I have tool or function or assistant one, and then I have assistant two, and then I kind of have a few. Right? And it's fairly easy to route between them because they're very distinct, right?

Daniel: 13:04

But now if you imagine, okay, well now I could call one of a thousand assistants or functions or something, or later on 10,000, right? How does the sort of scaling and routing kind of actually how has that affected as you kind of expand the space of things that you can do?

Pavel: 13:25

So that I think and again, I can't know, and I don't know, but I think that is still the secret sauce. In a way, that is still why there is all of all of these coding agents in SWE bench. All of them work with, let's say, Cloud Sonnet 3.5 or Cloud Sonnet 3.7 or GPT four point zero. LLM is the same, and yet results are clearly different. Some score 10 points higher than the other.

Pavel: 13:54

You go to cursor, IDE cursor, you ask it something, it does something. You switch the mode to max. They've introduced very recently. Cursor on Sonnet 3.7 and now on Gemini two point zero, I think, they have a max mode, which is pay per use versus their normal monthly plans, because max will do more iterations, will spend more tokens, will be more expensive, will likely run through more complex orchestrations of prompts and tools and whatnot to give you better results. How you build the pyramid of choices for your LLM, how you Yeah, you will not ask LLM, you will not give it a thousand tools.

Pavel: 14:38

If you as a human look a thousand options and you lose yourself, a hundred options in it, again, I don't know. I expect LLM to have the same sort of oops, overwhelmed effect. You don't want to give it a thousand tools. Want to give it groups. You want to say, Hey, pick a group.

Pavel: 14:56

So you want to do this basically like a pyramid, like a tree. But how you build it and how you prompt it and how you do this, now that's still on you. This is the application that connects the MCPs, the tools that it itself has, the prompt that the user has given, the system instructions, and building some of the chain of thought LLM can build. And this is going to be a very interesting balance. What do you ask LLM to build?

Pavel: 15:25

How much of this sequencing of steps will be on you in your hands versus how much you're going to delegate to LLM and ask LLM to come up with a sequence of steps. And from what I've seen over the last year, you're better off delegating more to LLMs because they get better at it. So the more you control the sequence yourself, the more sort of inflexible it becomes. You're better off delegating to a LM, but you don't expect it to just figure out from one prompt. Daniel, I can give you that example that I gave in the beginning, if you want, about the failure.

Pavel: 16:02

Yeah, go for it. I use AI. I built with AI, but I also use AI as a developer. I'm on cursor as my primary ID these days. I use the AIRunCodemey that I mentioned.

Pavel: 16:16

I play around with other things as they come up, like Cloud Code and things, but I also record what I do. Little snippets, five, ten minutes videos for my engineering audience at EPAM for the guys to just look what it is that I'm doing, learn from how I do it, try to think the same way, try to replicate, get on board with using AI. So I set out to do a task. I wanted to, on record, get a productivity increase with a timer. My plan was I'm going to estimate how long it's going take me, announce, let's say two hours, do it with an agent, and I always pause my video when the agent is thinking because that's a boring step.

Pavel: 16:57

But the timer's going to get ticking. And at the end, I'm going to arrive at, let's say, an hour, maybe forty minutes out of two, boom, that's the productivity gains. And thirty minutes in, I completely failed. I had to scrap everything that LLM and agents wrote for me and start from scratch. My problem was I overprompted.

Pavel: 17:16

I thought I knew what I wanted an agent to do. There were three steps: copy this, write this, refactor this, and you're done. It did it. It iterated for ten minutes. It was the Codeme agentic developer that we have.

Pavel: 17:32

When I scrapped it and started doing it myself, I did half of it, stopped, and realized that the other half was not needed. It was stupid of me to ask. The correct approach would have been to iterate, do the first half, stop, rethink, and then decide what to do next. But the agent was given the instruction to go all the way, so it went all the way. This is the other thing with thousand instructions, right?

Pavel: 18:01

You don't want an agent to be asked to do something that you think you know but you only really will know as you iterate through.

Daniel: 18:12

In these cases as well, so like I find your experience with the balancing how you prompt it, how far the agent goes. All of this is intuition that you're kind of learning. One of the things that was interesting, we just had Kyle, the CEO of GitHub on, and we were talking about agents and coding assistants. One of his thoughts was also around the orchestration after you have generated some code. It's one thing to create a project, create something new, but most of software development happens past that point.

Daniel: 18:51

And I'm curious as someone who is really trialing these tools day in and day out as your daily driver and utilizing these things, I think that's on people's mind is, Oh, cool. I can go into this tool, generate a new project that maybe whatever it is, you always see the demo of creating a new video game or whatever the thing is. But ultimately, I have a code base that is very massive, right? I'm maintaining it over time. Most of the work is more on that operational side.

Daniel: 19:28

So in your experience with this set of tooling, what has been your learning? Any insights there? Any thoughts on kind of where that side of things is heading? Especially for you're dealing with, I'm sure, real world use cases with your customers who have large code bases, right?

Pavel: 19:47

Well, that's great. I'm so glad that you asked because what I do is actually that latter aspect. I have a mono repo of 20 different things in it that could have been separate repos of their own. I have a large code base that I work with. And I actually saw our own developer agent occasionally choke because it attempts to read too much, and it just chokes on tokens and limits and things that it can do per minute or per hour or something.

Pavel: 20:18

So that's one thing. But what I find myself doing with Cursor, for example, I actually pinpoint it very actively, very often, cause I wanted to work with these files when it's something specific. I'll just point the files at it, and I'm gonna ask I'm gonna prompt it in context of these three or four files, and that limits how much it's gonna go out. But really, back to your question, to me, it's not about code bases that much. I don't think it's going to be well, maybe if I do something greenfield and funny, I'm going to write it, I'm going to run it, and if it works, it's all I need.

Pavel: 20:49

It's correct, it works, great. Today, and it's still a mental shift, it's still early, I'm still looking and thinking of the code base that I write with my agents as code base that will be supported by other people, likely with agents, but people still. So correct by itself is not good enough. I want it to be aesthetically the same, I want it to follow the same patterns, I want it to make sense for my other developers who will come in after me. I want it to be as if it's the code that I have written, or at least more or less that I have written.

Pavel: 21:24

That slows me down a little bit, clearly, I'm sure. But the other thing is I am the bottleneck. An agent will take minutes, small digit, like single digit minutes, if not less, to spit out whatever it spits out. Oftentimes in code bases, it's not a single file. It's edits in multiple places.

Pavel: 21:46

Then I have to come in and read it. Here's the difference. When I write myself, my brain has a timeline. I was thinking as I was typing, as I was thinking, I was I know how I arrived at what I have arrived at. I may decide that it's bull, you know, scrap, let's start over.

Pavel: 22:04

That happens, we're all developers. But I know how I arrived at where I am. When I look at what Agent produced for me, I have no idea how it arrived at where I am. I need to reverse engineer why, what did it do? It takes time.

Pavel: 22:19

I tried recording it and I can't because I can't speak as I think at the same time. This is the bottleneck, literally. Is the bottleneck. The other thing is, when I was doing that video with a timer, I expected certain outcomes, but I also knew that if it works, I'm going to say this at the end. I'm going to say, Guys, look, it took me twenty minutes, let's say thirty minutes out of an hour, so it's 2x, right?

Pavel: 22:46

Literally 2x productivity improvement. Amazing, isn't it? But here's the thing. Within the thirty minutes that I've spent, the percentage of time I spent critically thinking was much higher than normal. The percentage of time I spent doing boilerplate is much lower because the agents did I really critically thought about what to ask, how to prompt, and then analyzing what it did, thinking what to do next.

Pavel: 23:10

Do I edit? Do I reprompt? Can I sustain the same higher percentage of critical thinking for the full day to get to X in a day? Probably I can't. So what's probably going to happen, I'm going to get to X, but I'm going to use the time in between as agent work to do something else.

Pavel: 23:30

My day will likely get broken down into more smaller sections. My overall daily productivity is likely to increase. I'm likely to do more things in parallel. Maybe I'll do some research. Maybe I'll answer more emails, right?

Pavel: 23:45

But it's going to be more chaotic, also likely more taxing. I don't think we've learned don't think we've had enough experience yet. I don't think many people talk about this yet. People talk about this, Oh my god, look what I've built with agents. I wonder how they're going to talk about how they've worked for six months with agents and how six months that they've done with agents is better than six months without and how they feel at the end of the day.

Pavel: 24:15

And think about in the zone. Will, I hope, as engineers, like to be like, you know, disconnect all emails, whatever, get the music on, IDE in front of you, you're in it for, like, two hours. With agents, you just can't. You prompt an agent, it goes off doing something. What do you do?

Pavel: 24:34

Do you pull up your phone and then your productivity increases one way, your screen time increases the other way? It's not a good idea. What can you do? What do you do in this minute and a half or three? And you don't know how long.

Pavel: 24:47

Well, you can see the outcomes coming up, but the agents are still spinning, it's still spinning. I'm sorry, it's a long answer to your question, but that's what I'm thinking about constantly, and that's what I don't yet have answers for. But I really hope to eventually, through experiments and recording and thinking, arrive at least what it means for me, because I cannot even tell you what it means for me yet.

Daniel: 25:10

Yeah. I mean, I experienced this yesterday too because I'm preparing various things for investors, updating some competitive analysis and that sort of thing. And I just when you have whatever it is, I think it was 116 companies and I'm like, oh, I'm going to update all of these things for all of these companies. Obviously, I'm going to use an AI agent to do this. This is not something I want to do manually, is put in all of these things and search websites.

Daniel: 25:46

I did that, but to your point, I could figure out how to do a piece of that and get it running. And then I see it running and I realize that this will take however long it is, right? Ten minutes or whatever the timeframe is. And then you context switch out of that to something else, which for me I think was email or whatever. I'm like, Oh, this is going to run.

Daniel: 26:12

I'm going to go answer some emails or something like that, which in one way was productive, but then I had to context switch back. They're like, Oh, why did I output all these things? Or it happened to be that I wasn't watching the output. In one case when I ran it, I was like, Oh, well, I really should have had it output this column or this field, but I didn't think of that before. And I wasn't looking because I turned away from the agent back to my email.

Daniel: 26:42

Yeah, I think this is a really interesting set of problems that is more of like a new yeah. It's a new way of working that hasn't been parsed out yet. Right?

Pavel: 26:52

And I tried not to do it. Like, tried, but then you sit idle. Like, you'll literally sit idle. It's like and it it it doesn't feel good. It feels it feels like, oh my god.

Pavel: 27:00

Why am I not doing anything?

Daniel: 27:02

Yeah. It's it's an interesting dynamic. That's for sure. And I've definitely seen people that show having multiple agents working on different projects at the same time. That, when I see someone with two screens and things popping up all the place, there's no way I could, in my brain, monitor all of that that's going on.

Pavel: 27:25

It must be very taxing first, and second, half of those merge requests, pull requests from the agents will be, let's say, subpar. Frustration, and you will rise to it. You will think, Man, I would have done it already myself much better. What is this? Emotionally, it is a very different way of working emotionally.

Daniel: 27:47

Yes. I

Pavel: 27:52

keep thinking. I can forget. I advise people also to think, not just think about productivity gains. Not just think about delegating to agents and enjoying the results. Think about how it changes the dynamic of your day and how you think about it afterwards, right?

Daniel: 28:09

Yeah. Yeah. That's interesting. So I know we're circling kind of way back. Was an interesting discussion, but I do wanna make sure people can kind of find some of what you're doing with Dial.

Daniel: 28:22

You mentioned kind of the open source piece of this. What sort of needed from the user perspective to kind of spin this up and start testing it? And any for those that are out there that are interested in trying some things with the project, what would you kind of tell them as starting point and like, what the process is like to kinda get a system like this up and running?

Pavel: 28:45

I actually, I'm not sure I can tell, for dial specific. Have nobody is running local dials. It's not something you run locally. Gotcha. It's something that you run sort of centrally in an organization of size can be different, you expose it to your people through like a URL that they all can go to and use, use AI through dial and do things through dial.

Daniel: 29:15

Interesting.

Pavel: 29:15

One of the apps we built, as an example, earlier, it was last year, Talk to Your Data. If you look at analytics like Snowflakes over the world, they all have something like this today, like semantic layer, which you work on. Then through semantic layer, through prompting and through some query conversions and connectors to data warehouses and data lakes, you get yourself a chat with your data, like analytical reports, graphs, tables. So we built that. That was built into Dial.

Pavel: 29:47

So you go to Dial, and then again, imagine Imagine ChatGPT that allows you to choose what model you talk to, right? Not just OpenAI models, but all of the other models that exist, as well as applications. So go to this ChargeDepT, which is in our case, Dial, you select this data heart AI, we call it, which is our Talk To Data, and you start talking to it. This is still your Dial experience, but you're really talking to an app that then talks to semantic layer, then builds queries based on your questions, runs them, gets data back, visualizes it in because Dial has all these visualizations, capabilities that explain how it's not just text coming back, builds your charts and you can interact with it. But again, you don't run Dial locally.

Pavel: 30:33

If you want to explore what it is, I hope, I expect that if you go to, I think it's railepam.com.

Daniel: 30:42

EPM rail. Yeah.

Pavel: 30:44

Oh, EPMrail.com. Thank you. And you're going to read about what it is and you're going to find all the links to hopefully documentation, how to But also, most companies who we work with, they want more than just, Hey, how do we install it? And now we want to build with it. And that's where we come in with professional services.

Pavel: 31:09

We can build them things for their dial so that they can do the AI that matters to them in their context, with their data, with their workflows, with their restrictions on things they can and cannot do and yada yada yada.

Daniel: 31:25

Yeah. And I'm wondering for this kind of if you think about this zoo of underlying applications or assistance, I'm wondering, because you've obviously been working in this area for some time, do you have any insights or learning around easy wins for underlying functions or agents that can be tied into this orchestration layer, or maybe more challenging ones, things that you've learned over time developing and working with these things in terms of things that you could highlight as easy types of wins and things that I mean, you mentioned the workflow stuff around some of what isn't yet kind of figured out, but more on the orchestration layer and the function calling. What are some areas of challenge or things that might not be figured out yet that you think are interesting to explore in the future?

Pavel: 32:27

Let me think. My first thought was to So you're asking about connecting tools and functions to an LLM and which of the functions or what type of connectivity is easier?

Daniel: 32:40

Yeah, yeah. Is there anything that's out of scope or more of a challenge currently, or is it fair game for kind of I guess it's whatever you can build in that function in the assistant. What limitations are there or challenges in that mode of development of developing these underlying functions or tools?

Pavel: 32:59

I see. It's kind of a twofold answer. If you take the technicality aspect, like how do I build a tool that does x, the complexity is really in x. If you want to go and query a database, how hard is that? Well, not hard.

Pavel: 33:20

Right? I mean, connectivity to the database, if you have a query, you run it, you get results back. So it's not hard to do the technicality of querying a database. Making it useful and making the result useful in context of users' prompt and conversation is a lot more challenging. I had this I'm running a service.

Pavel: 33:41

It actually has a public webpage called api.epim.com. It's our own You will not really go past the front page, but you'll understand what it is. It's a collection of APIs that we built, my team has built, that exposes a lot of data. Remember I said EPM runs on internal software? So all of those applications, they stream their data and their events out into a global data hub.

Pavel: 34:04

Think big, big, big Kafka cluster. But that's Kafka, so you can read data out of it as a Kafka consumer, but if you want to have more modern API search, lookup, this, that, so we have an API service, all of the data. Somebody came to me today and said, Hey, have you heard of MCP? I'm like, Yes, of course I have. Why don't you guys build MCP for api..com?

Pavel: 34:29

My answer is it is easy to build. Api. Epim Com speaks RSQL. I can build a server that will take your query, create RSQL, LLM will be able to do that easily, run it, give back the data. But I said it's not going to be useful because this is single dataset APIs.

Pavel: 34:48

Your questions are likely analytical. You likely want to ask something that expects me to do summary by month, this, this, this, give you a Which, that's a very different question. You ask me about MCP to an API, easy to do. Make it useful for your actual use case, much harder to do. I likely need to do a lot more than just connectivity of tool to an LLM.

Pavel: 35:15

I need to understand what you're asking, figure out the orchestration that is required, maybe custom apps, maybe something else. Then you start hitting authentication, legacy apps, all the other roadblocks. And in a way, the Talk to Your Data is an amazing prototype that we built, and I have a video about this. But we sort of stopped because we clearly sensed how steep the curve is to get it to actual Because what we wanted to do, what we envisioned we could do, was analytics democratized. So you don't have to go to an analytical team, ask them to build you a new Power BI report, and them spending a week doing so.

Pavel: 35:57

You can just come into Dial and say, Hey, show me this, this, this, and this. And yes, we technically can do it. But to be able to do this for all kinds of questions you can ask about our data, that's a much harder thing to do.

Daniel: 36:12

Yeah. To your point, underlying systems might have limitations. Think in analytics related use cases that we've encountered with our customers, Often I'll just ask the question around, Hey, if you gave this database schema or whatever it is to a reasonably educated college intern or whatever that is, and you ask what columns would be relevant to query based on this natural language query, you can pretty easily tease out, Well, I look at all these columns, I have field 157 and customnewfield. There's no way for just someone off yet to know anything about that. So it's not really a limitation of what's possible in terms of the technicality, like you said.

Daniel: 37:12

It's more of you're not always set up for success in terms of utility, like you mentioned.

Pavel: 37:19

And for data, that's where semantic layer comes in. So if you have descriptions of your columns, of your tables with business meaning Mhmm. Then connecting that semantic layer with some data samples to LLM will allow it to write the query that you thought was impossible to write because it is impossible without the semantic layer sort of can explain the data that you have in business terms in the language that the questions will be asked of your assistant. That's what allows us to do this talk to a data analytics.

Daniel: 37:59

Yeah. Well, I know that we've talked about a lot of things. I think you are probably seeing a good number of use cases across your clients at EPAM and also your own experiments with Dial and other things. I'm wondering as you kind of lay lay in bed at night or whenever you're thinking about the future of of AI or or maybe it's all the time or maybe it's maybe it's not, at night. But, yeah, as as you kind of see what is to your point, just bringing it all the way back to the beginning, you see what is possible to do now, which even six months, a year ago, whatever it was, you know, it not possible.

Daniel: 38:41

What kind of is most exciting for you or most interesting for you to see how it plays out in the next six to twelve months? What is constantly on your mind of where things are going? Sounds like you know, how we work with these tools is one of those things. We already talked about that a little bit, but what else is exciting for you or encouraging in terms of how you see these things developing?

Pavel: 39:09

My answer may surprise you. When I think about it, don't, you know, think or anticipate any new greatness to come. I actually mostly worry. And I worry because I know that my thinking is linear. Like most of us, even though looking back, we know that technology has been evolving rather exponentially, our ability to project into the future and think what's coming next is linear.

Pavel: 39:41

So I am unlikely to properly anticipate and get ready for and then expect, right, and wait for what's to come. I am sure to be surprised, and I guess as everybody else, I'll be doing my best to hold on, to not fall off. So I worry seeing how the entry barriers rise. It's harder for more junior people to get in today. When I'm asked about skills I recommend that people focus on as far as trying to be better prepared for the future, I always answer it with the same things.

Pavel: 40:18

I always say fundamentals and then critical system thinking. And fundamentals, you can read about a lot, but you really master them when you work with them yourself, not when someone else works with them for you. And not having them is likely gonna constrain you from being able to properly curate and orchestrate all these powerful AI agents. And when they get so powerful that they don't need you to curate and orchestrate them, then what does it do to you as an engineer? And maybe that's not the right thinking, but this is what I think about at night, like you asked, when I think about AI and what's coming.

Pavel: 41:00

I am excited as an engineer. I like using all of this. I just don't know how it's gonna reshape the industry and how it's gonna change my work, you know, in years to come.

Daniel: 41:13

Yeah. Well, I I think it's something, even in talking through with you kind of some of the work that you and I have been doing with agents and how that really has triggered a lot of questions in our own mind of what is the proper way of of working around this. And I think there is going to be a you know, that is a widespread, issue that people are gonna have to navigate. So, yeah, I think it's I think it's very valid. And we'll we will be interested to see how it develops, we'd love to have you back on the show to to have your learnings again in, in six or twelve months of of how it's shaking out, shaking out for you.

Daniel: 41:53

Really appreciate you joining. It's been a great conversation.

Pavel: 41:56

Thank you very much. It's been a pleasure.

Jerod: 42:04

Alright. That is our show for this week. If you haven't checked out our Changelog newsletter, head to changelog.com/news. There you'll find 29 reasons. Yes.

Jerod: 42:17

29 reasons why you should subscribe. I'll tell you reason number 17. You might actually start looking forward to Mondays.

Pavel: 42:25

Sounds like somebody's got a case of the Mondays.

Jerod: 42:28

28 more reasons are waiting for you at changelog.com/news. Thanks again to our partners at fly.io, to Brakemaster Cylinder for the Beats, and to you for listening. That is all for now, but we'll talk to you again next time.

Creators and Guests

Host

Daniel Whitenack

Guest

Pavel Veller

Orchestrating agents, APIs, and MCP servers

Broadcast by

Creators and Guests

headphones Listen Anywhere

Listen Anywhere