Practical AI | Transcript: Federated learning in production (part 1)

Federated learning in production (part 1)

May 30, 2025 / 44:38/E314

Jerod: 00:04

Welcome to Practical AI, the podcast that makes artificial intelligence practical, productive, and accessible to all. If you like this show, you will love the changelog. It's news on Mondays, deep technical interviews on Wednesdays, and on Fridays, an awesome talk show for your weekend enjoyment. Find us by searching for the changelog wherever you get your podcasts. Thanks to our partners at fly.io.

Jerod: 00:28

Launch your AI apps in five minutes or less. Learn how fly.io.

Daniel: 00:44

Welcome to another episode of the Practical AI Podcast. This is Daniel Witenack. I am CEO at Prediction Guard and joined as always by my cohost, Chris Benson, who is a principal AI research engineer at Lockheed Martin. How are doing, Chris?

Chris: 01:01

Doing great today, Daniel. How's it going?

Daniel: 01:03

It's going pretty good. I would say my mind is a little bit scattered today, maybe distributed over various topics, jumping from peer to peer between different meetings. Thankfully, we're just going to continue that theme today into a little bit of a discussion on federated learning because I'm really, happy to have Patrick Foley here with us who is lead AI architect who's focused on federated learning at at Intel. How are doing, Patrick?

Patrick: 01:35

Doing great. Thanks for having me on the show.

Daniel: 01:37

Yeah. Of course. I was saying, one of our one of our engineers at, at Prediction Guard, Ashwarya, shout out to her. She spoke at the Flower Conference over in London, not too long ago, I bumped into you. So it's good to good to get that, get that lead.

Daniel: 01:55

But, it's been maybe a little while since we talked about federated learning, which we have talked about in previous episodes. But I'm wondering just for the audience at large, who's maybe been hearing a lot about LMs and only LMs or Gen AI for however long now. Just circling back to that topic, could you set the stage for us and give us kind of the explainer on federated learning generally and what that means?

Patrick: 02:25

Yeah, absolutely. So the main training paradigm for machine learning has been having your data centralized and then training your model on that local data. There's a lot of cases where you can't centralize your data due to privacy concerns or maybe even the size of the data is an issue. And so there's a different technique where instead of sending your data to a central place, you send your model to where the data is and you train it there. So it's closely related to distributed training, as you could probably tell from the description there.

Patrick: 03:01

But there's a much higher focus on privacy concerns. And so how you can verify that the model is not encapsulating something about the data and who the threats are because it's not just a single person that is controlling all of the infrastructure, but multiple parties who might not trust each other, That's where a lot of the variance of how we need to focus on those concerns comes from.

Daniel: 03:28

And just to dig in maybe just a small bit deeper there. So if you're bringing the model to this distributed data, in what way maybe just walk us through kind of a flow, I guess, of training. So you send the model to these places that have the data, what kind of happens in that training process or how does it iterate in a different way than maybe what people are used to hearing about?

Patrick: 03:57

Yeah, absolutely. So there's a number of both closed source and open source federated learning frameworks that are out there. I lead the open federated learning OpenFL open source projects. And there's a number of people that do this in the same way. But really what it involves is first having a shared notion of what that model is.

Patrick: 04:19

And then there might be a distribution phase for the workspace or the code ahead of time so that everyone has a record of what the code is that's going to be running on their infrastructure. And so at the time that the experiment starts up, there's a server or what we call an aggregator that's the central point where everyone is communicating with that server for what tasks they should be doing or what the latest model weights are that they should be training on. And then the client side is what we term as the collaborator. So everyone has a view of what that code is and we have this concept of a federated learning plan, which includes everything outside of the code itself. So this might be hyperparameters for the model, some of the network details that you might want to know, whether there's TLS being used, mutual TLS, and a lot of other things that you might care about if you're a hospital that wants to be running this software on your infrastructure and you don't want to be exposing your data because of HIPAA or GDPR considerations.

Patrick: 05:25

So there's this vetting process that's really important to happen ahead of time. And then once this vetting has happened, then there's an opportunity to actually launch the experiment. And what this means is for the aggregator or the server is launching that application that opens a that starts a gRPC server or some kind of REST server. And then for the collaborators, they are just starting their local process and making the connections to that local server. So the flow is this is really all of the setup for the experiment actually taking place.

Patrick: 05:55

But the aggregator has initial model weights for what everyone is going to be training on for that first round of the experiment. And so then everyone receives those model weights. And it's not the entirety of the model. And the way that we divide things into this provisioning phase and then the runtime phase is so that we can limit what actually gets sent across the network. We don't need to be sending Python objects, which are much higher risk in terms of being able to send code that could then exfiltrate your data and it's not necessarily vetted ahead of time.

Patrick: 06:27

So there's very small windows of information and we limit that communication path to NumPy bytes. And the great thing about doing things in that way is that if you're just dealing with model weights, then that means that you can train across a bunch of these different deep learning frameworks. So we can work with PyTorch models, TensorFlow models, etcetera. And you can send those model weights across the network. You can populate your Python code that's already been shipped to you ahead of time, do your local training, and then based on the updates that you have for based on your local data, you send your updated model weights back to the aggregator and then they get combined in some way.

Patrick: 07:08

In the simplest case, this can be something like a weighted average based on the number of datasets that you might have locally for each of those collaborators. And then this is really what constitutes a single round of federated learning training. And then what we've seen is that just by using these simple methodologies, you can get to a point where you have somewhere in the realm of 99% accuracy versus a model that's been trained on centralized data alone.

Chris: 07:34

I'm curious, just as you were talking about the the aggregation of the of each of the data back to to the to the main server. And you talked a little bit about different ways of aggregating and stuff. I'm just curious, are there a lot of different approaches algorithmically to that aggregation or does that tend to follow the same mechanism most of the time? And do people tend to choose different ways of aggregating data? I'm just wondering how much variability is typically found in there among practitioners.

Patrick: 08:04

Yeah, that's a great question. So we've seen that Fed average works pretty well in a lot of cases, but because So Fed average is the original aggregation algorithm for federated learning that was coined by Google. This was back in 2017 and they actually coined the term federated learning originally at that time. But there's others that are out there that deal much better with data heterogeneity between the different client sites that might have different data distributions. And so when that's the case, you might need to ignore some of the outliers or incorporate their local updates in a different way that allows you to capture that information or converge faster to what a global model would be that would perform well on all of these different data distributions.

Patrick: 08:56

So there's a number that do try to capture some of this information. So FedOpt is one of those that incorporates the loss terms of the different collaborators that are out there. This is really a hot research area, but really varies is what we found. But by applying some of these top methods, you can generally get to a pretty good point in convergence versus centralized data alone.

Daniel: 09:23

So Patrick, I'm curious about if we could just talk through maybe a couple of example use cases, kind of pointing out the actors in the process. So we've talked about kind of the central aggregation. We've talked about these, clients or collaborators, I believe you called them. So this distributed set of collaborators who have the model and are doing updates to the model, which are then aggregated back together. If you could just maybe highlight, Hey, here's an example use case in this industry with this type of model.

Daniel: 10:02

Here's who the party would be that would be the aggregator party and where that infrastructure would run, and here's the parties that would be the collaborators where the model would be distributed. That would be very helpful.

Patrick: 10:18

Yeah, absolutely. So I'll take one of really the first real world deployments of learning that my team took part in. So back in about 2018 or so, Intel started collaborating with the University of Pennsylvania on trying to deploy federated learning in hospitals for the purpose of brain tumor segmentation. So this was very recently after Google even released their seminal paper on federated learning showing that this had high success for text prediction on Android phones. And this was the health application of this for federated learning.

Patrick: 10:56

And so this progressed to a point where we were able to demonstrate that we were able to achieve 99% accuracy versus a centrally trained model. And then this really spanned out to a much larger real world federation where we were able to train across roughly 70 different hospitals across the world. And so each of those hospitals represent the collaborators in the architecture that I was speaking to earlier. And then the University of Pennsylvania served as that central point or the aggregator for where the initial model was populated from. And it was a three d convolutional neural network, a segmentation model.

Patrick: 11:40

So coming in with DICOM data and then trying to get an estimate of where a glioblastoma brain tumor was based on that image. And so there's the collaborators and the aggregator. And then that's really the high level of what this looks like. But then there's a lot of other details that had to be dealt with beyond just this more, I would say, vanilla federated learning architecture. And really where came from was there's a lot of issues with figuring out how to identify mislabeled data when you have privacy that's at stake.

Patrick: 12:20

And so this really requires experts in data science or someone who has a background in federated learning to go and dive into how you're identifying these conversions issues that might pop up. And so UPenn was taking on a lot of that responsibility. There were Intel engineers who were very involved with a lot of those calls as well and trying to get on the phone and have these Zoom calls with, I mean, these different IT admins and data owners at each of the hospitals, just trying to figure out where there might be a mislabeled dataset or that type of thing. But it really exposed that there were gaps in the total participants layout, and we needed to have more of this kind of shared platform for how you can exchange this information and get access to that data in a secure way. And that's one of the things that we've been working on ever since this study came out.

Sponsor: 13:25

Well, friends, NordLayer is the toggle ready network security platform that's built for modern businesses. It combines all the good stuff, VPN, access control, threat protection, and it's all in one easy use platform. No hardware, no complex setup, just secure connections and full control in less than ten minutes. No matter if you're the business owner, the IT admin, or someone on the cybersecurity team, NordLayer has what you need. Here's a few use cases.

Sponsor: 13:51

Business VPN. How often are you traveling? And you need to have secure connections from one endpoint to another, accessing resources, preventing online threats, preventing IP leaks. This happens all the time. What about threat protection being in a place where you can prevent malware, where maybe there's a high risk, you're at a coffee shop, malware, ransomware, phishing.

Sponsor: 14:15

These things happen every single day and users who are not protected are the ones who get owned. And what about threat intelligence? What if you could spot threats way before they escalate? You can identify, analyze, prevent internal and external risks. This is like dark web stuff all day.

Sponsor: 14:32

Data breaches, breach management, serious stuff. Well, of course, our listeners get a super awesome deal up to 22% off NordLayer yearly plans plus an additional 10% off the top with the coupon code using practically dash 10. Yes, that's the word practical then l y dash 10. So practically dash 10. And the first step is to go to nordlair.com/practicalai.

Sponsor: 15:01

Use the code practically dash 10 to get a bonus 10% off. Once again, that's nordlair.com/practicalai.

Daniel: 15:15

Well, Patrick, I'm wondering, you you gave a really good example there in terms of the healthcare use case, the distributed collaborators being these hospitals, the aggregator being the university. Certainly there's other details that are relevant in that, that I'm sure were a lot of difficult things to work out and research. One of the things that I'm wondering, and this might be something that's on people's mind just in terms of the climate that we're in around AI and machine learning, is what are the types of models that are relevant to federated learning? It might be somewhat of a shock to people just coming into the AI world that, hey, are still a lot of non GenAI models. Actually, the majority of AI models, quote unquote, or machine learning models out there are not GenAI models.

Daniel: 16:14

It may come as a shock to them that there's still a lot of that going on. I assume based on what you said before that those types of non Gen AI models are relevant to the federated learning procedure or framework. But could you give us a little bit of a sense of the kinds of models that are relevant and maybe tie that into some of the, I guess, just the real world constraints of managing one of these federated learning experiments in terms of the compute that's available or the network overhead or whatever that is and what that kind of dictates in terms of the types of models that are currently feasible to be trained in this way.

Patrick: 16:59

Yeah, absolutely. So I would say most of the real world deployments of federated learning have focused on non Gen AI models up to this point. So the example that I had was this three d segmentation type of use case. There's been a lot of other deployments of these classification models. Really where federated learning has focused on from the framework support perspective has been around neural networks.

Patrick: 17:26

And a lot of the reason for that is not just because of all of the advances that have, of course, happened for neural nets over the past ten to fifteen years, but it's been because you have a shared weight representation for all of those models across each of the sites where they're going to be distributed. And really what I mean by this, and just as a comparison point, so say support vector machines or random forest, you're going to have something that is going to be based fundamentally on the data distribution that you have locally at one of those sites. So with neural networks and using that for federated learning, that allows us to have much clearer methods for how those weights ultimately get combined for the purpose of aggregation without knowing quite as much about the data distribution ahead of time. I will say that there are some methods for how you perform federated learning on these other types of scenarios. So federated XGBoost is something we recently added support for in OpenFL.

Patrick: 18:26

There's other types of methods out there that have actually performed pretty well. And I mean, getting back to the Gen AI piece of this, that is, of course, a big area of interest for federated learning too. And we have a number of customers who have been asking about how they can incorporate these large foundation models, generative AI models for the purpose of federated learning and this training in a privacy preserving way. And to get to your point or the question around the size constraints that we run into, is, of course, an issue for these large GenAI models. We're very lucky to have techniques like PEFT and quantization that can be applied so that you don't necessarily need to be training on the entirety of 70,000,000,000 waits at a time and distributing those across the network because as you scale the federation, there's, of course, a lot of network traffic that can result from that.

Patrick: 19:23

So by shrinking that in any way that you can, we can still support those types of models, but it's still I would say we're having to use these additional methods instead of just base training because size and the time that it takes to actually train them is, of course, always a concern.

Daniel: 19:42

Yeah. And, just for listeners that are maybe more or less familiar with certain terminology, those sort of PEFT, this is parameter efficient methods where maybe only some of the parameters of a model, a model function are are updated during the training process and create some efficiencies there. Quantization being methods to limit the precision or the size of the total parameter set by kind of reducing the precision of those parameters. I'm wondering, we've kind of naturally got into it, Patrick, but you started talking about, of course, requests to add features and that sort of thing. Obviously in your context, I think we're mostly talking about OpenFL.

Daniel: 20:33

I'm wondering if you could just give us a little bit of an introduction. Now we've talked about federated learning more broadly, what it is, kind of some use cases, that sort of thing. Obviously there needs to be frameworks to support this process and OpenFL being one of those. Could you just give us a little bit of an introduction to the project at a higher level?

Patrick: 20:54

Yeah, so OpenFL, Open Federated Learning is what that stands for. It's been around since about 2018 and it came out of this research collaboration that we had with the University of Pennsylvania. So what other Federated Learning frameworks have done is they've really started from research and then expanded into real world and production deployment. We kind of took this the opposite direction. We had to deal with the real world issues that come from deployment of this framework into hospitals and the challenges that can really result from that.

Patrick: 21:26

And when I say we, I mean, this is a collaboration between my team at Intel, which is more focused on the productization side of how you take these technologies and then bring them into products. University of Pennsylvania, but then also Intel's Security and Privacy Research Lab. So they're, of course, very focused on research as well and have been thinking about security and privacy and confidential computing for quite a long time. So this was really a natural collaboration to bring together research with the experts in this health care and in brain tumor segmentation type of deployments to really bring the right features into this framework that started off as largely a research project at NCEL, but then has since become a much larger framework that's focused on how you can actually perform this across companies or across very large types of deployments that involve academia as well as just how you bring different parties together.

Daniel: 22:34

Yeah. And obviously it's called OpenFL. I'm assuming that people can find it somewhere in the open source community. And also I see there's kind of an association with the Linux Foundation, if I'm understanding correctly. Could you talk a little bit about those things and just sort of the, I guess, the ecosystem where people can find things, but also a little bit about the kind of who is involved and some of how that's developed?

Patrick: 23:01

Yeah, absolutely. So OpenFL started as an Intel first closed source project, and then we open sourced it around 2020. We've since donated it to the Linux Foundation, the Data in AI subgroup of that. And the reason was is that Open is in the name. We wanted this to be really a community driven and owned project.

Patrick: 23:23

And that's the way that we saw this gaining the most traction and success over time. So we didn't want Intel to be in the driver's seat for having complete control over what the direction of this was going to be. In order to be truly successful as an open source project, you need to be thinking about the community and addressing really those concerns and letting them take the wheel and steering this in many cases. So Intel still has a large representation on the development and roadmap for OpenFL, but we have a technical steering committee that's governed under the Linux Foundation. So I'm the chairman of that steering committee, but then we also have Flower Labs who supports the Flower Federated Learning Framework, is also a participant on that technical steering committee.

Patrick: 24:07

We have representatives from Fate, who is actually another competitorcollaborator of ours, Leidos, and then University of Pennsylvania as well. Their faculty has actually since moved over to Indiana University, but they still represents the original collaboration that we had. And they're longtime collaborators of ours who continue to have a strong vision of where federated learning is most applicable for research purposes.

Daniel: 24:34

And I guess in terms of usage, sometimes that's a hard thing to gauge with an open source project, but could you talk a little bit about that and maybe, you were just at the Flower Conference, you're engaging the community in other ways, I'm sure at other events and online. Could you maybe talk a little bit about what you've seen over the past however many years in terms of actual real world usage of federated learning and engagement in the OpenFL project and kind of what that momentum has looked like, how you've seen that maybe shift in certain ways over time and how you see that kind of developing moving forward.

Patrick: 25:23

Yeah, absolutely. So I think that it's really picked up since about 2020. We had the world's largest healthcare federation at that time, and we published a paper in Nature Communications demonstrating the work that we had done. But it's really become evident that there's a lot of real world federated learning that other frameworks are are starting to get into as well. So my involvement at the the Flower Summit was we we I've actually so my team at Intel and OpenFL, we've been collaborating with Flower Labs for the last three years or so.

Patrick: 25:58

And we're jointly very interested in interoperability and standards for federated learning. So I think that one of the things that we both recognized early on is that federated learning is pretty new compared to just deep learning as a study. And we've kind of seen that things are heading the same direction that they did with the early deep learning frameworks that were out there, where you have a proliferation of them at the very beginning and then over time, there's more consolidation across those frameworks as one ecosystem becomes more mature or they specialize in really different ways. So we've been working closely with Flour and other groups on how we can build this interoperability between our frameworks and try to get to a point where we have a defined standard for some of those lower level components because ultimately, we're solving the same problems over and over again between our different implementations and there's not really a need to do that. If you've done it once and if you've done it the right way, then you should be able to leverage that core piece of functionality and then just import it into whatever library you want to.

Patrick: 27:08

That's really the open source ethos. It's building on top of the shoulders of giants. So that's the direction that we're hoping to head. And so at the Flower Summit, we've gotten to the point now where we can actually run Flower workloads. And this is a competitorcollaborator of ours, but we can run their workloads on top of OpenFL infrastructure.

Patrick: 27:30

And getting into the pieces where we specialize and we do have differentiation, So Flower has done a great job building a large federated learning community. They've done wonders, I think, for the scaling of federated learning and the visibility that's on it. And they have a very close research tie as well. So they're seeing, I think, the gamut of different things that people want to do for privacy preserving AI. OpenFL, we've had, because of our history in security and privacy, confidential computing and how you really think deeply about preventing threats for profited learning and these distributed multi party workloads, that's an area that we've been thinking through for quite a while too.

Patrick: 28:12

And we have the benefit, being from Intel, of actually having invented a lot of their technologies for confidential computing like Software Guard Extensions. So you can run OpenFL entirely within these secure enclaves, which means that even local root users do not have visibility into what is actually happening in the application. And if you engage other services on top of that, like Intel Trust Authority, that allows you to actually remotely verify that someone else is running the workload that they're supposed to. So part of the vision here and why we're so excited to be working with Flowr is that now you can run, as part of the Flowr community, this very large community, you can run these workloads now inside of these confidential compute environments on Intel hardware using OpenFL. So there's kind of a chain of how all of these things flow, but that's one of the directions that we're really excited to be undertaking with the wider federated learning community that's out there.

Chris: 29:09

So Patrick, was really interesting for me. I'm learning a lot. And you got me thinking, I'm kind of starting to think about OpenFL in my own life, in my own world. I'm really kind of focused on kind of agentic use cases and out on the edge with kind of physical AI devices that are doing that. And I'm just and you really got me thinking about all the ways that we could apply federated learning in those environments.

Chris: 29:39

I'm kind of wondering, there what what is that is, you know, obviously a big wave of activity we're especially seeing, you know, in the last year or so. What is kind of the story around doing federated learning across, you know, physically, not just within, you know, different data centers and stuff like that, where you have it, but edge devices where you're storing a ton of data in those devices, and you're you're running agentic, you know, operations and those and you're wanting to try to to apply federated learning to that environment. What's the thinking about where that's going and where it's at now and where it might be going forward?

Patrick: 30:19

Yeah. So I mean, it's going be a big area. And we're fully anticipating that this is something that we want to go out and support. So for AgenTic, you have the neural network is one of the components, and then you have the tools that are actually performing operations based on whatever information is coming from that neural network. So at a fundamental level, we can absolutely support these AgenTic use cases by training that neural network and doing this in a privacy preserving way.

Patrick: 30:48

So I think one of the areas that's not necessarily that well studied yet, and I think there's more and more focus on this, but how LLMs can memorize data in a way that certain other neural networks cannot. And so that's really a hot research area. But depending on, I think, how you train these models and then ultimately how they're deployed. So if you're using privacy enhancing technologies on top of just this architecture where you're training at the edge already where the data is, then you're going to get a lot more confidence that there's not going to be your information that's somehow exposed where the model ultimately ends up going.

Daniel: 31:28

Yeah. And this would be like in terms of memorization, what you're talking about here would be like, Hey, I'm training on In this device, let's say it's just a bunch of people's clients and there's communications on those clients that have personal information. In theory, an LLM could be trained in a distributed way, but leak that data through the centrally aggregated model. Am I understanding that right?

Patrick: 32:03

That's exactly right. And we have customers come to us all the time and ask, how can we get assurance that my data is not leaking into the model? And the best thing that we have to deal with this, there's different types of technologies that are out there. You have differential privacy that can apply noise in such a way that you're trying not to expose anything fundamentally about your data when you share those model weights. You have other techniques like homomorphic encryption, where you're encrypting those models ahead of time before they're actually even sent for the purpose of aggregation.

Patrick: 32:39

But really, not all of them is completely foolproof. There's no free lunch, as we say. And then confidential computing, it has the benefit of you can actually train in these completely constrained environments where not even the root user has access to this little protected encrypted memory enclave. But that ultimately requires that you have hardware at the edge to go and be able to perform that type of thing. So that's really where the challenge lies.

Patrick: 33:09

And there's other statistical measures of how you can estimate data leakage into the model. We have support in OpenFL for a tool called Privacy Meter that actually lets you train a shadow model based on the local training that you've done and then get some kind of graph around what the percent risk is based on the local data distribution that you have and that exact model topology that you've trained on. So there's, I think, increased visibility on how you can try to quantify that amount of data leakage. But there's some costs, in the case of some of these technologies, the cost of accuracy for the model overall. So it's really on a per experiment, per model, and per data distribution basis that you have to tune these things.

Patrick: 33:56

That's where there's a bit of work and recommendations that need to be made from people who have experience in this domain.

Daniel: 34:03

And have a Maybe this is sort of a strange question, so humor me in this one. While you were talking, I was kind of reflecting on the fact that maybe the landscape is shifting a little bit around privacy in general and AI in the sense that for whatever reason, seem to want to send a ton of their data to third party AI providers now. And I think gradually people are becoming more sophisticated in that and sort of understanding the implications around sending your data to third parties in the sense of using third party AI model providers from model builders and not running that in their own infrastructure. But there's definitely a wider This has opened up the topic of privacy to a much wider audience and maybe people that aren't so before there was sort of this maybe this discussion around federated learning amongst data scientists, researchers, those that are trying to train models to be better and better. It seems like now there's this wider discussion about privacy and AI providers and a lot of people talking about this.

Daniel: 35:19

And certainly, we've seen people that we're engaging with, of course, to build out private AI systems of their own. But I'm wondering from your perspective, you're kind of in the weeds or in the trenches, I guess, is the best word in terms of helping people with their actual privacy concerns. Have you seen the landscape or perception change in one way or another around kind of AI plus privacy post the kind of ChatGPT era, if you will?

Patrick: 35:54

Yeah, absolutely. So OpenFL, this is the open source project that my team directly supports, but there's another kind of division of where my responsibility lies and that's building on top of OpenFL to really address a lot of these customer concerns. We're my team is actually building a service on top of OpenFL called Intel's Hyper Secure Federated AI that makes it a lot easier for corporate customers to go and deploy secure Federated learning. And so for a lot of the people that we're talking to, really concerned about I mean, they have these foundation models that perform really well on their local datasets, but they ultimately don't have access to the data that's being generated at the edge or some of their sub customers that they're working with. They're not necessarily experts in federated learning ahead of time.

Patrick: 36:44

And so we've heard from many different parties that if there was a service that could actually provide a lot of the infrastructure and recommendations for them ahead of time to go and deploy this easily, then this is something that would make it just a lot easier for them to actually perform a lot of these experiments and vet whether this is something that's going to work for them over the long term. So I talked about the use of confidential computing earlier and how that can be successful for this type of thing. That's an area that we've been trying to really specialize in and make easier for a lot of our customer base. So if you have technologies like Intel SGX that are available across the extent of the parties that participating in this federated learning experiment, then that gives you some really nice properties. Not only can you remove these untrusted administrators from the threat boundary, but you can also verify that your model IP so the model weights, but even the the model topology itself is not something that is divulged to anyone that shouldn't have access to it.

Patrick: 37:50

So how to protect your intellectual property I mean, that being, of course, data, and that's really one of the the main focuses of federated learning is not revealing that to prying eyes, but the model itself too. I think for a lot of our healthcare customers, they'll spend millions of dollars going through FDA approval. And so having that divulged to someone represents a risk to all of the work that they've done, prior to that point. So we've been hearing this from a number of customers for for years, but I think there's a as you've mentioned, more visibility on it because of generative AI. And I think the doors that it unlocks for what benefit is of actually deploying these models in the real world.

Chris: 38:34

I'm curious as I've learned a lot through this conversation. As we I think I probably came into it. We've had previous federated learning conversations in the past with folks. And I think I was still kinda stuck a little bit on kinda distributed data being the driver of federated learning. And you mentioned earlier that, you know, it was that, but more than that, it seems to me in this conversation that that these concerns around privacy, which can take many different forms, you know, from from protecting, you know, individual personal data to IP protection, to regulation, to whatever.

Chris: 39:12

Would it be fair to say that these might be the primary drivers of federated learning? Because it seems like that's really where this conversation has gone over time rather than what I was expecting, which was more just distributed, you know, and I brought up the edge thing a little while ago. I I'm just wondering, do you think am I getting that? Am I on the right track or in terms of getting what the drivers are these days?

Patrick: 39:34

Absolutely the right track. And when I talked earlier about the different participants and the architecture for OpenFL, where I mentioned the collaborators and the aggregator, that's really sufficient for a single experiment when everyone inherently trusts each other or there's some central body. So the parallel here with the University of Pennsylvania and the Federated Tumor Segmentation Initiative, which was this world's largest healthcare federation, everyone trusted the University of Pennsylvania that was ultimately deploying these workloads. As you scale federated learning and you have people that you don't necessarily know that you're welcoming into the mix, you need to have some other way of establishing that trust. And so governance is really the piece that's missing from OpenFL, and that's where we built on top of this with the service that we've established.

Patrick: 40:24

So how you can vet the models ahead of time, how you have a central platform of actually recording that different parties have agreed to the workload that is going to run on their infrastructure and having this unmodifiable way of establishing what the data sets are that you're going to be training on, who the different identities are that are actually participating in the experiment. Governance is a huge concern for a lot of the customers that we've been talking to. And if you want to have cross competitive types of federations where you might have two different pharma customers who have a lot of data they've generated internally, they have mutual benefit by working together for training either one of their models on their competition's data and they might have some kind of agreement that's set up for how what ultimate model is generated, that they have a revenue sharing agreement or that type of thing. Having a platform for being able to establish that type of collaboration in a competitive environment is really where we see federated learning going over the long term. And we're trying to figure out a way to get there.

Daniel: 41:30

And yeah, you already were kind of going to maybe a good place to end our conversation here, which is really looking towards the future. You've been working on OpenFL and these other efforts for some time now and been engaged with the community. As you look forward, what's most exciting for you in the coming years?

Patrick: 41:53

Yeah, what I think is really exciting is, I mean, the collaboration between the different parties that are out there, I think right now is really, I think motivating for me personally, because there's the spirit right now where everything is new and exciting for people who are deep into this field and people want to figure out how to just push everything forward. And I think generative AI has really been a catalyst for that in terms of figuring out how we can get access to this siloed data that's out there and how we can do it in a way that actually enables industry to take up these things. Because we don't want for federated learning to sit in the research world forever. We want to actually take this forward and make it one of the main methods of how you do machine learning at scale when you have these privacy concerns that are, of course, extremely common today. They're common for companies, they're common for individuals.

Patrick: 42:51

So opening up those silos is really one of the things that I think there's going to be a lot of benefit by doing that. And it's going to come That benefit's going to come in the form of much more Or we expect much more accurate models over the long term and much more capable models because of just the increased access to data.

Daniel: 43:10

Awesome. Well, is very exciting. I hope to have you back on the show very soon, next year, we see some of that playing out. Appreciate your work and, you know, the team's work, the wider the wider community's work on, what you're doing. And, yeah, keep up the good work.

Daniel: 43:32

Thanks for taking time.

Patrick: 43:33

Thank you for having me on the show, Daniel and and Chris. Really appreciate it.

Jerod: 43:43

All right. That is our show for this week. If you haven't checked out our Changelog newsletter, head to changelog.com/news. There you'll find 29 reasons. Yes.

Jerod: 43:55

29 reasons why you should subscribe. I'll tell you reason number 17. You might actually start looking forward to Mondays. Sounds like somebody's got a case of the Mondays. 28 more reasons are waiting for you at changelog.com/news.

Jerod: 44:11

Thanks again to our partners at fly.io, to Brakemaster Cylinder for the Beats, and to you for listening. That is all for now, but we'll talk to you again next time.

Creators and Guests

Chris Benson

Daniel Whitenack

Patrick Foley

Creators and Guests

headphones Listen Anywhere

Listen Anywhere