Addressing High Cost of AI, Frontier Fine Tuning, Edge Computing, Microsoft and NVIDIA

[00:00:00] Good morning, everybody. I have some geeky stuff I want to walk through with you this morning in advance of Irresistible that came out of Microsoft and Nvidia this week. And I'm going to give you two links to some fairly long videos for those of you that are techies that you might want to listen to. And I want to cover three big topics. The first is this issue of token maxing and the high cost of AI. The second is fine tuning. And the third is edge computing. Second. So token maxing. There has been a belief and strategy in many companies to get everybody to use as much AI as possible so they will learn how it works and find applications. And that's been great. We've all had Copilot or Claude or OpenAI on our desktops. [00:00:45] The company's been picking up the costs of this stuff and paying for it, and we've been playing around learning all sorts of new things. Well, we now find out that it's becoming very expensive. And the reason it's becoming very expensive, as I wrote, wrote about, is the cost of building these data centers is extraordinarily high. There aren't enough cooling towers and power plants and real estate to do this. So the investment community, which has now invested more than a trillion dollars in this stuff, is essentially telling our suppliers, you got to charge more for this. To say nothing of all the middleware companies in the middle, like Workday, SAP, Oracle, all of them, they're going to charge for it too. So the cost of playing around with AI and generating code is going up. And stories are coming out about companies spending tens to hundreds to $500,000 on tokens which they're not getting any value out of. And you know, that gets to this bigger issue that playing around and building software tools is not for amateurs, because we need to generate a return on investment on these things. And if we're not building something that scales, if it may not have been worth the effort. Now, I'll talk about edge computing in a minute, but this essentially changes the economics of AI. And I mean, I've seen this coming and I'm going to talk about it a lot at our conference. You need to invest your AI energies in high return use cases that have business relevance to your company. Not tinker toy tools that people kind of think are fun to play with or making games. And the analogy that I like to use in my mind is in my early days in computing in the 1980s, we used to sell mainframes, and a mainframe had five MIPS, 5 millions of instructions per second. By the way, you can now get trillions of instructions per second on your PC. But anyway, those mainframes would cost millions to tens of millions of dollars, which of course was probably 10 times the price today in inflation. And so people didn't put stuff on the mainframe that they didn't need. And we had big groups of people in it that optimized what went on there and, and used very, very highly refined software tools, oftentimes by IBM or others, to make sure that we got the best out of those computers. We're back to that model again. And of course the PC revolutionized that, which I'll talk about in a minute. [00:03:07] So that's number one is this economic shift and this essentially use case change that we have to look for high return applications in hr, but before we just build things. And so that may be if you're growing really fast or you have high turnover, that you focus on talent acquisition, it may be that you focus on training, et cetera, because the cost of delivering these things is not trivial. And so two massive announcements came out this week. Jensen Huang in Taiwan introduced a set of chips called Spark. By the way, it's the same name as Gemini's new agent. But let's ignore that for now, which essentially run the LLM at a very high speed with huge amounts of memory on your PC. And then Satya Nadella announced a new breed of Microsoft computers, Surface, called Surface computers, that run this chip with trillions of instructions per second at your desk side. And what that means is that the personalization of AI is going to pick up speed. And in our architecture for HR 2030, we define the personal AI that interacts with the corporate super agents. And the personal AI is going to be big deal because you're going to have this personal AI on your phone, in your glasses, on your belt, in your computer, in your laptop, or whatever device you're carrying around or consider to be your personal device. And it will know everything it needs to know about you and do things on your behalf. So a lot of things to read about and hear about there, new computers to buy. And then of course, for developers, you're going to have a computer next to your desk that's a thousand times faster than the largest mainframe we ever sold in the 1980s. So you won't have to buy all these tokens from some cloud vendor running in some data center in Kansas somewhere to get your work done. Okay, those are worth looking at. Second, big topic fine tuning. So for those of you that have been following what we do. We have been working with Microsoft for more than a year on embedding Galileo into their HR function in their HR applications. And what you find in a sophisticated company like Microsoft is they have lots and lots of agents already built and running and they do many, many things. They're employee self service, there's an agent for crisis management, recruiting, policy management and so forth. And so they've evolved in their architecture to a quite sophisticated view of this. And again, I'll be talking about this next week. And the way Galileo works is because Galileo has 30 years of comprehensive research and best practices and hundreds and hundreds of case studies and data in IT and benchmarks and so forth. They found, as we predicted, that when they laid Galileo on top of these employee facing applications, it was significantly smarter and more useful at answering employee or manager or HR questions. Which doesn't surprise us because the native Galileo does that today. But how do you get it in there in a way that everybody can use it? Well, there's a capability that Microsoft has now launched called fine tuning. And what fine tuning does, it's now called Frontier tuning. The word frontier is now the new brand of all of Microsoft's new stuff is you essentially take the intellectual property in your company, your policies, your procedures, your workflows, whatever you want to give it, and you readjust the model itself to understand that information. [00:06:38] It changes the weights and the actual characteristics of the model and it becomes your model. And I talked about what the word model and weights mean in my last podcast. If you don't understand those concepts. So you essentially have a copilot that is your company's copilot. And this is exactly what we've been talking about now for some time, is that AI is a personalization technology. It's not like a typical ERP where you buy it and you customize it and you just use it, it becomes you, it becomes your company. And now Galileo is integrated into the copilot and the early availability of the frontier fine tuning is. Now we're working with some companies on this already and you can get Galileo embedded into your copilot for all of the employees in your company. And that means that all of a sudden the Microsoft copilot becomes the smartest HR business partner you've ever hired. It may not know everything about your company yet, but you can put that in and you can tune it using this fine tuning system to essentially deliver either on employees desktops or on a portal, an HR or employee experience that is extremely consultative, refined, personalized, and relevant to what your company is doing. So we're working with Microsoft and clients on this. And if this interests you, I'll put the link into the notes and you can register for access to this early availability system from Microsoft and then from us. Okay. And Microsoft also announced a whole bunch of other new models like this designed to be used by developers at their desktops or other specialized applications. One of the companies that spoke at the Microsoft Build conference was the Mayo Clinic. And although we're not the Mayo Clinic, I feel like we're analogous to the Mayo Clinic. They have years and years and years of knowledge about heart disease and medical diagnostics and drugs that they're fine tuning into their models to be used by researchers and clinicians all over the Mayo Clinic. So, you know, they're essentially doing this. And this is where HR 2030 is going, is a personalized AI network inside your company acting on your behalf. And I don't mean autonomously, by the way. And I'll talk about that more at the conference. Okay, so there's this issue of cost and edge computing, and then there's this issue of fine tuning models and making the models more relevant and more useful and more personalized. Then there's the actual edge computing itself. So let's talk about that. So, you know, we all use computers, Macs, laptops, PCs, whatever you want to call them, and they go through waves. The original IBM PC in the 80s was not super high powered, but it was very, very exciting. And it did displace a lot of capacity on mainframes. And IBM at the time was very nervous about their mainframe revenue, but but decided they had no choice. So they went ahead and launched this new platform. And then we went through a period of time in the 80s where basically Microsoft and Apple refined the graphical user Interface through Windows 3.1 and then Windows 95 and on into Windows today to try to make it easier and easier to use. And once these graphical interfaces picked up speed, including the Mac, of course, we got a whole bunch of new applications on our computers for graphics and PowerPoint and word processing and thousands of QuickBooks, everything. And then people started to use games. And games were always exciting because we never had access to games other than pinball machines. So we tried to use our computers for games and they were really slow. And the computing industry loved that because then you had to upgrade your computer and buy a bigger and bigger and bigger one and always buy the next intel chip and the next computer and more memory and a bigger screen. And along came this Graphics card. And the graphics card had 3D graphics in the chipset on the card. And guess what? That was Nvidia. That's what Nvidia did. Nvidia. And Jensen was there. I don't know what his job was at the time, but Nvidia was building the accelerator chips for games. And I had many, many, many Nvidia cards on my computers as a young man, always upgrading to the next one. They were, you know, 300, 400, $500 each. And you would, you know, plug them into the back of your computer and reboot it and hopefully the drivers would work. And then whoa, all of a sudden the game worked better and it was faster and you had better graphics. And what they were doing is building a library of software and hardware that could manipulate three dimensional motion and many, many bits of color. By the way, we used to have 8 bit color, now we have 24 bit color. And, you know, making things more interesting and more realistic, you could move things around on the screen, which was very hard to do in the early days. And all of that knowledge of how to take the physical world represented on 2D and manipulate it on a screen was building up and building up inside of Nvidia. There were other vendors too, but Nvidia became the biggest. And we were buying these accelerator graphics chips and they were mostly positioned for games, but people used them for all sorts of other things. And then, you know, all that happened and Nvidia was a successful company, went public and so forth. And then 2022 arrives and, and we end up with ChatGPT. [00:12:06] And turns out when you look at the models and what they do and the way they vectorize data. Again, go back to the last podcast. If you don't know what that means, you realize that what the AI is doing is very similar to what these graphical game processors were doing. You're manipulating very long strings of numbers and matrix calculations very, very fast. When you move a 3D object around on your screen quickly, it has to shade, it has to look at coloring. By the way, one of the companies that was very hot at that time was a company called Silicon Graphics that I loved. I almost went to work for them. They got bought by Cray. And I don't know what happened to Cray. Somebody bought them. And anyway, so. So this 3D stuff turned out to be extremely useful for the types of manipulations that these vector databases do in generative AI. And Jensen, being a very, very shrewd and fast moving guy, managed to convince the rest of that company to re pivot itself towards AI computing. And by the way, there's a story here about org design and organizational culture. You know, here you have a company that's very old, really, it's roughly probably as old as Cisco or older, that transformed itself multiple times from a chip company to a gaming board company, to a software company, to a platform company, to a model company, to a pioneer R and D company. Same company. And those of you that work in consumer goods or banking or insurance or any of the other places where you guys work, your company has to evolve just like this. And so they, through Jensen's leadership, and I think there was another founder, re educated themselves and refocused themselves on this market. And if you listen to Jensen speak and speech in Taiwan this week, it just happened last weekend, it's worth listening to just if you're just to admire the evolution of that company, you'll see how fast they've moved. So what's going on right now is edge computing is getting very, very big. And this Spark chipset developed by Nvidia and now used by Microsoft. And by the way, Microsoft is not only partnering with Nvidia, amd, Intel, other chip makers, is moving the compute demand or the compute cost into your personal computer. Now, I don't think you're going to run your corporate applications on a network of PCs. We talked about that for many, many years. I don't think that's going to happen. Although Sun Microsystems did that. Sun Microsystems did that in the 2000s and it worked really well for a lot of companies. So that's absolutely possible. But it does mean that software developers, hackers, end users are going to get very local AI processing and that will to some degree offset this massive cost and price for these computers in the cloud. So that's the really the third big trend that's really going to make our lives more interesting. And of course, that technology gets shrunk down onto your eyeglasses, your phone and many other devices, including the device in your house. Jensen actually talks about this. If you look at his video, you'll see a little picture of a little box that he believes is going to be sitting in your house. It'll be running all these agents for your house. Of course, my experience with home agents is I can barely get my sprinkler system to work. So I'm a little bit worried about hooking all this stuff up to my home devices unless you have an IT department much more sophisticated than me. But, you know, that's likely to happen. There's companies like Sonus and Apple and others that'll make that easier and easier and easier. So that's kind of where we are and what it means to you in hr. And the reason this is kind of geeky is that this area of technology is worth your time. We'll do everything we can to demystify more and more of these things and help you with them. And your IT department is learning as fast as they can, too. I'm sure they're going through the same learning curve we all are, but the progress is incredibly fast. And I don't think this is autonomy. This is intelligent autonomy. It's precision business. It's making decisions in a more integrated, complete way. It's dynamic enablement of people. As I'll talk about next week, but it's coming fast and I just wanted to highlight what's been going on. And stay tuned for our announcements on the Microsoft Copilot. Our Galileo for Copilot will be generally available later in June and July, so we're going to briefly talk about that at the conference. Not a lot, but it's coming. So for all of you that are Microsoft fans, you know, stay tuned. And by the way, Microsoft's very neutral on models. They have lots of models in the harness of the Copilot. So I think it's going to be a very big part of the market and perhaps the biggest vendor of all. Over time, we'll see how that goes. Okay, that's it for now. That's my 20 minutes. Have a great week and I'll see all you guys in LA next week. Have a great week.

Show Notes

Chapters

Episode Transcript

Other Episodes

Episode 0

How Do Trailblazer Banks Pull It Off? Lessons For Every Company Here.

Episode 0

Explaining The Massive People Data Market: How It Works & What You Need To Know

Episode 0

Can One AI Agent Do Everything? How To Redesign Jobs for AI? HR Expertise And A Big Future for L&D. E200