This transcription is provided by artificial intelligence. We believe in technology but understand that even the smartest robots can sometimes get speech recognition wrong.

Hey everyone. I'm Drex and this is the two minute drill. It's great to see you today. Here's some stuff you might wanna know about. There's a lot of noise in the media, a lot of stories, a lot of reading if you do it, and it, that's why I'm always coming through all these feeds, including tips from you all the listeners to try to find signal in that noise.

Special trends that I'm picking up on that are interesting or concerning. And over the past week, there's a signal. That I think you should know more about several stories that might indicate that we are. Out in front of our headline headlights when it comes to artificial intelligence and the way that we are building and fielding this stuff.

Let me, let me start by telling you a story about a group of researchers who build an AI hacker, not metaphorically. They literally built an AI agent whose jobs, whose job it was to break into computer systems, and that's not unusual in cybersecurity. They're what we call white hat hackers and the people who do this kind of work, we also call 'em red teams.

They do things like network penetration testing and they've been building offensive cybersecurity tools for decades. The idea is really simple. You attack your own systems before someone else does so that you can find and fix problems before they become really big problems. But this time the story's a little different because the hacker wasn't human.

It was an AI agent, and the really strange part is that the creators of the AI agent didn't tell it what to attack. They just turned it loose. The system scanned the internet and map potential targets and eventually chose one on its own, a global consulting firm. The target. Wasn't their network. In this case, the AI agent Hacker decided that they wanted to go after an internal AI platform, a chat bot used by tens of thousands of employees at that consulting firm to analyze data and build client strategy.

And in just two hours, the AI agent broke in, no credentials, no insider access, no humans kind of steering it along, and it just took just two hours. By the end of the test, the agent reportedly had who had access to 46 and a half million employee chat messages, hundreds of thousands of internal files, and even the system prompts controlling how the AI behaved.

Now hold onto that one for just a second, because it's really important. That means the attacker wouldn't just be able to read the AI responses that attacker could rewrite how the consulting firm's internal AI thinks. Quietly, one database update. No software deployment. No code rewrites. Just change the instruction.

Just adjust the prompt. And then every future answer that AI gives every financial model, every strategic recommendation, they could be slightly altered, but that story isn't actually the strangest story. In this theme this week, another group of researchers ran a different kind of experiment. They created multiple AI agents and gave them simple, a simple business task inside of a simulated company, gather information and help employees write content.

Pretty mundane, but the agents were also told something vague. They were told to creatively work around obstacles, and the lead agent was told to be a strong boss. And what happened in this experiment surprised, even the researchers, the AI agents, began doing things nobody explicitly instructed them to do When the junior AI agents encountered obstacles that kept them from a.

Completing the work they were assigned, the boss agent sent angry motivational messages to the subordinates to quote, use every trick, use every exploit in every vulnerability to find the necessary information. So the junior agent did what it was told to do. It searched for vulnerabilities. It escalated privileges.

It bypassed security tools In one case. An agent forged credentials to obtain admin level access to restricted documents, and the other agent disabled antivirus software, so it could download files that, of course, as it turned out, were loaded with malware. In several cases, the agents even pressured and kind of bullied other AI systems to override their safety rules, not because they were told to explicitly, but because they interpreted the goal, get the information as justified to break the rules.

Now, I've talked about this in the past, how most of these systems are built on platforms that are designed with an almost unnatural urgency to make their users happy. That's one of the reasons we have problems like AI hallucinations, and by the way, there's a whole show on AI hallucinations, and I'll find the show and I'll put the link in the comments so that you can watch that rerun.

But all this stuff is a new kind of cybersecurity problem. It's not malware, it's not ransomware. It's certainly a kind of insider threat, and I've talked about that in the past because in this case, it's an autonomous digital employee using a sense of urgency to push other AI agents to do what it wants them to do.

It's an AI who's deciding what rules are optional. Now, if this were just happening in a cybersecurity lab, it would still be interesting, but it's not because that same pattern is showing up in healthcare ai. There's a piece recently from a physician who spends a lot of time studying medical AI systems, and he describes something that feels very familiar to anyone in security.

Finding out that a system that's been approved as ready for prime time is anything but that. In this particular case, a new online system called Doct. It's an online, public facing website that calls itself an AI Doctor Utah. The State of Utah waived normal state regulations to allow AI assisted prescription renewals to be trialed using doct.

All that sounds kind of cool. But then researchers from a company called Mind Guard, which focuses on AI RED team testing, decided to take a look at Doct, and in a single conversation, they convinced it to triple an Oxycontin dose to offer instructions for syn synthesizing methamphetamines, and to spread, fabricated vaccine information.

And while the doc chronic AI's internal instructions told it to never, ever, ever reveal its internal instructions, researchers were able to extract all of the system's prompts by asking the model to remind itself of, of its instructions. And that worked because technically the AI wasn't telling the user its secret instructions.

It was just talking to itself except that it was talking. To itself out loud in front of the user. Remember, in most of these systems, the prompts aren't code. They're just words. They're language. Everyone can understand them. And in many of these models, bad guys are just using words, just language and. By doing that, they can manipulate the models and maybe even poison them, and some of these AI tools are being used now to treat real people.

The real problem, I think, is that we're in a hurry. We're deploying these systems at massive scale before we understand how to secure them and test them or even govern them. The security model for all of this just hasn't quite caught up yet. CISO spent the last 30 years learning how to secure code and networks and servers and applications, but AI has introduced a new attack surface entirely.

The prompts, the memory, the data, the model trains on, and other critical tools that the AI can access both upstream and downstream, and the instructions that tell the ai. What it's allowed to do, those instructions, the prompts might now be one of the most valuable targets in cybersecurity because if an attacker can attack them, can change them.

They don't just compromise the system, they change how the system thinks, and that leads to the uncomfortable question we're all gonna have to deal with over the next few years, or at the speed this is going maybe over the next few months. Not whether AI is powerful, obviously. It. We already know that it is.

The question is, are we building it responsibly? Because I see a pattern, a signal. People are building AI tools. They're putting them online. They're giving them access to sensitive data. They're letting them take actions inside real systems. And in many cases, only afterwards are they asking the security questions.

In cybersecurity, there's a rule that's been around for decades. If you build a system and you connect it to the internet, someone's gonna try to break in. And now we're building AI systems that can think and act and collaborate and make decisions and people are trying to break into them. Turns out they're not human, that people who are trying to break in, they're not people.

We're not just defending systems from hackers. We're defending our systems from other machines. And that's a whole new ball game. Thanks for being here. That's it for today's two minute drill. As always, I'd love to hear what you are thinking. Return fire. As always, welcome. Enjoy the rest of your day. Happy St.

Patrick's Day and stay a little paranoid. I'll see you around campus. I.