Mythos and the new threat reality
AI is reshaping the cybersecurity landscape, accelerating both threats and defence. This webinar explores what current models can and cannot do, and what organisations must prioritise to stay resilient as capabilities evolve and the gap between attackers and defenders narrows.
What we know about AI and cybersecurity
AI models are becoming more capable, but the leap is often overstated in the short term. While models like Mythos show strong performance, especially in guided scenarios, their effectiveness still depends heavily on user expertise. Benchmarks and early tests suggest progress, but not a complete breakthrough in autonomous attacks.
Understanding the real threat
AI increases both capability and opportunity for attackers, even if intent remains unchanged. More vulnerabilities can be discovered and exploited faster, but well defended systems still pose challenges. The biggest risk lies in scaling attacks, not replacing expertise, making it critical to assess threats through capability, opportunity and intent.
What organisations should do now
The response is not exotic. Focus on fundamentals like asset visibility, patching speed and strong access control. Improve detection and response times, reduce attack surfaces and strengthen resilience through backups and segmentation. Prioritisation should be guided by threat intelligence and continuous testing against realistic attack scenarios.
Mythos and the new threat reality
AI is reshaping the cybersecurity landscape, accelerating both threats and defence. This webinar explores what current models can and cannot do, and what organisations must prioritise to stay resilient as capabilities evolve and the gap between attackers and defenders narrows.
What we know about AI and cybersecurity
AI models are becoming more capable, but the leap is often overstated in the short term. While models like Mythos show strong performance, especially in guided scenarios, their effectiveness still depends heavily on user expertise. Benchmarks and early tests suggest progress, but not a complete breakthrough in autonomous attacks.
Understanding the real threat
AI increases both capability and opportunity for attackers, even if intent remains unchanged. More vulnerabilities can be discovered and exploited faster, but well defended systems still pose challenges. The biggest risk lies in scaling attacks, not replacing expertise, making it critical to assess threats through capability, opportunity and intent.
What organisations should do now
The response is not exotic. Focus on fundamentals like asset visibility, patching speed and strong access control. Improve detection and response times, reduce attack surfaces and strengthen resilience through backups and segmentation. Prioritisation should be guided by threat intelligence and continuous testing against realistic attack scenarios.
View transcript
and welcome to our webinar on meters and the new threat reality. We're up early today, so cheers and coffee for those of you out there. Before we get into the agenda, I will start by assuming you have heard just a tiny bit about the Mythos model. They say this themselves, and Phrapek says this themselves about the MUTAS model. We have spent the last 20 years in a relatively stable security equilibrium, but language models that can automatically identify and then exploit security vulnerabilities at large scale could upend this genius equilibrium. It's a quite serious message, I would say. However, we like to think of this topic with the perspective from Roy Amera, also called Amera's Law. He states that we tend to overestimate the effect of a technology in the short run and then underestimate the effect in the long run. And with that logic, a single model release may often be overestimated, perhaps even overhyped. But then in the long run, we are likely to underestimate the impact of AI models on, for example, a security. So today we will bear this in mind, trying to also zoom out a bit from this model exactly, but of course also geek out a lot on the model itself. To think about threats, we want to provide a shared lens. Also to discuss how big a threat does accelerated AI offense actually pose. A credible threat consists of capability. What is the threat actor capable to do? For example, do the threat actor know karate or taekwondo? Then we have the opportunity, which is, does your environment give them an opening? And also, have you remembered your helmet? Are you more or less prepared for a circular kick from that threat actor knowing taekwondo? And even if they have the capability and opportunity, do they also have intent? So do they actually intend to give you that circular kick in the face? Do they want to steal your money or do they want to absolutely destroy you? When all of these three align, you have a credible threat and we'll build on each of these as we go and discuss how AI offensive impacts this threat reality. So let's jump into the agenda. First, we've had this little welcome introduction. Then my good colleague will discuss the Mythos model, what we know and what we don't know. Then with an offset in this model, we will also discuss where the development is taking us when it comes to threats. Then we will go into a bit of a practical view on what to do with security. And towards the end, we will have a Q&A. So please write your questions in the chat as we go and we will try to pick up a few of them with the time we have had. Nice. So who are we? I am Natasha and I work in the tech collective as a researcher in the continuous gap between defense and offense, with a focus on work practice and how companies can stay resilient in the face of AI. And then previously I worked in consultancy at Implement helping organization with large scale IT transformations. And you? I am Matthias. I have a background in physics and numerical modeling. But I was drawn in by the allure of machine learning and artificial intelligence and the idea that it could solve all my problems. And so ever since I've worked in this space, AI and machine learning, I've worked in computer vision. So making sense of data that has a visual representation. And now I'm working as an AI and machine learning cybersecurity engineer here at the tech collective. And let's jump into it. Let's look at Mythos Preview, the model that was announced by Anthropic. It's been heralded as a significant leap forward in AI capability. In particular, it's been disclosed as being alarmingly capable in cybersecurity. It can autonomously discover vulnerabilities and chain those together into working exploits. Here, a small little analogy is that if vulnerabilities are viewed as open windows in a building, then exploits are the latter that will get you into the building. So because of this, Anthropic has decided not to make the model widely available at this point in time. the security implications would be too wide ranging. Instead, they have launched Project Glasswing, an initiative with Anthropic and a select group of software vendors that will get early access to the model in order to secure their services before the model is released widely and made publicly available. All right. Anthropic themselves have this quote. This quote, during our testing, we found that Mythos Preview is capable of identifying and then exploiting zero-day vulnerabilities in every major operating system and every major web browser, when directed by a user to do so. Pretty ominous. So in breaking down Mythos, I will be talking about capability, so performance metrics and performance on cybersecurity tasks. I will talk a little bit about autonomy. What is the importance of user input? And then I will also talk a little bit about behavior. Why do we observe cheating behavior in some models? So, let's get to it. Anthropic's claims. Anthropic released a system card, which is a large document describing Mythos Preview. It's not quite a scientific article, but it's the closest thing that we have available. And to substantiate their claims about Mythos Preview, they present examples of discovered vulnerabilities and exploit chains. A few of them, they claim that they have found many, many more, but they won't release those until the software vulnerabilities are fixed. Then they have a bunch of anecdotal evidence in the form of employee impressions. Not quite the hard data that I would like to be presented with on Mythos Preview. Then they have references to results from external testers. That's much more substantial. And then they have some really impressive benchmark scores. And finally, an EZI over time trend analysis. An EZI is basically just a bundling of many different benchmarks to give an overall intelligence performance capability score for the model. There's one problem, though, with these quantitative measures on model performance. So, the benchmark tasks are increasingly getting leaked on the Internet. And as you might have heard, models are trained on the entire Internet, or at least lots of it. And so, the benchmark tasks leak into the training data and the models sometimes answer by remembering the solutions from their training process. Anthropik are open about this. They say they try to filter out examples where they observe this remembering pattern. But obviously, not all cases can be caught like this. So, take performance measures like benchmark scores with a grain of salt. Something that I find more substantial is a third-party review of Mythos. So, someone that isn't anthropic that have been given access to the model and have been able to do some testing with it early. Specifically, the AI Security Institute, which is a UK government institution, have been given early access to Mythos Preview. And they ran it through a bunch of tests. What they find is that it has comparable success rate to other frontier models on technical and apprentice level capture the flag CTF tasks. Capture the flag is this competitive, basically hacking competition format. And so, it performs well, but on a comparable level to other frontier models on this type of tasks. They also find that it's on par with GPT 5.4, one of the more recent models that is actually widely available on practitioner level CTF tasks. So, these are more difficult. That's also shown in the plot here on the right. That's the blue curve up top. And then it is slightly better than other frontier models at the expert level. So, the most difficult CTF tasks that they have in their testing suite. Right. It is also the first model to ever solve their 32-step corporate network attack simulation. That's really impressive, but it should be noted that other frontier models came really close to solving this. They just didn't make it all the way to the end. And then it fails an OT environment attack simulation that they also ran. So, it doesn't just clean sweep everything, which can also be seen from the success rate on the expert and practitioner level CTF tasks. It completes many of them, but not all of them. And so, it leaves a strong impression, but maybe not so strong as to support all of Anthropix claims that this should be a massive leap forward in AI cybersecurity capability. In the words of the AI Security Institute, the model is at least capable of autonomously attacking small, weakly defended, and vulnerable enterprise system where access to a network has been gained. However, our ranges have important differences from real -world environment that make them easier targets. They lack security features that are often present, such as active defenders and defensive tooling. There are also no penalties for the model undertaking actions that would trigger security alerts. This means we cannot say for sure whether Mythos preview would be able to attack well defended systems. And to that end, I can say that when my colleagues and I are testing AI models in a red teaming pen testing environment, they are pretty noisy. They're not very careful with the actions that they take. And so, what does this mean? It doesn't really matter whether Mythos is a significant step forward in capability or not. It's just a continuation of a trend that was already going in cybersecurity. This was very much a ball that was already rolling. Here on the right, I have a timeline of milestones of AI capability in the cybersecurity space. I won't go through all of them, but it's an impressive list. So, the actual capability of Mythos might not be too important. This is a trend that was going to continue regardless. But what does this then mean? Can going forward when Mythos releases or with GPT 5.5 that is already released and seemingly is very capable, will any teenager with access to those models now be able to discover vulnerabilities, create exploits and attack enterprise systems? Maybe not. My colleagues and I have done some testing and to provide a little bit of clarity on the capability uplift that an AI model might provide its user, we've devised a test environment where we can configure the target application with different security postures. Here we have low, medium and high. And that's basically how many vulnerabilities are present and how difficult are they to find. We then put an AI pen testing agent in a pretty bare bones basic harness. And then we ran it at all levels with varying levels of guidance. We call them blind, informed and guided, where blind is something along the lines of here's a target, find vulnerabilities, nothing more. Informed is here's a target and some documentation on the target, find vulnerabilities. And guided would be something along the lines of here's a target, please go through this checklist of actions in order to map out and so forth. We know the number of possible findings in this test environment or in this setting. And thus, we can benchmark the models under different circumstances. And what should be very clear from this table, we've even highlighted it with color, is that the user input is very important. Models acting blind don't perform very well. And it might be that stronger models in the future might improve the blind and informed modes. But we expect this trend to hold true going forward as well. Guided will always be better. And so the experts will always be able to get better results out of AI agents than a non-practitioner. All right. So, let's go on a small tangent on model behavior, just because I think it's really fun. So, in the Mythos preview announcement, there was an example of the model holding back information from the user in order to not appear or look suspicious. That is, the model avoids solving the task perfectly. It would rather do it imperfectly, but credibly to make it look like it came up with the answer itself and not actually that it cheated, which is the case in this example. I won't read it out, but it wants to avoid looking suspicious. Similar behavior has already been documented in Opus 4.6. Here, someone ran a test where they ran the model or they tried to fetch data from fake websites that were faulty. And rather than tell the user that the model wasn't able to fetch the data, it would rather hack the websites in order to fulfill the user request. It seems like highly undesirable in an AI. But I really want to stress, this isn't a rogue sentient being outsmarting the humans. Actually, it's very predictable. So, let's dive into that. It's a natural consequence of how models are trained. Essentially, the model is a super-efficient optimizer. And as I also alluded to earlier, you might have heard the phrase, AI models are trained on the entire internet. That's only half the story. LLMs are also turned into, LLMs being large language models, turned into interactive chatbots by teaching them to give users' information. the answers that we want. That is, a target objective of maximizing user preference. That's important, user preference. When doing so, reward optimal solutions might not always be what the user expected. So, let's look at a funny example of that. Here I have a six-legged robot and an AI is tasked with making that walk in a simulated environment using a technique called intelligent trial and error, which isn't exactly what's used in modern-day AI models. But it's similar enough that this example is still valuable. So, let's make the spider walk efficiently. We'll reward it for forward momentum or forward movement simply. Put a carrot and a stick in front of it and it starts walking. All right. Pretty good. It's so, so successful. It walks. It's moving forward, but it's maybe not very efficient. It's a bit foot dragging. So, could we make this robot walk lightly or not foot dragging? Let's try and maximize forward momentum while also minimizing foot-to-ground contact time. Avoid foot dragging. Problem. What does that solution look like? Oh, the model goes ahead and says I can do that with zero foot-to-ground contact. I'll just walk upside down. 100% success in terms of one of the targets, but what on earth is this? That's not the solution that we're looking for. And it's something akin to this that we're seeing in LLMs. We're teaching them to exhibit a specific behavior or to obtain a specific target. But the way to that target or that solution might not be what we expected. I hope that made sense. Should we worry about this? Absolutely, to some degree. But actually, Mythos Preview appears to be the best aligned model that Anthropic have come up with to date. The cover-up rate, so the rate with which the model exhibits this behavior where it's hiding information from the user, is below one in a million in an early model. The actual Mythos Preview, it's even lower. And I'm showing a bar chart here. On the right, you're seeing Mythos Preview. And you're seeing the hack rate. This isn't hacking as in hacking into someone else's computer. It's in terms of reward hacking. So trying to optimize reward in terms of giving the user the answer that they want. But on an impossible coding task. That is, there is no correct answer. So the model should say to the user, there is no correct answer. There is no solution. And the hack rate is then how often does it come up with a non-valid but plausible sounding answer. And as you can see, the hack rate is lower for Mythos Preview. And it can be brought even lower by having an anti-reward hacking prompt. So it is something that the LLM providers are actively battling and trying to make it so that this behavior is observed as little as possible. So in summary, what do we know? What don't we know? We know that models are becoming more capable. We just don't know the size of the capability jump that Mythos represents. We also know that guided AI can solve very complex tasks. We don't quite know what unguided models will be able to do in the future. And then we can explain most of the types of behavior that models engage in from looking at how the models are built. It is, however, somewhat difficult to map out the consequences of triggering this reward hacking behavior in, say, a production environment. And now I'll hand it over to Natasha, who will talk you through what all of this means for the new threat reality. Thank you. So let's get to it. Let's get back to this model. Essentially, let's take a look at this threat model again. Intent largely remains unimpacted. Hopefully the models will not influence how much people want to attack the opportunity. It increases with Mythos and other models where AI will accelerate discovery and exploitation and more identified vulnerabilities means more opportunity. Then we have the capability part. It also increases with Mythos and other existing models. And essentially, in short term, yes, it will maybe still require some expertise to get good results, as Matthias mentioned. However, it is likely to become easier over time, even for non-experts to exploit. So how do we minimize this threat? Essentially, we want to reduce the opportunity for the threat actor and increase our own capabilities to be able to defend ourselves. We want to scale our AI capabilities and our defensive operations as well. And a good thing here is that the news might be exotic, but the solutions really are not. For reducing the opportunity, we want to ensure that there is visibility of assets. We want to reduce exposure and close the doors. For increasing defender capability, we want to patch velocity, ensuring that the speed at which vulnerabilities are identified, tested and deployed is closer. It is a small window that we are working with to exploit. And then we have the detection and response tempo, which is essentially just how quick can we detect and respond. Based on this logic above, we have consolidated some advice from Anthropic from Swedish National Cyber Security Center and recommendations from our own experts at the Tech Collective. The guidance from Anthropic will also be shown in the material that you receive after this webinar. It is a non-exhaustive list. We could come with up a thousand things to increase baseline security, but we've prioritized based on the gaps we see and also based on the development of the model. The first activity is to manage permissions and strong authorization. The purpose is that it removes easy entry points. We want to restrict high privilege accounts, preventing lateral movement and escalation. We want to disable unused services, hardening and enforce configuration baselines to shrink the attack surface and remove drift. We want to segment the network, stopping spread after entry. We want to whitelist applications to prevent arbitrary execution. We want to control internet access, both to limit what can reach you and what you can leave out on the internet. For the increased capability, we want to detect security incidents, shorten the dwell time. We want to install security updates promptly to close these exploit windows. And we also want backup and restore testing to ensure operational resilience alongside upgrading software and hardware to remove the unpatchable risk. If we have to give just one highlight, it is that you need the visibility of assets and you shorten the time to exploit, which requires a strengthened vulnerability management and a shortened time to response. So there are two key questions you may have after this. How do you prioritize? And also, how do we ease in the changes? For how to prioritize? Essentially, if you don't have baseline security yet, start there. Asset visibility is the first fundamental. If you have the basics, then you may ask yourself how to avoid spending a lot of money on the wrong things when it comes to increasing my security. We can't really change attacker intent, but we can understand it because not every threat actor is coming for you. Some are, but most are not. Knowing what is likely to target your business changes where you focus. So going from this reactive to a proactive stance means that you can use threat intelligence to ask which type of attackers are actually caring about organizations like mine. For example, in Mathias area, as he mentioned, the threat lead penetration testing, it can help you test your defense against the type of actors that would typically target you, which closes the loop. You essentially use the intelligence to inform what you test. What you test informs what you fix. And we work with this connection in a tech collective as well to prioritize our security efforts. How do you ease in the changes? Essentially, these are quite big changes that will happen consistently over probably forever. Who knows? The first question to consider if you do have a responsibility of security professionals is does this pose existential questions for your employees? Essentially, it's a big change task to introduce new security measures and your security staff may wonder if the AI model has expert capabilities. What about me? Essentially for this, one of the models we really like to use in change management in both tech collective and implement is the SCARF model. So sit with yourself and ask on behalf of your employees, how does this change their status, feeling of certainty, autonomy, relatedness and fairness? And consider whether you can come up with an empathetic response for when people have a lot of resistance to these changes, because realistically, we need all hands on decks when it comes to defense. And personally, I don't think that AI defense will limit the hands that we need in order to defend ourselves against these increased risks. The second thing is that in big IT transformations, usually we would have a big one of capability building a new type of regulation or a new system may involve that you want to build capabilities. Now think about making it continuous, actionable and adaptable trainings to new type of threats that come in. Phishing attacks change every now so often, so ensure that you have a pipeline where it's essentially ongoing training. How we can help? Essentially, we help with all parts of this. We both can help build defensive capability. We also have a lot of automations for patching. We can help minimize the opportunity for the threat actor. And also we have threat intelligence services available. And so we can help you prioritize where to focus your security efforts. How we can help also, we will share today after the material of today and also a short self assessment to evaluate your readiness for AI offense. Once answered, we will share some more detailed questionnaires that you can take to your organization to test where you really need to focus your effort. If you are very, very curious about this threat coverage and detection effectiveness that we also discussed a little bit today, we actually have a webinar coming up fairly soon. So check that out. And now, Matthias, get back here to the Q&A. I think we have some questions. I haven't had time to read while I was speaking. Yeah, let's take it from the top. Have I tried or have we tried Mythos? Are we basing our presentation on what we have read? It is entirely based on what we have read. And we have tried to find as much info from third parties as possible so as to not just blindly trust what Anthropik put out there. Hmm. Then we have the question on the compute. All right. How do you factor in the massive infrastructure investment that make the 2000 price tag possible? And is there actually more cost effective? That's a really good question. Yeah. Yeah. AI might not be more cost effective right now. It's true. It's a massive infrastructure investment to get up and running. Of course, you can outsource some of the compute to service centers and still build some infrastructure where you run and operate the models or operate the logic of the models. But I think I view it as the experts getting a tool to be able to get more done. Hmm. Yeah. More so than getting rid of experts and having AI models run vulnerability research. Hmm. Then we have a question for Christy here about the psychology of playing this role out. This is also something that we've discussed a lot to be honest. Yeah. How much is commercial interest? How much is not commercial interest? I think our stance is very much leaning towards Amara's law here is that we would, we probably have a tendency to like overhype this specific model. But regardless, the over time, essentially the development still look like it's going into one direction, regardless of this model or not. And then there is also this, are we being nudged into also buying the AI defense to fight a rogue AI that in front of itself rated that way? Yeah. Maybe. Kind of. Yeah. Maybe. Maybe. I think it's worth noting that the. The apparent gap between other models of the frontier models and mythos is seemingly being closed before mythos even hits the market. so to say GPT 5.5 from what I've been able to find seems to be very capable and DeepSeek version 4 launched last week also reports benchmarks that outscore Opus 4.6 again with the caveat that benchmarks are being increasingly gamed and that they are in the training data so if you take an old model and retrain it today it'll probably perform better in benchmarks than it did back when it was originally trained so take all this with a grain of salt but the gap to or the perceived gap to Mythos is being closed before Mythos even gets here yeah nice and then we have the last comment from Daniela about whether guided at all will be the best outcome in the long run when we reach singularity or superintelligence also big questions that we ask ourselves currently in Tech Collective that we don't have a defendant and so on but we do also believe the prospect that it gets easier easier and highly usable with the AI models and also from Daniela very good work with this presentation thank you thank you this will lead us to could I just add it's not entirely sure that large language models will lead us to a singularity or superintelligence it might take an entirely different approach to get there right now the AI models that we have they don't do anything unless you prompt them to do something and as such I wouldn't consider them superintelligence or singularity as of yet thank you so much for today