Security Transcript

Chaired By:: Brian Nisbet, Tobias Knecht, Markus de Brün
Session:: Security
Date:: Tuesday, 21 October
Time:: 16:00 ‐ 17:30 (UTC +0300)
Room:: Side Room
Meetecho chat:: View Chat

4pm.

Side room.

21 October 2025.

SECURITY WORKING GROUP.

BRIAN NISBET: Hello, hello and welcome to the RIPE 91 edition of security working group in person, physical 3 D, whatever way you wish to describe the meeting.

I am Brian Nisbet, with co chairs, Tobias Knecht is here and Marcus is online I believe as well. Marcus de Brun, unfortunately couldn't be present with us in lovely Bucharest this week.

So, yes, so again welcome to you all. An thanks to and welcome to indeed our scribe and chat monitors and wonderful steno people, obviously all of this is being streamed live over a variety of platforms globally and including Meetecho and if you are joining us on Meetecho, please do interact as well, either via the chat Q&A function or indeed put your hand up, show us your lovely face, unless it's like four o'clock in the morning wherever you are or you have terrible bed hair, I have no frame of reference myself. If however any interactions we are making in the meeting are covered by the RIPE code of conduct so you know, there's a lot more details and it's a lot more complex than be excellent to each other, all the interactions of this working group are covered by the code of conduct. We are using a new system for everything at RIPE meetings, the talks and all the rest and we are all getting used to this season thanks to the folks at the NCC for guiding us through it. You can rate the talks but so wish in the working group via that system and we would encourage you to do so because it gives us information on that's important to the working group and what you would like to hear more of, what was useful, etc.

So yes, so, we sent some minutes out, nobody said anything, it all seems good, if nobody else says anything now, we'll take them as approved. We'll take them as approved, yeah.

Does anyone have anything they would desperately like to add to the agenda at this point in time?

No? OK, cool, we'll proceed with the agenda that we have.

So we can move on to the next slide deck please.

Yes, excellent wonderful, there's only one item on that, I am about to ask Dave to get up on stage unexpectedly, so recent list discussion, there hasn't been any, we would love there to be some discussion on the mailing list, there's a whole shiny mailing list there almost like fresh fallen snow, which would be wonderful to have digital footprints all over an lot of discussion on important relevant topics. So I don't have anything to talk about here but the mailing list is there and is open to everyone. So it would be wonderful if people wished to talk on it.

On the online agenda, we mentioned policies. We don't have any policies either, we only had once or twice, but again as always, if there are relevant policies in relation to security, come and talk to us. The policy process can look intimidating, but the whole or one of the whole points of co‑chairs is to help anyone who wishes to submit a policy, to talk to them about that, to try and work with them, to bring that through the process. So never look at it and go, I have an idea for a policy but I am all alone and scared. Email the co‑chairs, talk to us about it. And importantly, if we sit there and we go, actually, you know what, a different working group should handle that policy, then we'll tell you. We'll introduce you to the lovely co‑chairs of the other working group and again joyously as a community we'll hopefully bring a policy together.

So, interaction with law enenforcement agencies and associated government entities departments of justice, all of this kind of thing is really important and I am on multiple record of having said this and it's really important that we as a community engage and show to them that they are part of the community and also show to them the tools, pieces, the policies, all the things that are there to help everybody, including them in navigating the internet.

So it was, it's in our own self interest to have them involved in the community, very much so.

So it was very nice of Europol to invite me to their annual cyber crime conference which took place a couple of weeks ago in the Hague and as an addendum to that the NCC were running a round table and some training which they invited me to participate in as well, I thought I would take a moment and talk about that and give you my reflections on that event. Unfortunately due to inclement weather brought about the terrible things we are doing to the planet, I did have to leave early. But I think it was a suitable time to kind of get the feel for the event.

So I mean Emanuele can tell you a lot more as he is one of the organisers from Europol, it's an annual event, it's a mix of law enenforcement, so many cops, law enenforcement, regulators, data protection people, cyber security companies, experts from them, civil society, obviously RIRses, etc, etc, a whole bunch of people related to the area and I found a lot of them super interesting to both listen to and talk to the training day was about, I think we had about twenty something people in the room from a mix of different European law enforcement agencies and it was really useful.

Now the downside is I have been at a number of these before and they are always great and then unfortunately quite regularly as we have all experienced the law enforcement people change and get sent somewhere else and go to a different group, the documentation they leave behind them is not great and so the person who is now taking on ‑‑ who has been transferred from traffic or from wherever else ‑‑ one Irish girl I was talking to had been transferred from crime seen investigation into cyber crime and has to learn anew all of these things, that's the biggest challenge is the changes of personnel is the losses in information. And the conversations we were having in the Hague very much reminded me of the last time I was at one of these training sessions which was ten years ago with similar reaction to the great tools and the great information and the concern on my part that will these people still be in that role in six months, will they still remember. I mean there was a guy there from Iceland who couldn't wait to get home to use RIPE state and who amongst us hasn't been able to get to wait home to use RIPE state.

But it was really positive and it is also really useful and I would say this about myself but I think it's really useful to have an operator too in the room as well. Because the NCC folks, they know their stuff and they can talk really well about it, it's really great to be able to say as an operator of a network, here's an anecdote that refers exactly to the thing that's happening. I think that's a really good mix. And I would ‑‑ it doesn't always have to be me, but I would encourage again when the NCC are doing it to bring those people along.

And I think there is all of that conversation, some of the law enforcement people desperately want to arrest a server somewhere, they want to know exactly where in the world something is, that's not possible, it's about talking about what is there, what is possible, what's in the database and how ‑‑ and in the registry and how you can navigate through that. Again, a really important topic and one that some people definitely get and again, my experience of this, this is all my reflections, this is not stating god given fact, and some people kind of never will, you know, they will always want to be able to press a button or pick up a phone and find out exactly where the bad actors are operating from and that's something that we as a community have to navigate, we talk to people and explain.

So the conference itself. I thought it was really really useful. I also felt in conversation with a few of the people from the technical side of the house that we were almost the bad kids down the back of the room in the auditorium. Yeah. Being a bit, I am going to say this thing, it's going to be about encryption, obviously, there was at one point someone on stage ‑‑ and I am not going to mention any names or details because that's the nature of the conference. If you took your phone out to look like you were going to take a photograph, you got a polite tap on the shoulder. And these were not the kind of people you wanted a toll light tap on the shoulder from but there was a comment that started off with privacy a fundamental right. And I already whispered to the person beside me the word but before the person on stage said the word but.

And that is a thing that we need to be honest about, that's one of the biggest I think were clashes between technical society, between our community and ‑‑ sorry between parts of our community and other parts of the community which is law enforcement.

And that's the big conversation. That's the big difficulty.

There was a comment made about compromise being needed and I understand compromise and I think the point I felt wasn't being raised or understood is you can't compromise with mathematics.

You can't change, you can't do something to encryption in a compromised fashion, it's a binary state, it's either there or it's not there and again I think that's one of the really complicated things and the things that it was really very clear in the conference that that was the fundamental divide.

And it's really important to keep talking about that. And keep trying to explain and keep trying to discuss. But we are not hiding a secret way of breaking encryption, it's not that we don't want to tell people, it's that it doesn't exist. And I certainly don't have a solution to this in any way, shape or form.

Importantly I really believe we all want the same thing.

We all want the internet to be safer for its users, we all want it not to be enabling crime, if at all possible. But that doesn't change the fact that there are different ways of approaching that.

The conference also shared a number of operational reports, which really were quite eye‑opening in relation to the work that law enforcement are doing. Again, without breaking encryption. And I wouldn't do that job. And I will stand here and I will be very honest, I would not do that job. Certainly looking at some of the pieces which were shared around things like AI generated SeanSam is scary, I mean beyond scary I think, scary is far too mild a word, there's lots of stuff there and it's really important that it's shown and seen and again and I think it's important that the technical side of the community sees more of that as well an it was great to see as well, it was great to see the companies that are interacting in special operation groups or whatever else with Europol and other groups to show that interaction.

So lots of very positive things out of that.

But I don't want to ‑‑ I am going to be very blunt, as I frequently am, about some of the problems I see. And of course like any conference like we are all doing here, networking is super important and the coffee breaks as always were, you know, worth the metaphorical price of entry. Even the queue. Because there's metal detectors and things so the queue was 20 minutes long to get into the building, that was really useful. I met a bunch of people in that queue to who I now know hey cool, these are important people I want to talk to to but the networking was useful, I met a bunch of Irish guards, the Garda Siochana, the Irish police force who I am now like great, you are the Superintendent in the cyber crime division, as an operator, I had never spoken to them so met them at this, we are going to have meetings with them, it's going to be a positive thing, the networking is useful, not just law enforcement, other members of the technical teams as well, a really great opportunity and I think the law enforcement and people, regulators, you felt comfortable in a space in a way they don't always feel comfortable in that space and that's why that exchange is so important because we all feel different levels of comfort in different spaces and being in different places enables that dialogue which is super useful.

So that's kind of it, as I said, this was reflections, it was general, it was a very positive event, there were definitely things that concerned me still there and conversations we still need to have. But having that kind of event, in the same way we are welcoming law enforcement here, seeing law enforcement welcome the technical community into their house is vitally important as well and I was very glad I went, even if I had the anxiety of will my plane take off because of Storm Amy and things like that. And yeah ‑‑ but no, very, very good and I think it's a dialogue that needs to obviously that dialogue needs to keep going.

So I don't know if anyone has any questions or comments. This was me reflecting as me. These views do not represent necessarily even security working group, definitely don't represent the NCC, I would not put words in their mouth. But I wanted to share and I hope it's useful.

So, if, no, cool, yeah

RUEDIGER VOLK: Kind of.

BRIAN NISBET: You still need to see who

RUEDIGER VOLK: Ruediger Volk, still retired and still around. I would like to ask whatever that has been essentially the only political European level thing that you observed since last RIPE meeting? .

BRIAN NISBET: It's the only event I was at since the last RIPE meeting, it's not the only political level thing I have observed since the last RIPE meeting but it's the only event I was at. I mean...

RUEDIGER VOLK: Well, the question that comes to to my mind at the last RIPE meeting in the connect working group session, not connect, the Co‑Operation Working Group, I think something was presented and asked for that might be relevant to a security working group to follow. We did have as guest speaker Martin Rudiger from the EU Commission who was talking about the NIS2, let me...

BRIAN NISBET: Last week was the first anniversary, lots of countries not transcribing it into law.

RUEDIGER VOLK: Well, the thing is, the thing is he was announcing that they were starting to create multi‑stake holeers forum on internet standards deployment for the purpose of giving flesh to the NIS2 implementing act and specifically it mentions cyber security risk management measures and then well OK more specific DNS and routing security and e‑mail and my question would be did you observe anything happening there or and is following that activity something that the working group should be involved in.

BRIAN NISBET: So I feel you might have some follow up to that. From my point of view, I have not been involved, I don't think Tobias ‑‑ I don't know if you have or not. Yes, we probably should be, can we enable Tobias's microphone? Or, yes. We probably should be, yeah. But like again the NCC may well be doing some stuff there.

TOBIAS KNECHT: So there is several things going on in the European Commission at the moment, there's other type of organisations, starting to evolve, especially with what happened at ICANN with the regulations over the new contracts about abuse management and abuse handling, there's obviously a lot of movement at the moment, we also have at least from my perspective and in my professional life an also a little bit with the head of the chair of the security working group, there is a lot of conversation at the moment also at EU level on how to tackle the issue of the whole online scams, online security and so on and so forth. So I don't know if it doesn't feel very coordinated at the moment because there's a lot of things popping up and they are also going away again but there's a few that stick. So it's kind of, it feels like this, you know, this super early although that's a little bit ‑‑ chokingly meant as a super early way of we now need to think about security and we need to think about online scams and so on and so forth. So it feel like there's something moving but I don't know, you know, it's a little bit early to say what's at the end kind of surviving and what's coming out of it. Thank you.

HISHAM IBRAHIM: Hisham Ibrahim, RIPE NCC CC, we have been talking with the European Commission since the beginning of the year actually and maybe even last year about what they have been looking at when it comes to security under NIS2, right.

The talk that Ruediger is referring to was the first talk that the European Commission came and if you ask me I think they are taking this the right way, they are bringing it to the communities trying to learn from them what is there, have a better understanding of how to cooperate before they put something forward. They do have something under NIS2 that says something about encouraging best practices when it comes to security and up to date technology or something like that. I don't know the text off the top of my head.

So they came to the RIPE meeting first. This was the first place they presented it. I know they went to the IETF after that and we are still working with them for them to understand what scope do they have there. To Ruediger's question about does the security working group have a role there; I would say absolutely yes. I think there is a lot that could be done between the working group here and co‑ordination whether with the NCC or other bodies.

We had presentations today from the IETF. Mirjam organised a sit down with me with people from the IEB as well to also talk about standards and what can we do in terms of co‑ordination and I think there is room for us to do more there, to demonstrate that yes, this is something important and that is high in our minds.

BRIAN NISBET: Yes. And I think we can talk about this and I am not saying not, but we have a limited amount of time, this is almost like what a mailing list is for as well, to raise things, to discuss things, to see who the right people we need to get in the room are.

So do you want to join the queue for the mic.

SPEAKER: Alexde Joode from the Amsterdam internet exchange. I think it's very important to realise that NIS2 is targeting organisations so it's not about protocols but about organisations, basically if you have an ISO 27001, you add liability for your director and you specify the supply chain a little bit better, you have NIS2. So this discussion or the conversation we have with RIPE is not only about how are you managing the EU connection but a lot of members here are a member, customer of RIPE so also have expectations about how I can secure my supply chain and for IP addresses in RPKI, RIPE is an essential part so I do expect some NIS2 compliance sufficient coming out of RIPE NCC. And if we as a working group can help with that, that's fine. But there are some expectations also from customers, at least about the security stance and the posture of RIPE.

BRIAN NISBET: Yes, absolutely.

RUEDIGER VOLK: OK, I would like to respond to your statement that NIS2 is only about organisations. Actually, this activity definitely is saying well OK, we want to talk about protocols, standards and best practices and they will come up with well, OK, documents that may force well OK, that will be used to check whether an organisation actually deploys acceptable stuff and if you don't just beyond the just organisational certification that you have, you may get into big trouble. That seems to be the plan. And the plan by the way as presented in Lisbon and in Madrid at IETF a few weeks later is calling for essentially ‑‑ actually the activity is very well structured, would we expect anything else out of Brussels and the timeframe looks like kind of the set up of the activity. Should be already almost done so kind of this is pretty serious stuff.

BRIAN NISBET: OK. We do need to move on, I think it's a... yeah, please.

EMANUELE IOVINI: Emanuele Iovini, Europol. I was the organiser of the workshop at Europol, I actually invited RIPE NCC because I want to train our law enforcement... how we can get information because I think that the key words here are co‑operation and understanding, the main reason... is already the third event I attend is because we want to understand each other. We want to understand what we do and why we do and I think is when we find a good way to communicate, we have almost the same goals actually, to have a clean internet, so that's why also coming to our cyber crime conference is to let you understand what we do for the greater good. For example... is a big problem at the moment combined with AI.

So what I want to show is that we are here, we want to be part of the community, we don't want to enforce anything. For example, the reason why we couldn't allow photography was to protect our investigators because they have families, they are people with lanyards that don't want pictures like we do here, nothing different from here, we respect the people, we try to balance, not the temple compromise, it's more balance because times I would like to arrest a cyber criminal or someone who promotes child abuse or something instead of his own privacy so sometimes it is this balance of rights. But this is just to make you understand that we work to towards the same objectives and towards the same goal so this is start, I think we are on the right way to cooperate with each other. We have to train ourselves, our investigators because we have to make sure that most of the investigators know the best practices, it's really hard because every country has thousands of investigators working on different fields. But I think we are on the right way. We can do it. So just to thank you for this moment, to show that you also care about what is the law enforcement perspective on the internet governance aspects, thank you.

BRIAN NISBET: Cool. Thank you.

SPEAKER: Ripe NCC, thank you. Just to add because we are monitoring this NIS2 forum that the Commission is about to launch, and the police officer that presented in Lisbon is called Ruediger as well and just informed me that the next steps is now coming up so they are going to launch the call for interest in the coming weeks. So they are going through a lot of bureaucracy, so they got some delays, but it's coming. And they will have a launch event in Brussels around December if that timeline is still correct. So we are watching and we understand the concerns on both sides, but this is something we need to tackle also within this working group as well.

BRIAN NISBET: Yes, something to do NIS2 has delays, I am shocked ‑‑ yes, thank you for that, it was a good event, and I am glad I got to speak about it here.

OK, let us move on then. We have two presentations, and our we have your re, very, excellent. So your re is going to talk about a thing that I am going to remember when the slide comes up and I can read the title! There we go, yes. So leak secrets in cloud buckets and responsible disclosure. Yury, please.

YURY ZHAUNIAROVICH: Thank you very much.

(APPLAUSE.)

So, thank you very much for coming, so my name is Yury Zhauniarovich, I am representing TU devil and this is a joint work between TU Delft, the company in the Netherlands that performs the internet scanning and lied den university, I am assistant Professor, I am not doing heavy lifting any more so the majority of the heavy lifting is done by the Soufian El Yadmani.

This work was partially funded by two NW projects. Today I want to tell you about ourkour small step into how we tried to make the internet a little bit more safe and basically so you know that the mod he were applications are not built from scratch so basically there are a lot of external services are used. And these services are protected with passwords, user names and password keys. API keys, tokens, etc.

And it always happened that developers failed to protect the secrets properly so for instance a pretty recent case was that Toyota customers database leaked because their secret key for the database was in the Github in the code, so hard coded so 300,000 Toyota customers details were leak.

So also there is, it shows a million times, 50% of the infrastructure as a code script, they had it hard coded for scored for more than 20 months. And other research shows that 74% of the leaked secrets are never revoked, this is like kind of I would say worse signs for us and there are different ways how the secrets can be leaked so obviously there are like Github code which was like in the past full of the secrets and Github introduced very nice features on how to detect the secrets and notify the owners of the code about this but there are other ways like containers, Docker containers, last week there was ACM CCS conference and one of the distinguished people was given to the presentation about the leaky applications so there were like many detected many Android and IOS applications that contain inside the secrets.

So what does research question we tried to answer. So it was well known that storage, cloud storage buckets contain private information. But we wanted to discover like if there are within this cloud storage there are like API secrets that could give us the possibility to move forward. So what kind of secrets can we detect there and what are the infrastructure and third party services we can use with these keys.

So the methodology was pretty straightforward. So for this work we used this platform called GreyHat warfare. So what it does, it collects from all over the internet the names of the packets, AWS, GCP and Azure and then it tries to release the files inside the packets so if the packets with are not correctly protected like the permissions are not well configured, you can lease the files that are there and this platform collects the Meta information like the full path to the file and the update, last update time, created time and the file stats so we used this platform to search for particular extensions which according to our assumption may contain different types of secrets.

We downloaded these files locally because we wanted to analyse; we removed the duplicates and we tried to search for the secrets and then we validated the secrets and of course because we are academics, we did the responsible disclosure and monitored the keys revoked and so on.

So we were searching like number of ‑‑ some of them are quite well known so in the community some of them, we developed ourselves.

So we had just several prerequisites which secrets we can use. First of all we want to like I don't know if some of you are aware about Truthful Hawk, it detects different kinds of secrets, you can use this tool as well for detecting if you are leaking some secrets. But our case we want to have the secrets which we are 100% sure are correct and valid. And second prerequisite was that we want to collect only the secrets that we can responsibleably verify so basically using something like API end point which we can just test with H typical 200 without downloading additional possible private information so these were like the secret patterns that we selected for our work.

And for responsible disclosure we co‑operated with a Dutch company, not for profit, CSIRT global, once we discovered that we discovered too many secrets, so it's, it's very difficult to do the responsible disclosure to CSIRT global is already connected with the companies and they have like the expert tease on how to perform OpenSource intelligence to find who are the owner of particular asset. And they have like their the resources to do this en mass, notify and monitor the results.

So and we did this responsible disclosure like for ‑‑ and collected and monitored the results for five months, between April 15th and September 15th in 2024.

What were the results.

So basically just in the beginning so GrayHat Warfare scans the internet around every couple of months and publishes the data set, the latest one every couple of months, for instance the data set we used from 19th January to 2024 contained the information about more than 450 open buckets. So this is kind of, I don't know, mind‑blowing for me. And the numbers can go up and down and they had some others recently, so this amount of unprotected buckets are kind of shocking.

And we were searching for different extensions within these buckets and we found around more than 9 million which we are interested in analysing. But of course it's also interesting to see other duplicates and maybe for us it was very interesting to see that if you just download randomly 400 different buckets, you can figure out that 30% of, around 30% of their files inside these buckets are duplicates so you have different customers, but the files, they are sometimes very similar.

And with our with our extensions with our files which we downloaded and we were interested in, the percentage of the duplicates is even higher. So for instance for PS 1 scripts, you can have up to 80% of the duplicates. So pretty interesting, yeah.

And we detected 215 leaks. Here on this slide you can see what are different types of the leaks that we found.

So obviously as Amazon is the most used service, we found more than 100 exactly AWS access tokens and some of these cases were pretty terrible so for instance with one token, we got the access to 300 more buckets like protected with this token so you can download a lot of private information there.

The most shocking example was the Crowd Strike, Falcon admin key so basically with this key, Crowd Strike Falcon is kind of EDR system and with this key you can take over completely the organisation network. So EDR, you can upload malware, you can delete logs, you can be completely undetected there and I don't know like some other keys, AWS, SES, SMTP, you can send the emails on behalf of their the organisation which is also pretty bad for me.

And obviously so one part is how to notify responsible people about these issues and of course like with two hundred keys, it's not very easy task so that's why we collaborated with... global to do this thing and it took us a while to do this thing. And we didn't manage to identify 160 owners of the keys and we then managed to identify 50 and five were dropped because the keys were removed, so in total you have 215, where were these companies. So these companies were located in 37 different countries. So it's also like you know, very difficult to not, to people in many, many countries and obviously the main one is the USA because USA probably has the largest adoption of different services.

And sectors obviously, computer and information technology but also retail stores, finance and insurance, education and research. So all these issues were reported to companies in these sectors.

So what was interesting so, we made it because we collaborated with...global so ‑‑ and the response was pretty fast so majority of the issues were removed within the first couple of days.

And on this graph you can see the number of cases which were removed, figured, and reaction means that we on like on our email that we sent OK you have this leak into this, this, this bucket in this file and yeah, this key, so sometimes we got the reply and this is reaction to us but some of the organisations just fix without replying to us. Of course, we as a researchers, we prefer when people reply to us and we can like discuss with them more, but like at least they fixed ‑‑ and this is for us also like a good thing.

So regarding what security measures companies took. So 19 cases and 19 cases they did everything right. So they restricted access to the bucket, they restricted access to the file, and they revoked the corresponding token or secret. But in several cases, like in 25, they either restricted access to the bucket or to the file, but they didn't revoke the token and were just like checking constantly during this five‑month period in these cases, I don't know what happened, so 25 cases, token is still there, so they are not accessible but we know these are tokens.

So one particular example was very interesting ‑‑ so we notified the company, the next day we observed that they removed their access to the token and in a couple of days this token appeared once again. So this was like OK, what's going on? So we notified them once again that OK, like you fixed and now you are back and some systems were depending on this token and it took them a while after to remove these things.

The properties of the files with leaks, the most I think interesting case that we found the secret that was figured and it has dated back to 2014, so more than ten years the key was available there. And yeah, it was fixed only during after our responsible disclosure.

So yeah, what are the conclusions. So cloud storage a major source of secret leaks, so obviously in our case we just touched the surface outside because we analysed only ten different extensions, we analysed just the keys that we can verify and we can like verify in the responsible way and we like even with these restrictions, we managed to identify 215 very valid secrets. As I said, just a while back, some leaks are ten years old and yeah, so definitely employing the organisation that has experience in doing the responsible disclosure that has already like kind of connection with the companies helped us to fix 95 out of 160 but there are still 65 cases which are not fixed, even with our responsible disclosure.

So thank you very much. If you would like to read more. So please, this paper is published at the SP conference.

(APPLAUSE.)

BRIAN NISBET: Thank you very much. Any questions?

SPEAKER: APNIC, Hello, have you had any negative or hostile reactions when you did the responsible disclosure thing?

YURY ZHAUNIAROVICH: So I would say not nasty but they said like OK, and what? So one particular in this case was like with AWS SMPT token, they said OK, you have this token, what can you do? So show us that what can you do. So after verifying that indeed it goes back to the company talk en, so we sent them the email from their security account. So... please fix this.

BRIAN NISBET: Did they at that point?

YURY ZHAUNIAROVICH: Yes. Sometimes you need to persuade people.

BRIAN NISBET: Because I think we have seen lots of instances where people are like you have hacked me, or things like that!

YURY ZHAUNIAROVICH: But they were aware. So...

TOBIAS KNECHT: Very cool project. When you informed people, did you inform the companies that owned the keys? So did you inform AWS and they more or less went to their customers and said hey, this is going to happen? Or did you go directly to the companies that used the keys?

YURY ZHAUNIAROVICH: Exactly so, thank you for the question. Yes, we identified the exact owners, either through packets names, sometimes it's becoming pretty obvious what is the company from the packet name or into the packet, into the files within this packet to find who can be their owner of these packet. So we didn't inform the AWS, now Google has the problem to notify Google and then Google will try to protect this thing, I don't remember if this is the case for AWS, correct me if where I am wrong, if somebody has this. Information, I definitely remember Google started this programme so we can notify to Google and Google will try to restrict the access to the bucket.

So I don't know if it is the case for AWS.

TOBIAS KNECHT: Right, I think you could always send it to security at AWS, I know for example SES, we work with the E SE people and they would be interested very interested in that data, they know that people spam through those channels, so thank you.

LEO VEGODA: Isn't this the kind of thing you would expect the providers to want to proactively check to make sure their customers aren't leaking the tokens that allow someone else to use their service and therefore diminish the reputation of the provider overall? Or am I being unreasonable in expecting that sort of thing? .

YURY ZHAUNIAROVICH: Thank you. Very good question. So like as the provider you mean AWS or ‑‑ so I think for ‑‑ as I said, so Google started this, doing this. So it played a tech that their key is used like some secret is leaked, so they tried to protect this packet from being ‑‑ restrict access to the packet.

I think we expect too much from this companies. You cannot check all the source where this information can leak so maybe, yes maybe not, I don't have the definite answer. Sorry.

BRIAN NISBET: Anecdotely I am very sure their support people find these on a regular basis. I would agree, I think that if you can do some of this, they had easier access than you did. So hopefully as you do this, maybe they will, if Google are doing some of it, maybe AWS is, I think the biggest dog in this particular area will hopefully follow suit.

YURY ZHAUNIAROVICH: I hope so, awareness matters. Providing these presentations and giving this speech like might make more people aware about this problem and eventually I hope we will come to the better solution. So definitely now there is already more and more efforts in protecting the packets like the cloud storage more.

>> Thank you very much.

BRIAN NISBET: Any other questions? If not, thank you very much Yury.

(APPLAUSE.).

So Thomas please, Thomas is going to talk about those poor advertiser hoardings are going to be in a bad state after this meeting, anyway talk bus about reg check.

THOMAS DANIELS: Hi everybody, I am Thomas, I am a researcher at... today I will be presenting reg check which is an approach to proactively identifying malicious domain name registrations. This has been developed in collaboration between DNS Belgium and SI DM, the registries and. .be respectively and adopted by dotdeas part of their registration process. So academy abuse includes fishing and other attempts at scams and fishing still makes a lot of victims every year. This victims can suffer serious losses. The Belgian federation for the banking sector reports that Belgians lose about 40 million euro to phishing every year.

Which is quite a high figure.

And a part of this is facilitated using abuse of... domain names, those domain names are what you want to prevent. Let me show what the former timeline of a phishing website used to be. So the Fisher would register a domain name and make use of it by putting content on it that aims to deceive visitors and then they would spread links to this website, eventually this website would be detected and mitigated such that it is no longer accessible to visitors. But usually at least several hours have passed between the registration and the mitigation of domain name which is more than enough time to make victims and that is our aim to prevent.

So instead of using a reactive mitigation like this, we want to use a more pro active strategy such that a website, a suspicious domain name gets detected within milliseconds of its registration rather than hours. And the way we do it is, right after registration, we perform an automated risk scoring.

And this risk scoring can have two outcomes, either everything looks OK and the domain is immediately activated or the risk scoring flags that there is something suspicious about the domain and the registrant should undergo and identity check, they have to basically prove that they are who they have claimed to be.

And if they do this, then the domain gets activated same way but if they don't, then the domain remains inactive until they do provide this proof. The domain is still there, it's still registered on their name but it just cannot be used.

And this acts as a barrier against malicious actors.

Now this approach in itself is not new at all. It has been used since the end of 2020, but at the time the risk scoring approach was done using a manually implemented rule based system. And this has quite some limitations. First because the rules are manually implemented, it is difficult and time consuming to maintain and to keep up with new strategy years that could come up.

And secondly it's also a very coarse‑grained system. What I mean by that is that each role in the system would have a weight of one, two or three and the total risk core is the sum of the weights of the matching rules. There's only a limited capacity for identity verifications; you have need to have a threshold somewhere, like what goes through and what gets held back.

And the rules based system does not allow for a very fine grain threshold, you could say, OK, we set the threshold at one but it might be too strict and selects more than there is capacity for. Or you could say the threshold is two, it might not be strict enough and some suspicious domains get through.

So the machine learning approach needs to solve some challenges first, we need to assign labels to past registrations, we still have the capacity constraints to respect when the... and thirdly we also need to make the decision in realtime right after registration, another challenge has been that existing machine learning solutions did not really work in our case because they largely relied on the assumption that malicious domain names are registered in bulk or as campaigns which is a behaviour that we do not really observe at .be So our approach really needed to be tailored to individual malicious registrationings, I will dive into our machine learning approach.

First we need to assign labels to past registrations, using which we train a... model, this is a category of machine learning models that use a collection of decision trees that together make a prediction and this is a category of models that performs very well on the sort of data that we have.

Then we need to set a threshold for what we select and finally we can deploy the model as part of our registration process.

So first the data and the labels. We used 1.4 million registrations, new registrations since 2018 to train the model and we need to label those registrations as malicious or benign, this is to teach the machine learning model what should count as goods and what should count as bad and this is a non‑trivial problem because they are multiple interpretations that you could give to what is malicious or what is benign.

So if a domain name has ended up on a trusted list or been revoked by DNS Belgium, we know that it's malicious, then we give it the malicious label.

We can also use the identity verification as label source because of the rule based system, we have historical data for this. What we do is if our registrants was required to go through verification and they completed the process, we assume their registration is benign. If they do not complete the process, we'll wait 120 days to give them enough time. And if they don't, then we also label it as malicious.

Now, this clearly is a weaker label than a previous one, because if you do not complete verification, that does not prove in any way that you are malicious actor. You could have plenty of other reasons for not completing verification, the simplest being that you are not in any hurry to use the domain name.

But we still want to include those in the positive labels because a part of them would still be malicious ones that were stopped by the verification and we absolutely want to include those.

In our context, a false positive also has a lower cost than a false negative because a false positive just means the next step verification for registrants while a false negative means an actual malicious website that can exist, that means real harm to victims.

So we primarily want to minimise the false negatives.

So with this labelling strategy, we have still only managed to label a very small minority of all the registrations, what we do with the rest is that there are plenty of active domain names that have existed for a long time without having any abuse reported on them.

So in those cases we assume that these are benign. This will not be a completely perfect assignment because there will have been malicious cases that were missed in the past but this works well in practice.

We also compared this label scheme to other approaches and this one seemed to work out the best.

Next we also need to extract features from our data. Features are numerical Boolean or categorical characteristics of registration that get fed into the machine loading model.

But first let me show you what data we actually have as part of the registration, what is the data we can extract features from. At registration time, what we have is the domain name itself of course. We also have the contact details of the registrants being their name, their address, their phone number and their email address, these are all freely input by the registrants. We also know which registrar they have chosen to make their registration through. We know at what time they have registered the domain and domain has existed before but has expired, we have the time stamp when it was previously released.

So from this data we extract several categories of features.

We compute lexicographic features that identify a pattern level or character level patterns in the data, we have domain lifecycle features for example if a domain is drop catched, if it was registered very shortly after it was released, then that's a risk factor.

We compute reputation features for the country codes, the email provider and the registrar. And finally we also compute geographical features that identify inconsistencies in the address details. Now we have our labels, we have our features and that means we can train the model. And then when the model is trained, we can make it make predictions on new registrations.

And this prediction takes the form of number between zero and one, with a higher number means a mar suspicious registration.

But this alone does not suffice because we need to make a decision either we select the registration for verification or we don't and we just let it through.

So there are a couple of considerations there, meaning that requiring verification may induce extra workload on the customer support team of DNS Belgium because they might need to review documenting or be asked questions about a verification procedure and so this means that we need to put a limit on how many registrations we can select every day.

We also need to make a decision immediately after the registration. So we cannot just wait until the end of the day, then sort the predicitons and select the top ones, that's something we cannot do. We need to make the decision immediately.

So our approach is that during the training process, we compute the optimal threshold between zero and one and everything above that threshold will be selected.

So to make that more specific, we have about 550 new registrations every day.

Let's assume that we have a capacity of 150 verifications per day.

Then we want to set a threshold such that we have about 150 positives per day. And we observed that no matter what the thresholds we set, there was a lot of variability in the number of selections per day an because it's just that as well some days happen to have more suspicious registrations than others, that's just how it is.

So instead of looking at like the daily basis, we compute a rolling average over seven days of windows of seven days, which still has some variability, but a lot less and then we set the threshold such that this rolling average never exceeds 150.

And that's the value we use at our decision thresholds.

On the individual days, individual days may still exceed 150 selections, but that's actually not a problem in practice as long as the average makes it flatten out.

So we do that process of calculating the threshold at the end of the model training on the validation sets of 90 days and we did not only compute the they are hold for 150, a capacity of 150, but we computed for multiple capacities such that our customer support team can choose the capacity at any time, if they have less capacity for verifications, they can lower the number.

It's also important to note that this capacity configuration only determines the threshold; it does not put an actual hard limit on the number of verifications. The reason is that if there would be such hard limits, the easy way to circumvent the system would be do a lot the suspicious registrations and the capacity would be exhausted and your next registration just gets through and we should prevent that obviously. So that's why the capacity only determines the threshold.

So we have the deployed our model in March of 20 24 and in the months after, we have taken a look at how well it performed in practice.

So first we tried to see if the amount of abuse has actually reduced.

And we do this by measuring the number of revoked registrations, if DNS Belgium has to revoke a lower number of registrations, and that means there has been a lower amount of abuse. And indeed in the months after the deployment of the model, we revoked 30% less fewer registrations than in the same period the year before.

So that's already a good sign.

And another observation is that if the prediction of the model is higher, then this correlates with a lower likelihood of completing the identity verification procedure which means that the model is quite successfully able to identify more suspicious registrations and give these a higher score.

So to conclude this presentation, let me end with some take aways.

First, a proactive approach is possible for detecting malicious domain name registrations and a machine learning pipeline is an improvement over hand‑crafted rules and that's been shown both by our experiments and our results after the deployments.

And thirdly, this approach is also applicable to other domain name registries. They use the same type of data, we are and have been in touch with some others so perhaps at some point others may also integrate this, but we will see.

So thanks for your attention, this work has also been published at KDD this summer, you can read the full paper if you want. And I am happy to take any questions.

(APPLAUSE.)

Thank you very much. Are there any questions?

EMANUELE OVINI: I speak for myself. I like to this, it's a great, great tool, I think it's important to start prevent phishing because we know that phishing happens in the first minutes of the registration of a domain many fines, I just want to give maybe a comment, did you try to be both sometimes on the registrant as a label to understand other domains registered by the same guy to give a different score for example, I have a domain, I found a guy, he is a malicious registrant domain, I can be able to see all the other registered domains by this guy who are also malicious, maybe give a higher score on malicious.

THOMAS DANIELS: Yes, well, if it's about past registrations of the same person then no, if they are malicious and they are revoked, well then they also get a positive label and they will be included in ‑‑ when the model is retrained because we train the model on new data every day. So yeah, that would be taken into account in that manner.

EMANUELE OVINI: That's great, just an additional question and then I leave. What are the action you put in place after thinking because I see that it's like a prediction about the malicious of the domain, you are not sure about it but you give a score, what are the actions based on the score, if the core is really high, what you do and if it's really low, what you do is always the same?

THOMAS DANIELS: The action is always the same, if it's above our threshold, then you need to provide a proof of your identity and once that's provided, you can use the domain name, no matter how high the score it, it does not depend on that.

EMANUELE OVINI: You can also speak with your ID.du for some risk approach.

THOMAS DANIELS: Yes, they have a quite similar approach and we also compared to that, their approach is more tailored towards campaigns but they also do the same concept of requiring identity verification for things that look suspicious, so yeah, we are aware of their work and have compared.

EMANUELE OVINI: Good job, keep up the good work.

TOBIAS KNECHT: Speaking for myself. So very interesting one of the questions I have, did you also look at, this is only new registration, did you have any access to data about existing domains that have become be abusive at one point in time or was that completely out of the scope?

THOMAS DANIELS: Now about in the scope we used the scope to label so we use those domains as a training for the model and we know they have been abusive so we can label them as malicious and when we train the model we feed that data into the model so it learns that OK, these domains were malicious, they had these characteristics, they can be used for new predictions, but we did not use the model to predict the old domains because they are in the training sets, we know what you are supposed to be, it isn't really useful at that stage any more.

TOBIAS KNECHT: So one of the things that I think would be really interesting, there's a lot of discussions about at the moment about those type of systems, ICANN changes and so on and so forth about are malicious domains malicious at the beginning of a registration process or is it is it domains that later get compromised so there's interest in the community at least from my side and from a lot of other people to see how much of the bad domains are actually registered for bad purposes or how many of the abuse cases are based on registered domains. I don't know if this is something you guys have in that area but I am just interested if you have.

THOMAS DANIELS: Yes, this approach does not help prevent anything... compromised domains, we have participated in a study to compare like the behaviour of compromised and maliciously registered domains which was actually presented at RIPE 89 two years, no a year ago. So we have also done some work on that aspect, yes.

TOBIAS KNECHT: Cool, thank you.

ANDREW CAMPING: Hi, Andrew Camping speaking for myself. A great presentation, very interesting and excellent research just for context, I saw recently research that predicts the total cost, if you like, of online scams and fraud globally this year is estimated at a trillion dollars so this is an important thing, it's part of tackling that. The other thought, I have got no connection with them, if you need further funding to continue this excellent work, I know NomNet announced their DNS fund a few weeks ago and that closes, I think, in about a week's time and they are offering grants of up to I think it's 100,000 pounds, something like that.

So if you need funding to take this further forward, that might be worth looking A.

THOMAS DANIELS: Thank you, that's interesting to keep in mind at least. Yes. Thank you.

BRIAN NISBET: Nothing online? No? OK. Cool, thank you very much.

(APPLAUSE.) OK. So that is the end of the ‑‑ slides are black, it suddenly feels very dark behind me.

Any other business? That anyone wishes to raise or talk about or otherwise? No? Cool. In which case I will remind people that obviously you can ‑‑ there is a mailing list, there is apparently a relevant security working group presentation that was given in the MAT Working Group around the disruption to the internet caused by some of the laws around copyright and things like that which I think the proposer is going to send to the mailing list as well, the link, it's also there in the MAT Working Group, it might be of interest to people here, it's often the case that there are items which are of interest to multiple different working groups and I think we were very aware of the fact that the security group when we changed the charter would have both things of interest here that were across others and vice versa.

But yes so the agenda for RIPE 91, never be afraid to send us things. And also we are working through this, this was the first time with the new system, I think it, a lot more familiar with it now and we'll have the draft agenda out a lot earlier than we did on this occasion.

Just before I finish, a couple of announcements, the voting has opened for the Programme Committee elections. You can vote up until Thursday at 17:00 local time.

If you are registered to vote in the NRO electionses should have received an email at this point in time giving you the details and again, that opened at 17:00. And that runs until 9 o'clock, the NRO NC election. And finally yes the bus is to Cong and I am assuming that will show up, it will leave from the hotel entrance at 20.15 and 20.30 local time and will come back later.

So thank you all very much and we hope to see you in Edinburgh at RIPE 92, so sorry, yeah, whatever the next RIPE meeting is!

Cool, have a lovely evening and I hope to see lots of you at the social. Thank you.

(APPLAUSE.)

End.