Skip to content

Plenary Transcript

Chaired By:
Clara Wade, Osama Al-Dosary
Session:
Plenary
Date:
Time:
(UTC +0300)
Room:
Main Room
Meetecho chat:
View Chat

RIPE 91

21 October 2025

OSAMA AL DOSARY: Good morning everyone, thank you for coming in so early, I hope you had a wonderful first day and evening, hopefully. We'd like to start our first session for today, Luke, could you come up, he is going to talk about Rotonda. Thank you.

LUUK HENDRIKS: Good morning everyone. Thank you for being here so early, my name is Luke, we make OpenSource software, one of the things that we make is route Nate Tore, our RPKI validator and I am sure some of you in this room know about the route Nate Tore.

When people started using validators a couple of years ago, we got some messages from worried operators. They already had enough trouble making sense of their BGP routing, what it did in their networks and now they had to add route origin Val nation which hooks to your decision process so things were becoming more scary and less tangible. Well that's fair, right, we understood those operators.

We didn't have a solution to help them out back then. At the same time, BMPstart to become a thick, it was gaining traction, we thought maybe we can do something, maybe we can make something that takes in routing information, take in RPKI information and help out these operators and that something became Rotonda that I want to show you today, before we go there, when I came BMP, who is familiar, can I get a show of hands with BMP, the BGP monitoring protocol, OK. That's great, who is actually using this in production on a somewhat day to day basis? OK. OK. I guess we can improve on that but that's good. I will do a very short introduction on BMP to we are all on the same page, mind you this is over simplified, but basically it goes like this. AS 1 and we hook up our router to a BMP station, right, it's the BMP collector, now if AS 2, thank you, sends us routing information in the form of of a BGP update, our router will encapsulate this into a BMP message and tend it our collector, our BMP station and it really looks a lot like the original BGP update, there's a lot of debales I am omitting here but the main message here is like we get basically all of the details that you could get from the BGP update all the way to our BMP station in forms of this BMP message and you can see that there's an extra header smacked on there so we can distinguish from whichever peer this information came.

So if we then zoom into our router itself, the big one in the middle, there's actually a lot more going on, if we open it up, I am sure many of you know we have this logical division into multiple RIBs in there, if we get information from the peer, it goes via some policy into our adjacent RIB‑in and flows on to ourlocRIB going to the decision process and goes out via some local policy via the RIB‑out and with BMP, you can look at all those different stages, you get a lot of information about this one single route, you can actually get five different views of this one single route. And it's not just for one peer, it's for all the sessions that you have on the monitored router so this is a lot of information and this can be very useful.

Now if we zoom out a bit, if you have your BMP station configured and you have your router exporting the information to the monitor where not hook up your other router, the other routers may be that other Auth router that's on its side and the other one that's mounted upside down, I don't know, I had space in the slide so monitor all the stuff, right.

That's the main message here. There is a lot of information that we can get via BMP and it all ends up at the one single point, your BMP station.

So what dodo with a lot of information, to you have to make sure you can process it, you have to be able to ingress that at a certain speed, be able to process it and ideally you want to have some flexibility there and this is what Rotonda aims to be. So what does Rotonda look like. Everything in Rotonda resolves around our store, the store is an in memory database, we built from scratch in house, tailored to store routing data. So think prefix based stuff. How do we get data in, via BMP, we can get a lot of stuff via a lot of routers and we can store all the different views that I showed you, the RIB‑in and RIB‑out policy, etc.

We also have a very passive BGP speaker which you can use as poor man's BMP, it will never announce any routes, it will only take in information and put it in this store in our database.

Perhaps less useful for operators but for the researchers in a room, we have an MRT ingress component, you can feed in archive data for example from RIPE RIS or route views, we also use this for testing.

How do we get data out, a BGP API giving you big gigabytes and gigabytes of JSON and there's other machinery around there to sent out events or MQTT bus or RIPE fix or file so we are missing one important part here to glue some things together and give us flexibility and I will get to that, before we continue some things that I want to emphasise.

We do keep store. Right. People often think when you talk about Rotonda, it's a BMP parser and you spit it out again to some other system, a database or whatever, we keep state ourself, we ingress the stuff and keep the state and if you have to explain Rotonda in one sentence, you could stay it's actually a specialised database for routing information. It's also a very simple single binary written in Rust, everything in the dark rectangle that you saw in the overview is one single binary, we don't rely on third party database or whatever and the goals here are doing all the stuff, high volume, high performance, and with a certain amount of flexibility that is useful for everyone. And that's the glue I want to talk about.

The flexibility we get from Roto, this is how all of the stuff is tied together, how we do our fill erring on the ingress components, row tow is our own programme language. Now there's always a red flag coming up with a new programming language and we realised this and we want to make this as easy as possible possible for people to adopt, it has some really powerful features, so first of all, this is what it it looks like, I hope you can see it's a very high‑level field, if you ever did something like Python or whatever, you can make sense of this, I hope.

This, I mean this example might be a bit silly but it's just to show you what's level, you can reason about in your filters with rotor.

We have syntax highlighting. It doesn't make or break your programming language but it's one of the things we can do to make adoption as easy on people, let's make it a nice experience.

What's even more useful in this on this topic is it's a fully strongly typed type checked language. So if you see the chains I made with the definition of my AS, you now accidentally put in a prefix, this won't below up at run time because it will actually be compiled before that and the compiler will actually warn you in a very user friendly way about hey, it seems like you made a mistake, I am expecting an ASN in this method, which you gave me a prefix so no surprises at run time because this is all type checked and compiled before we use it. We are leveraging this thing called crane lift, it allows us to compile Roto into this inter immediate raid representation, that actually is compiled to machine road, this gives us high flexibility but also high performance, right, there is very little overheads this way, compared to if we would implement the same logic in rust itself.

So, where do we use Roto in Rotonda, everywhere you see the small /KO*GS, let's see what it looks like in the ingress bar, where we feed in the routing data, this is an example rote at that filter for your BMP connect Tore, you get the BMP message and additional information where the messages came from, if you want to act upon for example the peer address or the peer ASN, you can reject this message and it won't go through to the RIB. You could also add some commands to log or sent events out to an MQTT bus. But you can also look into the message itself, so you can actually access the path attribute and say hey if there's a specific ASN in the AS past, I want to accept it for reject it or something about it. (Or reject it) then if the message is accepted and it goes through, it ends up at the store, it will be stored in our memory database.

And just before that, there's another row tow filter that you can hook into.

Message is exploded into all the individual announcements, the message can contain multiple announcements but here you have access to an individual route and individual announcement, now you can look at the prefix for that thing, maybe there's a certain list of prefixes you want to keep track this of, this allows you to do that.

Then we have more ways to ingress data other than routing data, namely RPKI data, this is where this idea came from, right, we have an RTR connecter that you can connect to your favourite RRPsoftware to feed it into the Rotonda awe and now we can do more funky stuff with Roto, we have routing data and RPKI data so if you go back to the filter that we used just before we store routing entries, we can now actually perform route origin validation, on that very route. So this will mark all the routes that pass through this filter with either being ROV valid or not found, it's your normal RPKI ROV. The main difference is if something is invalid, we only mark it as insad, valid, later you can see the route and use it in your analysis.

So we have full blown ROV but there's more stuff we can hook into, right, we can also hook into whenever you receive an update from the RPKI. So maybe you don't even have BMP streams or BGP sessions, you could also just use this to monitor the RPKI, perhaps there's prefixes you are interested in and you get, you can keep an eye on them there, if you are not interested, you just leave out a filter and you don't do anything with it, right, that's the flexibility that this offers.

And then the most interesting part I think is when we sort of combine these two. Assume we have RIBs, sorry, routes in our store, and there is an update coming in via the RPKI, we have a hook here that let's you act upon all the changes that this RPKI update affects on your stored route, so we have the specific type that contains the route that was just altered by the new RPKI information, which contains the previous ROV status an also the new ROV status, we can check has the status changed and is it now actually invalid while it wasn't before, then sends out something over MQTT or perhaps log something to a file, to syslog that this is an example and I hope this is readable on all the beamers, this is an example where a new V PR came in via RTR, which was now all of a sudden that was covering a certain prefix which wasn't covered before. So before the ROV status would be not found but because it's now covered with a different origin AS than was on the ROA, we actually see is ROV invalid and not once but we actually had seven versions this have route coming from seven different peers so it means for us there were seven changes in our store for this.

So what's next? Getting stuff out via the HTTP API. Well I am sure most of you have seen JSON APIs at this point, there's nothing fancy or exotic here, you can get the information about the sessions and of course you can get out the routing data per address family. So we'll see some example output there, this is a shout out to route views we are collaborating with them, we get some of their internal BMP feeds feeding into a Rotonda which is great for us while developing, this is really, really helpful. So the data you see here actually comes from them. So if you query first a certain prefix you will get something that looks like JSON because it is JSON, nothing new there, one thing I want to highlight here is ask for one prefix but what we have seen before is that you might have like ten versions of this from different peers in different RIB views so you get back the prefix that you asked for, you also get a list of all the routes that you have there, right, everything that comes in via BMP or BGP is stored in this one big in ream re database in the store, if you ask for one prefix, you get all the information that we have, whether it came from BMP or BGP, you get the complete view, that's why you get a list of routes and you see here that you still where it came from, what type of RIB, you can see the RPKI status and you get all the path attributes, we don't throw anything away, we are storing everything.

So, that means you have to do some filtering, right, because serialising gigabits of JSON is not fun on a the Rotunda part, it might be even worse when you try to pipe it through JQ or do anything with it so we need some filling and the filtering it in Rotonda in the you can filter on several feels and features but you can also combine those by Boolean ends, but you want more flexibility, right, we realise that and we want to offer the flexibility but we feel that trying to cram in all the flexibility into the http API will be very hard and it will be ambiguous and cumbersome so we leverage Roto again, you can actually write your own Roto functions that you can then use as a query parameter and call that, so an example of that and this is an example where you can see where you can still improve on Roto because this is a bid for both I hope it makes sense to you, what we do here is try to filter out routes that contain an OTC path at tute, only to customer path attribute, also contain an AS path attribute but where the OTC ASN is not in the path. That's what this logic does, right. If it's not in the path, we want to see the route in our HTTP response. Now, I didn't really think this through, I wasn't sure whether we would see stuff but actually we do see stuff and what you see here is the route service which are transparent so they are in the path but you don't see them in the AS path but they do set the OTC attribute, now they pop up here, this is an example of a filter which is hard to describe in a query, in an HTTP query but we have all this flexibility in Roto so we can leverage in that way.

We have metrics, we have built in metrics but of course you can also use row tow for your met risks, right, we don't know what you want to measure, we don't know what you want to alert so you just type it out yourself. This is an example inspired by the problem we have two years ago where there was a certain was it deprecated path attributes that escaped and blew up routers of a certain vendor, anyway, whatever the details were, you might want to keep track of whoever is sending you nasty path attributes, right. So here we have a Roto filter for your BMP ingress component that checks whether a certain attribute is in there and if it's in there, it will increase this counter which eventually ends up in your slash metrics end point, this is not built into Rotonda, this is something you can define at run time yourself.

So well I guess most of you know the Prometheus Grafana tool chain, probably most of you can plot this in a better way than I can in graph feign and you can configure alerts, this is not a thing to name and shame, this is just another obsolete time code that I found in the yawn in a registry, this is not the problematic one we see, this is just an example. What else is there, there's a big gap, we want to do high volume export of data as well so we are working on amongst other things a Kafka egress component there, that's something for another day. So what's Rotunda can do today for you is do large and large and large amounts of Unicast routing data.

What we are currently working on is doing non‑Unicast stuff, people ask us about EVPN and about BGP LS and many asked us about flow speck and we are working on that, the first parts will land very soon and we'll go on with specifically the flow specks, we have to do a lot of cool things also with the support of Fastnet Monday, a shout out to you guys, it will be a lot of fun, we already store aspa stuff, he we want to expose that in Roto and also to facilitate the standardisation of that and help people get confident with it, persistent storage instead of only in memory stuff, interaction via CLI, other ways to do configuration and we have a lot of other cool and crazy ideas so I invite you to actually try this, it's all OpenSource software, you can install it yourself, run it in your own network, it's not a hosted thing, we want your routing data but not in that way, there's domain men tation, Github, you know how it works and please if you can, help us out. Because we are not operators, we don't have a network and getting Unicast data, that's doable, getting non‑Unicast data is really hard for us. But it's crucial to get our parses in the place where they need to be and to make sure we can give you the tools to get the data out in a way that makes sense for you. So, if you are doing anything with BMP, no matter what platform, please reach out to us. Also because the BMP implementation, they have their own peculiarities, it's really good for the robustness of Rotonda if we have a diverse input that way and also because the protocol is still evolving and on that note, if you want to talk about IETF stuff, I am a co‑author on these two draft, I am not sure Colin is room, talk to him either, he is he is also a co‑author on these two drafts, where we tried to figure out ways to store BMP and make it more usable in ways other than the on the wire stuff that I showed you today. So, please come find us and talk to us, we are here all week, I am here all week, I don't say per is here all week, we hope you want to gives this a shot and try to break us and how it makes and we invite you to get very creative with all the Roto stuff and make some school things) Jasper) thank you.

(APPLAUSE.)

OSAMA AL DOSARY: Do we have any questions?

AUDIENCE SPEAKER: Hi, Mick O'Donovan, HEANet, it's not really a question, more a request. I like it and I am very willing to play with it. Can you stick a Docker compose as well into the repo? I see the Docker file, maybe a Docker compose might help as well?

LUUK HENDRIKS: Yes, I think we can, I think we can. Yeah, we need to figure that out, let's talk about that, I am sure we can make it work, yes, sure thing.

OSAMA AL DOSARY: Any other questions? Rudy, go ahead.

AUDIENCE SPEAKER: Rudiger Volk, look I wonder the router language as I see it explained right now is essentially while OK filtering thing kind of it did not seem like to actually include elements that allow it to express policy, going from filtering to policy which actually means you manipulate, you change the routes, probably is not a huge step.

LUUK HENDRIKS: Correct, yeah, so it's not included in the examples and not all the features have landed yet, we have things that let you change things from path attributes, that kind of stuff. It's not in there yet. But it will be soon.

RUDIGER VOLK: OK, the other thing that I wonder about is the instore data structures are while OK, I didn't see anything that indicated a time stamp from when a certain update was observed, when you look into how permanent storage of BMP traces are handled, of course you need that. Is that just not there and are you just having data structures that kind of keep current state of tables so kind of actually monitoring and expressing things that are related to changes in time, would not be supported?

LUUK HENDRIKS: Currently it's like you say, you have this in memory view of what is the latest state, to so to speak, right, the current state. One of the bullet points mentioned persistent storage so what we have actually already implemented but what is not exposed yet is on receiving a route that will over every write an entry that's already in the database, it the write it to disc and create an historical thing where you can later query on time windows and see differences in time so I guess that answers that part of the question.

And when it comes to time stamps an BMP issue, I guess it's a whole can of worms on its own within securities and stuff like that, that is on our radar, obviously I guess we can improve there, we have a concept of time in our store but if you want to get into filthy details, that is your guy, yeah, he is here all week and he is happy to talk to you.

RUDIGER VOLK: OK, time is a strange thing, we know at least since Einstein!

OSAMA AL DOSARY: Any other questions? All right, thank you, Luke. Thank you.

(APPLAUSE.)

So our next presenter is Radu, he is going to be talking about route collectors.

RADU ANGHEL: I am guessing this is the... pointer? Yeah. So, hi, good morning, I am Radu, I do internet measurements which means I often look at route directors. So I am going to share some of the things that I found there. I will start with a short intro, even though this is the RIPE meeting so probably everybody knows what route collectors are. They are the memory of the internet. They keep historical information for everything that they see but they don't see everything because they see things only from the point of view of limited number of peers that feed them information.

Besides this, we have the problem that the internet, what is the internet? So this ISP has this number of routes, that ISP has a different number of routes ‑‑ is this the same internet, excluding when Cogent is fighting everybody, it should be the same or at least in high proportion similar.

This data is used for a lot of things. Geoff would like to plot some IPv4 graphs, depletion, table size and stuff, you can use it to detect hijacks, you can perform alerts, you can alert on certain events. People use it for relationships between ASes, this is more or less successful. Yeah. Again what is the internet. The common thing with the internet is BGP, routers have to speak BGP to understand each other but BGP is evolving, we keep adding things to it, we keep adding RPKI, soon ASPA I hope. So this is constantly changing but we still want to analyse it. This is also something not for the RIPE community. The global routing table and default routes and everything. Yeah. Now we have the path vector protocol problem because each router only can tell you how to get to a certain destination from its point of view, the router doesn't know how somebody else's router can get to that destination or even if that other router knows that this destination. This comes from OSPF where the router doesn't know everything about everything. But people still want to analyse the internet so can we do it with what we find in route collectors. To a certain degree, we can. So in my work, I often use route collector data so I will analyse the major route collectors, the public ones at least because there are more, some are private, this will be a lot of numbers. Yeah.

I will start with packet clearing house, they have in theory the core rage of the internet, they have over 300 route collectors placed at internet exchanges, which in theory gives them the largest number of peers, however they don't provide any RIB dumps, they only provide updates every minute and they daily show iBGP text file, that text file is missing perhaps important information such as BGP communities, other attributes that could be interesting and the fact that they do this every minute makes it quite unusable if you want to process their whole information.

Then we have route views. Route views is a very nice project, from the university of Oregon, they have 46 route collectors, actually six of them are retired and one of them has a different naming scheme when you try to download the RIBs, not not very fun until you understand it. They have a lot of peers but there are double cuts so they have the same IP and same peering at different route collectors meaning they get the same data in multiple locations. There is also ‑‑ there was also a problem with processing their updates because they use free range routing with BGP extended messages enabled so most tools did not, were not able to process this so these numbers are lower than the actual numbers they have. But, yeah.

Then we are going to RIS. RIS is from RIPE, as everybody knows. They have more peers, more ASNs providing feeds. As a difference from route views, all route collectors from RIS have at least one peer providing a full table and they also have IPv6 tables from everybody. Route views was missing some of this information.

So just comparing them ‑‑ there is an overlap of almost three hundred ASNs between the two project. Which basically means that most of the peers of route views are also peering with RIS. Then we are going to what information you can find in these. You can of course find private address space default routes that in some cases are useful, in some cases they are not. I think it would be nice to have this documented somewhere.

There is a squirrel to distract you from the private space. Yes. As probably everybody expects, the RFC 1918 space is the most popular that appears there. Which in my opinion is the bigger problem than the second one which is the C G net IPs because of the age of the RFC and the RFC, the CGNet IPs are young. So I would expect the older private IPs would be better filtered, however they are not. Then we can see the more specific routes. I have put them in different orders for IPv4 and IPv6, this is how I am estimating the peers there.

This could be useful information for example for remote black hole routes, you might find hey, this IP is under attack and draw in conclusion from the route collectors, however the company announcing it the ISP might consider this private information, we don't want the internet to know our customer is under attack.

So I am not sure if these should or should not be there. The point‑to‑point links and internal routes could provide external people some information about what ISPs do inside their networks. Yeah.

However, these are limited to a low number of peers so they are not globally visible.

Then we have the multi‑origin AS prefixes, which for the general public are not OK according to my currently favourite RFC. However when you operate route servers you can push for an RFC that says your critical infrastructure and rules don't apply to you so Verisign has around 60 or something ASNs, besides this with IPv6, we have funny things, funny words in hex, which people consider funny to announce as multi‑origin prefixes, last time I checked the most, there were eight ASNs announcing such a prefix. There is a lot of China education ASNs that announce IPv6 prefixes as multi‑origin and here we have a small comparison that shows something weird for me. So for IPv4 there are a lot more prefixes that appear as multi‑origin announced by a lower number of ASNs, while for IPv6, we have less prefixes but way more ASNs announcing them. The number in brackets is the ones that professionally dual stack multi‑origin.

Then we also have the problem of misreading the information provided by these route collectors. Because these days you get internet news that some route leak happened but it was actually visible to only one or two peers. So maybe there should be before becoming internet news, maybe there should be some, some threshold. Also funny problem is what happens when you connect one internet exchange and then you all these monitoring websites, you appear to have 20 or 30 more peers than are connected to the IXP, this happens because of the new trend, new ways RFCs are new, or that of remote peering, basically a larger company connects to a lot of internet exchanges and then resells this to remote peers, however they seem to find it useful to act as route server from the IXP and they remove the AS from the path, in my opinion this is faking the data because it tricks people there is a shorter AS path, I am not sure how useful this is, maybe ASPA can fix it or maybe the internet exchanges could enforce the first hop AS, that would also be nice.

Then I think we should be aware on certain features and limitations of these. For example there was this ticket to buy ASN, a tool that transforms rib dumps in MRT format to plain text files and this person was opening a ticket because hey, I found in this route views snapshot private IPs with a public ASN assigned to them, why. And they considered it as a bug of by ASN when actually it was correct because that specific dump contained those routes.

So there is a lack of documentation as to what is included in these and this confuses people, then remember earlier when I talked about route views and not being able to process their updates due to BGP extended messages, this was fixed like three days ago when BGP launched a new version that is able to process these large packets.

There are other developments surrounding these route collectors so from only collecting rib dumps and the updates, we have near realtime information. From, yeah, there is also an attempt to increase the data available from network operators through a thing they called MANRS VIP which suggests in order to be, to have trusted badge you have to peer with route collector and somehow this is marketed decentralization when it could be marketed as centralisation, then we attempts to duplicate the data.

Now this is my favourite slide about the licensing issue. Route views has a very nice and open licence CC by, while RIS for commercial use wants you to include their logo, a link and the very long legal paragraph, however I have never seen this on any commercial site that uses RIS data. So if you can't enforce this, maybe this rule doesn't help anybody.

Also for stability reasons perhaps using it commercially is not such a good idea, even though RIS has a large number of peers. Yeah. So, my wish list would be as stated before, for better documentation, what to expect to see in these. If some of these are considered issues by the route collector operators, they have the power to contact their peers and ask them to fix this, like stop announcing a default route to me because I don't need it. I don't remember the exact number of peers announcing default routes but it was around one hundred, I think. There is also this RFC where you can ask for marked routes so you can make a difference between a peer, a customer, an upstream, this would help people when they try to infer relationships between ASNs, also the better peer selection path, route views published a new peering policy in January, RIS is doing it since I think two years ago. Yeah. And these are my conclusions. The main one being that RIS needs a new logo. A logo, not a new one, because it doesn't have one. Yeah. Thank you, any questions?

(APPLAUSE.)

OSAMA AL DOSARY: Do we have any questions?

RANDY BUSH: Randy Bush, RGNet, IIJ, ORCUS. Some of the differences between them are also what time period is available. As a researcher, I sometimes like to go back as early as possible and when and how richly they collected IPv6 data as opposed to just IPv4 and it would be nice to see, you know, some comparison in that too, if you have the time.

RADU ANGHEL: Yeah, I didn't have the time, I also had another slide where I was showing that route views collected data in the local time zone, before 2006 and then if you want to process that, you have to actually guess what time zone the collector was in, if it's not documented anywhere. So yeah, it would be interesting to know the IPv6 thing too.

RANDY BUSH: Route views was late to start collecting IPv6, RIS was a little earlier. Also how well they record outages and anomalies in the data, so that you don't have to kind of analyse the data to realise something is bad right in this period. And the recording of these kinds of meta data has been weak and is slowly improving, they are working on it but you go back in history, it gets really ugly.

RADU ANGHEL: Yeah.

OSAMA AL DOSARY: Any other questions? Thank you, Radu. (APPLAUSE.).

So up next is Christoff Visser to talk about Apple wireless direct link.

CHRISTOFF VISSER: Thanks. Hello again. Still Christoff Visser from IIJ. And I guess I am going to continue a bit on my theme from yesterday about giving you some, if you know what your network is actually doing. In particular I am going to be talking about Apple wireless direct link and just a quick show of hands before seeing this presentation . Has anybody heard of this concept or AWDL before? One person! OK, awesome.

Cool. Before I started on this, I didn't know it either. So, I am one of those very annoying people, I tend to play, I have a computer at home and I tend to play things on my iPad or my phone but I noticed that I was running into this issue and I hope the audio works and you might realise that there's strange suttering going on and I am just going to skip more to it, nobody needs to hear that (Audio played). While I was playing it, it wasn't all the time but it was something I was encountering a lot, it was another rabbit hole I decided to run into because it's a very strange rhythmic suttering that's going on. I am trying to figure out is this because of the application I am using, is this my device, and was it a new iPad at the time so fingers crossed, it wasn't that, that just sounds expensive or was it my network, common sense this is a wi‑fi network, wi‑fi is usually the issue but it's rhythmic, it's very precise and that's not congestion, it doesn't feel like that. So that brings me to the question, is it something else?

Let's start the application, I use Moonlight as a free OpenSource project, which means that awesome, it has a Github and I quickly realised that I am not the only one that is running into this exact issue. So there's a lot of other people with this micro suttering, rhythmic pattern suttering and if I look an application statistics, I can see here that we are seeing suttering which would imply that there's like frames being dropped or packets dropped but there's zero packets dropped so what is going on here?

But what it does say here and this is on a local connection is I have got a 20 milliseconds network latency but a variance of 25 which seems oddly high for a local connection. And I swapped over to a wide connection and the issue is resolved. So this feels like again a networking issue but at the same time this doesn't look like a condition, I know what condition feels like and it's not rhythmic like this, so I started to dig a little bit deeper, iPads, there's not a lot of network debugging so I tried another streaming app and this is what we use for Steam and here we can see very clearly this rhythmic increase and latency that comes through, that tells me it's not just the application, I am not going crazy. But this does exist. And I also wanted to see if this is happening from my computer where I have a little bit more control and lo and behold, it does. So this is just a graphics ping running hopefully, yes, so hopefully it's kind of clear but you can see this rhythmic pattern that goes up and comes down and then back, it starts to go up again and this is going from all the way from 3 million seconds latency to 92 milliseconds, that's massive for again a local connection, my commuter is physically wired it in, it's only my Mac Book that is wireless.

So I got more into digging, like what is ‑‑ and again it's very rhythmic, pattern that happens. And I have done enough software development that this feels like a software issue of some description.

So I was trying to figure out what is causing this. Clearly it's pointing to network but network doesn't feel right. So I tried less congested network, I am very much alone in my network connection here, I even tried 2.4 gigahertz, I was desperate but the issue remains and, frankly, I was like, I had this issue on and off for three, four years but nothing ever really showed up until while I tried like my normal Google search an because we live in modern day, I need to add SiteReader.com to get anything useful and lo and behold, somebody had was talking about exactly this issue I am having.

But they are talking about it in Vision Pro, so these were originally made last year so Vision Pro was very new, latest greatest thing has the same issue. But this, describing this rhythmic suttering issue and they are telling me to change my wi‑fi channel to 149 and that feels strange, why would a wi‑fi channel fix suttering. It doesn't add up.

But what is nice is they referred to this AWDL which up until this point I had never heard but it points me to something I can look into now, something to work with.

So I follow the link and yeah, the summary of what AWDL is, this is the source of our link, this is from TU dumpster, they have got a lot of great work reverse engineering this, as the name implies, it's Apple RIRless direct link, so this is to make hand off between your phone and Mac Book or different Mac books and your air drop and add on lock of your machine, all of the nice things that makes the Apple ecosystem who it is and keeps people there is this Apple wireless direct link and it's using a normal wi‑fi to have this communication with each other.

So what it really does is digging into it a bit deeper, in order for it doesn't matter where you bought your device in the world, for them to talk to each other, they need to be able to communicate with each other so the way this is done is they have these free social wi‑fi channels which is channel 6 for 2.4 gigahertz, 44 and 149 for five gigahertz, I don't have a 6 gigahertz at home, but as part of this whole process, AWDL needs to essentially listen if anybody wants to share stuff with it it also needs to be able to advertise to others, like may I hey I want to air drop something to someone else and ironically enough, they also negotiate which wi‑fi channel to send data over so they have a concept of wi‑fi channels so some extent but I am annoying a network engineer, I want to optimise my networks so I want to use channels that are empty, but that causes a ‑‑ it's a big mistake, what ends up happening is that if you are not in one of these social channels, you get this periodic wi‑fi channel swapping where it goes to the social channel, listens in anybody wants to talk to it and swaps back so create ago very rhythmic suttering what we just saw.

So I wanted to test this theory, does this actually work, I said channel 6 for my 2.4 gigahertz, 44 for 5 gigahertz, 149 is not available in Japan, I can't test that but from what I have seen, that one is high lie recommended because there's even less condition on that.

And it works. And I hate that it works. I really hate that this works.

As you can see like the line is, it's flat, it's flat and the stable speed as also increased from 50 to 105 so it works and everything inside me just hates that this is the case because the whole point of wi‑fi channels is to share the network and stuff, why do I have to use specific ones.

And the same thing for when I run the pings from my Mac Book. It looks exaggerated but the major thing is this is a max of 9 milliseconds now instead of 92, so a lot more what you would expect with wi‑fi on a local network but significantly more stable than what we were looking at.

So what is actually happening in this whole process. So what I did I said my wi‑fi to 120 as ample example, no congestion and it wasn't one of the ones advertised and I had to do sniffing on a different laptop because these packets are completely hidden on the host machine that's using it. So you might be able to do some jail breaking and let into the low kernel level, with the new Apple silicon ones, I didn't want to gamble with that, I ran pings from one machine over would I five and the other one was just sniffing on this channel 44 and you do see and then all of these red ones is where the AWDL packets are detected and blue is the latency and we can see that there's these jumps and increase in RTT times whenever it kind of lines up with the red, as you might notice there's a couple of these red clusters so the first one is when I went and unlocked my iPad, it starts to did the negotiation, etc and slowly as they have kind of communicated a bit and say OK, we don't actually have to send stuff to each other, it starts to slow down and the space between the messages increases and then I unlock my iPhone and there's a lot longer one, there's a sense of hierarchy, who is going to be the device that's in charge and I am doing all of this negotiation, we start to see more and more of the latency increases with that. So there's definitely some correlation going on with that there.

And I then reach out to Apple on this thing, what I have discovered especially talking to the dump set guys is they don't have any good communication things so this is from the feedback assistance and normally if you have got a bet at that IOS, you put something there, it's about the only place you have, if you encounter these issues yourself, please use my reference because apparently that's the only way to get some sort of a response. I have reported this like the start of last year and I have included my talk of this all before and I have had no responses since and there still exists in the latest IOS and Mac OS updates, all versions.

So what are our options, what can we actually do. The first and easy one is don't use AWDL, that means there's no more air drop, no more continue to camera, when I am travelling I use the iPad side car as a second monitor, to it works really nicely, a lot of these nice things that incentivise people to stay in the Apple ecosystem but that fixes a problem for me. There's approximately over 1.5 billion other iPhone users in the world and are you really going to tell your users in your network just don't use the features on these Apple devices for why they got them so it's not really a solution.

The other option is to do well the Apple way of networking, so for the best experience you use the same wi‑fi channels as everybody else or you will suffer from jitter at some point and the problem is because as the name implies, it's Apple wireless direct link and newer faster app is not going to fix this because the issue is not on the app side, it's directly between the devices and I put a question mark here, I have found social channels for 6 gigahertz, this issue will only keep persisting moving forward and one of the big things that Apple is trying to push is this RFC 9330, again this was between your device and your router etc, it's not going to fix this direct link between things and the big thing with this is again because it's direct link, somebody doesn't even need to be connected to your network, they can be walking past while you are on a conference call, unlock their device and there's going to be some jitter included, now most like Zoom and stuff has ways to counteract this a little bit, but at the end of the day the question really becomes like should Apple or any vendors really decide how we do our networking, should they decide that we only use these would I five channels to give the best experience.

So that kind of like again one of my talks, I don't really have a good solution, mainly because I don't work for Apple and unfortunately on the list I don't see any Apple attendees here. But it brings us to the question of like does Apple ecosystem, it's a double edged sword. There's a lot of convenience, as I described, that they use with this and they are pushing this more, you can mirror our iPhone on your device, they are pushing this Moran the question is really is this convenience worth disruption, for most things sure, it doesn't matter too much but this is the big one here. Cloud gaming and remote gaming is growing bigger and bigger and they are trying to push high fidelity, bigger bit rate, if you are trying to do 4k HDR at 120 MPS, yes you are going to start to feel these delays and packet loss more and more.

And it really feels, it makes me uncomfortable because it really promotes bad network practices to like not use the best channels to actually improve your end user experience.

Which doesn't really work in like a conference room or in a big office setting but for your home use, if you have Apple devices and stuff, I would recommend use wifi channels 6, 44 and 149. And the other thing as well and it's part of the magic that they have with their implementation is there's actually no packet loss that's really happening. While it's doing the swapping, it's just putting things in a queue and delaying things and pushes it back on so really there's no, is this a problem or is this just me being annoying gamer person that my video and stuff isn't looking at good? And the thing is the connection will stabilize, it will slowly, when there's not enough communication of stuff, it will stabilise, nothing going on, until another Apple device appears.

So yeah, at the end of the day I don't really have a proper solution and this is more, I wanted to do this especially for people from ISPs, etc, maybe share some of this with your help desk because if your people are using one of these services, the people they are going to complain to is going to be the services first, then their ISP, maybe their router box but not the last, not in like final metres etc. So it might help with some of the headaches that you guys experience on the hell desk team and to work with that.

But yeah. That's it for that, I will take questions as best I can.

(APPLAUSE.)

AUDIENCE SPEAKER: Hello. I was wondering what are the security implications of this. So OK, we have some stuff which is systemically on the same channels and it's supposed to be used to communicate between Apple devices, I don't know random Apple devices, how about for example in order to push Apple, start weaponising, let's stay, you start intentionally in a big conference, start pushing junk, junk, junk and what can you gain as a malicious attacker because this is even worse?

CHRISTOFF VISSER: So I will actually ‑‑ where is it? So I would highly recommend, OK, follow these guys. They are more of a wireless security team. What you are describing they did find and disclose, it was worse from my understanding one of the first things they found with it on plain checks but all of the Apple IDs of people on these communications now it's kind of encrypted so for instance air drop, it won't share your stuff if you put in only contacts, so there are a lot of security things that they overcome from it, I don't know if it's necessarily at this stage potentially it can disrupt it large enough but apart from there was a the Steve Jobs like first introducing wi‑fi on the iPhone, there was the disruption set, apart from that at these Apple events I haven't seen anything there so... I think these guys have tackled it but I was more looking on the network latency aspect of that.

AUDIENCE SPEAKER: Thank you very much for your interesting presentation, it feels really good to hear that I am not the only one who went through almost this exact same trouble shooting process.

CHRISTOFF VISSER: I heard that a lot when I presented, a lot of people have that light bulb home, ah, that's what caused it. Speak peak out of curiosity, you haven't received any official answer from Apple on this but have you perhaps gotten any unofficial confirmation information from somebody who works at Apple who you might have met at conferences and such.

CHRISTOFF VISSER: Unfortunately not, I would love to but as you see on the attendance list, they don't come to events like this so it's hard. The thing is going back to the Github page on things, there's people complaining about these things since 2019 but nobody is, they see this issue but everybody is blaming something else, it was only until I did a deep dive that all the stars aligned. I would love to be able to talk to them, for instance one of the things with how Google does their direct ones is the social channels, a negotiation happened on Bluetooth flow energy, the thing is AWDL is older than bluetooth flow energy so transitioning to that might be a bit harder.

AUDIENCE SPEAKER: Thanks again, if I run into somebody from Apple, I now have a conference point to talk to as your problem.

CHRISTOFF VISSER: Yes, yes.

AUDIENCE SPEAKER: Hello. I actually heard that before because I debugged a VPN issue, where deactivating the VPN would stop this.

CHRISTOFF VISSER: Yes. I am not really sure how exactly, I have heard the same thing but I am not sure exactly how.

AUDIENCE SPEAKER: I found inside there's IPv6 local link I believe for something like that. So if the firewall gets to see that, which it should not considering to Apple, it could block it, but my question was something else. Did you see anything as to the generation of NIC cards, the wi‑fi cards? Because this seems to me like it could be taking more time depending on how fast your wi‑fi device can switch channels, so it might have less impact with a monocard, it might be why Apple wants to do their own chips? I don't know.

CHRISTOFF VISSER: I haven't tried it on any, my lightest machine is M3 and then the first machine that I did the testing was the M 1 so the same issue existed on both of them but I haven't tried it on one of the new where they are doing their own systems stack but I will say between the two devices, my experience was the same. And same thing at like an M 1 iPad, same issue. Strangely enough, and this is something worth just how they put the devices to sleep, I experienced this more frequently on iPads because that tends to go on a power save more quickly compared to I phones that will still do a lot of background task before it fully finishes things off so yeah. It would be nice to see if they have done that but yeah, I'd have to buy something new for that, maybe put it in my budget for next year.

AUDIENCE SPEAKER: Thank you.

AUDIENCE SPEAKER: Hi, Apple people come to IETFs, they are quite nice and they talk to me so seek me out and if you have a feedback idea or something, I can just like try pushing to it.

CHRISTOFF VISSER: Absolutely, I will come see you.

AUDIENCE SPEAKER: It takes them time because I wanted to remove a single IPv4 for month page or degree and it took like a year and, a top tier, you know, Stuart had to get involved into removing a word from the main page, so it takes them time but things happen, I also like, there was like a kernel crash that I can cause with BIND and it's only got recently fixed, but it's happening so get in touch with me and I can connect you with the people who come to IETFs.

CHRISTOFF VISSER: That would be awesome, thank you very much.

OSAMA AL DOSARY: Any other questions? Thank you, Christoff, do we have questions online? Thank you Christoff.

CHRISTOFF VISSER: Thank you very much.

(APPLAUSE.).

OSAMA AL DOSARY: I'd like to remind you to rate the talks online, they are very useful for us to be, do a better job so it's an opportunity to give a constructive feedback so please rate them online. There's a few more announcements that we have, the PC elections, the deadline to nominate yourself is today, 3.30 and also to register for NRO NC elections is at 2pm an also the social tonight, the buses, the shuttle buses will be leaving at 8.15 and 8.30 and we should be back here at 11 after the coffee breaks, we have an extended coffee break, anything else? Clara? Right. Thank you everyone.

(APPLAUSE.)

Coffee break.