This morning a number of customers were unable to connect to the Clockwork API. This was due to a major outage with an Internet backbone provider.
The Internet is made up of lots of separate network providers such as BT, Virgin Media and AT&T. Due to the sheer number of providers out there, and the fact they’re spread out all over the world, it’s not practical for them to connect with each other so they tend to connect to a few neighbouring networks.
Given that each network only knows about its immediate neighbours a solution was found back in 1989 that solved how you learnt about networks on the other side of the world, this was called Border Gateway Protocol (BGP). It essentially works by each network tells all their neighbouring networks about what other networks they can see, these messages are passed from neighbour to neighbour until eventually the whole world knows which way to send traffic for each network.
For people who know all about networking, I’m aware this isn’t quite correct but it’s near enough.
Unfortunately back in 1989 nobody was too worried about the internet being used maliciously or people making mistakes so these messages are just passed on with the assumption that they’re true.
A network provider in Malaysia started making announcements to its neighbours that it was the best route for traffic to everywhere in the world. One of the connected networks (Level 3) believed believed these announcements so sent traffic to this Malaysian network rather than where it was supposed to go. Unfortunately with Level 3 being a major backbone provider this affected a significant number of people.
Who was affected?
The UK seems to have been one of the most impacted countries, as a result customers with servers outside the UK were generally unable to connect to Clockwork. This is mostly down to us being an isolated island so all traffic has to go through a small number of backbone network providers. Customers based within the UK are unlikely to have been affected.
What wasn’t affected
For those customers who didn’t experience any connection issues your messages were not affected. We have a number of connections out to mobile networks and SMS suppliers and we re-routed traffic to avoid any troublesome links.
If you have any questions get in touch.