The Three Protocols That Could Bring Down The Internet (And Life as We Know It): DNS, PKI and BGP
With the shutdown of Heathrow for a day, many people are talking about resilience in our critical national infrastructure, and one of the most important parts of this is the Internet. While in the past, we have used leased lines to add resilience to our communications infrastructure, these days, we all use the Internet to provide a reliable and inexpensive way to communicate. Without it, there would be no planes taking off anywhere in the world, and our modern life would grind to a halt much more than with COVID-19. In fact, if it wasn’t for the Internet, the effects of COVID-19 on the world would have been so great. So imagine if the whole of the Internet just shut down in an instant!
For something that was envisioned to be a totally distributed network of interconnected machines that was robust against attacks, the Internet is a long way from being this. Basically, it was created with “old” protocols that were created in the 1980s, a time when mainframes and minicomputers ruled the world of computing. The protocols IP (RFC 791), TCP (RFC 792), HTTP, DNS, and so on have changed little since their initial drafts in RFCs (Request for Comments). In fact, HTTP has struggled to move past Version 1.0.
Thus, cybersecurity professionals know there are three fundamental weaknesses that could bring down the whole of the infrastructure in an instant. These are DNS (Domain Naming System), PKI (Public Key Infrastructure), and BGP (Border Gateway Protocol).
DNS
And, so we have flaws in our Internet protocols that need to be fixed. But is “just enough” security enough for something that provides the core service that runs the Internet: DNS (Domain Name Service)? One of the greatest threats is DNS cache poisoning, and where a malicious host can seed an incorrect domain name for the rest of the network. DNSSEC overcomes this by having a protected zone in which all the responses are digitally signed. DNS resolves can then check that the DNS information has been signed by one of the trusted hosts.
Bring down the Internet
A massive distributed denial of service (DDoS) against the core DNS infrastructure would cripple most of the Internet, as we are so dependent on DNS servers to resolve IP addresses. If the core fails to work, the rest of the infrastructure will collapse. And that, at any time, our DNS infrastructure could be taken over for malicious purposes.
DNS is a terrible protocol. It relies on one DNS server telling others what the IP resolution of the domain name is. But who can you trust to seed this? Well, the main authority of the domain. We don’t have to integrate the main authority as the information on the domains will be propagated through the Internet. So what happens if a malicious entity sees the wrong IP address? Let’s say a nation-state — MegaTropolis — wants to take over a little state — MiniTropolis. Well, their first focus might be poisoning DNS caches around the world, so that all the MiniTopolis domains pointed to MegaTropolis sites.
Domain Name System Security Extensions (DNSSEC)
The problem with DNS is that it has virtually no real security built into it, and where a fake DNS system can easily be set up and redirect users to incorrect sites. But there is a solution … Domain Name System Security Extensions (DNSSEC). It provides origin authentication of DNS data, along with data integrity. It does not stop someone from viewing the data in the request and reply. The issues related to DNS have been known for a while [here]:
The threats include: Packet interception; ID Guessing and Query Prediction; Name Chaining; Betrayal By Trusted Servers; Denial of Service; Authenticated Denial of Domain Names; and Wildcards. One of the greatest threats is DNS cache poisoning, and where a malicious host can seed an incorrect domain name for the rest of the network. DNSSEC overcomes this by having a protected zone in which all the responses are digitally signed. DNS resolves can then check that the DNS information has been signed by one of the trusted hosts.
DNS works by creating a domain record which defines an SOA (Start of Authority). This then defines the serial number, the refresh time, and so on:
Within this, we can define an NS (Name Server) and MX (Mail Server), along with the IP addresses of defined hosts within the domain. We can use nslookup to interrogate the entry:
It should be noted the DNSSEC does not provide confidentially of the data, or does it protect against a denial of service attack.
Coding
Some coding in Go is [here]. With this, we will create a 2,048-bit RSA key pair (a public key and a private key), and then sign the SOA with the private key of a trusted domain. Others can then check the signature using the public key of the trusted domain. The SOA entry is created as in DNS, with a response header for the domain name, and the IP addresses (dns.ClassINET):
The serial number is important on the SOA, as it defines the most up-to-date version of the entry. In this case, the signature was created with an RSA-encrypted SHA-256 hash. We also include a date for the expiry time for the signature. The signature is signed the entity defined in the key (key.Hdr.Name) [here]:
package main
import (
"crypto"
"crypto/rsa"
"github.com/miekg/dns"
"fmt"
"os"
)
func main() {
domain:="asecuritysite.com"
argCount := len(os.Args[1:])
if (argCount>0) {domain= string(os.Args[1])}
domain = domain+"."
key := new(dns.DNSKEY)
key.Hdr.Name = domain
key.Hdr.Rrtype = dns.TypeDNSKEY
key.Hdr.Class = dns.ClassINET
key.Hdr.Ttl = 3600
key.Flags = 256
key.Protocol = 3
key.Algorithm = dns.RSASHA256
priv, _ := key.Generate(2048)
soa := new(dns.SOA)
soa.Hdr = dns.RR_Header{domain, dns.TypeSOA,dns.ClassINET, 14400, 0}
soa.Ns = "ns."+domain
soa.Mbox = "mail."+domain
soa.Serial = 1293945905
soa.Refresh = 14400
soa.Retry = 3600
soa.Expire = 604800
soa.Minttl = 86400
sig := new(dns.RRSIG)
sig.Hdr = dns.RR_Header{domain, dns.TypeRRSIG, dns.ClassINET, 14400, 0}
sig.TypeCovered = dns.TypeSOA
sig.Algorithm = dns.RSASHA256
sig.Labels = 2
sig.Expiration = 1562761057
sig.Inception = 1562761057
sig.OrigTtl = soa.Hdr.Ttl
sig.KeyTag = key.KeyTag()
sig.SignerName = key.Hdr.Name
var pr crypto.Signer
pr,_= newSignerFromKey(priv)
if err := sig.Sign(pr, []dns.RR{soa}); err != nil {
fmt.Printf("Failed to sign")
return
}
if err := sig.Verify(key, []dns.RR{soa}); err != nil {
fmt.Printf("Failed to verify")
} else {
fmt.Printf("Signature okay\n\n")
}
fmt.Printf("SOA: %s\n\n",soa)
fmt.Printf("Sig: %s\n\n",sig)
fmt.Printf("Key: %s\n\n",key)
}
type rsaPrivateKey struct {
*rsa.PrivateKey
}
func newSignerFromKey(k interface{}) (crypto.Signer, error) {
var sshKey crypto.Signer
switch t := k.(type) {
case *rsa.PrivateKey:
sshKey = &rsaPrivateKey{t}
default:
return nil, fmt.Errorf("ssh: unsupported key type %T", k)
}
return sshKey, nil
}
A sample run for a message of “asecuritysite.com.” is [here]:
Signature okay
SOA: asecuritysite.com. 14400 IN SOA ns.asecuritysite.com. mail.asecuritysite.com. 1293945905 14400 3600 604800 86400
Sig: asecuritysite.com. 14400 IN RRSIG SOA 8 2 14400 20190710121737 20190710121737 42450 asecuritysite.com. NqFhFsl
EHFtETdO1cWFKhMyWydiTDpGkWKQggzgbzVGa9COBQDrFS+NRsVEQEpIee3EMJ/hY6RXpmo75ZxWO7OO4FfIBbl2qgZctewmutFy+HT4GUFA3dJp9rzfr2Jn
0lnCCOkLIW33zHgXgSXmKJobWXPsHTPQoUogjdxmPzbzWFd6S1XXPep4klyi1hbcM9uvnABtFGw5tb/rd7hs6B/hS9spoO5MHZqDczmAKEoW1XKht12G97Qc
Qz3nyyGlRScntbHNXk3xaD3Xzevu9SWhZ4Ro2xFvIWxdFkLWCv2wZkcwHIG9q7zFE5HglZhip+q8EKzfZV1PQn8AOVf16dA==
Key: asecuritysite.com. 3600 IN DNSKEY 256 3 8 AwEAAdcn7cvHVXBvDQVsh6ge+JHZWTCn4WarBbWzQ1TGKMpk/pT9L386Q2ZU1fvynYdp
eqQi4PKbRpavycjqMFBtJvg9qhHpXq25iSpdx1+aNHL8zyvx/eFFTAWHA8qN3uQuVKc5mm+KQ498poWd1dnYBNHRcNGgZA8epNsq+WSoLzRISIxgiFDs6j+k
ryO4ivj7n8dLOqqcv9C/tQl/7YhU4y3lHSek9FqFOCpYK4DzQb+jJuLKNWjAPobWF19JkrvcN0KeDZ2TZEeApz3UGtjsRMowH4AJ48yKyaT2vnmE52MwIiC1
/yHLtQJK77CMgow3BejXO2T9uytp+rTQyZk8Ens=
PKI (Public Key Infrastructure)
As if DNS was not bad enough, a single tweak of PKI would bring virtually every system down. With this, we identify an entity with its public key, and where it will prove its identity by signing a hash of some data with its associated private key. But we have to make sure that we trust the public key that we have received, and so we encapsulate the public key into a digital certificate, and which is then signed with the private key of a trusted authority. If we trust the authority, then we will trust the public key in the digital certificate.
For this, we then have Root CA (Certificate Authorities) and Intermediate CAs. A Root CA has the trust to allow Intermediate CAs to sign for certificates within given areas, such as for TLS connections, hardware devices and encrypted disks. The public keys for these authorities are then stored on computers, which are needed to check the validity of a certificate. In this case, Bob gets Trent Jr to sign for his public key, and then Bob passes the signed certificate with his public key to Alice. She then checks this with Trent Jr’s public key, and if it is verified, she can trust Bob’s public key. He can then digitally sign for his identity:
Overall, the Root CA is fundamentally required here, as Trent Jr’s certificate would not be valid, if Trent Jr was not trusted by Trent. Thus, the whole house of cards of the Internet could fall, with either a major hack on the private keys of a Root CA or for an adversary to send out a revocation request on Root CAs. Machines around the world would then not trust any of the certificates, and would fail to connect to services as they could not trust them.
BGP (Border Gateway Protocol)
Now, don’t get me started on the disaster that is BGP. It has a long track record of bringing down whole domains for extended periods, such as taking Facebook offline for a whole day. But, we have been lucky, and rather than taking a whole domain offline for a day, it could take the whole of the Internet offline for an extended period. No transport, no energy provision, no food supplies, and so on.
So the Internet isn’t the large-scale distributed network that DARPA tried to create, and which could withstand a nuclear strike on any part of it. At its core is a centralised infrastructure of routing devices and of centralised Internet services. The protocols its uses are basically just the ones that were drafted when we connected to mainframe computers from dumb terminals. Overall, though, a single glitch in its core infrastructure can bring the whole thing crashing to the floor. And then if you can’t get connected to the network, you often will struggle to fix it. A bit like trying to fix your car, when you have locked yourself out, and don’t have the key to get in.
As BGP still provides a good part of the core of the Internet, any problems with it can cause large-scale outages. Recently, Facebook took itself off the Internet due to BGP configuration errors, and there have been multiple times when Internet traffic has been “tricked” to take routes through countries which do not have a good track record for privacy.
BGP does the core of routing on the Internet and works by defining autonomous systems (AS). The ASs are identified with an ASN (Autonomous System Number) and keep routing tables, which allow the ASs to pass data packets between themselves and thus route between them. Thus the Facebook AS can advertise to other AS’s that it exists and that packets can be routed to them. When the Facebook outage happened, the Facebook AS failed to advertise its presence. Each AS then defines the network ranges that they can reach. Facebook’s ASN is AS32935 and covers around 270,000 IP address ranges [here].
What is BGP?
The two main interdomain routing protocols in recent history are EGP (Exterior Gateway Protocol) and BGP (Border Gateway Protocol). EGP suffers from several limitations, and its principal one is that it treats the Internet as a tree-like structure, as illustrated in Figure 1. This assumes that the structure of the Internet is made up of parents and children with a single backbone. A more typical topology for the Internet is illustrated in Figure 2. BGP is now one of the most widely accepted exterior routing protocols and has largely replaced EGP.
BGP is an improvement on EGP (the fourth version of BGP is known as BGP-4), and is defined in RFC1772. Unfortunately it is more complex than EGP, but not as complex as OSPF. BGP assumes that the Internet is made up of an arbitrarily interconnected set of nodes. It then assumes the Internet connects to a number of AANs (autonomously attached networks), as illustrated in Figure 3, which create boundaries around organisations, Internet service providers, and so on. It then assumes that, once they are in the AAN, the packets will be properly routed.
Most routing algorithms try to find the quickest way through the network, whereas BGP tries to find any path through the network. Thus, the main goal is reachability instead of the number of hops to the destination. So, finding a path which is nearly optimal is a good achievement. The AAN administrator selects at least one node to be a BGP speaker and also one or more border gateways. These gateways simply route the packet into and out of the AAN. The border gateways are the routers through which packets reach the AAN.
The speaker on the AAN broadcasts its reachability information to all the networks within its AAN. This information states only whether a destination AAN can be reached; it does not describe any other metrics. An important point is that BGP is not a distance vector or link-state protocol because it transmits complete routing information instead of partial information.
The BGP update packet also contains information on routes which cannot be reached (withdrawn routes), and the content of the BGP-4 update packet is:
- Unfeasible routes length (2 bytes).
- Withdrawn routes (variable length).
- Total path attribute length (2 bytes).
- Path attributes (variable length).
- Network layer reachability information (variable length). This can contain extra information, such as ‘use AAN 1 in preference to AAN 2’.
Routers within ASs share similar routing policies and thus operate as a single administrative unit. All the routers outside the AS treat the AS as a single unit. The AS identification number is assigned by the Internet Assigned Numbers Authority (IANA) in the range of 1 to 65,535, where 64,512 to 65,535 are reserved for private use. The private numbers are only used within the private domain and must be translated to registered numbers when leaving the domain.
BGP and routing loops
BGP uses TCP segments on port 179 to send routing information (whereas RIP uses port 520). BGP overcomes routing loops by constructing a graph of autonomous systems, based on the information provided by exchanging information between neighbors. It can thus build up a wider picture of the entire interconnected ASs. A keep-alive message is sent between neighbours, which allows the graph to be kept up-to-date.
Single-homed systems
ASs which have only one exit point are defined as single-homed systems, and are often referred to as stub networks. These stubs can use a default route to handle all the network traffic destined for non-local networks.
There are three methods that an AS can use so that the outside world can learn the addresses within the AS:
- Static configuration. For this, an Internet access provider could list the customer’s networks as static entries within its own router. These would then be advertised to other routers connected to its Internet core. This approach could also be used with a CIDR approach which aggregates the routes.
- Use an Interior Gateway Protocol (IGP) on the link. For this, an Internet access provider could run a IGP on the single connection, this can then be used to advertise the connected networks. This method allows for a more dynamic approach than static configuration. A typical IGP is OSPF.
- Use an Exterior Gateway Protocol (EGP) on the link. An EGP can be used to advertise the networks. If the connected AS does not have a registered AS, the Internet access provider can assign it from a private pool of AS numbers (64,512 to 65,535), and then strip off the numbers when advertising the AS to the core of the Internet.
Multihomed system
A multi-homed system has more than one exit point from the AS. As it has more than one exit point, it could support the routing of data across the exit points. A system which does not support the routing of traffic through the AS is named a non-transit AS. Non-transit ASs, thus, will only advertise their own routes to the Internet access providers, as it does not want any routing through them. One Internet provider could force traffic through the AS if it knows that routing through the AS is possible. To overcome this, the AS would set up filtering to stop any of this routed traffic.
Multi-homed transit systems have more than one connection to an Internet access provider and also allow traffic to be routed through it. It will route this traffic by running BGP internally so that multiple border routers in the same AS can share BGP information. Along with this, routers can forward BGP information from one border router to another. BGP running inside the AS is named Internet BGP (IBGP), while it is known as External BGP (EBGP) if it is running outside AS’s. The routers which define the boundary between the AS and the Internet access provider is known as border routers, while routers running internal BGP are known as transit routers.
BGP specification
Border Gateway Protocol (BGP) is an inter-Autonomous System routing protocol (exterior routing protocol), which builds on EGP. The main function of a BGP-based system is to communicate network reachability information with other BGP systems. Initially, two systems exchange messages to open and confirm the connection parameters and then transmit the entire BGP routing table. After this, incremental updates are sent as the routing tables change.
Each message has a fixed-size header and may or may not be followed a data portion. The fields are:
- Marker. Contains a value that the receiver of the message can predict. It can be used to detect a loss of synchronization between a pair of BGP peers, and to authenticate incoming BGP messages. 16 bytes.
- Length. Indicates the total length, in bytes, of the message, including the header. It must always be greater than 18 and no greater than 4096. 2 bytes.
- Type. Indicates the type of message, such as 1 — OPEN, 2 — UPDATE, 3 — NOTIFICATION and 4 — KEEPALIVE.
OPEN message
The OPEN message is the first message sent after a connection has been made. A KEEPALIVE message is sent back confirming the OPEN message. After this the UPDATE, KEEPALIVE, and NOTIFICATION messages can be exchanged.
Figure 4 shows the extra information added to the fixed-size BGP header. It has the following fields:
- Version. Indicates the protocol version number of the message. Typical values are 2, 3 or 4. 1 byte.
- My Autonomous System. Identifies the sender’s Autonomous System number. 2 bytes.
- Hold Time. Indicates the maximum number of seconds that can elapse between the receipt of successive KEEPALIVE and/or UPDATE and/or NOTIFICATION messages. 2 bytes.
- Authentication Code. Indicates the authentication mechanism being used. This should define the form and meaning of the Authentication Data and the algorithm for computing values of Marker fields.
- Authentication Data. The form and meaning of this field is a variable-length field which depends on the Authentication Code.
Figure 4: BGP message header and BGP OPEN message data
BGP configuration
BGP configuration commands are similar to those used for RIP (Routing Internet Protocol). To configure the router to support BGP the following commands is used:
RouterA # config t
RouterA(config)# router bgp AS-number
With IGP’s, such as RIP, the network command defines the networks on which routing table update are sent. For BGP a different approach is used to define the relationship between networks. This is [here]:
RouterA # config t
RouterA(config) # router bgp AS-number
Router(config-router)# network network-number [mask network-mask]
where the network command defines where to advertise the locally learnt networks. These networks could have been learnt from other protocols, such as RIP. An optional mask can be used with the network command to specify individual subnets. With the BGP protocol neighbours must establish a relationship, for this the following is used:
RouterA # config t
RouterA(config) #router bgp AS-number
Router(config-router)#network network-number [mask network-mask]
Router(config-router)# neighbor ip-address remote-as AS-number
which defines the IP address of a connected BGP-based router, along with its AS number.
Conclusions
At its core, the Internet is not a decentralised infrastructure. It is fragile and open to human error and adversarial attacks. Too much of our time is spent on making our services work and very little on making them robust. We need to spend more time looking at scenarios and how to mitigate them. Previously, it was Facebook taking itself offline; the next time, it could be a nation-state bringing down a whole country … and it is likely to have a devastating effect.
Now … I have set up more Cisco challenges for BGP for you, so go and learn more about BGP configuration here:
Or learn about PKI here: