The Truth About Routers and Switches
Check out this excerpt from Ben Piper's "Learn Cisco Network Administration in a Month of Lunches" and learn more about routers and switches.
Join the DZone community and get the full member experience.Join For Free
"What do routers and switches actually do?"
"Why do devices have both MAC and IP addresses?"
These seemingly simple questions don’t have a straightforward answer. I’ve seen a lot of attempts to answer these questions in a few sentences, and all such attempts invariably cause more confusion than they clear up.
The truth is that routers and switches were born out of necessity rather than practicality. In principle, neither device is particularly elegant or clever, although Cisco has done some clever things to make them perform better. Like most technologies, routers and switches came about because of questionable decisions that were made decades ago.
Later technology is usually built on earlier technology. For instance, e-books borrow concepts such as “pages” and “bookmarks” from traditional printed books. Imagine explaining the “page” concept to someone who is used to reading scrolls but has never seen a traditional printed book. How would you do it? Before you can explain what a “page” is, you have to explain why pages exist in the first place.
Similarly, before I can explain what a router or a switch is, I have to briefly explain what problems each was designed to solve
A long time ago, some folks decided that all network devices would uniquely identify each other using something called a media access control (MAC) address. A MAC address is 48 bits long and is represented as a string of hexadecimal numbers, like this: 0800.2700.EC26. You’ve probably seen a few of these.
Here’s the interesting part: the manufacturer of each network device assigns it a unique MAC address at the time of manufacture. The rationale behind this is to make it possible to simply plug a device into a network and have it communicate with other devices without having to manually configure anything. That sounds noble, but there is a rub: because the manufacturer assigns the MAC address, it has no relationship to where the device will physically end up. In that sense, it’s not really an address because it can’t help you locate the device.
A MAC address works like a person’s full name. It’s assigned at birth and makes it easy to identify someone, get their attention in a crowd of people, and even send them a message by calling out their name. If we’re in a large crowd of people, and you need to communicate a message to me but have no idea where I am, you could get on a bullhorn and yell, “Ben Piper, where are you?” If I’m in that crowd, I’ll receive your message.
Network devices communicate with each other in a similar fashion, but instead of using full names, they use MAC addresses. Suppose that my computer has a MAC address of 0800.2700.EC26, and it needs to print to a network printer named Monoprint with the MAC address 0020.3500.CE26. My computer and the printer have a physical connection to a device called a switch as illustrated in Figure 1. Specifically, my computer and the printer are physically connected to individual Ethernet ports on the switch. In this sense, a switch is like a gathering place for network devices. Just like you and I might gather together with others in a crowded outdoor marketplace, network devices gather together on a switch. This collection of connected devices is called a local area network (LAN).
Figure 1. Computer and two printers connected to a switch
But here’s the problem: my computer doesn’t know where Monoprint is, or if it’s even a part of the LAN — the “crowd” of devices connecting to the switch. MAC addresses, like full names, make for good identifiers, but they’re lousy at telling you exactly where a device is. Because of this, my computer has to get on its “bullhorn” and call out to Monoprint using its MAC address.
The Ethernet Frame: A Big Envelope
My computer creates an Ethernet frame containing its own MAC address as the source, and the printer’s MAC address as the destination. Think of the Ethernet frame as the big envelope in Figure 2 with a return address and a destination address.
Figure 2. An Ethernet frame contains a source and destination MAC address.
My computer places the data it wants to send — in this case, a print job — inside the “big envelope” and sends it to the switch. The switch receives the frame and looks at the destination printer’s MAC address. Initially, the switch has no idea which port the printer is connected to, so it sends the frame to every other device plugged into the switch in order to get it to the one device that needs it — the printer. This is called flooding.
Figure 3. Ethernet frame flooding. In step 1, my computer sends an Ethernet frame addressed to Monoprint’s MAC address (0020.3500.ce26). In step 2, the switch floods this frame to all other connected devices.
When Everybody Talks, Nobody Listens
Flooding has an effect similar to that of blasting a bullhorn into a large crowd. Everyone hears you, but for that moment, people in the crowd cannot hear each other. You effectively stop their communication, at least momentarily. Even after you stop bellowing into your bullhorn, it takes a bit of time for the people in the crowd to process your message and realize that they were not the intended recipient. The same thing happens when a switch floods or sends a message to all devices. Those devices won’t be able to “hear” any other devices until the flood is over. And even then, they must process the message to determine whether they need to do anything with it. This phenomenon is called an interrupt.
Although a few flooded frames and interrupts here and there might seem negligible, consider what would happen in a crowd of, say, 1,000 people who each have a bullhorn. Just as you’re ready to get on your bullhorn and send a message to me, someone right next to you gets on his bullhorn and yells out to someone else. After your ears stop ringing, you raise your bullhorn again, only to be once again interrupted by someone else. Eventually, you might get enough of a break to get your message out. But that’s the problem. You’re competing with others for the use of a shared medium — the air. This “one-to-many” communication method makes it difficult to relay a message to a specific person in a timely manner. And the larger the crowd, the worse the problem becomes.
On a LAN with a few devices, flooding is not a problem. On a LAN with hundreds or thousands of devices, it is. But that raises another problem. A network that can’t connect thousands of devices is virtually useless.
Suppose you add another switch named Switch2 to the network topology and connect a database server to it, as shown in Figure 4. When my computer sends a frame to the server’s MAC address, Switch1 will flood (and interrupt) every device connected to its ports — including Switch2! Switch2 in turn floods the frame to every other device. In this case, the database server is the only other device connected to Switch2.
Figure 4. Second switch extending the broadcast domain. In step 1, my computer sends a frame addressed to the database server’s MAC address (00db.dbdb.5010). In step 2, Switch1 floods this frame to all of its connected devices. Finally, in step 3, Switch2 floods the frame to the database server.
All of these devices that receive the frame are members of the same broadcast domain. A broadcast domain isn’t a thing or a directly configurable setting, but rather an emergent property of a network. To better understand this, consider the following analogy.
When you stand alone in the middle of a street, you are not a crowd. But as a few people gather around you, you become part of a small crowd. As more people gather around you, you become part of a larger crowd. You don’t change, but you become part of a crowd by virtue of how many others gather around you. Similarly, a device becomes part of a broadcast domain by virtue of which devices it receives flooded frames from.
Closing the Floodgates: the MAC Address Table
Flooding is an inevitable side-effect of using MAC addresses. Fortunately, switches use a neat little trick to mitigate unnecessary flooding. Whenever a switch receives a frame, it looks at the source MAC address and the switch port it came in on. It uses this information to build a MAC address table.
When Switch1 receives a frame from my computer, it takes note of the source MAC address – 0800.2700.ec26 – as well as the switch port the frame came in on – FastEthernet0/1. It adds this information to its MAC address table as shown in Table 2-1.
Table 1. Switch1’s MAC address table
Now suppose the database server sends a frame addressed to my computer’s MAC address. The frame reaches Switch2 which in turn forwards it on to Switch1. But this time, instead of blindly flooding the frame to all other devices, Switch1 checks its MAC address table.
It sees that the destination MAC address – 0800.2700.ec26 – is on FastEthernet0/1, so it sends that frame only out of that specific port. This works similarly to an old telephone switchboard, which is where the term switch comes from.
Figure 5. How the MAC address table mitigates flooding. In step 1, the database server sends a frame to my computer’s MAC address (0800.2700.ec26). In step 2, Switch2 floods the frame to Switch1. In step 3, Switch1 consults its MAC address table and finds a corresponding entry for the destination MAC. In step 4, Switch1 sends the frame only to my computer instead of flooding it to all devices.
Breaking Up the Broadcast Domain
As the size of the broadcast domain grows, communication becomes more difficult. Consequently, a broadcast domain containing hundreds of devices performs poorly. But modern organizations require network connectivity among thousands of devices. And just haven’t connectivity isn’t good enough. The network still has to be fast and reliable.
The solution is to limit the size of the broadcast domain. This means breaking it into multiple, small broadcast domains that, somehow, can still communicate with each other.
Going back to our example, the simplest way to split the broadcast domain is to remove the Ethernet cable connecting Switch1 and Switch2, as shown in Figure 6. Note that the switches are not connected in any way. That’s the easy part. Here’s the hard part: My computer and the database server reside on two separate broadcast domains. There is no way my computer and the server can communicate. What do you do? You can’t just plug the switches back together because that would re-create the original, single broadcast domain.
Figure 6. Two broadcast domains
Joining Broadcast Domains
In order to join two broadcast domains together without encountering that nasty flooding problem, two things must happen:
First, since both broadcast domains are physically disconnected, you need a special device to physically connect them in such a way that flooded frames cannot cross the broadcast domain boundary. Since frames contain source and destination MAC addresses, this device will effectively “hide” the MAC addresses in one broadcast domain from the MAC addresses in the other.
Second, since MAC addresses in one broadcast domain will be hidden from another, we need a different scheme to address devices across multiple broadcast domains. This new addressing scheme, unlike MAC addresses, must not only be able to uniquely identify devices across broadcast domains, it must also provide some clues as to which broadcast domain each device resides in.
Let’s start with the latter.
Addressing Devices Across Broadcast Domains
The addressing scheme has to meet some requirements: First, the addresses have to be unique across broadcast domains. A device in one broadcast domain can’t have the same address as another device. Second, the address has to tell us all by itself which broadcast domain it is a part of. The address should not only be able to uniquely identify a device, but also tell other devices which broadcast domain it resides in. This is to avoid that ugly flooding problem. Third, the addresses cannot be “assigned at birth” like MAC addresses. They have to be configurable by you, the network administrator.
Fortunately, you don’t have to look very far. Such an addressing scheme already exists, and you’re already using it.
Internet Protocol (IP) Addresses
You already know what an IP address looks like. One of the most common IP addresses is 192.168.1.1. It’s a series of four octets separated by dots, and each octet can range from 0 to 255.
You’ve probably seen these 192.168.x.x addresses pop up in a variety of places. That’s because 192.168.x.x addresses are reserved for use on private networks like your home or business. They’re not globally unique because they’re not reachable via the public Internet. But you can still use them to address devices on your own internal networks.
Unlike a MAC address, you can assign any IP address to whatever device you like. You can create your own addressing scheme based on where devices are, not just what they are. Let’s look at an example.
Where Are You?
Devices connected to Switch1 are in broadcast domain 1 and devices connected to Switch2 are in broadcast domain 2. You could, then, assign a 192.168.1.x address to all devices connected to Switch1, and a 192.168.2.x address to devices connected to Switch2. Even without looking at Figure 7, just knowing the IP addresses makes it painfully obvious which broadcast domain each device resides in.
Figure 7. Each device has an IP address that corresponds to its broadcast domain.
But at this point, there’s still no connectivity between the two broadcast domains, so it’s only possible for devices to communicate within the same broadcast domain. But that raises the question: Now that each device has both an IP and MAC address, which one will it use to communicate within its broadcast domain?
The IP vs. MAC Dilemma
“Why don’t we just use IP addresses instead of MAC addresses?” is a common refrain among IT professionals trying to learn networking. It’s a good question.
After all, MAC addresses aren’t very friendly. They’re hard to remember, mostly meaningless, and difficult (if not impossible) to change. IP addresses, on the other hand are easy to remember, easy to change, and can be very meaningful with regards to location and function. The winner here is obvious.
So why don’t we just use IP and get rid of MAC addresses altogether? The answer is simple, but a bit disturbing.
Network devices within a broadcast domain still have to communicate using MAC addresses. This is a requirement of the Ethernet standard that has been around for decades. Assigning IP addresses doesn’t change that. Sure, someone could come along and create a new standard that makes MAC addresses unnecessary, but that would require replacing every single device on your network.
In short, MAC addresses are here to stay. That’s the bad news. The good news is that we don’t have to think about them, at least not very often.
Address Resolution Protocol (ARP)
Remembering both MAC and IP addresses is inefficient and wasteful. That’s why almost all networked applications use the IP address and completely ignore the MAC address. Thanks to the address resolution protocol (ARP), you can adopt this same approach.
ARP provides a clever way to map or resolve IP addresses to MAC addresses. The advantage of ARP is that it lets you use human-friendly IP addresses without even having to think about MAC addresses. All network devices made since the mid-1980s use ARP by default, so you don’t need to manually configure it.
Suppose that my computer needs to send another print job to Monoprint. Both devices are in the same broadcast domain, so they’re still going to talk using their MAC addresses. But you, as a de facto network administrator, don’t want to even think about MAC addresses. So you configure my computer to print to Monoprint’s IP address — 192.168.1.20.
Figure 2-9 illustrates how ARP works. My computer sends what’s called an ARP request to figure out Monoprint’s MAC address. The request says, “This is 192.168.1.10 and my MAC is 0800.2700.EC26. Who has 192.168.1.20?” My computer stuffs this ARP request inside an Ethernet frame and sends it to a special broadcast MAC address, FFFF.FFFF.FFFF.
Remember that all network devices must use MAC addresses to communicate. In order for my computer to get the ARP request to all devices on the network, it has to address the Ethernet frame to some MAC address. It can’t just send it to a blank address. So it sends the ARP request to the broadcast MAC address. Each device “listens” for the broadcast address in addition to listening for its own MAC address. This ensures that every device on the network pays attention to every ARP request.
The switch floods this frame out all ports, including the port Monoprint is connected to. Monoprint receives the frame, peeks inside, and sees the ARP request. Monoprint sees the query, “Who has 192.168.1.20?” and thinks, “Oh, that’s my IP address! Monoprint then sends an ARP reply back to my computer: “This is 192.168.1.20. My MAC address is 0020.3500.CE26.” Bingo. My computer now knows Monoprint’s MAC address, and will communicate with it using that.
Figure 8. Address resolution protocol (ARP) request and reply[HS10] . In step 1, my computer sends an ARP request to the broadcast MAC address (FFFF.FFFF.FFFF). In step 2, Monoprint sends back an ARP reply containing its IP address (192.168.1.20). Finally, in step 3, my computer sends the print job to Monoprint’s MAC address.
ARP is the secret sauce that rescues us IT professionals from having to think about MAC addresses very much. It frees us up to think in terms of friendly, meaningful IP addresses the vast majority of the time.
This article was excerpted from Learn Cisco Network Administration in a Month of Lunches by Ben Piper.
Published at DZone with permission of Ben Piper. See the original article here.
Opinions expressed by DZone contributors are their own.