In the previous posts about the SCTP protocol, I promised a separate article about multi-homing. I think we have covered most of the basic topics and now it is time to review this killer feature. The behaviour of a multi-homed SCTP node is scattered around RFC 4960 and in this post I will present the most important aspects.
I think Section 6.4 has the best definition for multi-homing:
An SCTP endpoint is considered multi-homed if there are more than one transport address that can be used as a destination address to reach that endpoint.
What does this mean in practice? For example on fig. 1, host A and host B has got two routes between each other. With multi-homing you can add both paths to the association and in case of failure all the traffic will be transparently redirected to the other path.
figure 1: Multi-homing example
During association initialisation each endpoint may announce list of additional IP address that can be used for communication. Remember that the port number is the same for all addresses in this list. One association can't use more than one port number. One of the paths, between the endpoints, is considered PRIMARY. The others are used either when the primary goes down or when the upper layer implicitly requires the message to be sent to another IP address. In the following sections we will review each phase from the association life and describe what happens when multi-homing is used.
Obtaining peer's IP addresses
If you need a refresher about SCTP association initialisation check this post. Section 5.1.2 explains how a SCTP endpoint can obtain the list with the IP addresses of its peer. There are three possibilities depending on the parameters included in the INIT/INIT ACK chunk:
No Host Name Address, IPv4 Address or IPv6 Address parameters present
There is no multi-homing in this scenario. The SCTP stack saves only the source IP address and port of the sender of the INIT/INIT ACK chunk.
Host Name Address parameter present
In this case there should be only one Host Name Address parameter. More than one is considered an error and if there are any additional IPv4 Address or IPv6 Address parameters they should be ignored. This constraint is defined on multiple places - Section 5.1.2 (subclause B), NOTE 3 in Section 3.3.2 and again NOTE 3 in Section 3.3.3.
Subclause B includes recommendations about when the hostname in Host Name Address parameter should be resolved. The receiver of the INIT chunk should do this after it receives COOKIE ECHO chunk, due to possible resource attack. This means that it should send the INIT ACK chunk to the IP address and the port from which the INIT was received. After the hostname is resolved, no other IP address should be used for data transfer. The receiver of INIT ACK chunk with Host Name Address parameter should resolve the hostname immediately and send COOKIE ECHO chunk to the resolved IP address.
In both cases, if the hostname can't be resolved the association initialisation should be terminated immediately with ABORT chunk. Check subclause B if the security considerations and error handling for this case are important for you.
IPv4 Address or IPv6 Address parameters are present
Unlike the Host Name Address, these parameters can occur more than once in the INIT/INIT ACK chunk. The receiver should record all IPv4/IPv6 addresses AND the IP address from which the INIT/INIT ACK was received. The receiver should use only these IP addresses for any further communication with its peer. However the INIT ACK chunk should always be sent to the IP address and port from which the INIT was received.
The sender of the INIT may additionally include Supported Address Types parameter to specify what IP address types it supports. If the receiver can't satisfy this requirement it should abort the association initialisation immediately.
For more information about Host Name Address, IPv4 Address and IPv6 Address you can also check Section 188.8.131.52.
After the association is established, each endpoint knows the IP addresses of its peer. However a misbehaving (or malicious) endpoint may report incorrect IP addresses. To handle this, SCTP has to confirm each address before sending any messages to it. This is accomplished with the path verification procedure, described in Section 5.4. Initially each endpoint has got a set of confirmed addresses:
- For the client (the sender of the INIT) these are the addresses, passed from the upper layer.
- For the server (the receiver of COOKIE ECHO) this is the address from which the INIT was received.
All other IP addresses are considered unconfirmed. The verification of each address is done with a HEARTBEAT, sent to it. When HEARTBEAT ACK is received, the address is considered confirmed and can be used for data transfer. More details about the path verification can be found in Section 5.4.
After all IP addresses are derived, one of them is selected as a PRIMARY and it will be the default for any further messages. This address can be changed by upper layer request, if it becomes unreachable or if the upper layer explicitly request the message to be sent to specific IP address. Usually the primary path is the one which was used to send/receive the INIT chunk (as described earlier).
Each SCTP endpoint should transmit reply chunks (like HEARTBEAT ACK, SACK, etc.) to the address from which the corresponding HEARTBEAT/DATA chunk was received, when this is possible. Exception can be made when for example the stack is sending SACK for multiple DATA chunks, received from different addresses. In this case the reply can be sent to any active IP address. Chunks, which have timed out, might also be retransmitted via different IP addresses. More examples about alternative path usages can be found in Section 6.4.
Each SCTP endpoint should monitor its peer addresses via HEARTBEATs. Once an address becomes unreachable it should be marked as inactive and a notification should be sent to the upper layer. For more details about remote address monitoring check Section 8.2.
Association termination hasn't got any specifics related to multi-homing.
Now let's see how multi-homing works in action. We will recreate the sample network setup on fig. 1 and review two cases - normal SCTP operation (when both paths remain available during the association lifetime) and primary path switching (when the link used for primary path goes down). I use VirtualBox and Vagrant for the simulation. You can get the Vagrantfile I use to recreate the network on fig. 1. The only difference is that hosts alpha and beta are directly connected (without routers) to each other, which is irrelevant for our case.
You can find the whole PCAP file here. Below I will use screenshots to show the important things. First let's see the whole communication on fig. 2.
figure 2: Normal operation
Packets 1-4 are the association initialisation. 5-25 are data transfer and heartbeats and finally 26-28 - association tear down. Pay attention to the source IP addresses of DATA and SACK chunks. They are always the same - 192.168.35.10 and 192.168.35.11. The reason is that 192.168.35.10 <-> 192.168.35.11 is selected for Primary path. The link remains up during the association lifetime, so there is no need to use alternative paths for data transfer. Nevertheless there are HEARTBEAT chunks transferred over the second path (192.168.45.10 <-> 192.168.45.11) to make sure it is up.
Now let's have a look at the INIT chunk on fig. 3 and INIT ACK chunk on fig. 4. They are very similar, so I will review them together. The IP address related parameters are unfolded. The client (the sender of INIT) announces two IP addresses to the server - 192.168.35.11 and 192.168.45.11. However as we discussed in Path verification only 192.168.35.11 is considered verified by the server, because this is the source IP address of the INIT chunk (see the resume for IP protocol on fig. 3). The server announces 192.168.35.10 and 192.168.45.10 as its IP addresses in INIT ACK chunk. The message is sent from 192.168.35.10, so it is the only confirmed address for the client.
figure 3: INIT chunk
figure 4: INIT ACK chunk
The trace confirms the rules described in Path verification. The primary path is 192.168.35.10 <-> 192.168.35.11 and it also is the default choice for data transfer. It is not monitored with HEARTBEATs, because there are acknowledged DATA chunks transferred over it. On the other hand over the second path (192.168.45.10 <-> 192.168.45.11) we can see occasional heartbeats, which assures the endpoint that the path is still available.
Primary path switching
This time I will simulate failure on the link used for the primary path, which will force the SCTP stack to select another path. I simulate the failure by rejecting all incoming SCTP traffic via the Ethernet interface used by the primary path on the client and the server. In my case this is eth1 and I block the traffic with iptables:
iptables -I INPUT -p sctp -i eth1 -j REJECT
On fig. 5 you can see the recorded PCAP file. I have added a few new columns - TSN (for DATA chunks), Cumulative TSN ACK (for SACK) and the payload itself (for DATA chunks). This will help us spot the retransmissions.
figure 5: Primary path switching
I block the SCTP traffic somewhere between packets 11 and 12 and packet 13 (ICMP Destination unreachable) is the first indication that there is something wrong with the link. Immediately a HEARTBEAT (packet 14) is sent on the other link and HEARTBEAT ACK (packet 15) is received. This means that the second link is operational so the server resends the lost DATA chunk (packet 17). Notice that the TSNs of packets 12 and 17 are the same, which indicates retransmission. The client confirms the reception of the DATA chunk with a SACK (packet 18).
Meanwhile the server continues to monitor the first link with HEARTBEATs (packets 19 and 21). The response is still ICMP Destination unreachable, so the communication continues over the second link (packets 23-28). After a while I unblock the link and we can see some acknowledged HEARTBEATs (packets 29-32). This is an indication for the SCTP stack that the PRIMARY link is up again and we can see that the rest of the chunks are transferred over it (packets 33 - 37).