Exercising NAT (network address translation)

Setting up this experiment

Scripts are used. A set of them is provided for each experiment. Please see the "Virtual Machine Usage" section of this document.
A summary of the scripts' use and intended execution order follows.

"vmconfigure-populate" is first, creates machines
"vmconfigure-construct-network" is next, if it exists
"vmconfigure-guestOS-internal-settings" is next, it makes the internal settings after powering the machines on;
so if this one exists "vmconfigure-poweron" is not needed
"vmconfigure-poweron" is next, if there is no "vmconfigure-guestOS-internal-settings"

Now you can use your VM(s). When you are finished:

"vmconfigure-poweroff"
"vmconfigure-destroy" if you want to delete the machines, but not if you plan to come back to them later

If you do come back to them later,

"vmconfigure-guestOS-internal-settings" should be repeated, if present
"vmconfigure-poweron" if there is not "vmconfigure-guestOS-internal-settings" present

Note - on taking screenshots

Below there are several junctures where you are asked to take screenshots (notations appear there in bracketed red italics). To do that, it's recommended to use a screenshot program that you may have on your host machine. Take the screenshot with the VM window open showing the activity of interest.

Alternatively, the guest virtual machine has a screenshot program. You can take a screenshot within the VM by pressing the PrintScreen key. The result is that a file is deposited on the disk. The name of the file will be something like "Screenshot from 2020-09-14 21-29-57.png" and it will be found in the "Pictures" subdirectory of the user's home directory. In the exercise, performed as student, that will be /home/student/Pictures/.

The host option is recommended in order to avoid the need to transfer the file out of the guest VM, since the VMs as provided are not equipped with interfaces that give them internet/external connectivity. It can be done and is documented on our class website but is best avoided.

1. Set up

This exercise is performed in this internetwork:

It contains two constituent networks. One of the machines belongs to both and plays the router role. Here are the scripts you need to run to instantiate the exercise.

Create the machines and set up the exercise using the scripts provided for that purpose. Run, in order:

vmconfigure-populate.bat (or .sh)

vmconfigure-construct-network.bat

vmconfigure-guestOS-internal-settings.bat

You might arrange your desktop layout to parallel the above diagram, like this:

Put the router and the remote machines in graphical mode ( "startx" after logging in as student). Then obtain two stacked terminal windows. Do it by opening a single terminal window, from the terminal window icon under the Activities. In that window become root ( sudo su - ). Then run terminator ( terminator -f & ). In terminator, to split the window horizontally into two windows as shown, press ctrl-shift-O. The other two machines, nodeN1 and nodeN2, can remain in character mode. Log in to them as root.

2. Network activity without NAT

The idea of NAT is that something will happen to packets that pass through the router. Namely, it will "edit" them. That isn't in effect yet. Let's do a before-and-after exploration. We will conduct two types of left-to-right interaction: ping and echo. (These are not the same thing even though both throw back something thrown to them; the echo protocol operates as a service using port 7, either tcp or udp, while ping does not. ping uses the icmp protocol to do its work.)

As a needed prerequisite, turn on the echo protocol in nodeNremote, for both udp (by editing /etc/xinetd.d/echo-dgram) and tcp (by editing /etc/xinetd.d/echo-stream.) They contain a line that needs to be edited and reads:

disable = yes

but "yes" needs to be replaced with "no". The stream editor sed can do it:

sed -i '/disable/s/yes/no/' /etc/xinetd.d/echo-dgram
sed -i '/disable/s/yes/no/' /etc/xinetd.d/echo-stream

( s/yes/no/ means do a search-and-replace swapping yes for no; /disable/ says apply that to any lines that contain the word disable; -i means "in-place" signifying to apply the change within the file to make it stick ). After making those changes give them effect::

systemctl restart xinetd

In one of the terminal windows of nodeNremote:

tcpdump -nnti enp0s3 not arp

Then, in nodeN1

ping -c1 200.0.0.1

Note the source and destination IP addresses shown by tcpdump on nodeNremote. Are they those of the source and destination machines?

Now these commands in nodeN1:

nc -t 200.0.0.1 7

At nc's (visually null) prompt type "hello from N1" then enter, followed by control-C.

nc -u 200.0.0.1 7

At nc's (visually null) prompt type "hello from N1" then enter, followed by control-C.

In each case check whether tcpdump's source and destination addresses match those of the source and destination machines.

3. Network activity with NAT

Now let's tell nodeNrouter to start performing NAT. There is a table in the router containing the "what am I supposed to translate?" rules. On nodeNrouter, examine it:

iptables -t nat --list --numeric (or equivalently, iptables -t nat -nL )

The table should be empty. Now on nodeNrouter execute:

iptables -t nat -F
iptables -t nat -A POSTROUTING -p tcp -s 10.1.1.0/24 -d 200.0.0.0/24 -j SNAT --to 200.0.0.99
iptables -t nat -A POSTROUTING -p udp -s 10.1.1.0/24 -d 200.0.0.0/24 -j SNAT --to 200.0.0.99
iptables -t nat -nL

The first command flushes existing rules if any. The next two put mechanisms in place to take out the source address and put 200.0.0.99 in its place whenever tcp or udp packets pass through, left-to-right. The last command displays the just created rules.

Repeat your explorations with ping and echo/nc, carefully examining the IP addresses that tcpdump shows you. Are its source and destination addresses those of the source and destination machines for ping? for echo/nc with tcp? for echo/nc with udp?

You can see that address translation is being performed for tcp and udp traffic. What if that traffic comes from mixed source machines, not just a single one? On nodeNremote in the lower terminal window:

watch "netstat -4pant | grep xinetd"

The watch command prints the output of a command given as an argument, every two seconds. Here the argument is the netstat command, filtered to show only the ports of interest. Then over at both nodeN1 and nodeN2 execute:

nc -t --source-port 11111 200.0.0.1 7 (on nodeN1)
nc -t --source-port 22222 200.0.0.1 7 (on nodeN2)

When you do that, connections show up in netstat on node4remote. nc is waiting for the user to type input from the keyboard. We want to make sure that the nat mechanism on nodeNrouter is able to separately return the right replies to the right source requestors, and not get them comingled. Does that work OK? Alternately go back and forth between the two nodes and type stuff. I suggest "alpha" in nodeN1 then "omega" in nodeN2, followed by "apple" in nodeN1 then "zebra" in nodeN2. Make sure everything that gets echoed back gets echoed back to whichever machine it originated from and doesn't show up on the other one. Terminate nc on both nodes with the ctrl-C keystroke (note the concomitant disappearance of netstat records on node4remote).

Usually IP traffic gets to a particular machine in accord with the its destination address. Here packets sent by nodeNremote, whether for nodeN1 or nodeN2, are addressed identically and indistinguishably to the same address (namely, router's 200.0.0.99). So what differentiates them so that they can go back to the correct originator without getting mixed up?

Author Jim Kurose describes NAT in a lecture on his "Authors' website" which shows this explanatory diagram:

It is largely the same as our diagram, though with right-to-left instead of left-to-right orientation. The NAT translation table adapted to our case would look like this:

NAT translation table in nodeNrouter
local/left (LAN) side enp0s3		remote/right (WAN) side enp0s8
source IP	source port	source IP	source port
? ?		? ?
? ?		? ?

This table belongs to the router, so let's scrutinize traffic there. In the router run copies of tcpdump in separate root terminal windows, one watching local-side interface enp0s3 and the other remote-side interface enp0s8:

tcpdump -nntvi enp0s3 not arp (on nodeNrouter in one terminal window)
tcpdump -nntvi enp0s8 not arp (on nodeNrouter in another terminal window)

Now we want to give the router something to handle, so move to nodes N1 and N2 and generate some traffic to the remote node. Let's use udp instead of tcp because it produces fewer frames, hence more clarity.

nc -u --source-port 11111 200.0.0.1 7 (on nodeN1)
nc -u --source-port 22222 200.0.0.1 7 (on nodeN2)

on each, type something and hit enter to let a packet fly. Notice that both nodes correctly get back the right thing-- a copy of what they themselves sent out. How does the router sort this out, when it gets mixed replies to both arriving from nodeNremote, devoid of addresses 10.1.1.1 ir 10.1.1.2? How does it tell the difference? Look at the incoming frames in the router's tcpdump output on remote-side interface enp0s8 (the ones with source IP 200.0.0.1 and destination IP 200.0.0.99). They are the ones carrying the mixed replies to the mixed sources. What is in the reply packets that identifies to which source it is replying? What is different between those lines?

Now that you've figured out NAT's theory of operation; it's clear to you how the mechanism works. Replies to traffic from any of various sources gets back to the right sources. All traffic or some traffic? Let's try ping. At the two local nodes:

ping -c1 200.0.0.1 (on nodeN1)
ping -c1 200.0.0.1 (on nodeN2)

and study tcpdump output at the nodeNrouter and nodeNremote. What is NAT's effect on this traffic? Does your theory hold? Let's augment our instructions to the router:

iptables -t nat -A POSTROUTING -p icmp -s 10.1.1.0/24 -d 200.0.0.0/24 -j SNAT --to 200.0.0.99
iptables -t nat -nL

To run these you may need to first interrupt one of the tcpdump's that's running (ctrl-C), run these, then reinstate the tcpdump (up arrow for command recall and enter key to re-launch).

The rule we inserted here differs from the earlier ones. They had "-p tcp" and "-p udp" to specify which protocols NAT was supposed to touch. This time the specification is "-p icmp" instead. Ping again:

ping -c1 200.0.0.1 (on nodeN1)
ping -c1 200.0.0.1 (on nodeN2)

What's happening now with the addressing on the two router interfaces? Study them carefully, especially the remote/right enp0s8 side. While both source nodes are getting ping replies back, it's hard to make sure whether they're getting the right ones rather than maybe the other guy's. In order to ascertain, change the pings on the two source machines to send different stuff from each other, and change the tcpdumps on the two interfaces of the router to expose full packet contents. First interrupt the current tcpdumps on the router (ctrl-C, then ctrl-L to clean up the screen) and re-institute new ones:

tcpdump -nntXi enp0s3 not arp (on nodeNrouter in upper terminal window)
tcpdump -nntXi enp0s8 not arp (on nodeNrouter in lower terminal window)

(we added the X option). Then on the two source machines, re-ping:

ping -c1 -p31 200.0.0.1 (on nodeN1)
ping -c1 -p32 200.0.0.1 (on nodeN2)

(we added the -p "pattern" option, telling ping to fill packets with 0x31 or 0x32 characters, which are 1's and 2's). Study the tcpdump on router's local/left enp0s3 interface, specifically the packets outgoing from there, and note who they are going to. 1's came from nodeN1/10.1.1.1; 2's came from nodeN2/10.1.1.2. To whom is router dispatching the replies that contain 1's, and to whom those that contain 2's? Now it's working for ping, like for udp/tcp before. But not by your earlier theory. Ping can't use what udp/tcp used for this job. Something else is being used. You need a new theory.

So in ping, what does NAT have to work with? Please research on what basis NAT is able to handle ping. I had trouble finding out. Examine the fields of ping type icmp messages (echo/echo-reply). Read about their definition and usage in the Forouzan textbook, page 254. There might be some little help in rfc's 3022, "Traditional IP Network Address Translator (Traditional NAT)" and 5508, "NAT Behavioral Requirements for ICMP." Supplement your research with some experimentation: perform some pings from nodeN1 and nodeN2 to nodeNremote, watching with tcpdump on the middle-man router the interesting fields' values (use tcpdump's -vvv option to make it verbose). How do the values evolve as you emit ping after ping after ping? Try to get pings coming from both nodeN1 and nodeN2 to have the same value (they evolve with a pattern giving you limited control). Then keep emitting them from both source nodes and see if you can't cause one of the pings to have a unexpectedly random value that deviates from the pattern. This is due to the nat algorithm's operation.

4. What to turn in

On paper or electronically, edit a copy of this page. Fill in the NAT translation table with the 8 entries it requires to depict the rules it used for the the NATted udp and tcp traffic we ran above, in the same form as Kurose showed in his table. (The values for both the udp and tcp operations are the same, so even though you conducted two operations you can fill in the table just once.) Then, see the questions under the table. These are short-answer questions. Write in their answers (no essays please, nor even sentences). Provide your filled in sheet for me to see in a file named nat-operation.jpg (or .png).