Chapter 4: Network Address Translation – IP Addressing and Subnetting INC IPV6

Chapter 4

Network Address Translation

Solutions in this chapter:

 Learning what NAT is and how it works

 Seeing examples of how to implement NAT

 Learning how NAT interacts with security solutions

 Learning when NAT is appropriate to use

Introduction

This chapter covers Network Address Translation (NAT). In its simplest form, NAT changes network layer (layer 3) addresses as they pass through some device, such as a router or firewall. In theory, other layer 3 protocols can be translated, such as AppleTalk or IPX, as well as other layers (such as layer 2). In practice, it’s usually done only with IP addresses at layer 3. Because this is a TCP/IP book, this chapter will focus exclusively on IP.

We will demonstrate, however, that simply changing the layer 3 address is insufficient, and that transport layer (layer 4), and often higher layer, information must also be affected. Therefore, our discussion will also include TCP and UDP, as well as application layer (layer 7) protocols. We will discuss not only what NAT is and how it works, but also what the problems and shortcomings are.

This chapter is not about network security; however, the issues surrounding NAT often intersect with those of security applications. In some cases, particular types of NAT make the most sense in the context of a security application. Many of the commercial NAT implementations are part of a security package. Given that, we will be covering some security information as it relates to NAT, though NAT by itself is not necessarily security technology.

Hiding Behind the Router/Firewall

The ideas behind NAT became popularized in early firewall solutions. These early firewalls were mostly proxy-based. A good example is the FireWall ToolKit (FWTK). A proxy (in the firewall context) is a piece of software that fetches some information on behalf of a client, such as a Web page. The client computer asks the proxy for a particular Web page (it gives it the URL) and awaits reply. The proxy will then fetch the Web page, and return it to the client.

What’s the point of that? First, the administrator of the proxy can often program a list of things the client isn’t allowed to do. For example, if it’s a Web proxy at a company, the proxy administrator may choose to block access to www.playboy.com. Second, the proxy might be able to perform some caching or other optimization. If 50 people visit www.syngress.com every day, the proxy could keep a copy of the Web page, and when a client asks for it, all the proxy has to do is check if there have been any changes. If not, it passes along the copy has stored, and the client typically gets to see the page more quickly.

Usually in this type of proxy configuration, the clients have been blocked from retrieving Web pages from the Internet directly, so they are forced to use the proxy if they want to view Web pages. This is often done with packet filtering on the router. Simply stated, the router is configured only to allow the proxy to pull Web pages from the Internet, and no other machine.

The result of this type of design is that inside clients now talk only to the proxy, and no longer talk directly to other hosts on the Internet. The proxy only needs to accept requests from the “inside” and fulfill them. This means that other machines on the Internet no longer need to speak to inside clients directly, even for replies. Therefore, the firewall administrator can configure their router or firewall to block all communications between the inside and outside machines. This forces all communications through the proxy. Now, the only machine the outside can talk to (if all is configured correctly) is the proxy. This dramatically reduces the number of machines that outsiders can attack directly. The proxy administrator takes particular care to make sure the proxy machine is as secure as possible, of course. Figure 4.1 is a diagram of what it looks like, logically.

Figure 4.1 Retrieving a Web page through a proxy.

This process has been highly simplified for purposes of discussion, but the principles are there: a clear division of inside and outside, and a point between them. This point between the two is sometimes called a choke point. In our diagram, the choke point is the proxy and filtering router together.

This is a simplified firewall architecture. Issues outside of the scope of this chapter come into play when designing a real firewall, such as:

 Is proxy software available for all needed protocols?

 How is the packet filtering configured on the router?

 How does the Web browser software on the client know to talk to the proxy?

 How does the proxy know which machines are on the inside, and which are outside?

The point of the discussion in this chapter is not what a proxy firewall architecture looks like, but rather, a side effect of it. We already know that all traffic on the Internet from this network originates from the proxy. This means that the Internet only “sees” the IP address of the proxy server. We also know that the Internet can’t reach the client machines on the inside.

As far as the Internet is concerned, this means that this site needs only one IP address, which is that of the proxy.

Recall from Chapter 3 that address space is considered scarce at present, and that certain IP address ranges, referred to as the private IP address ranges, have been set aside. These ranges are currently listed in the document RFC 1918, available at

as well as at a number of other Web sites.

If you happen to read through the RFC, you’ll see that it renders RFCs 1627 and 1597 (an older version of RFC 1918) obsolete. RFC 1627 attempts to make a case against private IP address ranges. Apparently, RFC 1627 lost because it has been declared obsolete by one that explicitly allows private address ranges. The other RFCs can be reached at the previous URL (there are links at the top of that Web page).

Following is a quote from RFC 1918, which defines the private address spaces, and when they should be used:

“For security reasons, many enterprises use application layer gateways to connect their internal network to the Internet. The internal network usually does not have direct access to the Internet, thus only one or more gateways are visible from the Internet. In this case, the internal network can use non-unique IP network numbers.”

As part of the reason for having private addresses, the RFC recognizes that many companies already have application layer gateways (proxies) in place. Therefore, it would be useful to have a set of addresses that can be reused internally, as long as none of those machines needs to talk to other machines directly.

The RFC also recommends that companies who wish to employ such a proxy obtain address space from Internet Service Providers (ISPs). In recent years, most of the address space has been allocated to ISPs, rather than directly to companies, as it used to be. A big part of the reason for this is to keep routing tables on Internet core routers as small as possible. If a block of addresses is given to an ISP, then the other ISPs can hold a route to that single block, rather than having an entry for each of the separate network ranges in the block, as would be the case if those address ranges were given to various companies. By today’s rules, you pretty much have to be an ISP to get address space allocated to you permanently. For more information about how ISPs obtain and assign addresses, please see Chapter 6.

If you run a proxy architecture, it will be fairly easy to get some addresses from your ISP, and you will need relatively few. With this architecture, you are free to use the RFC1918 addresses inside your network, and still have Internet access for your internal client machines.

This type of architecture is in very common use today. Many companies, especially large ones, have some sort of firewall or proxy device that does the direct communication on the Internet. Even companies that have been on the Internet long enough to have their own address space frequently use this type of architecture, though mostly for security reasons.

Now that we have some idea what proxies are, how exactly does that relate to NAT? Well, actually not much—proxies aren’t NAT. Towards the end of the chapter, we explain why. However, the discussion is important, because proxies form part of the history of why NAT exists.

What Is NAT?

The idea behind NAT is similar to one of the benefits of proxies: hiding your internal addresses. The usual reason for wanting to hide addresses is the one we mentioned—Internet access for inside client machines. At a high level, the end result is the same. The Internet sees a valid Internet address (a public address), probably assigned by your ISP, and your inside machines are using private addresses.

There is at least one other reason you might want to use NAT if you’re using the RFC 1918 addresses: What if your company merges with another one? Usually, the two companies will want to link internal networks to facilitate business communications. However, if both companies had previously been using the same RFC 1918 address ranges, a conflict arises. Ultimately, a renumbering of some sort will probably have to be done, but as a short-term measure, it’s possible to use a type of NAT to translate addresses between the two companies to resolve conflicts. We’ll return to this example later.

To understand how NAT differs from proxying, we have to take a detailed look at how NAT works.

How Does NAT Work?

NAT works by modifying individual packets. It modifies (at least) the layer 3 headers to have a new address for the source address, destination address, or both. We’ll also see an example where layer 4 headers are modified, as well as the data portion (layer 7).

As we’ll see, a few small variations in how the addresses are translated can result in a fairly wide range of behavior and features. We’ll also see that for some protocols, it will take a lot more than simply changing the layer 3 addresses for them to function with NAT. There are even protocols that can’t function with NAT in place.

The NAT function is usually performed by a router or firewall. It is theoretically possible for a bridge (layer 2) device to do layer 3 address translation, and at least one firewall product on the market functions that way. However, the vast majority of the NAT devices, or software that includes a NAT function, depends on plain IP routing to deliver packets to it. Most NAT devices have an underlying IP routing function.

Network Address Translation (Static)

We’ll start with the simplest form of NAT, which is called static, or 1-to-1 translation. This is the most intuitive kind: Simply stated, in static NAT, a particular IP address is changed to another going one way, and changed back going the other way. The change usually is done to the source address for outgoing packets. Figure 4.2 will help clarify this. In the figure, the arrows indicate direction of packet flow (where it’s being routed), S indicates source address, and D indicates destination address.

Figure 4.2 Static NAT during the first two packets of the TCP handshake.

How Does Static NAT Work?

Let’s assume for the moment that this is a really simple-minded NAT; that is, all it does is modify the source or destination address when appropriate. What kind of work does the NAT router have to do? First, it has to have some idea of which direction the packet is traveling relative to the NAT configuration. Notice in the example that the router translates the source in one direction, and the destination in the other. It can decide which to do based on particular interfaces being marked as “to” or “from” interfaces. A configuration example, next, will make things more clear. The router also has to decrement the TTL and redo any checksums needed, but routers do that anyway.

The example is also stateless, meaning that the router doesn’t have to know what went on with previous packets, if anything, in order to modify the current one. All the information it needs to modify the packet is available in the current packet, and in its configuration. Also note that this type of NAT has no security features—all traffic passes regardless, with just an address change in the process. The idea of state information is very important for later NAT examples, and also for firewalls. Keep this in mind for later discussion.

This type of NAT is fairly simple to understand, but it isn’t as useful as it might be. Consider our goal of trying to have a few IP addresses represent a group of inside machines. Our example is 1-to-1, meaning there is no address savings! Each inside IP address has to have a matching outside address, so there is no savings of IP addresses. Does this mean that it is useless? No, there are a number of scenarios where we can use a 1-to-1 mapping of IP addresses.

One scenario is that you’ve got an internal machine with an internal IP address, and you want to make it reachable by the Internet for some reason. One way to do it without having to change anything on the inside machine is to define a static translation for it, like we did in our example. If that’s done, you simply have to publish the translated IP address (perhaps by assigning a DNS name to it).

Let’s consider another example, which matches the one in Figure 4.2, except that the destination address is changed on the first packet instead of the source address. When would it be useful to change the destination address instead of the source address? There is at least one type of server you generally have to refer to by IP address: DNS servers. Imagine a situation where a DNS server has failed, probably only temporarily, and you would like to have your inside client machines make DNS requests of a new one without having to reconfigure them all, and then put them back when the original DNS server is back up.

Double NAT

The last static NAT example we want to look at is often called “double NAT.” Simply put, this is changing both the source and destination addresses of a packet. Many products that support NAT don’t support this type of configuration, unless you’ve got two of them.

Under what circumstances would you want to use double NAT? One possibility is a combination of the previous two examples: You’ve got inside machines using private IP address, and you need to have them connect to a different DNS server without reconfiguring them. That example is a bit contrived, though, and there’s a better one.

Recall that one of the problems with using private IP addresses is the possibility of conflict when you connect to another network that is using the same addresses. Double NAT can help in this situation, though again, you’ll probably want to use this only as a temporary measure.

Here’s a scenario: You need to connect your network to that of another company, and you just found out that you both are using class C 192.168.1. You have to find a way to enable the two networks to communicate until a renumbering can be completed. This situation is far from impossible, as several firewall/NAT products use this address range by default.

It turns out you’ve both got routers capable of doing NAT—the same routers you are using to connect to each other. For our example we’ll focus on two machines, one on each net, that have the same IP address (see Figure 4.3).

Figure 4.3 Two networks with conflicting RFC1918 addresses.

The IP addresses used on the link between the two routers aren’t particularly important for this example, as long as they don’t create additional conflicts.

The trick is to make each machine believe that the other one is at a different IP address. We accomplish this by making the machine on the left think that the machine on the right is IP address 192.168.2.2, while the machine on the right thinks that the machine on the left is 192.168.3.2.

This is still static NAT: each machine has a 1-to-1 mapping to another IP address. However, in this example, since we’re going through two NAT routers, we’re going to translate twice. The first router will change the source address on the packet, and the second router will change the destination address on the packet. Double NAT.

Let’s walk through an example of the machine on the left sending a packet to the machine on the right (see Figure 4.4).

Figure 4.4 Source address is 192.168.1.2, destination address is 192.168.2.2.

Since the machine on the left assumes it’s simply communicating with another machine at 192.168.2.2, it sends its packet to the local router for forwarding, as it normally would. At this point, router A is going to change the source address on the packet, to hide the fact that it came from a 192.168.1 net (see Figure 4.5).

Figure 4.5 Source address is now 192.168.3.2, destination address is still 192.168.2.2.

The destination address remains 192.168.2.2 at this point, and router A uses its normal routing tables to determine where the 192.168.2 network is, and forwards the packet. In this case, it forwards the packet to router B. Router B is going to perform its translation next, and it changes the destination address from 192.168.2.2 to 192.168.1.2 (see Figure 4.6).

Figure 4.6 Source address is 192.168.3.2, destination address is now 192.168.1.2.

Now the machine on the right receives the packet, and that machine believes it has received a packet from 192.168.3.2. Packets traveling from the machine on the right to the machine on the left will go through a similar, but reversed process.

In this manner, the two machines with the same address, which would normally never be able to communicate with each other, are able to do so. Naturally, to make this type of scenario usable in real life, it will probably require some clever DNS setup as well. The DNS server for the machine on the left would be configured so that the names of the machines on the right resolve to 192.168.3 addresses, and so on.

Problems with Static NAT

So far, we’ve ignored the problems with NAT, and they are significant. The basic problem is that not all network address information is in the network address headers (IP layer). A fair number of protocols, for various reasons, include address information in the data portion of the packets. We’ll look at a few examples.

One of the most problematic protocols for NAT is the File Transfer Protocol (FTP). However, because FTP is so common, most NATs deal with it properly.

What’s difficult about FTP? First of all, it passes IP addresses in the data stream, in ASCII. Second, it passes these addresses to inform the other machine on which IP address and port it will be listening for reverse connections. In the default mode, when an FTP client wants to receive a file, it listens on a port number assigned by the operating system, and informs the server of that port number and its IP address. The server then contacts the client and delivers the file. This problem gets worse when security or other types of NAT are considered, which we’ll look at later.

This means that the NAT software has to be able to spot the IP addresses when they are being sent, and be able to modify them. FTP also introduces the problem of state. Unfortunately for the NAT software designer, the IP address information may be split across more than one packet. This means that the NAT software also has to keep track of what it was doing on the last packet as well as the current one. This is known as maintaining state information; most NAT devices use state tables to maintain this type of information.

Figure 4.7 contains a packet capture of the problem in action.

Figure 4.7 Packet containing the FTP PORT command.

Figure 4.7 is a packet from the middle of an FTP session, containing the PORT command. Behind the scenes, FTP is basically a text protocol, with binary transfers added onto it. The command you see at the bottom on the figure, PORT 208,25,87,11,17,234, is the client informing the server what port it will be listening on for receiving data. I had just connected to the server and my client sent an address and port number to which the server could connect in order to send its welcome banner.

Let’s take a look at the command. The PORT part is fairly evident: it is telling the server what port it can connect to. The first four numbers, 208,25,87,11, are simply the client’s IP address—if you look at the top of the figure (source address), it is 208.25.87.11. The next two numbers are the port number, split into two bytes. Notice that the current source port is 4585. The client in this case is a Windows 98 machine, and like most operating systems, Windows allocates ports sequentially. To convert 17,234 into a single number, follow this conversion routine: Multiply the first number (on the left) by 256, and then add the second number—in this case, 17*256+234=4586. So, our client is telling the server to connect to 208.25.87.11 at port 4586.

Everything worked as expected, and the banner was properly displayed on the FTP client. But had NAT been in use, the NAT software would have to recognize the PORT command, and modify the number for the IP address inside the packet. In this example, all fields were contained in the same packet (as they often are). However, they may be split across more than one packet, so the NAT software must be prepared to handle that possibility.

If the NAT software is able to modify the PORT command correctly, all still works well. The headers are changed, and the PORT command(s) are changed to match, accordingly. Now FTP can work properly across static NAT.

That’s only one protocol handled as a special case—there are lots more. Real-world NAT implementations must deal with these in order to be useful to consumers. It’s fairly common for NAT vendors to provide a list of protocols for which they do or do not work correctly. The basic problem lies with protocols that pass address and port information as part of the data portion of the packets. When the IP headers are changed, the data portion must also be changed to match. If this is not done, then the protocol most likely will not work properly.

There is at least one other category of protocols that have problems, even with static NAT. Certain protocols exist that can detect when the IP headers have been changed, and will refuse to work when a change is detected. Usually, these are cryptographic protocols. A prime example is the IPSec Authenticate Header (AH) protocol. Without going into too much IPSec detail, the idea behind this protocol is that it is sometimes useful to know for sure that the IP address with which you are communicating is who it claims to be. The two IP addresses communicating using IPSec AH have shared cryptographic keys with which to verify certain types of information. When one of these devices puts together a packet, it includes a large number with it, which is a function of nearly all the information in the packet, as well as the cryptographic key. When the device at the other end sees the packet, it can go through a similar process, and determine if the packet has been tampered with. If it detects any tampering, it discards the packet as invalid.

IPSec AH will see NAT as tampering (unauthorized modification to the headers) and drop the packets as being invalid. Here is a protocol that cannot work with NAT, because of its design. There are not a large number of protocols like this, and they are usually complex enough that network and firewall administrators are often involved in their configuration, so they should be aware of the issues, and be able to work around them. Be aware, though, that some ISPs employ NAT on their networks. Also, some Virtual Private Network (VPN) products use IPSec, and these products often will not work over an ISP that does NAT or any type of firewalling.

Configuration Examples

In this chapter, our configuration examples will be using Cisco’s IOS, Windows NT 2000, and Linux. Specifically, we’ll be using Cisco IOS 11.3 or higher (on the main Cisco router line), and Red Hat Linux 6.0. Note that some other Cisco devices, such as the 77x ISDN routers, support NAT as well, but they use a different numbering scheme for their software. We use Windows NT 2000 because this is the first version of Windows NT to include built-in NAT capabilities. At the time of this writing, NT2000 is still beta. This feature is expected to be present in the final version, but there is always a possibility it won’t be or that it will be slightly changed. The software package we’ll be using on Linux is called IP Masquerade, which comes with the most recent versions of all the Linux distributions. The “References and Resources” section at the end of the chapter provides URLs for documents containing information about NAT, including information about which exact versions of the Cisco IOS include NAT features, and where to obtain IP Masquerade if it isn’t already included with your distribution. This chapter assumes that the appropriate software is already installed, and that you have a basic familiarity with the operating system.

Windows NT 2000

Windows NT 2000 includes a feature called Internet Connection Sharing (ICS). (ICS is also included in Windows 98 Second Edition.) ICS is intended to allow dial-up users to provide Internet access to other machines attached via a LAN. It does that well, but it’s pretty single-minded, so it’s not very flexible. The outside interface must be a dial-up connection; that is, if your Internet access method is via a LAN connection (such as a cable modem or most DSL setups) you can’t use ICS with it. To accommodate inside machines on the LAN, the NT 2000 box configures its LAN interface to be 192.168.0.1, and turns itself into a DHCP server and DNS proxy. The configuration of the LAN interface might very well cause conflicts if those services already exist, so be careful. We’ll assume that NT 2000 is already installed properly, that the LAN interface is functioning properly, and that there is a correctly defined Internet dial-up connection. We’ll start with the network control panel, shown in Figure 4.8.

Figure 4.8 Windows 2000 Network connections window.

In Figure 4.8, we can see the LAN connection and the Internet dial-up connection. The Internet connection is grayed-out to indicate that it’s not up at the moment.

To configure ICS, right-click on the Internet dial-up connection and select Properties. When the Properties window comes up, click on the Internet Connection Sharing tab, shown in Figure 4.9.

Figure 4.9 Dial-up properties window, ICS tab.

Checking on the Enable Internet Connection Sharing box enables ICS. Optionally, you can configure the NT 2000 machine to dial the Internet automatically when an inside machine tries to access the Internet. Checking on this option also enables the DHCP server, so again be sure there isn’t already a DHCP server before you check this on.

The short version of this configuration example is that inside machines will now actually be able to access the Internet (after you dial-up, of course). However, since we’re discussing static NAT, we’ll dig a little deeper into what ICS can do. Strictly speaking, ICS doesn’t do static NAT (we’ll discuss that later in the chapter), but it can perform some of the same behavior.

Notice that there is a Settings button at the bottom of the screen: If you click on that, and then select the Services tab, you will see something like the screen shown in Figure 4.10. In our example, there is already a service defined, called “telnet.” By default, this list is empty. If we click on edit, we will see the screen shown in Figure 4.11.

Figure 4.10 ICS Services tab, Telnet service selected.

Figure 4.11 Definition of Telnet service.

In the Service port number field, we’ve got 23 (which is the default port for a Telnet server). The protocol is TCP, and the Name field is portabeast, which is just the name of a machine on our example inside network.

Since ICS doesn’t do real static NAT, inside machines can get out, but outside machines can’t get in. The Services feature lets you explicitly allow certain services to be reachable from the outside. In our case, we’ve made it possible for the outside to Telnet to portabeast. ICS automatically handles FTP properly.

Cisco IOS

Of the three operating systems we’re covering, Cisco’s IOS has the most flexible NAT software. Using it, we’re able to do a true static NAT configuration. This example was done on a 2621 router, which has two Fast Ethernet ports. Here’s what the relevant portion of the configuration looks like before we start:

Interface FastEthernet 0/0 is our inside interface, which uses the 192.168.0 net. 130.214.99 is our outside net, representing the path to the Internet for this example.

There is an inside machine at 192.168.0.2 that we want to be able to get out, so we’re going to assign it an outside address:

The first step is to mark the inside and outside interfaces, which is done with the ip nat inside and ip nat outside commands. Next, we tell the router to do an IP mapping. The command (global this time, rather than an interface command) is again ip nat. We’re mapping an inside address and translating the source address (destination address translation is also possible with IOS). It’s a static mapping, and we’re translating 192.168.0.2 to 130.214.99.250.

This is a true static mapping, and only the one inside machine is fully reachable from the outside at the 130.214.99.250 address.

As mentioned, the IOS supports destination address mapping as well. It can also do double NAT with just one physical router, if you need it.

Linux IP Masquerade

Our Linux box (Red Hat 6.0) also has two LAN interfaces. IP Masquerade comes standard with Red Hat 6.0, and can be used with other versions and distributions of Linux, although you may have to install it yourself. Instructions are available on how to do so; check the “References and Resources” section at the end of this chapter. Our example begins with the LAN interfaces already configured and working properly. Here is the output from the ifconfig command:

The addressing setup is very close to that of the router. Interface eth1 is our inside network, again 192.168.0, and interface eth0 is our outside interface. With IP Masquerade, the address to which the inside is translated is determined by which direction traffic is routed. It will use the IP address of the outside interface. Here’s the route table (output from the netstat –rn command):

Since the default route (0.0.0.0) is towards 130.214.99.1, which is reachable via the eth0 interface, all traffic will exit via that interface (unless it’s destined for the 192.168.0 net). Therefore, the IP address for the eth0 interface (130.214.99.253) will be used as the translated source address.

IP Masquerade replies on the OS doing routing, so routing must be enabled (it’s disabled by default). To turn routing on, issue this command:

This will turn forwarding on, but only until the next reboot (or if it’s turned back off manually in a similar manner). To turn it on permanently in Red Hat, you’ll want to edit the /etc/sysconfig/network file, and change the line that reads:

That takes care of the forwarding (routing). The next step is to install a masquerade policy that will translate traffic the way we want. IP Masquerade handles FTP properly; in fact, there is a special loadable module that needs to be installed for FTP. Issue this command:

From its name, it’s pretty obvious what this module is for. There are several modules like this for IP Masquerade, and we’ll take a look at more later in the chapter. Next, we’ll set some timeout values:

The first number (3600) specifies how many seconds idle TCP connections will stick around (in this case, an hour). The second number indicates how long after the FIN exchange the connection is tracked, and the last number indicates how long UDP connections will be kept around without any traffic.

Finally, we put in the actual IP Masquerade rules:

(192.168.0.2 is still our inside machine for the example.)

At this point, our inside machine will be able to get to the Internet. You won’t want to type these commands in every time you reboot, so typically you’ll want to put them in a shell script in /etc/rc.d so that they run on startup.

Network Address Translation (Dynamic)

Static NAT is 1-to-1 NAT. Dynamic NAT is many-to-many NAT. Note that 1-to-many NAT is a special case of many-to-many NAT (a subset), and won’t really be discussed as a separate issue here. If you can do many-to-many NAT, you can also do 1-to-many NAT.

We’ve seen how 1-to-1 NAT works, and we’ve also shown that it doesn’t reduce the required number of IP addresses. This is where dynamic NAT comes in. Dynamic NAT works by translating a number of internal IP address to a number (usually a smaller number) of external IP addresses. It does so by dynamically creating 1-to-1 NAT mappings on the fly, as needed. Then, through traffic monitoring and timers, it destroys the mappings as needed, and frees up outside IP addresses for new inside clients. You may have already spotted a problem, but hold that thought for the section on PAT, later in the chapter.

Here’s our example scenario: You’ve got an internal network, 10.0.0.x, with about 50 machines on it. You get an Internet connection, but your ISP can give you only 16 addresses, 192.138.149.0 through 192.138.149.15. Because of standard subnetting issues, 0 and 15 can’t be used, 1 is used by the ISP’s router, and 2 is your router, leaving only 3 through 14, or 12 addresses. Naturally, you want to provide Internet access for all your inside machines; that’s what you got the Internet connection for.

The setup looks like that shown in Figure 4.12. We know from previous discussion that we could do it with only 1 IP address and a proxy server. For this example, to avoid the extra theoretical expense of a new dedicated server, we’re going to make use of dynamic NAT.

Figure 4.12 Connecting to the Internet through ISP, 16 addresses assigned.

We’ve already identified the range of available IP addresses, 192.138.149.3 through 192.138.149.14. Our router will be programmed with those addresses as an outside pool and 10.0.0.x as an inside pool. The word “pool” in this context simply refers to a range of IP addresses. To know how to do the dynamic NAT, the router will need to know for which IP addresses it is responsible.

This is more intuitive for the outside IP addresses, because the router needs to be informed of how many of the IP addresses it can use for NAT. The inside pool is a little less intuitive. Why not just NAT any address from the inside? There are a couple of reasons: First, you might want to designate a certain portion of your inside net to go to one outside pool, and another to go to a different outside pool. Second, you might need to do static NAT for certain machines, say a mail server, and you don’t want that particular machine being dynamically translated.

How Does Dynamic NAT Work?

What does a router have to do to implement dynamic NAT? We’ve already discussed briefly all the elements a router needs in order to implement dynamic NAT. It needs a state table, it needs to have an idea of when a connection start and stops, and it needs to have a timer.

We’ve already seen how static NAT works. For the dynamic NAT discussion, we’ll assume that a working static NAT with state tables and protocol specifics is in place, and expand on that. The first major change is that the static NAT mapping will no longer be hard-coded (i.e., manually configured by an administrator), but will be part of another table that the router can change as needed. When we start, the table will be empty, and there will be no 1-to-1 mappings. The table will remain this way until an inside machine tries to connect to the Internet.

Let’s take a moment to point out that this is a slight security improvement over static NAT. With static NAT, any machine on the Internet can attempt to connect to the outside IP address in a static NAT mapping at any time, and they will be allowed through to the inside. With dynamic NAT, the default for the outside IP addresses is no mapping. Thus, when the mapping table is empty, any attempts to the outside IP addresses should be futile, as they map to no inside machines at the time. This is not yet sufficient for security purposes, but it is an improvement.

When an inside machine attempts to connect to the Internet, the router will consult its table, and pick an available unused outside IP address. In our example, since the table is currently empty, it will likely pick the first one. It will then create an entry in the mapping table, and create a (temporary) static mapping from the inside machine’s IP address to the outside IP address it has chosen. Note that the router’s idea of a connection attempt to the Internet may be very simplistic: as soon as it gets any packet from the inside destined for the outside, it may create a mapping. The router will also start a timer at this point.

As long as the inside machine is sending traffic out, or something on the Internet is sending traffic in (via that outside IP address) the mapping will remain. Every time a packet is passed that is part of that mapping, the timer is reset.

There are two ways the mapping will be removed. The first is that the connection is stopped normally. For example, the FTP session is done, and the client has quit. For this to work, the router has to have an idea of what the end of a connection looks like. For TCP connections, this is relatively easy, as there are particular flags that indicate the end of a connection. Of course, for the router to watch for the end of a connection, it would have had to watch for one to start. We’ll talk more about how this works in the section on PAT. The second way a mapping is destroyed is that no traffic is sent for the duration of the timer. When the timer runs out, the assumption is that any communications must be finished, and the mapping is removed.

Naturally, while this one inside machine is communicating on the Internet, other inside machines may begin to as well, and they would get their own individual mappings.

Problems with Dynamic NAT

By now, the problems with dynamic NAT may be evident. If we assume the simplistic model, where the router creates a mapping as soon as any packet goes from an inside machine to the Internet, and only gets released when a timer expires, mappings are going to tend to stick around. If we’ve got 50 inside machines and only 14 outside addresses, there are going to be problems at certain times of the day, like mornings and lunchtime when everyone wants to access the Web.

How can this problem be solved? One way to help alleviate it is to provide more outside IP addresses. In our example, this isn’t practical since we got just so many from the ISP. Besides, it seems clear that there is a possibility that all 50 inside machines might want to access the Internet at the same time someday, and we would need 50 outside addresses. At that point, we might as well be back at static NAT, and there would still be no address savings.

Another possibility is to try to reduce the amount of time that a mapping sticks around. This will give inside machines a better chance at getting out at peak times. We could reduce the timer, but that would increase the chances that it might expire while an inside machine is awaiting a response from a slow server on the Internet. This would be effectively broken, and could result in packets reaching the wrong internal client.

The other way to reduce the amount of time is to improve the router’s recognition of when connections are complete. However, this adds a fair amount of complexity. Often, a client will have multiple connections open to the Internet at a given time. This is especially true for Web surfing, for example. Each different element on a Web page is retrieved as a separate connection, at least under HTTP 1.0. If you connect to a Web page with 10 pictures, that will result in at least 11 connections—1 for the HTML page, and 10 for the pictures. So, a router can’t simply watch for the end of any connection, it has to watch for the end of every connection. The router has to know how many connections are taking place, which means it has to watch for the beginnings of connections in order to count them.

This is all handled in yet another table. Each time a connection starts, an entry is created in the table. Each of these table entries may have their own timer, rather than using one global time for the whole inside machine. This works pretty well for connection-oriented protocols like TCP, where there is a clear beginning and end to connections, but it doesn’t work quite as well for connectionless protocols like UDP and ICMP, so for those we’re back to timers.

All in all, dynamic NAT (as stated here) isn’t very workable. It seems clear in our example that if 14 people on the inside are actively using the Internet at a given moment, no additional inside people will get to use the Internet.

Clearly, something that can guarantee fair access for an arbitrary number of inside machines simultaneously is needed. That’s why dynamic NAT doesn’t work exactly the way we said; this is covered in detail in the PAT section.

Configuration Examples

Unfortunately, configuration examples for many-to-many dynamic NAT will be pretty sparse. In fact, out of our three examples, only Cisco IOS supports many-to-many NAT.

Cisco IOS

We’re going to look at a many-to-many example using IOS. For this example, we’re back to the first config we looked at (no NAT config yet). Here are the commands:

The first five lines are the same as before. The next line defines a pool, named dynpool,. which is a range of IP addresses from 130.214.99.200 through 130.214.99.250. When the router uses them, it will use them as if they had a subnet mask of 255.255.255.0.

Next is the NAT command, which starts with ip nat inside source, like the other. In this case, though, we’re going to match against an access list to pick up our source addresses. The translated addresses will be from a pool named dynpool. The overload keyword means that potentially there will be more inside addresses than there are addresses in the pool, and the router is to handle that situation in a particular way (see the next section on PAT). Finally, we define list 1, which we referenced in the previous command. List 1 is simply the inside IP address range.

With this configuration, when an inside machine wants to get out, the router will assign it an IP address from the pool dynamically. When this configuration was tested, IP address .200 was assigned.

Port Address Translation (PAT)

There is a way to address the problems with static and dynamic NAT, to allow more than one inside machine to share one outside IP address. It’s called Port Address Translation, or PAT. Some folks may also think of PAT as being dynamic NAT since, as we’ll see, PAT is really necessary for dynamic NAT to function properly. In other cases, vendors will refer to PAT simply as “NAT” and you’ll have to look at the listed capabilities of the product to determine exactly what type it is. In Firewall-1, which is a very popular firewall product from Checkpoint, PAT is referred to as “hide NAT,” making reference to the fact that many inside IP addresses can “hide” behind one IP address.

The reason for the naming confusion is twofold: First, NAT is defined for a given product by the marketing departments of that vendor, so there is bound to be some confusion. Second, PAT is really the dominant form of NAT in use today (though static NAT is sometimes a necessary part of the security architecture). So, many vendors of PAT-capable products oversimplify, and just call the whole collection of features NAT. As with any product evaluation, if you’re considering purchasing a product, take a look at the technical documentation to see exactly what the capabilities are.

So what’s the problem with two inside machines sharing the same outside IP address anyway? Collisions—not collisions in the Ethernet sense, if you’ve studied Ethernet at all, but rather colliding port numbers and IP addresses. Let’s look at the naïve version of sharing an outside address. Two machines on the inside transmit requests to the Internet. When the replies come back, they both come back to the outside IP address. How can the router decide which of the two IP addresses on the inside the packets should be sent to?

Let’s look at a more involved version of a NAT router that is trying to use one outside IP address for more than one inside machine. In the section on dynamic NAT, we discussed a router that is capable of tracking individual connections as they pass through and are translated. Adding this capability would seem to correct the problem of the router not knowing which IP address to send the packet back to. It can simply scan through the table and look for a connection that the current packet seems to match. When the router finds the match, it looks up the inside IP address that connection belongs to, and forwards it to that machine, after proper translation, of course.

Does this work? Not quite yet. Back to the issue of collisions: Imagine that two inside machines, which share the same outside IP address, want to make a query of the ISP’s DNS server. Since the DNS server is maintained by the ISP, it’s “on the Internet” from the client’s point of view. At least, it’s on the far side of the NAT router from the client, so there will be a translation on the way out. Let’s take a look at what kind of information might be in the connection table we’ve been talking about. Certainly, there are IP addresses: Internet IP address (the server), inside IP address (real inside machine address), and outside IP address (the address the inside machine is translated to). Another obvious thing to track is the TCP and UDP port numbers for those types of connections, both source and destination ports. For our example, let’s assume all of this is tracked.

Back to the clients talking to the DNS server: They will be sending packets to the same server IP address, and the same port number (UDP port 53 for client DNS queries). We already know they share the same outside IP address, so in the connection table for these two separate “connections” (in quotes because UDP is connectionless), the Internet IP address is the same, the outside IP address is the same, and the destination port number is the same. The inside IP addresses are different, and the source port numbers are probably different. The requests go out with no problem.

The problem is, two requests from two separate inside machines look very similar, and probably only differ on the source port and data portion of the packet.

When a reply comes back to the outside IP address, the only differentiating factor at that time (since the router doesn’t know which inside IP address to send to; that’s what it’s trying to figure out) is the source port. More specifically, it looks at what is now the destination port (source and destination port get reversed on replies), decides which of the two inside machines was using that as a source port, and sends it to that one.

There’s where the possibility for collision comes in. Most operating systems will start allocating source ports at 1,025, and work their way up sequentially. There’s a very good chance that at some point, the two inside machines will happen to be using the same source port at the same moment, trying to talk to the same IP address on the Internet, as the same destination port. Everything matches except for the inside IP address, which is not good since that’s the unknown piece of information when the packet arrives at the outside IP address on the router.

The problem lies in the fact that the headers in the two requests are the same, but the data portion differs. The NAT device has to determine which packet goes to which inside machine.

How Does PAT Work?

Statistically, we’ve got a smaller chance of having a conflict than we did with straight dynamic NAT. Still, we’d like to make the chance of conflict negligible. This is where PAT comes in. If you hadn’t already guessed from the name, PAT works by translating port numbers along with IP addresses. Specifically, when it translates the source address on the way out, it also translates the source port.

If the router is careful not to create conflicts when it chooses new source ports, this solution works well and eliminates conflicts, at least for TCP and UDP. Some extra tricks are sometimes needed for ICMP, which had no port numbers per se.

Now, the router has a unique port number to reference when all the other information matches another connection. PAT enables a very large number of inside machines to share even just one outside IP address. How many exactly? It’s difficult to give an exact number, since it depends on usage patterns, so let’s make some assumptions. Assume that the limit factor will be many inside machines communicating with a single Internet IP address at one time. The worst case will probably be UDP, since we’re stuck using timers to emulate connections (to know when they’re done). Let’s say the timer is set for two minutes. That is, after two minutes of no packets from either side, the connection is declared over. The possible range of port numbers is 0 to 65,535, so the theoretical limit is 65,536 simultaneous connections. This assumes that they are all happening at the same time, either because they all start at the same time and have to wait two minutes, or because the connections are active longer than that, and it builds up to that level. This is for one outside IP address. If a flavor of dynamic IP is being used, multiply that number by the number of IP addresses being used for dynamic NAT with PAT.

Remember, that applies only if all the clients want to talk to the same machine on the Internet. If you consider all the machines on the Internet, the chances for conflict drop to nearly zero. Chances are good that in the real world, you’ll exhaust the memory of your NAT device before you start reaching any theoretical limits.

What is the security situation with PAT? It’s starting to look a lot better. An outside IP address no longer corresponds to a single inside IP address; it now depends on the connection. This means that if a new connection attempt is made to the outside address, it will not match anything in the connection table, and will therefore not have an internal IP address to connect to. At least, that’s the most common behavior when an Internet machine tries to connect to an outside address. It’s theoretically possible to design the PAT so that a particular outside IP address maps to a particular inside address (combined static NAT and PAT). For a security application, you would not want that behavior. Another “gotcha” to look out for is that the outside IP address isn’t the IP address of the NAT device for that interface. For example, with some routers it’s possible to use the router’s own outside IP address for PAT. In that case, connection attempts to the outside IP address will connect to the router, which may not be desirable.

Many PAT implementations only allow a particular inside pool to map to a single outside IP address. Presumably, this is because just about any size inside network can map to a single outside IP address.

Let’s take a look at what these connection tables we’ve been discussing might look like. They include inside source IP address, outside source IP address, destination Internet IP address, original source port, translated source port, destination port, transport protocol, FIN flags, and timer. FIN flags would be a couple of simple flags to indicate that a FIN exchange has been done for one of the two directions. TCP connections, if closed properly, close each direction separately, so we need to track each direction. When both flags are set, the whole connection is done. If a RST occurs instead, the flags aren’t needed, and the connection is done immediately.

Figure 4.13 contains a diagram of a possible connection, which we can use as an example. In the diagram, the inside machine is 10.0.0.2, the router’s outside IP address is 192.138.149.1, and the server we’re contacting on the Internet is 207.244.115.178. The line between the Web server and the router represents the Internet between the two.

Figure 4.13 Simple PAT arrangement, using a router’s own outside IP address.

The inside machine sends a SYN packet to port 80 on the Web server, using a source port of 1030. Here’s what the table entry might look like:

All of the labels that indicate direction are from the point of view of the first packet, the SYN packet, going from the inside to the outside. Many of the items will be reversed for packets going the other way, but the router will keep track of that by noting into which interface the packet arrived.

Here’s a rough block diagram of the SYN packet headers just leaving the inside machine:

Here is the same packet after it passes through the router:

Notice that the source address and source port have both been translated. Here’s the reply packet from the Web server:

Source and destination have been reversed, and the flag is now SYN-ACK. This is the packet that will arrive at the outside of the router. The router has to make its decision with these main fields. All the router has to do is match the four leftmost fields to the connection table. If there is a match, it routes the packet and restores the original source address and source port (now destination address and port):

The address and port the router needs to translate the packet back are simply looked up in the connection table. The connection table entry will remain until one of three conditions are met:

 Both sets of FIN packets are received

 A RST packet is sent by either end

 The timer runs out

The timer is checked periodically to see if time has run out. In addition, each time a packet is routed for this connection, the timer is reset to two minutes, or whatever other value is used.

UDP works much the same, except there are no FIN or RST packets to indicate the end of a connection, so only a timer is relied on to end UDP connections.

Problems with PAT

What kind of problems exist with PAT? PAT has all of the problems of static NAT (i.e., having to translate addresses that appear in the data portion of packets), plus a couple of new ones. Our discussion of PAT was based around the idea of a fully functioning static NAT. So any protocols that pass IP addresses in the data portion of packets, like FTP, should be handled. Well, not quite. The sharing of an outside IP address that gives us the almost-firewall effect of not allowing machines on the Internet to connect inside works against us here.

Again, FTP serves as a good example of the problem. We’ll assume the data portion of the packets (the FTP PORT command) is getting modified properly. So what happens when the FTP server tries to connect to the outside IP address at the port supplied? There is no entry in the connection table to permit it, and it will fail.

The solution is obvious. While the NAT software modifies the PORT command (and now it has to change the port passed in the same manner as it does for other connections), it also creates an entry in the connection table.

For this example, refer back to Figure 4.9. This time, the protocol will be FTP instead of HTTP. After the initial connection has been made, the connection table looks like this:

At some point during the connection, the FTP client will issue a PORT command. For our example, we’ll use PORT 10,0,0,2,4,19. The port number section 4,19 translates to 1043 in decimal, which is what port the OS will hand out next. The router will have to translate this PORT command. If we assume the next translated port the router makes available is 6177, the PORT command becomes PORT 192,138,149,1,24,33. (The PORT command works in bytes: 24*256+33 = 6177.) In addition, the router must add this new port to the connection table. Now the table looks like this:

Now, with this addition, PAT properly handles FTP. The data connection will be handled as a separate connection, and will be removed under the same circumstances as any other TCP connection. We have finally achieved our goal of IP address savings, which is the driving factor for wanting to use NAT in the first place.

NOTE

The FTP server will use a source port of 20 when connecting back to clients to deliver data.

With this type of setup, PAT works well. There is one small “gotcha” that comes up on occasion. There really isn’t any good reason to do so, but some servers on the Internet will pay special attention to the source port that is used when they are being connected to. This comes up most often with DNS. Traditionally, when two DNS servers communicate using UDP, they will use port 53 as a destination port, as well as their source port. This is a matter of convention rather than a hard and fast rule. If we’re translating the source address, though, there could be a problem. There are a few sites on the Internet that have configured their DNS servers to accept connections only from port 53.

This has come up in the past with both apple.com and intel.com, but they aren’t the only ones. It can be difficult to get others to change to suit you, so if you find yourself having trouble with a particular DNS server, you may have to change the translation for your internal DNS server to static so that the source port of 53 isn’t changed on the way out. This applies only if you run your own inside DNS servers. If you use your ISP’s DNS servers (which would be outside), then most likely you won’t have a problem.

Configuration Examples

In a way, almost all the configuration examples (minus the Cisco static NAT example) have been PAT examples. At their cores, ICS and IP Masquerade are PAT products, even if you’re only translating one address to another. IOS can do it or not, depending on how you configure it. Even so, we’ll take an opportunity to go into a little more depth, and look at a few more examples.

The reason for the ruse so far is that, practically speaking, NAT (without PAT) doesn’t actually work. All of the problems we’ve discussed so far make plain NAT unusable.

Windows NT 2000

There really isn’t a lot more to say about ICS from the first example. It’s a PAT product, and all the inside IP addresses are forced to 192.168.0, and are port-translated out using the single dial-up address. There is, however, another option we haven’t looked at yet. There was another tab on the window brought up by the Settings button, as shown in Figure 4.14.

Figure 4.14 ICS reverse connection setup.

Much like the Services screen, special application handling can be defined here. This is intended to cover behavior like FTP exhibits, where a reverse connection needs to be made. Unlike the FTP handlers we’ve seen though, this is a little less flexible. With the FTP handlers, just the one port needed is opened long enough for the connection to be made. In this case, we’re being invited to leave a range of ports open back to the inside for as long as the service is in use. This also tends to invite more conflicts, since having a port on the outside open gives us all the problems of many-to-one NAT. Even so, using this may make it possible to get an application working that otherwise wouldn’t. It’s better to have the option than not.

Since the product is still beta, documentation is scarce. I know passive FTP works with no special configuration because I tried it. It’s likely that other protocols are handled in a special way, too, but Microsoft hasn’t told us which ones yet.

Probably the biggest issue with ICS is that it works only with dial-up, and that it forces DHCP on you. This means it won’t work with cable modems, DSL, or any technology that wants to connect via a LAN interface. Microsoft sells a much higher end product called Microsoft Proxy Server (MSP). It’s much more flexible, but it retails for $1000 US.

There are other commercial solutions that fill in the price gaps between free and $1000. To find a list of commercial NAT products for NT, consult the “References and Resources” section, later. I’ve personally had very good luck with Sygate, of which the most expensive version (unlimited inside users) costs only about $300 US.

Linux IP Masquerade

IP Masquerade is also doing PAT, even when working on just one inside IP address. Changing our static NAT to many-to-1 PAT is very simple. Change the line:

to:

which will take care of the whole inside subnet.

There is a good set of documents on how to use IP Masquerade; links to them can be found in the “References and Resources” section. If you plan to deploy IP Masquerade in production, you owe it to yourself to read them. You will also need to read the IP Chains documentation (notice the ipchains command we’re using to configure IP Masquerade). IP Chains is the built-in firewall for Linux kernel 2.2.x. IP Masquerade is not sufficient to keep your system secure.

Let’s take a look at some other aspects of IP Masquerade. We know there’s a module that specifically handles FTP. What other modules are there? If you recall, the command that installed the FTP handler was modprobe. The command modprobe −1 will list all modules available for install. In that list, these stick out:

Our FTP module is in the list, and judging by the names, there are obviously IP Masquerade modules. Several of those are immediately recognizable, and are known to cause difficulty when used with firewalls or NAT. These include FTP, Real Audio, Quake, IRC (specifically, DCC send), CUSeeMe, and VDOLive.

There is a place where IP Masquerade handlers can be obtained, and ones that don’t exist can even be requested. Please take a look at the “References and Resources” section of this chapter for details.

Cisco IOS

We’ve already seen the Cisco PAT, too—that’s what the “overload” configuration was. This variation gets all inside machines to go out using the router’s own IP address:

This tells the router to use access list 1 (match all 192.168.0 addresses) and to translate using the router’s own IP address for fastethernet 0/1 as the source address.

Here’s a full working config for this:

Naturally, if you want to use this config, you’ll have to correct IP addresses and interface names. Also, the passwords have been crossed out, so put those in manually. It’s always a good idea to sanitize your router configuration files before you let anyone else see them.

This type of configuration (having all inside machines translate to 1 outside IP) is often useful when connecting to an ISP.

The Cisco has another interesting feature that we haven’t looked at yet. The IOS lets you examine the connection tables! We looked at some theoretical examples before, and now we can look at some real ones.

Here’s an example from the static NAT configuration on IOS:

Cisco doesn’t expose the FIN flag or timers. Also, notice that there are four address:port pairs. That’s because the IOS can do double NAT inside one box.

In this case, inside machine 192.168.0.2 had Telnetted (port 23) to 130.214.250.9. The source address was translated to 130.214.99.250. On the left, you can see that the transport protocol is TCP.

Here’s an example from the dynamic NAT config (using a pool of outside addresses):

The address pool starts at 130.214.99.200, and that address was picked for the same machine for all connections. Here, we see more Telnet connections, and a few DNS connections (UDP port 53).

Here’s the state table during our PAT example, when all inside machines are going out as the router’s IP address:

Here, we’ve got TCP, UDP, and ICMP. Notice that the ICMP connections have what appears to be a port number next to them. Some NAT devices will impose state information on ICMP in order to be able to distinguish it. It’s unclear if that’s what’s happening here, but it’s possible that the router has replaced part of the ping stream with 256, or some representation of it, and this is how it’s tracking that.

Here is what the table looks like during an FTP session, using the PAT conflg:

The first listing is just after an ls command was issued in the FTP client. We can see our connection out to port 21, and the reverse connection back from port 20. The second list is after another ls command. Notice the previous reverse-connection entry is gone. Finally, if need to, it’s possible to empty the translation table manually:

What Are the Advantages?

If you’ve read the previous sections in this chapter, you probably already have a pretty good idea of the advantages of using NAT. Primarily, it allows you to use a relatively small number of public IP addresses to connect a large number of inside machines to the Internet. It also buys you some flexibility in how you connect to other networks.

For Managers

How Many IP Addresses Do You Really Need?

There are many more Internet connectivity options available today than there were just a short while ago. These include modems, ISDN, traditional leased-line, DSL, cable, wireless, and more. Prices, performance, reliability, and availability vary widely, but they all have one feature in common: The more IP addresses you want, the more expensive it will be. From a financial perspective, it makes sense to get by with as few as possible. NAT can go a long way towards reducing the number of IP addresses needed. PAT can be used in most cases to let all your internal machines access the Internet as if they were one IP address. This can be crucial if the access technology only allows for one IP address, such as dial-up access (modem, ISDN). If you plan to host any publicly accessible services on your premise, such as a Web server or DNS server, you’ll need a few more IP addresses. This usually isn’t too much of a problem, since dial-up access isn’t appropriate for hosting public servers anyway. You can still use PAT to keep the inside Internet access down to one IP address, and get enough other addresses to cover however many servers you want to run. If you do have public servers, however, don’t fool yourself into thinking that NAT is a complete security solution. It’s not. You must still implement a full security solution, probably including a firewall.

The goal of using a small number of IP addresses on your NAT device for many inside machines is usually the motivating factor behind wanting to use NAT. This goal is achieved in the real world through a particular type of NAT, called PAT. PAT allows many inside machines to use a small number of IP addresses (often as few as one) to connect to the Internet.

NAT also gives you some flexibility in how you handle changes or outages. Sometimes a machine goes down or moves, and rather than reconfigure many client machines, you’d like to translate addresses on the router to point to the new server, or to an existing server at a new address. This can also be useful for temporarily dealing with address conflicts.

What Are the Performance Issues?

What is the cost in performance for all of these NAT features? Not many hard numbers are available. For NTs ICS, performance is probably a moot point, since it has to involve a dial-up interface. Certainly ICS will function fast enough to max out a dial-up connection. IP Masquerade could have some meaningful testing done to it, but I’m not aware of any performance testing that has been done. In addition, Linux is very much a moving target. Changes come quickly, and they may include performance enhancements. Linux also runs on a wide variety of platforms, so if you run into a performance bottleneck while using IP Masquerade, you’ll probably be able to scale it up with better hardware. Cisco has provided some rough numbers here:

Cisco gives numbers for three of their router platforms: 4500, 4700, and 7500. The 4500 is able to run at about 7.5–8.0 Mbps on 10Mb Ethernet for all packet sizes. The 4700 is able to run at 10 Mbps on 10Mb Ethernet for all packet sizes. The 7500 throughput ranges from 24 Mbps for 64-byte packets, to 96 Mbps for 1500-byte packets on Fast Ethernet.

Of course, for all three NAT packages we’ve been looking at, this depends on what else these platforms are doing. If the NT ICS server is running a CPU-intensive game at the time, performance may dip. If the Cisco router is also performing an encryption on the traffic, performance will drop there, too.

It’s not surprising that there should be some delay when performing NAT versus just plain routing. At a high level, the routing function is relatively simple:

1. Receive the packet.

2. Verify checksums.

3. Consult the routing table.

4. Decrement the TTL field.

5. Recalculate the checksums.

6. Transmit.

Compare this with the functions needed for NAT:

1. Receive the packet.

2. Verify checksums.

3. If entered outside the interface, check if there is a matching connection table entry.

4. Consult the routing table.

5. Check if the outbound interface is marked for NAT.

6. Determine portions of the packet to be modified.

7. If it is the first packet in a new connection, create a table entry.

8. If it is a PORT command or similar, rewrite the data portion and create a new table entry.

9. If it is a FIN packet, remove the table entry.

10. Modify the packet as needed.

11. Recalculate checksums.

12. Transmit.

Even if there is enough CPU speed, there will still have to be a small latency increase, as these steps will require numbers memory lookup and writes. The good news is that under most circumstances, performance won’t be an issue. Usually NAT will be a problem only when routers are already under a heavy load.

For IT Professionals

Which Product Are You Going to Pick?

Chances are it’s going to depend on which operating system you know best, and possibly what equipment you already have. If you are comfortable with UNIX, then IP Masquerade or something similar would probably be your preference. If you’re an NT person, then you’ll want something on NT. To be realistic, it probably won’t be ICS. ICS is really only good enough for a home LAN, which isn’t too surprising, since it was designed for that. In some cases, it isn’t even suitable for that, since it might be a cable modem you want to share. If you’re a network person, or maybe if you just already have a Cisco router in place, you may want to implement your NAT there. Cisco routers aren’t the only ones that do NAT, either, in case you have a different brand. It’s doubly important to pick a solution that runs on your platform of choice, because chances are that it’s not just a NAT architecture, but also a security architecture. Like it or not, as soon as you hook up to the Internet, you’ve got a security problem to worry about. You’ll have to configure whatever platform you want to run on to be as secure as possible, so it should be whatever operating system you know best.

Proxies and Firewall Capabilities

Now that we’ve covered in depth what NAT is and how it works, let’s discuss security. So far, we’ve only covered firewalls indirectly, mentioning them here and there while discussing NAT. Let’s begin with some basic definitions, and later get to how firewalls are similar to, and different from, NAT packages.

What is a firewall? That’s a bit of a religious issue, as firewall means different things to different people. The original meaning of firewall was a barrier, often in a structure, designed to take a certain amount of time to burn through during a fire. For example, a building may have some walls or portions of walls that are firewalls, designed to compartmentalize a fire for a certain amount of time, to limit damage. Some people liken firewalls in the electronic security sense to these barriers, saying they are designed to deter intruders for a period of time, and to compartmentalize parts of the network. So, if there is a breach in one portion of a network, the others aren’t instantly affected, too.

Other folks will argue that a firewall is features X, Y, and Z, with X, Y, and Z being whatever features they desire in a firewall. Some say that the firewall is the portion of a security architecture that stops traffic. Others say it includes the pieces that allow certain types of traffic.

The folks who participate in these discussions are the philosophers of firewalls. These discussions often take place on mailing lists dedicated to firewalls. What’s a little disturbing is that these folks, some of whom invented firewalls, can’t agree on terminology.

Realistically, firewalls are defined by companies who sell products called firewalls. It turns out that the situation isn’t as bad as it might seem, because nearly all of these products have a number of features in common. We’ll be taking that road, so we’ll be discussing features.

Packet Filters

Networks, by their nature, are designed to pass as much as possible, as quickly as possible. The original routers had no need of intentionally blocking things, except perhaps for corrupt packets. That is, corrupt in the sense that the appropriate checksums don’t match. Supposedly, in the early days of the Internet, security wasn’t much of a concern.

I’ve heard at least a few stories that indicate that people wanted to start filtering certain kinds of traffic due to errors. Someone, somewhere, made a configuration error, and traffic starts flying that causes someone somewhere else some trouble. Thus were born packet filters.

Packet filters are what they sound like—devices that filter packets. Very commonly they are routers, but they can also be general-purpose hosts, such as Windows NT or Linux. The earliest packet filters would have been able to block packets based on the IP addresses contained within. Later, they would be able to block packets based on port numbers. Modern packet filters can filter on a variety of criteria. These include IP addresses, port numbers, transport type, certain flags in TCP headers, and more.

These packet filters have long been used as part of a traditional proxy/screening router firewall architecture (see the “Proxies” section, next). Typically, they will be used to block types of traffic that aren’t allowed by policy. They can also be used reactively to block attacks after they have been detected (i.e., block all traffic from a particular address range).

Traditional packet filters (PF) have the characteristic that they don’t change packets, and they don’t have state. In other words, a PF can only pass or not pass a packet, and it can only make that decision based on information in the current packet. In addition, PFs are statically configured, meaning that they can’t change the filter rules based on traffic.

Many packet filters have a way to filter on “established,” which would seem to indicate that they are able to track conversations in progress. In fact, to a PF, “established” simply means that the ACK bit is set in the TCP header.

PFs have some serious limitations as firewalls. Let’s go back to the problem of how to handle FTP. Say you have an inside machine that you want to allow FTP access out. The control channel connection is easy. The filter rule says inside IP can go to any IP outside at port 21. Next, you can turn on the allowed established rule to allow established packets from any outside IP to the inside IP. At this point, the control connection will work, and you’re relatively protected. The problem becomes how to handle the reverse connections. The first packet back has only the ACK bit on, so the established rule will not help there. You don’t know what port the inside IP will be waiting on, only that it’s probably above 1023.

With a PF, though, all you can do is add a rule that says to allow packets from any IP, TCP port 20, to any IP at TCP port >1023. This opens up a massive security hole, as machine operating systems run services at ports above 1023. Many of these services have known security holes. Anyone who figures out that you allow access to all inside IP addresses at all ports above 1023, if the source port happens to be 20, can attack you. For the clever attacker, the firewall might as well not be there.

FTP is simply a familiar example. If you take a look at the handlers that are available for IP Masquerade, you’ll see many more examples of protocols that would have to be handled in the same way.

However, if you had a special machine that didn’t have any vulnerable services running above 1023, and had otherwise been specially secured and locked down, it would probably be acceptable to configure the PF to allow traffic only to it in this manner, depending on the local security policy. Such a machine is often called a bastion host. The problem is, these machines tend to be less useful to everyday users, so they really can’t be put on everyone’s desk to act as their main productivity machine. So, what can the machine be used for? It can act as a proxy.

Proxies

Proxies were discussed somewhat at the beginning of this chapter. A proxy is a machine, often a bastion host, that is configured to fulfill requests on behalf of other machines, usually inside machines. We’ll get into the details of how the proxy actually works in a moment.

Imagine now that we’ve configured our PF to allow traffic only from the Internet to the proxy. Since we’ve configured it well, the fact that the Internet can get to ports above 1023 is not a major concern. Additionally, another PF between the proxy and the inside would be useful to help keep malicious inside users from attacking the proxy. Our architecture looks like that shown in Figure 4.15.

Figure 4.15 Protected proxy server.

It’s important to note that Figure 4.15 is more a logical diagram than a physical one. Although we could implement all of the pieces shown to achieve the desired effect, it may not be necessary. For example, the diagram would seem to indicate that the proxy has two interfaces—it could, but usually doesn’t. Traffic may enter and leave the same interface without causing difficulty if addresses are managed properly on the filtering routers. Also, with a flexible enough router acting as PF, this design can be done with one 3-interface router rather than two 2-interface routers. However, this diagram makes it much easier to visualize data flow.

The inside PF has another function besides protecting the proxy from inside users. Should the proxy be compromised in some way, it may help protect the inside against the proxy itself. This concept is important, and it’s called a DMZ (Demilitarized Zone). The term DMZ has a couple of different meanings to the firewall philosophers as well. Some purists call it the network just outside the outside interface of a firewall (or in our case, outside the outside PF). The definition we’ll be using is “a network segment that trusts neither the inside nor the outside, and is not trusted by the inside.” The word trust in this case implies unfettered network access. For example, the Internet at large trusts everyone, as everyone gets access. The inside network trusts no one, and no one gets direct access to the inside. Practically speaking, most folks consider a DMZ to be a third interface on the firewall (the first and second interfaces being the inside and outside).

So how exactly does a proxy work? We’ll start with traditional proxies. Basically, the proxy acts as a server to inside machines, and as a client to the Internet. Inside machines have to use either modified software, or a procedural change to make use of the proxy. Traditional proxies are not routers, and in fact the routing code should be turned off or compiled out of a bastion host that is to be a traditional proxy. If you send a packet towards a proxy, and its destination IP address isn’t the proxy’s address, the proxy will just throw the packet away. In all of our NAT examples, the destination address of packets always remained (except for the double NAT examples) that of its ultimate destination, some host on the Internet. Proxies work differently, and clients have to change their behavior accordingly.

The first requirement is that the destination IP address must be that of the proxy server, not the server the user actually wants on the Internet. Let’s look at a simple (contrived) example: Telnet.

With a NAT-type solution, you would simply Telnet to the name or address you wanted. Let’s design an imaginary proxy to handle Telnet. First, we write our program to listen for network connections, and pick a port on the proxy on which to run it. The port could be 23, replacing the regular Telnet mechanism (if any) on the proxy machine, or we could run it on its own port. For our example, we’ll pick port 2000. Our program will accept TCP connections, and then prompt for a name or IP address. Once it gets the name, it attempts to connect to that name at port 23. Once the connection is made and output from port 23 on the outside machine is sent to the inside machine, any subsequent output from the inside machine (i.e., the user typing) is sent to the outside machine.

So, an inside user who wants to Telnet out must now Telnet to the proxy at port 2000, and enter the name of the machine to which they really want to Telnet. If it connects, they will see the output from it, and will be able to type input for it.

Of course in the real world, the Telnet protocol isn’t that simple, and our example isn’t quite sufficient. However, it illustrates the basic idea: have the inside client inform the proxy of what it wants. The proxy makes the connection on behalf of the client, retrieves some data, and passes it back to the client. Pass any input from the client to the server.

How is FTP looking? The problem remains the same: the reverse connections. The proxy does the same trick as a PAT device, but in a slightly different manner. The control channel connection (to port 21) works more or less like the Telnet proxy example just given, until the PORT command. Upon identifying the PORT command in the data stream, it changes it in the same manner that a PAT device would, and substitutes its own address. The proxy also asks the OS for an available port, begins listening on that port, and sends that port number. It has to keep a copy of the original PORT command for later reference. When the outside server connects back to the proxy, the proxy opens a connection to the inside machine in the original PORT command and sends the data.

So what does a user on the inside who wants to use FTP have to do differently? That presents a problem. With our Telnet example, it’s pretty easy to see how to get extra input from the user. The problem with FTP is that there are many, many different types of FTP client programs. These range from command-line text clients where users have lots of opportunity to enter input, to fully GUI FTP clients, where nearly everything is point-and-click.

One strategy is to have the inside user put in a special username. For example, instead of entering anonymous, they would enter anonymous@ftp.example.com. This would instruct the proxy to use the username anonymous, and connect to the FTP server ftp.example.com. The password would be supplied unchanged.

This works for any FTP client where the user is prompted for a username and password. Problem is, when Web browsers follow an FTP link, they automatically use anonymous and whatever e-mail address you’ve got programmed into your browser. They don’t stop to prompt.

Web browsers are a problem in general. How is the user using a browser supposed to get the browser to connect to the proxy, and how are they to supply the URL of the real site to the proxy? There are tricks that can be tried, such as putting in special URLs and treating the proxy as a Web server. These work theoretically, though with problems, but these mechanisms aren’t very practical. Users will tire of them quickly and complain.

There is a separate tactic that can be used for proxy access: special client software. Basically, this means that the client software is modified to access a proxy server, so that the user does the same thing as they might if they were directly connected to the Internet, and the software takes care of using the proxy. So, when the user runs the special Telnet program, it handles contacting the proxy and informing the proxy about which server is desired, transparently. All the user has to do is Telnet to the server they want, using the special Telnet client. Theoretically, this can be done to any client program, so that users don’t have to be bothered with the details.

The problem is, there are many, many client programs, most of which don’t have publicly available source code for modification. Also, there are potentially many, many proxy protocols, if each site created their own proxy software. Obviously, some standards would be useful.

The currently used proxy protocol standards are SOCKS and CERN proxy, but we won’t get into the details. The CERN proxy protocol grew out of a proxy feature of the CERN HTTP server, and as you might guess it’s an HTTP proxy. It was important because there was support for the protocol starting with the early Web browsers.

SOCKS enjoyed similar early browser support, with the advantage that it can proxy arbitrary port numbers. Of course, your SOCKS proxy server must still be able to handle the protocol that matches the port number.

SOCKS also came with a few rewritten client programs, like rtelnet and rftp. These were “SOCKSified” versions of the Telnet and FTP programs. They are UNIX source code, so chances are you could compile versions for most UNIX platforms. Later, third-party Windows applications starting appearing with SOCKS support. Nowadays, if a client program supports the use of a proxy, it usually has SOCKS support. More information about SOCKS can be found in the section “References and Resources.”

The idea of SOCKS being able to support arbitrary ports begs the question: Is there such a thing as a generic proxy? Indeed, there is. It’s possible to proxy a stream of data assuming that there are no reverse connections, and so forth. That is, assume it looks rather like a Telnet connection.

Such a proxy is often called a circuit-level proxy, or a plug gateway (after the plug-gw feature in Gauntlet, a popular proxy-based commercial firewall). SOCKS proxies typically can support such an arrangement, if desired.

Yet another way to handle getting the client request to the proxy is to modify the IP stack on the client to do so. This software is typically called a shim. The Microsoft Proxy Server works this way; it supplies a shim for Microsoft Windows clients. MSP also supports the SOCKS protocol, for non-Windows clients. In this manner, MSP is able to support arbitrary client programs on the Windows platform, as long as the protocols are simple, or a handler has already been designed.

Finally, before we leave the topic of proxies, some proxies now have a transparency option, which changes the model of how proxies used to work. As discussed, traditional proxies require the clients to behave differently. Transparent proxies can act as routers and proxy connections automatically, much like a PAT device. These proxies have the important advantage that they do not require any special software or configuration of the client. So what’s the difference between PAT and a transparent proxy? This is discussed in detail later in this chapter, in “Why a Proxy Server Is Really Not a NAT.”

Stateful Packet Filters

During the time that proxies were evolving, so were PFs. The ability to keep simple state information was added to PFs, and thus were born Stateful Packet Filters (SPFs). An SPF could, for example, watch a PORT command go by, and only allow back the port that was mentioned, rather than having to let through everything above 1023. Rather than just let in every TCP packet that had the ACK bit set, it could let in just the ones that corresponded to outgoing packets. The small addition of being able to track what went on before adds an amazing amount of power to the simple PF.

Very few plain SPFs actually exist. That’s because they almost all add yet another ability, which will be discussed shortly. An example of an SPF as described is a Cisco router using reflexive access lists. These access lists have the ability to modify themselves somewhat, based on other access list lines being matched.

Stateful Packet Filter with Rewrite

The preceding definition of SPF is not a widely accepted one. Despite its use of the word filter in the middle, when most people discuss SPFs, they mean a device that can also modify packets as they pass through. Adding this capability theoretically gives the SPF complete control over packets.

The packet rewrite also puts one feature into the SPF engine in particular—NAT. Recall that the requirements for NAT are: the ability to rewrite packets, and the ability to track that information. As it turns out, the connection tables needed to do PAT are basically the same as those needed to do SPF. So, if you can do SPF, it’s pretty easy to add PAT, and vice versa.

There are many commercial examples of SPF-based firewalls, even if they use a different term for the underlying technology. The market-share leader, Checkpoint’s Firewall-1, is based on SPF, which they call Stateful Multi-Layer Inspection (SMLI). Another popular example is Cisco’s PIX firewall.

We won’t go into a lot of detail about how SPF works—if you understand the details behind PAT, you understand SPF. The tables that need to be maintained to perform an SPF function are the same as those needed to do PAT. An SPF firewall needs to do at least the same amount of work as a PAT device, and should ideally add on a fair amount more, to allow for better data validation and content filtering.

Why a Proxy Server Is Really Not a NAT

At this point, it’s appropriate to discuss the differences between proxies and NAT. For purposes of this discussion, all flavors of NAT, PFs, and SPFs are equivalent. Transparent proxies are a little bit of a special case, but they will be treated as traditional proxies for this discussion.

At a very high level, proxies and NAT appear to be the same; they both let you hide many machines behind an IP address. They both modify the data stream as it goes by to account for the change of address. They both keep state about more complicated protocols, in order to handle them correctly.

It turns out that the ends might be the same, but the means are very different. At a low level, the internals of the device (a proxy or NAT device) handle the packet in completely different ways. The basic difference boils down to this: For NAT, the basic unit being worked on is the packet; for a proxy, all of its work is done on a data stream. Let’s discuss what that means, starting with the proxy.

When a packet is received by a server, the server first determines if the packet is intended for it (i.e., if the destination address is one of its addresses). In the case of a traditional proxy, it will be. The packet then undergoes a process of being passed up the IP stack of the server. If the packet belongs to an existing connection, the data portion of the packet is extracted, and placed into a buffer for the proxy program to read. If it’s a new connection, a new buffer is created and the proxy program is notified that there is a new connection to service, but the process is otherwise the same.

When the proxy needs to send something, a reverse process happens. The information the proxy needs to send is placed into an output buffer. The TCP/IP software on the server will then pull information out of the buffer, put it into packets, and send it.

Under the IP protocol, packets can be a wide range of sizes. Large packets may be split up into fragments in order to cross networks that can handle only frames up to a particular size. For example, a 2000-byte packet would have to be split into at least two parts to cross an Ethernet segment with an MTU of 1500 bytes. On a proxy server, the IP stack will put the fragments together before it places the data into the buffer. Ideally, fragments won’t happen. When possible, hosts will not transmit packets that will have to be fragmented. A host doesn’t always have a way to determine whether or not a fragment will need to be made along the way across a network, so often the best the host can do is to not transmit packets bigger than its local network.

The goal of the fragment discussion is to illustrate a point: The number of packets that enter a proxy server don’t necessarily equal the number of packets that come out. For an overly simplified example, a proxy server may receive a single packet that contains “Hello World!” However, when it transmits it back out, it may be as two packets: “Hello” and “World!” The reverse may happen as well. In fact, the proxy inputs only the string of characters, and outputs them to a different buffer, possibly making changes to them. It doesn’t concern itself with how the packet gets divided. When it sees an FTP PORT command, it reads it in, decides on how it should be changed, and outputs the changed version. It doesn’t need to do anything special if the command ends up being longer or shorter.

Contrast this with a NAT device. When the IP stack of a NAT device gets a packet that is not addressed to it, which is normally what will happen, it will try to route the packet. During the routing process is when the NAT device has an opportunity to operate on the packet. Except for fragments and a couple of special cases, NAT is one packet in, one packet out. The packet will be basically the same size as well. When a PORT command starts through, the NAT device has to keep the packets as intact as possible. This means there may have to be a special piece of code to expand or shrink a packet to accommodate a longer or shorter address. When fragments arrive, the NAT device typically will have to perform reassembly as well. Although fragments are packets in their own right, they are also pieces of a larger packet. Taken as a whole piece, the packets in and packets out count still holds.

What are the security implications for the two methods? There are pros and cons for each. There exist types of attacks that rely on the exact structure of a packet. With a proxy, since the packets are torn apart, there is little chance of this type of attack succeeding against inside hosts. However, since the proxy has to process the packet itself, it may fall prey to the attack rather than the inside host. A NAT device would likely not fall victim to the same type of attack, but it might pass it along to the inside host, and have it succeed there. Fortunately, these types of attacks are almost always Denial of Service (DoS) attacks, which mean they tend to crash things, but don’t result in a violation of information integrity. In one case, the firewall crashes. In another case, the firewall stays functional, but the inside host goes down. Neither is an absolutely better choice, and it depends on the preference of the firewall administrator. No one wants their firewall going down, but on the other hand, its job is to protect the inside.

The other big difference between NAT and proxy is data validation and modification. There are a number of proxy packages out there that take a more conservative security stance. That is, they contain proxies for protocols that the designers were reasonably sure they could validate well. They had an idea of what allowable values are, and designed their proxy to watch for those, and make corrections if needed. In some cases, if it looks like a protocol has some inherent problems, they would not produce a proxy for it, thereby discouraging customers from using that protocol.

Many NAT packages seem to take a different tact. They will do the bare minimum necessary to get a protocol to pass, and they often try to pass as many protocols as possible. They also tend to be more open by default; that is, if a connection attempt is made from the inside and the protocol is unknown, it will try to pass it anyway.

Now, this isn’t a fair comparison. I’ve compared the best proxies against the worst NAT implementations. Naturally, there are products from both camps that meet in the middle, and a good firewall administrator can make a NAT/SPF secure, and a bad one can misconfigure a good proxy. Still, the tendencies are there: NAT devices typically only go above layer 4 when it’s needed to make the protocol work (like the FTP example). Proxies always work above layer 4, and even the simplest ones (circuit-level proxies) operate at layer 5. The assumption is, of course, that the higher up the stack they go, the more secure.

All of this is a religious argument though, because you don’t buy a conceptual firewall, you buy an actual product. Each product must be evaluated on its own merit.

The other point that makes a lot of the arguing pointless is that the lines between SPF and proxy are blurring. The latest versions of most of the commercial firewalls include features both from the proxy world and the SPF world, regardless of which background they came from. For example, in Firewall-1, there are a number of “security servers” included that optionally can be activated in place of a NAT-style packet passing. These typically include extra capabilities, such as extra authentication, stripping out of undesirable content (such as Java or ActiveX) and blocking of particular sites by name or URL. Many of the proxy firewalls have gone transparent. In order for a proxy to be transparent, it has to change its behavior somewhat. The short explanation is that they have to perform a SPF-type function to deliver the packets to the proxy software, when they weren’t addressed to the proxy in the first place.

Shortcomings of SPF

There are plenty of SPF shortcomings to discuss, but only in a security context. In terms of functionality, all the products work well. There are performance differences and administration differences, but if the product claims to pass a particular protocol, it usually does.

Proxies are generally slower, simply because they do more to the information as it goes through. They strip headers off, put them on, allocate sockets, and do a lot of buffering and copying of data. SPFs skip a lot of this work. For protocols without a lot of work to do, this is an advantage. For protocols that should be handled carefully, this is bad. There seems to be a consensus that handling complicated protocols is easier with proxy-style software than with NAT style. It seems that the idea of being able to toss the unit of the packet makes the process easier, at least for TCP protocols. Being able to pick between the two in a single package is one advantage to having the lines between SPF and proxy blur. The firewall designer can pick the best tool for the protocol.

The transparency option for proxies is a very good feature. Not having to change the software on all the inside machines, and not having to support those changes, can be a huge advantage. A subtle bit of information is lost with this option, though.

With traditional proxies, especially with architectures where there is a separate program for each protocol, there is never any question about which protocol was desired. For example, if the user contacted the Telnet proxy, you could be sure they wanted to use the Telnet protocol. If they contacted the HTTP proxy, clearly they want HTTP. If you’ve spent any time surfing the Web, you’ve probably noticed that some URLs specify a different port number. For example, instead of:

In this case, rather than contacting the Web server at port 80 (which is the default for HTTP), we’ve explicitly told it to contact a Web server via port 8080. For a traditional proxy, this is not a problem. It knows you want HTTP, and it knows you want to do it over port 8080.

This works because the proxy is forcing the client to specify both protocol and port explicitly. Let’s look at the same situation with a transparent proxy. The client isn’t configured in any special way. The user might not even realize there is a firewall there. Now, the client isn’t specifying the protocol to the proxy, because it doesn’t know a proxy is there. So, when it contacts port 80, the proxy has to make an assumption—it will assume (almost always correctly) that this is HTTP, and will handle it as such. What about when the browser specifies port 8080? The proxy has to make another assumption. Port 8080 is pretty commonly used as a nonstandard HTTP port, but not always. The proxy must either pick HTTP or a circuit-level proxy. This is commonly configurable.

What happens in this situation?

Some joker on the Internet has run his Web server on port 21, and your user has clicked on a link to it. The proxy has to make an assumption—it’s going to assume this is the FTP protocol. Chances are, this connection won’t work very well.

So, we’ve lost some information by going transparent. We’ve forced transparent proxies to make assumptions about protocols based on port numbers. This will work well most of the time, but there is always a weird exception somewhere. SPFs suffer from the same problem as well.

Some folks have argued that an SPF-type architecture makes it too easy for a firewall administrator to do something risky. In other words, SPFs may be more flexible in the range of things they can be made to allow, and that may be too tempting. This is largely untrue now anyway, since most proxies include similar capabilities.

Most firewalls come with some sort of GUI for configuring the rules of what is allowed and what isn’t. These can be very convenient for maintaining large rule sets. Some more-experienced firewall administrators have complained that this can be a hindrance to understanding exactly what the firewall is up to. They complain that by putting simplicity on top of a complex product, it gives the illusion to the novice administrator that they comprehend everything that is going on. In other words, it provides a false sense of security.

Summary

Network Address Translation (NAT) changes a packet’s layer 3 address as it passes through a NAT device. Other protocols like IPX could also be translated, but the vast majority of the commercial NAT implementations perform NAT on IP addresses. Often, simply changing layer 3 protocols is insufficient, and higher layer information must be modified as well. NAT and security are often used together.

The ideas behind NAT probably came from early proxy-based firewall solutions. Proxy servers allow administrators to filter traffic for content, and to make it appear to outside networks that everything is coming from one IP address.

The proxy administrator usually configures a filtering router (i.e., a packet filter) to block direct access from inside-out, and outside-in. The configuration allows only inside machines to communicate directly with the proxy. This forces inside clients to use the proxy if they want access to the outside net. This single point in the network where all traffic is forced to pass through (on the way to the Internet, at least) is called a choke point. Care is taken to configure the proxy server to be as secure as possible.

A side-effect of a proxy firewall is that the outside needs to see only one IP address. This can reduce the needed publicly routable IP addresses to one. RFC 1918 recognizes this, and makes a number of IP address ranges available for private use, behind proxy servers or NAT firewalls. A NAT device usually acts as a router.

There are several types of NAT. The first type is static NAT, a 1-to-1 mapping between two IP addresses. In one direction, either the source or destination address is translated; in the other direction, the reverse happens. Typically, the source address is the one that is translated, but there are uses for translating the destination address as well. One possible use for translating the destination address is redirecting client machines to a different server without having to reconfigure them.

A NAT router has to differentiate between interfaces, typically by marking one “inside” and the other “outside” in order to know when to translate, and whether to translate source or destination addresses. Because of the 1-to-1 mapping, static NAT saves no address space.

Another interesting variation of static NAT is called double NAT, which changes both the source and destination addresses at the same time. This can be useful for connecting two networks that use the same addresses.

A static NAT implementation that simply translates layer 3 addresses and doesn’t change the data stream at all may have problems with certain protocols. A classic example of a protocol that passes IP addresses in the data stream is the FTP protocol. In order for a static NAT (or any NAT for that matter) implementation to work with FTP, it must modify the FTP PORT command as it goes by. This must also work if the PORT command is split across more than one packet.

Another flavor of NAT is dynamic NAT. Dynamic NAT is similar to static NAT, except that it is many-to-many, or many-to-1, and the static mappings are done on the fly out of a pool of addresses. Problems for address contention may arise, however, if there are more inside addresses than outside addresses. To help with this problem, the NAT device will attempt to detect when a mapping is no longer needed. Strategies for this may include timers, and watching for packets that indicate the end of connections.

To track these items, dynamic NAT must maintain a connection table to track IP addresses, port number, FIN bits, and timers. Even with these mechanisms, dynamic NAT can still easily result in resource contention, and in inside machines not being able to get out. A further refinement is needed.

Port Address Translation is a type of NAT that allows more than one inside machine to share a single outside IP address simultaneously. This is accomplished by translating ports as well as IP addresses. When an inside machine makes a connection out, its source port and address may be translated. The NAT router will track which source ports are in use, and will avoid conflicts when picking new source ports. PAT finally achieves the address savings desired, and also achieves some level of security.

PAT keeps a connection table similar to that of dynamic NAT. In addition, PAT has to dynamically open ports as needed to handle protocols with reverse connections, like FTP. Most existing NAT solutions are PAT-based, and have the ability to do static NAT as needed.

NAT’s major feature is address savings. In addition, it can be used to temporarily work around certain types of network problems. NAT typically carries some small performance cost, but it’s usually negligible except under the heaviest network loads.

Proxies and firewalls have a somewhat different mission than NAT, though they often are used together. Firewalls are about security—security in this context means controlling network connections. Historically, there are several types of firewalls: Proxies, Packet Filters (PFs), and Static Packet Filters (SPFs).

Proxies work by having clients connect to them instead of to the final intended server. The proxy will then retrieve the desired content, and return it to the inside client. Like NAT, proxies must understand some of the protocols they pass in order to handle them properly.

PFs are often used in conjunction with proxies to achieve the protection needed, and to create the choke point to force all traffic through the proxy. Packet filters don’t maintain state, and must often leave large port ranges open to accommodate protocols like FTP. Like NAT, PFs are usually routers.

SPFs are PFs with state. In addition, almost all SPFs can rewrite packets as needed. If an SPF is able to rewrite packets, it can theoretically do anything to packets as needed, including implementing NAT if desired. The connection tables needed for PAT are about the same as those needed for SPF.

NAT (and SPF) differs substantially from a proxy in terms of how it implements its features. For NAT, the basic unit worked on is a whole packet. For a proxy, it’s a data stream. The major practical difference is that a proxy will tear a packet all the way apart, and may reassemble it as more or fewer packets. A NAT device will always keep the same number of packets in and out.

Most firewalls on the market at present are a hybrid of proxy and SPF technology. The main advantage to this is that they are able to be transparent, requiring no special software or configuration on the inside client machines.

FAQs

Q: Why doesn’t program X work?

A: If you find yourself administering a firewall, NAT device, proxy, or something similar, invariably you will get questions like this: “I just downloaded a beta of a new streaming media protocol called “foo” and it doesn’t seem to work. Can you fix it?” For whatever reason, streaming media protocols have a strong tendency to use reverse connections. Here’s what you can do to try to get it to work:

 Visit the Web site for the vendor that produces the program. Often, they maintain a FAQ about how to get their protocol to cooperate with firewalls. In some cases, it may be a simple option that is setable on the client program. In other cases, there will be instruction on how to configure your firewall to make it work.

 Check with your firewall vendor to see if there is an update to handle the protocol. Most firewall vendors maintain a Web site with searchable content, and you can search for the protocol in question.

 Check the firewall logs to see if there are reverse connections that are coming back and being denied. Possibly, you might have to use a protocol analyzer to try to determine what the protocol is up to.

 Don’t forget to consider that you may not want to pass this protocol. If you’re very security-conscious, you may realize that there may be bugs in the new program that may pose a serious threat to your network. Client-side holes have become very common recently.

Q: Why can’t I connect to anything?

A: This relates to when you are first setting up your NAT/firewall/proxy. There can be a large number of reasons why you can’t connect, any one of which will break things. Here are some special things to pay attention to:

 Make sure all of your routing is correct. If possible, you might turn off any NAT or security features temporarily in order to see if packets seem to flow. If not, you may have a routing issue. If they do, then you probably need to check your security configuration.

 Make sure you’ve allowed the traffic you’re trying to send. This sounds obvious, but it happens often enough. Probably the easiest place to see this is the logs. If you show up as having been dropped, then you haven’t allowed the traffic you’re trying to send.

 Make sure any ARP settings needed are in place. For some solutions that require virtual IP addresses, you may have to publish ARP addresses manually. A quick way to check if this is working or not is to look at the ARP table on the router.

 Make sure your client configuration is correct. This applies especially if you’re using proxies. Make sure the client program is set to use the proxy, and look for typos or other misconfigurations that might be easy to miss.

 When all else fails, you may have to use a protocol analyzer to see what’s actually happening on the wire. Unfortunately, you may have to use it in several places to get the full picture (inside the firewall, outside, etc.).

Q: How do I verify that my address is being translated properly?

A: This one is usually pretty easy. The simplest way is to connect to something that tells you what your IP address is. If it’s your network, the router immediately outside your NAT device may tell you. For example, if you log on to a Cisco router and issue the “show users” command, it will tell you the DNS name or IP address from which you’re connecting.

    If you’re an end-user, and you suspect you’re being translated and want to find out, it may be slightly harder. If you’ve got an account on a router or UNIX box somewhere on the Internet, you can usually find out that way. Another choice is a Web page that informs you what IP address you’re coming from. An example of such a page is:

Q: What does a good firewall architecture look like?

A: This is a religious question (i.e., you’ll get many people preaching their favorite gospel). However, there are a few generally accepted best practices. We’ll use a medium-sized company as our example. They have a full-time Internet link, and have their own Web and e-mail servers on the premises. Let’s assume they didn’t previously have a firewall, and now they want to install one.

The Web server and e-mail server have to be reachable by the Internet—that’s their purpose. They also must be reachable by the inside. They want them to be protected from the Internet as much as possible. The typical setup is a firewall with a DMZ, sometimes called a three-legged or 3-interface firewall. The diagram looks like Figure 4.16.

Figure 4.16 Transparent firewall with DMZ.

In this example, the firewall does routing. It can be an SPF firewall, or transparent proxy—it doesn’t really matter. When an inside user wants to get out, they must traverse the firewall. When either the inside or outside wants to get to the public servers, they must traverse the firewall. It’s not depicted on the diagram, but the rules on this type of firewall would prevent the Internet from connecting to the inside. Typically, the inside would be using RFC 1918 addresses, and the firewall would be doing PAT for the inside machines.

Most likely, the rules on the firewall are set up for at least a few inside machines to have a somewhat higher level of access to the public servers, for administration purposes.

An important result of this type of architecture is that the inside doesn’t fully trust the DMZ. That is, the DMZ machines can’t get back inside, at least not for all ports. This means that if the DMZ machines are compromised, the inside may still be protected.

References & Resources

It’s impossible to cover every detail of NAT, proxies, and firewalls in a chapter, so we have provided a number of resources to which you may refer. Some of them are general, like the RFCs, and some are very specific, such as Web pages at Cisco about their NAT software. Most likely, you will want to scan through the list and look for topics that are of interest to you. In addition, if you are planning to implement one of the technologies mentioned here, you will need to read the relevant documentation, also referenced here.

RFCs

RFC 1918 is the current RFC covering private address space and NAT. The official documentation for the private address ranges (10.x.x.x, 172.16.x.x-172.31.x.x, 192.168.x.x) is located here. In addition, the top of the document contains links to related and obsolete RFCs.

A related RFC that isn’t referenced in RFC 1918,

is aimed at NAT developers and implementers.

IP Masquerade/Linux

This is the main place to start looking for IP Masquerade documentation. On this page, you’ll find a changelog, a link to the HOWTO:

which links to pages on how to join a IP Masquerade mailing list, and links to locations to get IP Masquerade handlers. The links on the page at the time of this writing were broken (may be fixed by the time you check the main page), but this one works:

This is information about IPChains, which is needed to work with IP Masquerade.

Cisco

Cisco has several documents regarding their NAT implementation in their routers. If you plan to use this, you owe it to yourself to at least familiarize yourself with them.

This is the Cisco NAT FAQ.

This is Cisco’s NAT technical tips, where Cisco documents what protocols are and are not covered, among other things.

This is a Cisco NAT white paper. It’s at a similar technical level to this chapter, but obviously with a Cisco focus. They cover configuration examples, and a few features that weren’t touched on here, like TCP load balancing.

Windows

Here is an excellent list of Windows-based NAT products. In fact, there are several sections on NAT that are worth checking out there.

My favorite low-cost Windows NAT product is SyGate from Sybergen Networks. It’s inexpensive, and easy to set up. You can even get a trial version to evaluate. Look for it here:

Microsoft Proxy Server was mentioned a couple of times; information about it can be found here:

If you’re thinking about running it, you have to check out the MSProxy FAQ:

NAT Whitepapers

Here are a couple of independent NAT white papers/resources:

This one has a focus on peer-to-peer networking and NAT.

This one reiterates some of the driving issues behind RFC 1918. @Work maintains a NAT FAQ. It’s geared towards their users, but contains some useful information and definitions:

Firewalls

There are several firewall FAQs:

This one is particularly good. It’s very complete, and covers the basics well.

This is a good collection of firewall information, in the form of a comparison sheet.

This contains the firewalls’ mailing list and archive.

There are a few Firewall-1 FAQs:

The Phoneboy FAQ; Dameon knows Firewall-1 well.

Here’s a second FW-1 FAQ. You might like the organization better.