Raiders of the Lost ARP
When moving from an old load balancer to a new one, it’s usually best to do so gradually. Move a VIP here, a VIP there, gently over time. But hey, who has time these days.
Sometimes you just gotta yank that sucker out.
Sometimes it’s a rip-and-replace, and you yank the old equipment out, throw the new equipment in, and hope no one notices. This is what I have coined the “Indiana Jones Upgrade Method”. In the opening sequence of the classic movie Raiders of the Lost Ark, Indiana Jones tries to swap out gold idol with a bag of sand. He tries to do so in a way that the mechanism doesn’t notice that it’s happened, like the goal of every network admin.
A Typical Maintenance Window
Except, his careful planning (and eyeball judgement) doesn’t quite work out the way he hoped. The mechanism is triggered, and he gets chased by a giant boulder. We’ve all been there. Chased through the data center by a giant boulder. Sometimes the boulder is metaphorical, and more than once, it’s been a literal boulder.
One of the villains in our quest to do the seamless replacement is the nefarious ARP cache. Dun dun dunnn….
ARP is the protocol that joins Layer 2 (Ethernet) and Layer 3 (IP). Picture this: An IP packet is destined for your local subnet in your data center. It’s a perilous journey, full of over-saturated links, drop-happy queues, and rate-shaping nightmares. But now your local router now has the IP packet, safe and sound. It needs to deliver it to your device. It’s going to send the IP packet to you, but it can’t just send it pure Layer 3. Think of Layer 3 as a river, and your IP packet can’t swim. So we’re going to your IP packet in a Layer 2 boat.
This Layer 2 boat needs a Layer 2 destination, known as a MAC address, and that’s where the ARP protocol comes in. The router shouts to the Layer 2 domain: “Who has 192.168.1.200?” Your device shouts back “192.168.1.200 is-at 00:11:22:33:44:55″. The router writes this down in its ARP cache.
Many routers doesn’t forget this mapping for 4 hours.
That’s good, because if the router had to ARP for every IP packet it got, there could be more ARP traffic than IP traffic. But the problem comes up when you switch out hardware.
He should have cleared his ARP Cache
So you’ve got a new device, and it’s going to replace your current device and assume its IP address of 192.168.1.200. You pull out the old hardware, put in the new, and bring up the IP address. You put the domain into your browser and…. nothing.
Why? Every network interface has a unique MAC address. Your new device has a different MAC address than your old device. But the router doesn’t know that. It still has the old MAC address, and it’s trying to forward those IP packets to the old MAC address. It’s not going to refresh it’s ARP cache for hours. The only effective solution is to log into the router and manually clear the ARP cache. And it’s not just routers, firewalls will do this as well. Any device responsible for delivering traffic to your local network will cache ARP, so you’ll need to know how to clear that ARP cache.
Moral of the story? If you’re going to pull an Indiana Jones, make sure you’ve got access to the upstream device (router, firewall), or at least access to someone who can clear it out for you.
Monday Fun With HTTP Headers
Back in November I told you all about how I got HTTP header rick-rolled. Here’s some more fun with HTTP headers: Slashdot’s Bender quotes. Apparently I’m the last to know about this (I’ve read slashdot for years, and have been, myself, slashdot’d repeatedly), but in their HTTP headers Slashdot includes a rotating set of Bender (from Futurama) quotes.
Words of Wisdom
The screen shot is from the wonderful free Firefox extension HTTPFox. I highly recommend it.
Download HTTPFox Now!
Gmail Goes All SSL, and So Should You
Among the changes Google has made after disclosing their new approach to China in response to sophistcated Gmail hacking of Chinese human rights activists is that Gmail will now be all HTTPS, all the time, by default.
In many web mail implementations, only the username and login is done through HTTPS. The rest of your interactions, such as authentication cookies (which are easy to steal), payroll spreadsheets, intimated details, and other private communications go over the Internet in the clear. Then your email moves to regular encrypted HTTP. From early on, Gmail supported HTTPS for all of their site, but you had to specify manually “https://gmail.com” in your browser.
If there ever was a place where you don’t want your private communications going around unencrypted, it’d be the Internet
Back in July of 2008, they put in the option to force HTTPS for everything. If you simply put “gmail.com” into the address bar of your browser, your browser would default to HTTP and you’d be unencrypted. With the force option, you’d be automatically redirected to HTTPS. This was in response to the threat of stealing cookies and assuming other’s identity at Wifi spots that I mentioned back in 2008.
But you had to select this as an option in the Gmail preferences. Now, Gmail defaults to all HTTPS, all the time, and you’d have to explicitly set HTTP if you wanted to go unencrypted (which you’d almost never want to). This is a very good thing(tm).
HTTPS: It’s not just for passwords
While most webmail and other web applications do HTTPS for when a username and password is supplied, most do not use HTTPS for the rest of the interaction.
For webmail especially, this is critical. Cookies are used as authentication tokens (so that username and passwords do not need to be re-supplied every time you ask for a web page), and if they’re intercepted, someone could potentially pretend to be you.
For Webmail, Think Like Cookie Monster: Hands off my cookies!
If you run your own webmail implemenation, follow Google’s lead and do all HTTPS, all the time. If your’e concerned about performance, front-end your webmail installation with a load balancer equipped with an SSL accelerator card. A device like a KEMP LoadMaster or CoyotePoint Equalizer can handle such tasks for less than $10,000, somtimes even far less.
But most sites that do webmail such as OWA (Outlook Web Access) can handle the SSL fine on a decent-spec’d system without SSL acceleration.
And this needn’t be just webmail, all web applications could benefit from the added security of all HTTPS, all the time. It increases both privacy, and trust.
All SSL, all the time.
Programmer Expresses Murderous Rage And My Own Diatribe
Once in a while someone writes an angry, expletive filled rant and posts it on the Internet. Wait a minute, that happens all the time. How about this: once in a while someone writes an angry, expletive filled rant that makes sense and connects with a wide audience in a way that arguments about “Han shot first” and “which Star Trek uniform was the best” (the movie reds, of course) fall short.
I direct your attention to “Programmers Need To Learn Statistics or I Will Murder Them All“. Come on Zed, don’t be afraid to tell us how you really feel.
“Hi Bob, I’d like to schedule some time with you to discuss statistics.”
Often times the load balancer administrator gets pulled into a performance issue, and the blame storming game beings. You know the drill. Developers blame networking. Database blames the server admins. Server admins blame the network. And round and round we go.
One thing that has always frustrated me about that scenario is the disparity in metrics that the various groups have. The server admins have lots of metrics at their disposal. Whether it’s Windows or Linux/UNIX, metrics such as CPU, TCP/IP, disk I/O, and others are readily available and fairly easy to trend over time. I know, I used to be a server admin. The database admins have a variety of metrics available too: SELECT, UPDATE, INSERT, and DELETE stats, memory consumption, extends; all depending on the database platform.
And don’t get me started on us networking folks. We’ve got MRTG, HP OpenView, NetFlow, and half a dozen other tools to keep our eyes on the prize. Sometimes we’re so obsessed with metrics, we completely forget that we’ve got a network to run. We’d trend the number of boogers in our nose if someone would just write a damn tool for it.
The Movie War Games: Inspiring NOC setups since 1983
But it seems to me there really isn’t anything like that in the application world. When an application goes bonkers, visibility into the inner workings of the application seems to be the exception rather than the rule.
I worked at a place that had a Tomcat application. Every now and then, it simply stopped working. The application would freeze, and we couldn’t figure out why. The web server processes and database processes were running fine, but something got hung up in the app layer.
And we had no idea why. Nor did we have any good way to find out, other than looking for a cause in the networking realm, OS realm, database realm, etc. There was just no visibility into the inner workings of the application available to the developer team.
I try to not fan the flames of inter-departmental aggression. I think there are too many Layer 8 problems and blamestorming, and I think it’s a bad habit. But one of the ways I encourage others to fight this destructive tendency is to have good visibility with diagnostic tools in to your own realm.

All I am saying is give Layer 8 peace a chance
I don’t think it’s because application developers are lazy, far from it. Of all the pieces of the modern infrastructure, I think they suffer deadlines the most. My own unscientific and barely qualified sense is that there are two primary reasons for the lack of visibility into the performance of web applications.
For one, it’s time. Developers have deadlines. Probably a more apt way of saying it is that deadlines have them. There’s some serious business pressure to get the applications out the door.
Secondly, it’s just not part of the workflow currently. There’s no emphasis on building in metrics or hooks into the application so that reporting and diagnostics can be done. And I think that relates partly back to the timecrunch issue.
I’m not sure what the solution is, other than to encourage developers to build in metrics collection in their applications. If you’re a developer, I’d love to hear your thoughts on this issue.
Three Things You Need To Know About HTTP
Whether someone is a network administrator, server administrator, or application developer, the prevailing knowledge level of HTTP is usually the following:
Jack and squatспални комплекти. And possibly a little bit of bubkis.
This needs to change.
In terms of training, it’s a subject matter that gets skipped over by all three fields. Network administrators interact with routes and packets, server administrators interact with CPU and I/O operations, and application developers interact with memory constructs and data typing. They typically don’t interact with HTTP, they just don’t have any need to.
Until something goes wrong, that is.
HTTP is absolutely something everyone who is in someway involved with keeping the Internet working should understand on at least a basic level. And here are the three things you absolutely need to know, no matter what your primary skill set is.
1: HTTP messages
Ethernet has Ethernet frames, IP has IP packets, and HTTP has the HTTP message. There are two types of HTTP messages: Requests and responses. They both always have headers, and they both can have payload, although not always.
2: One object means one request and one response
There is a 1:1:1 ratio between an object, a request generated to get that object, and the response containing that object. Every JPG, every HTML file, every PDF requires an individual request, and will generate an individual response. Got a page with 11 objects (one HTML file and 10 images)? Then you’ll need 11 individual HTTP requests (each with its own header), and you’ll get 11 individual responses. If you ask for an object the server doesn’t have, the server still responds with a message (such as 404 FILE NOT FOUND).
3: Stop seeing the world in terms of web pages
Developers especially, but also network and server administrators tend to see problems in terms of web pages. That’s a mistake. Web servers, load balancers, proxies, and other devices don’t see the world in terms of web pages. They see two things: HTTP and Layer 3/4 (TCP/IP).
If you want to learn more about HTTP, I highly recommend the Firefox add-on called HTTPfox. It will show you each object and the headers for the request and responses they generate.
I Got HTTP Header Rickrolled

I just got HTTP header Rickrolled. I’m in Oslo, Norway right now, and I’m talking to a group about HTTP, and specifically how some web pages can require quite a lot of HTTP requests. I asked the group if they knew a site that had lots of objects, and they suggested http://vg.no, one of Norway’s most popular news sites. It’s very image-intensive, and was perfect for this demonstration with over 150 objects for the first page. I was using HTTPfox to browse the various objects, and I saw this header:
(Status-Line) HTTP/1.1 200 OK X-VG-WebServer flash1 Expires Mon, 23 Nov 2009 17:01:27 GMT Cache-Control max-age=21600 Content-Type application/x-shockwave-flash Last-Modified Thu, 19 Nov 2009 09:17:46 GMT Server lighttpd/1.4.20 Content-Length 27635 X-VG-WebCache fritz X-Rick-Would-Never Give you up X-VG-Varnish-IP 10.84.201.11 Date Mon, 23 Nov 2009 12:53:58 GMT X-Varnish 1899982491 1867855938 Age 6828 Via 1.1 varnish Connection keep-alive X-Cache HIT X-Cache-Hits 32843
I just got Rickrolled.
Cavium Buys MontaVista Linux
Cavium is a company that makes network processor chips, and is probably best known in the load balancing world as the company that makes the SSL ASICs that power a lot of the products out there.
Recently Cavium purchased MontaVista for $50 million. More at moblinzone.
KEMP Releases LoadMaster 5.0 Firmware
KEMP Technologies released the 5.0 LoadMaster firmware for LoadMaster 2000 and above models on Tuesday. It brings VLAN trunking (802.1Q) as well as Etherchannel to the LoadMaster series.
Not as prominent in the press release, but I personally think is the neatest feature, is the dynamic transparency. Transparency is when the source IP address of the client is maintained, which is the default method for most load balancer. The LoadMaster’s non-transparency is probably known more commonly in the industry as Source NAT, or SNAT. This is when the client’s IP address is replaced by an IP on the load balancer.
When preserving the true source IP address of your clients, you cannot have clients on the same network as your servers. This is sometimes referred to as “the same subnet problem”. The cause of this is that the traffic needs to pass through the load balancer on the way out. If the client is on the same subnet as the servers, the servers reply directly to the client, rather through the load balancer.
The solution for the same subnet problem is usually to enable SNAT/non-transparency, but you lose the true source IP address of your clients, so the web server logs will show everyone coming from one address.
The higher-end load balancers have the ability to do selective SNAT, and now KEMP has the ability to do selective SNAT automatically. I’ve yet to see it in action, so I can’t attest to how well it works, but it’s potentially a very nice feature.
Guess Who’s Back? Hint: Alteon
When Radware purchased the remnants of the once great Alteon line, many thought they were simply buying the customer list and were going to abandon the platform entirely. Radware insisted at the time this was not the case, but there was the usual (and understandable) skepticism. After all, Alteon languished in the arms of the deteriorating Nortel. (Although the old Alteon hardware — such as the AD3 and 180E — thrived as a used ecosystem on eBay, despite the lack of vendor support.)
But it seems Radware has made good on their promise to keep up the Alteon line, although in slightly different form. They’ve released the Alteon 5412, a 20-Gigabit Layer 7 device (web acceleration and SSL promised for mid-2010). It’s the Alteon software running on top of their OnDemand Switch 3 switching platform. Since the acquisition, they have also released two maintenance releases of the Alteon OS for the previous platforms. So it seems they’re making good on their promise.
Check out the press releases and promo site (bringing back Alteon marketing materials even).











