Mega Proxy Not So Mega, Akshually
Apologies for the LOLcatspeak. I’m incapable of helping myself.
The driving force behind Layer 7 persistence (keeping an individual user tied to a specific server in a server group based on HTTP headers instead of IP address) was the dreaded AOL Megaproxy issue. AOL had the nasty little tendancy of routing all web traffic through a couple of mega proxies located throughout the US and Canada.
This caused a problem with the previous method of persistence, which was to base it on source IP address. Typically, one IP address equaled a single user. However, with AOL, you could have 20,000 users coming from a single IP address. The load balancer would think it’s a single user, and if you had 300 servers ready to take orders, all 20,000 users would go to one. That situation has happened a few times, and it’s hillarious, so long as you aren’t the company with the 300 servers.
I still teach that mega proxy problem, mostly out of muscle memory. But I stopped to think about it, do we really have a problem with megaproxies anymore? Does AOL even do this practice, and even if they did, is AOL represent a significant amount of traffic?
The answer to the later question is almost certainly no. AOL has seen a dramatic drop in subscribers, and most people connect directly to the Internet through their cable modem or DSL provider. And I don’t know of any major Internet provider that utilizes proxies for their users Internet requests.
Layer 7 persistence is still applicable to situations where you may have multiple users coming from a single IP address (such as a small client base coming from a handful of offices, with each office using on public IP address), but I wonder what doing Layer 4 persistence would do to a major site these days. I’m thinking, not much.
What do you think?



While it’s true that mega-proxies at carriers have died out (since the difficulty of running them was deemed to be more than the cost of raw bandwidth - a rousing management triumph of stupid over clever), we are seeing the rise of mega-proxies at large companies. These mega-proxies are used for scanning, security and logging of all Internet access (compliance etc). Having recently deployed a mega-proxy cluster for 60000 users with peak simultaneous access for 20000 users, I can say that its ‘not dead yet’.
On the other hand, the impact of these users on your loadbal rig is likely to be much less, 20000 corporate users are unlikely to flood most sites and cause unequal load balancing, unless, of course, you are providing services to large companies. In which, we know very well that this problem still exists.
Note also, that the rise of Blue Coat, Cisco WAAS and other web accelerators are an unknown quantity since they are also used to proxy HTTP.
Too much uncertainty for me to say with certainty, that L4 persistence is viable as a general observation.
(And the vendors are loving that because that sells more and bigger boxen)
September 15th, 2008 at 3:44 pmThe situation is still very much alive,mega proxies, in the hosted services industry serving the enterprise market. Some load balancers are better than others in handling it. Same goes for applications. Probably the best helper to getting around the problem has been virtualization. Being able to throw 20 servers at a farm in a short-period of time makes seasonal deployments of servers practical.
September 16th, 2008 at 7:05 ammega-proxies are alive and well. If you have an AT&T wireless internet connection, by default your internet access is proxy’d (and your graphics are compressed as well).
February 15th, 2009 at 12:05 amAs a large manage hosting shop, we are frequently asked why one server gets hammered even though they’re using load balancing (with sticky). The culprit is almost always an office full of users behind NAT who use some web application hosted on the servers. Cookie-insertion is the next thing to do, but this gets more involved when SSL offloading is required to do the inserts for SSL traffic.
Ideally, we shouldn’t have to do this. The application should keep its own shared state and not care which web/application server the request lands on. I don’t know if this is short-sightedness of the application developer who never intended for their app to be load balanced, or if it’s an intentional push to move this to the load balancer.
If it’s the latter case, I don’t see how the it makes economic sense as the solution tries to scale up. As traffic increases, it requires bigger and faster load balancers to handle the extra persistence. These things are certainly not cheap when compared to server hardware+admin host.
March 9th, 2009 at 2:26 pmCorrect, no economic sense if you look at traditional “legacy hardware appliances”.
However, wouldn’t it be great to combine this functionality with standard commodity server hardware?
Which you can do. Take a look at the Zeus eXtensible Traffic Manager (ZXTM), everything (and more) that you need in a load balancer. But the form factor is software, the perfect combination…
Nick
March 10th, 2009 at 9:39 amCrazy how many people still use and others profit with web proxies.
May 11th, 2009 at 5:58 pm