Hardware Config and Multiple Array Design (Full Version)

All Forums >> [ISA 2006 Firewall] >> General



Message


Quee -> Hardware Config and Multiple Array Design (18.Mar.2008 4:41:46 PM)

Hello,

I'm faced with a hardware dilemma in setting up ISA server (specifically Configuration Servers) in an Enterprise envronment.  We have thousands of users of Exchange in each site, and we're setting up ISA to publish Exchange resources.  I have a test site set up, with a Configuration Storage server and an array of several ISA servers to proxy Exchange.

On the hardware side, we have really beefy machines (AMD-based) with anywhere from 8 to 32 GB of RAM.  Because the Config Storage Server will be taking such a big hit and I can only have one (local workgroup setup), I want to beef up the Config Storage Servers (Processors specifically) so that they can handle the load.

Two questions I have are:
  1. If I leave the topology at one array per site (i.e. 1 config storage server per site) does it make sense to try to max out the hw config to quad core processors?  The normal hw rule of thumb is to distribute memory accordingly based on the # of processors you have.  But if I do that I'll end up with 8 GB minimum of RAM (32 GB if the hw folks go with their recommendations).  With ISA only using 3GB and only capable of handling 4 GB, what problems could I run into?  We use W2K3 R2 SP1 as the base, so the MS hotfix for memory addressing is applied (in SP1).
  2. Has anyone ever implemented a multi-array topology in one site?  How does this get managed - i.e. one URL per array, so for example OWA traffic to one URL, POP3 traffic to another, ActiveSync to a third?

Thank you for your help!  At this point I'm tempted to do both the HW upgrade and the multi-array topology.

Quee




gbarnas -> RE: Hardware Config and Multiple Array Design (18.Mar.2008 9:42:23 PM)

Hmm... Beefy CSS, eh?

My organization has four distinct ISA arrays served by a pair of CSS's. The ISA boxes are quad-core (dual/dual-core) systems w/ 4G RAM, and used for various things within our organization. One array is for Dev/QA and consists of a single server used for EDI type communications to/from our vendors. The second is an NLB array used for production of the same EDI services. We just migrated our user community from an ISA2K array to the third NLB cluster today - all Internet access from 2 main offices (800+ users) and 300+ locations (2500+ users). While I was monitoring the new array today, I found about 1250 active connections at that time. Memory utilization was 2.4G of the 4G and flat, and CPU utilization averaged about 0.33% with occasional peaks to 12%. We do a LOT of Internet access during the day, our staff is constantly scanning both vendor and competitor web sites to close sales. Our fourth array is under development/testing, and will support specialized "reverse" publishing of vendor non-HTTP based applications.

Our web and mail services are handled outside of ISA, except for OWA.

Oh, our CSS servers? A pair of VM (ESX 3.x) systems with single CPU and 1G RAM, one in each of the two data centers. They barely register any load at all, since the ISA servers only check them periodically for configuration changes. The VM images are backed up and can be moved to another VM host for DR. I'm not sure I'd really put much horsepower behind the CSS boxes.

The EDI servers have 100 URLs defined for various vendor communicaitons using HTTP protocol. We use alternate ports, one port per application, all bound to a single IP address. A vendor might communicate with 1-3 applications, each on a distinct URL/port. These EDI gateways are used to proxy access between our application servers and vendor sites, and they can respond asynchronously by connecting to the URI. This is a reverse-proxy / ISA publishing configuration. We use a separate server array, and separate URIs per application to isolate the financial transactions from other traffic.

These arrays are managed quite easily from the CSS, or even from the MMC installed on my workstation. My only gripe is that dynamic log monitoring requires you to log onto the array itself.

Glenn




Quee -> RE: Hardware Config and Multiple Array Design (18.Mar.2008 10:53:20 PM)

Thank you for the post - this certainly helps.  I'm wondering though, when I have 40k users + per site, and ISA is the reverse proxy solution authenticating users against a local ADAM store, how well this will scale?  Your numbers tell me that I definitely have to scale up the hardware to the max for the CSS server, although on the surface I think it can be done.  Peak traffic time is obviously in the morning, so the CSS will take a hit basically once a day in each site. 

Even talking to MS, they don't have the numbers in perf testing or dog fooding ISA internally to prove this out either way.  One of the other thigs in the mix is that we don't use NLB - we use hardware load balancing devices in front and back of the ISA servers. Makes it a bit difficult to compare to others as many use NLB to or from the ISA servers.

Maybe I'm over-analyzing the impact the user traffic will have - we're only publishing OWA and some POP3 and IMAP connections, so perhaps it won't be as much of an impact as I'm anticipating.  No VPN, no caching, no SharePoint, OCS etc.  But I can't help but worry given how many users we have and the fact that one lonely CSS server will be out there doing front-line authentication for each site and its' 40,000+ users.  

We're looking at quad core servers with more RAM than the OS can handle (where IS ISA 2008 when we need it!).  We'll see how that goes hardware-wise.

I wish I had your scenario Glenn, it sounds so much more . . . manageable, and a little less risky than what I'm working with here.

Thanks!




gbarnas -> RE: Hardware Config and Multiple Array Design (19.Mar.2008 11:31:20 AM)

I don't understand why you're worried about the CSS at all. It does nothing but provide the storage for the ISA configuration. It does nothing related to users, authentication, or anything else. If you make an ISA configuration change, it will allow ISA to obtain the update from a central location. ISA checks in every 15 seconds or so, depending on whether you change the defaults or not. The check-in is a quick transaction for a timestamp, doing a download only when there is a config change for that array.

One CSS can handle many arrays of many servers per array, and a second CSS provides fault tolerance. With your user numbers, you need to focus more on larger arrays than larger server hardware. A system with 2 quad core CPUs is much less expensive than one with 4 quads, so using two dual-quad systems will cost less to buy, about the same to operate, and provides much better scalability in the enterprise.

FYI - We compared our F5 load balancer with Integrated NLB during our development phase and went with NLB. Lower cost, tighter integration, balanced caching, and same or better performance.

Glenn




Jason Jones -> RE: Hardware Config and Multiple Array Design (19.Mar.2008 12:03:11 PM)

quote:

ORIGINAL: gbarnas

I don't understand why you're worried about the CSS at all. It does nothing but provide the storage for the ISA configuration. It does nothing related to users, authentication, or anything else. If you make an ISA configuration change, it will allow ISA to obtain the update from a central location. ISA checks in every 15 seconds or so, depending on whether you change the defaults or not. The check-in is a quick transaction for a timestamp, doing a download only when there is a config change for that array.

One CSS can handle many arrays of many servers per array, and a second CSS provides fault tolerance. With your user numbers, you need to focus more on larger arrays than larger server hardware. A system with 2 quad core CPUs is much less expensive than one with 4 quads, so using two dual-quad systems will cost less to buy, about the same to operate, and provides much better scalability in the enterprise.

FYI - We compared our F5 load balancer with Integrated NLB during our development phase and went with NLB. Lower cost, tighter integration, balanced caching, and same or better performance.

Glenn



Scaled out is always better (as per my reply to another one of Quee's posts

Interesting feedback regarding a hardware load balancer vs. NLB...I get a lot of resistence from customers who often see NLB as a crappy solution because it is software based and Microsoft (and free!). I think if you configure NLB properly and ensure the networking environment is configured to support it properly, it provides a very good solution that is difficult to beat given the list of advantages you have mentioned. The only other argument I often encounter is that NLB and NIC teaming are mutually exclusive and people like the idea of NIC teaming and often don't want to remove it...

I have personally used NLB for lost of customers with good success; an 8 node array being the largest single array I have used...

Micrsoft have recently released an update that now allows the use of Multicast NLB which now removes the only potential issue or limitation that I had come across with NLB on ISA. 

Personally, I would need a very good reason NOT to use NLB for my Enterprise designs...especially as multicast is now supported too.

Cheers

JJ




gbarnas -> RE: Hardware Config and Multiple Array Design (19.Mar.2008 12:26:42 PM)

Funny - we built the servers with NIC Teaming at first and I disabled it for NLB. Got a lot of flack from the network and systems team here at first. We now have 7 ISA2K6 servers in 4 arrays without teaming. Each has duplicate NIC ports that we can switch to in event of a problem.

Glenn




Jason Jones -> RE: Hardware Config and Multiple Array Design (19.Mar.2008 1:11:07 PM)

Not just me then! [:D]




Quee -> RE: Hardware Config and Multiple Array Design (19.Mar.2008 2:05:39 PM)

I wish there were case studies showing that Windows NLB can really scale in Enterpise scenarios.  I've yet to work in an Enterprise of over 150k users (and I've worked in a few now) where Windows NLB was chosen as a solution over hardware appliances.  The client's response has always been "it isn't proven to scale" and I haven't seen any metrics to the contrary.  I don't have a leg to stand on when they present me with their large (and proven) Nortel, Cisco, BigIron, etc. devices that are time tested.

Even if I did have metrics to prove that out though, that wouldn't be able to be done under the scope of my ISA effort.  Oh well . . .

Thanks for all the responses - I appreciate the feedback and guidance.

Quee




Page: [1]