I have a DNS server behind an ISA firewall. I noticed that the DNS publishing fails almost everyday and I have to restart the firewall service or restart the machine everytime this happens. Nslookup to the internal IP of the NS server works fine. I remember reading in this message board that Tom is also encountering this problem. Is this an ISA bug? Can anybody recommend a fix?
I used to have that problem a lot, until I revamped the DNS infrastrucutre for those sites publishing their own DNS.
The most important thing is to configure a separate DNS server for your external domains. These servers should answer questions only for the domains you host, and they should not perform recursion. Since these DNS servers don't perform recursion, you can't use them to resolve Internet host names for your internal network clients.
Once I made the change, I don't have the problem nearly as often, although it still shows up every week or two. Still have figured out to get rid of the problem completely.
As you might remember I posted about the same kind of problem about a month ago.
Well, things went from bad to worse so we decided to open a call to Microsoft... after the inital "we've never heard about that before" and "have you tried rebooting" and such we finally got to second level support.
After more then a month of them going through the config, machine, setup, logs, performance counters (we even gave them VPN and TS access to our machines!) they are basically stumped!
What we have found is that with increasing load the DNS publishing rules works for shorter and shorter times, actually I've seen times down to 5 minutes before failing.
We have even seen that when the load gets real high DNS queries from the inside also fails!
It seems that the ISA server creates some kind of "session" for each UDP connection (and yes, I KNOW that UDP is not session/connection oriented) to map replies to requests and with a high load it soon runs out of the session pool that is initially set to 40 session per client.
After increasing this to 600 we could run for longer times but it still fails when to load goes to high.
Which in this case is a whooping total of; - about 250 clients using web proxy - 2 internal FTP servers that are published - 2 internal Http/Https servers that are published - 2 internal DNS servers that are published - these also act as DNS forwarders for all internal servers - 1 internal SMTP/POP server that is published - Traffic on the external interface is below 2Mb/s
Well, the contact at Microsoft says that this is not in the white papers and that we really should have done a full-load test of it before implementing!
Just a small question, who in the ***** has the resources to do that kind of test before implementation?
So at the moment we are looking at ripping out that piece of pre-release, buginfested software (sorry, can't call it anything else) and instead install something that is of a higher version then 1.0!
I'll keep you posted if we get anything from them that takes care of the problem.
** NEWS FLASH ** Microsoft has managed to reproduce our problem in their lab AND they have managed to find the cause.. no cure yet though :-(
IF you have UDP based publishing (any UPD not just DNS) AND a Site and Content filter containing FQDN the following happens;
Incoming UDP requests are checked against the Site and Content rules by attempting to do a reverse lookup on the incoming IP (to find a FQDN to match against the IP).
If this for some reason fails (like the requesting IP not being in a reverse zone) then the ISA tries to make a NBTSTAT query against the remote IP to find the FQDN.
Once it has succeded, failed or timed-out on the incoming request it will then process the request.
This can take some time (at least on my side we just drop incoming netbios so those will have to timeout) and during that time the ISA is gobbling up UDP connections.
With heavy traffic this will at times cause the pool of available UDP mapppings to be full so that incoming requests first have to wait for another request to make it through the S&S rules before itself can start the path through!
So if; - you have a remote client that is not in a reverse zone and that can not be resolved by nbtstat AND if it is re-requesting the DNS information after say 5 seconds - then you can easily end up in a situation where - the requests are being held pending, in wait for a process slot, while the ISA is trying to resolve the FQDN of a previous request FROM THE SAME MACHINE!
So the good news is that they know why and the bad news is that it sounds like it is a fundamental change that needs to be done!
Possible workaround, no S&S rules! Not sure I want to go that way....
Was this clear? If not, drop me a line and I'll try again....
what has site&content rules to do with UDP publishing?
I thought that site&content rules are only checked for *outbound* connections and not for inbound connections! So, for normal web/server publishing with only a primary inbound connection, no site&content rule is checked as far as I know. However, when server publishing an internal FTP server and active mode FTP is used, then the secondary connection (FTP Data) is an outbound connection and if there is no site&content rule allowing this request, the secondary connection will be denied by ISA server.
Now, when you server publish an UDP service then ISA simulates an inbound connection. So, you would think no site&content rules are applied. Hmm... another ISA server mystery?
A few answers to the comments on my latest post....
Stefaan: A few months ago I actually posted a question what S&S rules had to do with publishing since we saw that the server publishing was affected by the S&S rules. It seems that every part of the ISA "mess" is always involved in every request!
AxleMunshine: If you increase to a ridicously high value it will work longer, however it will not fix the problem!
Tom: Correct! That Netbios garbage is what got the MS support personnel started on the path to finding the problem!
Jorgen, thanks for your reply. I set it up higher. The DNS behind ISA has been working since then and I have about 35 domains served by my ISA box.
Anyway, I used brute force to definitely solve my problems: one backup DNS at my ISP, the one behind my ISA box and I resurected a PC to act as a dedicated DNS server directly on an external address. I installed the Sygate personal firewall on it. Works great, since it run as a service.
So, if the behind ISA DNS fails, I have two others that answer for each domain.
It's a pain, but I can't complain. ISA works great for everything else in my case.
Last week we received an engineering patch from MS that was supposed to fix this problem... well, it sort of takes care of the problem.
What it does is that for each incoming UDP connection it still does that reverse DNS lookup to match against S&S rules. However, if it doesn't get a reply (which can and will happen in a real world) it will cache the non-reply so that the next time it will not attempt to do a reverse on the same IP!
Well, it sort of works... we're not seeing the situation where the ISA serevr gets completely bogged down with outstanding requests but it can still be frustratingly slow when it attempts to resolve a new IP.
So, I it looks like the S&S rules are deeply involved and can't be removed from the publishing... I guess that it is alway good to be able to block Penthouse and Playboy by S&S rules so that they can't surf to your servers!