Hello TMG experts - I'm hoping someone will have seen this issue before and can help! My example uses anonymised URLs, but the issue is real!
I have an internal site called https://website.internal.local that I wish to publish to external users using TMG 2010, so I set up a reverse proxy rule on TMG pointing “https://reversed.mydomain.com” at “https://website.internal.local”
TMG blocks access to any URL with a UTF-16 encoded element in the URL – a log entry is below:
Denied Connection MY-TMG-SERVER 19/10/2011 09:15:35 Log type: Web Proxy (Reverse) Status: 12302 The server denied the specified Uniform Resource Locator (URL). Contact the server administrator. Source: 220.127.116.11:27365 Destination: 18.104.22.168:443 Request: GET https://reversed.mydomain.com/index.html?test=%u00e9 Filter information: Req ID: 19f644d1; Compression: client=No, server=No, compress rate=0% decompress rate=0% ; FBA cookie: exists=no, valid=no, updated=no, logged off=no, client type=unknown, user activity=yes Protocol: https User: anonymous
By comparing this log with a different request where the “é” character is not encoded I gain some interesting insights:
Denied Connection MY-TMG-SERVER 19/10/2011 09:04:00 Log type: Web Proxy (Reverse) Status: 12217 The request was rejected by the HTTP filter. Contact your Forefront TMG administrator. Rule: PUBLISH_INTERNAL_SITE Source: External (22.214.171.124:11611) Destination: Local Host (126.96.36.199:443) Request: GET https://website.internal.local:443/index.html?test=é Filter information: Req ID: 19f64248; Compression: client=No, server=No, compress rate=0% decompress rate=0% ; FBA cookie: exists=yes, valid=yes, updated=no, logged off=no, client type=public, user activity=yes ; Blocked by the HTTP Security filter: URL contains high-bit cha Protocol: https User: (SecurID)bloggsj
STATUS: I notice that the error message is different between the two examples. In the UTF-16 version of the request, TMG server (at some type of top level) blocks the URL. When the “é” character is not encoded, it is the HTTP filter that blocks the “é” as a high bit character.
REQUEST: Additionally the GET request is quite different – in the UTF-16 example, the request has been blocked before any translation to an internal address has occurred, but in the non encoded version, the request has progressed through to the HTTP filter stage and is blocked by the “Block High Bit Characters” option within the rule’s HTTP properties. (If I uncheck the “block high bit character” option, the URL is valid and it works!)
USER: Finally – I notice that the UTF-16 encoded request has been blocked before any form of identity can be passed whereas in the un-encoded example identity has been established.
Summary and Question:
So – it looks like TMG 2010 has decided that a URL containing UTF-16 encoded characters is not valid and the URL is rejected. It’s not just a high bit characters issue; it seems to be about the encoding.
As I have no control over the internal site and can’t stop UTF-16 encoded characters from appearing in the content, is there any way to ask TMG server not to worry about these encoded characters in the URLs?
From: Amazon, Brazil
check if Verify normalization option is disabled.
When Verify normalization is enabled, Forefront TMG decodes URL-encoded HTTP requests to determine that the decoded request is valid. (URL-encoded requests contain a percent sign (%) followed by a particular number in place of certain characters. For example, %20 corresponds to a space.) Normalization helps prevent attacks that rely on double-encoded requests. Web services such as Outlook Web Access may use double encoding for particular requests, but these requests are filtered by Forefront TMG by default. To allow these requests, you need to disable Verify normalization for the Web publishing rule. To modify an HTTP filter setting, see Modify HTTP Filtering for Web Traffic.
I did find some similar posts on other boards from people who have the same issue - but the suggested fix is to alter the internal website rather than configure TMG 2010.
For instance - http://mcaf.ee/udc48 - in this case the internal site was sharepoint and the issue was resolved by changing the internal source website language config (installing language packs to eliminate the long encoded %u**** charaters)
Also the post here - http://mcaf.ee/8bvk4 - seems to suggest that the fix is best done on the internal website.
I really hope there is a way of fixing this issue at the TMG 2010 side as the site I'm trying to publish is owned and operated by a 3rd party and I havn't a hope of getting them to change it!