Last week (30 JAN) Attrition.org (@SecurityErrata) tweeted that the SANS GIAC site was susceptible to cross-site scripting (XSS) via the search field.
XSS is, without question, a vulnerability almost every web application will or has suffered at some point. One need only read Attrition's OSVDB, Secunia Advisories, or Whitehat's website statistics reports to get a feel for how prevalent the issue is. There are many reasons why the vulnerability is #2 on OWASP's Top 10 and SANS, like so many others, is no stranger to the issue as Johannes Ullrich (Dr. J) points out in his ISC Diary entry from 12 JUN 2012.
The particular vector reported last week regarding SANS/GIAC was quite interesting as it was reported as a "simple" XXS. However, upon closer inspection it turns out it was actually quite complicated to exploit and far from simple. Initial reports indicated that the search boxes were vulnerable to simple attacks such as " onmouseover=alert(document.cookie) x=", but the truth was much more complicated. Operating from a place of complete transparency, Brian C from the SANS Web team provided us with explicit details regarding this vulnerability so as to help readers protect themselves from similar issues.
According to Brian, when the web team received the report of the issue they tried all the "basic" CSS attacks and couldn't immediately reproduce the issue. After further research and communication with the reporter, Ryan F figured out how to duplicate the problem, and once verified, the team immediately shut down the search page while they worked on the investigation. It turns out that, while the attack string used by the original reporter triggered an alert, it was not the same alert they were trying to trigger. The attacker's string was actually dealt with properly by the code, the alert that was returned was a stored sample from a paper on the GIAC site delivered back to the attacker via the search results. Furthermore, the issue was caused by a compounded series of events, any of which alone would not have presented themselves; thus, the issue was not the "simple" attack that the original reporter thought it was.
Brian stated that root cause was attributed to php "striptags()" functionality. This function is used by PHP to remove HTML tags from strings in order to render them "safe" for display in the browser. When using this function in certain areas of the application the team selectively allows "safe" tags through, an example being the tags used to indicate a paragraph of text (<p> and </p>. The team discovered that while the function does strip tags as it should, if any safe tags are let through it does not check the attributes of the tag. As a result, any area of the application where striptags() are in use was reviewed. The good news was that most of the places the function was in use, no tags were allowed through. In the other parts of the application the function was not used in conjunction with user input so again the risk was quite low. In addition to reviewing the use of striptags() the team focused on expanding their core validation libraries. These enable whitelisting of attributes which will be used in areas where safe tags are allowed through in order to prevent such issues in the future.
To summarize actions taken, the team:
Reviewed all uses of the striptags function
Expanded validation library to check tag attributes when needed
The unfortunate series of events that caused the issue included:
The underlying striptags() issue above
A paper on the site with an example of a simple XXS
Google indexing the site and converting papers to text for indexing
Search string used by the original reporter
Section of the paper returned for the search "preview"
Dr. J followed up with the SANS Web team's Ryan C for further technical exploration of the issue. This was indeed a compound problem, not caused by the simple lack of attributes filtering alone. As the application was NOT double-encoding the input it meant that data being sent by Google, which Google had escaped for literal display as part of the search results, was decoded when it should not have been.
As an example:
The above is what was returned by Google and, after running through HTML sanitation, should have been returned to the browser.
According to Ryan C, using the model-view-controller architecture (MVC) pattern, the controller should encode the output before making it available to the view. The view in-turn uses HTML sanitizer to decode and display as actual HTML. However, because double-encoding was disabled in the controller, '<' for instance remained as such, rather than being double-escaped to '&lt;'. The HTML sanitizer re-constituting the HTML in the view, then decoded the '<' to '<' when it should have been converting '&lt;' to '<'.
In summary, as ISC Handler Swa pointed out, the difficulty lay in the fact that snippets of GIAC Gold papers were being sent back and that trying to maintain the formatting by preserving some of the strict HTML created the issue. While the whitelist allowed certain HTML tags, certain attributes such as <div onmouseover=…> were not removed.
Now that we're fully up to speed on the issue, what are some solutions?
Swa reminded us that HTML Purifier is a decent standard library with which to accomplish our mitigation goals above. HTML Purifier is reasonably configurable: it first cleans up the HTML to make sure it is standards compliant to avoid issues with browsers that try to interpret broken HTML, then it removes disallowed tags/attributes.
Dr. J mentioned OWASP ESAPI which is current for Java but is beginning to fall behind in maintenance for the likes of PHP.
Without question, refer to the Top 10 2010-A2-Cross-Site Scripting (XSS) overview which includes the OWASP XSS Prevention Cheat Sheet. Jim Manico's discussion on the Future of XSS Defense is also a great read.
I've long been an advocate for utilizing web application firewall options where possible or applicable. During my years of heavy web application vulnerability research when developers struggled to repair code in a timely or effective manner, I was always quick to mention the likes of ModSecurity as a short-term mitigation that can remain in place after the code fix to allow for defense-in-depth. While a WAF may not have been fully effective in mitigating this oddly chained issue, it can go a long way in blocking the majority of attacks with the likes of the OWASP Core Rule Set (CRS) . Note: ModSecurity for IIS was just voted the 2012 Toolsmith Tool of the Year.
As always, we're interested in your tactics and preventative measures, and look forward to hearing from you.
Russ McRee | @holisticinfosec