## Saturday, 20 October 2012

### Beating Trivial Server Side Filters With WebKit

I've just started reading an awesome book and I thought I'd share some of my findings with you. I'll share the title of the book at the end of the post, and I must say its a must read for anyone trying to master XSS attacks.

That being said lets get down to business

## Browser Languages

### Lost in translation

Quick question what languages does your browser parse or "recognize" for you language theorists and computer scientists out there? Did you answer HTML,HMTL+,HTML2.0,XHTML,JavaScript,VBScript,etc? Well then, you are supposed to be right, but in strict terms this is not entirely true! If the set of languages B---which is commonly understood as the browser language---is the set of languages containing HMTL,XHTML,JavaScript,etc. only then this not the language your browser recognizes. Your browser actually recognizes a lot more, in the case of WebKit browsers---especially Chrome, which is what I based my research on here---HTML*---including all HTML versions---is actually bigger by at least 10 elements per standard element---by this I mean for every <a></a> element there are at least 10 equivalent <a></a> elements,I'll show you why in a bit.

How does it happen that Browser languages are much much bigger than standardized browser based languages? The answer is lack of standards! We know that in langauges like C,Python or Java standards exist for handling incorrect syntax---compilers handle this---there are rules that say
if you are missing a semi-colon then raise a syntax exception or what have you. In browsers nothing
of the like exists. No one has defined a universal set of rules to handle syntactically incorrect HTML*.
And so its left to every layout engine engineer to decide what happens when weird HTML* is presented.

Another reason is UX---user experience---every layout engine---Trident,WebKit,Gecko and whatever one Opera uses---in every browser makes its own effort and preserving UX when presented with erroneous HTML* forms.

And now you're probably asking yourself, so what? I think its awesome! And in many cases,for many people it probably is! If browsers didn't do this you may even not be able to enjoy some of the web pages you visit everyday.

But there is a negative effect! XSS Filters more often than not are modelled using standard HTML.
A lot of people who write XSS Filters build then to recognize standard HTML forms, this either done
to ensure that the XSS filters work across all browsers or simply because of the assumption that
Browser Languages are all completely standard. Either way this leaves these XSS Filters exploitable.
Why because they will be recognizing a smaller language than they should!

## NofuzzhablafuzzHTMLfuzz

### abusing XSS Filters' language barriers

Before I can show you the effect non-standardized HTML has on the classic notion of an XSS Filter I'm going to discuss the simple process of finding the non-standard HMTL elements in WebKit by using a magical but---statistically brilliant---process of fuzzing. You can do this with any browser
I've started with WebKit and Firefox most recently and all you need to be able to do it is:
1. A PHP web server
2. A WebKit Web Browser
3. Basic background in PHP
The process is quite simple, and there a lot of awesome ways to automate it, but for people just starting
out with web browser fuzzing---like me :)---I think its better to do everything by hand, rather in a non-fully automated way.

To start we can set up some simple test cases. Let's try to fuzz everyone's favourite HTML element,
the anchor tag. Fire up your favourite text editor and whip up code that looks a little something like this.

<!DOCTYPE HTML>
<html>
<body>
<h1>Test 1: fuzz after opening angle</h1>
<?php
for($i=0;$i<=255;$i++){$character=chr($i); #right after the opening< echo '<div><'.$character.'a
href="http://www.google.com/">'.$i.'</a></div>'; } ?> </body> </html> Now whip out your browser and fire up this script woo hoo! lols you should see a of strings---which are "dead" anchor tags---and one valid anchor tag. Not too impressive right? try this one <!DOCTYPE HTML> <html> <body> <h1>Test 7: fuzz after closing angle</h1> <?php for($i=0;$i<=255;$i++){
$character=chr($i);
#right after the opening<
echo '<div><a href="http://www.google.com/">'.$i.'<'.$character.'/a></div>';
}
?>
</body>
</html>
You'll notice that its says "Test 7", this is because I've been doing a bunch of tests each in different positions---thats what she said!---of the anchor tag element.
So after running this script, what do you see? Probably not what you expected right? 255 anchor tags!
check out the source of the page and you'll start seeing what WebKit does to rectify the situation---by the way there may be very interesting things you can do with this behaviour, I haven't quite reverse engineered it yet but it seems a profitable endeavour.So you see that for every anchor element there
are effectively 255 other elements, that actually work properly! This means we if any XSS Filter is trying to match standardized HTML* anchors then we can beat it with WebKit's special anchor tags!

Another cool example is
<!DOCTYPE HTML>
<html>
<body>
<h1>Test 8: quest for the angle bracket</h1>
<?php
for($i=0;$i<=255;$i++){$char=chr($i); #right after the opening< echo '<div><a'.$char.'href="http://www.google.com/"'.$char.'>'.$i.'<'.$char.' /a'.$char.'></div>';
}
?>
</body>
</html>
which is a little more obfuscated but still yields some usable results.
The challenge is now to try these weird anchor tags against some of your favourite XSS Filters and see how well they perform. I'm going to test this against a regex designed for matching anchors, its easy enough to extend this to any other tag ;)
Try firing up this script and hammering it with standard anchor tag forms
<!DOCTYPE HTML>
<html>
<body>
<textarea size="20"><a[\s]+[^>]*?href[\s]?=[\s\"\']*(.*?)[\"\']*.*?>([^<]+|.*?)?
<\/a></textarea>
<div>
<?php
if (preg_match("/<a[\s]+[^>]*?href[\s]?=[\s\"\']*(.*?)[\"\']*.*?>([^<]+|.*?)?<\/
a>/i",$_GET['param']) != null){ echo "<h2 style='color:green'>These are not the droids you are looking for</ h2>"; } else{ echo "<h2 style='color:red'>No XSS Found</h2>"; echo$_GET['param'];
}
?>
</div>
</body>
</html>
What I'm doing here is simply grabbing the GET 'param' and running it through a preg_match. So lets try some regular anchors

And lets try using one of the anchors we found in the fuzz

And that's a WIN! This is classic example where regexes written for standard HTML* do not work.
All of the 255 examples found in the fuzz test actually work too!

And thats all I have to say ;) lols
Oh yes the Book title is:
Web application Obfuscation by the awesome as always Syngress publishers