Tuesday, 20 September 2016

[Idea] False Biased Extraction for SQLi using Prime numbers

I'm getting into this new style of blogging where I blog my unadulterated raw thoughts. Open to the ridicule of the entire world. Its liberating :) Its also very interesting to see how I make mistakes and what I learn from them, instead of just the monotonous, lifeless style of blogging what you got right or won at. So anyway here's todays thought...

A couple years ago, some dude from Immunity talked about how in "True"Blind SQLi it might be effective to favor responses that produce no-wait responses (here's the link: http://infiltratecon.com/miguelturner.html ). That is to say, if you are exploiting a Blind SQL injection vulnerability in which only time based responses are observable, it makes sense that responses that don't trigger sleeps or waits would get you information faster. Brilliant observation! This is quite clever because it allows you to throw the traditional computer science idea of efficiency out of the window (or rather the conventional idea computer scientists have grown into) and focus purely on extraction time. When musing on these ideas I tend to get trapped in thinking of computational cost in terms of strict computer science i.e. operations, network requests, cpu cycles etc. But here we are focusing on whats important, getting information fast.

Digression: The labels for efficiency you are "indoctrinated" into with computer science are merely the natural way people found computer science to be useful. We needed to make algorithms more efficient according to instructions because it allows us vertically (more bits per op instead of more information per bits) scale our computational power without scaling our solutions. But you should think of "instructions" and "memory" as a label you can generalize.  Why not try to find an efficient algorithm for picking up chicks as measured by the efficiency of how many drinks you need to buy hehe I literally just found an example of this line of thought... 

WARNING: This video is probably not totally safe for work!

So anyway the question we then need to answer becomes:

How can you construct a line of questioning (series of questions) that will almost always return no-wait responses, and terminate with enough information to accurately infer the number you are extracting?

My guess is using Prime numbers! So literally probing the database to indicate whether a number has a certain prime numbers as a factor. Why is this cool?

  • Prime number factors are unique; each number has a unique equation involving only prime numbers that multiply to it i.e. unique prime factorization
  • Distribution of prime numbers can be used to favor no-wait responses (we can order our line of inquiry in a such a way that we prioritize the prime numbers in terms of information extraction goodness and miss rate)
  • Many probes can be deployed in parallel
  • We can probe for multiple prime numbers at once i.e. password[0]%prime[0] ^ password[0]%prime[1] ^ ... if this number is 0 then we just extracted tons of information about the target letter.
  • Could be applied to extending,  or made useful in other paradigms of extraction i.e. hash based extraction where the alphabet is very small can be morphed to using the hash table to extract configurations of the prime numbers instead of literally values like lets say 2,5,7,11 could indicate bit positions and an extraction of 1,0,1,0 would mean 2 is a factor, 5 is not a factor, 7 is a factor and 11 is not a factor. You would probably not need such a big alphabet/hash table to effect an entire byte extraction. 
All that is left to get this idea off the ground is to construct an ordering that is chosen optimally. For that I could employ the shortest path algorithm, and assign the "goodness" and then search for the optimal ordering. But before that I need to find a costing function for the ordering! Will blog about that in the next one prob.

Monday, 19 September 2016

Abusing WebVTT and CORS for fun and profit

WebVTT is a way html5 developers can display and cue text as subtitles for video formats. The grammar for WebVTT is pretty simple and as we know browsers are always willing to forgive any "weird" looking grammar in an effort to provide best effort experience for users. This post looks at ways to take advantage of WebVTT in some attack contexts in order to extract information or perform general DOM abuse.


Video tags can make use of subtitle files, as follows:

WebVTT (subtitle) files need to follow this format:

The file merely describes cues, allows you to number them and associate a duration and display time for them. Display timestamps specify hours (hh), minutes (mm) , seconds (s) and milliseconds (ttt). According to my basically inspection of the grammar, most browser require you to respect the placeholders (significant figures) if you specify the magnitude. For instance, if you want to indicate hours you need to use both place holders, same goes for others. You must specify the "-->" to indicate the end timestamp.

Grammar Quirks

The grammar is not extremely strict, you can get away with alot. Here's a quick summary of some of the things I've noticed for Safari, Chrome and Firefox:

  • The file must start with "WEBVTT"
  • It doesn't matter what comes after the first few bytes (described above)
  • Numbering cues is optional, the browser usually just fills an array with these the cues.
  • Just about anything after the timestamp is considered part of the cue until another one is specified.
  • It doesn't matter what the content type of the subtitle file is 

Extraction Attacks

So now that we know what we can and can't do with the grammar lets see how we can abuse this. Here's the attack contet:

  1. A server hosts some authenticated content only a logged in user can access, with the metioned users access rights. 
  2. An attacker wants to extract this, but needs to do this under the mentioned user's access rights.
  3. The user is free to visit any arbitrary page and the attacker is on a remote network.
  4. The host web server has a cross origin headers like this: "Access-Control-Allow-Origin: *"
  5. The web server hosts a page that suffers from an injection vulnerability allowing the attacker to specify the first few bytes of the page

This is for all intents and purposes the same attack context as a CSRF vulenrability, and abuses the same mechanism in the browser that a CSRF attack does. Point number 4 is the kicker, should any JavaScript, JSON, or any arbitrary content harbor secrets in such a away that the attacker can inject into the first few bytes, there is a way to extract these secrets using WebVTT.

Straight Extraction Attack

Here's the basic attack idea, the attacker hosts the injectable page in such a way that it is interpreted as a WebVTT page, here's a demo
The trigger (as seen above) looks like this:


If this is inserted into the first few bytes of the page it will force it to be rendered as follows:

Which then forces the browser to interpret it as a valid WebVTT file, and the HTML content would be the subtitle text. Here's where the magic comes in; most browser have a javascript API for accessing the actual text in the cues, here's a quick demo of extracting the first cue's text:

And here's a screen shot of me using this trick to demonstrate that it could work against HTTP headers (random arbitrary content) should I be able to inject the WEBVTT preamble as discussed above (the screenshot shows Safari and Firefox):

Further Reading and References