Even Faster Blind SQL injection methods

A method presented at DerbyCon and BlackHat involves extracting not the bits of the character but the bits of a characters position in a look up table which contains a number of character ascii values---more on this later. This post discusses the conceptual advantages and fundamental drawbacks of the bin2pos method and introduces a new variant I've developed which provides better stability and only requires a maximum of 4 requests per character extraction but imposes some configurational requirements to the target web server.





Why am I studying Blind SQL injection techniques? 

Well honestly because its quite an interesting optimization problem and is rooted at the heart of many problem solving disciplines and heuristics in Computer Science, essentially it requires you to work out ways to extract as much data from a Blind SQL injection vulnerability purely by interpreting the behaviour of a web server under attack. Ideas like Hoffman encoding and binary search and various prioritization and sorting algorithms come to mind and can often provide much needed speed ups and space/time efficiency improvements.

The Bin2Pos method and some variations

Bin2Pos (Binary to position) method works by extracting a character, comparing it to a list of characters and then doing queries to extract the characters position. Since the attacker knows the contents of the list, he only needs to know its position in the actual array. The reason this method can be faster in many cases is because the attacker potentially has a lot less bits to extract.

Here's the method as documented in Roberto Salgado's Black Hat US 2013 paper:

IF((@a:=MID(
BIN(POSITION(MID((SELECT password FROM users WHERE id=2 LIMIT1),1,1)IN(CHAR(
48,49,50,
51,52,53,
54,55,56,
57,65,66,
67,68,69,70
))),1,1))!=space(0)
,2-@a,0/0)


Here's a simple modification I've made to the method using the MySQL field method.

select substring(bin(field([extraction target],"a","b","c","d",...)),[bit position],1)

The field function in MySQL does the following:
Returns the index (position) of str in the str1, str2, str3, ... list. Returns 0 if str is not found.
If all arguments to FIELD() are strings, all arguments are compared as strings. If all arguments are numbers, they are compared as numbers. Otherwise, the arguments are compared as double.
If str is NULL, the return value is 0 because NULL fails equality comparison with any value. FIELD() is the complement of ELT()

http://dev.mysql.com/doc/refman/5.0/en/string-functions.html#function_field

so typically if you where to use this against an actual target the payloads would look something like this:




www.mytarget.com/injectme.php?script=" or select substring(bin(field([target char],"a","b","c",...)),1,1)=1


www.mytarget.com/injectme.php?script=" or select substring(bin(field([target char],char(48),char(49),char(50),...)),1,1)=1

Advantages

  • Provides potential for 1 character per request extraction
  • Its pretty multi-threadable and allows automated tools to extract multiple characters in parallel
  • Only requires a maximum of 6 requests.

Drawbacks

The requirement of this method is that the character you are extracting actually appears in the list, the drawback then is that if you what to guarantee that it appears in this list you would need to make a set of the entire ascii table and risk sending a request that may be too long. MySQL also typically responds to look up like functions (such as field, instr, and the IN operator) with a NULL should the element not be found, which doesn't turn out too great if you're comparing the returned values with TRUE or FALSE or anything in fact because MySQL returns a NULL if you compare anything to NULL which is why Salgado's method returns a "0" should the element not be found. You also have the added tax of making a last request to make sure you've extracted all the bits of the position value, though by making a call to the bit_length function you could make this a lot simpler.

Another drawback is that probabilistically if where are analysing this from the perspective of extracting any kind of data from a database (password hashes, usernames, page content,etc.) on average you would be extracting values that appear near middle of the list--assuming you have no reason to know how to order your list so this doesn't happen. What this means is the longer you make your list the better guarantee you have of looking up the character but the more bits you need to extract in order to know its position and you might as well be extracting the bits of the actual ascii value.

You could make slight improvements in the ordering of the character list; for instance if you're extracting characters from english sentences the ordering of the list could be done according to the most commonly used letters i.e frequency analysis.

Bin2Pos Improvements 

The first major modification is to remove the ascii values in the lookup table and put all possible 2 bit values in the table instead; namely the massive list containing the following values:

"00","01","10","11"

that the field function looks like this:

field([comparator value],"00","01","10","11")

what this allows us to do then is extract two bits at a time, and look up their value in the table above, this removes the need to know the bit length of the index value--since it will always be 2. We can extract two bit values at a time by concatenating them into a string as follows:

concat(substring(bin([char]),1,1),substring(bin([char])),2,1)

which would mean we can stick this into the field function as follows:

field(concat(substring(bin([char]),1,1),substring(bin([char])),2,1)
,"00","01","10","11")
 
The last step is to make sure we can extract the position in one request, and this is where the configuration requirement comes in, for this method to work the web server needs to display a unique indication that a database level error has occurred, though this is not an unrealistic requirement since many servers are configured to either display a generic error message--of course if the cause of the error is displayed you may not need to use Blind injection--or return a HTTP 500 status message see here and here.

The way we make sure we always know what position the 2 bit value is in the table is by mapping these values onto four distinct behaviours for the web server namely:

  • FALSE responses triggered by returning a false value
  • TRUE responses triggered by a returning a true values
  • ERROR states triggered by group key duplicates i.e. select row(1,1)>(select count(*),floor(rand(0)*2)a from (select 1 union select 2 union select 3) group by a limit 1)
  • Delayed responses triggered by using the SLEEP() or BENCHMARK() function as in Time-based SQL injection data extraction techniques
combining these triggers into the payload we get the following:
"

select case field(concat(substring(bin([char]),1,1),substring(bin([char])),2,1)
,"00","01","10","11") 
when 1 then TRUE 
when 2 then FALSE 
when 3 then select row(1,1)>(select count(*),floor(rand(0)*2)a from (select 1 union select 2 union select 3)x group by a limit 1) 
when 4 then benchmark(100000,encode("asdfasdfdf","dsfdf")) end;

"
Here's a screen shot of me testing this against an "a" on my mysql-server installation.


Using this variation you will always only require 4 requests.

Interpreting the screenshot about you should see that:

  • The first request takes 2.08 seconds to return indicating that the BENCHMARK function was triggered meaning the bit pattern was "11"
  • The second request returns a TRUE meaning the bit pattern was "00"
  • The third request also returns a TRUE once again meaning the next bit pattern is "00"
  • and lastly because we've extracted 6 bits and there's on left I request bit 7 and bit 6, doing this the server responds with an error indicating that the bit pattern is "10" meaning the 7th bit is "1" and the 6th is "0" as extracted in the previous request.
Putting this together we get "11" + "00" + "00" + "1" = 1100001 = bin(ascii("a"))
And we've just successfully extracted a single character in just 4 requests!! :)

And if you're really desperate you could improve the amount of data you extract in a time base SQL injection attack by using this as follows:


select case field(concat(substring(bin([char]),1,1),substring(bin([char])),2,1)
,"00","01","10","11") 
when 1 then TRUE
when 2 then BENCHMARK(2*10000,encode("asdfasdfsfad",asdf"))
when 3 then BENCHMARK(30*10000,encode("asdfasdfsadfdf","asdf"))
when 4 then benchmark(40*10000,encode("asdfasdfdf","dsfdf")) end;

Here we are inferring which position the bit pattern is in by making it trigger a longer wait period based on the index of the bit pattern in the table.

Update: I was informed of another fast Blind SQL injection method, and since this post is about Fast blind SQL injection methods I felt it wouldn't be fair not to include a discussion on it.

 

The Pre-computation attack

The methods above work great for server side scripts that only allow TRUE/FALSE/ERROR/WAIT responses for instance SQL injection vulnerabilities in login forms and may be quite portable--applicable in most contexts, seeing that they only don't impose much on how the injected query is used--there is an even faster method to extract data from a vulnerable web page using what's called a pre-computation attack.

Pre-computation attacks work by using the page data returned by a query and associating it to a character value; for instance if the script accepts numerical values to associate to page content like page_id=1 or cat_id=50 and there's enough of these ids to go around--255, though there are ways of getting around this situation if there aren't--attackers can pre-compute hash values of the page content associate it to a numerical value and use this to extract information by comparing data they are extracting to the page contents.

So typically what you would need to do to pull this off is crawl the page and pre-compute hash values of the returned responses from the server

visit page_id=1 extract contents ---> compute a hash of page contents ---> associate this hash to numeric id 1

then

visit page_id=2 extract contents ---> compute a hash of page contents ---> associate this hash to numeric id 2

and so on and so forth.

Once you've done enough pre-computation, you can then begin extracting information from the database. Basically you would construct queries that look something like:

www.vulnerable-page.com/page_id=" or (select ascii(substring((select password from login.passwords where username="k3170"),1,1)))=id or "0

what happens then is the query returns whatever data is associated to the page corresponding to the ascii value of the first letter of "password" allowing an attacker to look this up in the pre-computed table and extract the associated numerical ascii value.

Associating entire character values to the hash value of page means that attackers can extract entire bytes in one go.

There is one big requirement for this attack, and that is that an attacker can associate the returned data reliably to a hash value, should a given page change between requests or have a single bit value that's dynamic and/or random per request this attack would not work. So any server side script that renders content or response data not associated to attacker controlled or predictable input would disable this attack. Though depending on how bad this situation is all it would mean is that one numeric value would be associated to multiple hash values, but things like randomly uniquely generated CSRF tokens could easily spoil the fun.

For more on this attack method see this article and here's a video of the attack.




 

Comments