Avoiding SQL Injection Attacks

Randal L. Schwartz

In recent months, the entertainment news was abuzz with how socialite Paris Hilton had had her Sidekick phone "hacked", revealing phone numbers for many famous people, interesting notes about possible profit participation in the "stolen" sex tape, and lurid pictures, all of which were broadcast widely on the Internet. But the quiet word amongst computer security professionals was that this wasn't a matter of hacking into her phone at all, not that the entertainment news people would know or care. Instead, certain individuals have claimed responsibility through a well-known security flaw on T-Mobile's Web site, with which the phone is synchronized. Thus, it wasn't the phone that was the source of information -- it was the Web site.

While the precise steps by which the intruder obtained access may never be known, one of the most common intrusion paths is the SQL Injection Attack. An intruder guesses either by trial and error or determines by examining the source code of the application some field in a form is being used to construct an SQL query. Unless the application carefully verifies the content of this field, or quotes the unusual characters properly, it might be possible for the field's value to escape beyond its intended use and create additional SQL constructs.

Let's look at a specific example. Suppose a bank Web application is trying to show all the bank account balances that belong to me. The SQL might look like:

SELECT account_id, balance
FROM account_data
WHERE account_owner = '[ME]';

Here, the [ME] comes from some other part of the calculation, which isn't relevant. The effect is that I see only my data. Now, let's say that the Web programmers at the bank upgrade the application so that I can distinguish my personal and business accounts via a pop-up form element, which returns back a "1" for personal and "2" for business. So, they change this SQL to be:

SELECT account_id, balance
FROM account_data
WHERE account_owner = '[ME]'
AND account_type = $account_type;

Of course, they presume that the values "1" and "2" will drop in quite nicely there, and the application tests out just fine, so the programmers move on to the next item in their long to-do list.

What many Web programmers fail to realize is that just because a pop-up form specifies a list of values or a text field provides a maximum length (even enforced by JavaScript), this doesn't mean that a user can't construct an arbitrary value for any returned form parameter. Of course, changing this value to "3" won't do me much good. I won't get anything at all because there aren't any accounts with a type of 3.

But suppose I jimmy up a return value so that $account_type comes back as: 1 OR 99 = 99. Let's fill that in:

SELECT account_id, balance
FROM account_data
WHERE account_owner = '[ME]'
AND account_type = 1 OR 99 = 99;

Oops! What just happened? Since the OR has a lower precedence than the AND, we're saying essentially "true" for every row of the table. Presuming we have the patience to page through the result, we now have everyone's account numbers and balances. Oops.

There are many other variants on SQL injection attacks. You might be rather shocked when you begin a Web search for further information. But let's cut to the right way to do this -- with DBI placeholders. The DBI interface permits a chunk of SQL to contain question marks wherever data is being used. For example, the SQL above could be written as:

my $sth = $dbh->prepare(q{
  SELECT account_id, balance
  FROM account_data
  WHERE account_owner = ?
  AND account_type = ?
});

This takes an existing database handle (here, $dbh), and creates a statement handle ($sth) that is safely protected from SQL injection attacks. We use this handle by calling execute, providing the values to be inserted:

$sth->execute('ABC 123', 2);

The DBI system goes through the data, figuring out how to quote the values properly for the corresponding database. For example, the statement being executed will likely end up being something like:

SELECT account_id, balance
FROM account_data
WHERE account_owner = 'ABC 123'
AND account_type = 2

Note that the string was enclosed in single quotes, but the number remained unquoted. (This is important for databases that respect data types.) Now, let's see what happens if we try to pass that complex string as the account type number:

$sth->execute('ABC 123', '1 OR 99 = 99');

This results in the effective SQL of:

SELECT account_id, balance
FROM account_data
WHERE account_owner = 'ABC 123'
AND account_type = '1 OR 99 = 99'

Ahh, look! The quotes enclose the now-apparent string value. While the query is likely to fail (no account type will match that), we didn't get to see everyone's account balances either. Success.

The execute method should get one value for every placeholder in the query string. The undef value is properly mapped as NULL, while numbers and strings pass through properly quoted for the active database. Certain types may require hinting as to the actual data type; see the DBI documentation for further information.

One common problem when using placeholders is that we must maintain a correlation between chunks of SQL and the actual values we want to plug in. Good discipline helps here, but there's a nice module in the CPAN called SQL::Abstract that helps us say what we intend with a bit more clarity.

One feature of SQL::Abstract is that a particular WHERE clause can be represented by relatively straightforward complex data structure. For example, AND-ing two conditions is represented by a hashref:

{ account_owner => $owner, account_type => $account_type }

Here, $owner contains our ABC 123 and $account_type is 1 or 2.

Using SQL::Abstract, we can convert this to the SQL to feed to prepare, and the bind parameters to feed to execute with one simple step:

use SQL::Abstract;
my $sqa = SQL::Abstract->new;

my ($owner, $account_type) = ('ABC 123', 2); # from inputs

my ($sql, @bind) = $sqa->select(
  'account_data', # table
  [qw(account_id balance)], # fields
  {
   account_owner => $owner,
   account_type => $account_type
  }, # "where"
);

my $sth = $dbh->prepare_cached($sql); # reuse SQL if we can
$sth->execute(@bind);                 # execute it for this query

The result is a good query, with good bind values in the right order. And we're still safe from SQL injection attacks. If we also wanted to add account_status as not in suspended or deleted, we could change the query to:

{
  account_owner => $owner,
  account_type => $account_type,
  account_status => { -not_in => ['suspended', 'deleted'] },
}

In this case, the corresponding SQL looks like:

SELECT account_id, balance
FROM account_data WHERE
( account_owner = ?
  AND account_status NOT IN ( ?, ? )
  AND account_type = ? )

with bind values of:

'ABC 123', 'suspended', 'deleted', 2

Note that the ordering of the hash elements affects the output, but the bind values did indeed line up with the corresponding placeholders.

I won't repeat the entire manpage for SQL::Abstract here, but it looks sufficient for most of the queries I've ever wanted to create in SQL.

I can even use SQL::Abstract with my Class::DBI-wrapped databases using Class::DBI::AbstractSearch. Mixing in this class to my Class::DBI-derived base class gives me the search_where clause, accepting that same data structure. For example:

my @results = My::DB::AccountData->search_where({
  account_owner => $owner,
  account_type => $account_type,
  account_status => { -not_in => ['suspended', 'deleted'] },
});

And there, no SQL to write at all, again, and I'm safe from SQL Injection Attacks.

I hope you find SQL::Abstract useful, and never have your name appear in a headline regarding a bad security break-in. Until next time, enjoy!

Randal L. Schwartz is a two-decade veteran of the software industry -- skilled in software design, system administration, security, technical writing, and training. He has coauthored the "must-have" standards: Programming Perl, Learning Perl, Learning Perl for Win32 Systems, and Effective Perl Programming. He's also a frequent contributor to the Perl newsgroups, and has moderated comp.lang.perl.announce since its inception. Since 1985, Randal has owned and operated Stonehenge Consulting Services, Inc.