Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Various multibyte bugs #22

Open
wants to merge 11 commits into
base: master
Choose a base branch
from
Open

Various multibyte bugs #22

wants to merge 11 commits into from

Conversation

eggyal
Copy link

@eggyal eggyal commented Apr 24, 2014

There are a number of multibyte bugs in this library.

  1. str_replace() is not multi-byte safe, yet is being used to escape quote characters within SQL object identifiers. This can lead to SQL injection in certain edge cases. For example:

    // using GBK
    $db = new SafeMySQL(['charset' => 'gbk']);
    
    // malicious table name
    $table = "\x8c` WHERE user='admin'; -- ";
    
    // other variables
    $mod = 'foo';
    $limit = 1;
    
    // example from documentation
    $data = $db->getAll('SELECT * FROM ?n WHERE mod=?s LIMIT ?i',$table,$mod,$limit);

    The query that is sent to the server:

    SELECT * FROM `` WHERE user='admin'; -- ` WHERE mod='foo' LIMIT 1

    Fortunately this vulnerability is somewhat mitigated by the fact that the statement will fail if the resulting identifier (the table name of in the above example) is not the name of a known object—and an attacker is limited in the identifiers that he can use, since the final byte must be 0x60 (an ASCII backtick); furthermore, since the query is dispatched with mysqli::query(), the server will ignore attempts to introduce a second statement. So the exploit risk is low.

    The problem can however be fixed by using mb_ereg_replace() instead of str_replace() and this is introduced by my first commit (together with a few other OCD changes to string literals).

    My second commit then moves the mb_regex_encoding() call (necessary to set the encoding for the above operation) into the escapeIdent() function in order to avoid problems if its setting is changed between object construction and statement preparation.

  2. In searching for parameters using preg_split(), prepareQuery() implicitly assumes that the raw statement is in ASCII—yet the user is allowed to choose alternative connection character sets. This may not be an unrealistic assumption, since every connection character encoding that MySQL currently supports is a compatible superset of ASCII. However, there's no sensible reason not to perform this operation in the connection character set, which is what my third commit achieves.

    As an aside, it may be worth noting that performing this split operation using regular expressions is somewhat naive. For example, the raw query might contain a parameter code within a string literal:

    $db->getAll("SELECT * FROM foo WHERE bar='hello?silly'");

    Here the library will attempt to replace the ?s within the string literal. Of course, in this example the query will fail due to a mismatch in the number of parameters and provided values—but it would be better to properly parse the SQL to identify the context in which placeholder strings are encountered: PHP SQL Parser might be useful to that end. In the meantime, one might at least assert that the pattern should end on a word boundary.

  3. On the same basis (i.e. that the SQL might theoretically be in an ASCII-incompatible encoding), placeholder expansion must ensure that all values and inserted SQL code are encoded using that connection character set. This is implemented in my fourth commit.

@eggyal eggyal changed the title Protect from multibyte exploits in identifiers Protect from multibyte exploits Apr 24, 2014
@eggyal eggyal closed this Apr 24, 2014
@eggyal eggyal reopened this Apr 24, 2014
@colshrapnel colshrapnel self-assigned this Apr 24, 2014
@eggyal eggyal changed the title Protect from multibyte exploits Fix various multibyte bugs Apr 24, 2014
@eggyal eggyal changed the title Fix various multibyte bugs Various multibyte bugs Apr 24, 2014
@eggyal eggyal closed this Apr 24, 2014
@eggyal eggyal reopened this Apr 24, 2014
@colshrapnel
Copy link
Owner

Ugh. It seems I know how I'll spend the upcoming holiday.
I need to explore it through.

But before I start, can you please take a look at http://stackoverflow.com/questions/22266311/to-parse-sql-with-regex-excluding-quoted-literals
or even at http://programmers.stackexchange.com/questions/232791/slight-extension-for-sql-prepared-statements-syntax-need-advice

I am going to change the parser completely, to allow named placeholders. How do you think - may be it would be better to get rid of regexp and create an old good char-by-char lexer?

@colshrapnel
Copy link
Owner

I can't seem to be able to reproduce your code. It does raise an error, which is still a bad thing, but it doesn't allow injection either. Here it goes:

include 'safemysql.class.php';
$db = new SafeMySQL(['charset' => 'gbk']);

// malicious table name
$table = "\x8c` WHERE user='admin';/*";

// example from documentation
$data = $db->getAll('SELECT 1 as ?n', $table);

var_dump($data);

what I am doing wrong?

@ExplodingCabbage
Copy link

what I am doing wrong?

I'm going to have a go at reproducing this now and see if I can find out. As a possible starting point: on my machine, I can't use MySQLi to send a query that contains an unclosed /* comment; doing so results in a syntax error. So for injections of this kind, I would need to use a -- comment instead; something like

$table = "\x8c` WHERE user='admin'; -- ";

instead of

$table = "\x8c` WHERE user='admin';/*";

I'm not sure whether this behaviour is as a result of the version of MySQL I'm running or the version of MySQLi I have installed, but perhaps it accounts for your failure to reproduce as well?

Once I've reproduced this, I'll let you know of any further obstacles I come across.

@ExplodingCabbage
Copy link

I can reproduce this just fine using eggyal's example query (as long as I create a table called ); the comment syntax issue was my only difficulty. What error are you getting?

@eggyal
Copy link
Author

eggyal commented May 1, 2014

Actually, the problem is (slightly) more serious than I first thought.

Some commands, especially DDL ones, can accept object identifiers that don't exist. For example:

$_GET['colname'] = "\x8c` INT, DROP COLUMN id, RENAME TO xyz -- ";
$db = new SafeMySQL(['charset' => 'gbk']);
$db->query('ALTER TABLE users ADD COLUMN ?n DATE', $_GET['colname']);

This will send to MySQL:

ALTER TABLE users ADD COLUMN `` INT, DROP COLUMN id, RENAME TO xyz -- ` DATE

So, using a vulnerable connection encoding and parameterising identifiers that don't exist (such as will be the case in many DDL commands) could lead to pretty devastating injection. And having a dynamic schema is pretty much the only justification I can see for parameterising identifiers in the first place, so the second criterion is not all that unlikely...

@matteing
Copy link

This should be merged.

@colshrapnel
Copy link
Owner

@Alphapixels unfortunately, there are too much garbage among these changes. One should never mix critical changes and just code prettifying in the same commit. Also, some changes can be questioned.

As a solution I would rather forbid using encodings other than utf-8 and single-byte ones.

@Arkemlar
Copy link

Arkemlar commented Jun 9, 2016

I agree with idea to support only utf-8 and single-bytes encodings (btw it needs to be checked that the used encoding is consistent with that limitations).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants