Become a Patron!

My Amazon wishlist can be found here.

Life Line

Questions from the Field: Should I Escape My Input, And If So, How?

At last weekend's PHP Benelux I gave a tutorial titled "From SQL to NoSQL". Large parts of the tutorial covered using MongoDB—how to use it from PHP, schema design, etc. I ran a little short of time, and since then I've been getting some questions. One of them being: "Should I escape my input, and if so, how?". Instead of trying to cram my answer in 140 characters on Twitter, I thought it'd be wise to reply with this blog post.

The short answer is: yes, you do need to escape.

The longer answer is a bit more complicated.

Unlike with SQL, inserting, updating and deleting data, as well as querying data, does not require the creation of strings in MongoDB. All data is always used as a variable or a constant. Take for example:

<?php
$c = (new MongoClient())->demo->col;
$c->insert( [ 'name' => $_GET['name'] ] );
?>

Because we don't need to create a string with the full insert statement, there is no need to escape with ' to prevent issues like SQL injections. The context in which variables are used is immediately clear.

But be aware that PHP's request parameters (GET, POST, COOKIE, and others) allow you to send not only scalar values, but also arrays. If we take the example code from above in mind, and request the URL http://localhost/script.php?name[first]=Derick&name[last]=Rethans, we end up inserting the following document into the collection:

[ 'name' => [
        'first' => 'Derick',
        'last' => 'Rethans'
] ]

And this is probably not what you had in mind.

The same trick is possible when doing queries. Look at this code:

<?php
$c = (new MongoClient())->demo->col;

$r = $c->findOne( [
        'user_id' => $_GET['uid'],
        'password' => $_GET['password']
] );
?>

If we now would request the URL http://localhost/script.php?uid=3&password[$neq]=foo we end up doing the following query:

<?php
$c = (new MongoClient())->demo->col;

$r = $c->findOne( [
        'user_id' => '3',
        'password' => [ '$neq' => 'foo' ]
] );
?>

The password clause in that query, will likely always match. Of course, if you are not storing passwords as a hash, you have other problems too! This is just a simple example to illustrate the problem.

This same example highlights the second issue - that is that all request parameters are always represented by strings in PHP. Hence my use of '3' instead of 3 in the above example. MongoDB treats '3' and 3 differently while matching, and searching for 'user_id' => '3' will not find documents where 3 is stored as a number. I wrote more extensively about that before.

So although MongoDB's query language does not require you to build strings, and hence "escape" input, it is required that you either make sure that the data is of the correct data type. For example you can do:

<?php
$c = (new MongoClient())->demo->col;

$r = $c->findOne( [
        'user_id' => (int) $_GET['uid'],
        'password' => (string) $_GET['password']
] );
?>

For scalar values, often a cast like I've done above, is the easiest, but you might end up converting an array to the string 'Array' or the number 1.

In most cases, it means that if you want to do things right, you will need to check the data types of GET/POST/COOKIE parameters, and cast, convert, or bail out as appropriate.

Shortlink

This article has a short URL available: https://drck.me/escinput-bm4

Comments

While relatively obvious to most Mongo users it's still a good summary. Everyone should burn this into his/her mind :)

Add Comment

Name:
Email:

Will not be posted. Please leave empty instead of filling in garbage though!
Comment:

Please follow the reStructured Text format. Do not use the comment form to report issues in software, use the relevant issue tracker. I will not answer them here.


All comments are moderated