New MongoDB Drivers for PHP and HHVM: History
We recently released a new version of the MongoDB driver for PHP. This release is the result of nearly a year and a half work to re-engineer and rewrite the MongoDB driver. In this blog post, I will cover the back story of the how and why we undertook this effort.
The original driver was created by Kristina and released in early 2009. The extension had a very simple architecture and was written mostly in PHP. A handful of core functions were implemented in C, such as creating a resource for the connection, standard CRUD operations, query execution, and iterating over a result set (sans an Iterator interface). All of the convenient user facing APIs were implemented in PHP code. There were also a few classes to encapsulate certain MongoDB data types that PHP otherwise could not represent, such as its ISODate, ObjectID and Regex types.
A small example to create a connection, insert a document, and retrieve a result set looked like:
<?php
include "Mongo.php";
$m = new Mongo();
$c = $m->selectCollection("phpt", "find");
$c->drop();
$c->insert( array(
"foo" => "bar",
"a" => "b",
"b" => "c")
);
$cursor = $c->find(array("foo"=>"bar"), array("a"=>1,"b"=>1));
while ($cursor->hasNext()) {
var_dump($cursor->getNext());
}
?>
Support for PHP 5.3 was soon added. In this release, much of the syntactic sugar in the form of PHP classes and code had been replaced by implementations in C. Where most (if not all) PHP extensions use phpt tests, the MongoDB driver instead implemented its test suite with PHPUnit tests. The main difference between test methods, is that phpt tests are all run in a separate PHP process in isolation, whereas PHPUnit tests re-use the same process by default. Which means that if one test case causes a crash in the driver, none of the others run either, and no result is visible.
In late 2009, the MongoDB driver for PHP reached a stable state with version 1.0.0. It added a feature to placate some PHP developers that refused to use single quotes around query operators. Query operators in MongoDB have the form of $ + operation, such as $gte for Greater Than or Equal or $set for setting fields during updates. As they are array keys, extra care needs to be taken. For example, the following does not do what you expect it to do:
$r = $c->find( array( "fieldname" => array( "$gte" => 42 ) ) );
The $gte in the double quotes of course causes a variable substitution, usually evaluating to an empty string. Instead of educating users to use single quotes (as in '$gte'), the driver acquired a way to replace the $ symbol to signal a query operator with a different symbol:
ini_set('mongo.cmd', '@');
$r = $c->find( array( "fieldname" => array( "@gte" => 42 ) ) );
The driver would replace these symbols before sending them on to the server. A little later, a warning was added in case an empty array key was found.
The 1.0.0 release also settled issues with serialising PHP variables to database types. MongoDB uses Binary JSON (BSON) as its internal format, and the PHP driver needs to serialize to and from this data type. BSON also supports compound data types, arrays (packed numerical keys), and documents (basically hash maps, or something akin to JSON objects). Because in PHP there is really only one array type, the decision was taken that both BSON documents and BSON arrays would be deserialized into PHP variables as arrays. The following example illustrates this:
<?php
// Make a connection, select the 'test' collection in the 'demo' database,
// and clean out the collection.
$m = new Mongo;
$c = $m->selectCollection( 'demo', 'test' );
$c->drop();
// Construct two documents
// 1. A stdClass object
$docObj = new stdClass;
$docObj->foo = 'bar';
// 2. An array
$docArr = [ 'foo' => 'baz' ];
// Insert documents into the collection:
$c->insert( $docObj );
$c->insert( $docArr );
// Query and show results
$r = $c->find();
foreach ( $r as $record )
{
print_r( $record );
}
?>
The result of this script is (after formatting):
Array
(
[_id] => MongoId Object ( [$id] => 565888e844670acd368b4567 )
[foo] => bar
)
Array
(
[_id] => MongoId Object ( [$id] => 565888e844670acd368b4568 )
[foo] => baz
)
As you see, both come back as arrays, as arrays are easier to work with than objects of stdClass.
At the start of February 2012, I took over the role as maintainer of the PHP driver. Connection pools were new in the 1.2 series, and ended up causing lots of issues. Because PHP is single threaded (from a request handling point of view), a connection pool makes little sense as PHP can't use multiple connections at the same time anyway. Ultimately, connection pools were ripped out in the 1.3 series to improve reliability (and sanity while debugging). At the same time, I also added lots of logging!
With 1.3 out the door, Hannes started helping out, and because we were now two, we settled on a new Git workflow. Jeremy joined our little team around at the same time, and started contributing to the driver in earnest somewhere in 2013.
Over the next few years, Hannes, Jeremy and I worked on the driver, keeping it up to date with new server features. While doing so, we ran into quite a few earlier design issues. A few I have mentioned so far, but also included are:
-
Two inconsistent ways of passing in options to methods. The driver uses both positional arguments and arguments passed in as an array. Sometimes both are allowed for the same method due to backwards compatibility reasons.
-
There are static methods for setting query and cursor objects, which makes it difficult to find out which option value was being used for each query.
-
We had no clear strategy as to when to add command helpers. Command helpers are convenience methods for tasks such as creating users, indexes, and collections — things that could also be done through the generic command execution method.
-
Maintenance of some complex functionality such as GridFS was hard, and was both difficult and cumbersome, as it was all implemented in C code.
-
Because everything was implemented in the PHP extension, we couldn't easily share code in case we wanted to support something like HHVM.
Lastly, because each language driver was implementing its own version of the protocol to talk to MongoDB, we were doing a lot of duplicated work.
With this in mind, we set some goals for a new version of the driver:
-
A bare bones driver just implementing the essential API for accessing all features of MongoDB.
-
No syntactic sugar, or command helpers (they are better suited to a PHP library)
-
It should be fast to write and easy to maintain
-
Support for other PHP engines like HHVM
-
No reinvention of the wheel by implementing the protocol to talk to MongoDB again.
-
The addition of a supporting PHP library, to provide convenience methods, command helpers and in general, a nice user-facing API.
With all these requirement in place, we came up with a new architecture, that I will describe in an upcoming blog post.
Life Line
Spring Colours Along the Thames
It was aovely day for a walk. The spring colours are always so fresh.
Updated 2 benches
I walked 8.6km in 2h36m33s
Updated a bus_stop and a restaurant
I walked 2.0km in 20m06s
I've finished reading This Way Up. It's about maps, that went wrong.
It's a good read, but htyerr were several chapters that were written in a novel way (as a video transcript, a series of letters), and I found distracting from the a tail content. It'll have worked better in a produced video.
No mention of @openstreetmap though :-(
I walked 3.0km in 34m18s
I walked 1.1km in 10m21s
Updated a bench
I walked 4.9km in 56m05s
Created a tree; Updated 3 humps and a waste_basket
I walked 3.0km in 43m29s
The Early Cormorant Catches the Eel
Sorry, not the best photo! But I caught this Cormorant catching this large eel when looking for Bank Swallows, right next to Eel Pie Island in the Thames.
#Birds #BirdPhotography #BirdsOfMastodon #Photography #London
Updated an estate_agent office
I walked 2.2km in 37m51s
I went to my nieces' birthday party yesterday.
The theme was pink, and that included all the food, mostly died with beet root.
Shock and horror this morning when doing number two. Not only was my turd dark red, it was also glittering at me. Apparently the carrot cake had edible glitter...
So now I know what's worse than glitter.
😂 ✨ 💩 🟣Long-Tailed Tit on a Branch with Lichen
I've been spending some time in random London local nature reserves.
Sitting and listening, and in fifteen minutes you spot countless species.
This one was in Ham Lands Local Nature Reserve near Teddington.
#london #BirdPhotogaphy #BirdsOfMastodon #Birds #LichenSubscribe
I walked 2.5km in 29m36s
A Colourful Mandarin
In The Long Water in Kensington Palace Gardens, London.
Created 7 benches
Created 2 benches
Created a bench
I walked 7.3km in 2h28m39s
Added a note about a duplicate Papersmiths






Shortlink
This article has a short URL available: https://drck.me/newmongo1-byc