Downstream — Trip 1

As you might have picked up on, I am walking the whole length of the Thames Path currently. All 184 miles from the source to the Thames Barrier.

I started making videos about each day of walking, which I will be posting here, but the videos don't tell the whole story so I have decided to write about it as well. This is the story of my first trip, which I did over 3 days at the end of June. I had taken the Monday off so that I could to the first 3 stages in one go. That's mostly necessary because there is no, or really crappy, public transport along this first third of the Thames Path. But both the source, and Oxford, are reasonably easy to do by train.

Day 1 — Source to Cricklade

Downstream — Day 1

The first day started by getting the train to Kemble, which is about a mile away from the source in the Cotswolds. After being delayed in Paddington for an hour, the change at Swindon wasn't great either, as the smaller train into Kemble was very full as the previous London to Swindon train had been cancelled. Getting from Kemble station to the source was a little bit tricky, as it wasn't overly well sign posted. And once you get in the fields near the source, you basically have to have a compass to follow it, as there is no hint of the river yet at all.

When I made it to the source, I rested for a bit before starting my walk along "the river" towards Cricklade. The first bit was really just walking in a field, to a little past where I got onto the path from Kemble, where there was suddenly a little stream, with really clear water. From there on, the river gradually became wider and wider. Most of this first day was easy to walk, with mostly paths, and even though I thought I'd see a lot of mud, there was basically none. It having been so warm in the last few weeks really helped.

I was also quite excited when I spotted the first wild life. At first I thought it was a snake, but it turned out to be a fake-snake, a Slow-worm. We have them in the Netherlands too. Not much after that, I also spotted a single pheasant.

Walking through the Cotswolds Water Park with its many lakes was quite variable. Some were quiet and had waterbirds, others were loud with people attempting to waterski.

Just before Cricklade I walked through the North Meadow, with its abundance of wild plants. Apparently it's one of the very few real meadows left in the UK.

Once I made it to Cricklade, I checked in into the White Hart Hotel, had a shower, and promptly mapped the whole town on OpenStreetMap. Then it was time for a pint, and dinner. While checking in my pint on Untappd I noticed that there was a nearby venue, The Red Lion, which also features its own brews. I spend most of the rest of the evening there trying out their brews in the nicely cool garden reading a book (and twitter!). It was a good end of the day after a 22km walk.

Day 2 — Cricklade to Newbridge

Downstream — Day 2

I got up really early (7:30am!!) to have breakfast and to head out as early as I could. I had to cover a lot of ground today to make it to Newbridge where I had booked a night to stay in an inn. I knew it was a long way, and unfortunately I had developed a blister on my heel. Luckily I had packed blister plasters so this didn't end up too big of a deal. Sunday morning in Cricklade was really quiet, with only a few people walking their dogs. It is not a big town, and soon I found myself in the country side.

Just before Lechlade, my Thames Path in the Country guide said that there was an annoying section along the A361, quite a bit away from the river. But since the guide was written, the Thames Path had been redirected through a path much closer to the river after having secured access. This new section had lots of brand new gates, and would surely have been a big improvement if there weren't a few fields of stinging nettles to get through. Walking in shorts was not a clever plan there.

I came upon the first swans, boats, and of course, the first lock (St. John's Lock). I spend a little time here, as locks always fascinated me. But as I had a long walk today, I didn't linger too long.

I had lunch at Ye Olde Swan. I thought their web site had said they brewed their own beer, but that appeared to be not really true. I still had lunch with a lovely pint on their outdoor Thames-overlooking terrace.

After lunch many of the "paths" turned into fields of plants that I had to battle through for a while, and many had nettles. Where the path was a bit easier, there were lots of butterflies and dragonflies. I probably brought a little bit too little water, but found out that most locks will have a water point. That was particularly useful on this hot section later in the day.

At about 19:00, I made it to the end of the section at Newbridge. I would be staying at The Rose Revived.

Day 3 — Newbridge to Oxford

Downstream — Day 3

After a good night's rest I left rather early again. Not quite because I had a long way to go (only 22km!), but mostly so that I could take it very slowly. The previous day's walk had definitely taken its toll.

The Thames Path went straight through the pub's garden, so I didn't have very far to the path. Once I got out of the garden, there were very many rabbits hopping around, only to be disturbed by me, one fellow hiker, and a very adventurous farmer which found it necessary to mow the hay at 08:30 in the morning, on a Sunday.

After having to go around a caravan park bordering the river, and through fields of sheep, I made good progress. With the river a bit wider, there was quite a lot more boat traffic on the Thames, and even some people swimming in it. I guess the Thames is quite a bit cleaner before cities like Oxford, Reading, and London dump their waste in it.

After about half way, I ran into a sign saying that one of the bridges over one of the tributaries was closed and that I should take a detour instead. That detour was very poorly signed, but with some help of the well mapped paths on OpenStreetMap, I found myself a new route. About a third through the detour, on the top of a hill, I ran into another hiker which informed me that the bridge wasn't actually closed. Which stinks, as this meant that I didn't actually walk the whole Thames Path, even though it was about the same distance.

After passing the Godstow Abbey Ruins, near Oxford, the river side became busy with people (and geese!) that were enjoying the river on this hot day. I was happy to be done with the walking around 15:00, and refresh myself with a pint and lunch before heading home by train to London.

Photos from my Adventure on the Thames Path are available on Flickr, and all videos on Vimeo.


This article has a short URL available:


No comments yet

Downstream — Day 1

I have started walking the whole length of the Thames Path. 184 miles from the source to the Thames Barrier. While walking it, I'm making a little documentary, "Downstream", of which the first episode is here:

Downstream — Day 1

It's my first ever attempt at doing something like this, and hence, very rough. It covers the first day out of a three day long walking trip, and I will be adding further episodes of this segments, and then further walking trips in the future.



This article has a short URL available:


No comments yet


Last week I listened to an episode of The Sceptics' Guide to the Universe where the word of the week was "analemma". An analemma is a diagram showing the position of the Sun in the sky over the course of a year, as viewed at a fixed time of day from the same location on Earth. I once tried to make such a diagram when I was still living in Norway from a series of photos, but the weather wasn't consistent enough to make that work.

But as I am currently starting to update the Guide to Date and Time Programming for a second edition, I was wondering whether I could create an analemma from existing PHP functions. Unfortunately, PHP only provides functionality to calculate when the Sun is at its highest point, through date_sun_info():

$sunInfo = date_sun_info(
        (new DateTimeImmutable())->getTimestamp(), // Unix timestamp
        51.53,                                     // latitude
        -0.19                                      // longitude

$zenith = new DateTimeImmutable( "@{$sunInfo['transit']}" );
echo $zenith->format( DateTimeImmutable::ISO8601 ), "\n";

Which on February 26th, was at 2018-02-26T12:13:38+0000 in London.

Then I remembered that a few years ago I wrote Where is the Sun?. There I features a new hobby library "astro" that I was working on. This library implements a few astronomical calculations. I wrote a little PHP extension around it too: php-solarsystem. Neither library or extension have really been released.

The php-solarsystem extension implements just one function: earth_sunpos(), which fortunately does exactly what I needed for drawing an analemma: it gives you the position of the Sun in the sky for a specific location on Earth at a specific time.

With this function, all I had to do is calculate the position of the Sun in the sky at the same time-of-day for a whole year. With the DatePeriod class in PHP, I can easily create an iterator that does just that:

date_default_timezone_set( "UTC" );

$dateStart = new DateTimeImmutable( "2018-01-01 09:00" );
$dateEnd   = $dateStart->modify( "+1 year 1 day" );
$dateInterval = new DateInterval( "P1D" );

foreach ( new DatePeriod( $dateStart, $dateInterval, $dateEnd ) as $date )

We don't really want Daylight Saving Time to be in the way, so we set the time zone to just UTC, which works fine for London for which we'll draw the analemma.

We start at the start of the year (2018-01-01 09:00) and iterate for a year and a day (+1 year 1 day) so we can create a closed loop. Each iteration increases the returned DateTimeImmutable by exactly one day (P1D).

After defining the latitude and longitude of London, all we need to do is to use the earth_sunpos() function to calculate the azimuth and altitude inside the loop. Azimuth is the direction of where the Sun is, with 180° being due South. And altitude is the height of the Sun above the horizon.

$lat = 51.53;
$lon = -0.09;

foreach ( new DatePeriod( $dateStart, $dateInterval, $dateEnd ) as $date )
        $ts = $date->format( 'U' );
        $position = earth_sunpos( $ts, $lat, $lon );
        echo $ts, "\n";
        echo $position['azimuth'], ",";
        echo $position['altitude'], "\n";

The script outputs the calculation as a "CSV", which we should redirect to a file:

php tests/analemma.php > /tmp/analemma.csv

To plot we use the following gnuplot script:

set style line 1 lt 1 lw 2 pt 0 ps 0 linecolor rgb "orange"
set style line 2 lt 1 lw 1 pt 0 ps 0 linecolor rgb "grey"

set datafile separator comma
set xrange [100:150]
set yrange [0:50]

set grid linestyle 2
set terminal png size 640,640 enhanced font "Helvetica,12"
set output '/tmp/analemma.png'

plot "/tmp/analemma.csv" using 2:3 title "London @ 9 am" with linespoints linestyle 1

With this script, we can then draw the analemma:

gnuplot /tmp/analemma.plot

The result:


Analemma (Plot) — Derick Rethans


This article has a short URL available:


No comments yet

Pretty Printing BSON

In Wireshark and MongoDB 3.6, I explained that Wireshark is amazing for debugging actual network communications. But sometimes it is necessary to debug things before they get sent out onto the wire. The majority of the driver's communication with the server is through BSON documents with minimal overhead of wire protocol messages. BSON documents are represented in the C Driver by bson_t data structures. The bson_t structure wraps all of the different data types from the BSON Specification. It is analogous to PHP's zval structure, although its implementation is a little more complicated.

A bson_t structure can be allocated on the stack or heap, just like a zval structure. A zval structure represents a single data type and single value. A bson_t structure represents a buffer of bytes constituting one or more values in the form of a BSON document. This buffer is exactly what the MongoDB server expects to be transmitted over a network connection. As many BSON documents are small, the bson_t structure can function in two modes, determined by a flag: inline, or allocated. In inline mode it only has space for 120 bytes of BSON data, but no memory has to be allocated on the heap. This mode can significantly speed up its creation, especially if it is allocated on the stack (by using bson_t value, instead of bson_t *value = bson_new()). It makes sense to have this mode, as many common interactions with the server fall under this 120-byte limit.

For PHP's zval, the PHP developers have developed a helper function, printzv, that can be loaded into the GDB debugger. This helper function unpacks all the intricacies of the zval structure (e.g. arrays, objects) and displays them on the GDB console. When working on some code for the MongoDB Driver for PHP, I was looking for something similar for the bson_t structure only to find that no such thing existed yet. With the bson_t structure being more complicated (two modes, data as a binary stream of data), it would be just as useful as PHP's printzv GDB helper. You can guess already that, of course, I felt the need to just write one myself.

GDB supports extensions written in Python, but that functionality is sometimes disabled. It also has its own scripting language that you can use on its command line, or by loading your own files with the source command. You can define functions in the language, but the functions can't return values. There are also no classes or scoping, which means all variables are global. With the data stored in the bson_t struct as a stream of binary data, I ended up writing a GDB implementation of a streamed BSON decoder, with a lot of handicaps.

The new printbson function accepts a bson_t * value, and then determines whether its mode is inline or allocated. Depending on the allocation type, printbson then delegates to a "private" __printbson function with the right parameters describing where the binary stream is stored.

__printbson prints the length of the top-level BSON document and then calls the _printelements function. This function reads data from the stream until all key/value pairs have been consumed, advancing its internal read pointer as it goes. It can detect that all elements have been read, as each BSON document ends with a null byte character (\0).

If a value contains a nested BSON document, such as the document or array types, it recursively calls __printelements, and also does some housekeeping to make sure the following output is nicely indented.

Each element begins with a single byte indicating the field type, followed by the field name as a null-terminated string, and then a value. After the type and name are consumed, __printelements defers to a specialised print function for each type. As an example, for an ObjectID field, it has:

if $type == 0x07
    __printObjectID $data

The __printObjectID function is then responsible for reading and displaying the value of the ObjectID. In this case, the value is 12 bytes, which we'd like to display as a hexadecimal string:

define __printObjectID
    set $value = ((uint8_t*) $arg0)
    set $i = 0
    printf "ObjectID(\""
    while $i < 12
        printf "%02X", $value[$i]
        set $i = $i + 1
    printf "\")"
    set $data = $data + 12

It first assigns a value of a correctly cast type (uint8_t*) to the $value variable, and initialises the loop variable $i. It then uses a while loop to iterate over the 12 bytes; GDB does not have a for construct. At the end of each display function, the $data pointer is advanced by the number of bytes that the value reader consumed.

For types that use a null-terminated C-string, an additional loop advances $data until a \0 character is found. For example, the Regex data type is represented by two C-strings:

define __printRegex
    printf "Regex(\"%s\", \"", (char*) $data

    # skip through C String
    while $data[0] != '\0'
        set $data = $data + 1
    set $data = $data + 1

    printf "%s\")", (char*) $data

    # skip through C String
    while $data[0] != '\0'
        set $data = $data + 1
    set $data = $data + 1

We start by printing the type name prefix and first string (pattern) using printf and then advance our data pointer with a while loop. Then, the second string (modifiers) is printed with printf and we advance again, leaving the $data pointer at the next key/value pair (or our document's trailing null byte if the regex type was the last element).

After implementing all the different data types, I made a PR against the MongoDB C driver, where the BSON library resides. It has now been merged. In order to make use of the .gdbinit file, you can include it in your GDB session with source /path/to/.gdbinit.

With the file loaded, and bson_doc being bson_t * variable in the local scope, you can run printbson bson_doc, and receive something like the following semi-JSON formatted output:

(gdb) printbson bson_doc
ALLOC [0x555556cd7310 + 0] (len=475)
    'bool' : true,
    'int32' : NumberInt("42"),
    'int64' : NumberLong("3000000042"),
    'string' : "Stŕìñg",
    'objectId' : ObjectID("5A1442F3122D331C3C6757E1"),
    'utcDateTime' : UTCDateTime(1511277299031),
    'arrayOfInts' : [
        '0' : NumberInt("1"),
        '1' : NumberInt("2"),
        '2' : NumberInt("3"),
        '3' : NumberInt("5"),
        '4' : NumberInt("8"),
        '5' : NumberInt("13"),
        '6' : NumberInt("21"),
        '7' : NumberInt("34")
    'embeddedDocument' : {
        'arrayOfStrings' : [
            '0' : "one",
            '1' : "two",
            '2' : "three"
        'double' : 2.718280,
        'notherDoc' : {
            'true' : NumberInt("1"),
            'false' : false
    'binary' : Binary("02", "3031343532333637"),
    'regex' : Regex("@[a-z]+@", "im"),
    'null' : null,
    'js' : JavaScript("print foo"),
    'jsws' : JavaScript("print foo") with scope: {
        'f' : NumberInt("42"),
        'a' : [
            '0' : 3.141593,
            '1' : 2.718282
    'timestamp' : Timestamp(4294967295, 4294967295),
    'double' : 3.141593

In the future, I might add information about the length of strings, or the convert the predefined types of the Binary data-type to their common name. Happy hacking!


This article has a short URL available:


No comments yet

Wireshark and SSL

This is a follow up post to Wireshark and MongoDB 3.6, in which I explained how I added support for MongoDB's OP_MSG and OP_COMPRESSED message formats to Wireshark.

In the conclusion of that first article, I alluded to the complications with inspecting SSL traffic in Wireshark, which I hope to cover in this post. It is common to enable SSL when talking to MongoDB, especially if the server communicates over a public network. When a connection is encrypted with SSL, it is impossible to dissect the MongoDB Wire Protocol data that is exchanged between client and server—unless a trick is employed to first decrypt that data.

Fortunately, Wireshark allows dissection and analysis of encrypted connections in two different ways. Firstly, you can configure Wireshark with the private keys used to encrypt the connection, and secondly, you can provide Wireshark with pre-master keys obtained from a client process that uses OpenSSL.

The first option, providing Wireshark with the private keys, is by far the easiest. You can go to EditPreferencesProtocolsSSL and add the private key to the RSA keys list:

When you start using Wireshark with SSL encryption, it is also wise to configure an SSL debug file in the same screen. I have set it here to /tmp/ssl-debug.txt.

Months ago, I had added my private key to the RSA keys list, but when I tried it now for this post, Wireshark failed to decrypt my SSL traffic to MongoDB. I was a little confused as it worked in the past. Since I had my SSL debug file at least I had some chance of figuring out why this no longer worked. After a quick look I noticed the following in the debug file:

   session uses Diffie-Hellman key exchange
   (cipher suite 0xC030 TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384)
   and cannot be decrypted using a RSA private key file.

After some searching, I found out that if the session uses Diffie-Hellman for key exchange, Wireshark can not use the RSA private key, and needs different information. On an earlier run, I must have used a different version of either the encryption library (OpenSSL) or MongoDB, which did not use Diffie-Hellman.

This brings me to the second way of providing Wireshark with the information it needs to decrypt SSL encrypted connections: the pre-master key. This key is created during the connection set-up, and therefore you need to read data structures from within the OpenSSL library. You can do that manually with GDB, but it is also possible to inject a special library that hooks into OpenSSL symbols to read the data for you, and store them in a file with a format that Wireshark understands. You can find the source code for the library here.

Once you've obtained the source code, you can compile it with:

cc sslkeylog.c -shared -o -fPIC -ldl

The compiled key logging library can be loaded in the process to override the existing OpenSSL symbols with:

SSLKEYLOGFILE=/tmp/premaster.txt LD_PRELOAD=./ \
    ./mongo --ssl \
    --sslPEMKeyFile=/tmp/ssl/ssl/client.pem --sslCAFile=/tmp/ssl/ssl/ca.pem

The OpenSSL LD_PRELOAD trick should also work with the PHP driver for MongoDB as long as it uses OpenSSL. You can verify which SSL library the PHP driver uses by looking at phpinfo() output. For Java programs, there is an agent you can use instead.

With the key logging library and its generated file with pre-master keys in place, and Wireshark configured to read the keys from this file through the (Pre)-Master-Secret log filename setting, we can now decrypt SSL-encrypted connections between MongoDB client and server:

There was one caveat: a small patch to Wireshark is needed for it to realise that MongoDB's connections can be SSL encrypted on the default port (27017). I created a patch with the following one-liner:

      dissector_add_uint_with_preference("tcp.port", TCP_PORT_MONGO, mongo_handle);
+     ssl_dissector_add(TCP_PORT_MONGO, mongo_handle);

This patch, and the two patches mentioned in the previous post, have been merged into Wireshark's master branch and will be included in the upcoming 2.6 release. Until that is released, you will have to compile Wireshark yourself, or use a nightly build.


This article has a short URL available:


No comments yet

Life Line