The Xdebug Experience

Xdebug development is currently in its 17th year. I started working on it when PHP 4.2.0 had just been released. The first CVS commit added the (in)famous "maximum nesting level" functionality, to prevent PHP from crashing when your script would infinitely recurse. I had forgotten this, but apparently Xdebug has some origins in VL-SRM, a naive attempt to create an application server running PHP bananas (think of: Java beans). The VL prefix still lives on as part of my PHP internals debugging extension VLD.

Where Are We Now?

As with any legacy project, Xdebug has technical debt. Some of it I managed to address during the years by dropping support for older PHP versions. There is a fair amount of sub-optimal code, algorithmic, and design issues, which impact ease of configuration, usability, and performance.

Configuration Confusion

On the configuration side, take for example the 65 configuration settings. There are a few pointless ones (xdebug.extended_info, xdebug.remote_handler), a few that duplicate functionality (xdebug.trace_output_dir and xdebug.profiler_output_dir), and a few that should never be used together (xdebug.remote_enable and xdebug.profiler_enable).

Unproductive Usability

If we look at usability, there are situations where breakpoints don't break. Some of these I can likely address, whereas others I can not due to the way how PHP works internally (See also: The Mystery of the Missing Breakpoints). Having OPcache in the mix does not help either as it can create another set of problems where visible code has been optimised out.

To solve a third group of issues with breakpoints, the DBGp "breakpoint resolved" notification needs to be implemented in both Xdebug and IDEs.

Another issue is that both Xdebug and PHP-FPM use port 9000 as their default, although Xdebug's use of port 9000 precedes PHP-FPM's by about five years (2004 vs 2009).

And lastly, (potential) users often cite that it is "hard" to set Xdebug up. Although after talking to some of these users, it often becomes clear that the problem is not so much on the Xdebug side of things, but rather other tools: their IDE, Docker, SELinux, firewalls, etc. Improving this is a matter of education (and better error messages).

Poor Performance

On the performance side, there are some places with major issues, and some others with minor issues. Code coverage isn't particularly fast, and that needs an in-depth investigation. I have several ideas on where improvements in code coverage performance can be made—nonetheless, I would continue to put correctness over speed.

Even with Xdebug just loaded and enabled, there is too much of a performance impact. This is because Xdebug usually has every feature ready to go, even though you don't necessarily want to use all of these different features. For example, it is ludicrous to use single step debugging with the profiler turned on.

Although it's often possible to reduce the impact of feature sets by setting configuration options correctly, a better way to put Xdebug in a specific mode would be welcome.

What Needs to Be Done?

In order to address many of these problems, Xdebug must undergo a massive refactoring. As part of that effort, the refactoring must primarily focus on solving the above mentioned three categories, although improvements in code layout are also desirable.

As part of the refactoring process, the following primary tasks need to be completed, in preferably the listed order:

  • Finalize PHP 7.3 support (there is a nasty bug where Xdebug and a specific OPcache optimisation setting conflict).

  • Code needs to be reorganised, as the current location of source files and functions is detrimental to improvements.

  • The number of configuration options needs to be reduced.

  • Modes needs to be introduced, so that it is easier to turn off and on (internal) code to support different features. This will very likely improve general performance already.

  • Specific code paths and features need to be optimised. This includes primarily the basic features (stack traces, var_dump() improvements, etc.), as well as Code Coverage.

What is Needed to Get This Done?

I would love to be able to continue to work on Xdebug; to keep it in sync with changes in PHP itself, to implement the ideas from this article to improve the Xdebug Experience, and to produce educational material such as improved documentation, tutorials, and videos. (Dare I say a book?)

Therefore I am currently looking for ways that I can be funded for doing all the required work. It will take a lot of time to get Xdebug 3 out in a tip-top shape. Possibly, if not more, than 6 months. Although I have plenty of time now that I've left MongoDB, the bills and mortgage do need to get paid too. I can think of a few options:

  • Get more users to sign up to my Patreon account. There is a group of loyal users and companies that contribute towards the upkeep of the server. But in order to make this sustainable, the patron count need to increase about 30 fold. I struggle with setting up rewards to nudge people to support me.

  • A fundraiser (through a crowd funding site) for specific tasks and/or features from the plan.

  • Some functionality that would only be available under a commercial license. One of the ideas here is to add a recording feature that records the full execution of a script, which then later can be replayed back through an interactive single step debugging session in an IDE.

  • Work with IDE manufacturers to implement some of their requested features (such as the before mentioned "resolved breakpoint" notification), and to come up with new features to make the debugging experience better.

I am interested to hear whether you have a specific preference, or perhaps some additional suggestions for me to consider. I would absolutely want to give Xdebug all the love it still deserves after 17 years. Let me know what you think! Either in a comment, or by email.

cheers, Derick

Shortlink

This article has a short URL available: https://drck.me/xdebug-experience-eoy

Comments

I don't do Patreon, but I do want to contribute. Are there other ways? Paypal?

I love Xdebug as it is. Because it's so useful, not because it's no easy. Everybody should use it. Making it easier and faster would be a great start!

Hi Derick. I'd be interested to hear why each new version of PHP requires specific support in Xdebug, rather than "just working". Maybe you could do a post on that? :)

Im in! Will support this undertaking.

Moving On From MongoDB

The last seven years, I have worked with, and for, MongoDB. I have seen it grow from a small start-up, then called 10gen, with fewer than a hundred people, into a NASDAQ listed company with over a thousand employees. But now, after all these years, it's time for me to move on.

I've learned a lot, and I have worked with very good people. In these seven years I rewrote the PHP driver (twice), wrote an HHVM driver (remember HHVM?), and implemented time zone support in the MongoDB server. I will be leaving the PHP driver for MongoDB in Jeremy's and other colleagues' capable hands.

I have come to greatly admire both the engineers at MongoDB, as well as the technology that we all created. I am a big fan of MongoDB, and will continue to speak and write about it.

What's next for me? That I'll explain in my next post.

Shortlink

This article has a short URL available: https://drck.me/bye-mongo-eop

Comments

All the best with your future endeavours!

If you are interested in a more aquatic experience/lifestyle than London affords, you are welcome to join me in mastering the sailor's trade. Who knows, we might end up organizing a 'coding and sailing' academy? :-) :-) :-)

The Confused C-Driver

Over the past few months I have been adding the MongoDB driver for PHP to Evergreen, MongoDB's CI system. Although we already test on Travis and AppVeyor, adding the driver to Evergreen also allows us to test on more esoteric platforms, such as ARM64, Power8, and zSeries.

While adding the zSeries architecture to our test matrix, we noticed that one specific group of tests was failing. In these tests we would do an insert with a write concern of {w: 0} followed by an insert with {w: 1}.

A write concern is used to enforce specific guarantees on how many MongoDB servers a write needs to be replicated to, before the primary responds to the client that the write has been acknowledged. A write concern of {w: 0} means that the client does not care about the result of the write operation, and henceforth does not need a reply from the server. These are also referred to as unacknowledged writes.

The test fails in such a way that the result for the second insert (with {w: 1}) seemed empty.

Figuring out what was going wrong was not particularly easy in this case. Single-stepping through a lot of connection and socket handling code with a fairly complicated protocol takes time. After several hours it was still unclear what was going wrong.

I tried to find out whether the data was sent correctly over the wire. As we only have a single shared zSeries development server without a GUI, I could not use Wireshark. Moreover, I could not use tcpdump either, as I didn't have any sudo rights:

$ tcpdump -i lo -nnXSs 0 'port 27017'
tcpdump: lo: You don't have permission to capture on that device
(socket: Operation not permitted)

After some time I figured out that you can also use strace for finding out what goes over the wire, albeit in a slightly annoying format. I used the following invocation:

strace -f -e trace=network -s 10000 -p 22506

The -f makes strace follow forked processes, the -e trace=network shows the system calls of all network related operations, the -s 10000 makes sure we see 10 000 bytes of data in strings, and the 22506 value for the -p argument is the process ID.

On the zSeries platform, the strace dump looks like the following. The first group is the insert with {w: 0}, and the second group the insert with {w: 1}}:

recvmsg(36, {msg_name(0)=NULL, msg_iov(1)=[{"\252\0\0\0\2\0\0\0\0\0\0\0\335\7\0\0", 16}], msg_controllen=0, msg_flags=0}, 0) = 16
recvmsg(36, {msg_name(0)=NULL, msg_iov(1)=[{"\0\0\0\2\0h\0\0\0\2insert\0#\0\0\0server_server_executeBulkWrite_002\0\10ordered\0\1\2$db\0\7\0\0\0phongo\0\3writeConcern\0\f\0\0\0\20w\0\0\0\0\0\0\0\1,\0\0\0documents\0\36\0\0\0\20wc\0\0\0\0\0\7_id\0\\\31\32\27\r\0200G%yF\322\0", 154}], msg_controllen=0, msg_flags=0}, 0) = 154
sendmsg(36, {msg_name(0)=NULL, msg_iov(1)=[{"&\0\0\0v\0\0\0\2\0\0\0\335\7\0\0\0\0\0\0\0\21\0\0\0\1ok\0\0\0\0\0\0\0\360?\0", 38}], msg_controllen=0, msg_flags=0}, MSG_NOSIGNAL) = 38

recvmsg(36, {msg_name(0)=NULL, msg_iov(1)=[{"\316\0\0\0\4\0\0\0\0\0\0\0\335\7\0\0", 16}], msg_controllen=0, msg_flags=0}, 0) = 16
recvmsg(36, {msg_name(0)=NULL, msg_iov(1)=[{"\0\0\0\0\0\214\0\0\0\2insert\0#\0\0\0server_server_executeBulkWrite_002\0\10ordered\0\1\2$db\0\7\0\0\0phongo\0\3lsid\0\36\0\0\0\5id\0\20\0\0\0\4\337{QQi\312A\245\213\270\210\376&+\247\260\0\3writeConcern\0\f\0\0\0\20w\0\1\0\0\0\0\0\1,\0\0\0documents\0\36\0\0\0\20wc\0\1\0\0\0\7_id\0\\\31\32\27\r\0200G%yF\323\0", 190}], msg_controllen=0, msg_flags=0}, 0) = 190
sendmsg(36, {msg_name(0)=NULL, msg_iov(1)=[{"-\0\0\0x\0\0\0\4\0\0\0\335\7\0\0\0\0\0\0\0\30\0\0\0\20n\0\1\0\0\0\1ok\0\0\0\0\0\0\0\360?\0", 45}], msg_controllen=0, msg_flags=0}, MSG_NOSIGNAL) = 45

The first line is the header, with the last four bytes (\335\7\0\0) being the opcode 2013 (OP_MSG). After the opcode, on the second line follow the flagBits: \0\0\0\2. The flagBits field is \0\0\0\0 for the second insert.

On my local Linux machine, the strace dump looks a little more complicated as the data is split into multiple IOV packets, but similar data is present. The trace is also made with the client and server reversed, so sendmsg and recvmsg are also swapped:

sendmsg(4, {msg_name=NULL, msg_namelen=0, msg_iov=[{iov_base="\252\0\0\0", iov_len=4}, {iov_base="\2\0\0\0", iov_len=4}, {iov_base="\0\0\0\0", iov_len=4}, {iov_base="\335\7\0\0", iov_len=4},
           {iov_base="\2\0\0\0", iov_len=4}, {iov_base="\0", iov_len=1}, {iov_base="h\0\0\0\2insert\0#\0\0\0server_server_executeBulkWrite_002\0\10ordered\0\1\2$db\0\7\0\0\0phongo\0\3writeConcern\0\f\0\0\0\20w\0\0\0\0\0\0\0", iov_len=104}, {iov_base="\1", iov_len=1}, {iov_base=",\0\0\0", iov_len=4}, {iov_base="documents\0", iov_len=10}, {iov_base="\36\0\0\0\20wc\0\0\0\0\0\7_id\0\\\31\32=\343\231,\25\301\\\203r\0", iov_len=30}], msg_iovlen=11, msg_controllen=0, msg_flags=0}, MSG_NOSIGNAL) = 170

sendmsg(4, {msg_name=NULL, msg_namelen=0, msg_iov=[{iov_base="\316\0\0\0", iov_len=4}, {iov_base="\4\0\0\0", iov_len=4}, {iov_base="\0\0\0\0", iov_len=4}, {iov_base="\335\7\0\0", iov_len=4},
           {iov_base="\0\0\0\0", iov_len=4}, {iov_base="\0", iov_len=1}, {iov_base="\214\0\0\0\2insert\0#\0\0\0server_server_executeBulkWrite_002\0\10ordered\0\1\2$db\0\7\0\0\0phongo\0\3lsid\0\36\0\0\0\5id\0\20\0\0\0\4 \314rM\25\362L\7\203\36O\2157\344\201V\0\3writeConcern\0\f\0\0\0\20w\0\1\0\0\0\0\0", iov_len=140}, {iov_base="\1", iov_len=1}, {iov_base=",\0\0\0", iov_len=4}, {iov_base="documents\0", iov_len=10}, {iov_base="\36\0\0\0\20wc\0\1\0\0\0\7_id\0\\\31\32=\343\231,\25\301\\\203s\0", iov_len=30}], msg_iovlen=11, msg_controllen=0, msg_flags=0}, MSG_NOSIGNAL) = 206
recvfrom(4, "-\0\0\0", 4, 0, NULL, NULL) = 4
recvfrom(4, "i\0\0\0\4\0\0\0\335\7\0\0\0\0\0\0\0\30\0\0\0\20n\0\1\0\0\0\1ok\0\0\0\0\0\0\0\360?\0", 41, 0, NULL, NULL) = 41

The first big difference is that locally, there is no recvfrom for the first sendmsg, but there is one for the second insert. The other difference is that the flagBits (the second line in each sendmsg) is \2\0\0\0 for the first insert. This is different in the zSeries trace, where the value is \0\0\0\2.

The latest version of the MongoDB wire protocol uses OP_MSG for all operations, which is a typical request/response API. This means that generally every request is expected to generate a reply from the server. For unacknowledged writes, the client does not need a reply from the server, and the way to tell the server that is by setting the moreToCome flag in the OP_MSG packet. It's bit 1 in the 32 bit wide bit field.

The wire protocol requires the use of the little-endian byte order for numbers. libmongoc, which implements the connection aspects of the PHP driver, accomplishes this by running a "swap" in case the driver is run on a big-endian system such as zSeries.

It turned out that although this swap happened for all normal integers (such as the OP_MSG opcode 2013), it did not happen for the flagBits. This meant that instead of setting the moreToCome flag (bit 1), we set bit 25, which does absolutely nothing. Because of that, the server helpfully sent a reply (the first sendmsg) in the zSeries dump. Because the driver did not expect such a packet, it did not read it from the socket either.

This meant that when the driver read information from the socket in response to the second insert (the one with a write concern of {w: 1}) it read the response of the first insert. And then it got confused.

In the end, the fix was as easy as making sure that the flagBits field was also correctly swapped between big- and little-endian.

Confusion solved!

Shortlink

This article has a short URL available: https://drck.me/c-endian-eab

Comments

No comments yet

The Mystery of the Missing Breakpoints

Occasionally I see people mentioned that Xdebug does not stop at certain breakpoints. This tends to relate to multi-line if conditions, or if/else conditions without braces ({ and }). Take for example the following artificial code sample:

 1 <?php
 2 $a = true;
 3 $b = true;
 4 $c = true;
 5 $d = false;
 6
 7 if (
 8   $a
 9   && $b
10   && ( $c || $d )
11   )
12 {
13   echo "It's true!";
14 }

If you set a breakpoint at either line 7, 11, or 12, you'll find that these are ignored. But why is that?

In order to investigate we employ vld, a tool that I wrote that can show PHP's internal bytecode (oparrays containing opcodes). They read very much like assembly code. For the above snippet, the vld dump looks as follows (after redacting useless information):

function name:  (null)
number of ops:  19
compiled vars:  !0 = $a, !1 = $b, !2 = $c, !3 = $d
line     #* E I O op             return  operands
---------------------------------------------------
   2     0  E >   EXT_STMT
         1        ASSIGN                 !0, <true>
   3     2        EXT_STMT
         3        ASSIGN                 !1, <true>
   4     4        EXT_STMT
         5        ASSIGN                 !2, <true>
   5     6        EXT_STMT
         7        ASSIGN                 !3, <false>
   8     8        EXT_STMT
         9      > JMPZ_EX        ~8      !0, ->11
   9    10    >   BOOL           ~8      !1
        11    > > JMPZ_EX        ~8      ~8, ->15
  10    12    > > JMPNZ_EX       ~9      !2, ->14
        13    >   BOOL           ~9      !3
        14    >   BOOL           ~8      ~9
        15    > > JMPZ                   ~8, ->18
  13    16    >   EXT_STMT
        17        ECHO                   'It%27s+true%21'
  15    18    > > RETURN                 1

The first column is the line number that PHP associates with each opcode, and you can see that there is no opcode for lines 7, 11, and 12. Additionally, Xdebug can only break on the EXT_STMT opcode, which you can see is only present for lines 8 and 13 in the logic section of the script. If a breakpoint is set on a line without an EXT_STMT opcode, Xdebug will not be able to interrupt the script on that line.

Sometimes it can be confusing where PHP thinks there is a line of code, especially since OPcache starts optimising more and more things out. In our example, it could really only leave line 13 around, as the rest is static. It doesn't quite do that yet however.

It is certainly frustrating that Xdebug cannot always stop where you want it to, but IDEs do have a possibility to interrogate where Xdebug could stop through a private DBGp commandxcmd_get_executable_lines. This can only be done when the script, or rather, a specific function is already running.

The DBGp protocol has provisions for signalling to IDEs whether it has resolved a breakpoint when a function gets entered into. Xdebug does currently not implement this functionality yet, but a ticket for it is scheduled for implementation (likely for Xdebug 3.0).

Shortlink

This article has a short URL available: https://drck.me/brk-ea9

Comments

No comments yet

Downstream — Trip 8

On the previous trip I walked from Teddington Lock to the Golden Jubilee Bridges. On this walk, I am finishing the series by going to the Thames Barrier. Getting to the start this time was easy peasy, with just a Tube ride down to Embankment.

Day 10 — Golden Jubilee Bridges to Thames Barrier

Downstream — Day 10

When I exited Embankment station, the rain had just stopped, and it felt very cold. The weather forecast indicated that it would get sunny, but I was very sceptical. I crossed the Golden Jubilee Bridges slowly, taking my time taking some photos and video of the London Eye. I considered taking a timelapse, but realised I had done so earlier for London in Fives, when the weather was much better.

The moment I made it to the south bank of the Thames, the Sun came out, and I was happy that I had brought my sun glasses. Not only was the Sun low, it also reflected a lot on the puddles on the ground.

This part of the Thames and Thames Part are right in the middle of London, and hence usually very busy with both commuters and tourists. Once the Sun came out, it therefore quickly became slow going, with lots of people going around the puddles in zigzaggy fashion.

The tide was out, which meant that there were quite a few people out on the river bed as well, to seek fortunes. There are still many bridges here, and definitely many landmarks and tourist attractions.

The best known is probably the Tower of London and Tower Bridge. The Victorian Tower Bridge (1894) is significantly younger than the Norman Tower (1078), which was built by the Normans after they invaded in 1066. The contrast behind the older Tower and Tower Bridge and the tall buildings of The City and City Hall is much starker still.

Beyond Tower Bridge, and the big yacht Kismet, the Thames Path becomes its quiet self again, with people going about their weekend morning business. The walk was now a lot more pleasant as I didn't have to dodge around people all the time. With the tide so low, many of the former docks now showed their muddy bottom, which made for some great photos.

After coming around the corner at Rotherhithe, the Thames Path loops nearly 270° degrees around The City, where dark clouds and rainbows loomed over on the North bank of the Thames, all the way to Greenwich. At Greenwich, the path goes along the Cutty Sark, a 19th century tea clipper which is now a museum. After crossing the square at the former Royal Naval College, the scenery turns quickly from "lovely residential area" into "industrial", which persists nearly all the way to the Thames Barrier.

Some interesting landmarks are the Millennium Dome. You can now walk across the roof, which is precisely what some people were doing when I came past. The Emirates Air Line flew high overhead between the two banks of the Thames as I was making my way further downstream.

After just over 300 kilometres of walking in 56 hours, while burning 25 000 calories, I then made it to the Thames Barrier, London's flood defence barrier. The barrier marks the end of the official Thames Path, although it is possible to extend it to the river Darent, where it then connects to the London LOOP that I walked a few years ago.

With the Thames Path done, it's time to look for the next adventure—I've my eyes on Hadrians Wall ;-).


Photos from my Adventure on the Thames Path are available on Flickr, and all videos on Vimeo. You can also see all the photos on a map.

Shortlink

This article has a short URL available: https://drck.me/downstream08-e8u

Comments

No comments yet

Life Line