More source analysis with VLD
VLD is a tool that I started working on years ago to visualise the opcode arrays in PHP. Opcode arrays are what PHP's compiler generates from your source code and can be compared to assembler code that is generated by a C compiler. Instead of it being directly executed by the CPU, it is instead executed by PHP's interpreter.
Over the years I've been adding some functionality, also aided by Ilia and some others, to show more information. For example Ilia has added a more verbose dumping format for opcodes (through the vld.verbosity
setting) whereas I have added routines to find out which ops in oparrays can never be reached. A very simple example of the latter is shown here:
<?php function test() { echo "Hello!\n"; return true; echo "This will not be executed.\n"; } ?>
If we run the above through VLD with php -dvld.active=1 test.php
, you'll see the following output (I removed the part about the script body itself):
Function test: filename: /tmp/test1.php function name: test number of ops: 9 compiled vars: none line # * op fetch ext return operands --------------------------------------------------------- 2 0 > EXT_NOP 4 1 EXT_STMT 2 ECHO 'Hello%21%0A' 5 3 EXT_STMT 4 > RETURN true 7 5* EXT_STMT 6* ECHO 'This+will+not+be+executed.%0A' 8 7* EXT_STMT 8* > RETURN null End of function test.
Every opcode that has a *
after the number (like in 5*
) is code that can not be reached, and can possibly be eliminated from the oparrays in an optimiser.
The dead code analysis routines have also made their way into Xdebug which uses them for the code coverage functionality to highlight dead code. This mostly makes sense if you are running your code coverage together with unit tests such as you can do with PHPUnit.
Recently I've been working on some new functionality to visualise all the code paths that make up each function. These new routines sit on top of the routines that do dead code analysis. Every branch instruction (such as if
, but also for
and foreach
) is analysed and a list of branches is created. Each branch contains information about the line on which the branch starts, the starting and ending opcode numbers that belong to the branch, as well as to which other branches this branch can jump to. There can be either no linked branches (when for example a return
or throw
statement is found), one linked branch (for an unconditional jump) or two linked branches (on a branch instruction). However, you need to be aware that internally, PHP's opcode don't always reflect the source code exactly.
Once all the branches and their links are found, another algorithm runs to figure out which paths can be created out of all the branches. It is best to illustrate this with an example. So let us look at the following script:
<?php function test() { for( $i = 0; $i < 10; $i++ ) { if ( $i < 5 ) { echo "-"; } else { echo "+"; } } echo "\n"; } ?>
In this script we have a for
-loop with a nested if
construct. When we run this script through VLD (with php -dvld.verbosity=0 -dvld.dump_paths=1
-dvld.active=1 test2.php
) we get the following output (again, only the test()
function and with some white space modifications):
Function test: filename: /tmp/test2.php function name: test number of ops: 22 compiled vars: !0 = $i line # * op fetch ext return operands ----------------------------------------------------------- 2 0 > EXT_NOP 4 1 EXT_STMT 2 ASSIGN !0, 0 3 > IS_SMALLER ~1 !0, 10 4 EXT_STMT 5 > JMPZNZ 9 ~1, ->18 6 > POST_INC ~2 !0 7 FREE ~2 8 > JMP ->3 6 9 > EXT_STMT 10 IS_SMALLER ~3 !0, 5 7 11 > JMPZ ~3, ->15 8 12 > EXT_STMT 13 ECHO '-' 9 14 > JMP ->17 12 15 > EXT_STMT 16 ECHO '%2B' 14 17 > > JMP ->6 15 18 > EXT_STMT 19 ECHO '%0A' 16 20 EXT_STMT 21 > RETURN null branch: # 0; line: 2- 4; sop: 0; eop: 2; out1: 3 branch: # 3; line: 4- 4; sop: 3; eop: 5; out1: 18; out2: 9 branch: # 6; line: 4- 4; sop: 6; eop: 8; out1: 3 branch: # 9; line: 6- 7; sop: 9; eop: 11; out1: 12; out2: 15 branch: # 12; line: 8- 9; sop: 12; eop: 14; out1: 17 branch: # 15; line: 12-14; sop: 15; eop: 16; out1: 17 branch: # 17; line: 14-14; sop: 17; eop: 17; out1: 6 branch: # 18; line: 15-16; sop: 18; eop: 21 path #1: 0, 3, 18, path #2: 0, 3, 9, 12, 17, 6, 3, 18, path #3: 0, 3, 9, 15, 17, 6, 3, 18, End of function test.
This dump consists of a few different parts. First of all we can see some basic information containing the name, the number of ops (22) and the compiled variables. The second part is a dump of all the opcodes that make up this function. The last part contains information about all the branches and the possible paths. This information is a bit hard to visualize in its textual form, so I've also added some code that dumps this information to a file format that the GraphViz tool "dot" can use to create a pretty graph. For this we re-run the previous PHP invocation as php -dvld.dump_paths=1
-dvld.verbosity=0 -dvld.save_paths=1 -dvld.active=1 test2.php
. This creates the file /tmp/paths.dot
that "dot" can use. If we run dot -Tpng
/tmp/paths.dot > /tmp/paths.png
we end up with the following picture:

If we put this graph next to the code, we can explain how this works. Every branch is named by the number of the first opcode in that branch:
-
op #1
is the assignment of$i
in line 4. -
op #3
is the loop test in line 4. If the condition doesn't match, we jump toop #18
on line 16 that echos the newline. -
op #9
is theif
condition on line 6. -
op #12
is when theif
condition returns true and -
op #15
is when theif
condition returns false. -
op #17
sits behind bothop #12
andop #15
and makes sure there is a jump to the counting expression in#op 6
. -
op #6
is the post increment operation on line 4 which will then again be followed byop #3
to check whether the end of the loop has been reached.
This is of course a very simple example, but it also works for (multiple) classes and functions in a file. You just need to make sure to tell VLD that you don't want the code executed as the output could be very large. You can use the vld.execute=0
php.ini setting for that.
I hope this new functionality can spread some light on how loops etc. work in PHP. In order to play with the code, you need to check-out VLD from my SVN with svn co svn://svn.xdebug.org/svn/php/vld/trunk vld
. You can also view the code on-line at http://svn.xdebug.org/cgi-bin/viewvc.cgi/vld/trunk/?root=php. Look out for a new release coming soon!
Comments
Nice nice nice nice nice Derick ! I'm gonna use that ASAP.
When will the code be pushed to pecl to use the pecl command to update vld ?
Seems like a cool idea : )
I wont use in current projects but i can see how that can come in handy! : )
thanks
art
Shortlink
This article has a short URL available: https://drck.me/msaw-vld-7rv