More source analysis with VLD
VLD is a tool that I started working on years ago to visualise the opcode arrays in PHP. Opcode arrays are what PHP's compiler generates from your source code and can be compared to assembler code that is generated by a C compiler. Instead of it being directly executed by the CPU, it is instead executed by PHP's interpreter.
Over the years I've been adding some functionality, also aided by Ilia and some others, to show more information. For example Ilia has added a more verbose dumping format for opcodes (through the vld.verbosity setting) whereas I have added routines to find out which ops in oparrays can never be reached. A very simple example of the latter is shown here:
<?php
function test()
{
echo "Hello!\n";
return true;
echo "This will not be executed.\n";
}
?>
If we run the above through VLD with php -dvld.active=1 test.php, you'll see the following output (I removed the part about the script body itself):
Function test:
filename: /tmp/test1.php
function name: test
number of ops: 9
compiled vars: none
line # * op fetch ext return operands
---------------------------------------------------------
2 0 > EXT_NOP
4 1 EXT_STMT
2 ECHO 'Hello%21%0A'
5 3 EXT_STMT
4 > RETURN true
7 5* EXT_STMT
6* ECHO 'This+will+not+be+executed.%0A'
8 7* EXT_STMT
8* > RETURN null
End of function test.
Every opcode that has a * after the number (like in 5*) is code that can not be reached, and can possibly be eliminated from the oparrays in an optimiser.
The dead code analysis routines have also made their way into Xdebug which uses them for the code coverage functionality to highlight dead code. This mostly makes sense if you are running your code coverage together with unit tests such as you can do with PHPUnit.
Recently I've been working on some new functionality to visualise all the code paths that make up each function. These new routines sit on top of the routines that do dead code analysis. Every branch instruction (such as if, but also for and foreach) is analysed and a list of branches is created. Each branch contains information about the line on which the branch starts, the starting and ending opcode numbers that belong to the branch, as well as to which other branches this branch can jump to. There can be either no linked branches (when for example a return or throw statement is found), one linked branch (for an unconditional jump) or two linked branches (on a branch instruction). However, you need to be aware that internally, PHP's opcode don't always reflect the source code exactly.
Once all the branches and their links are found, another algorithm runs to figure out which paths can be created out of all the branches. It is best to illustrate this with an example. So let us look at the following script:
<?php
function test()
{
for( $i = 0; $i < 10; $i++ )
{
if ( $i < 5 )
{
echo "-";
}
else
{
echo "+";
}
}
echo "\n";
}
?>
In this script we have a for-loop with a nested if construct. When we run this script through VLD (with php -dvld.verbosity=0 -dvld.dump_paths=1
-dvld.active=1 test2.php) we get the following output (again, only the test() function and with some white space modifications):
Function test:
filename: /tmp/test2.php
function name: test
number of ops: 22
compiled vars: !0 = $i
line # * op fetch ext return operands
-----------------------------------------------------------
2 0 > EXT_NOP
4 1 EXT_STMT
2 ASSIGN !0, 0
3 > IS_SMALLER ~1 !0, 10
4 EXT_STMT
5 > JMPZNZ 9 ~1, ->18
6 > POST_INC ~2 !0
7 FREE ~2
8 > JMP ->3
6 9 > EXT_STMT
10 IS_SMALLER ~3 !0, 5
7 11 > JMPZ ~3, ->15
8 12 > EXT_STMT
13 ECHO '-'
9 14 > JMP ->17
12 15 > EXT_STMT
16 ECHO '%2B'
14 17 > > JMP ->6
15 18 > EXT_STMT
19 ECHO '%0A'
16 20 EXT_STMT
21 > RETURN null
branch: # 0; line: 2- 4; sop: 0; eop: 2; out1: 3
branch: # 3; line: 4- 4; sop: 3; eop: 5; out1: 18; out2: 9
branch: # 6; line: 4- 4; sop: 6; eop: 8; out1: 3
branch: # 9; line: 6- 7; sop: 9; eop: 11; out1: 12; out2: 15
branch: # 12; line: 8- 9; sop: 12; eop: 14; out1: 17
branch: # 15; line: 12-14; sop: 15; eop: 16; out1: 17
branch: # 17; line: 14-14; sop: 17; eop: 17; out1: 6
branch: # 18; line: 15-16; sop: 18; eop: 21
path #1: 0, 3, 18,
path #2: 0, 3, 9, 12, 17, 6, 3, 18,
path #3: 0, 3, 9, 15, 17, 6, 3, 18,
End of function test.
This dump consists of a few different parts. First of all we can see some basic information containing the name, the number of ops (22) and the compiled variables. The second part is a dump of all the opcodes that make up this function. The last part contains information about all the branches and the possible paths. This information is a bit hard to visualize in its textual form, so I've also added some code that dumps this information to a file format that the GraphViz tool "dot" can use to create a pretty graph. For this we re-run the previous PHP invocation as php -dvld.dump_paths=1
-dvld.verbosity=0 -dvld.save_paths=1 -dvld.active=1 test2.php. This creates the file /tmp/paths.dot that "dot" can use. If we run dot -Tpng
/tmp/paths.dot > /tmp/paths.png we end up with the following picture:
If we put this graph next to the code, we can explain how this works. Every branch is named by the number of the first opcode in that branch:
-
op #1is the assignment of$iin line 4. -
op #3is the loop test in line 4. If the condition doesn't match, we jump toop #18on line 16 that echos the newline. -
op #9is theifcondition on line 6. -
op #12is when theifcondition returns true and -
op #15is when theifcondition returns false. -
op #17sits behind bothop #12andop #15and makes sure there is a jump to the counting expression in#op 6. -
op #6is the post increment operation on line 4 which will then again be followed byop #3to check whether the end of the loop has been reached.
This is of course a very simple example, but it also works for (multiple) classes and functions in a file. You just need to make sure to tell VLD that you don't want the code executed as the output could be very large. You can use the vld.execute=0 php.ini setting for that.
I hope this new functionality can spread some light on how loops etc. work in PHP. In order to play with the code, you need to check-out VLD from my SVN with svn co svn://svn.xdebug.org/svn/php/vld/trunk vld. You can also view the code on-line at http://svn.xdebug.org/cgi-bin/viewvc.cgi/vld/trunk/?root=php. Look out for a new release coming soon!
Comments
Nice nice nice nice nice Derick ! I'm gonna use that ASAP.
When will the code be pushed to pecl to use the pecl command to update vld ?
Seems like a cool idea : )
I wont use in current projects but i can see how that can come in handy! : )
thanks
art





Shortlink
This article has a short URL available: https://drck.me/msaw-vld-7rv