Become a Patron!

My Amazon wishlist can be found here.

Life Line

PHP Internals News: Episode 68: Observer API

In this episode of "PHP Internals News" I chat with Levi Morrison (Twitter, GitHub) and Sammy Kaye Powers (Twitter, GitHub, Website) about the new Observer API.

The RSS feed for this podcast is https://derickrethans.nl/feed-phpinternalsnews.xml, you can download this episode's MP3 file, and it's available on Spotify and iTunes. There is a dedicated website: https://phpinternals.news

Transcript

Derick Rethans 0:15

Hi, I'm Derick, and this is PHP internals news, a weekly podcast dedicated to demystifying the development of the PHP language. This is Episode 68. Today I'm talking with Levi Morrison, and Sammy Powers, about something called the observer API, which is something that is new in PHP eight zero. Now, we've already passed feature freeze, of course, but this snuck in at the last possible moment. What this is observer API going to solve?

Levi Morrison 0:44

the observer API is primarily aimed at recording function calls in some way so it can also handle include, and require, and eval, and potentially in the future other things, but this is important because it allows you to write tools that automatically observe, hence the name, when a function begins or ends, or both.

Derick Rethans 1:12

What would you use that for?

Levi Morrison 1:13

So as an example, Xdebug can use this to know when functions are entered or or left, and other tools such as application performance monitoring, or APM tools like data dog, New Relic, tideways, instana so on, they can use these hooks too.

Derick Rethans 1:38

From what I understand that is the point you're coming in from, because we haven't actually done a proper introduction, which I forgot about. I've been out of this for doing this for a while. So both you and Sammy you work for data dog and work on their APM tool, which made you start doing this, I suppose.

Sammy Kaye Powers 1:54

Yeah, absolutely. One of the pain points of tying into the engine to to monitor things is that the hooks are insufficient in a number of different ways. The primary way that you would do function call interception is with a little hook called zend_execute_ex and this will hook all userland function calls. The problem is, it has an inherent stack bomb in it where if, depending on your stack size settings you, you're going to blow up your stack. At some point if you have a very very deeply deep call stack in PHP, PHP, technically has a virtually unlimited call stack. But when you use zend_execute_ex, it actually does limit your stack size to whatever your settings are your ulimit set stack size. One of the issues that this solves is that stack overflow issue that you can run into when intercepting userland calls but the other thing that it solves is the potential JIT issues that are coming with PHP eight, where the optimizations that it does could potentially optimize out a call to zend_execute_ex where a profiling or APM tracing kind of extension would not be able to enter set that call, because of the JIT. The Observer API enables to solve multiple issues with this. Not only that, there's more. there's more features to this thing, because zend_execute_ex by default will intercept all userland function calls, and you have no choice but to intercept every single call, whereas, this API is designed to also allow you to choose which function calls specifically you want to intercept, so there is on the very first call of a function call. And it'll basically send in the zend function. This is a little bit of a point we've been kind of going back and forth on what we actually send in on the initialisation. But at the moment it is a zend function so you can kind of look at that and say okay do I want to monitor this function or observe it. If your extension returns a handler and says this is my begin handler This is my end handler. Those handlers will fire at the beginning and end of every function call. So it gives you a little bit of fine grain sort of resolution on what you want to observe. The other really kind of baked in design, part of the design is, we wanted it to play well with neighbours, because some of the hooks, at the moment, well, pretty much all of the hooks. Aside from typical extension hooks. Whenever you tie into the engine it's very easy to be a noisy neighbor. It's very easy not to forward a hook along properly it's very easy to break something for another extension. This has like kind of neighbour considerations baked right in. So when you actually request a begin and end hook. It will manage on its side, how to actually call those, and so you don't have to forward along the hook to other other extensions to make it a little bit more stable in that sense.

Derick Rethans 4:52

From working on Xdebug, this is definitely problem forwarding both zend_execute_ex and also zend_execute_internal, which is something I also override are of course. And I think there are similar issues with, with the error display as well and PHP eight will also have a different, or a new API for that as well. Also coming out of a different performance monitoring tool, which is interesting to see that all these things works. You mentioned the Zend function thing and I'm not sure how well versed the audiences and all this internal things what is this zend function thing?

Levi Morrison 5:24

as any function in the engine is what represents a function so not the scope that it's called in but the scope that it's defined in. It represents both method calls and function calls. It's triggered whenever a user land function is in play. So it has the function name, the name of the class that it's associated with, it tells you how many parameters you have and things like this. It does not tell you the final object that it's called with, and this is partly why we are debating what exactly should get passed in here, because some people may care. Oh, I only want to observe this with particular inheritors or, or other things of that nature so there's a little bit of fine tuning in the design perhaps still but the basic idea is you'll know the name of the function. What class it's in, and it's bound late enough in the engine that you would also have access to whatever parents that class has, etc.

Derick Rethans 6:33

Does it contain the arguments as well, that are being sent, or just a definition of the arguments?

Levi Morrison 6:38

The Zend function only contains the definition of the arguments. The hook is split into three sections kind of so there's like initialisation and then begin and end. Initialisation only gives you the Zendo function but to begin and gives you access to what's called the Zend execute data which has more information, including the actual arguments being passed.

Derick Rethans 7:03

Okay, so it's the idea of the initialisation, just to find out whether you actually want to intercept the function. And if you want remember that and if not it wouldn't ever bother are the trying to intercept that specific zend function either.

Sammy Kaye Powers 7:17

Actually what we actually pass into that initialization function is has been sort of up for debate. The original implementations, that is plural. We've had many different implementations of this thing over the, over the year. Derick you did mention that this got squeezed in last minute it has been a work in progress for a very long time and it actually is fulfilling JIT work so there's a specific mention in the JIT RFC that that mentions an API that is going to be required to intercept some function calls that are optimized out so that's why we were able to sneak in a little bit past feature freeze on the actual merge I think. But what we actually sent into this initialization function is spin up two for debate based on how we've actually implemented it. One of the original early implementations actually called this initialization function during the runtime cache initialization, just basically kind of a cache that gets initialized before the execute data actually is created. We didn't have the option of sending in the execute data at that time, we did have the zend function. So we were sending that in. Later on this implementation get refactored to a number different ways. We have the option now to send an execute data if we wanted to, but it might be sufficient to send in the op array instead of the Zend function. The op array should be the full sort of set of opcodes that basically is a function definition from the perspective of of internals, but it also includes like includes and evals. Having additional information at initialisation might actually be handy. I think we're still kind of maybe thinking about that potentially changing I don't know what do you think Levi.

Levi Morrison 8:59

Yeah, you can get the oparray from the function so it's a little pedantic on which one you pass in I guess, but yeah. The idea is that we don't want to intentionally restrict it. It's just that the implementations have changed over the year so we're not sure exactly what to pass in at the moment. I think a zend function's pretty safe, passing in a zend oparray is perhaps a better signal to what it's actually for, because it can measure function calls, but also include, require, eval. And the oparray technically does contain more information. Again, if you have zend function, you can check to see if it is an oparray and get the operate from the Zend function. So a little pedantic but maybe a little better in conveying the purpose and what exactly it targets.

Derick Rethans 9:56

And you can also get the oparray from zend_execute_data.

Levi Morrison 10:00

Yeah.

Derick Rethans 10:01

If I want to make use of this observe API I will do that? I guess, you said only from extensions and not from userland.

Sammy Kaye Powers 10:08

Exactly. At the moment you would as an extension during MINIT or startup, basically in the very early process with the actual PHP processes starting up, would basically register your initialization handler. And at that point, under the hood, the whole course of history is changed for PHP at that point, because there is a specific observer path that happens when an observer extension registers at MINIT or startup. At that point the initialization function will get called for every function call. The very first function call that that function called is called, I know that sounds confusing but if you think you have one function and it's called 100 times that initialization will run one time. That point you can return either a begin and or at an end handler. If you return null handlers it'll never, it'll never bother you again for that particular function, but it will continue to go on that is don't mentioned earlier for every new function that encounters every new function call and encounters, I should say.

Derick Rethans 11:12

There is not much overhead, because the whole idea is that you want to do as little overhead as possible I suppose.

Levi Morrison 11:19

Exactly, we have in our current design in pre PHP 8.0. You could hook into all function calls using that zend_execute_ex, but it has performance overhead just for doing the call. So let's imagine we're in a scenario where we have two extensions, say Xdebug and one APM product. Both of them aren't actually going to do anything on this particular call it will still call those hooks, which has overhead to it. So if nobody is interested in a function, the engine can very quickly determine this and avoid the overhead of doing nothing. This way we only pay significant costs, if there's something to be done.

Derick Rethans 12:09

You're talking about not providing much overhead at all. Just having the observer API in place, was there any performance hits with that?

Sammy Kaye Powers 12:17

That was actually one of the biggest sort of hurdles that we had to overcome specifically with Dmitri getting this thing, merged in because it does touch the VM and whenever you touch the VM like we're talking like any tiny little null check that you have in any of the handlers is probably going to have some sort of impact at least enough for Dmitri, who understandably cares about like very very very small overheads that are happening at the VM level, because these are happening for every function call. You know, this is, this is not something that's just happening, you know, one time during the request is happening a lot. In order to apeace Dmitri and get this thing merged in, it basically had to have zero overhead for the production version non observer, his production version but on the non observed version on the non observed path it had to basically reach zero on those benchmarks. That was quite a task to try to achieve. We went through four, about four or five different ways of tying into the engine, we got it down to about, like, two new checks for every function call. And that still was not sufficient, so we end up going with based on Dmitris excellent suggestion, went with the opcode specialization, to implement observers so that at compile time. We can look and see if there's an observer extension present and if there is, it will actually divert the function call related opcodes to specific handlers that are designed for observing and that way once, once you get past that point, the observer path is already determined at compile time and all the observer handlers fire. In a non observed environment, all of the regular handlers will fire without any observability checks in them.

Derick Rethans 14:03

At the end of getting within the loss of zero or not?

Levi Morrison 14:07

It is zero for certain things. Of course, there are other places besides the VM that you have to touch things here and there for, you know, keeping code tidy and code sane but it's effectively zero, for all intents and purposes. Goal achieved. I will say zero percent.

Derick Rethans 14:30

I think the last version of the patch that I saw didn't have the code specialization in it yet. So I'm going to have to have a look at myself again.

Levi Morrison 14:39

Yeah, the previous version had very low overhead, so low overhead that you couldn't really observe it through any time based things. But if you measured instructions retired or similar things from the CPU, then it was about point four to 1% reduction, and personally I would have said that's good enough because all of them would correctly branch predict, because you either have handlers in a request, or you don't. And so they would perfectly predict, every time. But still, those are extra instructions technically so that's why Dmitri pushed for specialization and those are no longer there.

Derick Rethans 15:27

Does that mean there are new opcodes specifically for this, or is it just the specialization of the opcodes that is different?

Sammy Kaye Powers 15:33

It's just this specialization. During the process of going, figuring out what exactly Dmitri needed to mergeable actually proposed an implementation that added basically an observer version of every kind of function call related opcode like do_fcall_observed, or observed_return or something like that. With, opcodes specialization, it reduces the amount of code that you have to write sort of at the VM level, it doesn't change the amount of code that's generated though because with opcode specialization, basically the definition file will get expanded by this, this php file that actually generates C code. When you add a specialization, to a handler that already has specializations on it, it will expand quite considerably. The PR at one point ended up being like 10,000 lines or something like that, so we had to do some serious reduction on the number of handlers that were automatically generated. Long story short, is there are no new opcodes but there are new opcode handlers to handle this specific path.

Derick Rethans 16:40

Not sure what, if anything more to ask about the Observer API, do you have anything to ask yourself?

Levi Morrison 16:45

I think it's worth repeating the merits of the observer API and where we're coming from. The key benefits in my opinion are that it allows you to target per function interception for observing. It allows you to do it in a way that's that plays nice with other people in the ecosystem and increasingly that's becoming more important. We've always had debuggers and some people occasionally need to turn debuggers on in production and other things like this. But increasingly, there are other products in this space; the number of APM products is growing over time. There are new security products that are also using these kinds of hooks. And I expect over time we will see more and more and more of these kinds of of tools, and so being able to play nicely is a very large benefit. At data dog where Sammy and I both work we've hit product incompatibilities a lot of times, and some people are better to work with than others. I know that Xdebug has done some work to be compatible with our product specifically but you know competitors aren't so interested in that. We care a lot about the community right, we want the whole community to have good tools, and I don't think we actually mentioned yet that we did collaborate with some other people and competitors in this space. That hopefully proves that that's not just words of mine that, you know, we actually met with the competitors who were willing to and discussed API design, and use cases, and making sure that we could all work together and compete on giving PHP good products rather than, you know, hoarding technical expertise and running over each other and causing incompatibilities with each other. So I think those are really important things. And then lastly, it does not have that stack overflow potential that the previous hooks you could use did.

Derick Rethans 18:54

Yeah, which is still an issue for Xdebug but but I fixed that in a different way by setting an arbitrary limit on the amount of nested levels you can call, right.

Levi Morrison 19:02

Yeah, and in practice that tends to work pretty well because most people don't have intentionally deep code. But for some people they do. And we can't as an APM product for instance say: sorry your code is just not good code, we can't observe a crash your your your thing and so we can't make that decision. And then the biggest con at the moment is that it doesn't work with JIT, but I want to specifically mention that, that's not a technical thing, that's just a not enough time has been put in that space yet because this was crunched to the last second trying to get it in. And so, some things didn't get as much focus yet. Hopefully by the time 8.0 gets released it will be compatible with JIT, or at least it will be only per function, so maybe a function that gets observed, maybe that can't be JIT compiled that specific function call, but all the other function calls that aren't observed would be able to. We'll see obviously there's still work to do there but that's our hope.

Derick Rethans 20:10

What happens now, if, if you use the observer API and the JIT engine is active? Does it just disable the JIT engine.

Sammy Kaye Powers 20:16

Yep. It just won't enable the JIT at all. In fact, it just goes ahead and disables it, if an observer extension is enabled and there is a little kind of debug message that's emitted inside of the OP cache specific logs that will will say specifically why the JIT isn't enabled just in case you're sitting here trying to turn the JIT on you're like, why isn't enabled, and it'll say there's an observer extension present so we can enable the JIT. Hopefully they'll be able to work a little bit, and maybe just change an optimization level or something in the future. I'd like to give a shout out to Benjamin Eberlei, who has been with us since the very beginning on this whole thing has been vetting the API on his products, has gotten xhprof on not only the original implementation but also on the newest implementation, and has just been a huge help in actually getting this thing pushed in, and was said some of the magic words that actually, this thing merged in, when it was looking like it wasn't gonna land for eight dot O and got it landed for eight dot one so Benjamin gets a huge thumbs up. So, Nikita Popov, Bob Weinand, and Joe Watkins really early on. These are awesome people from internals who have spent some time to help us vet the API, but also to help us with specific implementation details. It's been just a huge team effort from a lot of people and it was just like, really great to work across the board with everybody.

Derick Rethans 21:35

Yeah, and the only thing right now of course is all the extensions that do observe things need to be compatible with this.

Sammy Kaye Powers 21:43

Exactly.

Derick Rethans 21:44

Which is also means there's work for me, or potentially.

Sammy Kaye Powers 21:47

Absolutely.

Levi Morrison 21:49

I guess one one minor point there is that if an extension does move to the new API, it is a little bit insulated from those that haven't moved to the new API. So, to some degree, it still benefits the people who haven't moved yet because the people who have moved have one less competitor in the same same hook, so it's just highlighting the fact that it plays nicely with other people.

Derick Rethans 22:14

Is opcache itself actually going to use it or not?

Levi Morrison 22:17

So this is focused only on userland functions; past iterations that was not the case. Dmitri kind of pushed back on having this API for internals and so that got dropped. I don't think at this stage there's any there's any value in opcache using it specifically, but there are some other built in things like Dtrace. I don't know how many people actually use Dtrace; I actually have used it once or twice, but Dtrace could use this hook in the future instead of having a custom code path and things like that.

Derick Rethans 22:49

For Xdebug I still need to support PHP seven two and up, so I'm not sure how much work it is doing it right now, but definitely something to look into and move to in the future, I suppose. Well thank you very much to have a chat with me this morning. I can see that for Sammy the sun has now come up and I can see his face. Thanks for talking to me this morning.

Sammy Kaye Powers 23:10

Thanks so much, Derick and I really appreciate all the hard work you put into this because I know firsthand experience how much work podcasts are so I really appreciate the determination to continue putting out episodes. It's a huge amount of work so thanks for being consistent.

Levi Morrison 23:26

Yeah, thank you so much for having us Derick.

Derick Rethans 23:30

Thanks for listening to this installment of PHP internals news, the weekly podcast dedicated to demystifying the development of the PHP language. I maintain a Patreon account for supporters of this podcast, as well as the Xdebug debugging tool. You can sign up for Patreon at https://drck.me/patroen. If you have comments or suggestions, feel free to email them to derick@phpinternals.news. Thank you for listening, and I'll see you next week.

Show Notes

Shortlink

This article has a short URL available: https://drck.me/pin068-fq8

Comments

No comments yet

Xdebug Update: August 2020

Another monthly update where I explain what happened with Xdebug development in this past month. These will be published on the first Tuesday after the 5th of each month. Patreon and GitHub supporters will get it earlier, on the first of each month. You can become a patron to support my work on Xdebug. If you are leading a team or company, then it is also possible to support Xdebug through a subscription.

In August, I worked on Xdebug for about 80 hours, with funding being around 70 hours. I worked mostly on the following things:

Xdebug 3

This month I focussed on making it easier to spot what Xdebug is doing, and how and why it is trying to connect to a debugging client, creating files, etc. For this I reworked Xdebug's logging mechanism. Instead of it just being used for the step debugger, the logging mechanism is now used for everything, and as part of that I've renamed xdebug.remote_log to xdebug.log. The log will contain all messages with levels that match the xdebug.log_level setting, but Xdebug will now also send errors through PHP's standard error mechanism. Errors include not being able to open a log file, to create a profile or trace file, or issues with making debugging connections.

The biggest new addition is the xdebug_info() function , which acts as a Xdebug specific phpinfo() and contains configuration information about Xdebug, but more importantly lists warnings and errors. Where possible, a link to the documentation is included as well, where I set up a new page that includes all the warning and error messages that Xdebug can generate. I have tried to add information about how a specific message is created, and what you can do to resolve the issue if possible.

Modes and Docker

Xdebug 3 introduces modes that can be configured with the xdebug.mode setting. This setting can only be changed in php.ini, and not in an .htaccess file or equivalent, nor in the script with ini_set. Because Docker and containers are a more common development set-up now, a method was needed to be able to change Xdebug's mode without have to rebuild the containers. Because of that, I added support for configuring the mode through a new environment variable, XDEBUG_MODE, too. This can be used when starting your Docker container, and/or in Compose.

PHP 8

Now PHP 8 is getting closer I have started to pay a little bit more attention to adding support for features, beyond making it compile. PHP 8 introduces named parameters, and also has a project for adding (and updating) names for internal functions. Xdebug's tracing and logging already showed variable names for user-defined functions, and I've now adding support of doing that for built-in functions and methods as well.

Showing argument names in Xdebug 2 is controlled through setting xdebug.collect_params to 4. In Xdebug 3 I have removed this setting altogether, and argument names are now always visible if possible.

My focus in September is to add support for the rest of PHP 8's features and capabilities as well.

Xdebug Cloud

The only time I spend in August on Xdebug Cloud is some further conversations with JetBrains. I am going to work on Xdebug Cloud when these conversations move along a little further.

If you want to be kept up to date with Xdebug Cloud, please sign up to the mailinglist I'll let you know as soon as something can be tried-out.

Business Supporter Scheme and Funding

In August, no new supporters signed up.

If you, or your company, would also like to support Xdebug, head over to the support page!

Besides business support, I also maintain a Patreon page and a profile on GitHub sponsors.

Shortlink

This article has a short URL available: https://drck.me/xdebug-20aug-fpw

Comments

No comments yet

PHP Internals News: Episode 67: Match Expression

In this episode of "PHP Internals News" I chat with Derick Rethans (Twitter, GitHub, Website) about the new Match Expression in PHP 8.

The RSS feed for this podcast is https://derickrethans.nl/feed-phpinternalsnews.xml, you can download this episode's MP3 file, and it's available on Spotify and iTunes. There is a dedicated website: https://phpinternals.news

Transcript

Derick Rethans 0:15

Hi, I'm Derick, and this is PHP internals news, a weekly podcast dedicated to demystifying the development of the PHP language. This is Episode 67. Today we're going to talk about a match expression. I have asked the author of the match expression RFC, lija Tovilo, whether it wanted to come and speak to me about the match expression, but he declined. As I think it's important that we talk in some depth about all the new features in PHP eight, I decided to interview myself. This is probably going to sound absolutely stupid, but I thought I'd give it a go regardless. So here we go.

Derick Rethans 0:53

Hi Derick, would you please introduce yourself?

Derick Rethans 0:56

Hello, I'm Derick and I'm the author of Xdebug I'm also PHP seven four's release manager. I'm also the host of this podcast. I'm also you.

Derick Rethans 1:07

What a coincidence!

Derick Rethans 1:10

So what is the problem that is RFC is trying to solve?

Derick Rethans 1:13

Well, before we talk about the match expression, we really need to talk about switch. Switch is a language construct in PHP that you probably know, allows you to jump to different cases depending on the value. So you have to switch statement: switch parentheses open, variable name, parenthesis close. And then for each of the things that you want to match against your use: case condition, and that condition can be either static value or an expression. But switch has a bunch of different issues that are not always great. So the first thing is that it matches with the equals operator or the equals, equals signs. And this operator as you probably know, will ignore types, causing interesting issues sometimes when you're doing matching with variables that contain strings with cases that contains numbers, or a combination of numbers and strings. So, if you do switch on the string foo. And one of the cases has case zero, and it will still be matched because we put type equal to zero, and that is of course not particularly useful. At the end of every case statement you need to use break, otherwise it falls down to the case that follows. Now sometimes that is something that you want to do, but in many other cases that is something that you don't want to do and you need to always use break. If you forget, then some weird things will happen sometimes. It's not a common thing to use it switch is that we switch on the variable. And then, what you really want to do the result of, depending on which case is being matched, assign a value to a variable and the current way how you need to do that now is case, say case zero result equals string one, break, and you have case two where you don't set return value equals string two and so on and so on, which isn't always a very nice way of doing it because you keep repeating the assignment, all the time. And another but minor issue with switch is that it is okay not to cover every value with a condition. So it's totally okay to have case statements, and then not have a condition for a specific type and switch doesn't require you to add default at the end either, so you can actually have having a condition that would never match any case, and you have no idea that that would happen.

Derick Rethans 3:34

How's the match expression going to solve all of this stuff?

Derick Rethans 3:37

The match expression is a new language keyword, but also allows you to switch depending on a condition matching a variable. You have matching this variable against a set of expressions just like you would switch, whereas a few major differences with switch here. So, unlike switch, match returns a value, meaning that you can do: return value equals match, then your variable that you're matching, and the value that gets assigned to this variable is the result of the expression on the right hand side of each condition. So the way how this works is that you do result equals match, parenthesis, opening your variable name parenthesis close curly braces and then you have your expression which can just be zero, or one, or a string, or floating point number, or it can be an expression. For example, a larger than 42. And then this condition is followed by an arrow, so equals, greater than sign, and then another statement. What a statement evaluates to get assigned to the return value of the match keyword. Which means that you don't have to do in every case, you don't have to return value equals value, or return value equals expression. Now the expression itself, the evaluator what evaluates to gets returned to match. So that is one thing but as a whole bunch of other changes. The matching with a match keyword is done based on strict type. So instead of using the equal operator, the two equal signs, match uses the identical operator which is the three equal signs so that is strict comparison. Normal rules for type coercion are suspended, no matter whether you have strict types defined in your script, meaning that the match keyword always takes care of the type. It will not be any type juggling in there that can create confusing results. I've already mentioned that it can also be an expression on both the left and right hand sides. Switch also allows you to have an expression on the left hand side, although that isn't used very much, I guess. Another difference between switch and match is that, in case a switch on your right hand side, you can have multiple statements so you can have case seven, colon, and then a bunch of statements and statements are ended by the break, pretty much. With match you can only have one expression. It's possible that is going to change in the future, but it is at the moment analogous with what you can do with the short arrow functions, the arrow function with the fn keyword. That also only allows one expression on the right hand side. Switch also doesn't do automatic fall through to the next condition. If you have a single statement that doesn't make a lot of sense anyway, but this is one of the other changes here. So just like switch you can add multiple conditions separated by comma, on the left hand side. So you can do case zero comma one arrow, and then your expression again. As I mentioned with switch, it is possible to not cover all the cases like it is possible to say four possible values for a specific variable. Take for example you have addition, subtraction, multiplication and division. If none of the conditions that you have set up with a match matches. Then you'll get a unhandled match error exception, it is still possible to have a default condition just like you have it switched out. But if you don't cover any of the conditions or default, then you will get an exception. Anything that is actually still the same between switch and a new match. Well yes, any implementation each condition is still, even evaluated in order, meaning that if the first condition doesn't match it start evaluating the second condition or the third condition and so on, and so on. Also, just like switch again, if all the conditions are either numeric values or strings, PHP's engine will construct a jump table. That was introduced somewhere in PHP seven three I think, that automatically jumps to the right case. Switch, all the conditions have to be either integer numbers like 01234 or all strings, like foo, like bar, like baz, or so on and so on. The match statement, actually, it's a bit more flexible here because it can construct one jump table, if all the conditions are numbers or strings. It doesn't matter that are all numbers or all strings. If there are all numbers or integer numbers or strings, then it can construct this jump table. That doesn't work with switch because switch's type coercion and match doesn't and the internal implementation already support like this hashmap which is pretty much an array that supports integer array keys as well as associative array keys. But because for match the, the matching happens independent on the type is actually ends up working, so does actually works a little bit better, which is great.

Derick Rethans 8:54

Where there, any other additions that were considered to add to the new match keyword?

Derick Rethans 8:58

Well, there were a few things. There was a bit of discussion about blocks, meaning multiple statements to run for each condition. In the end it didn't become part of the RFC, perhaps because it made it a lot more complicated, or perhaps because it was really important to think that functionality through and also at the same time, think about what that does for the short array functions which also, just like match, only support one specific statement at the moment. There were some thoughts about adding pattern matching to the condition just like Perl does a little bit, where yeah like with regular expressions for example, but is also really difficult subject and lots of considerations have to be taken into account so that was also dropped from this RFC. The last one was a quick syntax tweak, which allow you to omit the variable name for a match expression. If you end up matching only against like expressions, like a larger than 42, or b smaller than 12, then it doesn't necessarily matter what you have behind match; the variable name there doesn't matter. So, well the trick that people already use with switch is, is to use switch (true). And with match you can also use match (true), and the addition that was suggested to do here was to be able to not have the true there at all. So, the match would only work on the conditions and not try to match these against a variable, but that also didn't become part of this current RFC.

Derick Rethans 10:31

Are there any backward compatibility breaks?

Derick Rethans 10:34

Well beyond match being a new keyword, there are none. But because match is a new keyword that we're introducing, it means it can't be used as a full namespace name, a class name, a function name, or a global constant. It shouldn't really be much of a surprise that PHP just can introduce these keywords, and that ends up breaking some code. I haven't looked at any analysis about how much code is actually going to break, but it is possible that it does actually do some. But also PHP also top level namespace so if you had a class name called match, you should have put it in your own namespace. And in PHP eight with Nikita's namespace token names RFC, as long as match is on its own, then your namespace name, you'd still be able to use it now, as part of a namespace name, which isn't possible or wouldn't have been possible with PHP seven four.

Derick Rethans 11:33

Okay. What was the reception of this RFC?

Derick Rethans 11:37

Initially, there was quite a little bit of going back and forth about especially the pattern matching or a few other things. But in the end were to slightly reduce scope of the RFC, it actually ended up passing very well with 43 votes and two votes against, which means it's now part of PHP eight. In the last week or so we did find a few bugs. Some crashes. But, Ilija the author of both RFC and the implementation is working on these to get those fixed, so I'm pretty sure we'll all quite ready for PHP eight with the new match expression.

Derick Rethans 12:12

Thanks Derick for explaining this new match expression. It was a bit weird to interview myself, but I hope it turned out to be fun enough and not too weird.

Derick Rethans 12:21

Thanks for having me Derick.

Derick Rethans 12:23

This is going to be the last episode for a while, as PHP 8's feature freeze is now in effect, and no new RFCs are currently being proposed. Although I'm pretty sure they are being worked on. There's one exception to the feature freeze period, which is the short attribute syntax change RFC, which which I'm collaborating on the Benjamin Eberlei, whether that will turn into yet another episode about attributes, we'll have to see. For the PHP eight celebrations, I'm hoping to make two compilation episodes again, as it did last season with Episode 36 and 37. For the PHP eight celebrations episodes, I am again looking for a few audio snippets from you, the audience. I'm looking for a short introduction, with no commercial messages, please. After your introduction, then state which new PHP eight feature you're looking most forwards to, or perhaps another short anecdote about a new PHP eight feature. Please keep it under five minutes. With your audio snippet, feel free to email me links to your Twitter, blog, etc. The email address is in the closing section of each of the episodes. Here's an example of what I'm looking for.

Derick Rethans 13:35

Hi, I'm Derick, and I host PHP internals news. I am the author of Xdebug and I'm currently working on Xdebug cloud. My favourite new feature in PHP eight are the additions to the type system, but union types and a mixed type continue to strengthen PHP's typing system. We've grown up from simple type hints to real proper types, following the additional property types and contract covariance and PHP seven four. I'm looking forward to PHP's type system to be even stronger, perhaps, with generics in the future.

Derick Rethans 14:07

Please record us as a lossless compressed file, preferably FLAC, FLAC, recorded at 44,100 hertz. And if you save them bit of 24 bits that. If you want to make a WAV file or a WAV file that's fine too. Please make them available for me to download on a website somewhere and email me if you have made one.

Derick Rethans 14:33

Thanks for listening to this instalment of PHP internals news, the weekly podcast dedicated to demystifying the development of the PHP language. I maintain a Patreon account for supporters of this podcast, as well as the Xdebug debugging tool. You can sign up for Patreon at https://drck.me/patreon. If you have comments or suggestions, feel free to email them to derick@phpinternals.news. Thank you for listening, and I'll see you next week.

Show Notes

Shortlink

This article has a short URL available: https://drck.me/pin067-fp4

Comments

No comments yet