Become a Patron!

My Amazon wishlist can be found here.

Life Line

PHP Internals News: Episode 55: Dealing with Bugs

In this episode of "PHP Internals News" I chat with Ignace Nyamagana Butera (Twitter, GitHub, Blog) about how the PHP project handles bugs and bug reports.

The RSS feed for this podcast is https://derickrethans.nl/feed-phpinternalsnews.xml, you can download this episode's MP3 file, and it's available on Spotify and iTunes. There is a dedicated website: https://phpinternals.news

Transcript

Derick Rethans 0:16

Hi, I'm Derick. And this is PHP internals news, a weekly podcast dedicated to demystifying the development of the PHP language. This is Episode 55. Today I'm talking with Ignace Nyamagana Butera after he'd asked me on Twitter, how PHP deals with bugs. A few episodes ago, I did a Q&A session about the RFC process. And this time again, we'll have Ignace Nyamagana Butera asking the questions. Would you please introduce yourself?

Ignace Nyamagana Butera 0:46

Hello, everyone. Hello, Derick. My name is Ignace Nyamagana Butera, but you can call me Nyamsprod. I've been a PHP developer for around 15 years now. Currently, I'm working as a software developer, and technical lead in the internet content provider agency. When I have free time, I'm doing some open source, I have a couple of projects that you may have heard of, like, league CSV and league URI. I created them and I am currently maintaining them.

Derick Rethans 1:23

Yeah, as I said, it is not me asking the questions as you this time. So I think we should jump straight in actually.

Ignace Nyamagana Butera 1:30

So my first question will be somehow really simple, because we are talking about bugs. And I was wondering if we had some statistics about bugs in PHP.

Derick Rethans 1:44

Though there are some statistics. I mean, it's not really easy to get that information out of our bug system. But just having had a look, it's about on average, maybe one bug a day gets reported at the moment or is nearly 80,000 bugs in the bug system of course, not all of these are closed, some of them are open, but the majority of them are closed.

Ignace Nyamagana Butera 2:07

Do bugs from the EOL PHP still being taken into account or we just say: okay, these bugs for instance, are for PHP five, will no longer look at them.

Derick Rethans 2:18

If it's a bug, unless it's a security bug fix, we won't look at them for unsupported PHP versions. So at the moment, PHP, seven three, and seven four are still supported. So those bugs will of course look at, if it's a security bug, we only will go back to PHP seven two. If it's reported to any older version and seven two for example, seven one or seven zero, or even PHP four or five, which does happen occasionally, we'll tell them to upgrade first because we won't spend time doing that.

Ignace Nyamagana Butera 2:47

Because I manage and maintain open source project. I know that PHP as a language is used everywhere and you can have multiple reports. First thing first, what is a bug? Because there are multiple definition of it.

Derick Rethans 3:03

And I'm sure if you asked 12 people, you get 13 definitions. I think it is unexpected behavior of something that is documented. So if something is documented do this, and it does something else, or it does something really wrong like crash your program, then that will be a bug.

Ignace Nyamagana Butera 3:21

What is the source of truth? Is it the PHP documentation? Is it the PHP specification language, what is the source of truth? Nothing. Okay. This is expected behavior because it is documented, or how does it work?

Derick Rethans 3:38

For most of the syntax, it's what the source does. And of course, you always find edge case. And I don't have a good example right now. For anything that the syntax, I mean, documentation and behavior should absolutely always work the same. If it doesn't, it's likely going to be a bug in the documentation. If you for example, look at other functionality like in an extension, there is almost as likely that the documentation is sometimes wrong than it is that the code's behavior is wrong. In that case, we need to have a good look at what what the expected behavior should have been. Now, with all the new features that have been put in, since we have the RFC process, pretty much anything that the RFC describes how it should work, is how the feature should work. And if it doesn't, that pretty much means there's a bug. Having said that, not everybody writes on all the expected behavior for all the functionality that an RFC has been put up for. And in those cases, you just need to see what makes the most sense whether it's about core feature.

Ignace Nyamagana Butera 4:40

What is the best way to report a bug? Okay, you have to go to bugs.php.net, I suppose. Yes. But apart from that, what is the best way to report a bug?

Derick Rethans 4:51

As you said, PHP is issue tracker is bugs.php.net. It tells you to fill in your problem, your expected behavior and what you actually get out, what is always really important for people to be able to fix an issue and to find out whether there is an issue to begin with, because that's not always the case either of course, is always to have a short reproducible script that reproduces your problem. And by short, that means it the short you can get it. 10 lines at most for most syntax features who probably do the job. In some cases, if it's a bug for a database related system, then of course, there's going to be some database setup necessary for it. But if it's just syntax, then a short script that reproduces the problem that shows what goes wrong, is really important. And of course, it's also important to say what it did, and what you expected it to do. Also, don't lie about your PHP version, because in some cases, people try to report a bug with a higher PHP version than they're actually using, which is kind of frustrating at times.

Ignace Nyamagana Butera 5:52

I guess that yeah, if we report something that didn't work in PHP five, but it was fixed in PHP 7.2 or PHP 7.3 everybody loses a little bit of time.

Derick Rethans 6:02

And in some cases people find a bug report for, say, PHP 7.4.1. Right, and we're currently at 7.4.6. We will always ask them first to upgrade if they can, because upgrading PHP should take a lot less time than trying to reproduce and fix a problem that has already been fixed.

Ignace Nyamagana Butera 6:20

And what is the strategy between the release of each version of PHP and the bug fix? Does PHP wait for all the bug fixes to be done and then a release is made. Or if for instance, I report a bug like today before a release is scheduled, then this bug will be skipped from the next release and will be tackled after

Derick Rethans 6:46

Every minor version of PHP, be at seven two, seven three, or seven four a moment, has a release every four weeks. Two weeks and two days before a release gets made, we make our release candidates. Everything that has made it in the release candidate will make it into the release. If in between the release candidate gets created and the final release, if bugs get fixed, unless they are really critical, they will make it into that release. But we'll have to wait until the next cycle. So we don't necessarily wait for all the bugs to be fixed before we make a release. Now, there is an exception here, and that is for security bugs. If you find security bugs, they don't end up in a normal PHP seven four branch. They get committed to a security repository that very few people have access to. And these security bug fixes. They get merged into the release branches two days before the release comes out. They don't end up in a release candidate builds because we don't want people 16 days to be able to exploit security bugs if they are remote exploitable, for example.

Ignace Nyamagana Butera 7:53

And can security bugs, or critical bugs push a release?

Derick Rethans 7:59

Technically, yes. If somebody ends up finding, like a remote exploitable bug in PHP, then there will be an emergency release for them. But I can't remember the last time we had to do that.

Ignace Nyamagana Butera 8:10

I remember, like one or two years ago, there was a bug that was going from the bugtrack to the internal mailing list and coming back again to the bugtrack, because there was some kind of indecision to know if it is a bug, or if it should be a feature. How is this possible?

Derick Rethans 8:32

We don't really have a set method for doing this. But our bug tracker isn't the most advanced system in the world. And sometimes it just makes sense to trash out a discussion over email on our PHP internals mailing lists, or sometimes these discussions happen on other chat channels as well I'm sure, just to go through to see what's the case. And sometimes if it is hard to take a decision while there's a bug, then it is always a good idea that more PHP core developers have a look at it and see what's going on there. So sometimes it makes it easier if that's discussed on the mailing list, then in the bug tracker.

Ignace Nyamagana Butera 9:04

Is it possible that for instance, someone submit an RFC. And then during the course of discussion of this RFC, it becomes clear that this is not an RFC, but more of a bug fix.

Derick Rethans 9:16

I don't think I can think of an example here actually.

Ignace Nyamagana Butera 9:19

I remember one example.

Derick Rethans 9:21

Okay.

Ignace Nyamagana Butera 9:23

Because I think it was yeah two years ago about the behavior of the CSV escape character. And I remember at some point, it was suggested to be an RFC. And because of the amount of background compatibility breaks, it was better to treat it like a bug. But I remember when between the bug tracker and the note sufficient there was a whole discussion to exactly being able to say: Okay, this is a bug. And this is an RFC and it was really not, it was a call at the end saying, okay, we will treat it like an RFC, and we will change the way the escape corrector works today. But it won't be as impacting as if it was an RFC that introduced a completely new behavior

Derick Rethans 10:12

CSV is a very difficult format, because everybody slightly implements a standard in a different way. And the way how it originally got implemented in PHP for reading CSV files was done in a very different way than for example, what Microsoft products would create. I mean, it has to do with escaping, if I remember correctly. And I mean, what do you decide, right? I mean, since then Microsoft have made a specification for this. And of course, what we then want to do in PHP is to make sure that we support a specification, but by doing so, we will then break previous behavior, and that is always a really difficult decision to do, right. If it is very clear that it is a bug, then we don't mind changing PHP, even though that could technically break people's code. But if it's unsure or whether it's based on a subjective decision, then that makes it a lot harder to write because we can't definitively say that, yeah, we have a bug here. But if we look at other codebase out there, so many people rely on this. So is the old behavior bug, or is it a feature in PHP? I mean, these things, you have to take one by one, and it's very hard to decide on what is what is a feature, and what is the bug in this case.

Ignace Nyamagana Butera 11:22

I think another subject that comes with bugs is people should be able to fix them. But I suppose that every one of us has a work and who can fix those bugs?

Derick Rethans 11:33

Technically, everybody who has time and know C code could fix a bug. PHP is an open source projects. Our repositories are available on GitHub, or on git.php.net, which is our source of truth, although most people submitted bug fixes against the GitHub repository because it makes it easier to review them and comment on pull requests, for example. But it's open for everybody. It's the same thing about triaging bugs. Trying to find out if the bugs that are actually reported are actual bugs and the bugs.php.net website has in the top right hand corner, it has a random link. And if you click that you get a random bug that hasn't been resolved yet. If somebody, if any of the listeners, or maybe you, are interested in looking at these bugs or wanting to attempt to fix them, click random and see what happens. Maybe you get something interesting, maybe because something really complicated, but in any case, it's possible for everybody to fix a bug. They will get reviewed. For a good enough bug fix it will get merged.

Ignace Nyamagana Butera 12:31

People are usually thinking when they think about open source nowadays they think about semver and people may think that if they look at the versioning of PHP, then they have an idea of it is a patch release, it is a bug release, it is a feature release. How is this related to bugs and how is it versioning of PHP working?

Derick Rethans 12:53

PHP's versions number consists out of three numbers. At the moment, we are the latest version is 7.4.6. The six is your bug fix release. In bug fix releases, there will not be any new functionality. Unless there are very minor, small contained parts in extensions. We tend not to want to have these. And unless you can make a good case for it, it's unlikely to happen. But it isn't unheard of. An example I think I can remember is that open SSL, added a bunch of new API's in there, and other technically new function functions in PHP, they sort of had to be supported, because as part of making sure that you could run the latest version of open SSL or something like that, but that being an exception there. Now, the middle number, traditionally, in semver, is there for features, right, you've bump the middle number, the middle digit, if you have new features, and that is the same in PHP. What we don't really have is a major number that indicates that we are going to break things. The major number in PHP is mostly a marketing number. So at the moment, we have PHP seven four out there. We don't have PHP eight zero next. But that is pretty much a PHP seven five, but with additional functionality that we find important enough to bump the major version from seven to eight for. Having said that, we do have a rule that we don't remove functionality, unless we bump the major number. For example, from five to seven, or from seven to eight. So there will be in the course of time, we might deprecate functionality, we don't tend to remove that until we bump the major number. And you also see that if the major number gets increased, that there is potentially more effort in removing or deprecating more functionality that would otherwise do say for example, it changed from 7.3.0 to 7.4.0. But it doesn't mean that we don't bump major numbers so that we can break all the things for example. So I think the PHP protect tries to, we don't always succeed of course, try to never break people's code. Unless it's a bug fix

Ignace Nyamagana Butera 15:03

That was it for my questions.

Derick Rethans 15:06

Maybe I have some questions for you now. I think it is good to talk about these issues. What are you most surprised with in the way how the PHP process handles bugs and bug reports?

Ignace Nyamagana Butera 15:15

The first thing is, like I say, I've been coding in PHP for more than 15 years, but I only started really to report bugs once I start doing some open source project. Because before I think, and I think it's the majority of people, it's like, yes, there is a bug, oh it's something for PHP, or for any kind of language. I'm not the maintainer. So it's a bug, someone else will report it not to me. Since I've changed because I'm doing myself some open sourcing. I'm like, hey, if I found a bug, I think the best way to resolve that bug is first, to report it and to report it correctly, to the project, to the language or to whatever has that bug. And once you've made this change of how you think about the language, then you start to ask yourself, okay, how can I do it the most efficient way so that the bug get reported? And then the bug can get tackled by the people who can.

Derick Rethans 16:19

Yeah, and the start of that, as you say's, always send us a bug report or sent your favorite open source project a bug report.

Ignace Nyamagana Butera 16:26

Exactly.

Derick Rethans 16:27

I can sort of see where you're coming from. Because I can understand that if you're just in an agency, for example, and the only thing, the only thing you have to do is to make sure that your project is done on time. You can't necessarily wait for the bug to be fixed in PHP anyway, because the product needs to be done by tomorrow or yesterday. And you're going to have to find a workaround you issue in that case anyway. And then you spending time reporting the bug will just takes you time and you don't have time for that, for example. But of course, if you do that, then everybody else that runs into this bug will have to come up with a workaround, and that means you're all end up wasting lots of time.

Ignace Nyamagana Butera 17:04

I remember I had a small story. In one of my previous jobs, someone came to me and we're talking about something and he said: Oh, but there is no constant on the DateTimeImmutable. That's very sad. And I said: no, there is because I remember I submitted the bug, and it was tackled. And now the constants are on the interface. So DateTimeImmutable has the constant and was like: Oh, yeah, but I didn't know. And I was; it was reported and someone use it. And if you don't report it, then maybe in two years, you will ask yourself the same question. Indeed, it takes time. Between the moment it is reported the moment it is tacked, because people need to have time to resolve the issue. But if you don't do the first step, which is reporting it correctly, then it will never be solved.

Derick Rethans 17:53

And by correctly that also means doing in the PHP bug tracker and not complaining on Twitter.

Ignace Nyamagana Butera 17:58

Exactly. Exactly.

Derick Rethans 18:02

Of which I see quite a bit of for Xdebug for example. Thank you very much for taking the time to talk to me, or I should say thank you very much for taking the time to interview me to talk about bugs today. I hope you enjoyed this.

Ignace Nyamagana Butera

Thank you for having me. And hopefully we'll meet again.

Derick Rethans

I'm looking forward to that. Thanks very much.

Ignace Nyamagana Butera 18:21

Thank you.

Derick Rethans 18:23

Thanks for listening to this instalment of PHP internals news, the weekly podcast dedicated to demystifying the development of the PHP language. I maintain a Patreon account for supporters of this podcast, as well as the Xdebug debugging tool. You can sign up for Patreon at https://drck.me/patreon. If you have comments or suggestions, feel free to email them to derick@phpinternals.news. Thank you for listening, and I'll see you next week.

Shortlink

This article has a short URL available: https://drck.me/pin055-fls

Comments

No comments yet

PHP Internals News: Episode 54: Magic Method Signatures

In this episode of "PHP Internals News" I chat with Gabriel Caruso (Twitter, GitHub, LinkedIn) about the "Ensure correct signatures of magic methods" RFC.

The RSS feed for this podcast is https://derickrethans.nl/feed-phpinternalsnews.xml, you can download this episode's MP3 file, and it's available on Spotify and iTunes. There is a dedicated website: https://phpinternals.news

Transcript

Derick Rethans 0:16

Hi, I'm Derick, and this is PHP internals news, a weekly podcast dedicated to demystifying the development of the PHP language. This is Episode 54. Today I'm talking with Gabriel Caruso about his ensure correct signatures of magic methods RFC. Hello Gabriel, would you please introduce yourself?

Gabriel Caruso 0:37

Hello Derick and hello to everyone as well. My name is Gabriel. I'm from Brazil, but I'm currently in the Netherlands. I'm working in a company called Usabila, which is basically a feedback company. Yeah, let's talk about this new RFC for PHP eight.

Derick Rethans 0:52

Yes, well, starting off at PHP eight. Somebody told me that you also have some other roles to play with PHP eight.

Gabriel Caruso 0:59

Yeah, I think last week I received the news that I'm going to be the new release manager together with Sara. We're going to basically take care of PHP eight, ensuring that we have new versions, every month that we have stable versions every month free of bugs, we know that it's not going to happen.

Derick Rethans 1:17

That's why there's a release cycle with alphas and betas.

Gabriel Caruso 1:20

Yeah.

Derick Rethans 1:21

I've been through this exactly a year early, of course, because I'm doing a seven four releases.

Gabriel Caruso 1:25

Oh, nice. Yeah. So I'm gonna ask a lot of questions for you.

Derick Rethans 1:29

Oh, that's, that's fine. It's also the role of the current latest release manager to actually kickstart the process of getting the PHP, in this case, PHP eight release managers elected. Previously, there were only very few people that wanted to do it. So in for the seven four releases it was Peter and me. But in your case, there were four people that wanted to do it, which meant that for the first time I can ever remember we actually had to hold some form of election process for it. That didn't go as planned because we ended up having a tie twice, which was interesting. So we had to run a run off election for the second person between you and Ben Ramsey, that's going to go continuing for you for the next three and a half years likely.

Gabriel Caruso 2:11

Yep.

Derick Rethans 2:12

So good luck with that.

Gabriel Caruso 2:13

Thank you. Thank you very much.

Derick Rethans 2:15

In any case, let's get back to the RFC that we actually wanted to talk about today, which is the ensure correct signatures of magic methods RFC. What are these magic methods?

Gabriel Caruso 2:24

So PHP, let's say out of the box, gives the user some magic methods that every single class have it. We can use that those methods for anything, but basically, what magic methods are are just methods that are called by PHP when a given action happens to the class. So for example, if a class is being constructed, then the construct magic method is going to be called. If I'm calling serialize function, then the magic method serialize as per PHP seven four or PHP eight. I don't remember, so this is basically what magic methods are, are methods that PHP hook into the classes and then once a certain action happened with the class, then PHP is going to call those magic methods in something magic, so to speak is going to happen.

Derick Rethans 3:13

And other options are like underscore underscore get, and underscore underscore set.

Gabriel Caruso 3:17

We have, we have a lot.

Derick Rethans 3:19

Exactly, what do people tend to use these magic methods for?

Gabriel Caruso 3:22

So that's something interesting. As the magic method is called by a number of actions we can use, for example, for let's let's get the example of ORM for example, Doctrine or Eloquent or whatever one. Let's say I'm a maintainer of that library. I don't know what fields do you have in your database. So when I'm porting, when I'm doing the translation, what it can do is map in a property, all those columns and values that I have in the database. And then when you instantiate your entity and you try to access a variable that is does not exist, then we're going to go to a magic method in this case is get, as I said, and I'm going to say okay, is not set in the class, but is mapped in the entity that I have. So this is one case, we also have the case for testing your you have, for example, the famous PHP Unit test framework, every time that a test case is called with all those methods is starting in with test, the call magic method is invoked. And then you can perform whatever action you have. You also have middlewares and the examples go go even further

Derick Rethans 4:32

In the title of RFC you have the word signature, what is the signature?

Gabriel Caruso 4:37

All the attributes that our method can have. So for example, the name of a method is its signature, what does it return? What parameters does it take? And also what modifiers so for example, is it static or not? Is it public, private or protected? So all this information together in usually is one line in PHP. So for example, private static MyMethod, that receives a string and returns a Boolean. There you go. This is the signature of my method

Derick Rethans 5:06

Because some of these magic methods have been in PHP for a long long time. Back in the time where we didn't have argument types or return types or perhaps not even static. All the way back from the past PHP hasn't really done anything with signatures because they've simply didn't exist. At the moment which signature checks this PHP already do?

Gabriel Caruso 5:26

I don't remember a by the RFC but I think was introduced together with the scalar type RFC. But only constructors and destructors until PHP seven four, those two only magic methods were being checked. If they have none return type, not even void, just no return type. But in PHP eight, we're gonna have the new stringable interface and then every single toString magic method. If it is typed, this is very important if it is typed it needs to be a string and these are the only from the 17 that we have only three in PHP 8 are being checked.

Derick Rethans 6:01

PHP seven four.

Gabriel Caruso 6:02

Yeah, in PHP seven four only two and then PHP eight, we have the new toString.

Derick Rethans 6:07

But this RFC suggesting to change that of course.

Gabriel Caruso 6:10

yeah.

Derick Rethans 6:11

What's the reason why you want to extend these checks to the other magic methods?

Gabriel Caruso 6:14

That brings me back how I figured out that. I was looking at some bugs, because we have the https://bugs.php.net, where we centralized all the bugs of PHP. Then there is a bug report explaining in complaining exactly about that. Like, I can't hide my magic method. Back in the days I can say, for example, that my tostring method is going to return an integer or a Boolean. That makes no sense. And then I was like, yeah, makes makes no sense. We need to fix that out and then I start to search how do we type that? How what types do we have and then I was like, we can't in PHP eight, because this is going to be a new major version. So we are allowed to at least vote for do that. We can check if someone is using types, we can check those types. We are not going to force, we are not going to require, we're not going to evaluate even run static analysis. Nope, we're going to simply check. Okay. Are you saying that this get magic method is going to return anything? Okay, that's okay. Oh, but I want to my guess is that you specifically return a string. That's also okay. As to how to pronounce that liskov mistook principle, right?

Derick Rethans 6:36

The liskov substitution principle.

Gabriel Caruso 7:26

Yeah. And so this is what we're going to basically do with this RFC, there's going to be voted. We're going to simply check if you're using the right types, because, in my opinion, magic methods are a foundation in PHP. As we have theses methods across different code bases across different projects from different behaviours, at least when I'm looking at that code. Okay, I'm looking at this magic method. I know what parameters does it take. I know what return does it have. This is worth less tab to the bug are trying to understand what is happening. Because today maybe I'm debugging a toString method there is return an integer. And I'm like, okay, this is the bug, it's supposed to return a string. But once you ensure those all those signatures, is one less bug that we're gonna have in production.

Derick Rethans 8:17

When are these signatures being ensured?

Gabriel Caruso 8:19

It's not at compile time because he does not have a compile time. But he's when the Zend machine is compiling the code, we have a very specific method that is checking all the modifiers. So for example, the signature that we mentioned before so all the magic methods needs to be public. This has been checked, for example, they callStatic magic method needs to be static. So this has also been checked. And then I'm extending how do we check for signatures for param types and also for return types. So during compilation of the Zend VM.

Derick Rethans 8:52

Taking as example callStatic in the RFC, I see that the name has to be a string and the arguments has to be an array. What happens if you use a different type there?

Gabriel Caruso 9:01

So nowadays if you use a different type that's allowed. So if you say there, you're going to receive an integer, and you're going to receive a string. This is allowed today. And this is what I mentioned about when you are debugging or analyze different code bases, you're going to be like why in the documentation says that we need to receive a string and an array, and there's this specific code base is receiving a string and an integer. So this is what kinds of mismatch I want to avoid. Of course, when using types, because we also know that PHP in some projects does not use types. And that's perfectly fine. If you're not using types, I'm not going to ask you, hey, you need to type those magic methods. Well, what I'm going to do is okay, you're using types and I need to make sure they're using right otherwise this is going to be a mess.

Derick Rethans 9:47

If you type it; say use an integer for the name of underscore underscore get, will give you a warning or a compile error, or parse error? What what kind of feedback which you get back from that?

Gabriel Caruso 9:59

While you are running your code, as soon as that class get referenced, we're going to check. Is not when is initiated, when is not when is called, as soon as I think the autoload detects that class is gonna parse, is going to identify, and then is going to compile and during the compile time that we mentioned. We're going to identify that. So it's going to be early in the stages. Perhaps as soon as you run something or you would upset me, you're going to have that feedback saying: hey, this is not compatible with what we are expecting.

Derick Rethans 10:32

Is that a warning or type error?

Gabriel Caruso 10:34

It's going to be a fatal error, because this is what we are constantly returning with the destructors and constructors.

Derick Rethans 10:41

Yeah, we alluded to mixed already a little bit and the RFC mentioned mixed a few times, of course mixes in the type and PHP yet. So what do you want to do about that?

Gabriel Caruso 10:51

Today we are 11th of May of 2020. Right now we have an RFC voting in PHP to introduce the mixed type. I'm not going to say if I agree or disagree, it's being voted. If that RFC gets accepted then I have already talked with the authors of the that RFC, I'm going to wait until they merge into master. I'm going to rebase and readapt to my RFC, to have those mixed types. And there we go PHP eight probably can have mixed, and probably can already have the usage of mixed in the magic methods. So either No, I'm gonna need to wait for the end of their RFC. If it's approved, there go I need to rebase my PR. In the other case, we are going to keep as comments because we can't ensure that in the compile time with the VM.

Derick Rethans 11:41

At the moment, it looks like that vote will and in May 21. The current votes are 35 to six for passing. So it looks like that will go through

Unknown Speaker 11:50

And then I need to rush because we have the upcoming feature freeze of PHP eight. So I need to make sure that I start to vote and implement my RFC before that time.

Derick Rethans 12:00

Feature freeze should be by the end of July. So I think you have plenty of ime pfor that. And of course you have a release manager, you can make an exception. That's how that works. Usually adding extra checks will have impact to existing code. Is there much impact to existing code here as well?

Gabriel Caruso 12:18

That was the interest question that I made myself. Okay, I'm going to touch the magic methods of PHP. I'm going to break some code in an issue identified those breaking changes in an each map in the RFC. How do I map across many projects, many libraries, many PHP codes out there? How do I do that? I remember that Nikita back in his RFC about the parenthesis origin, like how do we present this ordering and yada yada yada. He made a script, where he went through I think was the top thousand or top 10,000 packages. On packagist, that is the official composer package provider and he identified everything, and ask myself how he did that. And actually was very easy. He just cloned other repositories. He instantiate a new PHP parser instance that is his magic parser. That is behind PHP Stan, is behind psalm, is behind a lot of infection, a lot of big projects, where you analyze the code. So you have a code base where you can analyze and say: Do I have magic methods wrong? And then I run this script, identify, I think six or seven types that were not perfect. Three of them. I have already submitted a request because we're in PHP Unit and I said to Sebastian: hey, this actually is not right. Because I'm proposing this RFC, he was like: Okay, perfect, let's merge it. And the other cases are the cases that I mentioned. For example, with get. Get, you need to return mixed but by the LSP, you can nail down to an integer or a string. So there you go, at least in the top 10,000 packages of composer is not going to be a breaking change. But of course, it's going to be breaking change for people that I can't map. So this is why it's mentioned the RFC that if you're using types with magic methods wrong, we're going to warn you.

Derick Rethans 14:13

But at least it's an easy thing to check for. Because even running all your files through PHP minus L should catch it.

Gabriel Caruso 14:20

Yeah, there you go.

Derick Rethans 14:22

So it's a very easy to check for something. You provided a link to Nikita's script where he checks for those ternairies, do you have a version of your own script available as well?

Gabriel Caruso 14:33

That's interesting. I thought the RFC was updated. So I'm going to update the RFC, because I do have the script locally.

Derick Rethans 14:39

Then I can link to it for the podcast as well.

Gabriel Caruso 14:41

Okay, perfect.

Derick Rethans 14:42

In the future, are you thinking of extending checks to a few more things?

Gabriel Caruso 14:46

So this is something that I fought about this RFC, like how much you want to break and explode people's code. And I think starting with checking types in the signature is the first step. The next step is to actually check the return type. We do that with toString. So for example, although you have type right for maybe, some logic or something is wrong, you're returning an integer. There is a check before the actual type saying you're supposed to return a string you're return an integer. And actually, there is a check in the magic method saying this magic method was supposed to return a string. I think is gonna break even more code because then it's something that I can't measure. So I was like: Okay, let's first start with types and then we can give it next step that is: okay, inside this method, what is being returned, okay, is something different from the signature: explode. You're returning something that I was not supposed to return. But this is not a fight that I'm going to pick. So I leave it up for the next major version of PHP or whatever.

Derick Rethans 15:49

Wouldn't PHP's strict versus weak type mechanism already catch these things. So from debugInfo, if you would type that as returning an array, and then you end up returning an object, which is not necessarily wrong, just not what you expected. PHP's return type checking mechanism should already catch that for you.

Gabriel Caruso 16:13

If you have a magic method typed. If it's not typed, so we can say that some efforts do have that check. And then we're going to expand when we don't have types in the signature.

Derick Rethans 16:24

That's clear now. Do you have anything else to add?

Gabriel Caruso 16:27

The only thing that I want to add that is, I have created another RFC, and this is something that I always tell everyone that is easy to do; is not impossible. Anyone can go there, identify a bug or catch a bug report and then try to fix it. And this is what I'm doing. Like I'll do them to release many of PHP eight. I'm also fixing bugs, improving documentation and everything else. This is something that I try to do and share with everyone. So everyone can also be the next one contributor to the to PHP and it's evolution.

Derick Rethans 16:57

This RFC isn't out for voting yet. You set you want to sort of wait until mixed gets passed or not. What's the reception been so far?

Gabriel Caruso 17:05

So I asked a couple of key members of the PHP community, both internal and external people. They agree, they said that the right approach is to first check for the signature, because if someone is already using types, that project is type friendly, so we can at least play with that. But if someone is not typing, then this is a bigger fight. And then we're going to talk about that in the future.

Derick Rethans 17:29

Thank you, Gabriel for taking the time this morning to talk to me. I've learned a few more things about this RFC, so that's always good to know. And again, congratulations of being the PHP eight release manager together with Sara.

Gabriel Caruso 17:41

Thank you very much. Also thank you for inviting me for this new podcast is amazing. Always listen to all these famous people of PHP that talked with you. And I'm like, Whoa, Derick has invited me this is going to be so much fun. Thank you very much.

Derick Rethans 17:55

Thanks for listening to this installment of PHP internals news, the weekly podcast dedicated to demystify the development of the PHP language, I maintain a Patreon account for supporters of this podcast, as well as the Xdebug debugging tool. You can sign up for Patreon at https://drck.me/patreon. If you have comments or suggestions, feel free to email them to Dderick@phpinternals.news. Thank you for listening, and I'll see you next week.

Shortlink

This article has a short URL available: https://drck.me/pin054-fli

Comments

No comments yet

PHP Internals News: Episode 53: Constructor Property Promotion

In this episode of "PHP Internals News" I chat with Nikita Popov (Twitter, GitHub, Website) about the Constructor Property Promotion RFC.

The RSS feed for this podcast is https://derickrethans.nl/feed-phpinternalsnews.xml, you can download this episode's MP3 file, and it's available on Spotify and iTunes. There is a dedicated website: https://phpinternals.news

Transcript

Derick Rethans 0:16

Hi, I'm Derick. And this is PHP internals news, a weekly podcast dedicated to demystifying the development of the PHP language. This is Episode 53. Today I'm talking with Nikita Popov about a few RFCs that he's made in the last few weeks. Let's start with the constructor property promotion RFC.

Nikita Popov 0:36

Hello Nikita, would you please introduce yourself? Hi, Derick. I am Nikita and I am doing PHP internals work at JetBrains and the constructor promotion, constructor property promotion RFC is the result of some discussion about how we can improve object ergonomics in PHP.

Derick Rethans 0:56

Object economics. It's something that I spoke with Larry Garfield about two episodes ago, where we discuss Larry's proposal or overview of what can be improved with object ergonomics in PHP. And I think we mentioned that you just landed this RFC that we're now talking about. What is the part of the object ergonomics proposal that this RFC is trying to solve?

Nikita Popov 1:20

I mean, the basic problem we have right now is that it's a bit more inconvenient than it really should be to use simple value objects in PHP. And there is two sides to that problem. One is on the side of writing the class declaration, and the other part is on the side of instantiating the object. This RFC tries to make the class declaration simpler, and shorter, and less redundant.

Derick Rethans 1:50

At the moment, how would a typical class instantiation constructor look like?

Nikita Popov 1:55

Right now, if we take simple examples from the RFC, we have a class Point, which has three properties, x, y, and Zed. And each of those has a float type. And that's really all the class is. Ideally, this is all we would have to write. But of course, to make this object actually usable, we also have to provide a constructor. And the constructor is going to repeat that. Yes, we want to accept three floating point numbers x, y, and Zed as parameters. And then in the body, we have to again repeat that, okay, each of those parameters needs to be assigned to a property. So we have to write this x equals x, this y equals y, this z equals z. I think for the Point class this is still not a particularly large burden. Because we have like only three properties. The names are nice and short. The types are really short. We don't have to write a lot of code, but if you have larger classes with more properties, with more constructor arguments, with larger and more descriptive names, and also larger and more descriptive type names, then this makes up for quite a bit of boilerplate code.

Derick Rethans 3:16

Because you're pretty much having the properties' names in there three times.

Nikita Popov 3:20

Four times even. One is the property name and the declaration, one in the parameter, and then you have to the assignment has to repeat it twice.

Derick Rethans 3:30

You're repeating the property names four times, and the types twice.

Nikita Popov 3:34

Right.

Derick Rethans 3:36

What is the syntax that you're proposing to improve this?

Nikita Popov 3:39

The syntax is to merge the constructor and the property declarations. So you only declare the constructor and you add an extra visibility keyword in front of the normal parameter name. So instead of accepting float x in the constructor, you accept public float x. And what this shorthand syntax does is to also generate the corresponding property. So you're declaring a property public float x. And to also implicitly perform this assignment in the constructor body. So to assign this x equals x, and this is really all it does. So it's just syntax sugar. It's a simple syntactic transformation that we're doing. But that reduces the amount of boilerplate code you have to write for value objects in particular, because for those commonly, you don't really need much more than your properties and the constructor.

Derick Rethans 4:40

Besides public, I suppose you can also use protected and private there as well.

Nikita Popov 4:45

That's right. So you can use all the visibility modifiers. Well, public protected private, static does not really make sense. But if we add other modifiers in the future, then those could be used there as well for example, if we add support for read only properties, then of course, you could also write public readonly float x or something.

Derick Rethans 5:09

The RFC talks about desugaring. How's this implemented? Is this transformation on in the AST, or in another way?

Nikita Popov 5:17

This is not an AST transform, but I would say close enough. So we just generate the corresponding property declarations and assignments in the compiler. If you inspect the AST with an extension like PHP AST, you will see the code as written. So with the public in front of the parameter name, but if you inspect the code in reflection, then it will look as if you declared the property explicitly.

Derick Rethans 5:48

So the RFC talks about a few constraints and what you can and cannot do with those promoted properties. One of the things it talks about is nullability.

Nikita Popov 5:58

Well, we have two different nullability semantics in PHP for historical reasons. One is in parameters, where we say, if you use a type that is not explicitly nullable, but you have a null default value, then we make the type implicitly nullable. While for property types, which are newer, we no longer have this implicit behaviour. So if you want to have a nullable property, you do need to explicitly mark it as nullable. Just using a null default value on will result in an error. And the handling is the same here. So if you want to have a nullable promoted property, you have to mark it as nullable

Derick Rethans 6:43

And you cannot just rely on setting the default to null?

Nikita Popov 6:46

Exactly, but I think it's like detail. And really this could go either way. I just prefer the explicit nullability because this seems like the direction we are going to in the future. I don't know if we will ever remove this implicit behaviour. Maybe not. But I think nowadays explicit one is preferred.

Derick Rethans 7:10

Less magic is better.

Nikita Popov 7:11

Less magic, exactly.

Derick Rethans 7:13

The RFC also has like constraints in there. You can also define a constructor in traits and abstract classes. Can you also use a constructor property promotion there as well?.

Nikita Popov 7:23

In traits? Yes, I mean in traits, using it will be a little bit weird. But there is no reason why it can't work. After all traits can have a constructor that will be used in the using class. And traits can also have properties that get imported. So the same mechanism works there as well. It does not work for abstract constructors or constructors in interfaces. The syntax also implies that you have some assignments inside the body of the constructor, and if we have an abstract constructor, then we could not emit these assignments anywhere. We could support it as a special case, like saying that it only declares the properties but skips those assignments. But I know how often you've used abstract constructors, I probably used them like maybe once or twice in all my time working with PHP. So either they really need extra support in that area.

Derick Rethans 8:25

It would also then introduce an inconsistency were promoted properties in abstract classes or abstract class constructors if that's the thing, would be different from normal class constructor property promotion. How does the inheritance work? Is the working in the same way or is there no specific difference in it?

Nikita Popov 8:44

Based on like discussion feedback, I think inheritance is the largest point of confusion with this syntax. The thing is that does not really have any special interaction with inheritance. So you can just follow this like syntactical transformation it does, which does not have any impact on inheritance. But the thing is, if you just look at the code, and you see you have the parent class defining the constructor, and the child class defining the constructor, and then you're wondering, well, is there some kind of connection between the parameters? The promoted parameters declared in one constructor and the other one? And the answer is simply: No, there isn't. Those have nothing to do with each other. And even more generally, constructors are a bit of a special case where inheritance is concerned. So usually, we say that methods always have to be compatible with the parent method. So the signature has to be compatible, the return type has to be well not match, but be contravariant. And similar for the argument types, but this rule does not apply for the constructor. So the constructor really belongs to a single class, and constructors between parent and child class do not have to be compatible in any way.

Derick Rethans 10:09

Are there any types that you can't use for constructor property promotion?

Nikita Popov 10:14

Just callable. Because callable is not a valid property type. Well, there is one more thing that you can't use a variadic argument. Well, if you write a variadic argument, you write something like int, dot, dot, dot, whatever. But the type you're actually writing is int, because that's the type of each individual argument. But all of that gets collected into an array. So the type of the corresponding property would have to be array. So we would have to do an extra transform that's maybe not super obvious. And so I've left this part out.

Derick Rethans 10:50

And also PHP's type system doesn't support defining an array of integers. It only supports describing an array. At a time we're talking about is, at the end of April, this hasn't gone up for a vote yet. When do you think this will happen?

Nikita Popov 11:05

The RFC will need one small adjustment because the attributes RFC is currently in voting and it very much looks like it's going to be accepted. We will need to also consider support for attributes on the promoted properties. I think the only small question there is, what does the attributes apply to? Because this could apply to the parameter or to the property, or both.

Derick Rethans 11:34

How would you actually set these attributes because from what I understand docblocks, you can only use in front of a method name or a property declaration. How would you define a different attribute for each of the promoted properties?

Nikita Popov 11:48

I believe that the attributes RFC already supports attributes on parameters, so that shouldn't be a problem.

Derick Rethans 11:55

So it allows for setting a specific attribute for each of the arguments coming into the constructor. But that didn't quite answer the question. When do you think we'll be voting on this?

Nikita Popov 12:05

Maybe in a week or so.

Derick Rethans 12:06

By the time this podcast comes out?

Nikita Popov 12:09

Well, we have had a lot of activity recently in PHP internals. So I guess we are one of the few places that benefit from the Coronavirus, because people now have time to work on PHP.

Derick Rethans 12:24

Yeah, I mean, I'm looking at so much extra code now. Interestingly, when going to the RFC, and as a side note, it mentioned somewhere that when defining more properties, the line length goes too long, because you now have this extra keyword in there. And that could benefit from then separating the constructor arguments over multiple lines. And that that raises the point is that you can use a trailing comma in arrays when you call functions, but not in argument lists. And I saw that you've also made another RFC for adding the trailing commas in the parameter lists.

Nikita Popov 12:58

So there's like a super simple RFC, just allow that extra comma. This has actually already been discussed a couple of times in the past, and has not, has been declined that point.

Derick Rethans 13:13

I'm just having a quick look at it. Because this RFC is already voting to see what the current votes are, and it's 58 for and one against.

Nikita Popov 13:21

I think like the main counter argument people have against this kind of trailing comma stuff is, well, doesn't that mean that it encourages writing methods with a lot of parameters, which is a bad style. I don't think it does. And I think that even if you don't have a lot of parameters, it's fairly easy to run into line length limitations, because nowadays like to use expressive long parameter names, and expressive long type names, so even without adding an extra protected in front of all of that, you can really easily get signatures that split across multiple lines. In which case having the trailing comma is nice, mainly because we already write it everywhere else.

Derick Rethans 14:12

Except for in arguments to methods, because you can't.

Nikita Popov 14:17

Well, there are also a couple of other places where you can't. For example, like if you have a class implements, and then implements many interfaces, then you can't put a trailing comma after the last interface. And this is something we could also allow. But I think the relevant distinction there is that this is kind of a freestanding list. Um, it's not wrapped inside brackets, or parentheses. So it kind of looks a little bit weird if you have a trailing comma there, which is possibly also why previous RFC on that simply allowed trailing comma everywhere did not pass.

Derick Rethans 14:58

As I said, it looks likely that will pass.

Nikita Popov 15:01

Yes, I think it's unlikely that we're going to get 13 new no votes.

Derick Rethans 15:07

What I also find interesting is that an RFC that you've mentioned earlier in the episode is that attributes are going to pass as well. At the moment, there's only one no votes there as well, which surprised me because the last time attributes was discussed was very much not going to pass whatsoever.

Nikita Popov 15:27

Yeah, this is an interesting effect. It's hard to say why it happens. Probably, well, part of the reason is just that issues that were raised on previous proposals have been addressed. For example, the last one by Dmitri had the very controversial aspects where it's exposed the AST. The abstract syntax tree representation of the attributes, which has gone from this one, and thus removes one of the contentious issues. But I think another part is just that sometimes it takes multiple proposals to really get an idea through internals. We have this situation pretty commonly that though the first RFC fails, second RFC fails, and then the third one does pass.

Derick Rethans 16:18

It's also it's taken five years or so. And people's opinions might just change about these things.

Nikita Popov 16:23

Exactly. The previous proposals might just have been before their time.

Derick Rethans 16:29

I saw you had made one other tiny RFC, which is the stricter type checks for arithmetic slash bitwise operators. What is that about?

Nikita Popov 16:40

Very simple. So if you're write, well, x minus y, and x is an array. And y is a resource, like what do you expect the outcome to be? There is really no reasonable way that can work. So this RFC proposes to make the arithmetic and the bitwise operators, when working on arrays, when working on objects, and working on resources, simply throw an exception. And the motivation for that was the operator overloading RFC, which has in the meantime been declined. But still, this was a concern raised there that while you can overload operators for objects, but you still get pretty weird behaviour if an overloaded operator is missing, because we currently handle that with just a otice and assuming that the object is equal to one, which is usually not a useful or desired behaviour.

Derick Rethans 17:39

There is of course, one exception where you can still use an arithmetic operator, which is the plus between arrays.

Nikita Popov 17:46

That's right, yeah. So array plus array is similar to an array merge operation. And that one is of course, well defined and remains supported

Derick Rethans 17:55

Whereas things like true divided by 17, although not sensible, it'll continue to work.

Nikita Popov 18:00

Right, that also. Yeah, so because this is simply a much more contentious issue whether, like implicitly treating true as one is a good idea or not. Personally, I know I have written code where I, for example, add up booleans. Just as a count of how often something is true. This is like maybe maybe, style wise it would be better to write an explicit integer cast. But the code is also not really wrong. This may be as a discussion for another time.

Derick Rethans 18:33

As we've said before, the smaller the RFCs, the easier it is to get them passed as well. Alright, Nikita, thanks for taking the time this morning to talk to me about constructor property promotion RFC, and a few others. We'll see whether they get passed for PHP eight.

Nikita Popov 18:48

Thanks for having me Derick, once again.

Derick Rethans 18:52

Thanks for listening to this instalment of PHP internals news, the weekly podcast dedicated to demystifying the development of the PHP language. I maintain a Patreon account for supporters of this podcast, as well as the Xdebug debugging tool. You can sign up for Patreon at https://drck.me/patreon. If you have comments or suggestions, feel free to email them to derick@phpinternals.news. Thank you for listening, and I'll see you next week.

Shortlink

This article has a short URL available: https://drck.me/pin053-fl8

Comments

No comments yet