Parsing Mail with PHP
Many PHP applications require to parse e-mail messages. For example bug systems and ticket systems that want to allow input by e-mail. For sending e-mail there are already decent implementations, ones that even allow sending multi-part and mixed text/html messages with attachments and so on.
Parsing e-mail is a whole different story, and definitely not an easy task. As we see this task as something important we decided to add e-mail parsing functionality to the Mail component . We just released an alpha release of this component as PEAR package which you should be able to install with (not sure if you have to add the components channel first, for information on how to do that see the "PEAR Installer" section in this article.
pear install components.ez.no/mail-beta
The Mail parsing part of the component can currently use a POP3 server or a file to parse e-mail messages from. A small example to parse e-mail from a POP 3 server looks like (if you use the PEAR installer to install the component):
<?php
require_once "ezc/Base/base.php";
function __autoload( $className )
{
ezcBase::autoload( $className );
}
$pop3 = new ezcMailPop3Transport( "pop3.example.com" );
$pop3->authenticate( "user", "password" );
$set = $pop3->fetchAll();
$parser = new ezcMailParser();
$mails = $parser->parseMail( $set );
foreach ( $mails as $mail )
{
echo "From: {$mail->from->email}\n";
echo "To: ";
foreach ( $mail->to as $to ) {
echo "{$to->name} ({$to->email}) ";
}
echo "\n";
echo "Subject: {$mail->subject}\n";
switch ( get_class( $mail->body ) )
{
case 'ezcMailText':
echo "Text part, ".
"type={$mail->body->subType}\n--\n";
echo $mail->body->text;
echo "\n--\n";
break;
case 'ezcMailMultipartMixed':
echo "Multipart mail\n";
break;
}
echo "\n";
}
?>
The $mail variable now holds an array of ezcMail objects. For more information on how to access the information in the mail classes, please refer to the documentation . Currently not all the ezcMailPart decendents document the available properties yet, but this will ofcourse be addressed before the first beta.
In the near future we want to expand the component with an IMAP transport, more authentication mechanisms and add methods that allow you to "reply" to a parsed e-mail message or "forward" one. Those methods then set the correct headers in the e-mail object, including the correct handling of "References" and "In-Reply-To" headers.
Comments
Very cool component :-)
However, you're saying that instead of reading a mail from a POP3 account, one could also use a file... reading the documentation (haven't looked at the actual code yet) does not give me any hint of how to do so - can you perhaps give an example for that?
Currently we only have one for a single file but we want to expand this to allow reading of Unix mbox files too. This is not ready yet though and thus we didn't put the single file reader in the component yet. In http://svn.ez.no/svn/ezcomponents/trunk/Mail/tests/parser/parser_test.php you find the SingleFileSet class which you can use (but change the __construct() for your path). This class will end up as ezcMailMboxTransport later on with support for multiple files in it.
This looks good.
Are there any obvious differences to the MimeDecode class also in PEAR? I currently use MimeDecode so wodering if there are any immediate advantages to this.
Also, will this work by supplying a variable, e.g. command line input in a shell script?
@Chris: I believe we support more formats regarding multi parts and return a more structured result. The result of the parseMail() method is the same as if you would create a mail message yourself.
All mail that comes from a transport is also parsed on the fly while reading so that no unncessary memory is used.
The currently available classes do not yet work on a string only because there is no StringTransport available yet. This is something we could add though.
That sounds great Derick, the memory problem is a major issue with mimeDecode, I have also been looking at the PECL package mailparse but this istn't a portable option in most cases.
One further thing, have you done much testing with the class using other character sets, particularly when other character sets are used in the subject of an email?
@Chris: We tested some with iso-8859-1 and UTF-8. We use the iconv function currently and I think that should handle most encodings. We tested this will all header fields, including the Subject header.
i'm currently using PEAR::Mail_IMAPv2. Are there arguments to use this class instead ? What can you say about this ? thanks.
@Florent: One of the things is that we don't rely on any PHP extension for our component. It also seems that the parsed message is a bit more structured and we support multiple transports (POP3 and single mail file for now, IMAP , mailbox and string are following.) Our license is also slightly friendlier as we don't have the advertising clause which exists in the old BSD license (and also in the PHP license that this package is under).
Ok Derick. I'm convinced ;-) i've planned to use string instead of pop3 in my application so i'm glad to see that there will be a StringTransport. Have you an idea of the release date ?
We should have a beta out just after easter.
This looks really great! I've been desperately seeking something like this.
What's the current status? Has it made it to beta yet?
It has been released as beta, and as release candidate already. We will release the final version on Monday.
An excellent library ez :) thumbs up for the work.
I just want to know that how can I change the pop server port number? As I want to snatch email from gmail account which operates on pop.gmail.com 995. How can I change that?
Life Line
I've finished reading Children of Memory, the third book in the series.
Another interesting take on forms of intelligent life.
A fourth one is going to get released later this year.
Updated a post_box, a beauty shop, and a restaurant; Confirmed 2 clothes shops, 2 pet shops, and a restaurant
I walked 5.9km in 1h40m39s
Updated a bicycle_parking
Updated 2 waste_baskets
I walked 7.9km in 1h37m12s
Created 3 waste_baskets; Updated 3 bus_stops, 2 benches, and 2 waste_baskets
I walked 8.1km in 1h25m53s
I walked 1.2km in 9m31s
I walked 9.4km in 1h39m05s
Merge branch 'xdebug_3_5'
Merged pull request #1071
Fixed issue #2411: Native Path Mapping is not applied to the initial …
Created 2 waste_baskets; Updated 3 waste_baskets, 2 benches, and 2 other objects; Deleted a waste_basket
I walked 7.9km in 1h45m36s
RE: https://phpc.social/@phpc_tv/116274041642323081
Now that phpc.tv and phpc.social are part of the same umbrella, I've upped my yearly contributions to their Open Collective: https://opencollective.com/phpcommunity/projects/phpc-social
Merge branch 'xdebug_3_5'
Merged pull request #1070
I walked 7.2km in 1h10m26s
Fixed issue #2405: Handle minimum path in .xdebug directory discovery
I've published a new blog post: "Human Creations", on the difference in content generation by LLMs, and the creation of text, art and code by humans.
You can find it at https://derickrethans.nl/human-creations.html or at @blog
I walked 7.8km in 1h38m32s
RE: https://phpc.social/@afilina/116274024588235234
It's good to see that more and more people are realising that the Web can be for-good, without all the enshittification.
That's why I'm happy to see endeavours like phpc.tv springing up, and helping out where I can.
Taking back the control of how the Web is for people, by people, without big tech making it all shit.
Created a waste_basket; Updated 5 crossings and a bicycle_parking
I walked 10.7km in 2h35m10s


Shortlink
This article has a short URL available: https://drck.me/pmw-php-4qm