Using PHP IMAP functions to download email by fXQYuQQ5

VIEWS: 0 PAGES: 15

									Using PHP IMAP functions to download
email
This post looks at how to connect to a POP or IMAP mailbox using PHP's IMAP mail
functions and retrive the number of messages in the mailbox, the message headers and
body, and then to delete the message.

IMAP functions not installed?
If you get the error message "Fatal error: Call to undefined function imap_open()" when
trying to connect to the mailbox then the IMAP functions are not installed in your PHP
install. I detailed how to install this in an earlier post today for CentOS. Other Linux
distibutions will have similar methods for installing the IMAP extension.

Connecting to the mailbox
To connect to a POP3 mail server, do the following, where $server contains the server to
connect to, $login and $password the login name and password used to access the
mailbox:

$connection = imap_open('{$server:110/pop3}', $login, $password);

If it's an IMAP server you can leave out the port number and slash part:

$connection = imap_open('{$server}', $login, $password);

The value returned will either be an IMAP resource or boolean false in the event of an
error.

Note that if you're getting "Warning: imap_open(): Couldn't open stream" type of errors
when connecting you may need to append the /notls flag after the server e.g.:

$connection = imap_open('{$server/notls}', $login, $password);

I found that I needed to do this when moving a script from CentOS with PHP 5.1.6 to
Debian with PHP 5.2.6. On Debian it would give the error unless I had the "notls" flag.
Obviously if you need tls or ssl then you'd add those flags instead of notls.

Get the number of messages
To get the number of messages in the mailbox use the imap_num_msg() function like so:

$count = imap_num_msg($connection);
Loop through and retrieve the messages
Use a simple loop to loop through from 1 to the number of messages combined with the
imap_headerinfo and imap_body functions to get the message headers formatted into an
object and the raw body respectively.

for($i = 1; $i <= $count; $i++) {
    $header = imap_headerinfo($connection, $i);
    $raw_body = imap_body($connection, $i);
}

Note that the imap_body() function returns the raw body of the email messages, which
also contains all the mime parts and attachments etc (if present), not just the plain text
message. I will deal with looking at the parts of an email message in next Monday's PHP
post.

Header object
The value returned from the imap_headerinfo() function is an object containing a number
of properties. To show the values available, an example of the output from print_r is as
follows:

stdClass Object
(
    [date] => Wed, 4 Feb 2009 22:37:42 +1300
    [Date] => Wed, 4 Feb 2009 22:37:42 +1300
    [subject] => Fwd: another test
    [Subject] => Fwd: another test
    [in_reply_to] =>
<59f89e00902040137vb73ed1ep5f870dafe02f26cf@mail.example.com>
    [message_id] =>
<59f89e00902040137s16c317b6oa4658a4d2cc64c3c@mail.example.com>
    [references] =>
<59f89e00902040137vb73ed1ep5f870dafe02f26cf@mail.example.com>
    [toaddress] => john@example.com
    [to] => Array
        (
            [0] => stdClass Object
                (
                     [mailbox] => john
                     [host] => example.com
                )

          )

     [fromaddress] => Chris Hope
     [from] => Array
         (
             [0] => stdClass Object
                 (
                     [personal] => Chris Hope
                     [mailbox] => chris
                        [host] => example.com
                   )

         )

    [reply_toaddress] => Chris Hope
    [reply_to] => Array
        (
            [0] => stdClass Object
                (
                    [personal] => Chris Hope
                    [mailbox] => chris
                    [host] => example.com
                )

         )

    [senderaddress] => Chris Hope
    [sender] => Array
        (
            [0] => stdClass Object
                (
                    [personal] => Chris Hope
                    [mailbox] => chris
                    [host] => example.com
                )

         )

    [Recent] => N
    [Unseen] =>
    [Flagged] =>
    [Answered] =>
    [Deleted] =>
    [Draft] =>
    [Msgno] =>    20
    [MailDate] => 4-Feb-2009 22:37:42 +1300
    [Size] => 3111
    [udate] => 1233740262
)


Using PHP IMAP functions to download
email from Gmail
A couple of weeks back I posted how to use the PHP IMAP functions to download email
from an IMAP server. This post looks at how to do the same but downloading IMAP mail
from Gmail, which you need to connect to on a different port and use SSL. Earlier today I
posted how to enable IMAP mail access in Gmail so you'll need to ensure that's done
first.
How to connect to a normal IMAP server
As a quick recap, this is how you connect to an IMAP mail server using PHP, where
$server is the name of the IMAP mail server (including {curly brackets} :ports and
/flags), $login is your login name and $password is your password:

$connection = imap_open('$server', $login, $password);

How to connect to Gmail's IMAP server
Gmail requires the IMAP connection to be made on port 993 and it must be an SSL
connection.The hostname is imap.gmail.com. Therefore, to connect to Gmail IMAP with
the PHP IMAP functions you would do this:

$server = '{imap.gmail.com:993/ssl}';
$connection = imap_open($server, $login, $password);

Note that your login name for Gmail includes the domain name as well.

If you have an @gmail.com email address then your login name would be e.g. 'my-email-
address@gmail.com'

If you use Google Apps and have your domain mail hosted with Gmail then it will be
your account email address e.g. 'chris@example.com'

IMAP supports the concept of multiple mailboxes. In Gmail your labels become
mailboxes in your IMAP connection. When you first connect using the above you'll be
looking at the INBOX. It is possible to then change to one of your other mailboxes and
read mail from there.

How to download the email using IMAP from Gmail

At this stage it's the same as downloading mail from any IMAP server, so refer to my
other post which shows how to use the PHP IMAP functions to download email from an
IMAP server




Update March 12th 2009 - novalidate-cert flag
I recently moved from CentOS 5 with PHP 5.1.6 to Debian 5 with PHP 5.2.6 in my
development environment (and will also do in production next week). My same test script
would no longer connect to Gmail, instead error'ing out with:
Warning: imap_open(): Couldn't open stream {imap.gmail.com:993/ssl}

After various testing of different flags I discovered I needed to add the 'novalidate-cert'
flag so the server string looks like this:

$server = '{imap.gmail.com:993/ssl/novalidate-cert}';

The next post in this series will look at how to open other mailboxes with PHP IMAP,
using Gmail's mailboxes and labels as an example.

Update March 13th 2009 - firewall issues

I received an email from Jesse Fisher who was having issues connecting to Gmail with
the PHP IMAP functions and was getting the error message "Can't connect to gmail-
imap.l.google.com,993: Connection refused".

It turned out that port 993 was blocked by the firewall on the server that the PHP script
was running on. Once the hosting provider opened that port the issue went away and
Jesse was able to connect.

If you are getting an error like this connecting to a mail server using the PHP IMAP
functions then checking the outbound firewall settings is one thing that needs to be
looked at.


Deleting messages with PHP IMAP and
Gmail
My last PHP post looked at how to delete email messages using PHP's IMAP functions.
This is straight forward on a regular POP3 or IMAP mail server but deleting a message
when connected to Gmail just removes it from the inbox and it's still available in "All
Mail" rather than being moved into the trash. This post looks at how to move an email
into Gmail's trash with PHP IMAP.

Using imap_delete will simply remove the message from the inbox but it will still be
available in "All Mail" and won't be in the trash:

imap_delete($connection, $msgno);

So instead the email message must be moved into the trash using the imap_mail_move
function. The second parameter for this function takes a range/list and not a single
message number, so to move $msgno you need to set the range as "$msgno:$msgno" like
so:

imap_mail_move($connection, "$msgno:$msgno", '[Google Mail]/Trash');
The actual name of the trash folder will vary depending on the language settings. For
example I have Gmail set to using "English (UK)" so the trash folder is actually called
the "Bin" and to move messages into it requires this instead:

imap_mail_move($connection, "$msgno:$msgno", '[Google Mail]/Bin');

I've also seen people refer to the bracketed part as [Gmail] instead of [Google Mail] so it
would pay to use the imap_list() function to work out exactly what the trash folder is
called and assigning it to a variable. An example of doing this can be found in my Open a
mailbox other than the INBOX with PHP IMAP post.


How to enable IMAP access for a Gmail
account
Gmail has a web based interface but it's also possible to access your Gmail mailbox using
IMAP or POP and use a different (offline) application, such as Microsoft Outlook or
Mozilla Thunderbird to access your Gmail. This post looks at how to enable IMAP
access for Gmail. The same settings page also allows you to enable POP access but
you're better off using IMAP.

This post is included as part of a series I have been writing about getting reports from
Google Analytics and downloading email and attachments using PHP. This particular
post has been written as a supplement to another post that will appear later today about
how to download mail using PHP's IMAP mail functions from Gmail.

After logging into Gmail click the "Settings" link in the top right set of navigation
followed by the "Forwarding and POP/IMAP" tab. These are shown in the screenshot
below highlighted with the red arrows.
By default IMAP is not enabled. I have highlighted the "IMAP Access" section in the
above screenshot with a red box. Simply select the "Enable IMAP" option and then click
the "Save Changes" button. You will now be able to acess Gmail via IMAP.

Note that you can allow POP access to your Gmail using the "POP Download" settings
and selecting one of the "Enable POP..." options. However it is better to use IMAP.

The settings you need to connect to your IMAP mailbox are as follows:

IMAP Server: imap.gmail.com
Use SSL: yes
Port: 993
Login name: Your full email address e.g. my-email-address@gmail.com

The next post will look at how to connect to Gmail using IMAP with PHP.
Extracting attachments from an email
message using PHP IMAP functions
This post is part of an ongoing series which aims to show how to extract data from
Google Analytics using its scheduled email reports system. I have already looked at how
to send Google Analytics data by email, and how to use the PHP IMAP functions to
download email. I will also look at using other PHP libraries to download email and
attachments, but for now this post looks at how to extract email attachments using the
PHP IMAP functions.

There are other was to do this - the method presented here is just one of several was of
getting the attachments. I will look at other ways to get attachments from email messages
in later posts.

Getting the message structure
After logging in to the IMAP or POP mail server (detailed in my last post in this series
about how to use the PHP IMAP functions to download email) use the
imap_fetchstructure() function to get the structure of the message.

The following code snippet example uses the connection stored in $connection to
download the message structure for the $message_number message in the mailbox (this is
a 1 based index of the messages in the mailbox):

$structure = imap_fetchstructure($connection, $message_number);

Assuming all went well we now have an object containing a lot of information about the
message. The following is output from print_r for a message sent from Google Analytics
containing tab-separated data in an attachment:

stdClass Object
(
    [type] => 1
    [encoding] => 0
    [ifsubtype] => 1
    [subtype] => MIXED
    [ifdescription] => 0
    [ifid] => 0
    [ifdisposition] => 0
    [ifdparameters] => 0
    [ifparameters] => 1
    [parameters] => Array
        (
            [0] => stdClass Object
                (
                     [attribute] => boundary
                     [value] => 00221532c8aca27cf00462632bb7
                )
        )
    [parts] => Array
        (
            [0] => stdClass Object
                (
                     [type] => 0
                     [encoding] => 0
                     [ifsubtype] => 1
                     [subtype] => PLAIN
                     [ifdescription] => 0
                     [ifid] => 0
                     [lines] => 11
                     [bytes] => 737
                     [ifdisposition] => 0
                     [ifdparameters] => 0
                     [ifparameters] => 1
                     [parameters] => Array
                         (
                             [0] => stdClass Object
                                 (
                                      [attribute] => charset
                                      [value] => ISO-8859-1
                                 )

                            [1] => stdClass Object
                                (
                                    [attribute] => format
                                    [value] => flowed
                                )

                            [2] => stdClass Object
                                (
                                    [attribute] => delsp
                                    [value] => yes
                                )
                        )
                )
            [1] => stdClass Object
                (
                    [type] => 0
                    [encoding] => 3
                    [ifsubtype] => 1
                    [subtype] => TAB-SEPARATED-VALUES
                    [ifdescription] => 0
                    [ifid] => 0
                    [lines] => 111
                    [bytes] => 8674
                    [ifdisposition] => 1
                    [disposition] => attachment
                    [ifdparameters] => 1
                    [dparameters] => Array
                        (
                            [0] => stdClass Object
                                (
                                     [attribute] => filename
                                     [value] =>
Analytics_www.electrictoolbox.com_20090108-20090207.tsv
                                     )
                             )
                         [ifparameters] => 1
                         [parameters] => Array
                             (
                                 [0] => stdClass Object
                                     (
                                         [attribute] => charset
                                         [value] => US-ASCII
                                     )

                            [1] => stdClass Object
                                (
                                    [attribute] => name
                                    [value] =>
Analytics_www.electrictoolbox.com_20090108-20090207.tsv
                                )
                        )
                )
        )
)

Working out and getting the attachments

The above isn't the easiest to extract the information we need. You can see we need to
loop through [parts] and then each part's [parameters] and [dparameters] to get the
filename and name for each, downloading the message part using imap_fetchbody() if it
is. If the part doesn't have a name then it's not an attachment.

This is achieved with the following code, assigning information to a array called
$attachments. The reason 1 is added to $i in the call to imap_fetchbody() is that the parts
are zero-based but in the IMAP functions they are one-based.

$attachments = array();
if(isset($structure->parts) && count($structure->parts)) {

          for($i = 0; $i < count($structure->parts); $i++) {

                   $attachments[$i] = array(
                           'is_attachment' => false,
                           'filename' => '',
                           'name' => '',
                           'attachment' => ''
                   );

                   if($structure->parts[$i]->ifdparameters) {
                           foreach($structure->parts[$i]->dparameters as
$object) {
                                       if(strtolower($object->attribute) ==
'filename') {
                                                $attachments[$i]['is_attachment']
= true;
                                                $attachments[$i]['filename'] =
$object->value;
                                     }
                            }
                  }

                  if($structure->parts[$i]->ifparameters) {
                          foreach($structure->parts[$i]->parameters as
$object) {
                                     if(strtolower($object->attribute) ==
'name') {
                                               $attachments[$i]['is_attachment']
= true;
                                               $attachments[$i]['name'] =
$object->value;
                                     }
                            }
                  }

               if($attachments[$i]['is_attachment']) {
                       $attachments[$i]['attachment'] =
imap_fetchbody($connection, $message_number, $i+1);
                       if($structure->parts[$i]->encoding == 3) { // 3 =
BASE64
                               $attachments[$i]['attachment'] =
base64_decode($attachments[$i]['attachment']);
                       }
                       elseif($structure->parts[$i]->encoding == 4) { //
4 = QUOTED-PRINTABLE
                               $attachments[$i]['attachment'] =
quoted_printable_decode($attachments[$i]['attachment']);
                       }
               }
        }
}

The end result of the above code on our example email is the following, with the data
truncated for the actual attachment:

Array
(
    [0] => Array
        (
            [is_attachment] =>
            [filename] =>
            [name] =>
            [attachment] =>
        )

    [1] => Array
        (
             [is_attachment] => 1
             [filename] => Analytics_www.electrictoolbox.com_20090108-
20090207.tsv
             [name] => Analytics_www.electrictoolbox.com_20090108-
20090207.tsv
             [attachment] => ...
        )
)

You can now loop through the $attachments array looking for the appropriate filename to
do whatever processing you need to it.


PHP IMAP: Looping through messages to
find a specific subject
This post is part of a series about downloading email using the PHP IMAP functions. The
ultimate goal of the series is to show how to export Google Analytics data by email, and
then use PHP to connect to the mail server, find and download the appropriate message
and then load the data into a database. This post looks at how to look for a specific email
message using the PHP IMAP functions.

Looping through the messages
The most obvious solution is to loop through the messages and either do a string
comparison on the sender and subject etc and then process the message if it matches the
criteria.

Here's how you would do this:

$connection = imap_open($server, $login, $password);
$count = imap_num_msg($connection);
for($msgno = 1; $msgno <= $count; $msgno++) {

     $headers = imap_headerinfo($connection, $msgno);
     if( ... matching criteria here ... ) {
         call_a_function_to_do_something( ... parameters ... );
     }

}

The first line connects to the mail server; the second works out how many messages there
are in the mailbox. Then the script loops through each message, grabbing a copy of the
headers and then doing something if some search criteria is matched.

For example, if you were looking for a message from "chris@example.com" with a
subject "this is a test", you could make the if() statment like this:

if(strtolower($headers->subject) == 'this is a test'
    && strtolower($headers->fromaddress) == 'chris@example.com') {

The string comparisons are done using strtolower() and a lower cased constant to ensure
there are no case sensitivity issues.
You could also use regular expressions. The ultimate aim of these posts is to download
the Google Analytics attachments and an example subject is something along these lines:
"Analytics www.electrictoolbox.com 20090202-20090304 (Top Content)"

To extract the domain, from date, to date and the customisable subject you can do this:

preg_match('/Analytics (.*) (\d{8})\-(\d{8}) \((.*)\)/i',
    $headers->subject, $matches)

$matches is populated with the matches from the regular expression, and in the case of
the example above would return this:

Array
(
    [0] =>    Analytics www.electrictoolbox.com 20090202-20090304 (Top
Content)
    [1] =>    www.electrictoolbox.com
    [2] =>    20090202
    [3] =>    20090304
    [4] =>    Top Content
)

Using the imap_search function
imap_search is a powerful searching function which saves you from having to loop
through the messages like in the above examples. Pass it the connection and then some
criteria like so to match the above email:

$result = imap_search($connection,
    'SUBJECT "Analytics www.electrictoolbox.com" FROM
"chris@example.com"');

The way this works is the criteria are things like SUBJECT and FROM and you put the
values to match in a double quoted string after the criteria name. It must be a double
quoted string (not a single quoted one) otherwise it will not work. A full list of criteria
can be found on the PHP imap_search manual page.

Matches are done in a case-insensitive fashion, and will match anywhere in the string. So
in the above example 'SUBJECT "www.electrictoolbox.com"', ''SUBJECT "Analytics
www.electrictoolbox.com"' etc would match the full subject "Analytics
www.electrictoolbox.com 20090202-20090304 (Top Content)". You may then need to do
further validation on the subject etc before processing the message.

The function returns an array of message numbers which can then be looped through and
the messages processed as in the following example:

foreach($result as $msgno) {
    $headers = imap_fetchheader($connection, $msgno);
    if( ... any optional additional matching criteria here ... ) {
        call_a_function_to_do_something( ... parameters ... );
    }
}


Sending email with Zend_Mail
The Zend Framework is a PHP framework with many individual components that do not
require you to use the whole framework. In this post I will look at how to send an email
using the Zend Framework's Zend_Mail component.

Before using any component from the Zend Framework you need to add the path of the
framework to your include path. This is because the various component require other
library files and assume the "Zend" directory is already in the include path.

Basic usage

Having done this, the example below shows the basic usage of sending an email using
Zend_Mail:

require_once('Zend/Mail.php');

$mail = new Zend_Mail();
$mail->setBodyText('This is an example message body');
$mail->setFrom('chris@example.com', 'Chris Hope');
$mail->addTo('john@example.com', 'John Smith');
$mail->setSubject('This is an example subject');
$mail->send();

Setting the HTML body text
If you want to add an HTML version of the message body you can do the following.

$mail->setBodyHtml($html);

Note that you don't have to have both an HTML body and a plain text body but you must
specify at least one of them. If you don't you'll get a nasty error message along these
lines:

Fatal error: Uncaught exception 'Zend_Mail_Transport_Exception' with
message 'No body specified' in
/path/to/Zend/Mail/Transport/Abstract.php:284
Stack trace:
#0 /path/to/Zend/Mail/Transport/Abstract.php(313):
Zend_Mail_Transport_Abstract->_buildBody()
#1 /path/to/Zend/Mail.php(720): Zend_Mail_Transport_Abstract-
>send(Object(Zend_Mail))
#2 /path/to/test_script.php(15): Zend_Mail->send()
#3 {main} thrown in /path/to/Zend/Mail/Transport/Abstract.php on line
284
To, CC and BCC recipients
You can also specify multiple "to" recipients by calling the addTo() method multiple
times:

$mail->addTo('john@example.com', 'John Smith');
$mail->aadTo('jane@example.com', 'Jane Doe');

And also CC and BCC recipienrs like so:

$mail->addCc('john@example.com', 'John Smith');
$mail->addBcc('jane@example.com');

The addTo, setFrom and addCc all take a second optional parameter in which you can
specify the person's name, the first parameter being to set their email address. The
addBcc method only has the email address parameter because the name is not set in the
actual email for Bcc recipients.

Chaining it altogether
It's possible to chain the whole call together and send an email as follows instead. A lot
of this style comes down to personal preference:

$mail = new Zend_Mail();
$mail->setBodyText('this is a test')
    ->setFrom('chris@example.com', 'Chris Hope')
    ->addTo('chris@example.com', 'Chris Hope')
    ->setSubject('This is a test subject')
    ->send();

								
To top