Three Easy Steps
Step 1. Use a bounce box...
The first step in cleaning your list is to trap bounced
messages in a central location. We suggest that you create a
"bounce box". A bounce box is a dedicated e-mail account
that is setup to trap returned messages i.e. bounce@yourdomain.com.
To be sure that returned messages find their way to your bounce box
you must understand how these messages are routed by SMTP servers.
When a message is submitted to an SMTP server it is tagged with
a reverse-path. The reverse-path is specified by the sending
application with the MAIL FROM: command as outlined in the SMTP
RFC-821. The reverse-path is the path the the server should
use to communicate with the original sender of the message, and
therefore the reverse-path is typically the e-mail address of the
sender (the from address).
The SMTP sever stores the reverse-path internally, not in the
actual message, and forwards it with the message through any relay
servers as necessary until the message encounters an error or
reaches its destination. Since the return-path is not recorded in
the actual message it is typical to add a From: header to the
e-mail message which contains the address of the sender and an
optional friendly name. i.e. "Joe Sender" <joe.sender@domain.com>.
Mail readers use the From: header to display who a message is
from.
It is very important to understand that the reverse-path and
the address in the From: header need not be the same.
Therefore it is possible to send a message which will be displayed
by mail readers as coming from joe.sender@domain.com, but has a
reverse-path of some_other_address@domain.com.
Once you understand the difference between the reverse-path and
the From: header, and the roles they play, you are on your way to
building messages that will be displayed in a friendly manner if
delivered, or will be returned to your centralized bounce box if
there is a failure.
Step 2. Add custom data to bounced messages...
This step requires that your mail server is capable of being
configured to use a wildcard address. In other words, it
needs to be able to route all mail to bounce*@yourdomain.com to one
specific account such as bounce@yourdomain.com. If your mail
server does not support wildcard addresses, you can accomplish the
same thing by using a "catch-all" box and a dedicated domain.
You can then
append custom data to the end of the account name portion of the
return-path and it will still be delivered to the
bounce@yourdomain.com account. For example, suppose each e-mail
address in your database is identified by a unique numerical id.
You can then encode this id into your bounce address. For
example, suppose that the recipient address is jane.recipient@domain.com, and the id of this address in
your
database is 1063. You could then build an address such as
bounce_1063@yourdomain.com.
You can then send a message to jane.recipient@domain.com and
specify bounce_1063@yourdomain.com as the reverse-path by passing that
address to the SMTP server with the MAIL FROM command. i.e.
MAIL FROM:<bounce_1063@yourdomain.com>. To provide a friendly
"from" name or address for Jane's mail reader to display,
you can
add a From: header to the message. i.e. From: "Joe Sender" <joe.sender@domain.com>.
The sample at the end of this article shows how easily this can be done.
If the message is delivered successfully, Jane's mail reader
will display it as coming from Joe Sender. If for some
reason the message is undeliverable, a "undeliverable mail"
notification message will be sent to bounce_1063@yourdomain.com.
Since your mail server has been instructed to deliver all messages
for bounce*@yourdomain.com to bounce@yourdomain.com, this returned
messages should now land in your bounce box.
Additionally, since returned messages are returned to the
address specified by its reverse-path, each of these messages
should have your custom bounce address in the To: header. In
other words, each of the messages in the bounce box will be
addressed to bounce_<id>@yourdomain.com, where <id> represents the
id of the e-mail address in your database which is related to the
bounce. Our testing has indicated however that some mail
servers use the From: address of the original message as the To:
address of its resulting bounce. This is not what should be
going on according to the RFC, but we have a fix for that too.
If the To: header address does not begin with bounce_, you can
scan the message's "Received" headers and find your bounce address
there. The sample code shows you how this is done.
Following these rules, you can now easily match bounced
messages up to your database, as you will see...
Step 3. Retrieve the bounced messages and update your
database...
At this point, assuming you have sent mail as prescribed above,
and some of those messages were returned, you will have one or
more messages in your bounce box. Each of these messages
will be addressed to bounce_<id>@yourdomain.com, where <id> represents
the id of the e-mail address in your database which is related to
the bounce.
Now it is important to understand that there are two types of
bounces: hard and soft. Permanent failures, such as a
nonexistent account or domain, are considered hard bounces.
Other failures, such as a full mailbox or blocked domain, are
considered soft bounces. Instead of flagging your addresses
as good or bad, your database can keep a running count of
hard and soft bounces for each address. That way, your mailing application can be more intelligent about determining
which addresses to exclude from future mailings. For example
you might only want to send mail to any addresses with less than 8
soft bounces and less than two hard bounces. I usually do
not like to exclude someone from future mailings unless they have
more than one hard bounce. Just to be sure that the address
is really invalid, I look for at least two hard bounces.
Your application will have to scan the text of the bounced messages looking for phrases that indicate the reason
for the bounce. It
will look for such phrases as "delivery failure", "box full",
etc... (The downloadable sample code includes a database of
the phrases we have discovered in typical bounced messages.)
Your app will determine if each bounce is hard or soft based on the
phrase it finds in the message.
Once your app determines if the bounce is hard or soft, it can increment the bounce_hard and
bounce_soft fields in the database accordingly. It can then
delete the message from the bounce box. If your app can not
determine if the message is a hard or soft bounce the message can be left
in the bounce box. Periodically the messages remaining in
the bounce box can be analyzed by a human who can visually
determine why they were not identified by the phrase scanner
algorithm. The algorithm can then be updated to catch this
type of message. Once your app is run again, it should
handle this message properly and clear it from the bounce box.
As time goes on, your phrase scanning algorithm should improve
more and more. If you start with the phrases included with the
downloadable sample code, your app should immediately id just
about every bounced
messages.
On to the samples...
Page Navigator:
<< Back,
1,
2,
3,
4,
5,
Next >>
|