Boston Linux & Unix (BLU) Home | Calendar | Mail Lists | List Archives | Desktop SIG | Hardware Hacking SIG
Wiki | Flickr | PicasaWeb | Video | Maps & Directions | Installfests | Keysignings
Linux Cafe | Meeting Notes | Blog | Linux Links | Bling | About BLU

BLU Discuss list archive


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

fault tolerant email?



On Tue, Mar 14, 2006 at 11:06:25AM -0500, Larry Underhill wrote:
> Hi folks, 
> 
> My group has recently setup our own email infrastructure for a project
> we are running. Mail delivery and access via mutt is fine, but we are
> now looking at sane strategies for fault tolerance. Anybody have any
> pointers to resources or hard-won experience they would like to share?
> 
> Some details:
> 
> * Postfix + Courier IMAP on Debian
> * We have two servers in different datacenters. Server "A" is the one
> running mail services now. Server "B" is the one slated to provide
> redundancy/fail-over capabilities. 
> * We have regular backups of Server "A" to tape.
> * Updating MX records in DNS is slow since we don't admin the
> authoritative DNS server. 
> 
> We're most interested in solving the problem of continuing to receive
> mail in the face of (a) serious hardware failure on the primary mail
> server (b) serious network failure to the datacenter the primary mail
> server is in. 

Let's start with some more questions.

You do know that modern mail servers all try to send mail repeatedly
before giving up, right? So if Server A is down or unreachable for any
reason, simply bringing it back up within a day or so means that you
won't be losing any mail -- it will just be delivered late. (Once I
had a mail server down for three days; when it came back up, the load
average went to the mid 60s for a few hours as everyone who had tried
to send mail tried to resend it. AFAICT, no mail was lost in that outage.)

Second, simply adding an MX record and preference number for your Server
B means that all modern mail servers, after failing to get through to
Server A, will then automatically try Server B. You don't need to tweak
this in the event of a Server A outage.

So you're not going to lose mail.

The next question is how far you want to go in the ability of end users
to read mail in the event of a Server A outage. If all, or almost all
of your users are at Location D (being an office not directly connected
to the datacenters of either Server A or Server B) then you may want
to install Server C at Location D. Server C receives mail from Server
A and Server B and provides POP3/IMAP/webmail/shell/whatever access to
your users. A loss of connectivity at Location D means that mail will
queue for Server C at both Server A and Server B, and since your users
are all stuck anyway, not getting new email will be the least of their
problems. They'll also be able to send email internally all they want.

Two more notes: one, a backup MX server should apply all the same spam
and virus filtering as the primary; two, there is no reason for a mail
server not to run on RAID1 or RAID5 or better these days.

-dsr-




BLU is a member of BostonUserGroups
BLU is a member of BostonUserGroups
We also thank MIT for the use of their facilities.

Valid HTML 4.01! Valid CSS!



Boston Linux & Unix / webmaster@blu.org