Boston Linux & Unix (BLU) Home | Calendar | Mail Lists | List Archives | Desktop SIG | Hardware Hacking SIG
Wiki | Flickr | PicasaWeb | Video | Maps & Directions | Installfests | Keysignings
Linux Cafe | Meeting Notes | Blog | Linux Links | Bling | About BLU

BLU Discuss list archive


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

babyl (RMAIL) to Maildir converter?



There is a program called "b2m" included with GNU Emacs which reads a
Babyl file on stdin and converts it to an mbox file on stdout.
Unfortunately, it has at least five different problems.

1) It doesn't look for "Return-Path", "From" or "Sender" lines in the
   headers of the messages to find meaningful addresses to put in the
   "From " lines of its output.

2) The fake address it *does* put into the "From " lines is "Babyl to
   mail by b2m", including both the quotation marks and spaces.  Most
   programs which read mbox files will not tolerate "From " lines
   whose addresses contain spaces, so the mbox file this program
   produces isn't valid with many programs.  D'oh!

3) The header it puts for each message in the mbox file is the pruned
   header rather than the original full header.  I think the latter
   makes much more sense -- let whatever program is displaying the
   mbox file do appropriate header filtering rather than forcing it to
   use Emacs's.

4) It puts the current date in the "From " line of every message
   rather than figuring out from the message's "Date:" header what to
   put in its "From " line.

5) It doesn't quote "From " lines in message bodies.

There's also a function in Emacs called M-x unrmail which performs the
same function.  It partially fixes (1) (it looks for "From",
"Really-from" or "Sender" but doesn't look for "Return-Path"), fixes
(2), fixes (3) but doesn't strip out the Emacs-specific
"X-Coding-System" header, doesn't fix (4), and fixes (5).

A third option is to use the attached b2m.pl script, which I just
wrote and intend to submit to the FSF to be included with Emacs
instead of, or in addition to, b2m.c.  It fixes all of the problems
listed above.

Once you've got an mbox file, you can convert it to Maildir format
using the mbox2maildir utility that Chris Tresco mentioned
(<URL:http://www.qmail.org/mbox2maildir>).

Party on.

  jik
-------------- next part --------------
#!/usr/bin/perl

# b2m.pl - Script to convert a Babyl file to an mbox file

# This program is free software; you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation; either version 2 of the License, or
# (at your option) any later version.

# This program is distributed in the hope that it will be useful, but
# WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
# General Public License for more details.

# You should have received a copy of the GNU General Public License
# along with this program; if not, write to the Free Software
# Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307
# USA.

# Maintained by Jonathan Kamens <jik at kamens.brookline.ma.us>.

# Requires CPAN modules: MailTools (for Mail::Address), TimeDate (for
# Date::Parse).

use warnings;
use strict;
use File::Basename;
use Getopt::Long;
use Mail::Address;
use Date::Parse;

my($whoami) = basename $0;
my($version) = '$Revision: 1.3 $';
my($usage) = "Usage: $whoami [--help] [--version] [--[no]full-headers] [Babyl-file]
\tBy default, full headers are printed.\n";

my($opt_help, $opt_version);
my($opt_full_headers) = 1;

die $usage if (! GetOptions(
			    'help' => \$opt_help,
			    'version' => \$opt_version,
			    'full-headers!' => \$opt_full_headers,
			    ));

if ($opt_help) {
    print $usage;
    exit;
}
elsif ($opt_version) {
    print "$whoami version: $version\n";
    exit;
}

die $usage if (@ARGV > 1);

$/ = "\n\037";

if (<> !~ /^BABYL OPTIONS:/) {
    die "$whoami: $ARGV is not a Babyl file\n$usage";
}

while (<>) {
    my($msg_num) = $. - 1;
    my($labels, $full_header, $header);
    my($from_addr);
    my($time);

    # This will strip the initial form feed, any whitespace that may
    # be following it, and then a newline
    s/^\s+//;
    # This will strip the ^_ off of the end of the message
    s/\037$//;

    if (! s/(.*)\n//) {
      malformatted:
	warn "$whoami: message $msg_num in $ARGV is malformatted\n";
	next;
    }
    $labels = $1;

    s/(?:((?:.+\n)+)\n+)?\*\*\* EOOH \*\*\*\n+// || goto malformatted;
    $full_header = $1;

    if (s/((?:.+\n)+)\n+//) {
	$header = $1;
    }
    else {
	# Message has no body
	$header = $_;
	$_ = '';
    }

    if (! $full_header) {
	$full_header = $header;
    }

    # End message with a single newline
    s/\s+$/\n/;

    # Quote "^From "
    s/(^|\n)From /$1>From /g;

    # Strip the integer indicating whether the header is pruned
    $labels =~ s/^\d+[,\s]*//; 
    # Strip extra commas and whitespace from the end
    $labels =~ s/[,\s]+$//;
    # Now collapse extra commas and whitespace in the remaining label string
    $labels =~ s/[,\s]+/, /g;
    
    foreach my $rmail_header qw(summary-line x-coding-system) {
	$full_header =~ s/(^|\n)$rmail_header:.*\n/$1/i;
    }

    foreach my $addr_header qw(return-path from really-from sender) {
	if ($full_header =~ /(?:^|\n)$addr_header:\s*((?:\S.*\n)+)/i) {
	    my($addr) = Mail::Address->parse($1);
	    $from_addr = $addr->address($addr);
	    last;
	}
    }

    if (! $from_addr) {
	$from_addr = "Babyl_to_mail_by_$whoami\@localhost";
    }

    if ($full_header =~ /(?:^|\n)date:\s*(\S.*\S)/i) {
	$time = str2time($1);
    }

    if (! $time) {
	# No Date header or we failed to parse it
	$time = time;
    }

    print("From ", $from_addr, " ", scalar(localtime($time)), "\n",
	  ($opt_full_headers ? $full_header : $header),
	  ($labels ? "X-Babyl-Labels: $labels\n" : ""), "\n",
	  $_) || die "$whoami: error writing to stdout: $!\n";
}

close(STDOUT) || die "$whoami: Error closing stdout: $!\n";



BLU is a member of BostonUserGroups
BLU is a member of BostonUserGroups
We also thank MIT for the use of their facilities.

Valid HTML 4.01! Valid CSS!



Boston Linux & Unix / webmaster@blu.org