Friday 11 October 2013

How to Archive and Backup Emails with Postfix and Dovecot Subfolders on Ubuntu Servers

I've been thinking recently that as well as an off-site compressed backup of all of the emails within my system, I would also like a (relatively) easy way to recover an email that one of my users has accidentally deleted from their their trash folder (i.e. it's gone forever).

So how do we archive and backup emails with Dovecot? It's actually quite simple, and if I say so myself.. clever :-)

Step One:

Blind Carbon Copy, always_bcc

Ok, so first off, create a new user for your main domain. I used "archive@domain.com" for simplicity. You then need to configure postfix to always blind carbon copy every email sent and received to that email address...

Copy and paste this line into your terminal command line (substitute your domain in)

sudo -i
cp /etc/postfix/main.cf /etc/postfix/main.cf.bak
sudo echo "always_bcc = archive@domain.com" >> /etc/postfix/main.cf

What are we doing here? Well, first off we are becoming the super user, backing up our main.cf configuration file for safety and then we are echoing into the file /etc/postfix/main.cf the always_bcc variable. Notice the use of TWO pointy brackets, this appends the line to the file, if you use one, you will replace the contents, so make sure you don't do that here :)

Ok, so with that done, if you log into Roundcube or with your email client to check the archive@domain.com account, you will now start receiving every email that goes through your system, in and outgoing.

Step Two:

Adding Dovecot Maildir Subfolders from Command Line

That's a great start, but a year down the line, this is going to be very unorganised. Yes, we could sort by date, but if you have a large system or lots of active users, that isn't going to be particularly realistic. What we want to do is create Dovecot subfolders for each domain.

Now this is where we need to think a little bit. We cannot simply take the email address "to" field and sort by that, because what happens if there is more than one email address? No, we need to use the sender's email address, as there can only ever be one of those.

We will grab the sender's domain and check to see if we have a Dovecot subfolder for that domain. If we do, move the email in to that subfolder, if not, we will first create the folder and then move the email.

Two things for consideration are, we need to not only create the correct subfolder, but we also have to automatically subscribe archive@domain.com to the new folder so that when you check the emails, the new subfolders are automatically added for us.

So, here is the code that will do all this for you, that can be run however often you want (I do it once daily with cron, leaving it any longer could mean it using a lot of resources whilst everything is sorted.


STORE=/var/vmail/vmail1/domain.com/a/r/c/archive-2013.10.11.16.04.29/Maildir/cur
KEEP=/var/vmail/vmail1/domain.com/a/r/c/archive-2013.10.11.16.04.29/Maildir


for x in `find $STORE -type f`
do
 echo "--==~~==--"
 RSLT=`cat $x | grep "Return-Path"`
 PERSONTMP=`echo $RSLT | cut -f 2 -d "<"`
 PERSON=`echo $PERSONTMP | cut -f 1 -d ">"`
 echo "..get email $PERSON"
 NAMETMP=`echo $PERSON | cut -f 2 -d "@"`
 NAME=`echo $NAMETMP | tr '.' '_'`
 echo "..senders domain is $NAME"
 if [ -d $KEEP/.INBOX.$NAME ]
 then
  echo "..archive subfolder already exists.."
 else
  echo "..archive folder does not exist .. we will create it.."
  mkdir -p $KEEP/.INBOX.$NAME/cur
  echo `chown vmail:vmail -R $KEEP/.INBOX.$NAME`
  echo `chmod 0700 $KEEP/.INBOX.$NAME`
  echo "..adding .INBOX.$NAME to subscriptions"
  echo "INBOX.$NAME" >> $KEEP/subscriptions 
 fi
 echo "..Moving email.."
 mv -uv $x $KEEP/.INBOX.$NAME/cur
done


Notes:

  • You will need to first check through your vmail directory to find the correct path for your archive email as Maildir uses dates, time and categorisation in the path so yours will be different.
  • Notice that the actual subfolders your emails are stored in is .INBOX.NAME/cur.
  • For the sake of tidyness, we are replacing any periods (.) in the domains with underscores (_) such as gmail.com -> gmail_com This is because the way MailDir folders work, is that rather than directories inside directories, subfolders are denoted by periods. SO with some email addresses, you would end up with two or three subdirectories before you get to the actual emails. For example, twitter's email is twitter@bounce.twitter.com so the folder tree would be bounce/twitter/com/ which is annoying and untidy for navigation through in Email clients, our way, we just have one folder for each domain.

That's it! Now, when someone rings you up worried that they have deleted an important email, all you need to know is the domain it was sent from, and to make your life easier, the approximate date. Log in to your archive email account and find the domain folder.

That's it for another entry, please remember to click an advert if I have helped you :)

No comments:

Post a Comment