2008-03-16

Backup Gmail via IMAP using getmail

I have not been blogging for QUITE some time, ever since I started my internship back in December. After every day’s 8 hours work, I became very lazy and hardly had any motivation to write.

Today, actually I just got tipped by a very informative introduction on how on back up Gmail using getmail. I improved it a little bit and finally find my way of backing up my Gmail messages, which I’ve since long wanted to accomplish.

The detail of the HOWTO please refer to the author’s blog. What I want to mention is that, since a single .mbox file is too big while Maildir way produces so many small files (which is hard to manage and even harder to move them across disks), maybe it’s a better idea to backup all of the mails into several middle-sized .mbox files.

Using Gmail’s filter and label feature, all of the mails can be easily grouped in many different ways. I used “Quarter” as a unit, therefore I only need to put a search keyword like “after:2007/1/1 before:2007/3/31″ into the search field and apply a new label (e.g. FY07Q1) to all of the mails in the search result. Then I change the configuration of getmail into:

[retriever]
type = SimpleIMAPSSLRetriever
server = imap.gmail.com
username = username
password = password
mailboxes = ("FY07Q1",)

[destination]
type = Mboxrd
path = ~/.getmail/gmail-backup-FY07Q1.mbox

Notice, according to the documentation of getmail, the .mbox file specified in the configuration file

must already exist and appear to be a valid mboxrd file before getmail will try to deliver to it — getmail will not create the file if it does not exist. If you want to create a new mboxrd file for getmail to use, simply create a completely empty (0-byte) file.

After finishing downloading a batch of mail, just modify the configuration(or create another one) to proceed to another batch of mails.

That’s it, I guess getting very excited about being able to easily backup my gmail is the very reason why I added this post after 4 months without writing anything. Thanks again to Matt Cutts for such a useful howto. (which triggered me to note something down)

P.S. As IMAP reciever is used in my case, I didn’t come across the limite to the number of mails for downloading mentioned in the original howto, which is resulted when using getmail with the POP3 method.

9 comments:

  1. BillDec 15, 2009 08:45 AM
    I came across your post when looking for a way to backup GMail. Do you still use this method?

    I was interested in how you backup messages from the current quarter. Every time you run getmail it pulls back all messages from the server. It would make more sense to only pull back messages that have appeared since the last time getmail was run. Do you know if this is possible?

    Many thanks for your writeup.
    ReplyDelete
  2. ConradJan 24, 2010 05:50 AM
    Bill,

    i think you might have the answer allready, but no
    getmail only retrieves the new mails after the first batch. It keeps record of fetched mails.
    ReplyDelete
  3. PentoMar 28, 2010 06:32 AM
    Thanks for instructions for IMAP and Google.
    It's also useful to backup "[Gmail]/All Mail" label.
    ReplyDelete
  4. edin1Sep 21, 2010 07:23 AM
    Hi and thanks for this useful post.

    There's an error in the text. The search string should be: "after:2007/1/1 before:2007/4/1"
    ReplyDelete
  5. Ryan TateOct 16, 2010 12:02 PM
    What you DON'T mention is that this method will mark ALL MAIL as read! Awful. I do searches with "is:read" all the time, now that data is lost forever, with no warning. The getmail author is unrepentant and provides no workaround or option to control this. Ugh.
    ReplyDelete
  6. Ryan TateOct 16, 2010 12:17 PM
    I should add actually that it's not the getmail author's fault. It's just how Google implements IMAP - with POP you get options to NOT mark messages as read, but IMAP offers no such option. Anyway, you should mention this in your post. It was a huge data loss for me. Irrecoverable.
    ReplyDelete
  7. Windows 7 KeyNov 19, 2010 07:22 PM
    Thanks for your posting, i am just a newbie in the internet business, need to learn a lot from the gurus
    windows 7 professional
    ReplyDelete
  8. אראל סגלDec 8, 2010 01:44 AM
    Do you know how to RESTORE the mail from the mbox to a new gmail account? (in case the old account was lost after I did a backup)
    ReplyDelete
  9. MashiaraJan 20, 2012 02:30 PM
    To avoid marking messages read you can patch your getmail to use PEEK instead of FETCH, patch available at http://lists.freebsd.org/pipermail/freebsd-ports-bugs/2011-March/208082.html
    ReplyDelete