by on February 19, 2015, in

Extracting all Email Addresses from a GMail Label

As with any other product, when working with Gmail / Google Apps you sometimes get to a point where you want something the product simply does not provide. When that happens, a scripting interface can be a lifesaver. Let’s see how we can put Google Apps Script to use.

List all Email Addresses from a Label’s Messages

GMail’s most useful feature is the labeling system. The simple fact that you can tag a message with more than one label makes it far superior to message filing systems based on folders.

Let’s say you tag all personal communications with your friends and family with the label friends. If you are throwing a big party you might want to send invitation emails to all of them. If there only was a function to extract a list of all email addresses from all the emails tagged with the label friends.

When I needed this functionality recently I quickly stumbled upon a solution on labnol.org. However, that turned out to be slow and did not even work reliably. So I created my own.

Features

The script extracts email addresses from all messages in all threads tagged with a given label. It processes even large numbers of emails quickly. In addition to the sender’s email address, it lists the sender’s name and the date of first communication.

Usage

The email extraction script is part of a Google Sheet to provide an easy user interface for data input and output. To use the script copy the sheet to your Google Drive by clicking this link.

Side note: this nifty copy feature is triggered by appending /copy to the document’s URL.

You will be asked if you want to make the copy:

01-Copy-document

Next, your copy of the spreadsheet opens in Google Sheets. Fill in the label from which you want to extract email addresses. In this example I am using the label Spam. then click Extract email addresses in the HK Scripts menu:

02-Spreadsheet

You will be asked for authorization:

03-Authorization

04-Request-for-permission

Now the script runs. It should not take long. Its output, the extracted email addresses, are written to a new sheet:

05-Script-finished

As you can see above, it not only lists the email addresses, but also the person’s name and the time of first communication.

The Script

You do not need this in order to use the email extraction script, but for the those interested here is the source code. You can also view it directly in the sheet by navigating to Tools > Script editor.

//
// Searches one label for the first time someone sent you an email
// Returns name, e-mail address (extracted from the "From" field) and message time
//
function GetAddresses ()
{
  // Get the active spreadsheet
  var ss = SpreadsheetApp.getActiveSpreadsheet();  

  // Label to search
  var userInputSheet = ss.getSheets()[0];
  
  var labelName = userInputSheet.getRange("B2").getValue();
  
  // Create / empty the target sheet
  var sheetName = "Label: " + labelName;
  var sheet = ss.getSheetByName (sheetName) || ss.insertSheet (sheetName, ss.getSheets().length);
  sheet.clear();
  
  // Get all messages in a nested array (threads -> messages)
  var addressesOnly = [];
  var messageData = [];

  var startIndex = 0;
  var pageSize = 100;
  while (1)
  {
    // Search in pages of 100
    var threads = GmailApp.search ("label:" + labelName, startIndex, pageSize);
    if (threads.length == 0)
      break;
    else
      startIndex += pageSize;
       
    // Get all messages for the current batch of threads
    var messages = GmailApp.getMessagesForThreads (threads);
    
    // Loop over all messages
    for (var i = 0; i < messages.length ; i++)
    {
      // Loop over all messages in this thread
      for (var j = 0; j < messages[i].length; j++)
      {
        var mailFrom = messages[i][j].getFrom ();
        var mailDate = messages[i][j].getDate ();
        
        // mailFrom format may be either one of these:
        // name@domain.com
        // any text <name@domain.com>
        // "any text" <name@domain.com>
        
        var name = "";
        var email = "";
        var matches = mailFrom.match (/\s*"?([^"]*)"?\s+<(.+)>/);
        if (matches)
        {
          name = matches[1];
          email = matches[2];
        }
        else
        {
          email = mailFrom;
        }
        // Check if (and where) we have this already
        var index = addressesOnly.indexOf (mailFrom);
        if (index > -1)
        {
          // We already have this address -> remove it (so that the result is ordered by data from new to old)
          addressesOnly.splice(index, 1);
          messageData.splice(index, 1);
        }
        
        // Add the data
        addressesOnly.push (mailFrom);
        messageData.push ([name, email, mailDate]);
      }
    }
  }
  
  // Add data to corresponding sheet
  sheet.getRange (1, 1, messageData.length, 3).setValues (messageData);
}


//
// Adds a menu to easily call the script
//
function onOpen ()
{
  var sheet = SpreadsheetApp.getActiveSpreadsheet ();
  
  var menu = [ 
    {name: "Extract email addresses",functionName: "GetAddresses"}
  ];  
  
  sheet.addMenu ("HK Scripts", menu);    
}
Previous Article VMware vExpert Award
Next Article Manual Folder Redirection with Symbolic Links