Extracting all Email Addresses from a Gmail Label

  • Scripting
  • Published Feb 19, 2015 Updated Jul 19, 2020

As with any other product, when working with Gmail you sometimes get to a point where you want something the product simply does not provide. When that happens, a scripting interface can be a lifesaver. Let’s see how we can put Google Apps Script to use.

List all Email Addresses from a Label’s Messages

Gmail’s most useful feature is the labeling system. The simple fact that you can tag a message with more than one label makes it far superior to message filing systems based on folders.

Let’s say you tag all personal communications with your friends and family with the label friends. If you are throwing a big party you might want to send invitation emails to all of them. If there only was a function to extract a list of all email addresses from all the emails tagged with the label friends.

When I needed this functionality recently I quickly stumbled upon a solution on labnol.org. However, that turned out to be slow and did not even work reliably. So I created my own.

Features

The script extracts email addresses from all messages in all threads tagged with a given label. It processes even large numbers of emails quickly. In addition to the sender’s email address, it lists the sender’s name and the date of the first communication.

Usage

The email extraction script is part of a Google Sheet to provide an easy user interface for data input and output. To use the script copy the sheet to your Google Drive by clicking this link.

Side note: this nifty copy feature is triggered by appending /copy to the document’s URL.

You will be asked if you want to make the copy:

01-Copy-document

Next, your copy of the spreadsheet opens in Google Sheets. Fill in the label from which you want to extract email addresses. In this example I am using the label Spam. then click Extract email addresses in the HK Scripts menu:

02-Spreadsheet

You will be asked for authorization:

03-Authorization
04-Request-for-permission

Now the script runs. It should not take long. Its output, the extracted email addresses, are written to a new sheet:

05-Script-finished

As you can see above, it not only lists the email addresses, but also the person’s name and the time of first communication.

The Script

You do not need this in order to use the email extraction script, but for those interested here is the source code. You can also view it directly in the sheet by navigating to Tools > Script editor.

//
// Searches one label for the first time someone sent you an email
// Returns name, e-mail address (extracted from the "From" field) and message time
//
function GetAddresses ()
{
  // Get the active spreadsheet
  var ss = SpreadsheetApp.getActiveSpreadsheet();  

  // Label to search
  var userInputSheet = ss.getSheets()[0];
  
  var labelName = userInputSheet.getRange("B2").getValue();
  
  // Create / empty the target sheet
  var sheetName = "Label: " + labelName;
  var sheet = ss.getSheetByName (sheetName) || ss.insertSheet (sheetName, ss.getSheets().length);
  sheet.clear();
  
  // Get all messages in a nested array (threads -> messages)
  var addressesOnly = [];
  var messageData = [];

  var startIndex = 0;
  var pageSize = 100;
  while (1)
  {
    // Search in pages of 100
    var threads = GmailApp.search ("label:" + labelName, startIndex, pageSize);
    if (threads.length == 0)
      break;
    else
      startIndex += pageSize;
       
    // Get all messages for the current batch of threads
    var messages = GmailApp.getMessagesForThreads (threads);
    
    // Loop over all messages
    for (var i = 0; i < messages.length ; i++)
    {
      // Loop over all messages in this thread
      for (var j = 0; j < messages[i].length; j++)
      {
        var mailFrom = messages[i][j].getFrom ();
        var mailDate = messages[i][j].getDate ();
        
        // mailFrom format may be either one of these:
        // name@domain.com
        // any text <name@domain.com>
        // "any text" <name@domain.com>
        
        var name = "";
        var email = "";
        var matches = mailFrom.match (/\s*"?([^"]*)"?\s+<(.+)>/);
        if (matches)
        {
          name = matches[1];
          email = matches[2];
        }
        else
        {
          email = mailFrom;
        }
        // Check if (and where) we have this already
        var index = addressesOnly.indexOf (mailFrom);
        if (index > -1)
        {
          // We already have this address -> remove it (so that the result is ordered by data from new to old)
          addressesOnly.splice(index, 1);
          messageData.splice(index, 1);
        }
        
        // Add the data
        addressesOnly.push (mailFrom);
        messageData.push ([name, email, mailDate]);
      }
    }
  }
  
  // Add data to corresponding sheet
  sheet.getRange (1, 1, messageData.length, 3).setValues (messageData);
}


//
// Adds a menu to easily call the script
//
function onOpen ()
{
  var sheet = SpreadsheetApp.getActiveSpreadsheet ();
  
  var menu = [ 
    {name: "Extract email addresses",functionName: "GetAddresses"}
  ];  
  
  sheet.addMenu ("HK Scripts", menu);    
}

Comments

Related Posts

Configuring Citrix ShareFile Sync from PowerShell

Configuring Citrix ShareFile Sync from PowerShell
When you have a cloud-based file sharing service it makes a lot of sense to synchronize part or all of the data with your desktop computer. Citrix ShareFile offers the Sync for Windows tool for that purpose. However, once you open its configuration screen you notice that has a severe restriction: it can only synchronize to a single local folder. In many cases it would make much more sense to synchronize different cloud folders to different locations on your hard disk. When I complained to the product manager Peter Schulz about this I learned about a hidden gem: the single folder restriction is only present in the UI; the underlying sync engine is much more flexible. And the best thing is: the sync engine can be configured from PowerShell. Here is how.
Citrix/Terminal Services/Remote Desktop Services

PowerShell Script: Test Chrome, Firefox & IE Browser Performance

PowerShell Script: Test Chrome, Firefox & IE Browser Performance
There is more than one way to test the performance of web browsers like Chrome, Firefox, or IE, but regardless of how you do it, you need a consistent workload that makes the browsers comparable. Unless you are testing with synthetic benchmarks (which come with a plethora of problems of their own) you need a way to automate browsers opening tabs and loading URLs. This article presents a simple solution to do just that.
Scripting

Latest Posts

Fast & Silent 5 Watt PC: Minimizing Idle Power Usage

Fast & Silent 5 Watt PC: Minimizing Idle Power Usage
This micro-series explains how to turn the Lenovo ThinkCentre M90t Gen 6 into a smart workstation that consumes only 5 Watts when idle but reaches top Cinebench scores while staying almost imperceptibly silent. In the first post, I showed how to silence the machine by replacing and adding to Lenovo’s CPU cooler. In this second post, I’m listing the exact configuration that achieves the lofty goal of combining minimal idle power consumption with top Cinebench scores.
Hardware

Fast & Silent 5 Watt PC: Lenovo ThinkCentre M90t Modding

Fast & Silent 5 Watt PC: Lenovo ThinkCentre M90t Modding
This micro-series explains how to turn the Lenovo ThinkCentre M90t Gen 6 into a smart workstation that consumes only 5 Watts when idle but reaches top Cinebench scores while staying almost imperceptibly silent. In this first post, I’m showing how to silence the machine by replacing and adding to Lenovo’s CPU cooler. In a second post, I’m listing the exact configuration that achieves the lofty goal of combining minimal idle power consumption with top Cinebench scores.
Hardware