top of page

How to collect Email data from Microsoft Graph API with Python

  • 5.15 Technologies
  • May 19, 2023
  • 4 min read

Updated: Apr 14

Email is one of the most critical sources of operational data within an organization, yet much of it remains unstructured and underutilized.


By leveraging the Microsoft Graph API, teams can automate the collection and analysis of email data, transforming it into actionable insight that improves visibility, decision-making, and operational efficiency.


This enables organizations to move beyond reactive communication and begin leveraging email data as part of a broader, data-driven operational strategy.


To accomplish this, we will walk through the key steps required to connect to the Microsoft Graph API using Python and build a repeatable data collection workflow:


This process includes:


  • Setup

  • Connect

  • Authenticate

  • Collect

  • Process


How to Connect to the Microsoft Graph API

To connect to the Microsoft Graph API in Python we need to define variables and import a library called MSAL (Microsoft Authentication Library).


We will need five variables to connect to Microsoft Graph API:

  • Client ID: The Public Identifier for the application

  • Client Secret: A confidential passcode used to authenticate to the application

  • Tenant ID: A unique identifier of the Azure Active Directory instance

  • Authority: This is the API login link that includes your Tenant ID

  • Scope: This defines the permissions that the token will have

import msal

client_id = '**************'
tenant_id = '**************'

authority = f'https://login.microsoftonline.com/{tenant_id}'

Authentication with the Microsoft Graph API

Now that we have defined all the necessary variables, we can generate an access token for the Microsoft Graph API. We will do this by calling a function of MSAL named ConfidentialClientApplication. This function takes three inputs:

  • Client id

  • Authority

  • Client credential (client secret)

Once we've created a good connection to our application, we can call another function to obtain the API token. This function is called “acquire_token_for_client”, and we will pass in the scope that we defined earlier.


See a code example below:

def get_token(username, password, c_id, c_secret, auth):
    try:
        app = msal.ClientApplication(
            c_id, authority=auth,
            client_credential=c_secret,
        )
        result = app.acquire_token_by_username_password(username, password, scopes=['User.ReadBasic.All'])
        return result['access_token']
    except:
        return "Authentication Failed"

Now that we have our access token, let's get started collecting data.


Gathering Email Data

In this example, we are going to collect email data. We will need to import three more python libraries to accomplish this. The extra libraries we need are listed below:

import requests
import json
import pandas as pd

The first step to collecting email data is to define our endpoint URL. Most Microsoft Graph API URLs start with the same base URL, which is https://graph.microsoft.com/v1.0/. Then we add the endpoint for email, which you'll see below.

'''
GET /users/{id | userPrincipalName}/mailFolders/{id}/messages
'''

user_id = '***************'
url = f'https://graph.microsoft.com/v1.0/users/{user_id}/mailFolders/inbox/Messages?$top=999&$select=sender,subject,toRecipients'

We also have custom parameters defined, including "top” and “select.” By default, this endpoint will only return 10 emails, so for this example we define top to be 999. Meaning that the request will return up to 999 emails. Select tells the API which fields we want to be returned with the request. For this request, each email will have only the sender, subject, and recipients returned with it.


Now we'll retrieve a token using the code we wrote earlier, define headers for the request, and make the request using the Python Requests library. We will load the response as JSON. This will make it easier to parse through in the next step.

token = get_token()
if token == "Authentication Failed":
    print('Invalid Token. Abort Process')
headers = {"Authorization": f"Bearer  {token}"}
emails = requests.get(url, headers=headers)

Processing the Data

Now that we have requested and received data, the next step is to process the data. In this example, we will perform basic processing and output the result in a CSV file. All the email data we are after is within the key “value” of the “email_json” variable. In the code segment below, you can see we have a “for loop” that will iterate through each email in the “value” key.

email_json = json.loads(emails.text)
email_data = []
for email in email_json['value']:
    subject = email['subject']
    sender_name = email['sender']['emailAddress']['name']
    sender_email = email['sender']['emailAddress']['address']
    recipients = []
    for recipient in email['toRecipients']:
        recip_name = recipient['emailAddress']['name']
        recip_email = recipient['emailAddress']['address']
        recipients.append([recip_name, recip_email])
    email_data.append([subject, sender_name, sender_email, recipients])

df_emails = pd.DataFrame(data=email_data, columns=['Subject', 'SenderName', 'SenderEmail', 'Recipients'])

df_emails.to_csv("Email_Data.csv", index=False)

For each email, we capture the Subject, Sender Name, Sender Email, and the Recipients. As you can see, we also have a nested “for loop” that iterates through the Recipients and adds them to a recipients list variable. Once we have all the data stored in python variables, we add it to the email data list. The last step is to take the list of processed emails, convert it to a Pandas DataFrame, and then we save that DataFrame as a CSV file.


Full Code Snippet:

import msal

client_secret = '**************'
client_id = '**************'
tenant_id = '**************'

authority = f'https://login.microsoftonline.com/{tenant_id}'

def get_token(username, password, c_id, c_secret, auth):
    try:
        app = msal.ClientApplication(
            c_id, authority=auth,
            client_credential=c_secret,
        )
        result = app.acquire_token_by_username_password(username, password, scopes=['User.ReadBasic.All'])
        return result['access_token']
    except:
        return "Authentication Failed"
import requests
import json
import pandas as pd

'''
GET /users/{id | userPrincipalName}/mailFolders/{id}/messages
'''

user_id = '***************'
url = f'https://graph.microsoft.com/v1.0/users/{user_id}/mailFolders/inbox/Messages?$top=999&$select=sender,subject,toRecipients'

token = get_token()
if token == "Authentication Failed":
    print('Invalid Token. Abort Process')
headers = {"Authorization": "Bearer " + token}
emails = requests.get(url, headers=headers)

email_json = json.loads(emails.text)
email_data = []
for email in email_json['value']:
    subject = email['subject']
    sender_name = email['sender']['emailAddress']['name']
    sender_email = email['sender']['emailAddress']['address']
    recipients = []
    for recipient in email['toRecipients']:
        recip_name = recipient['emailAddress']['name']
        recip_email = recipient['emailAddress']['address']
        recipients.append([recip_name, recip_email])
    email_data.append([subject, sender_name, sender_email, recipients])

df_emails = pd.DataFrame(data=email_data, columns=['Subject', 'SenderName', 'SenderEmail', 'Recipients'])
df_emails.to_csv("Email_Data.csv", index=False)

Conclusion

The Microsoft Graph API provides a powerful foundation for collecting and analyzing organizational data, but its true value comes from how that data is integrated into automated workflows and decision-making processes.


By leveraging APIs and automation, organizations can move beyond manual data collection and begin building scalable, data-driven operations that improve visibility and efficiency across the business.


If you are looking to operationalize data collection, integrate Microsoft services, or build automation workflows at scale, 5.15 Technologies can design and implement a strategy tailored to your environment.



Explore how automation can transform your data workflows and operational visibility.


Comments


  • Twitter
  • LinkedIn
  • YouTube

©2026 by 5.15 Technologies, LLC

bottom of page