Connecting to SharePoint Using Python and Microsoft Graph API
Introduction
Microsoft SharePoint is a powerful platform for managing files and documents within an organization. However, accessing SharePoint programmatically can be challenging without the right tools. In this guide, we’ll walk through how to connect to SharePoint using Python and the Microsoft Graph API to list all files and folders within a site.
We’ll use the requests library to interact with the Graph API and retrieve SharePoint data. This tutorial includes setting up authentication, retrieving the site ID, obtaining the document drive ID, and listing folder contents.
Prerequisites
Before proceeding, ensure you have the following:
- A registered Azure AD application with the correct API permissions for Microsoft Graph
- The
requestslibrary installed in Python (pip install requests) - Client credentials (Tenant ID, Client ID, and Client Secret) for authentication
Setting Up Authentication
The first step is to authenticate our Python script with Microsoft Graph using OAuth2 client credentials.
import requests
import os
class SharePointClient:
def __init__(self, tenant_id, client_id, client_secret):
self.tenant_id = tenant_id
self.client_id = client_id
self.client_secret = client_secret
self.token_url = f"https://login.microsoftonline.com/{tenant_id}/oauth2/v2.0/token"
self.graph_api_url = "https://graph.microsoft.com/v1.0"
self.access_token = self.get_access_token()
def get_access_token(self):
body = {
'grant_type': 'client_credentials',
'client_id': self.client_id,
'client_secret': self.client_secret,
'scope': "https://graph.microsoft.com/.default"
}
response = requests.post(self.token_url, data=body, headers={'Content-Type': 'application/x-www-form-urlencoded'})
if response.status_code == 200:
return response.json().get('access_token')
else:
raise Exception(f"Failed to get access token: {response.status_code} - {response.text}")This function retrieves an OAuth2 access token, which is required for API requests.
Retrieving the SharePoint Site ID
To access files in SharePoint, we need the site ID. The function below fetches the site ID based on the site URL:
def get_site_id(self, site_url):
full_url = f"{self.graph_api_url}/sites/{site_url}"
headers = {'Authorization': f'Bearer {self.access_token}'}
response = requests.get(full_url, headers=headers)
if response.status_code == 200:
return response.json().get('id')
else:
raise Exception(f"Error retrieving site ID: {response.status_code} - {response.text}")Retrieving Drive Information
A SharePoint site may contain multiple document libraries (drives). We need the drive ID before accessing files.
def get_drive_id(self, site_id):
full_url = f"{self.graph_api_url}/sites/{site_id}/drives"
headers = {'Authorization': f'Bearer {self.access_token}'}
response = requests.get(full_url, headers=headers)
if response.status_code == 200:
return response.json().get('value')
else:
raise Exception(f"Error retrieving drive ID: {response.status_code} - {response.text}")Listing Folder Contents
After obtaining the site_id and drive_id, we can list all files and folders in the root directory.
def get_folder_content(self, site_id, drive_id):
full_url = f"{self.graph_api_url}/sites/{site_id}/drives/{drive_id}/root/children"
headers = {'Authorization': f'Bearer {self.access_token}'}
response = requests.get(full_url, headers=headers)
if response.status_code == 200:
return response.json().get('value')
else:
raise Exception(f"Error retrieving folder content: {response.status_code} - {response.text}")Displaying All Folder Contents
If we need to retrieve the contents of a specific folder, we can modify the API request as follows:
def list_folder_contents(self, site_id, drive_id, folder_id):
full_url = f"{self.graph_api_url}/sites/{site_id}/drives/{drive_id}/items/{folder_id}/children"
headers = {'Authorization': f'Bearer {self.access_token}'}
response = requests.get(full_url, headers=headers)
if response.status_code == 200:
return response.json().get('value')
else:
raise Exception(f"Error listing folder contents: {response.status_code} - {response.text}")Running the Script
Below is an example of how to use the SharePointClient class to connect to SharePoint, retrieve site and drive information, and list folder contents:
# Define your credentials
tenant_id = 'your-tenant-id'
client_id = 'your-client-id'
client_secret = 'your-client-secret'
site_url = "your-sharepoint-site-url"
# Initialize the client
client = SharePointClient(tenant_id, client_id, client_secret)
# Get site ID
site_id = client.get_site_id(site_url)
print("Site ID:", site_id)
# Get drive information
drive_info = client.get_drive_id(site_id)
print("Root folder:", drive_info)
drive_id = drive_info[0]['id'] # Assume the first drive is the main document library
# Get root folder content
folder_content = client.get_folder_content(site_id, drive_id)
print("Root Content:", folder_content)
# Retrieve contents of a specific folder (e.g., the first folder in the root directory)
folder_id = folder_content[0]['id']
contents = client.list_folder_contents(site_id, drive_id, folder_id)
for content in contents:
print(f"Name: {content['name']}, Type: {'Folder' if 'folder' in content else 'File'}, MimeType: {content.get('file', {}).get('mimeType', 'N/A')}")Conclusion
With this Python script, you can easily connect to SharePoint, navigate through document libraries, and retrieve file and folder contents using the Microsoft Graph API. This approach can be extended to upload, download, or manipulate files within SharePoint programmatically.
By leveraging the Graph API, you gain more flexibility and automation capabilities for managing SharePoint content within your organization.