avatarMatMaq

Free AI web copilot to create summaries, insights and extended knowledge, download it at here

3964

Abstract

sed on the site URL:</p><div id="06d1"><pre><span class="hljs-keyword">def</span> <span class="hljs-title function_">get_site_id</span>(<span class="hljs-params">self, site_url</span>): full_url = <span class="hljs-string">f"<span class="hljs-subst">{self.graph_api_url}</span>/sites/<span class="hljs-subst">{site_url}</span>"</span> headers = {<span class="hljs-string">'Authorization'</span>: <span class="hljs-string">f'Bearer <span class="hljs-subst">{self.access_token}</span>'</span>} response = requests.get(full_url, headers=headers) <span class="hljs-keyword">if</span> response.status_code == <span class="hljs-number">200</span>: <span class="hljs-keyword">return</span> response.json().get(<span class="hljs-string">'id'</span>) <span class="hljs-keyword">else</span>: <span class="hljs-keyword">raise</span> Exception(<span class="hljs-string">f"Error retrieving site ID: <span class="hljs-subst">{response.status_code}</span> - <span class="hljs-subst">{response.text}</span>"</span>)</pre></div><h1 id="ecba">Retrieving Drive Information</h1><p id="7ffc">A SharePoint site may contain multiple document libraries (drives). We need the drive ID before accessing files.</p><div id="beee"><pre><span class="hljs-keyword">def</span> <span class="hljs-title function_">get_drive_id</span>(<span class="hljs-params">self, site_id</span>): full_url = <span class="hljs-string">f"<span class="hljs-subst">{self.graph_api_url}</span>/sites/<span class="hljs-subst">{site_id}</span>/drives"</span> headers = {<span class="hljs-string">'Authorization'</span>: <span class="hljs-string">f'Bearer <span class="hljs-subst">{self.access_token}</span>'</span>} response = requests.get(full_url, headers=headers) <span class="hljs-keyword">if</span> response.status_code == <span class="hljs-number">200</span>: <span class="hljs-keyword">return</span> response.json().get(<span class="hljs-string">'value'</span>) <span class="hljs-keyword">else</span>: <span class="hljs-keyword">raise</span> Exception(<span class="hljs-string">f"Error retrieving drive ID: <span class="hljs-subst">{response.status_code}</span> - <span class="hljs-subst">{response.text}</span>"</span>)</pre></div><h1 id="4775">Listing Folder Contents</h1><p id="c79e">After obtaining the <code>site_id</code> and <code>drive_id</code>, we can list all files and folders in the root directory.</p><div id="7f8f"><pre><span class="hljs-keyword">def</span> <span class="hljs-title function_">get_folder_content</span>(<span class="hljs-params">self, site_id, drive_id</span>): full_url = <span class="hljs-string">f"<span class="hljs-subst">{self.graph_api_url}</span>/sites/<span class="hljs-subst">{site_id}</span>/drives/<span class="hljs-subst">{drive_id}</span>/root/children"</span> headers = {<span class="hljs-string">'Authorization'</span>: <span class="hljs-string">f'Bearer <span class="hljs-subst">{self.access_token}</span>'</span>} response = requests.get(full_url, headers=headers) <span class="hljs-keyword">if</span> response.status_code == <span class="hljs-number">200</span>: <span class="hljs-keyword">return</span> response.json().get(<span class="hljs-string">'value'</span>) <span class="hljs-keyword">else</span>: <span class="hljs-keyword">raise</span> Exception(<span class="hljs-string">f"Error retrieving folder content: <span class="hljs-subst">{response.status_code}</span> - <span class="hljs-subst">{response.text}</span>"</span>)</pre></div><h1 id="062e">Displaying All Folder Contents</h1><p id="eb99">If we need to retrieve the contents of a specific folder, we can modify the API request as follows:</p><div id="03b1"><pre><span class="hljs-keyword">def</span> <span class="hljs-title function_">list_folder_contents</span>(<span class="hljs-params">self, site_id, drive_id, folder_id<

Options

/span>): full_url = <span class="hljs-string">f"<span class="hljs-subst">{self.graph_api_url}</span>/sites/<span class="hljs-subst">{site_id}</span>/drives/<span class="hljs-subst">{drive_id}</span>/items/<span class="hljs-subst">{folder_id}</span>/children"</span> headers = {<span class="hljs-string">'Authorization'</span>: <span class="hljs-string">f'Bearer <span class="hljs-subst">{self.access_token}</span>'</span>} response = requests.get(full_url, headers=headers) <span class="hljs-keyword">if</span> response.status_code == <span class="hljs-number">200</span>: <span class="hljs-keyword">return</span> response.json().get(<span class="hljs-string">'value'</span>) <span class="hljs-keyword">else</span>: <span class="hljs-keyword">raise</span> Exception(<span class="hljs-string">f"Error listing folder contents: <span class="hljs-subst">{response.status_code}</span> - <span class="hljs-subst">{response.text}</span>"</span>)</pre></div><h1 id="c41f">Running the Script</h1><p id="7d97">Below is an example of how to use the <code>SharePointClient</code> class to connect to SharePoint, retrieve site and drive information, and list folder contents:</p><div id="56f3"><pre><span class="hljs-comment"># Define your credentials</span> tenant_id = <span class="hljs-string">'your-tenant-id'</span> client_id = <span class="hljs-string">'your-client-id'</span> client_secret = <span class="hljs-string">'your-client-secret'</span> site_url = <span class="hljs-string">"your-sharepoint-site-url"</span> <span class="hljs-comment"># Initialize the client</span> client = SharePointClient(tenant_id, client_id, client_secret) <span class="hljs-comment"># Get site ID</span> site_id = client.get_site_id(site_url) <span class="hljs-built_in">print</span>(<span class="hljs-string">"Site ID:"</span>, site_id) <span class="hljs-comment"># Get drive information</span> drive_info = client.get_drive_id(site_id) <span class="hljs-built_in">print</span>(<span class="hljs-string">"Root folder:"</span>, drive_info) drive_id = drive_info[<span class="hljs-number">0</span>][<span class="hljs-string">'id'</span>] <span class="hljs-comment"># Assume the first drive is the main document library</span> <span class="hljs-comment"># Get root folder content</span> folder_content = client.get_folder_content(site_id, drive_id)
<span class="hljs-built_in">print</span>(<span class="hljs-string">"Root Content:"</span>, folder_content) <span class="hljs-comment"># Retrieve contents of a specific folder (e.g., the first folder in the root directory)</span> folder_id = folder_content[<span class="hljs-number">0</span>][<span class="hljs-string">'id'</span>] contents = client.list_folder_contents(site_id, drive_id, folder_id) <span class="hljs-keyword">for</span> content <span class="hljs-keyword">in</span> contents: <span class="hljs-built_in">print</span>(<span class="hljs-string">f"Name: <span class="hljs-subst">{content[<span class="hljs-string">'name'</span>]}</span>, Type: <span class="hljs-subst">{<span class="hljs-string">'Folder'</span> <span class="hljs-keyword">if</span> <span class="hljs-string">'folder'</span> <span class="hljs-keyword">in</span> content <span class="hljs-keyword">else</span> <span class="hljs-string">'File'</span>}</span>, MimeType: <span class="hljs-subst">{content.get(<span class="hljs-string">'file'</span>, {}</span>).get('mimeType', 'N/A')}"</span>)</pre></div><h1 id="2acc">Conclusion</h1><p id="d081">With this Python script, you can easily connect to SharePoint, navigate through document libraries, and retrieve file and folder contents using the Microsoft Graph API. This approach can be extended to upload, download, or manipulate files within SharePoint programmatically.</p><p id="35d2">By leveraging the Graph API, you gain more flexibility and automation capabilities for managing SharePoint content within your organization.</p></article></body>

Connecting to SharePoint Using Python and Microsoft Graph API

Photo by Chris Ried on Unsplash

Introduction

Microsoft SharePoint is a powerful platform for managing files and documents within an organization. However, accessing SharePoint programmatically can be challenging without the right tools. In this guide, we’ll walk through how to connect to SharePoint using Python and the Microsoft Graph API to list all files and folders within a site.

We’ll use the requests library to interact with the Graph API and retrieve SharePoint data. This tutorial includes setting up authentication, retrieving the site ID, obtaining the document drive ID, and listing folder contents.

Prerequisites

Before proceeding, ensure you have the following:

  • A registered Azure AD application with the correct API permissions for Microsoft Graph
  • The requests library installed in Python (pip install requests)
  • Client credentials (Tenant ID, Client ID, and Client Secret) for authentication

Setting Up Authentication

The first step is to authenticate our Python script with Microsoft Graph using OAuth2 client credentials.

import requests
import os
class SharePointClient:
    def __init__(self, tenant_id, client_id, client_secret):
        self.tenant_id = tenant_id
        self.client_id = client_id
        self.client_secret = client_secret
        self.token_url = f"https://login.microsoftonline.com/{tenant_id}/oauth2/v2.0/token"
        self.graph_api_url = "https://graph.microsoft.com/v1.0"
        self.access_token = self.get_access_token()
    def get_access_token(self):
        body = {
            'grant_type': 'client_credentials',
            'client_id': self.client_id,
            'client_secret': self.client_secret,
            'scope': "https://graph.microsoft.com/.default"
        }
        response = requests.post(self.token_url, data=body, headers={'Content-Type': 'application/x-www-form-urlencoded'})
        
        if response.status_code == 200:
            return response.json().get('access_token')
        else:
            raise Exception(f"Failed to get access token: {response.status_code} - {response.text}")

This function retrieves an OAuth2 access token, which is required for API requests.

Retrieving the SharePoint Site ID

To access files in SharePoint, we need the site ID. The function below fetches the site ID based on the site URL:

def get_site_id(self, site_url):
        full_url = f"{self.graph_api_url}/sites/{site_url}"
        headers = {'Authorization': f'Bearer {self.access_token}'}
        response = requests.get(full_url, headers=headers)
  if response.status_code == 200:
            return response.json().get('id')
        else:
            raise Exception(f"Error retrieving site ID: {response.status_code} - {response.text}")

Retrieving Drive Information

A SharePoint site may contain multiple document libraries (drives). We need the drive ID before accessing files.

def get_drive_id(self, site_id):
        full_url = f"{self.graph_api_url}/sites/{site_id}/drives"
        headers = {'Authorization': f'Bearer {self.access_token}'}
        response = requests.get(full_url, headers=headers)
  if response.status_code == 200:
            return response.json().get('value')
        else:
            raise Exception(f"Error retrieving drive ID: {response.status_code} - {response.text}")

Listing Folder Contents

After obtaining the site_id and drive_id, we can list all files and folders in the root directory.

def get_folder_content(self, site_id, drive_id):
        full_url = f"{self.graph_api_url}/sites/{site_id}/drives/{drive_id}/root/children"
        headers = {'Authorization': f'Bearer {self.access_token}'}
        response = requests.get(full_url, headers=headers)
  if response.status_code == 200:
            return response.json().get('value')
        else:
            raise Exception(f"Error retrieving folder content: {response.status_code} - {response.text}")

Displaying All Folder Contents

If we need to retrieve the contents of a specific folder, we can modify the API request as follows:

def list_folder_contents(self, site_id, drive_id, folder_id):
        full_url = f"{self.graph_api_url}/sites/{site_id}/drives/{drive_id}/items/{folder_id}/children"
        headers = {'Authorization': f'Bearer {self.access_token}'}
        response = requests.get(full_url, headers=headers)
  if response.status_code == 200:
            return response.json().get('value')
        else:
            raise Exception(f"Error listing folder contents: {response.status_code} - {response.text}")

Running the Script

Below is an example of how to use the SharePointClient class to connect to SharePoint, retrieve site and drive information, and list folder contents:

# Define your credentials
tenant_id = 'your-tenant-id'
client_id = 'your-client-id'
client_secret = 'your-client-secret'
site_url = "your-sharepoint-site-url"
# Initialize the client
client = SharePointClient(tenant_id, client_id, client_secret)
# Get site ID
site_id = client.get_site_id(site_url)
print("Site ID:", site_id)
# Get drive information
drive_info = client.get_drive_id(site_id)
print("Root folder:", drive_info)
drive_id = drive_info[0]['id']  # Assume the first drive is the main document library
# Get root folder content
folder_content = client.get_folder_content(site_id, drive_id)  
print("Root Content:", folder_content)
# Retrieve contents of a specific folder (e.g., the first folder in the root directory)
folder_id = folder_content[0]['id']
contents = client.list_folder_contents(site_id, drive_id, folder_id)
for content in contents:
    print(f"Name: {content['name']}, Type: {'Folder' if 'folder' in content else 'File'}, MimeType: {content.get('file', {}).get('mimeType', 'N/A')}")

Conclusion

With this Python script, you can easily connect to SharePoint, navigate through document libraries, and retrieve file and folder contents using the Microsoft Graph API. This approach can be extended to upload, download, or manipulate files within SharePoint programmatically.

By leveraging the Graph API, you gain more flexibility and automation capabilities for managing SharePoint content within your organization.

Python
Sharepoint
Data Science
Data Engineering
Pyspark
Recommended from ReadMedium