DZone
Thanks for visiting DZone today,
Edit Profile
  • Manage Email Subscriptions
  • How to Post to DZone
  • Article Submission Guidelines
Sign Out View Profile
  • Post an Article
  • Manage My Drafts
Over 2 million developers have joined DZone.
Log In / Join
Refcards Trend Reports
Events Video Library
Refcards
Trend Reports

Events

View Events Video Library

Related

  • DuckDB for Python Developers
  • Stop Writing Slow Pandas Code: Vectorization and Modern Alternatives Explained
  • Python Packages for Validating Database Migration Projects
  • dovpanda: Unlock Pandas Efficiency With Automated Insights

Trending

  • LLM Agents and Getting Started with Them
  • The Hidden Cost of Overprivileged Tokens: Designing Messaging Platforms That Assume Compromise
  • Designing API-First EMR Architectures in .NET: Enabling Modular Growth in Compliance-Driven Systems
  • S3 Vectors: How to Build a RAG Without a Vector Database
  1. DZone
  2. Coding
  3. Languages
  4. Automating Excel Workflows in Box Using Python, Box SDK, and OpenPyXL

Automating Excel Workflows in Box Using Python, Box SDK, and OpenPyXL

Automate Excel workflows on Box using Python, Box SDK, and openpyxl to sync data, eliminate manual updates, and streamline reporting.

By 
Sweetty P Devassy user avatar
Sweetty P Devassy
·
Abhinav K user avatar
Abhinav K
·
Veena Sudhakaran user avatar
Veena Sudhakaran
·
Oct. 22, 25 · Tutorial
Likes (2)
Comment
Save
Tweet
Share
2.3K Views

Join the DZone community and get the full member experience.

Join For Free

In many organizations, MS Excel remains the go-to tool for storing and sharing structured data, whether it’s tracking project progress, managing audit logs, or maintaining employee or resource details. Yet, a surprisingly common challenge persists: data is still being copied and updated manually.

Teams across different functions, especially management and DevOps, often find themselves entering or syncing data from one source into Excel spreadsheets manually and repeatedly. This not only consumes time but also introduces room for errors and inconsistencies.

For example:

  • A manager who regularly fetches data from a project board to Excel to track progress.
  • A DevOps engineer who tracks resource utilization across environments.
  • An auditor who needs to sync logs from internal tools into an Excel sheet stored in Box for compliance review.

These tasks are ripe for automation.

Box is a popular cloud storage provider used by several organizations to store files related to projects and employees. With the help of modern tools like the Box Python SDK, openpyxl, and pandas, it's possible to read and write Excel files directly in your code, no downloading or manual editing required.

In this blog post, we’ll explore how to:

  • Connect to Box using Python
  • Read Excel files stored in Box
  • Update and append new data to sheets
  • Upload the updated file back to Box

This workflow is especially useful for DevOps, SREs, and team leads who need to keep operational data synced across systems, or for managers and analysts looking to automate routine Excel updates.

Setting Up Your Box App

Step 1: Create a Box Developer App

  1. Go to the Box Developer Console.
  2. Click Create New App.
  3. Choose "Custom App" as the app type.
  4. Give your app a name and description.
  5. Select "OAuth 2.0 with JWT (Server Authentication)".

This will generate a new application that can interact with your Box account using server-side authentication.

Step 2: Configure App Permissions

Once your app is created:

  1. Go to your app’s Configuration tab.
  2. Under Application Scope, enable:
    • Read and write all files and folders stored in the Box.

Important: Once configured, you'll need to submit the app for authorisation by your Box Admin (if your account is part of a Box Enterprise). Until it’s approved, API calls will be limited or denied.

Step 3: Generate and Download Configuration File

Back in the Configuration tab:

  1. Scroll down to the App Credentials section.
  2. Click Generate a Public/Private Keypair.
  3. A config JSON file will be downloaded. This contains:
    • Client ID and Secret
    • JWT Public Key ID
    • Private Key (for signing the JWT)
    • Passphrase for the private key
    • Enterprise ID

Step 4: Share the Box Folder/File With the Box App Automation User

  1. Go to your app's General settings. 
  2. Under the Service account ID, copy the automation user ID.
  3. Share the box folder/file with this automation user ID. This will give read and write privileges to the box app. 

Keep this file safe and private — it grants full API access.

Installing Required Libraries

Before diving into code, let's install the required Python libraries. These will enable interaction with Box and manipulation of Excel files.

pip install boxsdk openpyxl pandas

Here’s a quick overview of what each library does:

  • boxsdk: The official Python SDK for Box’s APIs. It allows you to authenticate, access files, upload/download content, and manage folders using the Box Developer API.
  • openpyxl: A powerful library for reading and writing Excel .xlsx files in Python. It lets you work with individual cells, formulas, styles, and sheets.
  • pandas: A data analysis library in Python. Useful when you want to process or filter Excel data in a tabular format using DataFrames.

Authenticating With Box in Python

To interact with Box via the SDK, you'll first need to authenticate your application.

Python
 
def init_box_client():
    auth = JWTAuth.from_settings_file(‘<path/to/config.json>’)
    client = Client(auth)
    return client


Once authenticated, the client can now access any file, folder, or metadata your app has permissions for (Yes, you need to add the app you created as an editor to the file you want to automate).

Reading Excel Files from Box

Once authenticated, you can access Excel files in Box either by their file ID or by searching for their name.

1. Accessing a File by ID

ID is the last part of the URL of a box file.

Python
 
box_file = client.file(file_id)


2. Downloading to Memory Using BytesIO

You don’t need to write the file to disk. You can load it directly into memory:

Python
 
import io
from openpyxl import load_workbook

file_stream = io.BytesIO()
box_file.download_to(file_stream)
file_stream.seek(0)  # Rewind to beginning
workbook = load_workbook(file_stream)


3. Accessing a Specific Sheet

Python
 
sheet = workbook["Sheet1"]


Or dynamically get the first sheet:

Python
 
sheet = workbook.active


4. Reading into a Pandas DataFrame (Optional)

If you'd prefer to use pandas for data analysis:

Python
 
import pandas as pd

file_stream.seek(0)  # Ensure pointer is at start
df = pd.read_excel(file_stream, sheet_name="Sheet1")


Overall, the read Excel function would look like this:

Python
 
def load_excelsheet_from_box(client, id, sheet_name):
    box_file_ref = client.file(id)
    box_file_metadata = box_file_ref.get()
    file_stream = io.BytesIO()
    box_file_ref.download_to(file_stream)
    file_stream.seek(0)
    workbook = load_workbook(file_stream)
    sheet = workbook[sheet_name]
    return [box_file_ref, box_file_metadata, workbook, sheet]


Working With Excel Data

With your Excel sheet loaded, you can now read and manipulate rows using openpyxl or pandas.

1. Iterating Over Rows With openpyxl

Python
 
for row in sheet.iter_rows(min_row=2, values_only=True):
    print(row)


This will print data from the second row onward (assuming row 1 is headers).

2. Accessing Headers and Filtering

You can extract headers using:

Python
 
headers = [cell.value for cell in sheet[1]]


You might use this to filter or map column positions.

3. Using pandas for Heavy Processing

If you want to filter, pivot, or merge data, load it into a DataFrame:

Python
 
df = pd.read_excel(file_stream)
filtered_df = df[df["Status"] == "Active"]


This approach is powerful when working with large Excel sheets or when doing analysis.

Saving Changes Back to Box

After modifying the Excel workbook, the next step is to save the updated file and upload it back to Box.

1. Saving to Memory

We use BytesIO to avoid writing to disk:

Python
 
updated_stream = io.BytesIO()
workbook.save(updated_stream)
updated_stream.seek(0)


2. Uploading to Box

Box allows you to replace the contents of a file using its ID:

Python
 
client.file(file_id).update_contents_with_stream(updated_stream)


This creates a new version of the file in Box, preserving version history and allowing collaboration without risk of losing older data.

End-to-End Example: Sync Missing Names

Here’s a complete example that demonstrates reading from Box, checking for missing names, appending new entries, and uploading the updated file back to Box.

Use Case

Let’s say you want to track details of all employees who are onboarding to your platform.

Python
 
from boxsdk import Client, JWTAuth
from openpyxl import load_workbook
import io

# Step 1: Authenticate
auth = JWTAuth.from_settings_file("box_config.json")
client = Client(auth)

# Step 2: Download the Excel file from Box
file_id = "1234567890"
box_file = client.file(file_id)
file_stream = io.BytesIO()
box_file.download_to(file_stream)
file_stream.seek(0)

# Step 3: Load and read the Excel sheet
workbook = load_workbook(file_stream)
sheet = workbook["Sheet1"]
existing_names = {
    (row[0] or "").strip().lower()
    for row in sheet.iter_rows(min_row=2, values_only=True)
    if row[0]
}

# Sample new data. This should come from your platforms’ api.
new_rows = [
    {"name": "Suresh", "id": "A001", "owner": "Team A"},
    {"name": "Ramesh", "id": "B002", "owner": "Team B"},
]
# Step 4: Append new entries
for entry in new_rows:
    name = (entry.get("name") or "").strip().lower()
    if name and name not in existing_names:
        sheet.append([entry["name"], entry.get("id", ""), entry.get("owner", "")])

# Step 5: Save and re-upload
updated_stream = io.BytesIO()
workbook.save(updated_stream)
updated_stream.seek(0)
box_file.update_contents_with_stream(updated_stream)


Conclusion

Automating Excel tasks with Box can save hours of manual effort each week. Whether you're a management team syncing reports or a DevOps team tracking infrastructure resources, this workflow helps:

  • Reduce human error
  • Improve efficiency
  • Ensure data consistency

By combining Box SDK, openpyxl, and pandas, you unlock a powerful set of tools to manipulate Excel files in the cloud, without even opening a UI. From daily reports to audit tracking, once set up, these automations are a game-changer for your productivity.

You can refer to the Box documentation for detailed information about available Box APIs:

  • https://developer.box.com/reference/
  • https://github.com/box/box-python-sdk
Software development kit Pandas Python (language)

Opinions expressed by DZone contributors are their own.

Related

  • DuckDB for Python Developers
  • Stop Writing Slow Pandas Code: Vectorization and Modern Alternatives Explained
  • Python Packages for Validating Database Migration Projects
  • dovpanda: Unlock Pandas Efficiency With Automated Insights

Partner Resources

×

Comments

The likes didn't load as expected. Please refresh the page and try again.

  • RSS
  • X
  • Facebook

ABOUT US

  • About DZone
  • Support and feedback
  • Community research

ADVERTISE

  • Advertise with DZone

CONTRIBUTE ON DZONE

  • Article Submission Guidelines
  • Become a Contributor
  • Core Program
  • Visit the Writers' Zone

LEGAL

  • Terms of Service
  • Privacy Policy

CONTACT US

  • 3343 Perimeter Hill Drive
  • Suite 215
  • Nashville, TN 37211
  • [email protected]

Let's be friends:

  • RSS
  • X
  • Facebook