DZone
Thanks for visiting DZone today,
Edit Profile
  • Manage Email Subscriptions
  • How to Post to DZone
  • Article Submission Guidelines
Sign Out View Profile
  • Post an Article
  • Manage My Drafts
Over 2 million developers have joined DZone.
Log In / Join
Refcards Trend Reports
Events Video Library
Refcards
Trend Reports

Events

View Events Video Library

Related

  • Introduction to Spring Boot and JDBCTemplate: JDBC Template
  • Building a High-Throughput Distributed Sequence Generator Using the Hi-Lo Algorithm
  • When Snowflake Lies to You: Understanding False Failures in dbt Pipelines
  • Master-Class: Understanding Database Replication (Single, Multi, and Leaderless)

Trending

  • A Walk-Through of the DZone Article Editor
  • Optimizing High-Volume REST APIs Using Redis Caching and Spring Boot (With Load Testing Code)
  • AI Agents in Java: Architecting Intelligent Health Data Systems
  • Securing the AI Host: Spring AI MCP Server Communication With API Keys
  1. DZone
  2. Data Engineering
  3. Databases
  4. Docx Templating With docx4j: Tips and Tricks

Docx Templating With docx4j: Tips and Tricks

Looking to make fancy templates for docx Word documents? See how you can make it happen with docx4j and some nasty pitfalls to avoid during your work.

By 
Marcin Kulawinek user avatar
Marcin Kulawinek
·
Oct. 13, 17 · Tutorial
Likes (3)
Comment
Save
Tweet
Share
34.6K Views

Join the DZone community and get the full member experience.

Join For Free

The Problem

Say we need to create a Word document (based on a template) filled with data from a system with 'unlimited' rows and a possibly 'unlimited' number of columns.

Just like in the awesome, pro, fancy picture below.
Image title


We have a few requirements:

  • We have to use our super/hyper docx template from our legal/report/whatever deparment.

  • We have many rows, so they do not all fit on one page.

  • We need to repeat the header on every page.

  • We have many columns, so they do not all fit on one page.

  • We need to generate another table starting from a new page if all the columns do not fit on one page.

Solution

My solution for such problem was to:

  1. Read the docx file with docx4j.

  2. Find the template table.

  3. Clear template content (or remove this one table).

  4. Split the data into parts that fit on one page.

  5. Generate every table with the most possible data (e.g. 6 drivers on one page).

  6. Start a new table on a new page if there is more data to write.

  7. At the end, clear possibly blank columns.

I've shared my complete solution on GitHub.

Tips and Tricks

Set the repeat header row in the template docx: When you prepare the template document, take care to set the "repeat header row" in the table settings. This way, you will avoid writing your own pagination, which was a nightmare in my case. I facepalmed after realizing this is only one setting in Word/LibreOffice.

Do NOT use Word Online to create the template document: I'm a Linux user, so preparing a nice-looking Word document is not an easy task. So, I decided to use the online version of Word, which led me into a lot of trouble. The biggest one was that there is no option for repeating the header row for tables (Why? ¯\_(ツ)_/¯ ). You have to use full Word or try to handle this with LibreOffice.

Do not use VariablePrepare.prepare() when generating multiple copies from one template: This guy was suggested on few sites, including GitHub and SO. But if you try to invoke it multiple times, it is not going to work nicely. In my repo, I've created a version of prepare that works on objects, so you do not have to run prepare for the whole document.

Avoid using documentPart.variableReplace(Map<String, String>) when creating multiple copies: This guy got me in trouble as well. I was trying to create a copy of my table, then run variableReplace multiple times. The result was so strange that even now I'm not sure what was happening there.

Use simple String to handle templates: When you look inside docx4j, there is a lot of marshalling and unmarshalling of data and operations on plain Strings. You should do the same.

Doing just...

Object template = XmlUtils.deepCopy(object);
String templateAsAString = XmlUtils.marshaltoString(template);

// mappings is a map of variables from docx with values
// in docx, you use ${var}. In mappings, var
StrSubstitutor strSubstitutor = new StrSubstitutor(mappings);
Object result = XmlUtils.unmarshalString(strSubstitutor.replace(templateAsAString));


...is more than enough to replace variables with actual data. 

Good Template

The template was quite a challenge. First of all, docx has an annoying tendency to split your text into XML parts. So instead of ${var}, you end with something like ${var</tag><tag>} — which will display OK, but during processing, you will find that the variables are not filled with the proper data.

Summary

After all that, I think docx4j is quite a handy tool. Still, I found myself a little disappointed that doing something so common, like generating a document from a template, can be such hard thing to do.

@EDIT

Thanks to Tom Hombergs for link to his project which is wrapping docx4j.

So if you have Spring based project try it: https://github.com/thombergs/docx-stamper

Database Template

Opinions expressed by DZone contributors are their own.

Related

  • Introduction to Spring Boot and JDBCTemplate: JDBC Template
  • Building a High-Throughput Distributed Sequence Generator Using the Hi-Lo Algorithm
  • When Snowflake Lies to You: Understanding False Failures in dbt Pipelines
  • Master-Class: Understanding Database Replication (Single, Multi, and Leaderless)

Partner Resources

×

Comments

The likes didn't load as expected. Please refresh the page and try again.

  • RSS
  • X
  • Facebook

ABOUT US

  • About DZone
  • Support and feedback
  • Community research

ADVERTISE

  • Advertise with DZone

CONTRIBUTE ON DZONE

  • Article Submission Guidelines
  • Become a Contributor
  • Core Program
  • Visit the Writers' Zone

LEGAL

  • Terms of Service
  • Privacy Policy

CONTACT US

  • 3343 Perimeter Hill Drive
  • Suite 215
  • Nashville, TN 37211
  • [email protected]

Let's be friends:

  • RSS
  • X
  • Facebook