Automating Time-Expensive Tasks

My coworker in HR is doing a lot of interesting tasks that require both people skills and mental sharpness. Payroll is not one of those. It’s a boring, soul-crushing job of reading a spreadsheet row by row, copying and pasting the value to a payroll template, followed by a long time at the printer.

On top of that, there are a few people working on that spreadsheet, which increase the chances of missing a revision or using an outdated copy. Being a friendly neighborhood IT staff, I’ve offered to make a better workflow for this process. It could be done locally with some heavy-duty scripting, but there are a few people who need direct access to the sheet. On top of that, I didn’t want to sink too much time in this side-task. Not every data-processing tool needs to be a 6-week development job with a list of dependencies long as the arm.

I also want them to be able to change the template structure and style directly in the workbook without having to touch the code, so they will need a nice worksheet editor

Typical Desktop Data and Template Task

Call me old fashioned, but I have a soft spot in my heart for the old “read-from-the-database-push-to-template-field-by-field” workflow. With data and presentation well separated, you can add, remove, edit, and read the content, while working in parallel on the nice-looking template which will receive the values.

The Right Tool for the Right Task

I’m using Keikai to turn this Excel file into a web application. I just have to import the workbook in Keikai and use the Java Client API to control it.

Simply put, I want to read column headers and automatically create a copy of the display template for each row. In each copy, I want to fill in the value of the cell matching the column name. Easy as pie.





Well-Formatted Data

Of course, the data must be readable. I’ve retrieved the original workbook and just changed the formatting to follow my simple ruleset:

All the data will be in a table named PayrollTable . It's easy to target from the Java API, plus adding a line in the table will automatically make it included in my dataset.

. It's easy to target from the Java API, plus adding a line in the table will automatically make it included in my dataset. The first line of the table will contain headers. No data here, just the labels for the template fields.

Each row in the table will be output to one copy of the template

In-Browser Template Edition

Here’s the clever part. While the parsing and replacing are done in Java, the application can be piloted directly by the user simply by editing the template. Moving cells around, changing style, etc. — all operations can be performed directly on the spreadsheet file or through the Keikai web UI.

Read Data From Excel, Use it However You Like

Let’s get to it. The API here is simple. Since I’ve named my target range, I can retrieve it with a single line:

Range payrollRange = spreadsheet.getRangeByName(sheetName, rangeName);





There’s more to the API here if you’d like a detailed read. The range object can then be queried line by line to extract data. The first step is extracting the header to their own list.

List<String> headersList = new ArrayList<String>(); for (int i = 0; i < columnCount; i++) { headersList.add(header.getCell(0, i).getValue()); }





I can then use the header list as a reference to store each line as maps of the header and value. The resulting data structure will be close to JSON, easy to reuse, and is more optimized for a smaller dataset. For a larger dataset, a more memory efficient format could be a two-dimensional array.

Use Named Fields to Fill Values in a Sheet

Almost done, I just need to make a copy of that template for each row of data and push the values to the relevant cells. The main question is how to identify the target cells. The simplest answer is — once again — named ranges. By assigning name ranges to the target cells and pushing values to these cells, I can easily find every target cell and apply the matching value from the Map representing the row.

private void generateAllTemplates (List<Map<String, Object>> dataset) { for (Map<String, Object> rowData : dataset) { /* clone the template sheet to the end of the sheet list*/ Worksheet cloneTemplate = slipTemplateSheet.copyToEnd(slipTemplateSheet.getWorkbook()); […] /*write every key/value pair to the corresponding named cell in the target template*/ for (Map.Entry<String, Object> entry : rowData.entrySet()) { writeEntryToNamedFields(cloneTemplate, entry.getKey(), entry.getValue()); } } }





Conclusion

I was able to automate a boring task into a click-and-done workflow using only a few lines of Java and the Keikai library to control an excel spreadsheet. Designing with named ranges makes it easy to retrieve and write data to and from specific parts of the document without hardcoding values, which, in turn, makes the document extensible (new columns, new fields, moving fields, or renaming them) and puts both the data and the formatting under the user’s control. It would also make the table data available to be used in a different service entirely, such as writing to the database, using in a business layer, etc.

Source Code

I hope this was as interesting to read as it was to make. The complete source code of this project is available on GitHub.

Just run the project and visit http://localhost:8080/dev-ref/case/payroll.