[xep-support] Next Installment of Cool Tools - Inserting Microsoft Excel spreadsheet into XSL FO documents

From: Kevin Brown <kevin@renderx.com>
Date: Mon Aug 03 2009 - 18:59:26 PDT

Our internal invoicing system used a method of converting Microsoft Excel to
XML data and passing that data to RenderX XEP to format. While this was
already cool, we decided to bump it up a little and put together an XSLT to
convert Microsoft Excel's XML format to XSL FO. Since Microsoft Office 2003,
Excel has had a function to save an Excel spreadsheet in XML format. This
XML file is a self-contained file. It not only has all the data but all the
styling in the structured content.

We had to make a few design decisions in the process in order to build a
solution to extract relevant section of data. A spreadsheet could have
ranges of data that you may wish to format. We decided that we would focus
on the range "Print Area" so that is the only information that is extracted
from a spread sheet. This also allows us to use Excel's "Print Titles" to
create XSL FO table headers.

Styles in Excel can have parent-child like relationships, with child
elements possibly overriding parent elements for certain style elements. We
just implemented a recursion from the child outward from the actual element
to walk all the style information from the templates.

Excel also allows templates for displaying the actual data. For example
using Currency or Percent as a number formatting template. We did not try to
implement a solution to read and interpret the Excel rules to apply to a
format-number() function, we only look at the key names and apply built-in
templates for this.

Excel actually stores the "value" associated with a cell along with any
equation that could be there. We aren't doing any calculations so we just
skip the formulas and use the values to create our table.

The solution we implemented handles many of the common styles, likely enough
to make this perfect in 80% of all cases:

table layout including column widths and row heights
individual borders (top, bottom, left, right), border colors, and border
widths
cell background color
cell text color
font-family and styling like italics and bolding
font-size
text-decoration (underline)
cell and text alignments
table header and body area
row and column spanning
rotations of 90 and -90
Excel's special quick formatting styles like "Good", "Bad", etc.
hyperlinks
decimal, currency, accounting and percent number formats
extraction of named range (Print_Area) as the table with (Print_Titles)
range as table header area. This could be expanded to support multiple named
ranges in a document for multiple tables except one would need also to pass
in the number of rows for table headers.

Lastly, we implemented it such that you can just reference the XML Excel
document and use the document() function to process this whole thing to an
XSL FO table.

It ain't perfect. It was a collaborative effort to kick out something and if
there is something you don't understand or doesn't quite work for you, just
drop us a note and we would be happy to discuss or fix it.

As usual:

Sample output: http://www.xportability.com/CoolTools/testexcel.pdf

Download the package: http://www.xportability.com/CoolTools/Excel2FO.zip

Kevin Brown
RenderX

-------------------
(*) To unsubscribe, send a message with words 'unsubscribe xep-support'
in the body of the message to majordomo@renderx.com from the address
you are subscribed from.
(*) By using the Service, you expressly agree to these Terms of Service http://www.renderx.com/terms-of-service.html
Received on Mon Aug 3 19:57:04 2009

This archive was generated by hypermail 2.1.8 : Mon Aug 03 2009 - 19:57:11 PDT