[xep-support] Selectable/searchable phrases in PDF in table column content spanning lines

From: David Clunie <dclunie@dclunie.com>
Date: Fri Feb 21 2014 - 05:32:48 PST

Hi

In PDF table cells from the DocBook FO stylesheets rendered with xep,
if the words in a table cell spread cross several lines, then they
are not kept together from the perspective of selection or searching
in PDF viewers.

I have not been able to improve this behavior using any of the FO
"keep-together" options (e.g., applied in "table.cell.block.properties"
template customizations). I use "keep-together.within-column"
successfully to prevent cells breaking across pages, but that does
not affect the described problem.

And using "keep-together.within-line" doesn't solve the problem either
(phrases still get split up), and also causes some tables to over flow
the page margins anyway and is hence unusable for this.

In short, I am in need of some way of having the content wrap to fit
within the page (obviously) but remain together from the PDF encoding
perspective such that phrases are searchable, which is critical for
our use case (we have a standard with many thousands of pages and
need to be able to search for phrases that are (long) names of data
elements that are present in tables and wrapped to fit on a page
(e.g., as in the screen shots, "Shared Functional Groups Sequence").

Since Word can do it, I know PDF can be encoded this way; the question
is how to get xep to do it.

It isn't split in the FO (the text is contained in one <fo:block/>).

The attached screen shots show selection of a phrase spanning lines
wrapped within a cell highlighted when displayed using Acrobat, with
"good" output from Word and "bad" output from XEP.

The DocBook fragment for this row is:

<tr valign="top">
  <td align="left" colspan="1" rowspan="1">
   <para>Shared Functional Groups Sequence</para>
  </td>
  <td align="center" colspan="1" rowspan="1">
   <para>(5200,9229)</para>
  </td>
  <td align="center" colspan="1" rowspan="1">
   <para>1</para>
  </td>
  <td align="left" colspan="1" rowspan="1">
   <para>Sequence that contains the Functional Group Macros that are
shared for all frames in this SOP Instance and Concatenation.</para>
   <note>
    <para>The contents of this sequence are the same in all SOP
Instances that comprise a Concatenation.</para>
   </note>
   <para>Only a single Item shall be included in this sequence.</para>
   <para>See <xref linkend="sect_C.7.6.16.1.1" xrefstyle="select:
label"/> for further explanation.</para>
  </td>
</tr>

The customization used is:

<xsl:template name="table.cell.block.properties">
<xsl:attribute name="keep-together.within-column">always</xsl:attribute>
   <xsl:choose>
     <xsl:when test="ancestor::d:thead or ancestor::d:tfoot">
       <xsl:attribute name="font-weight">bold</xsl:attribute>
     </xsl:when>
     <!-- Make row headers bold too -->
     <xsl:when test="ancestor::d:tbody and
                     (ancestor::d:table[@rowheader = 'firstcol'] or
                     ancestor::d:informaltable[@rowheader = 'firstcol']) and
 
ancestor-or-self::d:entry[1][count(preceding-sibling::d:entry) = 0]">
       <xsl:attribute name="font-weight">bold</xsl:attribute>
     </xsl:when>
   </xsl:choose>
</xsl:template>

and the extract of the FO produced by the DocBook FO stylesheets is:

<fo:table-row>
                 <fo:table-cell padding-start="2pt" padding-end="2pt"
padding-top="2pt" padding-bottom="2pt" text-align="left"
display-align="before" border-start-style="none" border-top-style="none"
border-bottom-style="solid" border-bottom-width="0.5pt"
border-bottom-color="black" border-end-style="solid"
border-end-width="0.5pt" border-end-color="black"><fo:block
keep-together.within-column="always">
                   <fo:block space-before.optimum="1em"
space-before.minimum="0.8em" space-before.maximum="1.2em">Shared
Functional Groups Sequence</fo:block>
                 </fo:block></fo:table-cell>
                 <fo:table-cell padding-start="2pt" padding-end="2pt"
padding-top="2pt" padding-bottom="2pt" text-align="center"
display-align="before" border-start-style="none" border-top-style="none"
border-bottom-style="solid" border-bottom-width="0.5pt"
border-bottom-color="black" border-end-style="solid"
border-end-width="0.5pt" border-end-color="black"><fo:block
keep-together.within-column="always">
                   <fo:block space-before.optimum="1em"
space-before.minimum="0.8em"
space-before.maximum="1.2em">(5200,9229)</fo:block>
                 </fo:block></fo:table-cell>
                 <fo:table-cell padding-start="2pt" padding-end="2pt"
padding-top="2pt" padding-bottom="2pt" text-align="center"
display-align="before" border-start-style="none" border-top-style="none"
border-bottom-style="solid" border-bottom-width="0.5pt"
border-bottom-color="black" border-end-style="solid"
border-end-width="0.5pt" border-end-color="black"><fo:block
keep-together.within-column="always">
                   <fo:block space-before.optimum="1em"
space-before.minimum="0.8em" space-before.maximum="1.2em">1</fo:block>
                 </fo:block></fo:table-cell>
                 <fo:table-cell padding-start="2pt" padding-end="2pt"
padding-top="2pt" padding-bottom="2pt" text-align="left"
display-align="before" border-start-style="none" border-top-style="none"
border-bottom-style="solid" border-bottom-width="0.5pt"
border-bottom-color="black"><fo:block keep-together.within-column="always">
                   <fo:block space-before.optimum="1em"
space-before.minimum="0.8em" space-before.maximum="1.2em">Sequence that
contains the Functional Group Macros that are shared for all frames in
this SOP Instance and Concatenation.</fo:block>
                   <fo:block id="idp140215214853168"
space-before.minimum="0.8em" space-before.optimum="1em"
space-before.maximum="1.2em" margin-left="0.25in"
margin-right="0.25in"><fo:block keep-with-next.within-column="always"
font-size="9pt" font-weight="bold"
hyphenate="false">Note</fo:block><fo:block><fo:block
space-before.optimum="1em" space-before.minimum="0.8em"
space-before.maximum="1.2em">The contents of this sequence are the same
in all SOP Instances that comprise a
Concatenation.</fo:block></fo:block></fo:block>
                   <fo:block space-before.optimum="1em"
space-before.minimum="0.8em" space-before.maximum="1.2em">Only a single
Item shall be included in this sequence.</fo:block>
                   <fo:block space-before.optimum="1em"
space-before.minimum="0.8em" space-before.maximum="1.2em">See
<fo:basic-link
internal-destination="sect_C.7.6.16.1.1"><fo:inline>Section C.7.6.16.1.1</fo:inline></fo:basic-link>
for further explanation.</fo:block>
                 </fo:block></fo:table-cell>
               </fo:table-row>

Thanks ... David

!DSPAM:87,5307558d9852186785442!

_______________________________________________
(*) To unsubscribe, please visit http://lists.renderx.com/mailman/options/xep-support
(*) By using the Service, you expressly agree to these Terms of Service http://w
ww.renderx.com/terms-of-service.html
Received on Fri Feb 21 05:33:06 2014

This archive was generated by hypermail 2.1.8 : Fri Feb 21 2014 - 05:33:11 PST