When text is copied for example by an annotation tools like Hypothes.is
from a PDF generated by XEP (or when we simply want to copy text from
PDF into another document) spaces between words of adjacent lines or
between words of different font styles (eg italics/normal) get lost.
word1 word2word3 word4
*word1* word2 word3
Is this problem related to the way spaces are encoded within PDFs and is
there a way to generate PDFs with XEP that avoids this problem? I found
a blog post at an Adobe forum saying that the problem isrelated to the
way the rendering engine generates strings:
> The problem is that the PDF may or may not have the apparent spaces
> encoded as space characters, particularly at line ends, but also
> between words and perhaps even between characters. The rendering
> engine (Ps or PDF driver) may have chosen to break "/word1 word2/"
> into two strings with two starting coordinates and no U+0020 space
> character (or alternative space characters) at all.
Is this correct - are the spaces lost because they are not encoded at
all and the words are separated just by different starting coordinates?
If so, can this be avoided by encoding spaces when the PDF is generated?
My impression is, however, that the PDF viewer might be a source of the
problem (and not the PDF).
-- Dr. Armin Günther Information Technology Leibniz Institute for Psychology Information (ZPID) 54286 Trier, Germany Fon: +49(0)651-201-2055 Fax: +49(0)651-201-2604 E-Mail: firstname.lastname@example.org www.zpid.de/en
(*) To unsubscribe, please visit http://lists.renderx.com/mailman/options/xep-support
(*) By using the Service, you expressly agree to these Terms of Service http://w
Received on Tue May 3 08:28:27 2016
This archive was generated by hypermail 2.1.8 : Tue May 03 2016 - 08:28:41 PDT