[xep-support] Re: Mixed languages in a single PDF

From: Kevin Brown <kevin@renderx.com>
Date: Mon Dec 03 2012 - 19:17:56 PST

You could do something like this:


font-family="Helvetica, Arial, Arial Unicode"


make sure all of these are mapped in your xep.xml. it's a font selection
strategy that will select the appropriate font for something based on
whether the character is in the font list. This would choose a font for the
text in order --- Helvetica first (no impact), Arial second (would be
embedded and hopefully subsetted) and Arial Unicode last (you of course have
to have the font and license to use it, it is huge and has many, many


Now, I would give this as a solution if you are formatting a few documents,
occasionally. If you are looking for high performance formatting then no way
would I do this - Arial Unicode is a 14MB-27MB font (depends on which
version you have) and the prolog for the font itself which must be embedded
in the PDF is 512KB (we honor the wishes of copyright holders and embed
their copyrights into documents when their fonts are used unlike other
formatting engines which ignore them). If you are looking for performance
then you need to plan your system accordingly and select fonts or a list of
fonts that have the glyphs you need.


Kevin Brown





From: xep-support-bounces@renderx.com
[mailto:xep-support-bounces@renderx.com] On Behalf Of Darren Munt
Sent: Monday, December 03, 2012 6:14 PM
To: 'xep-support@renderx.com'
Subject: [xep-support] Mixed languages in a single PDF


We have a particular problem with some of our documents, which combine
system-generated text with user input. We produce the same report in many
different languages and sometimes we have an issue whereby user input is in
a different character set to the main document language. For example we
produce a report in English but some of the user text has been entered in
Chinese. There is no way of telling what language the user text might be in,
apart from either asking them when they enter it or doing some sort of
language-detection (which is how I understand the web browser does it for


When the report is generated in English, we use the Arial font to display
text and it does not have character mapping for the Chinese characters, so
they do not appear. When we generate the report in Chinese, we use the Sim
Sum font, which does display Chinese, but any English text appears in the
same font and it's not a great looking font for Latin characters.


Using Sim Sum by default is not an option because of the appearance of the
Latin text and also we support many other languages, including Greek and
Arabic, so we really need to be able to specify the font based on the
language selected for the report. It's quite possible in our system for a
report to contain any number of different languages in a multi-national or
multicultural scenario.


I've told the client I don't think there is a way of being able to support
ad hoc language changes within the document this way, but I thought I would
throw it out there in case there's something XEP can do that I don't know
about. Any suggestions gratefully accepted.


(*) To unsubscribe, please visit http://lists.renderx.com/mailman/options/xep-support
(*) By using the Service, you expressly agree to these Terms of Service http://w
Received on Mon Dec 3 19:05:43 2012

This archive was generated by hypermail 2.1.8 : Mon Dec 03 2012 - 19:05:49 PST