[xep-support] Re: Fonts and all

From: Kevin Brown <kevin@renderx.com>
Date: Wed Mar 04 2015 - 12:47:29 PST

More on this to come for sure.
I did some additional testing and found the reason why the resulting PDFs
are too large using the Noto fonts.

We are now checking some things out ... it could be a bug in OTF font
embedding or it could be that (some of) the fonts have an improper flag
The resulting PDFs have the entire font embedded.
Once I know more I will post back.

Directions are good, just the Noto fonts I used seem to be an issue.

Kevin Brown

-----Original Message-----
From: Xep-support [mailto:xep-support-bounces@renderx.com] On Behalf Of
Kevin Brown
Sent: Tuesday, March 03, 2015 2:52 PM
To: 'RenderX Community Support List'
Subject: [xep-support] Fonts and all

A few recent questions have been asked in Stackoverflow and also in our own
support groups about fonts. I thought a nice sample here would be

1) First, get access to a Unicode font that contains the characters you

Well there are many places and we are not a font repository. However, one
great place to look is Google and specifically Google’s fonts. They are
trying to organize a complete set of fonts for most all of the world’s
languages. There are certainly many more places and the group is welcome to
chime in. I happen to have a nice setup with many of the “noto” fonts on
Google. See https://www.google.com/get/noto/#/

2) Next, add those fonts to you configuration.

Again, there are many fonts types and many places to put them. Here I am
only going to describe how I do it to stay organized. I prefer to organize
all my fonts under the RenderX installation. Upgrades will preserve them and
it makes it easier to deploy to other machines. You can have your own
methods for sure, organize them into your own areas.
The "xep.xml" file contains all the information about fonts. There are many,
many different settings and you can find them all in our documentation. I
will just cover adding a complete family of the Noto fonts here for
Simplified Chinese.

First I downloaded the ZIP archive of Simplified Chinese Noto fonts from
here: https://www.google.com/get/noto/#/family/noto-sans-hans
In the RenderX installation directory, there is already a directory called
That directory contains the base14 fonts that all installations should come
with that deal with PDF (the 4 variations each of Helvetica, Times and
Courier plus Symbol and ZapfDingbats).
I created a subdirectory under this directory called "Noto/" to handle all
my google Noto fonts.
I unzipped the Simplified Chinese fonts in this directory.
It contains 7 fonts, all starting with "NotoSansCJKsc" with various weights
(Light through Black).
So we need to add these all (if we need them all) to "xep.xml".
Here's a snippet from my "xep.xml" showing how they are added:

  <fonts xmlns="http://www.renderx.com/XEP/config" xml:base="fonts/"

    <!-- Google Noto Fonts -->
    <font-group xml:base="Noto/" label="Noto" embed="true" subset="true"
      <font-family name="NotoSansCJKsc">
        <font weight="100"><font-data otf="NotoSansCJKsc-Thin.otf"/></font>
        <font weight="200"><font-data otf="NotoSansCJKsc-Light.otf"/></font>
        <font weight="300"><font-data
        <font><font-data otf="NotoSansCJKsc-Regular.otf"/></font>
        <font weight="500"><font-data
        <font weight="bold"><font-data otf="NotoSansCJKsc-Bold.otf"/></font>
        <font weight="800"><font-data otf="NotoSansCJKsc-Black.otf"/></font>

Now, a few points of interest here.
The <fonts> section contains the overall set of <fonts> available to
RenderX. It has an attribute "xml:base" that instructs RenderX to look
inside the "fonts/" directory for all fonts *where explicit paths are not
I created a "Noto/" directory under that directory so I add an "xml:base" to
my <font-group> that points to this directory.
Now, all the files in that directory can be accessed by relative paths.
In other words ... I have a file under the RenderX installation as

Also note the font-weight's.
Noto fonts (for ideographic languages) could be provided in several weights
like Thin, Light, DemiLight, etc.
So, in XSL FO/RenderX terms we have:

Thin = 100
Light = 200
DemiLight = 300
Normal or Regular = 400 (or no weight specified) Medium = 500 Bold = 700 (or
"bold" specified for font-weight in XSL FO) Black = 800

So the above "xep.xml" fragment maps all of the various possible weights
that Google fonts provides. Of course, you do not have to use (or allow
people to use) all of these. They are designed correctly as individual font
files. So each has to be handled separately and included separately. More on
this at the end.

3) Format documents referencing the fonts you specify.

You specify the font-family (at least) and in this case also the font-weight
to cause the font selected. You would use the <font-family> @name in
"xep.xml" to reference it.
So for example, here's a bit of FO that would cause all of the above fonts
to be used:

                        <fo:block-container font-family="NotoSansCJKsc">
                                <fo:block>Chinese Sample using
                                <fo:block space-before="6pt"
                                <fo:block space-before="6pt"
                                <fo:block space-before="6pt"
                                <fo:block space-before="6pt">大橋全長二千六
                                <fo:block space-before="6pt"
                                <fo:block space-before="6pt"
                                <fo:block space-before="6pt"

That's all there is to it. Get a font, store it somewhere and add it to the
configuration. Then add text and make sure you tell XEP (through XSL FO) to
use that particular font.

Now, some specific hints on fonts.

I love (and hate) the Noto fonts above. Why both?
Love: Well, they provide great coverage of all the characters. If you are
producing technical documentation then great, this is a great use of these
Hate: They are HUGE (15MB file) and very large even when subsetted and
embedded. I would not necessarily use them to print 100,000 invoices, I
would use them to do a technical manual.

The little text above with fonts embedded and subsetted is a CRAZY 8 MB.
Insane! The same exact document with Simsun font (which only would have one
weight) is only 32kb.
And If I format that simple little document above which access the 7 fonts
listed (15mb each) for the simple characters in it, it takes 20 seconds to

We had another example where a customer wanted to use "Arial Unicode" font.
This font file is 37MB and is copyrighted. They were creating letters and
were using "Arial Unicode" to select a specific bullet character they liked.
The "Arial Unicode" copyright notice that is required to be inserted when
embedding the font is about 250kb alone so that a simple letter with one
bullet is 300kb/file. The processing time to process each document was about
3 seconds.
I selected a similar bullet from "ZapfDingbats". The results were 12kb file
size and 12 documents processed per second. Yes, 3 seconds per document went
to 12 documents per second. A "slight" improvement.

So, try to use a tight, controlled font -- only what you need for a
particular application. The smallest font file possible that has all the
glyphs you need.

Kevin Brown

(*) To unsubscribe, please visit http://lists.renderx.com/mailman/options/xep-support
(*) By using the Service, you expressly agree to these Terms of Service http://w
Received on Wed Mar 4 12:44:29 2015

This archive was generated by hypermail 2.1.8 : Wed Mar 04 2015 - 12:44:29 PST