Bug 5378 - Problem with many Japanese fonts
Summary: Problem with many Japanese fonts
Status: RESOLVED FIXED
Alias: None
Product: fontconfig
Classification: Unclassified
Component: library (show other bugs)
Version: 2.1
Hardware: x86 (IA32) Windows (All)
: high normal
Assignee: Keith Packard
QA Contact:
URL:
Whiteboard:
Keywords:
Depends on:
Blocks: 8100
  Show dependency treegraph
 
Reported: 2005-12-19 03:07 UTC by Guillaume Proux
Modified: 2006-09-01 21:31 UTC (History)
2 users (show)

See Also:
i915 platform:
i915 features:


Attachments
Font cache -- M Gushee (16.07 KB, application/octet-stream)
2006-08-12 12:29 UTC, Matt Gushee
Details

Description Guillaume Proux 2005-12-19 03:07:52 UTC
I hesitated to open the bug with Gimp because it was really a regression from
the Gimp 1.x days that leads me to open a bug here.
In the Gimp 1.x days, all the fonts available in Windows were properly displayed
with their proper japanese names.

One of them is a nice informal font Reiryureisho (麗流隷書) that is very well
spread in Japan. (i don't know how the bugzilla is setup, you might need to
change the encoding of the page to SJIS to see the above chinese characters).

Instead of that name you see "ףˆן∂ֹyֹליצ" which is ... not so nice. Filename for
this font is usually BGREIRR.TTF

fc-list.exe shows the following output for this font.
 ףˆן∂ֹyֹליצ:style=Light

This is not the only font which displays that issue though.
I am not sure of the status of that font copyright wise so I can't attach it to
this bug report... sorry.

Also not sure of the exact version of fontconfig as
Comment 1 Matt Gushee 2006-08-12 12:29:29 UTC
Created attachment 6534 [details]
Font cache -- M Gushee
Comment 2 Matt Gushee 2006-08-12 12:49:16 UTC
I have similar symptoms with fontconfig 2.3.2 on Linux. I believe fc-cache is
extracting the font names incorrectly, as illustrated by the font cache file I
have attached. The fonts indexed here are all from a collection published by
Dynalab, with a copyright date of 1997. You can see that for a few faces there
is an English name followed by a Japanese name, and familylang=en,ja. But for
most of the faces, familylang=en,ja,en, and *three* names are shown: the first
of these consists of garbage characters and is the name displayed in most font
menus.

I can't rule out the possibility that the fonts themselves are defective, but
other tools, including Fontforge and the Freetype demo tools, show only legible
English and Japanese names. For example:

    $ ftdump DFHsg7.ttc
    There are 3 faces in this file.
    
    ----- Face number: 0 -----
    
    font name entries
       family:     DFHSGothic-W7
       style:      Regular
       postscript: DFHSGothic-W7-WIN-RKSJ-H
    
    font type entries
       FreeType driver: truetype
       sfnt wrapped:    yes
       type:            scalable
       direction:       horizontal, vertical
       fixed width:     yes
       glyph names:     no
       EM size:         1024
       global BBox:     (0,-144):(1023,880)
       ascent:          880
       descent:         -144
       text height:     1024
       glyph count: 8829
    
    charmaps
       0: platform 1, encoding 0
       1: platform 3, encoding 1 (active)
    
    ----- Face number: 1 -----

... etc. And:

    $ ftlint 24 DFHsg7.ttc 
    DFHsg7.ttc: OK.

Comment 3 Keith Packard 2006-09-01 11:18:35 UTC
Can you send the output of 

$ FC_DEBUG=384 fc-cache -f <directory containing broken font>

That will show how fontconfig generates the names that you're seeing.
Comment 4 Keith Packard 2006-09-01 21:31:21 UTC
These fonts are broken; they include a name field which purports to be encoded
in the standard Roman Macintosh encoding (similar to Latin 1) but which is
really encoded in SJIS.

I've added a heuristic for names in this encoding. If the name includes a large
(>1/3) fraction of bytes with the high-bit set, fontconfig will assume the name
is actually in SJIS encoding and report it as such, along with setting the
associated language tag to Japanese.


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.