Bug 62189

Summary: Provide compose sequences for common secondary group of ISO 9995-3 (German T3)
Product: xorg Reporter: Andreas Wettstein <wettstae>
Component: Lib/Xlib (data)Assignee: Xorg Project Team <xorg-team>
Status: RESOLVED MOVED QA Contact: Xorg Project Team <xorg-team>
Severity: normal    
Priority: medium CC: bensberg, cloos
Version: git   
Hardware: Other   
OS: All   
Whiteboard:
i915 platform: i915 features:
Attachments:
Description Flags
adds the missing dead keysyms for T3
none
patch cancelled
none
proposed T3 compose sequences
none
proposed key changes in T3 layout
none
replaces combiners with dead keys in German T3
none
adds compose sequences for T3
none
adds one more key symbol for German T3 none

Description Andreas Wettstein 2013-03-11 18:50:28 UTC
xkeyboard-config contains a German T3 layout, see bug #60991. This layout implements the common secondary group of  ISO 9995-3:2010.  To be fully functional, T3 and each other layout that supports the common secondary group require the support of compose sequences, not all of which are available in libX11 yet.  A table of characters that must be supported and a few compose sequences are given in http://www.open-std.org/Jtc1/sc35/WG1/docs/info1-9995-3.pdf
Comment 1 Benno Schulenberg 2013-08-25 19:57:44 UTC
Okay, appendix E and F of that document show some compose sequences.  But they are strange: they want _combining_ characters to not simply combine with the succeeding character (showing the desired thing in appearance), but to actually _compose_ with it to form a new single character (code point).  For example,
"combining short stroke" with "r" should give "r with stroke", U+024D:

    ◌̵ U+0335 combining short stroke overlay
    r U+0072 latin small letter r
    ɍ U+024D latin small letter r with stroke

In the current Compose tables, however, there is not a single compose sequence involving a combining character.  Have you tried, Andreas, whether this is at all possible, whether it works?
Comment 2 Andreas Wettstein 2013-08-26 18:35:13 UTC
Each keysym that starts a compose sequence becomes de facto a dead key, so

  <U0335> <r> : "̵r" U024D

will work.  I do not know if there is a policy; in principle, one could add new keysyms for dead keys, rather than converting the combining characters into dead keys. This would be more clean, in a sense.

Unfortunately, it is more messy. There is already a keysym dead_stroke which is used for some, but not all characters for which the ISO norm uses U0335 (for example, ̵r).  Currently, in the de(T3) layout, I put dead_stroke where the standard puts U0335.  But dead_stroke is also used for characters (such as ł) for which ISO uses U0338. So the existing dead keys do not match with the ISO usage of the combining characters. As we have to keep existing compose sequences, if we want to introduce dedicated keysyms, we needs two new ones, and have to replicate the existing sequences with dead_stroke.

Another conflict: ISO uses sequences of twice the same combining character in creative ways, whereas in Xlib, pressing twice the same dead key yields the spacing version of the diacritic.
Comment 3 Benno Schulenberg 2013-08-27 19:07:40 UTC
(In reply to comment #2)
> Each keysym that starts a compose sequence becomes de facto a dead key,

Ehm... are you sure?  If one would put for example <a> <x> in the Compose
tables (without Multi_key), the A key would automatically become a dead key?
This seems unlikely to me.  I would think that any key level produces either
a dead thing, always, or a combining thing, always.

>   so    <U0335> <r> : "̵r" U024D    will work.

Have you tried it?  I did, but can't get it to work -- nor any other compose
sequence that I add to the central Compose file, so that doesn't mean much.

> I do not know if there is a policy;

Well, trying to get my test compose sequences to work, I came across this:

  http://cgit.freedesktop.org/xorg/lib/libX11/commit/?id=79f47e6dff2f0a0b673bbfecc47528edca814baa

So it seems unlikely you will get any combining keysymbols included in the
compose sequences.

> in principle, one could add
> new keysyms for dead keys, rather than converting the combining characters
> into dead keys. This would be more clean, in a sense.

But which levels of which keys should then produce the dead keysymbols?  The
third level is already taken by the combining characters.  The fourth level?
Would the T3 standard allow this?
Comment 4 James Cloos 2013-08-27 22:30:22 UTC
>> Each keysym that starts a compose sequence becomes de facto a dead key,

> Ehm... are you sure?  If one would put for example <a> <x> in the Compose
> tables (without Multi_key), the A key would automatically become a dead key?

Remember that the system processes one key at a time; it cannot send
the <a> on (in your example) until it knows what key is pressed next.

The individual events are filtered until a full sequence is seen.

A quick test with <Multi_key> <q>, which is not in any sequences in
en_US.UTF-8/Compose, shows that neither key is seen as an unfiltered
event by the application.  (Try it out.  Look at it in xev(1), too.)
Comment 5 Andreas Wettstein 2013-08-28 18:27:30 UTC
> Ehm... are you sure?  If one would put for example <a> <x> in the Compose
> tables (without Multi_key), the A key would automatically become a dead key?

To reply a bit simpler than James: Indeed, a (but not A) would be dead.  I actually used q as a dead character for a while.

> >   so    <U0335> <r> : "̵r" U024D    will work.
> 
> Have you tried it?  I did, but can't get it to work -- nor any other compose
> sequence that I add to the central Compose file, so that doesn't mean much.

I put this in my ~/.XCompose, and it worked.  There are several possibilities why it might fail: First, applications read the compose table at startup and do not care about changes you make to them afterwards. Second, GTK and Qt applications by default use their own compose mechanism and do not care about the libX11 table.  For testing, after you changed the table, launch xterm or xev for testing.

> http://cgit.freedesktop.org/xorg/lib/libX11/commit/
> ?id=79f47e6dff2f0a0b673bbfecc47528edca814baa
> 
> So it seems unlikely you will get any combining keysymbols included in the
> compose sequences.

James might comment more, but in this patch he basically removed compose sequences with invalid keysyms that would have been ignored anyway.

> But which levels of which keys should then produce the dead keysymbols?  The
> third level is already taken by the combining characters.  The fourth level?
> Would the T3 standard allow this?

My understanding is that where the standard lists a combining key, that combining key should be really a dead key.  See also page 2 in

  http://www.pentzlin.com/ErweiterungDeutscheTastatur2.pdf

"Sämtliche diakritische Zeichen (Akzente usw.) werden weiterhin als »Tottasten« (d.h. vor dem Grundzeichen)eingegeben." (All diacritical marks (accents etc.) are still entered as dead keys, that is, before the base character).

Which means, ISO-9995-3:2010 might need a couple of more new sequences, on top of what the "List of peculiar characters" shows.
Comment 6 Benno Schulenberg 2013-08-29 20:41:26 UTC
(In reply to comment #5)
> Second, GTK and Qt applications by default use their own compose mechanism
> and do not care about the libX11 table.  For testing, after you changed the
> table, launch xterm or xev for testing.

Okay, in xterm my self-defined compose sequences work.  And indeed, any key
goes dead as soon as it's the first in a compose sequence.  Interesting.

> My understanding is that where the standard lists a combining key, that
> combining key should be really a dead key.

Well, is the standard going to get revised?  Because that is really confusing,
to say "combining" when they mean "dead".

> (All diacritical marks (accents etc.) are still entered as dead keys,
> that is, before the base character).

But that would mean that for *all* possible puttings of diacritics on letters
compose sequences will need to have been made *before* people can input them.
That's annoying.  The whole point of having *combining* diacritics on the keys
is that you can input them separately and anywhere you like.  The following comment argues this very well:

   https://bugs.freedesktop.org/show_bug.cgi?id=5107#c22

Relevant quote: "in Unicode, the combining characters (accents, diacritics &c.) are always, always [ALWAYS] _after_ the base character"

(As an advantage it mentions: "It avoids hidden states; each key you press
changes something on the screen."  In my opinion, this is superb -- no dead
keys, always feedback.)

So, the question remains: what does the ISO-9995-3 standard want?  Does it want functional, combining diacritics on the keys?  Or does it want dead diacritics on those same keys?
Comment 7 Benno Schulenberg 2013-08-31 15:29:55 UTC
Created attachment 84973 [details] [review]
adds the missing dead keysyms for T3

Okay, I've looked at the T3 layout in the symbols/de file in xkeyboard-config, and most diacritics are implemented as dead keys.  Except for the following four: U030D (combining vertical line above) on <AD01>, U0329 (combining vertical line below) on <AC01>, U0332 (combining low line) on <AC06>, and U0338 (combining long solidus overlay) on <AC09>.  To make the layout consistent, these would need to be replaced with dead keysymbols.  Attached patch adds the needed definitions of these dead symbols to the proto file.

James, Peter -- any chance that such a patch would be accepted?

Andreas -- anything else that the T3 layout needs to become complete (apart from a whole bunch of compose sequences)?
Comment 8 Andreas Wettstein 2013-08-31 16:13:04 UTC
> Andreas -- anything else that the T3 layout needs to become complete (apart
> from a whole bunch of compose sequences)?

Apart from compose sequences and of possibly using the new keysyms you propose with your patch, T3 is complete.
Comment 9 Andreas Wettstein 2013-08-31 16:17:47 UTC
> So, the question remains: what does the ISO-9995-3 standard want?  Does it
> want functional, combining diacritics on the keys?  Or does it want dead
> diacritics on those same keys?

Wikipedia confirms that ISO-9995-3 uses dead keys: 

  http://en.wikipedia.org/wiki/ISO/IEC_9995

It seems that Karl Penzlin is one of the main authors of the article, so I assume that this information is reliable.
Comment 10 Benno Schulenberg 2013-08-31 19:58:41 UTC
Created attachment 84978 [details] [review]
patch cancelled

Hm!  There are no precomposed Unicode characters (yet) that include a low line, a long solidus, or a vertical line above or below.  So there is no point in having dead key symbols for those elements.  I retract the patch.

(Which does make me wonder: what scripts are those combining diacritics supposed to serve?)
Comment 11 James Cloos 2013-08-31 20:07:57 UTC
> James, Peter -- any chance that such a patch would be accepted?

Certainly.

The set of dead keys should cover all keyboards in xkeyboard-config.

Pushed to x11proto as 6d4acb0e3a.
Comment 12 James Cloos 2013-09-01 22:10:20 UTC
> patch cancelled

That must have come in while I was applying and pushing the patch…

The UCS and Unicode will not add any characters with pre-composed
low lines, vert lines above or vert lines below, per their policies.

But they might add ones with a long solidus overlay.

And the targets of Compose sequences are strings, so they can be
(a few already are) combining character sequences.
Comment 13 Benno Schulenberg 2013-09-02 12:01:53 UTC
(In reply to comment #12)
> And the targets of Compose sequences are strings, so they can be
> (a few already are) combining character sequences.

Yes, I realized that too when I saw the Cyrillic sequences with combining
diacritics, and the two stressed jays.  Will send a slight tweak later on,
but first some things for Andreas to comment on.

Oh, James, BTW, there was this message to xorg-devel:
  http://lists.x.org/archives/xorg-devel/2013-August/037335.html
Comment 14 Benno Schulenberg 2013-09-02 12:17:26 UTC
Created attachment 85056 [details] [review]
proposed T3 compose sequences

The compose sequences involving <space>, listed in the first part of Appendix E,
from acute to tilde, already exist.  The attached patch adds the seven sequences
listed in the second half of the table.

The patch also adds all of the sequences listed in Appendix F which do not
result in a combining thing.  It does not add the latter ones for two reasons.
One, most of the combinations are sadly already taken by existing sequences.
And two, the intention of each of those combinations is not to result in a
combining thing, but in a dead thing -- and this will simply not work in the
Compose tables, as far as I can tell.

Andreas, you mentioned that some more sequences would be needed than those
mentioned in those two appendices.  Which ones did you have in mind?  Or is
the proposed patch complete (after some tweaks to x11proto)?
Comment 15 Benno Schulenberg 2013-09-02 12:27:29 UTC
Created attachment 85058 [details] [review]
proposed key changes in T3 layout

The patch puts the new dead keysymbols in place in the German T3 layout.
The <dead_shorstroke> and <dead_longsolidus> can produce quite a few
characters  (when the preceding patch is applied).  But <dead_aboveverticalline>
and <dead_belowverticalline> each can only produce one modifier letter.  This
does not seem very useful.  Whereas a combining diacritic is generally usable.
But, leaving those two keys to produce U030D and U0329 instead would "break"
the consistency of the layout.  What do you think, Andreas?
Comment 16 Andreas Wettstein 2013-09-06 18:31:47 UTC
Comment on attachment 85058 [details] [review]
proposed key changes in T3 layout

Review of attachment 85058 [details] [review]:
-----------------------------------------------------------------

The changes that are there are ok, but one is missing: the dead_stroke on <AC11> should be replaced by dead_shortstroke.
Comment 17 Andreas Wettstein 2013-09-06 18:47:03 UTC
> Andreas, you mentioned that some more sequences would be needed than those
> mentioned in those two appendices.  Which ones did you have in mind?  Or is
> the proposed patch complete (after some tweaks to x11proto)?

I did not have any particular ones in mind.  The main reason I created this bug that I was too lazy to do this research...

But generally, in case of doubt, I believe it is better not to add compose sequences.  Missing sequences are easily added if someone needs them, but useless or "incorrect" ones are difficult to remove.

> But <dead_aboveverticalline>
> and <dead_belowverticalline> each can only produce one modifier letter.  This
> does not seem very useful.  Whereas a combining diacritic is generally usable.
> But, leaving those two keys to produce U030D and U0329 instead would "break"
> the consistency of the layout.  What do you think, Andreas?

Regarding composing characters, how about extending an existing convention: a dead key, followed by a non-breaking space, creates the composing character that corresponds to the dead key.  That way, we can use dead keys as far as ISO forces it, but do not loose the ability to enter the composing character.
Comment 18 Benno Schulenberg 2013-09-06 19:26:24 UTC
Created attachment 85370 [details] [review]
replaces combiners with dead keys in German T3

(In reply to comment #16)
> The changes that are there are ok, but one is missing: the dead_stroke on
> <AC11> should be replaced by dead_shortstroke.

Okay.  (I had left it in place on purpose, so that people could have it both ways.  But now I see no other German layout has a <dead_stroke> (apart from Neo), so people are not used to it -- so it is better to have just one kind of dead stroke thing, the more consistent kind that always makes a horizontal stroke.)

Patch has been updated.  When do you think this patch should be fed to the maintainer of xkeyboard-config?  A new release of that package is due by the end of this month, but that would be too soon, as the key symbols are not available yet, no?
Comment 19 Benno Schulenberg 2013-09-06 19:59:31 UTC
Created attachment 85373 [details] [review]
adds compose sequences for T3

(In reply to comment #17)
> Regarding composing characters, how about extending an existing convention:
> a dead key, followed by a non-breaking space, creates the composing
> character that corresponds to the dead key.  That way, we can use dead keys
> as far as ISO forces it, but do not loose the ability to enter the composing
> character.

Ah, that is a good idea -- I wasn't aware of the non-breaking space convention.
But now I see Bépo does that too.  (I understand that with "composing character" you mean "combining diacritic".)  Okay, patch has been updated with five more sequences.  Can you verify that I got the character codes in there right?
Comment 20 Benno Schulenberg 2013-09-06 20:03:53 UTC
Created attachment 85374 [details] [review]
adds one more key symbol for German T3

Adds the dead_shortstroke, and shortens the unneeded long dead_longsolidusoverlay name to dead_longsolidus.  (And renumbers it -- that should still be doable now, no?)
Comment 21 James Cloos 2013-09-07 11:24:02 UTC
We already have dead_stroke, which is already used for things like
LATIN SMALL LETTER A WITH STROKE and LATIN SMALL LETTER B WITH STROKE.

Adding dead_shortstroke and dead_longsolidus, too, seems unnecessary.

But perhaps I am wrong?

Please send a note to xorg-devel about this idea of splitting the
dead_stroke keysym into separate dead_shortstroke and dead_longsolidus
keysyms.  Let’s see whether there is any consensus for such a change.
Comment 22 Benno Schulenberg 2013-09-07 16:05:48 UTC
(In reply to comment #21)
> We already have dead_stroke, which is already used for things like
> LATIN SMALL LETTER A WITH STROKE and LATIN SMALL LETTER B WITH STROKE.
> 
> Adding dead_shortstroke and dead_longsolidus, too, seems unnecessary.

It would be unnecessary if not the following three letters existed (mentioning just the lowercases, to keep it short):

  "ƚ"	U019A # LATIN SMALL LETTER L WITH BAR
  "ɵ"	U0275 # LATIN SMALL LETTER BARRED O
  "ⱦ"	U2C66 # LATIN SMALL LETTER T WITH DIAGONAL STROKE

How to produce those letters with a dead key when <dead_stroke> <l> gives ł (L WITH STROKE), <dead_stroke> <o> gives ø (O WITH STROKE), and <dead_stroke> <t> gives ŧ (T WITH STROKE)?  If the "T with diagonal stroke" did not exist, it would have been enough to just add a dead_shortstroke (or maybe better called "dead_bar"), to reliably put a short bar on a letter.  But in order to be able to compose Ⱦ and ⱦ, also a "dead_diagonal" will be needed, to reliably put a diagonal stroke on a letter.

> Please send a note to xorg-devel about this idea of splitting the
> dead_stroke keysym into separate dead_shortstroke and dead_longsolidus
> keysyms.

It is not a matter of splitting an existing keysym into two new ones.  The  dead_stroke key symbol has to keep on existing, has to keep on doing what it does now (sometimes produce a diagonal stroke, sometimes a horizontal one), because people have come to depend on this.  The new, additional dead key symbols however behave in a predictable way: dead_shortstroke always produces a short horizontal stroke, and dead_longsolidus always produces a diagonal stroke.

To argue this differently: the COMBINING SHORT STROKE OVERLAY always adds a short horizontal bar to the preceding letter, never something diagonal; and the COMBINING LONG SOLIDUS OVERLAY always adds a diagonal, never a horizontal.  To be able to fully implement the T3 layout, two dead-key symbols are needed that mimic this consistent and dependable behaviour.
Comment 23 Andreas Wettstein 2013-09-07 18:24:02 UTC
Comment on attachment 85373 [details] [review]
adds compose sequences for T3

Review of attachment 85373 [details] [review]:
-----------------------------------------------------------------

The code points, character names, and character strings are consistent.
Comment 24 Andreas Wettstein 2013-09-07 18:29:27 UTC
> When do you think this patch should be fed to the maintainer of xkeyboard-
> config?  A new release of that package is due by the end of this month, but 
> that would be too soon, as the key symbols are not available yet, no?

Right, I believe we should wait until a version of xproto with the required keysyms has been released.

Theoretically, there is the possibility that users define their own keysyms in a XKeysymDB file, which would allow them to define the new keysyms temporarily while they still use a system based on an older xproto version.  But as this is a little known feature, we cannot expect anyone to do that.
Comment 25 Benno Schulenberg 2014-05-29 15:57:04 UTC
James, what's your opinion on this?  Do you still think I should post comment #22 to xorg-devel, or are you convinced by now?
Comment 26 GitLab Migration User 2018-08-10 20:12:09 UTC
-- GitLab Migration Automatic Message --

This bug has been migrated to freedesktop.org's GitLab instance and has been closed from further activity.

You can subscribe and participate further through the new bug through this link to our GitLab instance: https://gitlab.freedesktop.org/xorg/lib/libx11/issues/56.

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.