Summary: | allow Unicode non-characters as per Corrigendum 9 | ||
---|---|---|---|
Product: | dbus | Reporter: | Simon McVittie <smcv> |
Component: | core | Assignee: | Simon McVittie <smcv> |
Status: | RESOLVED FIXED | QA Contact: | Havoc Pennington <hp> |
Severity: | normal | ||
Priority: | medium | CC: | desrt, lennart, smcv, thiago, walters |
Version: | unspecified | Keywords: | patch |
Hardware: | Other | ||
OS: | All | ||
Whiteboard: | |||
i915 platform: | i915 features: | ||
Attachments: |
[1.6, master] Accept non-characters when validating Unicode
[master] Specification: explicitly allow the Unicode noncharacters [1.6, master] [v2] Accept non-characters when validating Unicode |
Description
Simon McVittie
2013-04-03 10:33:00 UTC
Should this change also be made in D-Bus 1.6? Answers on a postcard. For: if an application using new-GDBus sends a message containing Corrigendum 9 UTF-8, making this change in D-Bus 1.6 means it won't get rejected. Against: an application expecting a message in "GLib 2.34 UTF-8" could receive an unexpected message in "Corrigendum 9 UTF-8" via a stable-branch dbus-daemon, and crash. If we're going to make this change at all then my inclination would be to say "yes, also change D-Bus 1.6". "yes, also change D-Bus 1.6" The number of applications that depend on not receiving non-characters via D-Bus must be vanishingly small. Created attachment 78331 [details] [review] [1.6, master] Accept non-characters when validating Unicode Unicode Corrigendum #9 clarifies that the non-characters U+nFFFE (for n in the range 0 to 0x10), U+nFFFF (for n in the same range), and U+FDD0..U+FDEF are valid for interchange, and their presence does not make a string ill-formed. GLib 2.36 made the corresponding change in its definition of UTF-8 as used by g_utf8_validate() and similar functions. Created attachment 78332 [details] [review] [master] Specification: explicitly allow the Unicode noncharacters This follows Unicode Corrigendum #9. Created attachment 78333 [details] [review] [1.6, master] [v2] Accept non-characters when validating Unicode Unicode Corrigendum #9 clarifies that the non-characters U+nFFFE (for n in the range 0 to 0x10), U+nFFFF (for n in the same range), and U+FDD0..U+FDEF are valid for interchange, and their presence does not make a string ill-formed. GLib 2.36 made the corresponding change in its definition of UTF-8 as used by g_utf8_validate() and similar functions. --- v2: also fix the comment above UNICODE_VALID(). Comment on attachment 78331 [details] [review] [1.6, master] Accept non-characters when validating Unicode Review of attachment 78331 [details] [review]: ----------------------------------------------------------------- Ship it! Comment on attachment 78332 [details] [review] [master] Specification: explicitly allow the Unicode noncharacters Review of attachment 78332 [details] [review]: ----------------------------------------------------------------- Ship it! Comment on attachment 78333 [details] [review] [1.6, master] [v2] Accept non-characters when validating Unicode Review of attachment 78333 [details] [review]: ----------------------------------------------------------------- Ship it! Fixed in git for 1.7.2, 1.6.10. Any chance you could review Bug #63166, which breaks the build on recent Linux systems, including mine? I think that's the only release blocker at the moment. |
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.