Unicode Consortium - Explore the Science & Experts

The Experts below are selected from a list of 339 Experts worldwide ranked by ideXlab platform

Michel Suignard - One of the best experts on this subject based on the ideXlab platform.

Latest Proposed Update

2015

Co-Authors: Michel Suignard

Abstract:

Client software, such as browsers and emailers, faces a difficult transition from the version of international domain names approved in 2003 (IDNA2003), to the revision approved in 2010 (IDNA2008). The specification in this document provides a mechanism that minimizes the impact of this transition for client software, allowing client software to access domains that are valid under either system. The specification provides two main features: One is a comprehensive mapping to support current user expectations for casing and other variants of domain names. Such a mapping is allowed by IDNA2008. The second is a compatibility mechanism that supports the existing domain names that were allowed under IDNA2003. This second feature is intended to improve client behavior during the transitional period. Status This is a draft document which may be updated, replaced, or superseded by other documents at any time. Publication does not imply endorsement by the Unicode Consortium. This is not a stable document; it is inappropriate to cite this document as other than a work in progress

15 days free trial to Access Article
Revision 2 Summary

2015

Co-Authors: Mark Davis, Michel Suignard

Abstract:

This document provides a specification for processing that provides for compatibility between older and newer versions of internationalized domain names (IDN) for lookup in client software. It allows applications such as browsers and emailers to be able to handle both the original version of internationalized domain names(IDNA2003) and the newer version (IDNA2008) compatibly, avoiding possible interoperability and security problems. [Review Note: At this point, IDNA2008 is still in development, so this draft may change as IDNA2008 changes. The following is a substantial reorganization of version 2, draft 3 of this UTS; the changes previous to that version are not tracked with yellow highlighting.] Status This is a draft document which may be updated, replaced, or superseded by other documents at any time. Publication does not imply endorsement by the Unicode Consortium. This is not a stable document; it is inappropriate to cite this document as other than a work in progress

15 days free trial to Access Article
Latest Proposed Update

2015

Co-Authors: Michel Suignard

Abstract:

Client software, such as browsers and emailers, faces a difficult transition from the version of international domain names approved in 2003 (IDNA2003), to the revision approved in 2010 (IDNA2008). The specification in this document provides a mechanism that minimizes the impact of this transition for client software, allowing client software to access domains that are valid under either system. The specification provides two main features: One is a comprehensive mapping to support current user expectations for casing and other variants of domain names. Such a mapping is allowed by IDNA2008. The second is a compatibility mechanism that supports the existing domain names that were allowed under IDNA2003. This second feature is intended to improve client behavior during the transitional period. Status This is a draft document which may be updated, replaced, or superseded by other documents at any time. Publication does not imply endorsement by the Unicode Consortium. This is not a stable document; it is inappropriate to cite this document as other than a work in progress. A Unicode Technical Standard (UTS) is an independent specification. Conformance to the Unicode Standard does not imply conformance to any UTS

15 days free trial to Access Article
Latest Proposed Update

2015

Co-Authors: Michel Suignard

Abstract:

Client software, such as browsers and emailers, faces a difficult transition from the version of international domain names approved in 2003 (IDNA2003), to the revision approved in 2010 (IDNA2008). The specification in this document provides a mechanism that minimizes the impact of this transition for client software, allowing client software to access domains that are valid under either system. The specification provides two main features: One is a comprehensive mapping to support current user expectations for casing and other variants of domain names. Such a mapping is allowed by IDNA2008. The second is a compatibility mechanism that supports the existing domain names that were allowed under IDNA2003. This second feature is intended to improve client behavior during the transitional period. Status This document has been reviewed by Unicode members and other interested parties, and has been approved for publication by the Unicode Consortium. This is a stable document and may be used as reference material or cited as a normative reference by other specifications. A Unicode Technical Standard (UTS) is an independent specification. Conformance to the Unicode Standard does not imply conformance to any UTS. UTS #46: Unicode IDNA Compatibility Processing file:///C:/L2-Doc/10456-tr46-5.htm

15 days free trial to Access Article
Revision 2 Summary

2015

Co-Authors: Mark Davis, Michel Suignard

Abstract:

This document provides a specification for processing that provides for compatibility between older and newer versions of internationalized domain names (IDN) for lookup in client software. It allows applications such as browsers and emailers to be able to handle both the original version of internationalized domain names(IDNA2003) and the newer version (IDNA2008) compatibly, avoiding possible interoperability and security problems. [Review Note: At this point, IDNA2008 is still in development, so this draft may change as IDNA2008 changes. The following is a substantial reorganization of the earlier proposed draft of this UTS; the changes from that version are not tracked with yellow highlighting. The text is rough as yet (not yet wordsmithed or copyedited).] Status This is a draft document which may be updated, replaced, or superseded by other documents at any time. Publication does not imply endorsement by the Unicode Consortium. This is not a stable document; it is inappropriate to cite thi

15 days free trial to Access Article

Mark Davis - One of the best experts on this subject based on the ideXlab platform.

Current Draft

2015

Co-Authors: Mark Davis, Andy Heninger

Abstract:

This document describes guidelines for how to adapt regular expression engines to use Unicode. Status This is a draft document which may be updated, replaced, or superseded by other documents at any time. Publication does not imply endorsement by the Unicode Consortium. This is not a stable document; it is inappropriate to cite this document as other than a work in progress. A Unicode Technical Standard (UTS) is an independent specification. Conformance to the Unicode Standard does not imply conformance to any UTS. Please submit corrigenda and other comments with the online reporting form [Feedback]. Related information that is useful in understanding this document is found in [References]. For the latest version of the Unicode Standard see [Unicode]. For a list of current Unicode Technical Reports see [Reports]. For more information about versions of the Unicode Standard, see [Versions]

15 days free trial to Access Article
Revision 2 Summary

2015

Co-Authors: Mark Davis, Michel Suignard

Abstract:

This document provides a specification for processing that provides for compatibility between older and newer versions of internationalized domain names (IDN) for lookup in client software. It allows applications such as browsers and emailers to be able to handle both the original version of internationalized domain names(IDNA2003) and the newer version (IDNA2008) compatibly, avoiding possible interoperability and security problems. [Review Note: At this point, IDNA2008 is still in development, so this draft may change as IDNA2008 changes. The following is a substantial reorganization of version 2, draft 3 of this UTS; the changes previous to that version are not tracked with yellow highlighting.] Status This is a draft document which may be updated, replaced, or superseded by other documents at any time. Publication does not imply endorsement by the Unicode Consortium. This is not a stable document; it is inappropriate to cite this document as other than a work in progress

15 days free trial to Access Article
Revision 2 Summary

2015

Co-Authors: Mark Davis, Michel Suignard

Abstract:

This document provides a specification for processing that provides for compatibility between older and newer versions of internationalized domain names (IDN) for lookup in client software. It allows applications such as browsers and emailers to be able to handle both the original version of internationalized domain names(IDNA2003) and the newer version (IDNA2008) compatibly, avoiding possible interoperability and security problems. [Review Note: At this point, IDNA2008 is still in development, so this draft may change as IDNA2008 changes. The following is a substantial reorganization of the earlier proposed draft of this UTS; the changes from that version are not tracked with yellow highlighting. The text is rough as yet (not yet wordsmithed or copyedited).] Status This is a draft document which may be updated, replaced, or superseded by other documents at any time. Publication does not imply endorsement by the Unicode Consortium. This is not a stable document; it is inappropriate to cite thi

15 days free trial to Access Article
Revision 1 Summary

2014

Co-Authors: Mark Davis, Michel Suignard

Abstract:

This document provides a specification for an internationalized domain name preprocessing step that is intended for use with IDNAbis, the projected update for Internationalized Domain Names. The proposed specification maintains compatibility with IDNA2003 (the current version of Internationalized Domain Names), and consistently extends that mechanism for characters introduced in any later Unicode version. At this point, IDNAbis is still in development, so this draft is based on the current draft of IDNAbis, and may change substantially as that draft changes. Status This is a draft document which may be updated, replaced, or superseded by other documents at any time. Publication does not imply endorsement by the Unicode Consortium. This is not a stable document; it is inappropriate to cite this document as other than a work in progress. A Unicode Technical Standard (UTS) is an independent specification. Conformance to the Unicode Standard does not imply conformance to an

15 days free trial to Access Article
Summary

2014

Co-Authors: Mark Davis, Michel Suignard

Abstract:

Client software, such as browsers and emailers, faces a difficult transition from the version of international domain names approved in 2003 (IDNA2003), to the revision approved in 2010 (IDNA2008). The specification in this document provides a mechanism that minimizes the impact of this transition for client software, allowing client software to access domains that are valid under either system. The specification provides two main features: One is a comprehensive mapping to support current user expectations for casing and other variants of domain names. Such a mapping is allowed by IDNA2008. The second is a compatibility mechanism that supports the existing domain names that were allowed under IDNA2003. This second feature is intended to improve client behavior during the transitional period. Status This is a draft document which may be updated, replaced, or superseded by other documents at any time. Publication does not imply endorsement by the Unicode Consortium. This is not a stable document; it is inappropriate to cite this document as other than a work in progress. A Unicode Technical Standard (UTS) is an independent specification. Conformance to the Unicode Standard does not imply conformance to any UTS

15 days free trial to Access Article

Eric Muller - One of the best experts on this subject based on the ideXlab platform.

Technical Reports

2015

Co-Authors: Eric Muller

Abstract:

The UTC have approved work to prepare an update to UTS #37. This update will clarify how Ideographic Variation Se-quences (IVSes) can be shared across IVD collections, that no IVD collection has special status, and that implementers can support any subset of the registered IVSes. It will further amend the specification to allow duplicate sequence identifiers within an IVD collection under particular circumstances, and to allow registrants to supply multiple representative glyphs for IVSes in their IVD collections. Finally, the registration procedures will be strengthened to require registrants to supply representative glyphs for registered IVSes and to supply a data file as part of a submission. Feedback on this proposed update can be submitted to the Unicode Consortium online a

15 days free trial to Access Article
Revision 4 Summary

2015

Co-Authors: Hideki Hiura, Eric Muller

Abstract:

This document describes the organization of the Ideographic Variation Database, and the procedure to add sequences to that database. Status This is a draft document which may be updated, replaced, or superseded by other documents at any time. Publication does not imply endorsement by the Unicode Consortium. This is not a stable document; it is inappropriate to cite this document as other than a work in progress. A Unicode Technical Standard (UTS) is an independent specification. Conformance to the Unicode Standard does not imply conformance to any UTS. Please submit corrigenda and other comments with the online reporting form [Feedback]. Related information that is useful in understanding this Page 1 of 13UTS #37: Unicode Ideographic Variation Databas

15 days free trial to Access Article
Latest Proposed Update Database Revision 6 Summary

2014

Co-Authors: Eric Muller, Ken Lunde 小林劍

Abstract:

This document describes the organization of the Ideographic Variation Database, and the procedure to add sequences to that database. Status This is a draft document which may be updated, replaced, or superseded by other documents at any time. Publication does not imply endorsement by the Unicode Consortium. This is not a stable document; it is inappropriate to cite this document as other than a work in progress. A Unicode Technical Standard (UTS) is an independent specification. Conformance to the Unicode Standard does not imply conformance to any UTS. Please submit corrigenda and other comments with the online reporting form [Feedback]. Related information that is useful in understanding this document is found in References. For the latest version of the Unicode Standard see [Unicode]. For a list of current Unicode Technical Reports see [Reports]. For more information about versions of the Unicode Standard, see [Versions]

15 days free trial to Access Article

Source Van Anderson - One of the best experts on this subject based on the ideXlab platform.

Discussion list: Chinook in the UCS

2015

Co-Authors: Document N, Source Van Anderson

Abstract:

The Duployan shorthands and Chinook script are used as a secondary shorthand for writing French, English, German, Spanish, Romanian, and as an alternate primary script for several first nations ' languages of interior British Columbia, including the Chinook Jargon, Okanagan, Lilooet, Shushwap, and North Thompson. The original Duployan shorthand was invented by Emile Duployé, published in 1860, as a stenographic shorthand for French. It was one of the two most commonly used French shorthands, being more popular in the south of France, and adjacent French speaking areas of other countries. Adapted Duployan shorthands were also developed for English, German, Spanish, and Romanian. The basic inventory of consonant and vowel signs- all in the first two columns of the allocation- have been augmented over the years to provide more efficient shorthands for these languages and to adapt it to the phonologies of these languages and the languages using Chinook writing. There currently exists no encoding- PUA or otherwise- for the representation of the Duployan or Chinook. Indeed, the submission of the Duployan Shorthands and Chinook script to the Unicode Consortium has necessitated the creation, from scratch, of the first Duployan/Chinook font, and the allocation is based solely on the internal logic of the script and affinity of usage among characters. The Chinook script was an adaptation and augmentation of the Duployan shorthand by fr. Jean Marie Raphael LeJeune, used for writing the Chinook Jargon and other languages of 19th c. interior British Columbia. Its original use and greatest surviving attestation is from the run of the Kamloops Wawa, a (mostly) Chinook Jargon newsletter of the Catholic diocese of Kamloops, British Columbia, published 1891-1923. At the time, the Chinook Jargon pidgin was widely spoken from SE Alaska to northern California, from the Pacific to the Rockies, and sporadicall

15 days free trial to Access Article
Historical Overview of the Duployéan and adaptations

2014

Co-Authors: Source Van Anderson

Abstract:

The Duployéan shorthands and Chinook script are used as a secondary shorthand for writing French, English, German, Spanish, Rumanian, and as an alternate primary script for several first nations ' languages of interior British Columbia, including the Chinook Jargon, Okanagan, Lilooet, Shushwap, and North Thompson. The original Duployéan shorthand was invented by Emile Duployé, published in 1860, as a stenographic shorthand for French. It was one of the two most commonly used shorthand systems in France, being more popular in southern France and adjacent French speaking areas of other countries. Adaptations of Duployéan were developed for the representation of English, German, Spanish, and Rumanian. The basic inventory of consonant and vowel signs- all in the first two columns of the allocation- have been augmented over the years to provide more efficient shorthands for these languages and to adapt it to the phonologies of these languages and the languages using Chinook writing. There currently exists no formal encoding, in any context, for the representation of the Duployan or Chinook. Indeed, the submission of the Duployan Shorthands and Chinook script to the Unicode Consortium has necessitated the creation, from scratch, of the first Duployéan/Chinook font, and the allocation is based solely on the internal logic and historical usage of the script. The Chinook script was an adaptation and augmentation of the Duployéan shorthand by fr. Jean Marie Raphael LeJeune, used for writing the Chinook Jargon and other languages of 19th c. interior British Columbia. Its original use and greatest surviving attestation is from the run of the Kamloops Wawa, a (mostly) Chinook Jargon newsletter of the Catholic diocese of Kamloops, British Columbia, published 1891-1923. At the time, the Chinook Jargon pidgin was widely spoken from SE Alaska to northern California, from the Pacific to the Rockies, and sporadically outside this area. Although the Chinook Jargon was th

15 days free trial to Access Article
Discussion list: Chinook in the UCS

2010

Co-Authors: Source Van Anderson

Abstract:

The Duployan shorthands and Chinook script are used as a secondary shorthand for writing French, English, German, Spanish, Romanian, and as an alternate primary script for several first nations ' languages of interior British Columbia, including the Chinook Jargon, Okanagan, Lilooet, Shushwap, and North Thompson. The original Duployan shorthand was invented by Emile Duployé, published in 1860, as a stenographic shorthand for French. It was one of the two most commonly used French shorthands, being more popular in the south of France, and adjacent French speaking areas of other countries. Adapted Duployan shorthands were also developed for English, German, Spanish, and Romanian. The basic inventory of consonant and vowel signs- all in the first two columns of the allocation- have been augmented over the years to provide more efficient shorthands for these languages and to adapt it to the phonologies of these languages and the languages using Chinook writing. There currently exists no encoding- PUA or otherwise- for the representation of the Duployan or Chinook. Indeed, the submission of the Duployan Shorthands and Chinook script to the Unicode Consortium has necessitated the creation, from scratch, of the first Duployan/Chinook font, and the allocation is based solely on the internal logic of the script and affinity of usage among characters. The Chinook script was an adaptation and augmentation of the Duployan shorthand by fr. Jean Marie Raphael LeJeune, used for writing the Chinook Jargon and other languages of 19th c. interior British Columbia. Its original use and greatest surviving attestation is from the run of the Kamloops Wawa, a (mostly) Chinook Jargon newsletter of the Catholic diocese of Kamloops, British Columbia, published 1891-1923. At the time, the Chinook Jargon pidgin was widely spoken from SE Alaska to northern California, from the Pacific to the Rockies, and sporadically outside this area. Although the Chinook Jargon was the lingua franca in many communities of the Pacific Northwest, it was generally a spoken

15 days free trial to Access Article
Discussion list: Chinook in the UCS Historical Overview of the Duployan and adaptations

2010

Co-Authors: Source Van Anderson

Abstract:

The Duployan shorthands and Chinook script are used as a secondary shorthand for writing French, English, German, Spanish, Romanian, and as an alternate primary script for several first nations ' languages of interior British Columbia, including the Chinook Jargon, Okanagan, Lilooet, Shushwap, and North Thompson. The original Duployan shorthand was invented by Emile Duployé, published in 1860, as a stenographic shorthand for French. It was one of the two most commonly used French shorthands, being more popular in the south of France, and adjacent French speaking areas of other countries. Adapted Duployan shorthands were also developed for English, German, Spanish, and Romanian. The basic inventory of consonant and vowel signs- all in the first two columns of the allocation- have been augmented over the years to provide more efficient shorthands for these languages and to adapt it to the phonologies of these languages and the languages using Chinook writing. There currently exists no encoding- PUA or otherwise- for the representation of the Duployan or Chinook. Indeed, the submission of the Duployan Shorthands and Chinook script to the Unicode Consortium has necessitated the creation, from scratch, of the first Duployan/Chinook font, and the allocation is based solely on the internal logic of the script and affinity of usage among characters. The Chinook script was an adaptation and augmentation of the Duployan shorthand by fr. Jean Marie Raphael LeJeune, used for writing the Chinook Jargon and other languages of 19th c. interior British Columbia. Its original use and greatest surviving attestation is from the run of the Kamloops Wawa, a (mostly) Chinook Jargon newsletter of the Catholic diocese of Kamloops, British Columbia, published 1891-1923. At the time, the Chinook Jargon pidgin was widely spoken from SE Alaska to northern California, from the Pacific to the Rockies, and sporadicall

15 days free trial to Access Article

Andrew West - One of the best experts on this subject based on the ideXlab platform.

Final proposal to encode the Khitan Small Script in the SMP of the UCS

2016

Co-Authors: Sun Bojun, -- Jiruhe, Yongshi Jing, Viacheslav Zaytsev, Andrew West, Michael Everson

Abstract:

Author(s): Bojun, Sun; Wu, Yingzhe; Jing, Yongshi; Jiruhe, --; Zaytsev, Viacheslav; West, Andrew; Everson, Michael | Abstract: This 2016 working document includes the proposed repertoire for the Khitan Small Script for its eventual encoding in the Unicode Standard. Subsequent modifications to the repertoire have appeared and can be found in the Unicode Consortium document register. The script was officially published in Unicode 13.0 in 2020.

15 days free trial to Access Article
Summary

2015

Co-Authors: Aaron Bell, Greg Eck, Andrew Glass, Andrew West

Abstract:

The Mongolian block starts with U+1800 MONGOLIAN BIRGA which is a kind of ornament that usually marks the beginning of a text or folio. Like Tibetan, which has a related character (U+0F04), there are multiple different types of the birga symbol. Five types of birga have been identified in publications that pioneered the Mongolian encoding (Erdenechimeg et al. 1999 and Quejingzhabu 2000). These publications include guidelines that encode the birga variants using sequences based on the standard MONGOLIAN BIRGA U+1800 with one of the MONGOLIAN FREE VARIATION SELECTORS (U+180B‒180D). Because there are just three MONGOLIAN FREE VARIATION SELECTORS, ZWJ is used as the fourth variation marker (U+1800 U+200D). These sequences for the birga variants have not been accepted by the Unicode Consortium and are not included in the current version of StandardizedVariants.txt (The Unicode Consortium 2013c). All other variation sequences specified in Erdenechimeg et al. 1999 and Quejingzhabu 2000 are included in StandardizedVariants.txt. The absence of these sequences from StandardizedVariants.txt, or a recommendation on how to access them has caused confusion among users and implementers of the standard. The authors of this document would like to discuss options for the correct encoding of these characters prior to updating the Mongolian code charts (see doc L2/14-031) so that the work on the code charts can include or exclude these variation sequences as appropriate. Three options are presented in this document and a recommendation is made for the separate encoding of these characters

15 days free trial to Access Article
Summary

2014

Co-Authors: Aaron Bell, Greg Eck, Andrew Glass, Andrew West

Abstract:

The Mongolian block starts with U+1800 MONGOLIAN BIRGA which is a kind of ornament that usually marks the beginning of a text or folio. Like Tibetan, which has a related character (U+0F04), there are multiple different types of the birga symbol. Five types of birga have been identified in publications that pioneered the Mongolian encoding (Erdenechimeg et al. 1999 and Quejingzhabu 2000). These publications include guidelines that encode the birga variants using sequences based on the standard MONGOLIAN BIRGA U+1800 with one of the MONGOLIAN FREE VARIATION SELECTORS (U+180B‒180D). Because there are just three MONGOLIAN FREE VARIATION SELECTORS, ZWJ is used as the fourth variation marker (U+1800 U+200D). These sequences for the birga variants have not been accepted by the Unicode Consortium and are not included in the current version of StandardizedVariants.txt (The Unicode Consortium 2013c). All other variation sequences specified in Erdenechimeg et al. 1999 and Quejingzhabu 2000 are included in StandardizedVariants.txt. The absence of these sequences from StandardizedVariants.txt, or a recommendation on how to access them has caused confusion among users and implementers of the standard. The exclusion of the sequences also means that their use is not conformant with Unicode since StandardizedVariants.txt is a normative contributory data file. The authors of this document believe that the Mongolian birgas should be encoded as atomic code points, and ask that the UTC consider this proposal

15 days free trial to Access Article

Discover everything there is to know about the scientific topic Unicode Consortium with ideXlab!

Michel Suignard - One of the best experts on this subject based on the ideXlab platform.

Mark Davis - One of the best experts on this subject based on the ideXlab platform.

Eric Muller - One of the best experts on this subject based on the ideXlab platform.

Source Van Anderson - One of the best experts on this subject based on the ideXlab platform.

Andrew West - One of the best experts on this subject based on the ideXlab platform.