Miscellaneous Binary Properties

This page serves as a grab bag for Unicode character properties that did not warrant a chart of their own but are still interesting enough to be presented in the usual format. They are all binary properties; for characters not explicitly listed here the respective property value is False.

Sources: PropList.txt, emoji-data.txt

Bidi Controls

The Bidi_Control property enumerates formatting characters that serve special functions in the Unicode Bidirectional Algorithm. They include implicit controls which are simply invisible, standalone marks with the same bidi properties as regular letters for lightweight formatting, and explicit controls which are stateful and affect the bidi handling of entire text runs.

Code Char Name
Implicit Controls
061C ؜ Arabic Letter Mark
200E Left-to-Right Mark
200F Right-to-Left Mark
Explicit Controls
202A Left-to-Right Embedding
202B Right-to-Left Embedding
202C Pop Directional Formatting
202D Left-to-Right Override
202E Right-to-Left Override
2066 Left-to-Right Isolate
2067 Right-to-Left Isolate
2068 First Strong Isolate
2069 Pop Directional Isolate
Code Char Name

Dashes

The Dash property includes all characters with the General_Category property value Dash_Punctuation, plus a few other symbols of a similar nature.

Code Char Name
002D - Hyphen-Minus
058A ֊ Armenian Hyphen
05BE ־ Hebrew Punctuation Maqaf
1400 Canadian Syllabics Hyphen
1806 Mongolian Todo Soft Hyphen
2010 Hyphen
2011 Non-Breaking Hyphen
2012 Figure Dash
2013 En Dash
2014 Em Dash
2015 Horizontal Bar
2053 Swung Dash
207B Superscript Minus
208B Subscript Minus
2212 Minus Sign
2E17 Double Oblique Hyphen
2E1A Hyphen with Diaeresis
2E3A Two-Em Dash
2E3B Three-Em Dash
2E40 Double Hyphen
2E5D Oblique Hyphen
301C Wave Dash
3030 Wavy Dash
30A0 Katakana-Hiragana Double Hyphen
FE31 Presentation Form for Vertical Em Dash
FE32 Presentation Form for Vertical En Dash
FE58 Small Em Dash
FE63 Small Hyphen-Minus
FF0D Fullwidth Hyphen-Minus
10D6E 𐵮 Garay Hyphen
10EAD 𐺭 Yezidi Hyphenation Mark
Code Char Name

Emoji Modifiers

Emoji modifiers are used in conjuction with emoji modifier bases to produce emoji modifier sequences, which are variants of human‐form emoji with specific skin tones. They employ the Fitzpatrick scale, a six‐point scale that classifies human skin based on its susceptibility to sunburn. Fitzpatrick types I and II have been combined into a single emoji modifier.

See Unicode Technical Standard #51: Unicode Emoji, section 2.4: Diversity for more information.

Code Char Name
1F3FB 🏻 Emoji Modifier Fitzpatrick Type-1-2
1F3FC 🏼 Emoji Modifier Fitzpatrick Type-3
1F3FD 🏽 Emoji Modifier Fitzpatrick Type-4
1F3FE 🏾 Emoji Modifier Fitzpatrick Type-5
1F3FF 🏿 Emoji Modifier Fitzpatrick Type-6
Code Char Name

Emoji Modifier Bases

Emoji modifier bases are all characters that an emoji modifier can be applied to. An emoji modifier sequence always consists of a character with Emoji_Modifier_Base followed immediately by a character with Emoji_Modifier. The “Example” column shows all such combinations for each character. Emoji modifiers automatically request emoji‐style display for their bases, so Variation Selector-16 is not required even for text‐default characters.

Not all well‐formed emoji modifier sequences are recommended for general interchange (RGI). In particular, applications are not expected to support skin tone variants of U+1F46A 👪 Family.

Source: emoji-sequences.txt

Code Char Name Example
261D White Up Pointing Index ☝🏻☝🏼☝🏽☝🏾☝🏿
26F9 Person with Ball ⛹🏻⛹🏼⛹🏽⛹🏾⛹🏿
270A Raised Fist ✊🏻✊🏼✊🏽✊🏾✊🏿
270B Raised Hand ✋🏻✋🏼✋🏽✋🏾✋🏿
270C Victory Hand ✌🏻✌🏼✌🏽✌🏾✌🏿
270D Writing Hand ✍🏻✍🏼✍🏽✍🏾✍🏿
1F385 🎅 Father Christmas 🎅🏻🎅🏼🎅🏽🎅🏾🎅🏿
1F3C2 🏂 Snowboarder 🏂🏻🏂🏼🏂🏽🏂🏾🏂🏿
1F3C3 🏃 Runner 🏃🏻🏃🏼🏃🏽🏃🏾🏃🏿
1F3C4 🏄 Surfer 🏄🏻🏄🏼🏄🏽🏄🏾🏄🏿
1F3C7 🏇 Horse Racing 🏇🏻🏇🏼🏇🏽🏇🏾🏇🏿
1F3CA 🏊 Swimmer 🏊🏻🏊🏼🏊🏽🏊🏾🏊🏿
1F3CB 🏋 Weight Lifter 🏋🏻🏋🏼🏋🏽🏋🏾🏋🏿
1F3CC 🏌 Golfer 🏌🏻🏌🏼🏌🏽🏌🏾🏌🏿
1F442 👂 Ear 👂🏻👂🏼👂🏽👂🏾👂🏿
1F443 👃 Nose 👃🏻👃🏼👃🏽👃🏾👃🏿
1F446 👆 White Up Pointing Backhand Index 👆🏻👆🏼👆🏽👆🏾👆🏿
1F447 👇 White Down Pointing Backhand Index 👇🏻👇🏼👇🏽👇🏾👇🏿
1F448 👈 White Left Pointing Backhand Index 👈🏻👈🏼👈🏽👈🏾👈🏿
1F449 👉 White Right Pointing Backhand Index 👉🏻👉🏼👉🏽👉🏾👉🏿
1F44A 👊 Fisted Hand Sign 👊🏻👊🏼👊🏽👊🏾👊🏿
1F44B 👋 Waving Hand Sign 👋🏻👋🏼👋🏽👋🏾👋🏿
1F44C 👌 OK Hand Sign 👌🏻👌🏼👌🏽👌🏾👌🏿
1F44D 👍 Thumbs Up Sign 👍🏻👍🏼👍🏽👍🏾👍🏿
1F44E 👎 Thumbs Down Sign 👎🏻👎🏼👎🏽👎🏾👎🏿
1F44F 👏 Clapping Hands Sign 👏🏻👏🏼👏🏽👏🏾👏🏿
1F450 👐 Open Hands Sign 👐🏻👐🏼👐🏽👐🏾👐🏿
1F466 👦 Boy 👦🏻👦🏼👦🏽👦🏾👦🏿
1F467 👧 Girl 👧🏻👧🏼👧🏽👧🏾👧🏿
1F468 👨 Man 👨🏻👨🏼👨🏽👨🏾👨🏿
1F469 👩 Woman 👩🏻👩🏼👩🏽👩🏾👩🏿
1F46A 👪 Family 👪🏻👪🏼👪🏽👪🏾👪🏿
1F46B 👫 Man and Woman Holding Hands 👫🏻👫🏼👫🏽👫🏾👫🏿
1F46C 👬 Two Men Holding Hands 👬🏻👬🏼👬🏽👬🏾👬🏿
1F46D 👭 Two Women Holding Hands 👭🏻👭🏼👭🏽👭🏾👭🏿
1F46E 👮 Police Officer 👮🏻👮🏼👮🏽👮🏾👮🏿
1F46F 👯 Woman with Bunny Ears 👯🏻👯🏼👯🏽👯🏾👯🏿
1F470 👰 Bride with Veil 👰🏻👰🏼👰🏽👰🏾👰🏿
1F471 👱 Person with Blond Hair 👱🏻👱🏼👱🏽👱🏾👱🏿
1F472 👲 Man with Gua Pi Mao 👲🏻👲🏼👲🏽👲🏾👲🏿
1F473 👳 Man with Turban 👳🏻👳🏼👳🏽👳🏾👳🏿
1F474 👴 Older Man 👴🏻👴🏼👴🏽👴🏾👴🏿
1F475 👵 Older Woman 👵🏻👵🏼👵🏽👵🏾👵🏿
1F476 👶 Baby 👶🏻👶🏼👶🏽👶🏾👶🏿
1F477 👷 Construction Worker 👷🏻👷🏼👷🏽👷🏾👷🏿
1F478 👸 Princess 👸🏻👸🏼👸🏽👸🏾👸🏿
1F47C 👼 Baby Angel 👼🏻👼🏼👼🏽👼🏾👼🏿
1F481 💁 Information Desk Person 💁🏻💁🏼💁🏽💁🏾💁🏿
1F482 💂 Guardsman 💂🏻💂🏼💂🏽💂🏾💂🏿
1F483 💃 Dancer 💃🏻💃🏼💃🏽💃🏾💃🏿
1F485 💅 Nail Polish 💅🏻💅🏼💅🏽💅🏾💅🏿
1F486 💆 Face Massage 💆🏻💆🏼💆🏽💆🏾💆🏿
1F487 💇 Haircut 💇🏻💇🏼💇🏽💇🏾💇🏿
1F48F 💏 Kiss 💏🏻💏🏼💏🏽💏🏾💏🏿
1F491 💑 Couple with Heart 💑🏻💑🏼💑🏽💑🏾💑🏿
1F4AA 💪 Flexed Biceps 💪🏻💪🏼💪🏽💪🏾💪🏿
1F574 🕴 Man in Business Suit Levitating 🕴🏻🕴🏼🕴🏽🕴🏾🕴🏿
1F575 🕵 Sleuth or Spy 🕵🏻🕵🏼🕵🏽🕵🏾🕵🏿
1F57A 🕺 Man Dancing 🕺🏻🕺🏼🕺🏽🕺🏾🕺🏿
1F590 🖐 Raised Hand with Fingers Splayed 🖐🏻🖐🏼🖐🏽🖐🏾🖐🏿
1F595 🖕 Reversed Hand with Middle Finger Extended 🖕🏻🖕🏼🖕🏽🖕🏾🖕🏿
1F596 🖖 Raised Hand with Part Between Middle and Ring Fingers 🖖🏻🖖🏼🖖🏽🖖🏾🖖🏿
1F645 🙅 Face with No Good Gesture 🙅🏻🙅🏼🙅🏽🙅🏾🙅🏿
1F646 🙆 Face with OK Gesture 🙆🏻🙆🏼🙆🏽🙆🏾🙆🏿
1F647 🙇 Person Bowing Deeply 🙇🏻🙇🏼🙇🏽🙇🏾🙇🏿
1F64B 🙋 Happy Person Raising One Hand 🙋🏻🙋🏼🙋🏽🙋🏾🙋🏿
1F64C 🙌 Person Raising Both Hands in Celebration 🙌🏻🙌🏼🙌🏽🙌🏾🙌🏿
1F64D 🙍 Person Frowning 🙍🏻🙍🏼🙍🏽🙍🏾🙍🏿
1F64E 🙎 Person with Pouting Face 🙎🏻🙎🏼🙎🏽🙎🏾🙎🏿
1F64F 🙏 Person with Folded Hands 🙏🏻🙏🏼🙏🏽🙏🏾🙏🏿
1F6A3 🚣 Rowboat 🚣🏻🚣🏼🚣🏽🚣🏾🚣🏿
1F6B4 🚴 Bicyclist 🚴🏻🚴🏼🚴🏽🚴🏾🚴🏿
1F6B5 🚵 Mountain Bicyclist 🚵🏻🚵🏼🚵🏽🚵🏾🚵🏿
1F6B6 🚶 Pedestrian 🚶🏻🚶🏼🚶🏽🚶🏾🚶🏿
1F6C0 🛀 Bath 🛀🏻🛀🏼🛀🏽🛀🏾🛀🏿
1F6CC 🛌 Sleeping Accommodation 🛌🏻🛌🏼🛌🏽🛌🏾🛌🏿
1F90C 🤌 Pinched Fingers 🤌🏻🤌🏼🤌🏽🤌🏾🤌🏿
1F90F 🤏 Pinching Hand 🤏🏻🤏🏼🤏🏽🤏🏾🤏🏿
1F918 🤘 Sign of the Horns 🤘🏻🤘🏼🤘🏽🤘🏾🤘🏿
1F919 🤙 Call Me Hand 🤙🏻🤙🏼🤙🏽🤙🏾🤙🏿
1F91A 🤚 Raised Back of Hand 🤚🏻🤚🏼🤚🏽🤚🏾🤚🏿
1F91B 🤛 Left-Facing Fist 🤛🏻🤛🏼🤛🏽🤛🏾🤛🏿
1F91C 🤜 Right-Facing Fist 🤜🏻🤜🏼🤜🏽🤜🏾🤜🏿
1F91D 🤝 Handshake 🤝🏻🤝🏼🤝🏽🤝🏾🤝🏿
1F91E 🤞 Hand with Index and Middle Fingers Crossed 🤞🏻🤞🏼🤞🏽🤞🏾🤞🏿
1F91F 🤟 I Love You Hand Sign 🤟🏻🤟🏼🤟🏽🤟🏾🤟🏿
1F926 🤦 Face Palm 🤦🏻🤦🏼🤦🏽🤦🏾🤦🏿
1F930 🤰 Pregnant Woman 🤰🏻🤰🏼🤰🏽🤰🏾🤰🏿
1F931 🤱 Breast-Feeding 🤱🏻🤱🏼🤱🏽🤱🏾🤱🏿
1F932 🤲 Palms Up Together 🤲🏻🤲🏼🤲🏽🤲🏾🤲🏿
1F933 🤳 Selfie 🤳🏻🤳🏼🤳🏽🤳🏾🤳🏿
1F934 🤴 Prince 🤴🏻🤴🏼🤴🏽🤴🏾🤴🏿
1F935 🤵 Man in Tuxedo 🤵🏻🤵🏼🤵🏽🤵🏾🤵🏿
1F936 🤶 Mother Christmas 🤶🏻🤶🏼🤶🏽🤶🏾🤶🏿
1F937 🤷 Shrug 🤷🏻🤷🏼🤷🏽🤷🏾🤷🏿
1F938 🤸 Person Doing Cartwheel 🤸🏻🤸🏼🤸🏽🤸🏾🤸🏿
1F939 🤹 Juggling 🤹🏻🤹🏼🤹🏽🤹🏾🤹🏿
1F93C 🤼 Wrestlers 🤼🏻🤼🏼🤼🏽🤼🏾🤼🏿
1F93D 🤽 Water Polo 🤽🏻🤽🏼🤽🏽🤽🏾🤽🏿
1F93E 🤾 Handball 🤾🏻🤾🏼🤾🏽🤾🏾🤾🏿
1F977 🥷 Ninja 🥷🏻🥷🏼🥷🏽🥷🏾🥷🏿
1F9B5 🦵 Leg 🦵🏻🦵🏼🦵🏽🦵🏾🦵🏿
1F9B6 🦶 Foot 🦶🏻🦶🏼🦶🏽🦶🏾🦶🏿
1F9B8 🦸 Superhero 🦸🏻🦸🏼🦸🏽🦸🏾🦸🏿
1F9B9 🦹 Supervillain 🦹🏻🦹🏼🦹🏽🦹🏾🦹🏿
1F9BB 🦻 Ear with Hearing Aid 🦻🏻🦻🏼🦻🏽🦻🏾🦻🏿
1F9CD 🧍 Standing Person 🧍🏻🧍🏼🧍🏽🧍🏾🧍🏿
1F9CE 🧎 Kneeling Person 🧎🏻🧎🏼🧎🏽🧎🏾🧎🏿
1F9CF 🧏 Deaf Person 🧏🏻🧏🏼🧏🏽🧏🏾🧏🏿
1F9D1 🧑 Adult 🧑🏻🧑🏼🧑🏽🧑🏾🧑🏿
1F9D2 🧒 Child 🧒🏻🧒🏼🧒🏽🧒🏾🧒🏿
1F9D3 🧓 Older Adult 🧓🏻🧓🏼🧓🏽🧓🏾🧓🏿
1F9D4 🧔 Bearded Person 🧔🏻🧔🏼🧔🏽🧔🏾🧔🏿
1F9D5 🧕 Person with Headscarf 🧕🏻🧕🏼🧕🏽🧕🏾🧕🏿
1F9D6 🧖 Person in Steamy Room 🧖🏻🧖🏼🧖🏽🧖🏾🧖🏿
1F9D7 🧗 Person Climbing 🧗🏻🧗🏼🧗🏽🧗🏾🧗🏿
1F9D8 🧘 Person in Lotus Position 🧘🏻🧘🏼🧘🏽🧘🏾🧘🏿
1F9D9 🧙 Mage 🧙🏻🧙🏼🧙🏽🧙🏾🧙🏿
1F9DA 🧚 Fairy 🧚🏻🧚🏼🧚🏽🧚🏾🧚🏿
1F9DB 🧛 Vampire 🧛🏻🧛🏼🧛🏽🧛🏾🧛🏿
1F9DC 🧜 Merperson 🧜🏻🧜🏼🧜🏽🧜🏾🧜🏿
1F9DD 🧝 Elf 🧝🏻🧝🏼🧝🏽🧝🏾🧝🏿
1FAC3 🫃 Pregnant Man 🫃🏻🫃🏼🫃🏽🫃🏾🫃🏿
1FAC4 🫄 Pregnant Person 🫄🏻🫄🏼🫄🏽🫄🏾🫄🏿
1FAC5 🫅 Person with Crown 🫅🏻🫅🏼🫅🏽🫅🏾🫅🏿
1FAF0 🫰 Hand with Index Finger and Thumb Crossed 🫰🏻🫰🏼🫰🏽🫰🏾🫰🏿
1FAF1 🫱 Rightwards Hand 🫱🏻🫱🏼🫱🏽🫱🏾🫱🏿
1FAF2 🫲 Leftwards Hand 🫲🏻🫲🏼🫲🏽🫲🏾🫲🏿
1FAF3 🫳 Palm Down Hand 🫳🏻🫳🏼🫳🏽🫳🏾🫳🏿
1FAF4 🫴 Palm Up Hand 🫴🏻🫴🏼🫴🏽🫴🏾🫴🏿
1FAF5 🫵 Index Pointing at the Viewer 🫵🏻🫵🏼🫵🏽🫵🏾🫵🏿
1FAF6 🫶 Heart Hands 🫶🏻🫶🏼🫶🏽🫶🏾🫶🏿
1FAF7 🫷 Leftwards Pushing Hand 🫷🏻🫷🏼🫷🏽🫷🏾🫷🏿
1FAF8 🫸 Rightwards Pushing Hand 🫸🏻🫸🏼🫸🏽🫸🏾🫸🏿
Code Char Name Example

Extenders

The set of characters with the Extender property unifies several disparate concepts that are of relevance to the Unicode Collation Algorithm. It includes characters that graphically extend or modify the shape of surrounding characters, as well as extenders in a linguistic sense, such as vowel or consonant lengtheners and repetition marks.

Code Char Name
00B7 · Middle Dot
02D0 ː Modifier Letter Triangular Colon
02D1 ˑ Modifier Letter Half Triangular Colon
0640 ـ Arabic Tatweel
07FA ߺ NKo Lajanyalan
0A71 Gurmukhi Addak
0AFB Gujarati Sign Shadda
0B55 Oriya Sign Overline
0E46 Thai Character Maiyamok
0EC6 Lao Ko La
180A Mongolian Nirugu
1843 Mongolian Letter Todo Long Vowel Sign
1AA7 Tai Tham Sign Mai Yamok
1C36 Lepcha Sign Ran
1C7B Ol Chiki Relaa
3005 Ideographic Iteration Mark
3031 Vertical Kana Repeat Mark
3032 Vertical Kana Repeat with Voiced Sound Mark
3033 Vertical Kana Repeat Mark Upper Half
3034 Vertical Kana Repeat with Voiced Sound Mark Upper Half
3035 Vertical Kana Repeat Mark Lower Half
309D Hiragana Iteration Mark
309E Hiragana Voiced Iteration Mark
30FC Katakana-Hiragana Prolonged Sound Mark
30FD Katakana Iteration Mark
30FE Katakana Voiced Iteration Mark
A015 Yi Syllable Iteration Mark
A60C Vai Syllable Lengthener
A9CF Javanese Pangrangkep
A9E6 Myanmar Modifier Letter Shan Reduplication
AA70 Myanmar Modifier Letter Khamti Reduplication
AADD Tai Viet Symbol Sam
AAF3 Meetei Mayek Syllable Repetition Mark
AAF4 Meetei Mayek Word Repetition Mark
FF70 Halfwidth Katakana-Hiragana Prolonged Sound Mark
10781 𐞁 Modifier Letter Superscript Triangular Colon
10782 𐞂 Modifier Letter Superscript Half Triangular Colon
10D4E 𐵎 Garay Vowel Length Mark
10D6A 𐵪 Garay Consonant Gemination Mark
10D6F 𐵯 Garay Reduplication Mark
11237 𑈷 Khojki Sign Shadda
1135D 𑍝 Grantha Sign Pluta
113D2 𑏒 Tulu-Tigalari Gemination Mark
113D3 𑏓 Tulu-Tigalari Sign Pluta
115C6 𑗆 Siddham Repetition Mark-1
115C7 𑗇 Siddham Repetition Mark-2
115C8 𑗈 Siddham Repetition Mark-3
11A98 𑪘 Soyombo Gemination Mark
11DD9 𑷙 Tolong Siki Sign Sela
16B42 𖭂 Pahawh Hmong Sign Vos Nrua
16B43 𖭃 Pahawh Hmong Sign Ib Yam
16FE0 𖿠 Tangut Iteration Mark
16FE1 𖿡 Nushu Iteration Mark
16FE3 𖿣 Old Chinese Iteration Mark
16FF2 𖿲 Chinese Small Simplified Er
16FF3 𖿳 Chinese Small Traditional Er
1E13C 𞄼 Nyiakeng Puachue Hmong Sign Xw Xw
1E13D 𞄽 Nyiakeng Puachue Hmong Syllable Lengthener
1E5EF 𞗯 Ol Onal Sign Ikir
1E944 𞥄 Adlam Alif Lengthener
1E945 𞥅 Adlam Vowel Lengthener
1E946 𞥆 Adlam Gemination Mark
Code Char Name

IDS Operators

Ideographic Description Sequences (IDS) are used to describe the shapes of Han, Tangut, Khitan, and Nüshu characters in terms of their components. So‐called ideographic characters are often modular in nature and can be recursively deconstructed into basic radicals and strokes. This is useful both for indexing and categorising these large sets of characters, and for systematically substituting as‐of‐yet unencoded ideographs.

Ideographic description characters represent simple spatial arrangements of ideographic components. They possess one of three properties based on how many operands they take: IDS_Unary_Operator, IDS_Binary_Operator or IDS_Trinary_Operator. Such an operand may in turn be another IDS. The “Example” column shows one Han character separated into its top‐level components for each operator.

See The Unicode Standard, section 18.2: Ideographic Description Characters for more information.

Code Char Name Example
Unary Operators
2FFE Ideographic Description Character Horizontal Reflection ⿾正𣥄
2FFF ⿿ Ideographic Description Character Rotation ⿿予𠄔
Binary Operators
2FF0 Ideographic Description Character Left to Right ⿰車侖
2FF1 Ideographic Description Character Above to Below ⿱山石
2FF4 Ideographic Description Character Full Surround ⿴囗寸
2FF5 Ideographic Description Character Surround from Above ⿵門人
2FF6 Ideographic Description Character Surround from Below ⿶凵水
2FF7 Ideographic Description Character Surround from Left ⿷匚斤
2FF8 Ideographic Description Character Surround from Upper Left ⿸尸毛
2FF9 Ideographic Description Character Surround from Upper Right ⿹气米
2FFA Ideographic Description Character Surround from Lower Left ⿺走戉
2FFB Ideographic Description Character Overlaid ⿻木日
2FFC Ideographic Description Character Surround from Right ⿼叉丶
2FFD Ideographic Description Character Surround from Lower Right ⿽水丶
31EF Ideographic Description Character Subtraction ㇯有二
Trinary Operators
2FF2 Ideographic Description Character Left to Middle and Right ⿲彳氵亍
2FF3 Ideographic Description Character Above to Middle and Below ⿳艹世木
Code Char Name Example

Join Controls

Only two characters possess the Join_Control property: U+200D causes its neighboring characters to form a ligature or assume contextual glyphs as if they were cursively joined to each other even if they otherwise wouldn’t; U+200C prevents ligation or cursive joining of its neighbors.

Code Char Name
200C Zero Width Non-Joiner
200D Zero Width Joiner
Code Char Name

Logical Order Exception

Unicode text is generally encoded in logical order, also called phonetic order. This means that user‐perceived characters are constructed in such a way that their component code points are placed roughly in the same order as they would be pronounced, even if their visual appearance would suggest a different order. For example, the Devanagari syllable ki (कि) is produced by putting the consonant ka () first and the vowel i (◌ि) second in the character stream, even though visually the vowel sign comes before the base consonant with regard to the writing direction.

For various historical reasons, the Thai, Lao, New Tai Lue, and Tai Viet scripts are an exception to this rule. In those scripts, the vowel signs that visually sit to the left of their base consonants actually precede them in the character stream as well. For example, the Thai syllable ke (เก) is written as e () followed by ka ().

This poses unique challenges to search and collation algorithms as they need to internally swap such sequences around to process them correctly, which is why all affected characters have been collected in the Logical_Order_Exception property set.

Code Char Name
0E40 Thai Character Sara E
0E41 Thai Character Sara Ae
0E42 Thai Character Sara O
0E43 Thai Character Sara Ai Maimuan
0E44 Thai Character Sara Ai Maimalai
0EC0 Lao Vowel Sign E
0EC1 Lao Vowel Sign Ei
0EC2 Lao Vowel Sign O
0EC3 Lao Vowel Sign Ay
0EC4 Lao Vowel Sign Ai
19B5 New Tai Lue Vowel Sign E
19B6 New Tai Lue Vowel Sign Ae
19B7 New Tai Lue Vowel Sign O
19BA New Tai Lue Vowel Sign Ay
AAB5 Tai Viet Vowel E
AAB6 Tai Viet Vowel O
AAB9 Tai Viet Vowel Uea
AABB Tai Viet Vowel Aue
AABC Tai Viet Vowel Ay
Code Char Name

Modifier Combining Marks

Unicode Technical Report #53, Unicode Arabic Mark Rendering uses the Modifier_Combining_Mark property to identify combining characters in the Arabic script that need to be rendered close to their base character regardless of the placement and canonical order of other marks within the same grapheme cluster.

Code Char Name
0654 ٔ Arabic Hamza Above
0655 ٕ Arabic Hamza Below
0658 ٘ Arabic Mark Noon Ghunna
06DC ۜ Arabic Small High Seen
06E3 ۣ Arabic Small Low Seen
06E7 ۧ Arabic Small High Yeh
06E8 ۨ Arabic Small High Noon
08CA Arabic Small High Farsi Yeh
08CB Arabic Small High Yeh Barree with Two Dots Below
08CD Arabic Small High Zah
08CE Arabic Large Round Dot Above
08CF Arabic Large Round Dot Below
08D3 Arabic Small Low Waw
08F3 Arabic Small High Waw
Code Char Name

Prepended Concatenation Marks

The Prepended_Concatenation_Mark property identifies formatting characters that, in a sense, are the opposite of combining marks. Whereas combining marks are placed after the base character they modify, prepended concatenation marks are placed before a sequence of characters (usually numerals) they apply to. Most prepended marks require complex rendering, as their glyphs are expected to stretch over or under a run of characters of arbitrary length.

The “Example” column shows each prepended mark applied to a simple sequence of characters.

Code Char Name Example
0600 ؀ Arabic Number Sign ؀١٢٣
0601 ؁ Arabic Sign Sanah ؁١٢٣٤
0602 ؂ Arabic Footnote Mark ؂١٢
0603 ؃ Arabic Sign Safha ؃١٢٣
0604 ؄ Arabic Sign Samvat ؄١٢٣٤
0605 ؅ Arabic Number Mark Above ؅𐋡𐋠𐋴𐋬𐋤
06DD ۝ Arabic End of Ayah ۝١٢٣
070F ܏ Syriac Abbreviation Mark ܐ܏ܒܓܕ
0890 Arabic Pound Mark Above ࢐١٢٣
0891 Arabic Piastre Mark Above ࢑١٢٣
08E2 Arabic Disputed End of Ayah ࣢١٢٣
110BD 𑂽 Kaithi Number Sign 𑂽१२३४
110CD 𑃍 Kaithi Number Sign Above 𑃍१२३४
Code Char Name Example

Quotation Marks

The Quotation_Mark property should be self‐explanatory. It’s for quotation marks.

Code Char Name
0022 " Quotation Mark
0027 ' Apostrophe
00AB « Left-Pointing Double Angle Quotation Mark
00BB » Right-Pointing Double Angle Quotation Mark
2018 Left Single Quotation Mark
2019 Right Single Quotation Mark
201A Single Low-9 Quotation Mark
201B Single High-Reversed-9 Quotation Mark
201C Left Double Quotation Mark
201D Right Double Quotation Mark
201E Double Low-9 Quotation Mark
201F Double High-Reversed-9 Quotation Mark
2039 Single Left-Pointing Angle Quotation Mark
203A Single Right-Pointing Angle Quotation Mark
2E42 Double Low-Reversed-9 Quotation Mark
300C Left Corner Bracket
300D Right Corner Bracket
300E Left White Corner Bracket
300F Right White Corner Bracket
301D Reversed Double Prime Quotation Mark
301E Double Prime Quotation Mark
301F Low Double Prime Quotation Mark
FE41 Presentation Form for Vertical Left Corner Bracket
FE42 Presentation Form for Vertical Right Corner Bracket
FE43 Presentation Form for Vertical Left White Corner Bracket
FE44 Presentation Form for Vertical Right White Corner Bracket
FF02 Fullwidth Quotation Mark
FF07 Fullwidth Apostrophe
FF62 Halfwidth Left Corner Bracket
FF63 Halfwidth Right Corner Bracket
Code Char Name

Soft Dotted

Letters with the Soft_Dotted property include a tittle in their glyph that disappears when a diacritical mark placed above is added. The “Example” column shows each soft‐dotted character with U+0301 ◌́ Combining Acute Accent. Notice how the acute accent completely replaces the tittle from the original base glyph.

Code Char Name Example
0069 i Latin Small Letter I
006A j Latin Small Letter J
012F į Latin Small Letter I with Ogonek į́
0249 ɉ Latin Small Letter J with Stroke ɉ́
0268 ɨ Latin Small Letter I with Stroke ɨ́
029D ʝ Latin Small Letter J with Crossed-Tail ʝ́
02B2 ʲ Modifier Letter Small J ʲ́
03F3 ϳ Greek Letter Yot ϳ́
0456 і Cyrillic Small Letter Byelorussian-Ukrainian I і́
0458 ј Cyrillic Small Letter Je ј́
1D62 Latin Subscript Small Letter I ᵢ́
1D96 Latin Small Letter I with Retroflex Hook ᶖ́
1DA4 Modifier Letter Small I with Stroke ᶤ́
1DA8 Modifier Letter Small J with Crossed-Tail ᶨ́
1E2D Latin Small Letter I with Tilde Below ḭ́
1ECB Latin Small Letter I with Dot Below ị́
2071 Superscript Latin Small Letter I ⁱ́
2148 Double-Struck Italic Small I ⅈ́
2149 Double-Struck Italic Small J ⅉ́
2C7C Latin Subscript Small Letter J ⱼ́
1D422 𝐢 Mathematical Bold Small I 𝐢́
1D423 𝐣 Mathematical Bold Small J 𝐣́
1D456 𝑖 Mathematical Italic Small I 𝑖́
1D457 𝑗 Mathematical Italic Small J 𝑗́
1D48A 𝒊 Mathematical Bold Italic Small I 𝒊́
1D48B 𝒋 Mathematical Bold Italic Small J 𝒋́
1D4BE 𝒾 Mathematical Script Small I 𝒾́
1D4BF 𝒿 Mathematical Script Small J 𝒿́
1D4F2 𝓲 Mathematical Bold Script Small I 𝓲́
1D4F3 𝓳 Mathematical Bold Script Small J 𝓳́
1D526 𝔦 Mathematical Fraktur Small I 𝔦́
1D527 𝔧 Mathematical Fraktur Small J 𝔧́
1D55A 𝕚 Mathematical Double-Struck Small I 𝕚́
1D55B 𝕛 Mathematical Double-Struck Small J 𝕛́
1D58E 𝖎 Mathematical Bold Fraktur Small I 𝖎́
1D58F 𝖏 Mathematical Bold Fraktur Small J 𝖏́
1D5C2 𝗂 Mathematical Sans-Serif Small I 𝗂́
1D5C3 𝗃 Mathematical Sans-Serif Small J 𝗃́
1D5F6 𝗶 Mathematical Sans-Serif Bold Small I 𝗶́
1D5F7 𝗷 Mathematical Sans-Serif Bold Small J 𝗷́
1D62A 𝘪 Mathematical Sans-Serif Italic Small I 𝘪́
1D62B 𝘫 Mathematical Sans-Serif Italic Small J 𝘫́
1D65E 𝙞 Mathematical Sans-Serif Bold Italic Small I 𝙞́
1D65F 𝙟 Mathematical Sans-Serif Bold Italic Small J 𝙟́
1D692 𝚒 Mathematical Monospace Small I 𝚒́
1D693 𝚓 Mathematical Monospace Small J 𝚓́
1DF1A 𝼚 Latin Small Letter I with Stroke and Retroflex Hook 𝼚́
1E04C 𞁌 Modifier Letter Cyrillic Small Byelorussian-Ukrainian I 𞁌́
1E04D 𞁍 Modifier Letter Cyrillic Small Je 𞁍́
1E068 𞁨 Cyrillic Subscript Small Letter Byelorussian-Ukrainian I 𞁨́
Code Char Name Example

White Space

White space characters are often used to separate text elements and generally have no graphic appearance of their own, except for an advance width. The White_Space property includes everything with a General_Category value of Separator, plus a few control characters that have space‐like properties or induce line breaks.

Code Char Name
0009 Character Tabulation
000A Line Feed
000B Line Tabulation
000C Form Feed
000D Carriage Return
0020 Space
0085 … Next Line
00A0   No-Break Space
1680 Ogham Space Mark
2000   En Quad
2001 Em Quad
2002 En Space
2003 Em Space
2004 Three-per-Em Space
2005 Four-per-Em Space
2006 Six-per-Em Space
2007 Figure Space
2008 Punctuation Space
2009 Thin Space
200A Hair Space
2028 Line Separator
2029 Paragraph Separator
202F Narrow No-Break Space
205F Medium Mathematical Space
3000   Ideographic Space
Code Char Name