This page serves as a grab bag for Unicode character properties that did not warrant a chart of their own but are still interesting enough to be presented in the usual format. They are all binary properties; for characters not explicitly listed here the respective property value is False.
Sources: Prop
Bidi Controls
The Bidi_ property enumerates formatting characters that serve special functions in the Unicode Bidirectional Algorithm. They include implicit controls which are simply invisible, standalone marks with the same bidi properties as regular letters for lightweight formatting, and explicit controls which are stateful and affect the bidi handling of entire text runs.
| Code | Char | Name |
|---|---|---|
| Implicit Controls | ||
| 061C | | Arabic Letter Mark |
| 200E | | Left-to-Right Mark |
| 200F | | Right-to-Left Mark |
| Explicit Controls | ||
| 202A | | Left-to-Right Embedding |
| 202B | | Right-to-Left Embedding |
| 202C | | Pop Directional Formatting |
| 202D | | Left-to-Right Override |
| 202E | | Right-to-Left Override |
| 2066 | | Left-to-Right Isolate |
| 2067 | | Right-to-Left Isolate |
| 2068 | | First Strong Isolate |
| 2069 | | Pop Directional Isolate |
| Code | Char | Name |
Dashes
The Dash property includes all characters with the General_ property value Dash_, plus a few other symbols of a similar nature.
| Code | Char | Name |
|---|---|---|
| 002D | - | Hyphen-Minus |
| 058A | ֊ | Armenian Hyphen |
| 05BE | ־ | Hebrew Punctuation Maqaf |
| 1400 | ᐀ | Canadian Syllabics Hyphen |
| 1806 | ᠆ | Mongolian Todo Soft Hyphen |
| 2010 | ‐ | Hyphen |
| 2011 | ‑ | Non-Breaking Hyphen |
| 2012 | ‒ | Figure Dash |
| 2013 | – | En Dash |
| 2014 | — | Em Dash |
| 2015 | ― | Horizontal Bar |
| 2053 | ⁓ | Swung Dash |
| 207B | ⁻ | Superscript Minus |
| 208B | ₋ | Subscript Minus |
| 2212 | − | Minus Sign |
| 2E17 | ⸗ | Double Oblique Hyphen |
| 2E1A | ⸚ | Hyphen with Diaeresis |
| 2E3A | ⸺ | Two-Em Dash |
| 2E3B | ⸻ | Three-Em Dash |
| 2E40 | ⹀ | Double Hyphen |
| 2E5D | ⹝ | Oblique Hyphen |
| 301C | 〜 | Wave Dash |
| 3030 | 〰 | Wavy Dash |
| 30A0 | ゠ | Katakana-Hiragana Double Hyphen |
| FE31 | ︱ | Presentation Form for Vertical Em Dash |
| FE32 | ︲ | Presentation Form for Vertical En Dash |
| FE58 | ﹘ | Small Em Dash |
| FE63 | ﹣ | Small Hyphen-Minus |
| FF0D | - | Fullwidth Hyphen-Minus |
| 10D6E | | Garay Hyphen |
| 10EAD | 𐺭 | Yezidi Hyphenation Mark |
| Code | Char | Name |
Emoji Modifiers
Emoji modifiers are used in conjuction with emoji modifier bases to produce emoji modifier sequences, which are variants of human‐form emoji with specific skin tones. They employ the Fitzpatrick scale, a six‐point scale that classifies human skin based on its susceptibility to sunburn. Fitzpatrick types I and II have been combined into a single emoji modifier.
See Unicode Technical Standard #51: Unicode Emoji, section 2.4: Diversity for more information.
| Code | Char | Name |
|---|---|---|
| 1F3FB | 🏻 | Emoji Modifier Fitzpatrick Type-1-2 |
| 1F3FC | 🏼 | Emoji Modifier Fitzpatrick Type-3 |
| 1F3FD | 🏽 | Emoji Modifier Fitzpatrick Type-4 |
| 1F3FE | 🏾 | Emoji Modifier Fitzpatrick Type-5 |
| 1F3FF | 🏿 | Emoji Modifier Fitzpatrick Type-6 |
| Code | Char | Name |
Emoji Modifier Bases
Emoji modifier bases are all characters that an emoji modifier can be applied to. An emoji modifier sequence always consists of a character with Emoji_ followed immediately by a character with Emoji_. The “Example” column shows all such combinations for each character. Emoji modifiers automatically request emoji‐style display for their bases, so Variation Selector-16 is not required even for text‐default characters.
Not all well‐formed emoji modifier sequences are recommended for general interchange (RGI). In particular, applications are not expected to support skin tone variants of U+1F46A 👪 Family.
Source: emoji-sequences.txt
| Code | Char | Name | Example |
|---|---|---|---|
| 261D | ☝ | White Up Pointing Index | ☝🏻☝🏼☝🏽☝🏾☝🏿 |
| 26F9 | ⛹ | Person with Ball | ⛹🏻⛹🏼⛹🏽⛹🏾⛹🏿 |
| 270A | ✊ | Raised Fist | ✊🏻✊🏼✊🏽✊🏾✊🏿 |
| 270B | ✋ | Raised Hand | ✋🏻✋🏼✋🏽✋🏾✋🏿 |
| 270C | ✌ | Victory Hand | ✌🏻✌🏼✌🏽✌🏾✌🏿 |
| 270D | ✍ | Writing Hand | ✍🏻✍🏼✍🏽✍🏾✍🏿 |
| 1F385 | 🎅 | Father Christmas | 🎅🏻🎅🏼🎅🏽🎅🏾🎅🏿 |
| 1F3C2 | 🏂 | Snowboarder | 🏂🏻🏂🏼🏂🏽🏂🏾🏂🏿 |
| 1F3C3 | 🏃 | Runner | 🏃🏻🏃🏼🏃🏽🏃🏾🏃🏿 |
| 1F3C4 | 🏄 | Surfer | 🏄🏻🏄🏼🏄🏽🏄🏾🏄🏿 |
| 1F3C7 | 🏇 | Horse Racing | 🏇🏻🏇🏼🏇🏽🏇🏾🏇🏿 |
| 1F3CA | 🏊 | Swimmer | 🏊🏻🏊🏼🏊🏽🏊🏾🏊🏿 |
| 1F3CB | 🏋 | Weight Lifter | 🏋🏻🏋🏼🏋🏽🏋🏾🏋🏿 |
| 1F3CC | 🏌 | Golfer | 🏌🏻🏌🏼🏌🏽🏌🏾🏌🏿 |
| 1F442 | 👂 | Ear | 👂🏻👂🏼👂🏽👂🏾👂🏿 |
| 1F443 | 👃 | Nose | 👃🏻👃🏼👃🏽👃🏾👃🏿 |
| 1F446 | 👆 | White Up Pointing Backhand Index | 👆🏻👆🏼👆🏽👆🏾👆🏿 |
| 1F447 | 👇 | White Down Pointing Backhand Index | 👇🏻👇🏼👇🏽👇🏾👇🏿 |
| 1F448 | 👈 | White Left Pointing Backhand Index | 👈🏻👈🏼👈🏽👈🏾👈🏿 |
| 1F449 | 👉 | White Right Pointing Backhand Index | 👉🏻👉🏼👉🏽👉🏾👉🏿 |
| 1F44A | 👊 | Fisted Hand Sign | 👊🏻👊🏼👊🏽👊🏾👊🏿 |
| 1F44B | 👋 | Waving Hand Sign | 👋🏻👋🏼👋🏽👋🏾👋🏿 |
| 1F44C | 👌 | OK Hand Sign | 👌🏻👌🏼👌🏽👌🏾👌🏿 |
| 1F44D | 👍 | Thumbs Up Sign | 👍🏻👍🏼👍🏽👍🏾👍🏿 |
| 1F44E | 👎 | Thumbs Down Sign | 👎🏻👎🏼👎🏽👎🏾👎🏿 |
| 1F44F | 👏 | Clapping Hands Sign | 👏🏻👏🏼👏🏽👏🏾👏🏿 |
| 1F450 | 👐 | Open Hands Sign | 👐🏻👐🏼👐🏽👐🏾👐🏿 |
| 1F466 | 👦 | Boy | 👦🏻👦🏼👦🏽👦🏾👦🏿 |
| 1F467 | 👧 | Girl | 👧🏻👧🏼👧🏽👧🏾👧🏿 |
| 1F468 | 👨 | Man | 👨🏻👨🏼👨🏽👨🏾👨🏿 |
| 1F469 | 👩 | Woman | 👩🏻👩🏼👩🏽👩🏾👩🏿 |
| 1F46A | 👪 | Family | 👪🏻👪🏼👪🏽👪🏾👪🏿 |
| 1F46B | 👫 | Man and Woman Holding Hands | 👫🏻👫🏼👫🏽👫🏾👫🏿 |
| 1F46C | 👬 | Two Men Holding Hands | 👬🏻👬🏼👬🏽👬🏾👬🏿 |
| 1F46D | 👭 | Two Women Holding Hands | 👭🏻👭🏼👭🏽👭🏾👭🏿 |
| 1F46E | 👮 | Police Officer | 👮🏻👮🏼👮🏽👮🏾👮🏿 |
| 1F46F | 👯 | Woman with Bunny Ears | 👯🏻👯🏼👯🏽👯🏾👯🏿 |
| 1F470 | 👰 | Bride with Veil | 👰🏻👰🏼👰🏽👰🏾👰🏿 |
| 1F471 | 👱 | Person with Blond Hair | 👱🏻👱🏼👱🏽👱🏾👱🏿 |
| 1F472 | 👲 | Man with Gua Pi Mao | 👲🏻👲🏼👲🏽👲🏾👲🏿 |
| 1F473 | 👳 | Man with Turban | 👳🏻👳🏼👳🏽👳🏾👳🏿 |
| 1F474 | 👴 | Older Man | 👴🏻👴🏼👴🏽👴🏾👴🏿 |
| 1F475 | 👵 | Older Woman | 👵🏻👵🏼👵🏽👵🏾👵🏿 |
| 1F476 | 👶 | Baby | 👶🏻👶🏼👶🏽👶🏾👶🏿 |
| 1F477 | 👷 | Construction Worker | 👷🏻👷🏼👷🏽👷🏾👷🏿 |
| 1F478 | 👸 | Princess | 👸🏻👸🏼👸🏽👸🏾👸🏿 |
| 1F47C | 👼 | Baby Angel | 👼🏻👼🏼👼🏽👼🏾👼🏿 |
| 1F481 | 💁 | Information Desk Person | 💁🏻💁🏼💁🏽💁🏾💁🏿 |
| 1F482 | 💂 | Guardsman | 💂🏻💂🏼💂🏽💂🏾💂🏿 |
| 1F483 | 💃 | Dancer | 💃🏻💃🏼💃🏽💃🏾💃🏿 |
| 1F485 | 💅 | Nail Polish | 💅🏻💅🏼💅🏽💅🏾💅🏿 |
| 1F486 | 💆 | Face Massage | 💆🏻💆🏼💆🏽💆🏾💆🏿 |
| 1F487 | 💇 | Haircut | 💇🏻💇🏼💇🏽💇🏾💇🏿 |
| 1F48F | 💏 | Kiss | 💏🏻💏🏼💏🏽💏🏾💏🏿 |
| 1F491 | 💑 | Couple with Heart | 💑🏻💑🏼💑🏽💑🏾💑🏿 |
| 1F4AA | 💪 | Flexed Biceps | 💪🏻💪🏼💪🏽💪🏾💪🏿 |
| 1F574 | 🕴 | Man in Business Suit Levitating | 🕴🏻🕴🏼🕴🏽🕴🏾🕴🏿 |
| 1F575 | 🕵 | Sleuth or Spy | 🕵🏻🕵🏼🕵🏽🕵🏾🕵🏿 |
| 1F57A | 🕺 | Man Dancing | 🕺🏻🕺🏼🕺🏽🕺🏾🕺🏿 |
| 1F590 | 🖐 | Raised Hand with Fingers Splayed | 🖐🏻🖐🏼🖐🏽🖐🏾🖐🏿 |
| 1F595 | 🖕 | Reversed Hand with Middle Finger Extended | 🖕🏻🖕🏼🖕🏽🖕🏾🖕🏿 |
| 1F596 | 🖖 | Raised Hand with Part Between Middle and Ring Fingers | 🖖🏻🖖🏼🖖🏽🖖🏾🖖🏿 |
| 1F645 | 🙅 | Face with No Good Gesture | 🙅🏻🙅🏼🙅🏽🙅🏾🙅🏿 |
| 1F646 | 🙆 | Face with OK Gesture | 🙆🏻🙆🏼🙆🏽🙆🏾🙆🏿 |
| 1F647 | 🙇 | Person Bowing Deeply | 🙇🏻🙇🏼🙇🏽🙇🏾🙇🏿 |
| 1F64B | 🙋 | Happy Person Raising One Hand | 🙋🏻🙋🏼🙋🏽🙋🏾🙋🏿 |
| 1F64C | 🙌 | Person Raising Both Hands in Celebration | 🙌🏻🙌🏼🙌🏽🙌🏾🙌🏿 |
| 1F64D | 🙍 | Person Frowning | 🙍🏻🙍🏼🙍🏽🙍🏾🙍🏿 |
| 1F64E | 🙎 | Person with Pouting Face | 🙎🏻🙎🏼🙎🏽🙎🏾🙎🏿 |
| 1F64F | 🙏 | Person with Folded Hands | 🙏🏻🙏🏼🙏🏽🙏🏾🙏🏿 |
| 1F6A3 | 🚣 | Rowboat | 🚣🏻🚣🏼🚣🏽🚣🏾🚣🏿 |
| 1F6B4 | 🚴 | Bicyclist | 🚴🏻🚴🏼🚴🏽🚴🏾🚴🏿 |
| 1F6B5 | 🚵 | Mountain Bicyclist | 🚵🏻🚵🏼🚵🏽🚵🏾🚵🏿 |
| 1F6B6 | 🚶 | Pedestrian | 🚶🏻🚶🏼🚶🏽🚶🏾🚶🏿 |
| 1F6C0 | 🛀 | Bath | 🛀🏻🛀🏼🛀🏽🛀🏾🛀🏿 |
| 1F6CC | 🛌 | Sleeping Accommodation | 🛌🏻🛌🏼🛌🏽🛌🏾🛌🏿 |
| 1F90C | 🤌 | Pinched Fingers | 🤌🏻🤌🏼🤌🏽🤌🏾🤌🏿 |
| 1F90F | 🤏 | Pinching Hand | 🤏🏻🤏🏼🤏🏽🤏🏾🤏🏿 |
| 1F918 | 🤘 | Sign of the Horns | 🤘🏻🤘🏼🤘🏽🤘🏾🤘🏿 |
| 1F919 | 🤙 | Call Me Hand | 🤙🏻🤙🏼🤙🏽🤙🏾🤙🏿 |
| 1F91A | 🤚 | Raised Back of Hand | 🤚🏻🤚🏼🤚🏽🤚🏾🤚🏿 |
| 1F91B | 🤛 | Left-Facing Fist | 🤛🏻🤛🏼🤛🏽🤛🏾🤛🏿 |
| 1F91C | 🤜 | Right-Facing Fist | 🤜🏻🤜🏼🤜🏽🤜🏾🤜🏿 |
| 1F91D | 🤝 | Handshake | 🤝🏻🤝🏼🤝🏽🤝🏾🤝🏿 |
| 1F91E | 🤞 | Hand with Index and Middle Fingers Crossed | 🤞🏻🤞🏼🤞🏽🤞🏾🤞🏿 |
| 1F91F | 🤟 | I Love You Hand Sign | 🤟🏻🤟🏼🤟🏽🤟🏾🤟🏿 |
| 1F926 | 🤦 | Face Palm | 🤦🏻🤦🏼🤦🏽🤦🏾🤦🏿 |
| 1F930 | 🤰 | Pregnant Woman | 🤰🏻🤰🏼🤰🏽🤰🏾🤰🏿 |
| 1F931 | 🤱 | Breast-Feeding | 🤱🏻🤱🏼🤱🏽🤱🏾🤱🏿 |
| 1F932 | 🤲 | Palms Up Together | 🤲🏻🤲🏼🤲🏽🤲🏾🤲🏿 |
| 1F933 | 🤳 | Selfie | 🤳🏻🤳🏼🤳🏽🤳🏾🤳🏿 |
| 1F934 | 🤴 | Prince | 🤴🏻🤴🏼🤴🏽🤴🏾🤴🏿 |
| 1F935 | 🤵 | Man in Tuxedo | 🤵🏻🤵🏼🤵🏽🤵🏾🤵🏿 |
| 1F936 | 🤶 | Mother Christmas | 🤶🏻🤶🏼🤶🏽🤶🏾🤶🏿 |
| 1F937 | 🤷 | Shrug | 🤷🏻🤷🏼🤷🏽🤷🏾🤷🏿 |
| 1F938 | 🤸 | Person Doing Cartwheel | 🤸🏻🤸🏼🤸🏽🤸🏾🤸🏿 |
| 1F939 | 🤹 | Juggling | 🤹🏻🤹🏼🤹🏽🤹🏾🤹🏿 |
| 1F93C | 🤼 | Wrestlers | 🤼🏻🤼🏼🤼🏽🤼🏾🤼🏿 |
| 1F93D | 🤽 | Water Polo | 🤽🏻🤽🏼🤽🏽🤽🏾🤽🏿 |
| 1F93E | 🤾 | Handball | 🤾🏻🤾🏼🤾🏽🤾🏾🤾🏿 |
| 1F977 | 🥷 | Ninja | 🥷🏻🥷🏼🥷🏽🥷🏾🥷🏿 |
| 1F9B5 | 🦵 | Leg | 🦵🏻🦵🏼🦵🏽🦵🏾🦵🏿 |
| 1F9B6 | 🦶 | Foot | 🦶🏻🦶🏼🦶🏽🦶🏾🦶🏿 |
| 1F9B8 | 🦸 | Superhero | 🦸🏻🦸🏼🦸🏽🦸🏾🦸🏿 |
| 1F9B9 | 🦹 | Supervillain | 🦹🏻🦹🏼🦹🏽🦹🏾🦹🏿 |
| 1F9BB | 🦻 | Ear with Hearing Aid | 🦻🏻🦻🏼🦻🏽🦻🏾🦻🏿 |
| 1F9CD | 🧍 | Standing Person | 🧍🏻🧍🏼🧍🏽🧍🏾🧍🏿 |
| 1F9CE | 🧎 | Kneeling Person | 🧎🏻🧎🏼🧎🏽🧎🏾🧎🏿 |
| 1F9CF | 🧏 | Deaf Person | 🧏🏻🧏🏼🧏🏽🧏🏾🧏🏿 |
| 1F9D1 | 🧑 | Adult | 🧑🏻🧑🏼🧑🏽🧑🏾🧑🏿 |
| 1F9D2 | 🧒 | Child | 🧒🏻🧒🏼🧒🏽🧒🏾🧒🏿 |
| 1F9D3 | 🧓 | Older Adult | 🧓🏻🧓🏼🧓🏽🧓🏾🧓🏿 |
| 1F9D4 | 🧔 | Bearded Person | 🧔🏻🧔🏼🧔🏽🧔🏾🧔🏿 |
| 1F9D5 | 🧕 | Person with Headscarf | 🧕🏻🧕🏼🧕🏽🧕🏾🧕🏿 |
| 1F9D6 | 🧖 | Person in Steamy Room | 🧖🏻🧖🏼🧖🏽🧖🏾🧖🏿 |
| 1F9D7 | 🧗 | Person Climbing | 🧗🏻🧗🏼🧗🏽🧗🏾🧗🏿 |
| 1F9D8 | 🧘 | Person in Lotus Position | 🧘🏻🧘🏼🧘🏽🧘🏾🧘🏿 |
| 1F9D9 | 🧙 | Mage | 🧙🏻🧙🏼🧙🏽🧙🏾🧙🏿 |
| 1F9DA | 🧚 | Fairy | 🧚🏻🧚🏼🧚🏽🧚🏾🧚🏿 |
| 1F9DB | 🧛 | Vampire | 🧛🏻🧛🏼🧛🏽🧛🏾🧛🏿 |
| 1F9DC | 🧜 | Merperson | 🧜🏻🧜🏼🧜🏽🧜🏾🧜🏿 |
| 1F9DD | 🧝 | Elf | 🧝🏻🧝🏼🧝🏽🧝🏾🧝🏿 |
| 1FAC3 | 🫃 | Pregnant Man | 🫃🏻🫃🏼🫃🏽🫃🏾🫃🏿 |
| 1FAC4 | 🫄 | Pregnant Person | 🫄🏻🫄🏼🫄🏽🫄🏾🫄🏿 |
| 1FAC5 | 🫅 | Person with Crown | 🫅🏻🫅🏼🫅🏽🫅🏾🫅🏿 |
| 1FAF0 | 🫰 | Hand with Index Finger and Thumb Crossed | 🫰🏻🫰🏼🫰🏽🫰🏾🫰🏿 |
| 1FAF1 | 🫱 | Rightwards Hand | 🫱🏻🫱🏼🫱🏽🫱🏾🫱🏿 |
| 1FAF2 | 🫲 | Leftwards Hand | 🫲🏻🫲🏼🫲🏽🫲🏾🫲🏿 |
| 1FAF3 | 🫳 | Palm Down Hand | 🫳🏻🫳🏼🫳🏽🫳🏾🫳🏿 |
| 1FAF4 | 🫴 | Palm Up Hand | 🫴🏻🫴🏼🫴🏽🫴🏾🫴🏿 |
| 1FAF5 | 🫵 | Index Pointing at the Viewer | 🫵🏻🫵🏼🫵🏽🫵🏾🫵🏿 |
| 1FAF6 | 🫶 | Heart Hands | 🫶🏻🫶🏼🫶🏽🫶🏾🫶🏿 |
| 1FAF7 | 🫷 | Leftwards Pushing Hand | 🫷🏻🫷🏼🫷🏽🫷🏾🫷🏿 |
| 1FAF8 | 🫸 | Rightwards Pushing Hand | 🫸🏻🫸🏼🫸🏽🫸🏾🫸🏿 |
| Code | Char | Name | Example |
Extenders
The set of characters with the Extender property unifies several disparate concepts that are of relevance to the Unicode Collation Algorithm. It includes characters that graphically extend or modify the shape of surrounding characters, as well as extenders in a linguistic sense, such as vowel or consonant lengtheners and repetition marks.
| Code | Char | Name |
|---|---|---|
| 00B7 | · | Middle Dot |
| 02D0 | ː | Modifier Letter Triangular Colon |
| 02D1 | ˑ | Modifier Letter Half Triangular Colon |
| 0640 | ـ | Arabic Tatweel |
| 07FA | ߺ | NKo Lajanyalan |
| 0A71 | ੱ | Gurmukhi Addak |
| 0AFB | ૻ | Gujarati Sign Shadda |
| 0B55 | ୕ | Oriya Sign Overline |
| 0E46 | ๆ | Thai Character Maiyamok |
| 0EC6 | ໆ | Lao Ko La |
| 180A | ᠊ | Mongolian Nirugu |
| 1843 | ᡃ | Mongolian Letter Todo Long Vowel Sign |
| 1AA7 | ᪧ | Tai Tham Sign Mai Yamok |
| 1C36 | ᰶ | Lepcha Sign Ran |
| 1C7B | ᱻ | Ol Chiki Relaa |
| 3005 | 々 | Ideographic Iteration Mark |
| 3031 | 〱 | Vertical Kana Repeat Mark |
| 3032 | 〲 | Vertical Kana Repeat with Voiced Sound Mark |
| 3033 | 〳 | Vertical Kana Repeat Mark Upper Half |
| 3034 | 〴 | Vertical Kana Repeat with Voiced Sound Mark Upper Half |
| 3035 | 〵 | Vertical Kana Repeat Mark Lower Half |
| 309D | ゝ | Hiragana Iteration Mark |
| 309E | ゞ | Hiragana Voiced Iteration Mark |
| 30FC | ー | Katakana-Hiragana Prolonged Sound Mark |
| 30FD | ヽ | Katakana Iteration Mark |
| 30FE | ヾ | Katakana Voiced Iteration Mark |
| A015 | ꀕ | Yi Syllable Iteration Mark |
| A60C | ꘌ | Vai Syllable Lengthener |
| A9CF | ꧏ | Javanese Pangrangkep |
| A9E6 | ꧦ | Myanmar Modifier Letter Shan Reduplication |
| AA70 | ꩰ | Myanmar Modifier Letter Khamti Reduplication |
| AADD | ꫝ | Tai Viet Symbol Sam |
| AAF3 | ꫳ | Meetei Mayek Syllable Repetition Mark |
| AAF4 | ꫴ | Meetei Mayek Word Repetition Mark |
| FF70 | ー | Halfwidth Katakana-Hiragana Prolonged Sound Mark |
| 10781 | 𐞁 | Modifier Letter Superscript Triangular Colon |
| 10782 | 𐞂 | Modifier Letter Superscript Half Triangular Colon |
| 10D4E | | Garay Vowel Length Mark |
| 10D6A | | Garay Consonant Gemination Mark |
| 10D6F | | Garay Reduplication Mark |
| 11237 | 𑈷 | Khojki Sign Shadda |
| 1135D | 𑍝 | Grantha Sign Pluta |
| 113D2 | | Tulu-Tigalari Gemination Mark |
| 113D3 | | Tulu-Tigalari Sign Pluta |
| 115C6 | 𑗆 | Siddham Repetition Mark-1 |
| 115C7 | 𑗇 | Siddham Repetition Mark-2 |
| 115C8 | 𑗈 | Siddham Repetition Mark-3 |
| 11A98 | 𑪘 | Soyombo Gemination Mark |
| 11DD9 | | Tolong Siki Sign Sela |
| 16B42 | 𖭂 | Pahawh Hmong Sign Vos Nrua |
| 16B43 | 𖭃 | Pahawh Hmong Sign Ib Yam |
| 16FE0 | 𖿠 | Tangut Iteration Mark |
| 16FE1 | 𖿡 | Nushu Iteration Mark |
| 16FE3 | 𖿣 | Old Chinese Iteration Mark |
| 16FF2 | | Chinese Small Simplified Er |
| 16FF3 | | Chinese Small Traditional Er |
| 1E13C | 𞄼 | Nyiakeng Puachue Hmong Sign Xw Xw |
| 1E13D | 𞄽 | Nyiakeng Puachue Hmong Syllable Lengthener |
| 1E5EF | | Ol Onal Sign Ikir |
| 1E944 | 𞥄 | Adlam Alif Lengthener |
| 1E945 | 𞥅 | Adlam Vowel Lengthener |
| 1E946 | 𞥆 | Adlam Gemination Mark |
| Code | Char | Name |
IDS Operators
Ideographic Description Sequences (IDS) are used to describe the shapes of Han, Tangut, Khitan, and Nüshu characters in terms of their components. So‐called ideographic characters are often modular in nature and can be recursively deconstructed into basic radicals and strokes. This is useful both for indexing and categorising these large sets of characters, and for systematically substituting as‐of‐yet unencoded ideographs.
Ideographic description characters represent simple spatial arrangements of ideographic components. They possess one of three properties based on how many operands they take: IDS_, IDS_ or IDS_. Such an operand may in turn be another IDS. The “Example” column shows one Han character separated into its top‐level components for each operator.
See The Unicode Standard, section 18.2: Ideographic Description Characters for more information.
| Code | Char | Name | Example |
|---|---|---|---|
| Unary Operators | |||
| 2FFE | | Ideographic Description Character Horizontal Reflection | 正 → 𣥄 |
| 2FFF | | Ideographic Description Character Rotation | 予 → 𠄔 |
| Binary Operators | |||
| 2FF0 | ⿰ | Ideographic Description Character Left to Right | ⿰車侖 → 輪 |
| 2FF1 | ⿱ | Ideographic Description Character Above to Below | ⿱山石 → 岩 |
| 2FF4 | ⿴ | Ideographic Description Character Full Surround | ⿴囗寸 → 団 |
| 2FF5 | ⿵ | Ideographic Description Character Surround from Above | ⿵門人 → 閃 |
| 2FF6 | ⿶ | Ideographic Description Character Surround from Below | ⿶凵水 → 凼 |
| 2FF7 | ⿷ | Ideographic Description Character Surround from Left | ⿷匚斤 → 匠 |
| 2FF8 | ⿸ | Ideographic Description Character Surround from Upper Left | ⿸尸毛 → 尾 |
| 2FF9 | ⿹ | Ideographic Description Character Surround from Upper Right | ⿹气米 → 氣 |
| 2FFA | ⿺ | Ideographic Description Character Surround from Lower Left | ⿺走戉 → 越 |
| 2FFB | ⿻ | Ideographic Description Character Overlaid | ⿻木日 → 東 |
| 2FFC | | Ideographic Description Character Surround from Right | 叉丶 → 㕚 |
| 2FFD | | Ideographic Description Character Surround from Lower Right | 水丶 → 氷 |
| 31EF | | Ideographic Description Character Subtraction | 有二 → 冇 |
| Trinary Operators | |||
| 2FF2 | ⿲ | Ideographic Description Character Left to Middle and Right | ⿲彳氵亍 → 衍 |
| 2FF3 | ⿳ | Ideographic Description Character Above to Middle and Below | ⿳艹世木 → 葉 |
| Code | Char | Name | Example |
Join Controls
Only two characters possess the Join_ property: U+200D causes its neighboring characters to form a ligature or assume contextual glyphs as if they were cursively joined to each other even if they otherwise wouldn’t; U+200C prevents ligation or cursive joining of its neighbors.
| Code | Char | Name |
|---|---|---|
| 200C | | Zero Width Non-Joiner |
| 200D | | Zero Width Joiner |
| Code | Char | Name |
Logical Order Exception
Unicode text is generally encoded in logical order, also called phonetic order. This means that user‐perceived characters are constructed in such a way that their component code points are placed roughly in the same order as they would be pronounced, even if their visual appearance would suggest a different order. For example, the Devanagari syllable ki (कि) is produced by putting the consonant ka (क) first and the vowel i (◌ि) second in the character stream, even though visually the vowel sign comes before the base consonant with regard to the writing direction.
For various historical reasons, the Thai, Lao, New Tai Lue, and Tai Viet scripts are an exception to this rule. In those scripts, the vowel signs that visually sit to the left of their base consonants actually precede them in the character stream as well. For example, the Thai syllable ke (เก) is written as e (เ) followed by ka (ก).
This poses unique challenges to search and collation algorithms as they need to internally swap such sequences around to process them correctly, which is why all affected characters have been collected in the Logical_ property set.
| Code | Char | Name |
|---|---|---|
| 0E40 | เ | Thai Character Sara E |
| 0E41 | แ | Thai Character Sara Ae |
| 0E42 | โ | Thai Character Sara O |
| 0E43 | ใ | Thai Character Sara Ai Maimuan |
| 0E44 | ไ | Thai Character Sara Ai Maimalai |
| 0EC0 | ເ | Lao Vowel Sign E |
| 0EC1 | ແ | Lao Vowel Sign Ei |
| 0EC2 | ໂ | Lao Vowel Sign O |
| 0EC3 | ໃ | Lao Vowel Sign Ay |
| 0EC4 | ໄ | Lao Vowel Sign Ai |
| 19B5 | ᦵ | New Tai Lue Vowel Sign E |
| 19B6 | ᦶ | New Tai Lue Vowel Sign Ae |
| 19B7 | ᦷ | New Tai Lue Vowel Sign O |
| 19BA | ᦺ | New Tai Lue Vowel Sign Ay |
| AAB5 | ꪵ | Tai Viet Vowel E |
| AAB6 | ꪶ | Tai Viet Vowel O |
| AAB9 | ꪹ | Tai Viet Vowel Uea |
| AABB | ꪻ | Tai Viet Vowel Aue |
| AABC | ꪼ | Tai Viet Vowel Ay |
| Code | Char | Name |
Modifier Combining Marks
Unicode Technical Report #53, Unicode Arabic Mark Rendering uses the Modifier_ property to identify combining characters in the Arabic script that need to be rendered close to their base character regardless of the placement and canonical order of other marks within the same grapheme cluster.
| Code | Char | Name |
|---|---|---|
| 0654 | ٔ | Arabic Hamza Above |
| 0655 | ٕ | Arabic Hamza Below |
| 0658 | ٘ | Arabic Mark Noon Ghunna |
| 06DC | ۜ | Arabic Small High Seen |
| 06E3 | ۣ | Arabic Small Low Seen |
| 06E7 | ۧ | Arabic Small High Yeh |
| 06E8 | ۨ | Arabic Small High Noon |
| 08CA | ࣊ | Arabic Small High Farsi Yeh |
| 08CB | ࣋ | Arabic Small High Yeh Barree with Two Dots Below |
| 08CD | ࣍ | Arabic Small High Zah |
| 08CE | ࣎ | Arabic Large Round Dot Above |
| 08CF | ࣏ | Arabic Large Round Dot Below |
| 08D3 | ࣓ | Arabic Small Low Waw |
| 08F3 | ࣳ | Arabic Small High Waw |
| Code | Char | Name |
Prepended Concatenation Marks
The Prepended_ property identifies formatting characters that, in a sense, are the opposite of combining marks. Whereas combining marks are placed after the base character they modify, prepended concatenation marks are placed before a sequence of characters (usually numerals) they apply to. Most prepended marks require complex rendering, as their glyphs are expected to stretch over or under a run of characters of arbitrary length.
The “Example” column shows each prepended mark applied to a simple sequence of characters.
| Code | Char | Name | Example |
|---|---|---|---|
| 0600 | | Arabic Number Sign | ١٢٣ |
| 0601 | | Arabic Sign Sanah | ١٢٣٤ |
| 0602 | | Arabic Footnote Mark | ١٢ |
| 0603 | | Arabic Sign Safha | ١٢٣ |
| 0604 | | Arabic Sign Samvat | ١٢٣٤ |
| 0605 | | Arabic Number Mark Above | 𐋡𐋠𐋴𐋬𐋤 |
| 06DD | | Arabic End of Ayah | ١٢٣ |
| 070F | | Syriac Abbreviation Mark | ܐܒܓܕ |
| 0890 | | Arabic Pound Mark Above | ١٢٣ |
| 0891 | | Arabic Piastre Mark Above | ١٢٣ |
| 08E2 | | Arabic Disputed End of Ayah | ١٢٣ |
| 110BD | | Kaithi Number Sign | १२३४ |
| 110CD | | Kaithi Number Sign Above | १२३४ |
| Code | Char | Name | Example |
Quotation Marks
The Quotation_ property should be self‐explanatory. It’s for quotation marks.
| Code | Char | Name |
|---|---|---|
| 0022 | " | Quotation Mark |
| 0027 | ' | Apostrophe |
| 00AB | « | Left-Pointing Double Angle Quotation Mark |
| 00BB | » | Right-Pointing Double Angle Quotation Mark |
| 2018 | ‘ | Left Single Quotation Mark |
| 2019 | ’ | Right Single Quotation Mark |
| 201A | ‚ | Single Low-9 Quotation Mark |
| 201B | ‛ | Single High-Reversed-9 Quotation Mark |
| 201C | “ | Left Double Quotation Mark |
| 201D | ” | Right Double Quotation Mark |
| 201E | „ | Double Low-9 Quotation Mark |
| 201F | ‟ | Double High-Reversed-9 Quotation Mark |
| 2039 | ‹ | Single Left-Pointing Angle Quotation Mark |
| 203A | › | Single Right-Pointing Angle Quotation Mark |
| 2E42 | ⹂ | Double Low-Reversed-9 Quotation Mark |
| 300C | 「 | Left Corner Bracket |
| 300D | 」 | Right Corner Bracket |
| 300E | 『 | Left White Corner Bracket |
| 300F | 』 | Right White Corner Bracket |
| 301D | 〝 | Reversed Double Prime Quotation Mark |
| 301E | 〞 | Double Prime Quotation Mark |
| 301F | 〟 | Low Double Prime Quotation Mark |
| FE41 | ﹁ | Presentation Form for Vertical Left Corner Bracket |
| FE42 | ﹂ | Presentation Form for Vertical Right Corner Bracket |
| FE43 | ﹃ | Presentation Form for Vertical Left White Corner Bracket |
| FE44 | ﹄ | Presentation Form for Vertical Right White Corner Bracket |
| FF02 | " | Fullwidth Quotation Mark |
| FF07 | ' | Fullwidth Apostrophe |
| FF62 | 「 | Halfwidth Left Corner Bracket |
| FF63 | 」 | Halfwidth Right Corner Bracket |
| Code | Char | Name |
Soft Dotted
Letters with the Soft_ property include a tittle in their glyph that disappears when a diacritical mark placed above is added. The “Example” column shows each soft‐dotted character with U+0301 ◌́ Combining Acute Accent. Notice how the acute accent completely replaces the tittle from the original base glyph.
| Code | Char | Name | Example |
|---|---|---|---|
| 0069 | i | Latin Small Letter I | í |
| 006A | j | Latin Small Letter J | j́ |
| 012F | į | Latin Small Letter I with Ogonek | į́ |
| 0249 | ɉ | Latin Small Letter J with Stroke | ɉ́ |
| 0268 | ɨ | Latin Small Letter I with Stroke | ɨ́ |
| 029D | ʝ | Latin Small Letter J with Crossed-Tail | ʝ́ |
| 02B2 | ʲ | Modifier Letter Small J | ʲ́ |
| 03F3 | ϳ | Greek Letter Yot | ϳ́ |
| 0456 | і | Cyrillic Small Letter Byelorussian-Ukrainian I | і́ |
| 0458 | ј | Cyrillic Small Letter Je | ј́ |
| 1D62 | ᵢ | Latin Subscript Small Letter I | ᵢ́ |
| 1D96 | ᶖ | Latin Small Letter I with Retroflex Hook | ᶖ́ |
| 1DA4 | ᶤ | Modifier Letter Small I with Stroke | ᶤ́ |
| 1DA8 | ᶨ | Modifier Letter Small J with Crossed-Tail | ᶨ́ |
| 1E2D | ḭ | Latin Small Letter I with Tilde Below | ḭ́ |
| 1ECB | ị | Latin Small Letter I with Dot Below | ị́ |
| 2071 | ⁱ | Superscript Latin Small Letter I | ⁱ́ |
| 2148 | ⅈ | Double-Struck Italic Small I | ⅈ́ |
| 2149 | ⅉ | Double-Struck Italic Small J | ⅉ́ |
| 2C7C | ⱼ | Latin Subscript Small Letter J | ⱼ́ |
| 1D422 | 𝐢 | Mathematical Bold Small I | 𝐢́ |
| 1D423 | 𝐣 | Mathematical Bold Small J | 𝐣́ |
| 1D456 | 𝑖 | Mathematical Italic Small I | 𝑖́ |
| 1D457 | 𝑗 | Mathematical Italic Small J | 𝑗́ |
| 1D48A | 𝒊 | Mathematical Bold Italic Small I | 𝒊́ |
| 1D48B | 𝒋 | Mathematical Bold Italic Small J | 𝒋́ |
| 1D4BE | 𝒾 | Mathematical Script Small I | 𝒾́ |
| 1D4BF | 𝒿 | Mathematical Script Small J | 𝒿́ |
| 1D4F2 | 𝓲 | Mathematical Bold Script Small I | 𝓲́ |
| 1D4F3 | 𝓳 | Mathematical Bold Script Small J | 𝓳́ |
| 1D526 | 𝔦 | Mathematical Fraktur Small I | 𝔦́ |
| 1D527 | 𝔧 | Mathematical Fraktur Small J | 𝔧́ |
| 1D55A | 𝕚 | Mathematical Double-Struck Small I | 𝕚́ |
| 1D55B | 𝕛 | Mathematical Double-Struck Small J | 𝕛́ |
| 1D58E | 𝖎 | Mathematical Bold Fraktur Small I | 𝖎́ |
| 1D58F | 𝖏 | Mathematical Bold Fraktur Small J | 𝖏́ |
| 1D5C2 | 𝗂 | Mathematical Sans-Serif Small I | 𝗂́ |
| 1D5C3 | 𝗃 | Mathematical Sans-Serif Small J | 𝗃́ |
| 1D5F6 | 𝗶 | Mathematical Sans-Serif Bold Small I | 𝗶́ |
| 1D5F7 | 𝗷 | Mathematical Sans-Serif Bold Small J | 𝗷́ |
| 1D62A | 𝘪 | Mathematical Sans-Serif Italic Small I | 𝘪́ |
| 1D62B | 𝘫 | Mathematical Sans-Serif Italic Small J | 𝘫́ |
| 1D65E | 𝙞 | Mathematical Sans-Serif Bold Italic Small I | 𝙞́ |
| 1D65F | 𝙟 | Mathematical Sans-Serif Bold Italic Small J | 𝙟́ |
| 1D692 | 𝚒 | Mathematical Monospace Small I | 𝚒́ |
| 1D693 | 𝚓 | Mathematical Monospace Small J | 𝚓́ |
| 1DF1A | 𝼚 | Latin Small Letter I with Stroke and Retroflex Hook | 𝼚́ |
| 1E04C | 𞁌 | Modifier Letter Cyrillic Small Byelorussian-Ukrainian I | 𞁌́ |
| 1E04D | 𞁍 | Modifier Letter Cyrillic Small Je | 𞁍́ |
| 1E068 | 𞁨 | Cyrillic Subscript Small Letter Byelorussian-Ukrainian I | 𞁨́ |
| Code | Char | Name | Example |
White Space
White space characters are often used to separate text elements and generally have no graphic appearance of their own, except for an advance width. The White_ property includes everything with a General_ value of Separator, plus a few control characters that have space‐like properties or induce line breaks.
| Code | Char | Name |
|---|---|---|
| 0009 | Character Tabulation | |
| 000A | Line Feed | |
| 000B | Line Tabulation | |
| 000C | Form Feed | |
| 000D | Carriage Return | |
| 0020 | Space | |
| 0085 | Next Line | |
| 00A0 | No-Break Space | |
| 1680 | Ogham Space Mark | |
| 2000 | En Quad | |
| 2001 | Em Quad | |
| 2002 | En Space | |
| 2003 | Em Space | |
| 2004 | Three-per-Em Space | |
| 2005 | Four-per-Em Space | |
| 2006 | Six-per-Em Space | |
| 2007 | Figure Space | |
| 2008 | Punctuation Space | |
| 2009 | Thin Space | |
| 200A | Hair Space | |
| 2028 | Line Separator | |
| 2029 | Paragraph Separator | |
| 202F | Narrow No-Break Space | |
| 205F | Medium Mathematical Space | |
| 3000 | Ideographic Space | |
| Code | Char | Name |