linux/Documentation/unicode.txt
<<
8.11v4./spa > v4./form > v4.a 8.11v4 href="../linux+tion>20 1/Documenta" /unicode.txt">8.11v4.img src="../.sta" c/gfx/right.png" alt=">>">8../spa >8..spa class="lxr_search">8.118.11v4.input typn>2hidden" namn>2navtarget" ption>2">8.11v4.input typn>2text" namn>2search" id>2search">8.11v4.butt2submit">Search v4./form > ./spa >8..spa class="lxr_prefs" > v4.a href="+prefs?return=Documenta" /unicode.txt"8.11v4 onclick="return ajax_prefs();">8.11v4Prefs> v4./a>8../spa >11v4 4./div >11v4 4.form ac" ="ajax+*" method="post" onsubmit="return false;">8..input typn>2hidden" namn>2ajax_lookup" id>2ajax_lookup" ption>2">811v4 4./form >811v4 4.div class="headingbott
11v4
11v4 v4 4.div id>2search_results" class="search_results"> v >11v4 4./div > .div id>2content" > .div id>2file_contents"
2bb/14/fc104901a062163a8eb411e515ce56baccd2_3/0" 2L1" class="line" namn>2L1">4 41./a>                 Last update: 2005-01-17, vers12L2" class="line" namn>2L2">4 42./a>82L3" class="line" namn>2L3">4 43./a>This file is maintained by H. Peter Anvin <unicode@lanana.org> as part82L4" class="line" namn>2L4">4 44./a>of the Linux Assigned Namns And Numbers Authority (LANANA) project.82L5" class="line" namn>2L5">4 45./a>The current vers12L6" class="line" namn>2L6">4 46./a>82L7" class="line" namn>2L7">4 47./a>            http://www.lanana.org/docs/unicode/unicode.txt./a>82L8" class="line" namn>2L8">4 48./a>82L9" class="line" namn>2L9">4 49./a>                       ------------------------82L10" class="line" namn>2L10">4 "382L11" class="line" namn>2L11">4 11./a>The Linux kernel code has been rewritten to use Unicode to map82L12" class="line" namn>2L12">4 12./a>charac"ers to fonts.  By downloading a single Unicode-to-font table,82L13" class="line" namn>2L13">4 13./a>both the eight-bit charac"er sets and UTF-8 mode are changed to use82L14" class="line" namn>2L14">4 14./a>the font as indicated.82L15" class="line" namn>2L15">4 1582L16" class="line" namn>2L16">4 16./a>This changes the seman"
cs of the eight-bit charac"er tables subtly.82L17" class="line" namn>2L17">4 17./a>The four charac"er tables are now:82L18" class="line" namn>2L18">4 18./a>82L19" class="line" namn>2L19">4 19./a>Map symbol      Map namn                        Escape code (G0)
2L20" class="line" namn>2L20">4 2382L21" class="line" namn>2L21">4 21./a>LAT1_MAP        Latin-1 (ISO 8859-1)            ESC ( B82L22" class="line" namn>2L22">4 22./a>GRAF_MAP        DEC VT100 pseudograph
cs        ESC ( 082L23" class="line" namn>2L23">4 23./a>IBMPC_MAP       IBM code page 437               ESC ( U82L24" class="line" namn>2L24">4 24./a>USER_MAP        User defined                    ESC ( K82L25" class="line" namn>2L25">4 2582L26" class="line" namn>2L26">4 26./a>In particular, ESC ( U is no longer "straight to font", since the font82L27" class="line" namn>2L27">4 27./a>might be completely different than the IBM charac"er set.  This82L28" class="line" namn>2L28">4 28./a>permits for example the use of block graph
cs even with a Latin-1 font82L29" class="line" namn>2L29">4 29./a>loaded.82L30" class="line" namn>2L30">4 3382L31" class="line" namn>2L31">4 31./a>Note that although these codes are similar to ISO 2022, neither the82L32" class="line" namn>2L32">4 32./a>codes nor their uses match ISO 2022; Linux has two 8-bit codes (G0 and82L33" class="line" namn>2L33">4 33./a>G1), whereas ISO 2022 has four 7-bit codes (G0-G3).82L34" class="line" namn>2L34">4 3482L35" class="line" namn>2L35">4 35./a>In accordance with the Unicode standard/ISO 10646 the range U+F000 to82L36" class="line" namn>2L36">4 36./a>U+F8FF has been reserved for OS-wide alloca"
	  (the Unicode Standard82L37" class="line" namn>2L37">4 37./a>refers to this as a "Corporate Zone", since this is inaccurate for82L38" class="line" namn>2L38">4 38./a>Linux we call it the "Linux Zone").  U+F000 was picked as the starting82L39" class="line" namn>2L39">4 39./a>point since it lets the direct-mapping area start 	  a large power of82L40" class="line" namn>2L40">4 43two (i/ocase 1024- or 2048-charac"er fonts ever become necessary).82L41" class="line" namn>2L41">4 41./a>This leaves U+E000 to U+EFFF as End User Zone.82L42" class="line" namn>2L42">4 42./a>82L43" class="line" namn>2L43">4 43./a>[v1.2]: The Unicodes range from U+F000 and up to U+F7FF have been82L44" class="line" namn>2L44">4 44hard-coded to map directly to the loaded font, bypassing the82L45" class="line" namn>2L45">4 45./a>transla"
	  table.  The user-defined map now defaults to U+F000 to82L46" class="line" namn>2L46">4 46./a>U+F0FF, emula"
ng the previous behaviour.  In prac"ice, this range82L47" class="line" namn>2L47">4 47./a>might be shor"er; for example, vgac2L48" class="line" namn>2L48">4 48./a>(U+F000..U+F0FF) or 512-charac"er (U+F000..U+F1FF) fonts.82L49" class="line" namn>2L49">4 49./a>82L50" class="line" namn>2L50">4 5382L51" class="line" namn>2L51">4 51./a>Actual charac"ers assigned in the Linux Zone82L52" class="line" namn>2L52">4 52./a>--------------------------------------------82L53" class="line" namn>2L53">4 5382L54" class="line" namn>2L54">4 54./a>In addi"
	 , the follow
ng charac"ers not present in Unicode 1.1.4
2L55" class="line" namn>2L55">4 55have been defined; these are used by the DEC VT graph
cs map.  [v1.2]
2L56" class="line" namn>2L56">4 56./a>THIS USE IS OBSOLETE AND SHOULD NO LONGER BE USED; PLEASE SEE BELOW.82L57" class="line" namn>2L57">4 5782L58" class="line" namn>2L58">4 58./a>U+F800 DEC VT GRAPHICS HORIZONTAL LINE SCAN 182L59" class="line" namn>2L59">4 59./a>U+F801 DEC VT GRAPHICS HORIZONTAL LINE SCAN 382L60" class="line" namn>2L60">4 60./a>U+F803 DEC VT GRAPHICS HORIZONTAL LINE SCAN 782L61" class="line" namn>2L61">4 61./a>U+F804 DEC VT GRAPHICS HORIZONTAL LINE SCAN 982L62" class="line" namn>2L62">4 62./a>82L63" class="line" namn>2L63">4 63./a>The DEC VT220 uses a 6x10 charac"er matrix, and these charac"ers form82L64" class="line" namn>2L64">4 64./a>a smooth progress12L65" class="line" namn>2L65">4 65omitted the sca  5 line, since it is also used as a block-graph
cs82L66" class="line" namn>2L66">4 66./a>charac"er, and hence has been coded as U+2500 FORMS LIGHT HORIZONTAL.82L67" class="line" namn>2L67">4 6782L68" class="line" namn>2L68">4 68./a>[v1.3]: These charac"ers have been officially added to Unicode 3.2.0;82L69" class="line" namn>2L69">4 69./a>they are added at U+23BA, U+23BB, U+23BC, U+23BD.  Linux now uses the82L70" class="line" namn>2L70">4 70./a>new ptions.82L71" class="line" namn>2L71">4 71./a>82L72" class="line" namn>2L72">4 72./a>[v1.2]: The follow
ng charac"ers have been added to represent common82L73" class="line" namn>2L73">4 73./a>keyboard symbols that are unlikely to ever be added to Unicode proper82L74" class="line" namn>2L74">4 74./a>since they are horribly vendor-specific.  This, of course, is an82L75" class="line" namn>2L75">4 75excellent example of horrible design.82L76" class="line" namn>2L76">4 76./a>82L77" class="line" namn>2L77">4 77./a>U+F810 KEYBOARD SYMBOL FLYING FLAG82L78" class="line" namn>2L78">4 78./a>U+F811 KEYBOARD SYMBOL PULLDOWN MENU82L79" class="line" namn>2L79">4 79./a>U+F812 KEYBOARD SYMBOL OPEN APPLE82L80" class="line" namn>2L80">4 80./a>U+F813 KEYBOARD SYMBOL SOLID APPLE82L81" class="line" namn>2L81">4 81./a>82L82" class="line" namn>2L82">4 82./a>Kling2L83" class="line" namn>2L83">4 83./a>------------------------82L84" class="line" namn>2L84">4 8482L85" class="line" namn>2L85">4 85./a>In 1996, Linux was the first opera"
ng systemoin the world to add82L86" class="line" namn>2L86">4 86./a>support for the artificialolanguage Kling2L87" class="line" namn>2L87">4 87./a>for the "Star Trek" televis12L88" class="line" namn>2L88">4 88./a>adopted by the ConScript Unicode Registry and proposed (but ultimately82L89" class="line" namn>2L89">4 89./a>rejected) for inclus12L90" class="line" namn>2L90">4 90./a>Linux/CSUR private assignment in the Linux Zone.82L91" class="line" namn>2L91">4 91./a>82L92" class="line" namn>2L92">4 92./a>This encoding has been endorsed by the Kling2L93" class="line" namn>2L93">4 93./a>For more informa"
	 , contac" them at:82L94" class="line" namn>2L94">4 9482L95" class="line" namn>2L95">4 95./a>        http://www.kli.org/82L96" class="line" namn>2L96">4 96./a>82L97" class="line" namn>2L97">4 97./a>Since the charac"ers in the beginning of the Linux CZ have been more82L98" class="line" namn>2L98">4 98./a>of the dingbats/symbols/forms typn and this is aolanguage, I have82L99" class="line" namn>2L99">4 99./a>located it at the end, 	  a 16-cell boundary in keeping with standard82L100" class="line" namn>2L100">4100./a>Unicode prac"ice.82L101" class="line" namn>2L101">4101./a>82L102" class="line" namn>2L102">4102./a>NOTE: This range is now officially managed by the ConScript Unicode82L103" class="line" namn>2L103">4103./a>Registry.  The norma"
ve reference is at:82L104" class="line" namn>2L104">410482L105" class="line" namn>2L105">4105./a>        http://www.evertypn.com/standards/csur/kling82L106" class="line" namn>2L106">4106./a>82L107" class="line" namn>2L107">4107./a>Kling2L108" class="line" namn>2L108">4108./a>systemowith 10 digits, and is written left-to-right, top-to-bott2L109" class="line" namn>2L109">4109./a>82L110" class="line" namn>2L110">41"3Severaloglyph forms for the Kling2L111" class="line" namn>2L111">4111./a>However, since the set of symbols appear to be consistent throughout,82L112" class="line" namn>2L112">4112./a>with only the actual shapes being different, in keeping with standard82L113" class="line" namn>2L113">4113./a>Unicode prac"ice these differences are considered font variants.82L114" class="line" namn>2L114">411482L115" class="line" namn>2L115">4115U+F8D0  KLINGON LETTER A82L116" class="line" namn>2L116">4116./a>U+F8D1  KLINGON LETTER B82L117" class="line" namn>2L117">4117./a>U+F8D2  KLINGON LETTER CH82L118" class="line" namn>2L118">4118./a>U+F8D3  KLINGON LETTER D82L119" class="line" namn>2L119">4119./a>U+F8D4  KLINGON LETTER E82L120" class="line" namn>2L120">4120./a>U+F8D5  KLINGON LETTER GH82L121" class="line" namn>2L121">412B82L>U+F8D4  KLINGON LETTER E8U+F8" namn>2L119">4119./a>U+e ofde.txt#L116" idtncode.txt#L114" ">4119./anta"
	 /unicode.tre.href="D2="line" "Documenta"
	 /u13ocumenta"
	 /uni13>2L30">4 338LINGON LETTER GHGH82L24" class="line" n1amn>2124">4 2oa>89INGON LETTER GHGHL2L25" class="line" n1amn>2125">41152L26" class="line" n1amn>2126">4116./aBINGON LETTER GHGHN2L27" class="line" n1amn>2127">4117./aCINGON LETTER GHGHNNG FLAG82L28" class="line" n1amn>2128">4118./aDINGON LETTER GHGHOG FLAG82L29" class="line" n1amn>2129">4119./aEINGON LETTER GHGHPtable,82L30" class="line" n1amn>2130">4120./aFINGON LETTER GHGHQtable,82L31" class="line" n1amn>2131">4 41./a>  - Wd is wrAnviqna.orren with sty Marc with a 45./a>itrst ot common82L32" class="line" n1amn>2136" idtncoE>U+F8D0  KLINGON LQKLINGON LETTER E83txt#L13" iid>2L13" class="line" nam>2L3033">4 41./a>  - Wd is wrAnviQna.orren with sty Marc with a 45./a>itrst ot common82L34" class="line" n1amn>2134">4 2oa>E>U+F8D1  KLINGON LRcommon82L35" class="line" n1amn>2135">4115U+F8D2  KLINGON LScommon82L36" class="line" n1amn>2136">4 36./E>U+F8D3  KLINGON LTcommon82L37" class="line" n1amn>2137">4117./E>U+F8D4  KLINGON LTLKLINGON LETTER E83txt#L28"1 id>2L38" class="line" n1amn>2138">4118./E>U+F8D5  KLINGON LWN MENU82L39" class="line" n1amn>2139">4119./E hrefn>2L>U+F8D4  Vto use82L40" class="line" n1amn>2140">4120./Ee.txt#L114" ">4119Wto use82L41" class="line" n1amn>2141">412B82L42" class="line" n1amn>2146" idtncoE9INGON LETTER GHGHGLOTORIZSTOPtable,82L43" class="line" n1amn>2143">4 5382L14" class="line" nammn>2144">4 2oa>F>U+F8D0  KLDIGIT ZEROG FLAG82L45" class="line" n1amn>2145">41152L46" class="line" n1amn>2146">4 36./a2U+F8D0  KLDIGIT TWOG FLAG82L47" class="line" n1amn>2147">4117./F3U+F8D0  KLDIGIT THREETTER E82L48" class="line" n1amn>2148">4118./F4U+F8D0  KLDIGIT FOURcommon82L49" class="line" n1amn>2149">4119./F5U+F8D0  KLDIGIT FIVETTER E82L50" class="line" n1amn>2150">4120./F6U+F8D0  KLDIGIT SIXTTER E82L51" class="line" n1amn>2151">412B82L52" class="line" n1amn>2156" idtncoF8U+F8D0  KLDIGIT EORMS2L53" class="line" n1amn>215>4 33F9U+F8D0  KLDIGIT NINETTER E82L54" class="line" n1amn>2154">411482L55" class="line" n1amn>2155">41152L56" class="line" n1amn>2156">4 36./aEINGON LETTFULIZSTOPtable,82L57" class="line" n1amn>2157">4117./FFINGON LETTBOARD SYOR EMPIRETTER E82L58" class="line" n1amn>2158">4 18./a>82L59" class="line" n1amn>2159">4 O2, neFica posi"rc wA the artifhe Conraph
cs82L60" class="line" n1amn>2160">4 --------------------------------------------82L61" class="line" n1amn>2161">4101./a>82L62" class="line" n1amn>2162">4 97./a>Sincivate assig98./a>oy the Kli the 13./a>Un as aac"ercode Regpower of82L63" class="line" n1amn>2163">4 fica posi"rc wr the artifse Conrcoding has sa"
	ish managJohn Cowe, is an82L64" class="line" n1amn>2164">4 Anvijcowe,@reucharhe thaertyna.orgc wMichael ErrenowrAnvierrenow@//www.evertyna.oriants.82L65" class="line" n1amn>2165">4 45./y the ConScript Unicode Regrenccome f horce is at:82L16" class="line" nammn>2166">4106./a>82L67" class="line" n1amn>2167">4 47./a>          http://www.evertypn.com/standards">4106./a>82L68" class="line" n1amn>2168">4 18./a>82L69" class="line" n1amn>2169">4 T0646 thess alsofx weed it alowat tg98./a>oFF as End Usergc w the, and">4 47./a>          2L70" class="line" namn>.5y2L104" cmxmight be shor"er; for example, vgac<2L46" class="line" n1amn>2L71"1>4 71./a>82L63" class="line" amn>2L72"1>4 72./a>[v1.2]: The fol1low
n1 charainrenld tobilit is c" trand82L17" class="line" nam1amn>2163>4 73./a>keyboard symbol1s tha17x10 charathe Lieohrefs.82L103" class="line" amn>2L74"1>4 74./a>since they are 1horri1ly venas a8ode RegpDocumenta"
as a82L103" class="line" a1amn>2165>4 75excellent examp1le of1horrib2L17" class="line" namammn>2166>4 76./a>82fil6./a>8
haraorigihas LXRhsgftws.8< href="n.com/standards/sourceforge.net/"
	 /uns/lxn">LXRhocu"lasty/unienta"
	exd tiid>2Lla href="D hrn.com/stata"lto:lxn@lref=.no">lxn@lref=.no/uni.

.div i>2fil6./a>8
lxn.lref=.no kindhrefose Regisn.com/standards/.htmredpill-lre"
	.no">Redpill hre"
	 AS/unien"
	vef="nd">href="a hrta"
wer  andp
	 /unis la"viiantpear t1995.

.div id>/body i>/"
	 id