LGR for Languages spoken in Pakistan

This document is mechanically formatted from the XML file for the LGR. It provides additional summary data and explanatory text. The XML file remains the sole normative specification of the LGR.

LGR Version 1
Date 2017-03-20
Unicode Version 6.3.0
Language Languages spoken in Pakistan
Scope domain: پاکستان.

Description

Label Generation Rules for Languages spoken in Pakistan

Overview

This document specifies a set of Label Generation Rules for languages spoken in Pakistan including Balochi, Pashto, Punjabi, Saraiki, Sindhi and Urdu using a limited repertoire as appropriate for a second level domain.

This document also provides a description of the IDN (Internationalized Domain Names) Language Table to be used by the Pakistan IDN ccTLD registry for the registration of languages spoken in Pakistan. These are based on the recommendation [401] of the پاکستان. IDN ccTLD Language Table Sub-Committee formed by the Ministry of IT, Government of Pakistan. Therefore, individual code points are not given.

The repertoire and variant sets are adjusted based on the more recent work by the Arabic script community published as the Root Zone Label Generation Ruleset for the Arabic Script by ICANN. The code point U+200C (‌) ZERO WIDTH NON-JOINER has been excluded due to security reasons. The variant sets have been added following the recommendations of the Root Zone LGR. However, U+0641 (ف) and U+0642 (ق) are kept distinct as these will not be used as variants for languages spoken in Pakistan.

Repertoire

Summary

Number of elements in Repertoire 165
Number of extended elements 0
Number of excluded elements 0
Total entries in table 165
Number of code point sequences 0

Repertoire by Code Point

The following table lists the repertoire by code point (or code point sequence). The data in the Script and Name column are extracted from the Unicode character database. Where the comment in the original LGR is equal to the character name, it has been suppressed.

For any code point or sequence for which a variant is defined, the link to the associated variant set, or if mapped to itself, the variant type of that mapping is provided in the Variants column.

# Code
Point
Glyph Script Name Tags Required Context Part of
Repertoire
Variants Comment References
1 002D - Common HYPHEN-MINUS   not: hyphen-minus-disallowed    
2 0030 0 Common DIGIT ZERO ascii-digits not: leading-digit set 1    
3 0031 1 Common DIGIT ONE ascii-digits not: leading-digit set 2    
4 0032 2 Common DIGIT TWO ascii-digits not: leading-digit set 3    
5 0033 3 Common DIGIT THREE ascii-digits not: leading-digit set 4    
6 0034 4 Common DIGIT FOUR ascii-digits not: leading-digit set 5    
7 0035 5 Common DIGIT FIVE ascii-digits not: leading-digit set 6    
8 0036 6 Common DIGIT SIX ascii-digits not: leading-digit set 7    
9 0037 7 Common DIGIT SEVEN ascii-digits not: leading-digit set 8    
10 0038 8 Common DIGIT EIGHT ascii-digits not: leading-digit set 9    
11 0039 9 Common DIGIT NINE ascii-digits not: leading-digit set 10    
12 0061 a Latin LATIN SMALL LETTER A          
13 0062 b Latin LATIN SMALL LETTER B          
14 0063 c Latin LATIN SMALL LETTER C          
15 0064 d Latin LATIN SMALL LETTER D          
16 0065 e Latin LATIN SMALL LETTER E          
17 0066 f Latin LATIN SMALL LETTER F          
18 0067 g Latin LATIN SMALL LETTER G          
19 0068 h Latin LATIN SMALL LETTER H          
20 0069 i Latin LATIN SMALL LETTER I          
21 006A j Latin LATIN SMALL LETTER J          
22 006B k Latin LATIN SMALL LETTER K          
23 006C l Latin LATIN SMALL LETTER L          
24 006D m Latin LATIN SMALL LETTER M          
25 006E n Latin LATIN SMALL LETTER N          
26 006F o Latin LATIN SMALL LETTER O          
27 0070 p Latin LATIN SMALL LETTER P          
28 0071 q Latin LATIN SMALL LETTER Q          
29 0072 r Latin LATIN SMALL LETTER R          
30 0073 s Latin LATIN SMALL LETTER S          
31 0074 t Latin LATIN SMALL LETTER T          
32 0075 u Latin LATIN SMALL LETTER U          
33 0076 v Latin LATIN SMALL LETTER V          
34 0077 w Latin LATIN SMALL LETTER W          
35 0078 x Latin LATIN SMALL LETTER X          
36 0079 y Latin LATIN SMALL LETTER Y          
37 007A z Latin LATIN SMALL LETTER Z          
38 0621 ء Arabic ARABIC LETTER HAMZA        
39 0622 آ Arabic ARABIC LETTER ALEF WITH MADDA ABOVE     set 11    
40 0623 أ Arabic ARABIC LETTER ALEF WITH HAMZA ABOVE     set 11    
41 0624 ؤ Arabic ARABIC LETTER WAW WITH HAMZA ABOVE     set 12    
42 0625 إ Arabic ARABIC LETTER ALEF WITH HAMZA BELOW     set 11    
43 0626 ئ Arabic ARABIC LETTER YEH WITH HAMZA ABOVE   precedes-right-joining  set 13    
44 0627 ا Arabic ARABIC LETTER ALEF     set 11    
45 0628 ب Arabic ARABIC LETTER BEH        
46 0629 ة Arabic ARABIC LETTER TEH MARBUTA     set 14    
47 062A ت Arabic ARABIC LETTER TEH     set 15    
48 062B ث Arabic ARABIC LETTER THEH     set 16    
49 062C ج Arabic ARABIC LETTER JEEM        
50 062D ح Arabic ARABIC LETTER HAH        
51 062E خ Arabic ARABIC LETTER KHAH        
52 062F د Arabic ARABIC LETTER DAL        
53 0630 ذ Arabic ARABIC LETTER THAL        
54 0631 ر Arabic ARABIC LETTER REH        
55 0632 ز Arabic ARABIC LETTER ZAIN        
56 0633 س Arabic ARABIC LETTER SEEN        
57 0634 ش Arabic ARABIC LETTER SHEEN        
58 0635 ص Arabic ARABIC LETTER SAD        
59 0636 ض Arabic ARABIC LETTER DAD        
60 0637 ط Arabic ARABIC LETTER TAH        
61 0638 ظ Arabic ARABIC LETTER ZAH        
62 0639 ع Arabic ARABIC LETTER AIN        
63 063A غ Arabic ARABIC LETTER GHAIN        
64 0641 ف Arabic ARABIC LETTER FEH        
65 0642 ق Arabic ARABIC LETTER QAF        
66 0643 ك Arabic ARABIC LETTER KAF     set 17    
67 0644 ل Arabic ARABIC LETTER LAM        
68 0645 م Arabic ARABIC LETTER MEEM        
69 0646 ن Arabic ARABIC LETTER NOON     set 18    
70 0647 ه Arabic ARABIC LETTER HEH     set 14    
71 0648 و Arabic ARABIC LETTER WAW     set 12    
72 0649 ى Arabic ARABIC LETTER ALEF MAKSURA     set 13    
73 064A ي Arabic ARABIC LETTER YEH     set 13    
74 0660 ٠ Arabic ARABIC-INDIC DIGIT ZERO arabic-indic-digits not: leading-digit set 1    
75 0661 ١ Arabic ARABIC-INDIC DIGIT ONE arabic-indic-digits not: leading-digit set 2    
76 0662 ٢ Arabic ARABIC-INDIC DIGIT TWO arabic-indic-digits not: leading-digit set 3    
77 0663 ٣ Arabic ARABIC-INDIC DIGIT THREE arabic-indic-digits not: leading-digit set 4    
78 0664 ٤ Arabic ARABIC-INDIC DIGIT FOUR arabic-indic-digits not: leading-digit set 5    
79 0665 ٥ Arabic ARABIC-INDIC DIGIT FIVE arabic-indic-digits not: leading-digit set 6    
80 0666 ٦ Arabic ARABIC-INDIC DIGIT SIX arabic-indic-digits not: leading-digit set 7    
81 0667 ٧ Arabic ARABIC-INDIC DIGIT SEVEN arabic-indic-digits not: leading-digit set 8    
82 0668 ٨ Arabic ARABIC-INDIC DIGIT EIGHT arabic-indic-digits not: leading-digit set 9    
83 0669 ٩ Arabic ARABIC-INDIC DIGIT NINE arabic-indic-digits not: leading-digit set 10    
84 0679 ٹ Arabic ARABIC LETTER TTEH     set 19    
85 067A ٺ Arabic ARABIC LETTER TTEHEH     set 15    
86 067B ٻ Arabic ARABIC LETTER BEEH     set 13    
87 067C ټ Arabic ARABIC LETTER TEH WITH RING        
88 067D ٽ Arabic ARABIC LETTER TEH WITH THREE DOTS ABOVE DOWNWARDS     set 16    
89 067E پ Arabic ARABIC LETTER PEH        
90 067F ٿ Arabic ARABIC LETTER TEHEH        
91 0680 ڀ Arabic ARABIC LETTER BEHEH        
92 0681 ځ Arabic ARABIC LETTER HAH WITH HAMZA ABOVE        
93 0683 ڃ Arabic ARABIC LETTER NYEH     set 20    
94 0684 ڄ Arabic ARABIC LETTER DYEH     set 20    
95 0685 څ Arabic ARABIC LETTER HAH WITH THREE DOTS ABOVE        
96 0686 چ Arabic ARABIC LETTER TCHEH        
97 0687 ڇ Arabic ARABIC LETTER TCHEHEH        
98 0688 ڈ Arabic ARABIC LETTER DDAL        
99 0689 ډ Arabic ARABIC LETTER DAL WITH RING        
100 068A ڊ Arabic ARABIC LETTER DAL WITH DOT BELOW        
101 068B ڋ Arabic ARABIC LETTER DAL WITH DOT BELOW AND SMALL TAH        
102 068C ڌ Arabic ARABIC LETTER DAHAL        
103 068D ڍ Arabic ARABIC LETTER DDAHAL        
104 068F ڏ Arabic ARABIC LETTER DAL WITH THREE DOTS ABOVE DOWNWARDS        
105 0691 ڑ Arabic ARABIC LETTER RREH        
106 0693 ړ Arabic ARABIC LETTER REH WITH RING        
107 0696 ږ Arabic ARABIC LETTER REH WITH DOT BELOW AND DOT ABOVE        
108 0698 ژ Arabic ARABIC LETTER JEH          
109 0699 ڙ Arabic ARABIC LETTER REH WITH FOUR DOTS ABOVE          
110 069A ښ Arabic ARABIC LETTER SEEN WITH DOT BELOW AND DOT ABOVE          
111 06A6 ڦ Arabic ARABIC LETTER PEHEH        
112 06A9 ک Arabic ARABIC LETTER KEHEH     set 17    
113 06AA ڪ Arabic ARABIC LETTER SWASH KAF     set 17    
114 06AB ګ Arabic ARABIC LETTER KAF WITH RING     set 21    
115 06AF گ Arabic ARABIC LETTER GAF     set 21    
116 06B1 ڱ Arabic ARABIC LETTER NGOEH        
117 06B3 ڳ Arabic ARABIC LETTER GUEH        
118 06B7 ڷ Arabic ARABIC LETTER LAM WITH THREE DOTS ABOVE        
119 06BA ں Arabic ARABIC LETTER NOON GHUNNA     set 18    
120 06BB ڻ Arabic ARABIC LETTER RNOON     set 19    
121 06BC ڼ Arabic ARABIC LETTER NOON WITH RING        
122 06BE ھ Arabic ARABIC LETTER HEH DOACHASHMEE     set 14    
123 06C1 ہ Arabic ARABIC LETTER HEH GOAL     set 14    
124 06C2 ۂ Arabic ARABIC LETTER HEH GOAL WITH HAMZA ABOVE     set 14    
125 06C3 ۃ Arabic ARABIC LETTER TEH MARBUTA GOAL     set 14    
126 06CC ی Arabic ARABIC LETTER FARSI YEH     set 13    
127 06CD ۍ Arabic ARABIC LETTER YEH WITH TAIL     set 13    
128 06D0 ې Arabic ARABIC LETTER E     set 13    
129 06D2 ے Arabic ARABIC LETTER YEH BARREE     set 13    
130 06D3 ۓ Arabic ARABIC LETTER YEH BARREE WITH HAMZA ABOVE     set 13    
131 06F0 ۰ Arabic EXTENDED ARABIC-INDIC DIGIT ZERO extended-arabic-indic-digits not: leading-digit set 1    
132 06F1 ۱ Arabic EXTENDED ARABIC-INDIC DIGIT ONE extended-arabic-indic-digits not: leading-digit set 2    
133 06F2 ۲ Arabic EXTENDED ARABIC-INDIC DIGIT TWO extended-arabic-indic-digits not: leading-digit set 3    
134 06F3 ۳ Arabic EXTENDED ARABIC-INDIC DIGIT THREE extended-arabic-indic-digits not: leading-digit set 4    
135 06F4 ۴ Arabic EXTENDED ARABIC-INDIC DIGIT FOUR extended-arabic-indic-digits not: leading-digit set 5    
136 06F5 ۵ Arabic EXTENDED ARABIC-INDIC DIGIT FIVE extended-arabic-indic-digits not: leading-digit set 6    
137 06F6 ۶ Arabic EXTENDED ARABIC-INDIC DIGIT SIX extended-arabic-indic-digits not: leading-digit set 7    
138 06F7 ۷ Arabic EXTENDED ARABIC-INDIC DIGIT SEVEN extended-arabic-indic-digits not: leading-digit set 8    
139 06F8 ۸ Arabic EXTENDED ARABIC-INDIC DIGIT EIGHT extended-arabic-indic-digits not: leading-digit set 9    
140 06F9 ۹ Arabic EXTENDED ARABIC-INDIC DIGIT NINE extended-arabic-indic-digits not: leading-digit set 10    
141 06FB ۻ Arabic ARABIC LETTER DAD WITH DOT BELOW        
142 06FD ۽ Arabic ARABIC SIGN SINDHI AMPERSAND        
143 06FE ۾ Arabic ARABIC SIGN SINDHI POSTPOSITION MEN        
144 075C ݜ Arabic ARABIC LETTER SEEN WITH FOUR DOTS ABOVE        
145 0763 ݣ Arabic ARABIC LETTER KEHEH WITH THREE DOTS ABOVE     set 21    
146 0768 ݨ Arabic ARABIC LETTER NOON WITH SMALL TAH        
147 076A ݪ Arabic ARABIC LETTER LAM WITH BAR        
148 076B ݫ Arabic ARABIC LETTER REH WITH TWO DOTS VERTICALLY ABOVE        
149 076D ݭ Arabic ARABIC LETTER SEEN WITH TWO DOTS VERTICALLY ABOVE        
150 076E ݮ Arabic ARABIC LETTER HAH WITH SMALL ARABIC LETTER TAH BELOW        
151 076F ݯ Arabic ARABIC LETTER HAH WITH SMALL ARABIC LETTER TAH AND TWO DOTS        
152 0770 ݰ Arabic ARABIC LETTER SEEN WITH SMALL ARABIC LETTER TAH AND TWO DOTS        
153 0771 ݱ Arabic ARABIC LETTER REH WITH SMALL ARABIC LETTER TAH AND TWO DOTS        
154 0772 ݲ Arabic ARABIC LETTER HAH WITH SMALL ARABIC LETTER TAH ABOVE        
155 0773 ݳ Arabic ARABIC LETTER ALEF WITH EXTENDED ARABIC-INDIC DIGIT TWO ABOVE     set 11    
156 0774 ݴ Arabic ARABIC LETTER ALEF WITH EXTENDED ARABIC-INDIC DIGIT THREE ABOVE     set 11    
157 0775 ݵ Arabic ARABIC LETTER FARSI YEH WITH EXTENDED ARABIC-INDIC DIGIT TWO ABOVE     set 13    
158 0776 ݶ Arabic ARABIC LETTER FARSI YEH WITH EXTENDED ARABIC-INDIC DIGIT THREE ABOVE        
159 0777 ݷ Arabic ARABIC LETTER FARSI YEH WITH EXTENDED ARABIC-INDIC DIGIT FOUR BELOW     set 13    
160 0778 ݸ Arabic ARABIC LETTER WAW WITH EXTENDED ARABIC-INDIC DIGIT TWO ABOVE     set 12    
161 0779 ݹ Arabic ARABIC LETTER WAW WITH EXTENDED ARABIC-INDIC DIGIT THREE ABOVE     set 12    
162 077A ݺ Arabic ARABIC LETTER YEH BARREE WITH EXTENDED ARABIC-INDIC DIGIT TWO ABOVE     set 13    
163 077B ݻ Arabic ARABIC LETTER YEH BARREE WITH EXTENDED ARABIC-INDIC DIGIT THREE ABOVE     set 13    
164 077C ݼ Arabic ARABIC LETTER HAH WITH EXTENDED ARABIC-INDIC DIGIT FOUR BELOW        
165 077D ݽ Arabic ARABIC LETTER SEEN WITH EXTENDED ARABIC-INDIC DIGIT FOUR ABOVE        

Legend

Code Point
A code point or code point sequence.
Name
Shows the character or sequence name from the Unicode Character Database.
Glyph
The shape displayed depends on the fonts available to your browser.
Script
Shows the script property value from the Unicode Character Database. Combining marks may have the value Inherited and code points used with more than one script may have the value Common.
References
Links to the references associated with the code point or sequence, if any.
Tags
LGR-defined tag values. Any tags matching the Unicode script property are suppressed in this view.
Required Context
Link to the rule defining the required context a code point or sequence must satisfy. If prefixed by "not:", identifies a context that must not occur.
Variants
A link to the variant set the code point or sequence is a member of, except where a coded point or sequence maps only to itself, in which case the type of that mapping is listed.
Comment
If the comment in this row consists only of the code point or sequence name it is suppressed in this view.
✔ - core repertoire
A check mark in the Part-of-Repertoire column indicates a code point is part of the core repertoire.
◯ - extended repertoire
An open circle indicates a code point is part of an optional extended repertoire, which is normally disabled but could be supported by deleting the relevant context restriction.
✗ - excluded from repertoire
A code point shown with is considered excluded from the repertoire. It is shown only for review purposes.

Variant Sets

Summary

Number of variant sets 21
Largest variant set 13
Ordinary Variants by Type allocatable (28)
blocked (282)
Reflexive Variants by Type  

The following tables list each pair of variant mappings on one row. For each pair of code points, by convention, the lower code point is taken as the source of the mapping in the forward → direction and the reverse direction ← is not listed separately. The variant mappings defined in an LGR are required to be symmetric, that is, both the forward and reverse mappings must be specified.

A mapping where source and target are the same is reflexive. Variant sets consisting of only a single reflexive mapping are not shown as a set. Instead, the variant type of the mapping is listed in the Variants column of the Repertoire by Code Point table. Reflexive mappings that are part of a larger set are indicated with a “≡”.

Where the type of both forward and reverse mappings are the same, a single value is given in the Type(s) column, otherwise the types for forward and reverse mapping are given in that order, as indicated by the arrows. The same applies to any comments.

In a properly specified LGR, all members of each variant set are variants of each other, a property called transitivity. Because of that, all variant sets are necessarily disjoint. In each set, shading is used to group mappings from the same source code point or sequence.

Variant Set 1 — 3 Members

# Source Glyph Target Glyph   Type(s) Ref Comment
1 0030 0 0660 ٠ blocked    
2 0030 0 06F0 ۰ blocked    
3 0660 ٠ 06F0 ۰ blocked    

Variant Set 2 — 3 Members

# Source Glyph Target Glyph   Type(s) Ref Comment
1 0031 1 0661 ١ blocked    
2 0031 1 06F1 ۱ blocked    
3 0661 ١ 06F1 ۱ blocked    

Variant Set 3 — 3 Members

# Source Glyph Target Glyph   Type(s) Ref Comment
1 0032 2 0662 ٢ blocked    
2 0032 2 06F2 ۲ blocked    
3 0662 ٢ 06F2 ۲ blocked    

Variant Set 4 — 3 Members

# Source Glyph Target Glyph   Type(s) Ref Comment
1 0033 3 0663 ٣ blocked    
2 0033 3 06F3 ۳ blocked    
3 0663 ٣ 06F3 ۳ blocked    

Variant Set 5 — 3 Members

# Source Glyph Target Glyph   Type(s) Ref Comment
1 0034 4 0664 ٤ blocked    
2 0034 4 06F4 ۴ blocked    
3 0664 ٤ 06F4 ۴ blocked    

Variant Set 6 — 3 Members

# Source Glyph Target Glyph   Type(s) Ref Comment
1 0035 5 0665 ٥ blocked    
2 0035 5 06F5 ۵ blocked    
3 0665 ٥ 06F5 ۵ blocked    

Variant Set 7 — 3 Members

# Source Glyph Target Glyph   Type(s) Ref Comment
1 0036 6 0666 ٦ blocked    
2 0036 6 06F6 ۶ blocked    
3 0666 ٦ 06F6 ۶ blocked    

Variant Set 8 — 3 Members

# Source Glyph Target Glyph   Type(s) Ref Comment
1 0037 7 0667 ٧ blocked    
2 0037 7 06F7 ۷ blocked    
3 0667 ٧ 06F7 ۷ blocked    

Variant Set 9 — 3 Members

# Source Glyph Target Glyph   Type(s) Ref Comment
1 0038 8 0668 ٨ blocked    
2 0038 8 06F8 ۸ blocked    
3 0668 ٨ 06F8 ۸ blocked    

Variant Set 10 — 3 Members

# Source Glyph Target Glyph   Type(s) Ref Comment
1 0039 9 0669 ٩ blocked    
2 0039 9 06F9 ۹ blocked    
3 0669 ٩ 06F9 ۹ blocked    

Variant Set 11 — 6 Members

# Source Glyph Target Glyph   Type(s) Ref Comment
1 0622 آ 0623 أ blocked    
2 0622 آ 0625 إ blocked    
3 0622 آ 0627 ا allocatable    
blocked    
4 0622 آ 0773 ݳ blocked    
5 0622 آ 0774 ݴ blocked    
6 0623 أ 0625 إ blocked    
7 0623 أ 0627 ا allocatable    
blocked    
8 0623 أ 0773 ݳ blocked    
9 0623 أ 0774 ݴ blocked    
10 0625 إ 0627 ا allocatable    
blocked    
11 0625 إ 0773 ݳ blocked    
12 0625 إ 0774 ݴ blocked    
13 0627 ا 0773 ݳ blocked    
allocatable    
14 0627 ا 0774 ݴ blocked    
allocatable    
15 0773 ݳ 0774 ݴ blocked    

Variant Set 12 — 4 Members

# Source Glyph Target Glyph   Type(s) Ref Comment
1 0624 ؤ 0648 و allocatable    
blocked    
2 0624 ؤ 0778 ݸ blocked    
3 0624 ؤ 0779 ݹ blocked    
4 0648 و 0778 ݸ blocked    
allocatable    
5 0648 و 0779 ݹ blocked    
allocatable    
6 0778 ݸ 0779 ݹ blocked    

Variant Set 13 — 13 Members

# Source Glyph Target Glyph   Type(s) Ref Comment
1 0626 ئ 0649 ى blocked    
2 0626 ئ 064A ي blocked    
3 0626 ئ 067B ٻ blocked    
4 0626 ئ 06CC ی blocked    
5 0626 ئ 06CD ۍ blocked    
6 0626 ئ 06D0 ې blocked    
7 0626 ئ 06D2 ے blocked    
8 0626 ئ 06D3 ۓ blocked    
9 0626 ئ 0775 ݵ blocked    
10 0626 ئ 0777 ݷ blocked    
11 0626 ئ 077A ݺ blocked    
12 0626 ئ 077B ݻ blocked    
13 0649 ى 064A ي blocked    
14 0649 ى 067B ٻ blocked    
15 0649 ى 06CC ی blocked    
16 0649 ى 06CD ۍ blocked    
17 0649 ى 06D0 ې blocked    
18 0649 ى 06D2 ے blocked    
19 0649 ى 06D3 ۓ blocked    
20 0649 ى 0775 ݵ blocked    
21 0649 ى 0777 ݷ blocked    
22 0649 ى 077A ݺ blocked    
23 0649 ى 077B ݻ blocked    
24 064A ي 067B ٻ blocked    
25 064A ي 06CC ی allocatable    
26 064A ي 06CD ۍ blocked    
27 064A ي 06D0 ې blocked    
28 064A ي 06D2 ے blocked    
29 064A ي 06D3 ۓ blocked    
30 064A ي 0775 ݵ blocked    
31 064A ي 0777 ݷ blocked    
32 064A ي 077A ݺ blocked    
33 064A ي 077B ݻ blocked    
34 067B ٻ 06CC ی blocked    
35 067B ٻ 06CD ۍ blocked    
36 067B ٻ 06D0 ې blocked    
37 067B ٻ 06D2 ے blocked    
38 067B ٻ 06D3 ۓ blocked    
39 067B ٻ 0775 ݵ blocked    
40 067B ٻ 0777 ݷ blocked    
41 067B ٻ 077A ݺ blocked    
42 067B ٻ 077B ݻ blocked    
43 06CC ی 06CD ۍ blocked    
44 06CC ی 06D0 ې blocked    
45 06CC ی 06D2 ے blocked    
46 06CC ی 06D3 ۓ blocked    
47 06CC ی 0775 ݵ blocked    
48 06CC ی 0777 ݷ blocked    
49 06CC ی 077A ݺ blocked    
50 06CC ی 077B ݻ blocked    
51 06CD ۍ 06D0 ې blocked    
52 06CD ۍ 06D2 ے blocked    
53 06CD ۍ 06D3 ۓ blocked    
54 06CD ۍ 0775 ݵ blocked    
55 06CD ۍ 0777 ݷ blocked    
56 06CD ۍ 077A ݺ blocked    
57 06CD ۍ 077B ݻ blocked    
58 06D0 ې 06D2 ے blocked    
59 06D0 ې 06D3 ۓ blocked    
60 06D0 ې 0775 ݵ blocked    
61 06D0 ې 0777 ݷ blocked    
62 06D0 ې 077A ݺ blocked    
63 06D0 ې 077B ݻ blocked    
64 06D2 ے 06D3 ۓ blocked    
allocatable    
65 06D2 ے 0775 ݵ blocked    
66 06D2 ے 0777 ݷ blocked    
67 06D2 ے 077A ݺ blocked    
allocatable    
68 06D2 ے 077B ݻ blocked    
allocatable    
69 06D3 ۓ 0775 ݵ blocked    
70 06D3 ۓ 0777 ݷ blocked    
71 06D3 ۓ 077A ݺ blocked    
72 06D3 ۓ 077B ݻ blocked    
73 0775 ݵ 0777 ݷ blocked    
74 0775 ݵ 077A ݺ blocked    
75 0775 ݵ 077B ݻ blocked    
76 0777 ݷ 077A ݺ blocked    
77 0777 ݷ 077B ݻ blocked    
78 077A ݺ 077B ݻ blocked    

Variant Set 14 — 6 Members

# Source Glyph Target Glyph   Type(s) Ref Comment
1 0629 ة 0647 ه allocatable    
blocked    
2 0629 ة 06BE ھ blocked    
3 0629 ة 06C1 ہ blocked    
4 0629 ة 06C2 ۂ blocked    
5 0629 ة 06C3 ۃ allocatable    
6 0647 ه 06BE ھ blocked    
7 0647 ه 06C1 ہ allocatable    
8 0647 ه 06C2 ۂ blocked    
9 0647 ه 06C3 ۃ blocked    
10 06BE ھ 06C1 ہ blocked    
11 06BE ھ 06C2 ۂ blocked    
12 06BE ھ 06C3 ۃ blocked    
13 06C1 ہ 06C2 ۂ blocked    
allocatable    
14 06C1 ہ 06C3 ۃ blocked    
allocatable    
15 06C2 ۂ 06C3 ۃ blocked    

Variant Set 15 — 2 Members

# Source Glyph Target Glyph   Type(s) Ref Comment
1 062A ت 067A ٺ blocked    

Variant Set 16 — 2 Members

# Source Glyph Target Glyph   Type(s) Ref Comment
1 062B ث 067D ٽ blocked    

Variant Set 17 — 3 Members

# Source Glyph Target Glyph   Type(s) Ref Comment
1 0643 ك 06A9 ک allocatable    
2 0643 ك 06AA ڪ allocatable    
3 06A9 ک 06AA ڪ allocatable    

Variant Set 18 — 2 Members

# Source Glyph Target Glyph   Type(s) Ref Comment
1 0646 ن 06BA ں allocatable    

Variant Set 19 — 2 Members

# Source Glyph Target Glyph   Type(s) Ref Comment
1 0679 ٹ 06BB ڻ blocked    

Variant Set 20 — 2 Members

# Source Glyph Target Glyph   Type(s) Ref Comment
1 0683 ڃ 0684 ڄ blocked    

Variant Set 21 — 3 Members

# Source Glyph Target Glyph   Type(s) Ref Comment
1 06AB ګ 06AF گ blocked    
2 06AB ګ 0763 ݣ blocked    
3 06AF گ 0763 ݣ blocked    

Classes, Rules and Actions

Character Classes

The following table lists all top-level classes with their definition and a list of their members intersected with the current repertoire.

Name Definition Count Members or Ranges Ref Comment
implicit Tag=ascii-digits 10 Elements: { 0030 0031 0032 0033 0034 0035 0036 0037 0038 0039 }    
implicit Tag=arabic-indic-digits 10 Elements: { 0660 0661 0662 0663 0664 0665 0666 0667 0668 0669 }    
implicit Tag=extended-arabic-indic-digits 10 Elements: { 06F0 06F1 06F2 06F3 06F4 06F5 06F6 06F7 06F8 06F9 }    

Legend

Members or Ranges
Lists the members of the class as code points (xxx) or ranges of code points (xxx-yyy). Any class too numerous to list in full is elided with "...".
Tag=ttt
A named class is defined by all code points that share the given tag value (ttt).
Implicit
An anonymous class implicitly defined class based on tag value.

Whole label evaluation and context rules

The following table lists all the top-level, or named rules defined in the LGR and indicates whether they are used as trigger in an action or as context (when or not-when) for a code point. (Any use of context rules for variants is not indicated).

Name Regular Expression Used as
Trigger
Used as
Context
Anchor Ref Comment
leading-combining-mark (start ((class property:gc:Mn) or (class property:gc:Mc)))   [120]RFC5891 restrictions on placement of combining marks
hyphen-minus-disallowed (choice [start (anchor)][(anchor) end][start (any)(any)(002D)(anchor)])  [120]RFC5891 restrictions on placement of U+002D
leading-digit (start (anchor))  [121]RFC5893 RTL labels cannot start with a digit
mixed-digits (choice [(ascii-digits)(0+ any)(arabic-indic-digits)(0+ any)(extended-arabic-indic-digits)][(ascii-digits)(0+ any)(extended-arabic-indic-digits)(0+ any)(arabic-indic-digits)][(arabic-indic-digits)(0+ any)(ascii-digits)(0+ any)(extended-arabic-indic-digits)][(arabic-indic-digits)(0+ any)(extended-arabic-indic-digits)(0+ any)(ascii-digits)][(extended-arabic-indic-digits)(0+ any)(arabic-indic-digits)(0+ any)(ascii-digits)][(extended-arabic-indic-digits)(0+ any)(ascii-digits)(0+ any)(arabic-indic-digits)])   [121]RFC5893 RTL labels with a mix of European, Arabic-Indic and Extended Arabic-Indic digits are invalid
precedes-right-joining (anchor)((class property:jt:D) or (class property:jt:R))  must precede a code point joining on the right
no-mix-teh-marbuta-goal (choice [(0629)(0+ any)(06C3)][(06C3)(0+ any)(0629)])   None
no-mix-kaf-keheh (choice [(0643)(0+ any)(06A9)][(06A9)(0+ any)(0643)])   [112]do not mix Arabic letters KAF and KEHEH in the same label
no-mix-kaf-swash (choice [(0643)(0+ any)(06AA)][(06AA)(0+ any)(0643)])   None
no-mix-heh-doachashmee (choice [(0647)(0+ any)(06BE)][(06BE)(0+ any)(0647)])   None
no-mix-heh-goal (choice [(0647)(0+ any)(06C1)][(06C1)(0+ any)(0647)])   None
no-mix-alef-maksura-farsi-yeh (choice [(0649)(0+ any)(06CC)][(06CC)(0+ any)(0649)])   [112]do not mix Arabic letters ALEF MAKSURA and FARSI YEH in the same label
no-mix-kaf-with-ring-gaf (choice [(06AB)(0+ any)(06AF)][(06AF)(0+ any)(06AB)])   None
no-mix-kaf-with-ring-keheh-with-three-dots-above (choice [(06AB)(0+ any)(0763)][(0763)(0+ any)(06AB)])   None
no-mix-gaf-keheh-with-three-dots-above (choice [(06AF)(0+ any)(0763)][(0763)(0+ any)(06AF)])   [112]do not mix Arabic letters GAF and KEHEH WITH THREE DOTS ABOVE in the same label

Legend

Regular Expression
A regular expression equivalent to the rule, shown in the standard notation with some extensions as noted.
[] - a choice
When there are various choices in a rule, each choice is represented by a set enclosed in square brackets.
[∩,−,Δ,∪] - set operators
Sets may be combined by set operators ( = intersection, = difference, Δ = symmetric difference and = union).
()= - empty set
Indicates that the following set is empty because of the result of set operations, or because none of its elements are part of the repertoire defined here.An empty set that is not optional means that a rule can never match.

Note: The terminologies used in the regular expressions are followed from RFC7940.

Actions

The following table lists the actions that are used to assign dispositions to labels and variant labels, based on the specified conditions. The order of actions defines their precedence: the first action triggered by a label is the one defining its disposition.

# Condition Rule / Variant Set   Disposition Ref Comment
1 if label matches leading-combining-mark invalid   labels with leading combining marks are invalid
2 if label matches mixed-digits invalid   RTL labels with a mix of European, Arabic-Indic and Extended Arabic-Indic digits are invalid
3 if label matches no-mix-teh-marbuta-goal invalid   None
4 if label matches no-mix-kaf-keheh invalid   do not mix Arabic letters KAF and KEHEH in the same label
5 if label matches no-mix-kaf-swash invalid   None
6 if label matches no-mix-heh-doachashmee invalid   None
7 if label matches no-mix-heh-goal invalid   None
8 if label matches no-mix-alef-maksura-farsi-yeh invalid   do not mix Arabic letters ALEF MAKSURA and FARSI YEH in the same label
9 if label matches no-mix-kaf-with-ring-gaf invalid   None
10 if label matches no-mix-kaf-with-ring-keheh-with-three-dots-above invalid   None
11 if label matches no-mix-gaf-keheh-with-three-dots-above invalid   do not mix Arabic letters GAF and KEHEH WITH THREE DOTS ABOVE in the same label
12 if at least one variant is in {blocked} blocked   default action
13 if at least one variant is in {allocatable} allocatable   default action
14 if any label (catch-all) valid   catch all (default action)

Legend

{...} - variant type set
In the "Rule/Variant Set" column the notation {...} means a set of variant types.

Table of References

[0] The Unicode Consortium. The Unicode Standard, Version 6.3.0, (Mountain View, CA: The Unicode Consortium, 2013. ISBN 978-1-936213-08-5)
Any code point cited was originally encoded in Unicode Version 1.1
[1] The Unicode Consortium. The Unicode Standard, Version 6.3.0, (Mountain View, CA: The Unicode Consortium, 2013. ISBN 978-1-936213-08-5)
Any code point cited was originally encoded in Unicode Version 2.0
[5] The Unicode Consortium. The Unicode Standard, Version 6.3.0, (Mountain View, CA: The Unicode Consortium, 2013. ISBN 978-1-936213-08-5)
Any code point cited was originally encoded in Unicode Version 3.2
[100]Internetstiftelsen i Sverige (IIS), Arabic https://github.com/dotse/IDN-ref-tables/blob/master/language-tables/arabic-lang-ref-table.txt
None
[107] MSR-2 Maximum Starting Repertoire https://www.icann.org/en/system/files/files/msr-2-overview-14apr15-en.pdf
Code points cited are obsolete
[108] Arabic VIP Report, page 5 http://archive.icann.org/en/topics/new-gtlds/arabic-vip-issues-report-07oct11-en.pdf
Code points cited are combining vowel marks
[110]TF-AIDN, "Proposal for Arabic Script Root Zone LGR", Version 3.4, 18 November 2015 https://www.icann.org/en/system/files/files/arabic-lgr-proposal-18nov15-en.pdf
Code points cited are used in Maghreb
[111] TF-AIDN, "Proposal for Arabic Script Root Zone LGR", Version 3.4, 18 November 2015 https://www.icann.org/en/system/files/files/arabic-lgr-proposal-18nov15-en.pdf
Code points cited as having no evidence for active use
[112] Sections 4 and 5 of TF-AIDN, "Proposal for Arabic Script Root Zone LGR", Version 3.4, 18 November 2015 https://www.icann.org/en/system/files/files/arabic-lgr-proposal-18nov15-en.pdf
Arabic variants and rules prohibiting the mixing of certain Arabic code points
[120]RFC5891, Internationalized Domain Names in Applications (IDNA): Protocol http://tools.ietf.org/html/rfc5891
None
[121]RFC5893, Right-to-Left Scripts for Internationalized Domain Names for Applications (IDNA) http://tools.ietf.org/html/rfc5893
None
[130]RFC5564, Linguistic Guidelines for the Use of the Arabic Language in Internet Domains http://tools.ietf.org/html/rfc5564
None
[201]Omniglot Arabic http://www.omniglot.com/writing/arabic.htm
None
[401] Minutes of Meeting by پاکستان. IDN ccTLD Language Table Sub-Committee formed by the Ministry of IT, Government of Pakistan - http://www.cle.org.pk/IDN/IDN2010/minutesofmeeting.pdf
IDN (Internationalized Domain Names) Language Table to be used by the Pakistan IDN ccTLD registry for the registration of languages spoken in Pakistan