70
|
1 HOW TO READ & WRITE ARABIC 94.8.4 TAKAHASHI Naoto
|
|
2
|
|
3 1. STARTING UP
|
|
4
|
|
5 1.1 INVOKING MULE
|
|
6
|
|
7 You must invoke Mule as an X client if you want to use Arabic.
|
|
8 Make sure that the environment variable DISPLAY is properly set.
|
|
9 So far only 16 dot font is available for Arabic. First, set you X
|
|
10 resources appropriately, then invoke Mule from a shell window with
|
|
11 the following command to get enough line spaces.
|
|
12
|
|
13 % mule -fsp 0+9
|
|
14
|
|
15
|
|
16 1.2 ENTERING AND LEAVING ARABIC-MODE
|
|
17
|
|
18 Hit C-] to enter arabic-mode. Whenever you are in arabic-mode,
|
|
19 you are also in visual-mode. Hitting C-] again brings you back
|
|
20 from arabic-mode, but you are still in visual-mode. Hitting C-]
|
|
21 in visual-mode brings you into arabic-mode. You can exit
|
|
22 visual-mode by hitting C-c C-c. See the figure below.
|
|
23
|
|
24
|
|
25 C-c C-c
|
|
26 +----------------------------------------------+
|
|
27 | +--------------------+ |
|
|
28 | | C-c C-c | |
|
|
29 V V | |
|
|
30 +-------------+ +-----------+ C-] +-----------+
|
|
31 | | C-] |arabic-mode| ------> | |
|
|
32 |initial state| ------> | and | |visual-mode|
|
|
33 | | |visual-mode| <------ | |
|
|
34 +-------------+ +-----------+ C-] +-----------+
|
|
35
|
|
36
|
|
37 The string "Arabic L2R" or "Arabic R2L" in mode-line means that
|
|
38 you are in both arabic-mode and visual-mode. If you see "L2R" or
|
|
39 "R2L" but not "Arabic" in mode-line, you are in visual-mode but
|
|
40 not in arabic-mode.
|
|
41
|
|
42
|
|
43 1.3 DISPLAY DIRECTION
|
|
44
|
|
45 Each buffer in Mule has a buffer local variable called
|
|
46 "display-direction". If this variable is set to nil (this is the
|
|
47 default), the lines begin from the left edge of the screen. On
|
|
48 the other hand, if display-direction is non-nil, the lines are
|
|
49 aligned to the right and texts are written from right to left.
|
|
50
|
|
51 If you are in visual-mode, the value of display-direction is
|
|
52 reflected in mode-line: if it is nil "L2R" is displayed; if it is
|
|
53 non-nil "R2L" is displayed. In visual-mode, you can set
|
|
54 display-direction to nil by typing 'C-c <', and to t by typing
|
|
55 'C-c >'.
|
|
56
|
|
57 If you read a file (C-x C-f) which has the extension ".l2r", the
|
|
58 buffer automatically goes in visual-mode and display-direction is
|
|
59 set to nil. Likewise, if a file has the extension ".r2l", the
|
|
60 buffer automatically goes in visual-mode and display-direction is
|
|
61 set to t.
|
|
62
|
|
63
|
|
64 2. EDITING ARABIC TEXT
|
|
65
|
|
66 2.1 INPUT
|
|
67
|
|
68 In arabic-mode, you can input Arabic characters and Arabic digits
|
|
69 from keyboard. To input ASCII characters or ASCII digits, you
|
|
70 have to exit arabic-mode by hitting C-]. The translation table is
|
|
71 given below. When you are in Arabic-mode, you can see the
|
|
72 keyboard layout by C-z.
|
|
73
|
|
74 Please note that this table is by no means a fixed one --- it is
|
|
75 just a quick hack. Your suggestion on Arabic keyboard layout will
|
|
76 be greatly appreciated.
|
|
77
|
|
78
|
|
79 translate table in arabic-mode
|
|
80 ------------------------------
|
|
81 " isolated hamza
|
|
82 a~ madda above alif
|
|
83 a' hamza above alif
|
|
84 w' hamza above waaw
|
|
85 a'' hamza below alif
|
|
86 y' hamza above yaa
|
|
87 a alif
|
|
88 b baa
|
|
89 o taa marbuTa
|
|
90 t taa
|
|
91 c thaa
|
|
92 j jiim
|
|
93 H Haa
|
|
94 K khaa
|
|
95 d daal
|
|
96 x dhaal
|
|
97 r raa
|
|
98 z zaay
|
|
99 s siin
|
|
100 / shiin
|
|
101 S Saad
|
|
102 D Daad
|
|
103 T Taa
|
|
104 Z Zaa
|
|
105 ` ayn
|
|
106 G ghayn
|
|
107 f faa
|
|
108 q qaaf
|
|
109 k kaaf
|
|
110 l laam
|
|
111 m miim
|
|
112 n nuun
|
|
113 h haa
|
|
114 w waaw
|
|
115 A alif maqSura
|
|
116 y yaa
|
|
117 C chim (Farsi)
|
|
118 g gaaf (Farsi)
|
|
119 p paa (Farsi)
|
|
120 X zhaa (Farsi)
|
|
121 _ make connection
|
|
122 | cut connection
|
|
123
|
|
124
|
|
125 Appropriate ligature is automatically generated whenever a
|
|
126 character is input. Special ligature of laam + alif will be
|
|
127 generated whenever an alif is input on the left of a laam. If you
|
|
128 want to cut the connection between two adjacent Arabic characters,
|
|
129 type a `|' (vertical bar) at that point in arabic-mode. An input
|
|
130 of a character preceded by a `|' produces a glyph which is not
|
|
131 connected to its right adjacent. Typing a `_' (underscore)
|
|
132 connects the two characters at that point, if possible.
|
|
133
|
|
134 When display-direction is nil (i.e. lines are aligned to left),
|
|
135 the cursor stays at the same position after an Arabic character is
|
|
136 inserted. It moves to the right after an Arabic digit or an ASCII
|
|
137 character is inserted.
|
|
138
|
|
139 When display-direction is non-nil (i.e. lines are aligned to
|
|
140 right), the cursor moves to the left after an Arabic character is
|
|
141 inserted. It stays at the same position after an Arabic digit or
|
|
142 an ASCII character is inserted.
|
|
143
|
|
144
|
|
145 2.2 DELETION, KILL & YANK
|
|
146
|
|
147 Use C-d to delete the character under the cursor. If you are in
|
|
148 arabic-mode, the necessary ligature will be re-generated after the
|
|
149 character is deleted.
|
|
150
|
|
151 DEL key behave differently according to the value of
|
|
152 display-direction: if the value is nil (aligned to left), it
|
|
153 deletes a character on the left of the cursor; if the value is
|
|
154 non-nil (aligned to right), it deletes a character on the right of
|
|
155 the cursor. If the display direction and the input character
|
|
156 direction are the same, lastly input character can be deleted with
|
|
157 DEL key, no matter what the value of display-direction is.
|
|
158
|
|
159 M-d (arabic-kill-word), M-DEL (arabic-backward-kill-word), C-k
|
|
160 (arabic-kill-line) and C-w (arabic-kill-region) remove the
|
|
161 specified stretch of string and put it in kill-ring. M-w
|
|
162 (arabic-copy-region-as-kill) also puts the specified stretch of
|
|
163 string in kill-ring, but the original text is left unchanged.
|
|
164
|
|
165 The strings in kill ring can be reinserted in buffer by C-y
|
|
166 (arabic-yank) and M-y (arabic-yank-pop).
|
|
167
|
|
168 Make sure that you are in arabic-mode when you kill or yank
|
|
169 something, otherwise ligature is not maintained, or at the worst,
|
|
170 unexpected region will be deleted or a garbage string will be
|
|
171 inserted in the buffer.
|
|
172
|
|
173
|
|
174 2.3 CURSOR MOTION
|
|
175
|
|
176 The following cursor motion commands are supplied in visual-mode
|
|
177 and in arabic-mode to handle bi-directional texts easily. All
|
|
178 these commands accept an additional prefix numeric argument.
|
|
179
|
|
180
|
|
181 key command name function
|
|
182 -----------------------------------------------------------------
|
|
183 C-f visual-forward-char move the cursor visually
|
|
184 forward by 1 character
|
|
185
|
|
186 C-b visual-backward-char move the cursor visually
|
|
187 backward by 1 character
|
|
188
|
|
189 C-p visual-previous-line move the cursor up
|
|
190 by 1 line
|
|
191
|
|
192 C-n visual-next-line move the cursor down
|
|
193 by 1 line
|
|
194
|
|
195 C-a visual-beginning-of-line move the cursor to the
|
|
196 visual beginning of line
|
|
197
|
|
198 C-e visual-end-of-line move the cursor to the
|
|
199 visual end of line
|
|
200
|
|
201 M-f visual-forward-word move the cursor visually
|
|
202 forward by 1 word
|
|
203
|
|
204 M-b visual-backward-word move the cursor visually
|
|
205 backward by 1 word
|
|
206
|
|
207 M-< visual-beginning-of-buffer move the cursor to the
|
|
208 visual beginning of buffer
|
|
209
|
|
210 M-> visual-end-of-buffer move the cursor to the
|
|
211 visual end of buffer
|
|
212
|
|
213
|
|
214 Note that ordinary cursor motion commands (forward-char,
|
|
215 backward-char, etc.) behave according to the logical order of the
|
|
216 text, whilst the above commands behave according to the visual
|
|
217 order. Compare the difference of the two C-f commands. (You can
|
|
218 exit visual-mode by typing "C-c C-c".)
|
|
219
|
|
220
|
|
221 2.4 LR COMMANDS
|
|
222
|
|
223 Some of you may be confused by the words "forward" and "backward".
|
|
224 Here is a summary:
|
|
225
|
|
226
|
|
227 display-direction display-direction
|
|
228 is nil is non-nil
|
|
229 -------------------------------------------------
|
|
230 forward right left
|
|
231
|
|
232 backward left right
|
|
233
|
|
234
|
|
235 If you are using arrow keys to move the cursor, you may want to
|
|
236 move the cursor to left/right no matter what display-direction is.
|
|
237 Likewise, you may want the cursor to be put on the left-most
|
|
238 column when you hit C-a, and on the right-most column when you hit
|
|
239 C-e. In such cases, rewrite the key definitions in visual.el and
|
|
240 arabic.el with the following commands. These commands are called
|
|
241 "LR commands" because they act according to the absolute direction
|
|
242 (left or right) rather than relative direction (forward or
|
|
243 backward).
|
|
244
|
|
245
|
|
246 ** LR commands in visual-mode **
|
|
247
|
|
248 command name function
|
|
249 --------------------------------------------------------
|
|
250 visual-move-to-left-char move the cursor to left
|
|
251 by one character
|
|
252
|
|
253 visual-move-to-right-char move the cursor to right
|
|
254 by one character
|
|
255
|
|
256 visual-move-to-left-word move the cursor to left
|
|
257 by one word
|
|
258
|
|
259 visual-move-to-right-word move the cursor to right
|
|
260 by one word
|
|
261
|
|
262 visual-left-end-of-line move the cursor to the
|
|
263 leftmost column
|
|
264
|
|
265 visual-right-end-of-line move the cursor to the
|
|
266 rightmost column
|
|
267
|
|
268 visual-delete-left-char delete the character on
|
|
269 the left of visual point
|
|
270
|
|
271 visual-delete-right-char delete the character on
|
|
272 the right of visual point
|
|
273
|
|
274 visual-kill-left-word kill one word on the left
|
|
275 of visual point
|
|
276
|
|
277 visual-kill-right-word kill one word on the right
|
|
278 of visual point
|
|
279
|
|
280
|
|
281 ** LR commands in arabic-mode **
|
|
282
|
|
283 command name function
|
|
284 ----------------------------------------------------------
|
|
285 arabic-delete-left-char do visual-delete-left-char
|
|
286 and make Arabic ligature
|
|
287
|
|
288 arabic-delete-right-char do visual-delete-right-char
|
|
289 and make Arabic ligature
|
|
290
|
|
291 arabic-kill-left-word do visual-kill-left-word
|
|
292 and make Arabic ligature
|
|
293
|
|
294 arabic-kill-right-word do visual-kill-right-word
|
|
295 and make Arabic ligature
|
|
296
|
|
297
|
|
298 3. HARDCOPY
|
|
299
|
|
300 You can use m2ps to get a hardcopy of a file which contains arabic
|
|
301 characters. See m2ps.1 for detail. Note that input files to m2ps
|
|
302 must be written in *internal* coding system. To save the content of
|
|
303 a buffer, use the following command in Mule:
|
|
304
|
|
305 C-u C-x C-w _filename_ RET *internal* RET
|
|
306
|
|
307 Please note that the current version of m2ps does not support r2l
|
|
308 printing direction (flushright mode). If you try to print a file
|
|
309 which was created under r2l display direction, it will be printed
|
|
310 left-aligned. Furthermore, you may get wrong word order.
|
|
311
|
|
312 4. LIMITATIONS
|
|
313
|
|
314 There are many limitations in this release. We need your help.
|
|
315
|
|
316 4.1 NON-SPACING MARKS IN ARABIC
|
|
317
|
|
318 Only two non-spacing marks, i.e., madda and hamza, are available
|
|
319 in this release. Any other marks, e.g. fatHa (short 'a'), Damma
|
|
320 (short 'u'), kasra (short 'i'), shadda (doubling sign), sukuun (no
|
|
321 vowel sign), waSla (joining hamza), etc., cannot be displayed. It
|
|
322 seems that short vowels and waSla are not necessary to write
|
|
323 ordinary Arabic text, but shadda is often marked in Arabic
|
|
324 printings. Please let me know if shadda is really indispensable,
|
|
325 in that case I will try to implement shadda in some way.
|
|
326
|
|
327
|
|
328 4.2 FILE FORMAT
|
|
329
|
|
330 This package uses its own format (coding system) for file I/O.
|
|
331 You cannot read the files saved in other format, e.g., ISO 8859-6,
|
|
332 ISO 10646, UNICODE, ArabTeX, xaw, etc. As a matter of fact, I do
|
|
333 not know what format is mostly used in the world to save Arabic
|
|
334 texts. If you have texts saved in certain format and would like
|
|
335 to edit them with Mule, please send me the documentation of your
|
|
336 format. I will try to implement file I/O routine for that format.
|
|
337
|
|
338
|
|
339 4.3 MISCELLANEOUS LIMITATIONS
|
|
340
|
|
341 * Tab does not work if display-direction is non-nil.
|
|
342
|
|
343 * transpose commands and rectangle commands do not work in most cases.
|
|
344
|
|
345
|
|
346 5. ADDRESS
|
|
347
|
|
348 Bug reports and comments should be sent to this mailing list
|
|
349 (mule@etl.go.jp) or directly to me (ntakahas@etl.go.jp). Any
|
|
350 kinds of suggestions or demands are greatly appreciated.
|
|
351
|
|
352
|
|
353 TAKAHASHI Naoto
|
|
354 Electrotechnical Laboratory, Japan
|
|
355 ntakahas@etl.go.jp
|