[Egyptian] Vertical vs horizontal writing-mode

ishida at w3.org ishida at w3.org
Wed Jul 20 11:55:31 BST 2016


On 13/07/2016 23:16, Mark-Jan Nederhof wrote:
 > A belated thank you for the information. I now had some time to go
 > through the links that you sent in more detail. The issue of 
directionality
 > is not crucial to our proposal, and there was so much confusion about
 > the matter that it seemed best to not further discuss it during the
 > meeting. All the Egyptologists fully understood, because the problem is
 > unavoidable in everyday transcription of hieroglyphic texts, but 
apart from
 > the Egyptologists we couldn't get anyone to care about this matter, so
 > there may be no point in trying to keep it as part of the proposal.
 >
 > This leaves two options: we omit mention of directionality from the 
proposal
 > altogether, or mention the matter in passing, with the suggestion that
 > we might revisit it at some future time. To determine what is best, can
 > I ask you the following? Do you know of some other language/script in
 > which the encoding of text itself is different depending on whether the
 > text direction is horizontal or vertical? (To be clear, I don't care 
about
 > ltr versus rtl. The central issue is horizontal versus vertical.)
 >
 > While going through the material you sent, I got the impression that
 > the encoding is always the same, for CJK and other scripts. As I tried to
 > explain in the paper and in the presentation, the situation for 
Ancient Egyptian
 > is very different, because the signs tend to be divided into groups 
('quadrats')
 > somewhat differently depending on whether the text direction is 
horizontal
 > or vertical, and the division into groups is part of the encoding. In 
a way,
 > one could say that a certain encoding is only genuine for horizontal 
text,
 > or for vertical text, but not usually for both. Do you see the 
problem, and the
 > reason why we brought this up? Would you agree with me this problem
 > does not occur in CJK and similar scripts ?

Let's refer to horizontal 'writing-mode' vs vertical 'writing-mode' to 
make the terminology clearer.

So, in general for CJK text one would expect to see the same sequence of 
code points for a text whether it is rendered horizontally or 
vertically.  However, there are some differences...

[a] there are certain characters that by convention are more likely to 
be found in one versus the other.  For example, vertical text usually 
uses corner brackets for quotation marks, whereas horizontal CJK use 
quotation marks. For examples, see bullet (b) under 
https://www.w3.org/TR/jlreq/#differences_in_vertical_and_horizontal_composition_in_use_of_punctuation_marks

So for good quality rendering of text, the choice of code point goes 
with the choice of writing-mode for a small number of characters, and 
you can't just flip between the two by switching the CSS.

Other times you may find full-width characters being used for latin 
letters and digits in vertical script when they are not in horizontal. 
For example an acronym such as FIFA is likely to be rendered as 
non-rotated, full-width characters in vertical Japanese, but as ordinary 
proportionally-spaced characters in horizontal.


[b] the visual appearance of some characters in CJK varies according to 
the writing-mode.  For example, parentheses are rotated 90° between 
vertical and horizontal. Other characters need completely different 
glyphs to be swapped in. For example the horizontal Japanese full stop 
has an advance width the same as other characters but has just a small 
circle in the lower left corner.  In vertical text that circle appears 
in the top right corner (ie. it can't be achieved by rotation).  In 
these cases you need extra glyphs in the font that are activate by 
sensitivity to the writing-mode.  For examples, see
http://r12a.github.io/scripts/tutorial/part4#rotations


[c] Sometimes text flows horizontally within vertical columns in CJK 
(known as tate chū yoko).  See an example at 
http://r12a.github.io/scripts/tutorial/part4#tatechuyoko. This is 
something that has no correspondence in horizontal text.


So in summary, sometimes the sequence of characters needs to be 
different for vertical vs horizontal text, but most of the time apparent 
differences are achieved through rendering algorithms operating on and 
selecting appropriate glyphs.

What does remain the same, however, is the logical progression of 
codepoints in memory.  That sequence, as is usually the case throughout 
Unicode, typically follows the pronounced order of the 'letters' 
involved or some other rule such as combining characters following base 
characters.

If the expected order of codepoints in a word varies for sequences of 
character in vertical vs horizontal writing modes, then problems arise 
in searching or processing text.


Btw, there are also plenty of examples in Unicode of scripts that treat 
visual display in terms of syllables, clusters or groups of characters. 
The underlying sequence of codepoints in many Brahmi-derived scripts is 
different from the order in which the respective glyphs are displayed. 
For example a RA at the start (nominally the left)  of a Hindi sequence 
of consonants in the word 'irsya' is likely to be displayed above the 
'a' (far to the right).  For examples, see 
http://r12a.github.io/scripts/tutorial/part3#positional

This, like the other things noted above (with the exception of the 
first) is achieved through applying some magical rendering process, 
using smoke and mirrors to transform the underlying, logically-ordered 
codepoint sequence.

Fwiw, in vertical arrangements the 'syllabic' clusters in indic scripts 
are treated as indivisible units that run horizontally.


So, coming back to Egyptian hieroglyphs, and making it clear that i know 
very little about how Egyptian hieroglyphs work, i find myself wondering 
the following:


a. perhaps some combination of the above smoke and mirrors techniques 
may be adequate to manage some of the differences between layout in 
horizontal vs vertical writing-mode when the thing we are struggling 
with is the spacial relationships between the elements circumscribed by 
a quadrat when they are rendered.

b. perhaps it's not particularly problematic that you can't 
automatically flip between horizontal vs vertical without changing code 
points, especially when one considers that there is anyway so much 
variation in 'spelling' of egyptian content, often to fit the visual 
space available.

c. if the control characters used to indicate the positioning of 
hieroglyphs within the quadrat display space are treated like other 
Unicode control characters, ie. they are not part of the semantics and 
are ignored for sorting, searching, and processing the text for meaning, 
rather they are just cues for visual arrangement, then perhaps it's not 
a big issue either if they are different for vertical vs horizontally 
rendered content.


does that help?
ri






More information about the Egyptian mailing list