[Egyptian] Some general considerations

Sun Jul 24 11:33:23 BST 2016

Hi guys!

(no worries, this is my last mail on the topic, enough time and energy spent on this.)
>> Third
>> 
> And your solution is a good one — I have absolutely no doubt — as long as you do not want to *search* for the relative position of signs with respect to one another. This is however a piece of information that is, like it or not, part of the ‘orthographic’ system of ancient Egyptian and to which we want to have access (see further Simon’s mail earlier this week).
> 
> In general, the spatial distribution of signs (i.e. the "grouping") has *no meaning at all* in Egyptian. No semantic, phonological, or morphological information is coded in the relative position of the signs.

This is patently inaccurate. 
Three simple examples should suffice here (sorry for providing textbook examples known by all).

1 - spatial distribution affecting phono-morphology
[t] followed by [w] can only be read /tw/, while t&w can be /tw/ or /wt/ => same ‘linear order’, 1 reading in one case, 2 readings in the other (referring potentially to 2+ morphemes).

2 - spatial distribution as a condition for reading (and adding semantic value)
 makes no sense, when combined as  it is clear that it should be read /ptH/ ‘Ptah’ (> p(t) + t(A) + H(H)), referring to the god in his demiurgic dimension (separating the sky from the earth). The position of the sign is both a condition for reading and an added semantic information (not only Ptah, but Ptah as demiurge).

3 - spatial distribution affecting the function of a sign
Hieratic example? If the rowing man is followed by a n () the n has to be read /n/ (phonemogram), if the n is positioned under the rowing man (), we are dealing with a compound classifier made of two signs, the /n/ has an iconic value and the whole group refer to a man rowing in water.

Etc., etc.

So position is 'just an esthetic, layout matter, not a linguistic one’ as you say? This kind of assertion reflects badly on our understanding of the hieroglyphic system. That’s a pity to make such unwarranted statements in a discussion also aimed at non-specialists to whom we try to explain things as straightforwardly as possible in order to come up with a solution that is satisfying for everyone.

Accordingly, if Unicode aims first and foremost at rendering the ‘linguistic’ dimension of writing, the examples above should suffice to show the importance of the ‘quadrat' organization of this script. Again, you might not like it from a computer/font oriented perspective; I agree it’s not convenient to encode, it might even not be possible to encode it at all in Unicode because the standard has not the capabilities needed (and only higher level protocols could then handle this). All of this is fine with me, but it would be great not to distord presentation of the data.

[I leave here alone other semiotic dimensions of writing: the spatial arrangement is part of the ‘orthography’ of the scribes, not necessarily meaningful at the ‘linguistic’ level strictly speaking, but at the level of scribal practices, etc.: why don’t we use IPA for our modern languages? simply because writing is much more than ‘linguistic’ stricto sensu and that it make sense to know who writes ‘next’ and who plays with the script and writes ‘neckst’. As simple as that.] 

> No more than illuminated initial capital letters in medieval manuscripts. 
> Like these:
> 
> http://www.sothebys.com/content/dam/stb/lots/L12/L12240/256L12240_6G4Y2.jpg <http://www.sothebys.com/content/dam/stb/lots/L12/L12240/256L12240_6G4Y2.jpg>
> 
> They are nice, they are fancy, they are there (hundreds of thousands of them), but they do not carry any additional linguistic information whatsoever. Thus, there is not specific need to (and in fact they are not) encode or represent them in unicode.

This comparaison is hilarious.

> As far as I know (and please correct me if I am wrong), the only utility that recording the position of signs could have on a practical level would be to fill lacunas in text or to suggest alternative readings in hieratic scripts.

Nope, see above. This is one side-interest of the control characters that I mentioned in the discussions, indeed, but definitely not the only one, since the arrangement is meaningful at multiple levels.
>> 1) JSesh approach 
>> 
> 
>> 
> Sure, everything needs not be dealt with at the level of Unicode, but the data needed should not be hidden in the ligatures embedded in the font either.
> 
> Again, what data?
> What is the *linguistic* (not graphic, not philological) information carried by groups that is so important to code?

See above.
>> 2) Groups in fonts
>> 
>> 
> OK, sure. But then again control characters have the advantage of being explicit about the relative position of signs when a group is not in a font: how would you proceed for storing such an information, as a lay user, when using ligatures? (this is a real question, nothing ironic here).
> 
> I am not sure I understand your question: the information about the spatial distribution is already coded in the ligature. the ligature IS the information.
> And if the ligature does not exist, it takes (literally) 30 seconds to create it within a font. And with a common database and a common font of reference, if would be extremely easy to create a common shared table of ligatures that will allow everyone to see exactly the same groups (or alternatively if i write my text with the font x with the ligature x, it would be enough to embed the font itself in the text file (there are various ways to do so), because everyone would be using the same font or at least the same set of ligatures.

All this was clear, sure. What I meant is that the ligature are purely ‘graphic’ right? No information is stored about the position of one sign with respect to another?

>> 3) the 4 “small sign in the corner of big sign” control characters.
>> 
> I’m glad to see that your solution is exactly the same as the one we suggest! Indeed, there are logically 2 ‘variables’ (Top vs Bottom and Left-Right), if the sequence of signs indicates the Left-Right unambiguously (like in your example), then we only need to encode explicitly the Top vs Bottom, of course. This is basically how we represented things with Michael at the pub.
> Please keep in mind that the four operators come from another type of syntax.
> 
> This is not what I am suggesting. I am suggesting 4 control characters: top-bottom, left-right, A\B, B/A. instead of top-bottom, left-right + four corners.
> Or perhaps I am not understanding your system.

OK, then I disagree because of the polysemic value of A/B and B/A.

>> 4) Vertical and horizontal script and control characters
>> 
>> 
> 
> I do not understand why you do entirely get rid of the notion of ‘quadrat’ in your example. If you encode your text as /hthr-H*(n:t)-tA:tA/, the text would be displayed correctly whatever the orientation  (hltr, hrtl, vrtl, vltr), no? Of course, one needs the notion of quadrat in the encoding for this; and this is a nice case for showing that we *need* this well-defined notion in the encoding, not just sequences of glyphs.
> 
> If you want to have the "notion of quadrate" in unicode, then you have to introduce at least another control character that you will have to use in front of *every single quadrate* in your text. if i understood correctly, one of the proposal that were circulating was indeed suggesting something like that. Which means that to display any text you will probably end up using more control characters that hieroglyphic signs themselves. This can be a sound way of thinking in a MdC (as you don't like JSesh ;-) ) perspective, but I doubt it make sense in unicode (and more important i doubt the people of the unicode consortium will think it make sense).
> 
> Unless you want to code every single possible "quadrate" as an independent glyph: that would thus end up being conceptually comparable to a chinese character.
> 
> But I assume (i hope) we don't want do do that, right?

We use spaces between words, would it be bad to use quadrats separators between quadrats? Mutatis mutandis, it feels to me like saying ‘oh, no, there are blank spaces all around!’.

More seriously, taking signs as basic units for Unicode might be a practical solution (even if it leads to many subsequent difficulties, see the vertical vs horizontal discussion), but denying the essential function of quadrats is kind of funny when discussing the introduction of control characters (1, 2, 38, it does not matter): what they do is building quadrats (implicitly or explicitly).
>> 5) special characters, vertical/horizontal texts and input methods
>> 
>> 
> 
> Your point escapes me, here. Unless it is a result of getting rid of the quadrats: within quadrats, the groups of hieroglyphs are essentially the same.
> 
> 
> With quadrats marked by special control characters would be even worse, as you would have to take into consideration even more possible combinations.
> 
>   
>> 6) Ramesside “groups” (or “tall groups”).
>> 
>> 
> 
> Why are they groups or quadrats and not ’small columns’? 
> 
> The answer is simple and straightforward (and provided by your example): because they correspond to the size occupied by the A1 sign. Look at the example.
>  
> but some signs do occupy the full height of the horizontal line (unlike in your Japanese example, which is nice, but as such irrelevant), which is the basis for deciding what counts as a unit (quadrat) or not. (Or do you have another definition in mind?). 
> 
> Not true.
> This can be interpreted just as a question of layout, not as a question of grouping.

Paraphrasing what Ariel Shisha-Halevy once said to a colleague during a conference: you’re entitled to have your own kind of Egyptian if you want!
I’m not arguing against non-sense. But that’s a pity, because there are actually many cases (esp. in Ptolemaic temples) that fit perfectly with your ‘small columns’ hypothesis. 
The Ramesside example that you provide simply does not work this way. 

And all the parallels from other scripts are pointless (and admittedly funny in the framework of this discussion), since none of these scripts are based on a quadratic structure that is close in any respect to Egyptian. Comparing apple with pears won’t help: and again, that’s a bit of a pity, because we have many cases in Egyptian of layouts similar to the ones you mentioned. These are nice cases of special layouts, indeed, and I agree with your analysis: we do not want to encode them. (unlike the quadrat structure).

> But if you want to be able to use it minimally for (1) standardized electronic corpus and (2) journals like LingAeg, etc. dealing with the Egyptian language, this would be a requirement. 
> 
> All the egyptian words quoted in my article published on the last issue of JNES were written with my font with just standard ligatures, without control characters. No one complained, no one told me anything, so I assume that such a system is indeed suitable for scientific publications, not only for writing tourists' names.. ;-)

I’m afraid that one paper consisting of a lexicographical discussion about the use of one word in one hieratic document cannot be taken as any evidence for disproving my point: you’re happy with the way your hieroglyphs are rendered? Perfect. But why should it be the case that people willing to render a simple hieroglyphic line as the one below would face difficulties? This escapes me.

 (KRI I, 4)
>> 7) What are the “square groups”?
>> 
>> 
> 
> I provided the definition agreed on — I think — by everyone, if you have a better one, I’m listening of course.
> 
> see above.
> And see my "Second" introductory point in the previous email: it is not a question of finding the "right" definition (as there isn't any). It is a question of finding the *most efficient* definition in order to efficiently reach our goal of having a working system of unicode-based hieroglyphs that can be used by the *whole* Egyptological community (and not only by a handful of people working on a specific corpus or database, and on his respect, to respond to you last remark..).
> 
> I’m sorry to put it so bluntly, but if Unicode is not to be useful for the majority of egyptologists, so be it. It will remain what it is, a standard not used by the community: Journals, Corpora, etc. will keep on using JSESH and other tools, they’re doing well with it. 
> 
> Perhaps the opposite should also be considered? perhaps the unicode-based system to write hieroglyphs (considering that the very idea at the basis of unicode is to make writing standardized and accessible to as many people as possible) should indeed aim at being useful for the majority of the Egyptologists (corpus-linguists as well as all the other thousands of non-corpus-linguist egyptologists), and if some team working with some specific database should have some specific need that cannot be satisfied with such a standard unicode system, then perhaps it is them who should use other tools?
> 
> I guess this is an option that should be considered.. no?

We are talking about feelings and different perceptions here, so it’s hard to be objective. My own view (admittedly subjective) is as follows: most Egyptologists (publishing texts, monuments, etc) will never be happy with what Unicode has to offer, because this is not precise enough at multiple levels (Nigel expressed this position multiple times during the meeting for instance). On the other hand, grammarians (broadly speaking) and people working on corpora are probably the ones who are the more open to standardization (hieroglyphs were not even there in the TLA at the beginning). I see this community (much more than historians, etc.) as intensive users of Unicode: using it for exchanging texts, publishing volumes, creating Online resources, etc. I feel like we are very much in favor of standardization and striving for making resources accessible; two goals of Unicode, as you mentioned.

These resources should be there for lasting, it would be a pity not to think about the standard carefully before going in any direction.

That’s all folks!
Have a nice weekend,

Stéphane
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://evertype.com/pipermail/egyptian_evertype.com/attachments/20160724/528742ab/attachment.htm>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: PastedGraphic-1.pdf
Type: application/pdf
Size: 3613 bytes
Desc: not available
URL: <http://evertype.com/pipermail/egyptian_evertype.com/attachments/20160724/528742ab/attachment.pdf>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://evertype.com/pipermail/egyptian_evertype.com/attachments/20160724/528742ab/attachment-0001.htm>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: PastedGraphic-2.pdf
Type: application/pdf
Size: 3441 bytes
Desc: not available
URL: <http://evertype.com/pipermail/egyptian_evertype.com/attachments/20160724/528742ab/attachment-0001.pdf>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://evertype.com/pipermail/egyptian_evertype.com/attachments/20160724/528742ab/attachment-0002.htm>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: PastedGraphic-3.pdf
Type: application/pdf
Size: 3713 bytes
Desc: not available
URL: <http://evertype.com/pipermail/egyptian_evertype.com/attachments/20160724/528742ab/attachment-0002.pdf>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://evertype.com/pipermail/egyptian_evertype.com/attachments/20160724/528742ab/attachment-0003.htm>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: PastedGraphic-5.pdf
Type: application/pdf
Size: 3638 bytes
Desc: not available
URL: <http://evertype.com/pipermail/egyptian_evertype.com/attachments/20160724/528742ab/attachment-0003.pdf>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://evertype.com/pipermail/egyptian_evertype.com/attachments/20160724/528742ab/attachment-0004.htm>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: Capture d?e?cran 2016-07-24 a? 11.47.11.png
Type: image/png
Size: 54301 bytes
Desc: not available
URL: <http://evertype.com/pipermail/egyptian_evertype.com/attachments/20160724/528742ab/attachment.png>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://evertype.com/pipermail/egyptian_evertype.com/attachments/20160724/528742ab/attachment-0005.htm>