[Egyptian] Some general considerations

Sat Jul 23 16:00:30 BST 2016

Hi Marwan,

Thanks for your mail!
Some very quick answers to your suggestions.
> First:
> 
No worries, everyone is entitled to have an opinion based on his/her own experience, and I would certainly agree with the fact that no one (except maybe for Serge) has a proper understanding of all the aspects involved here.
> Second:
> 
Sure, no ‘right’ solution, we simply want a solution that meets our minimal needs. The definition of these needs is what we do not seem to agree on. 

> We are not ancient Egyptians, we don’t know how ancient Egyptian scribes perceived their writing system. We can just observe it from the outside and suggest interpretations. Which are not “right” or “wrong”, but rather “more efficient” or “less efficient” in respect to the problems we want to solve through these interpretations. This is something that, I think, should be kept in mind.

Certainly, but they give us some clues that we should not ignore (see e.g. below, re point 6). 
> Third
> 
As this is an issue recurring again and again, I would like to stress one more time that, unlike what is done for most other scripts, we are not producing/imputing new texts, but transcribing old ones, hopefully without loosing too much information. And your solution is a good one — I have absolutely no doubt — as long as you do not want to *search* for the relative position of signs with respect to one another. This is however a piece of information that is, like it or not, part of the ‘orthographic’ system of ancient Egyptian and to which we want to have access (see further Simon’s mail earlier this week).

> 1) JSesh approach 
> 
Do not underestimate too much your addressees, Marwan ;) 
Furthermore, this is not a JSESH-based mentality, but a MdC-based mentality: the people behind this standard had some pretty good idea of what they were doing, trust me, and there are a lot of problems, but one should simply not throw out the baby with the bath water.
> I had the feeling during the meeting and reading your emails that many of you think in a very JSesh-based mentality. Like I had the feeling that at least some of the participants’ goal is to imitate JSesh in Unicode.
> 
> 
> But Unicode *is* *not* JSesh, or at least should not be JSesh.
> Using control functions, parentheses etc can be an efficient way in JSesh, but I doubt that a system requiring to input 10 (sic!) control characters to display 3 (sic!) hieroglyphs in a pretty common group (the Htp example in Mark-Jan’s last proposal) makes much sense in a unicode perspective. Especially considering that you can obtain exactly the same result without control character and just with 1 (sic!) general ligature within the font that will be automatically rendered by usual standard already available rendering algorithms.
> 
> 
> In other words, what can be an efficient solution to a problem in JSesh, is not necessarily a good solution in Unicode. And Unicode is not even a stand-alone thing: Unicode can work in combination with rendering algorithm (i.e. ligatures embedded in fonts), with layouts at the editor level etc. that can help solving the problems on different level than just on the Unicode encoding level.
> 
Sure, everything needs not be dealt with at the level of Unicode, but the data needed should not be hidden in the ligatures embedded in the font either.

> 2) Groups in fonts
> 
>  
> Someone in the list was saying that the groups need to be listed, but that not all of them necessarily need to be built in the font (or something like that).
> 
> This is not true: compound characters and groups will not automatically build on their own, and if you envisaged the possibility to build a group that you *do* need to have in the font, otherwise it wont display correctly and you will have random broken control-characters popping out here and there in your hieroglyph text. This perhaps will not bother you in your database etc, but it will be just unacceptable for common users (and, for instance, for editors).
> 
>  
> If you are accepting that a group can be built, then you need a way to display that group, i.e. you must have that group saved somewhere in your font.
> 
>  
> Whether you will build those characters through control characters or with rendering and ligature algorithms embedded in the fonts.
> 
> 

OK, sure. But then again control characters have the advantage of being explicit about the relative position of signs when a group is not in a font: how would you proceed for storing such an information, as a lay user, when using ligatures? (this is a real question, nothing ironic here).

> 
> 3) the 4 “small sign in the corner of big sign” control characters.
> 
I’m glad to see that your solution is exactly the same as the one we suggest! Indeed, there are logically 2 ‘variables’ (Top vs Bottom and Left-Right), if the sequence of signs indicates the Left-Right unambiguously (like in your example), then we only need to encode explicitly the Top vs Bottom, of course. This is basically how we represented things with Michael at the pub.
Please keep in mind that the four operators come from another type of syntax.

> 4) Vertical and horizontal script and control characters
> 
> 

I do not understand why you do entirely get rid of the notion of ‘quadrat’ in your example. If you encode your text as /hthr-H*(n:t)-tA:tA/, the text would be displayed correctly whatever the orientation  (hltr, hrtl, vrtl, vltr), no? Of course, one needs the notion of quadrat in the encoding for this; and this is a nice case for showing that we *need* this well-defined notion in the encoding, not just sequences of glyphs.

> 5) special characters, vertical/horizontal texts and input methods
> 
> 

Your point escapes me, here. Unless it is a result of getting rid of the quadrats: within quadrats, the groups of hieroglyphs are essentially the same.

>  
> 6) Ramesside “groups” (or “tall groups”).
> 
> 

Why are they groups or quadrats and not ’small columns’? 

The answer is simple and straightforward (and provided by your example): because they correspond to the size occupied by the A1 sign. Look at the example.

Again, it might not be easy to handle from a technical point of view, I do agree, and you might want to split them in smaller groups for the sake of convenience, OK, but some signs do occupy the full height of the horizontal line (unlike in your Japanese example, which is nice, but as such irrelevant), which is the basis for deciding what counts as a unit (quadrat) or not. (Or do you have another definition in mind?).

Again, if unicode is meant for writing names of tourists (not even in cartouches as it seems) or to prepare simplified layout for online teaching grammar, etc. that’s really fine. 
But if you want to be able to use it minimally for (1) standardized electronic corpus and (2) journals like LingAeg, etc. dealing with the Egyptian language, this would be a requirement. 

> Trying to describe (and encode) them as independent units (or devising control characters to build them) is more or less like trying to encode every singly column of hieroglyphic text as single independent “groups”.
> 
Are you sure? Come on...

> Rather, it would make much more sense to deal with these “Ramesside short vertical strings of text disposed in horizontal lines” as if they were.. well, short vertical strings of text disposed in horizontal lines.
> 
> This can be *easily* done at the layout level, either with xml/CSS style sheets, or with any other layout algorithm of any text editor. Word can do that, already. It is enough to use a vertical font, and create a page layout with horizontal sequences (= the lines) of very short vertical columns.
> 
> 

The wording is transparent: easy, but unfortunately inaccurate and irrelevant for scholarly uses (not palaeography, of course: palaeography has to do with the actual appearance and style of individual signs). 

> 7) What are the “square groups”?
> 
> 

I provided the definition agreed on — I think — by everyone, if you have a better one, I’m listening of course.

A final remark (aimed at everyone, not an answer to Marwan, of course). I spent hours and days reading and thinking about the arguments during the last few weeks, and I’ve got the impression that we are repeating the same obvious points again and again. I’m sorry to put it so bluntly, but if Unicode is not to be useful for the majority of egyptologists, so be it. It will remain what it is, a standard not used by the community: Journals, Corpora, etc. will keep on using JSESH and other tools, they’re doing well with it. 

On the other hand, the Egyptologists around the table seem to agree that we ‘simply’ need (and it does not sound extraordinary to me):
1 - to build quadrats.
2 - to create (sub)groups of signs within quadrats.
3 - to be able to position (groups of) signs with respect to other (groups) either vertically, horizontally or in other ‘corner’ like position (the INSERT-like operators). [As a side-note to Bob: the positioning of groups of signs in corners is trivial in monumental inscriptions: the low number of them in Ramses comes from the fact that we encoded very few hieroglyphic texts; check the first pages of the KRI for getting an idea.]

These are the basic principles, illustrated ad nauseam in the (files attached to the) mails before (and we leave out everything else as mandatory requirements). If this is not possible to envision, I think that we can close the discussion, without anger or regret: Unicode cannot be used by most of us. That’s a pity, but so be it. 

And I would find it really great if one could now stop asking for more data regarding these questions or post-poning the decision for bad reasons: we provided more than is needed. If you do not want to take these cases into account because they are not frequent enough (on which basis, e.g., for someone working only with this kind of texts?) or because they are hard to implement, that’s fine; but please do not invoke the lack of evidence.

Have a good weekend folks!

Stéphane

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://evertype.com/pipermail/egyptian_evertype.com/attachments/20160723/575473cb/attachment.htm>