[Egyptian] Brackets in the TLA encoding

Sat Jul 23 13:47:15 BST 2016

Hi Simon, Hi Stéphane, Hi All,

This is very helpful.

Physicists tell us that if you want to gather and use data, you need 
hypotheses first, or else you don't know what to look for. For me the 
relevant hypotheses are:
(1) The primitives we have in our current document allow description of 
most of the groups in an accurate enough way. Both 'most of' and 
'accurate enough' are subjective of course. There is no escaping that.
(2) It would be quite difficult to reduce the expressive power before
we would lose coverage. There is an implicit parameter, which is a limit on 
the depth of nesting, which I assume is 3. As also Simon confirmed once 
more, 2 is not enough, even for the most basic, run-of-the-mill classical
(horizontal) inscriptions.

As to (1), we have moved away quite considerably from descriptive power
that is machine-interpretable. This was motivated by people finding the
original encoding too complicated, and arguing that fonts would do a lot of
fine-tuning anyway for particular choices of signs. Also, we don't really
care about a sign being printed 0.5 mm too much to the left or to the right,
as long as the user gets a rough idea of what the text looks like.

These arguments all sound reasonable, but realise two things:
* If even stupid machines don't know how to render an encoding roughly
as it was intended, perhaps there is not enough information present for 
humans to know what was meant either.
* As stressed once more by Stéphane, the kinds of groups we are talking 
about are productive. We don't want to be manually fine-tuning the appearance
of an unbounded number of groups, so some approximately correct automatic 
rendering would be quite useful.

I think we are still okay with the present version of the proposal, but we have moved 
a long way from existing routines that do the rendering in a deterministic, predictable 
manner, to needing lots more refinements to program code and the result being not quite 
well-defined.

As to both (1) and (2), the provided examples include quite a few cases of insertions 
and stacking, insertions into stacked groups, and even groups with insertions that 
are themselves inserted. So far I don't see either hypothesis refuted. 

I had to struggle quite a bit to get rid of prefix operators. As anyone with the slightest
knowledge of formal languages knows, prefix or suffix operators are ideal for
automatic processing, because the problem of ambiguity simply does not exist, 
whereas endless volumes of textbooks since the late 1950s have been written about 
the ambiguities caused by infix operators and how to solve them using principled or not so
principled methods involving operator precedence and low-level hacks in 
shift-reduce parsers. Using infix operators is really only justifiable if notation is
meant for human consumption. That is why I was very surprised to hear objections
with the argument that font technology is too primitive to handle prefix operators.
If anything, I would have imagined that primitive tools would have a lot of
difficulty with parsing in the presence of operator precedence and such. 
I implemented OpenType substitution rules that analysed bracket structures
and prefix operators myself, and that works fine. It would be a nightmare for me 
to have to implement OpenType substitution rules in the presence of operator 
precedence. There may be something in the arguments people use that I don't 
understand. 

Anyway, one thing to look out for (I say this in particular to Simon, Stéphane and
Serge, with whom this was discussed in detail in Cambridge), is that in the process of
getting rid of prefix operators, and avoiding ambiguity, the following coverage was lost: 
it is not possible anymore to insert A into the top-left corner of B, and to insert the resulting 
group into the bottom-left corner of C. The same holds for the right corners. I have yet to 
see a group where this matters. It is still possible however to insert A into the bottom-left
corner of B, and to insert the resulting group in the bottom-left corner of C. The same holds
for two upper-left corners and the corresponding right corners. There are groups of 
these forms among the provided examples.

The problem of course is that inability to find certain structures in the corpora we happen 
to have at this very moment does not prove their non-existence. At best it means our
encoding won't be too much lacking in terms of coverage.

Best regards,
Mark-Jan

On Thursday 21 Jul 2016 15:24:51 Simon Schweitzer wrote:
> Hi all,
> 
> @Stéphane: thank you for your .gly-files! In this mail, I want to add 
> some remarks concerning the subgroup topic.
> 
> As in Ramses, there are many encodings with "(" and ")" in the TLA. I 
> collected these encodings ans I want to present you my evaluation:
> * In some cases, the encoding is invalid, e.g. (F12-S29):D21, which 
> should be understood as F12*S29:D21.
> * Sometimes, the encoding of the brackets is superflous. There are many 
> cases of Hiero1:(Hiero2*Hiero3) The brackets are not necessary: use 
> Hiero1:Hiero2*Hiero3 !
> * But in many cases, the parsing without the brackets would be misleading:
> 1) There are many vertical groups in horizontal groups in vertical 
> groups. I list only 10 examples:
> N35:"⸮"*"?"*(W22:Z2) (Rezepte Papyrus Zagreb E-597-3, l. 1; 
> ID:ABLN5PNQ2BBENE7LWO72KDRPPU)
> Aa1&D58:(X1:N35)*N25 (Stele Louvre C 284 ("Bentresch-Stele"), l. 22; 
> ID:4VLZLA44UVGJZN22WIWP774LOQ)
> Aa13:S40*(X1:O49) (Stele Louvre C 284 ("Bentresch-Stele"), l. 21; 
> ID:4VLZLA44UVGJZN22WIWP774LOQ)
> Aa15:W19*(X1:X1) (Harfnerlieder Text C, l. 2; ID:H6Z5TORPQFFZXOU6CJODODZHYQ)
> D21:V28*(X1:B1) (〈Stele des Montuhotep (Cambridge E.9.1922)〉, l. C.3; 
> ID:LQ7QIJTK7NFWTIFBS7R6AIYWGY)
> D21:V7*(W24:X1) (〈Stele des Montuhotep (Kairo CG 20539)〉, l. I.b.18; 
> ID:ZOLMMIAB2NHV7PSOSOOLAHN64U)
> D35A:(X1:Z4A)*G37 (〈Stele des Antef (Louvre C 167 = E 3111)〉, l. C.1; 
> ID:DWHZIO5ZCFBURLZ6G4T26YIP7U)
> D36:D21:N29*(X1:"⸮"*Z4*"?") (〈Stele des Antef (Glasgow D1922.13)〉, l. 
> A.3; ID:OIYODBZ74RHM7OPTR72OLCMJ3A)
> I10&I9:X2*(X4:Z2) (〈Stele des Antef (Glasgow D1922.13)〉, l. A.7; 
> ID:OIYODBZ74RHM7OPTR72OLCMJ3A)
> K4:G1*(Z7:X1) (Sinuhe AOS, vs. 18; ID:RP2F6BGNDBAARDBNDHGFSFPIEM)
> As you can see, this kind of grouping occurs in hieroglyphic and in 
> hieratic texts, and this feature is also attested in the "classical" 
> period from the Middle Kingdom (the examples from the stela of 
> Montuhotep and Antef).
> 2) horizontal grouping of vertical groups in columns
> If the text is written vertically, there are cases of horizontal groups 
> of vertical groups, e.g. in the Buch von der Himmelskuh 
> (ID:WHEOIX5P5ZAVVFU4BGO6OWASV4), M17*(Q3:N35), (X1:Z1)*I12, 
> M17*S29*(A2:Z2) and so on.
> 
> Best regards,
> 
> Simon
> 
> _______________________________________________
> Egyptian mailing list
> Egyptian at evertype.com
> http://evertype.com/mailman/listinfo/egyptian_evertype.com
>