<div dir="ltr">Hello everyone<br><div class="gmail_extra"><br><div class="gmail_quote"><br><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex"><div style="word-wrap:break-word"><div><div><blockquote type="cite"><div dir="ltr"><p class="MsoNormal" style="text-align:justify"><b><span lang="EN-GB">Third</span></b></p></div></blockquote><div>And your solution is a good one — I have absolutely no doubt — as long as you do not want to *search* for the relative position of signs with respect to one another. This is however a piece of information that is, like it or not, part of the ‘orthographic’ system of ancient Egyptian and to which we want to have access (see further Simon’s mail earlier this week).</div></div></div></div></blockquote><div><br></div><div>In general, the spatial distribution of signs (i.e. the "grouping") has *no meaning at all* in Egyptian. No semantic, phonological, or morphological information is coded in the relative position of the signs. </div><div>This is demonstrated by the texts themselves: if the relative position of the signs were linguistically important, you would have some form of regularity, with some combinations being possible and others being forbidden. This is not the case.</div><div><br></div><div>Combining three signs into a group, or writing them one after the other is linguistically exactly the same.</div><div>it is just an esthetic, a layout, matter.</div><div>Not a linguistic one.</div><div><br></div><div>No more than illuminated initial capital letters in medieval manuscripts. </div><div>Like these:</div><div><br></div><div><a href="http://www.sothebys.com/content/dam/stb/lots/L12/L12240/256L12240_6G4Y2.jpg" target="_blank">http://www.sothebys.com/content/dam/stb/lots/L12/L12240/256L12240_6G4Y2.jpg</a><br></div><div><br></div><div>They are nice, they are fancy, they are there (hundreds of thousands of them), but they do not carry any additional linguistic information whatsoever. Thus, there is not specific need to (and in fact they are not) encode or represent them in unicode.</div><div><br></div><div>As far as I know (and please correct me if I am wrong), the only utility that recording the position of signs could have on a practical level would be to fill lacunas in text or to suggest alternative readings in hieratic scripts.</div><div>But in fact, in order to find what sign could be missing in a given lacuna, or what sign could be hidden behind a hieratic ligature, you need dictionaries and corpora of texts where you can search for sequences of signs *independently* form their spatial position (what signs x is attested after sign y? whether combined in a similar group or not?), you don't need to code anything in unicode. It can be a plus, but it is not indispensable.</div><div><br></div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex"><div style="word-wrap:break-word"><div><div><blockquote type="cite"><div dir="ltr"><p class="MsoNormal" style="text-align:justify"><b><span lang="EN-GB">1) JSesh approach</span></b> </p></div></blockquote></div><div><span><blockquote type="cite"><div dir="ltr"><p class="MsoNormal" style="text-align:justify"><span lang="EN-GB"><br></span></p></div></blockquote></span>Sure, everything needs not be dealt with at the level of Unicode, but the data needed should not be hidden in the ligatures embedded in the font either.</div></div></div></blockquote><div><br></div><div>Again, what data?</div><div>What is the *linguistic* (not graphic, not philological) information carried by groups that is so important to code?</div><div><br></div><div>Perhaps this question has been already asked, but honestly so far I haven't heard or read any convincing answer, and I haven't seen any common example (i can exclude there could be some uncommon case, i obviously have not seen all the egyptian texts existing in the world) where the position of a sign in respect with the other signs around it carries and linguistic information.</div><div><br></div><div><br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex"><div style="word-wrap:break-word"><span><blockquote type="cite"><div dir="ltr"><p class="MsoNormal" style="text-align:justify"><b><span lang="EN-GB">2) Groups in fonts</span></b></p><div style="text-align:justify"><br></div></div></blockquote></span>OK, sure. But then again control characters have the advantage of being explicit about the relative position of signs when a group is not in a font: how would you proceed for storing such an information, as a lay user, when using ligatures? (this is a real question, nothing ironic here).</div></blockquote><div><br></div><div>I am not sure I understand your question: the information about the spatial distribution is already coded in the ligature. the ligature IS the information.</div><div>And if the ligature does not exist, it takes (literally) 30 seconds to create it within a font. And with a common database and a common font of reference, if would be extremely easy to create a common shared table of ligatures that will allow everyone to see exactly the same groups (or alternatively if i write my text with the font x with the ligature x, it would be enough to embed the font itself in the text file (there are various ways to do so), because everyone would be using the same font or at least the same set of ligatures.</div><div><br></div><div><span style="text-align:justify"> </span><span style="text-align:justify"> </span><br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex"><div style="word-wrap:break-word"><div><span><blockquote type="cite"><div dir="ltr"><p class="MsoNormal" style="text-align:justify"><b><span lang="EN-GB">3) the 4 “small sign in the corner of big sign” control characters.</span></b></p></div></blockquote></span>I’m glad to see that your solution is exactly the same as the one we suggest! Indeed, there are logically 2 ‘variables’ (Top vs Bottom and Left-Right), if the sequence of signs indicates the Left-Right unambiguously (like in your example), then we only need to encode explicitly the Top vs Bottom, of course. This is basically how we represented things with Michael at the pub.</div><div>Please keep in mind that the four operators come from another type of syntax.</div></div></blockquote><div><br></div><div>This is not what I am suggesting. I am suggesting 4 control characters: top-bottom, left-right, A\B, B/A. instead of top-bottom, left-right + four corners.</div><div>Or perhaps I am not understanding your system.</div><div><br></div><div><br></div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex"><div style="word-wrap:break-word"><div><span><blockquote type="cite"><div dir="ltr"><p class="MsoNormal" style="text-align:justify"><b><span lang="EN-GB">4) Vertical and horizontal script and control characters</span></b></p><div><br></div></div></blockquote><div><br></div></span>I do not understand why you do entirely get rid of the notion of ‘quadrat’ in your example. If you encode your text as /hthr-H*(n:t)-tA:tA/, the text would be displayed correctly whatever the orientation  (hltr, hrtl, vrtl, vltr), no? Of course, one needs the notion of quadrat in the encoding for this; and this is a nice case for showing that we *need* this well-defined notion in the encoding, not just sequences of glyphs.</div></div></blockquote><div><br></div><div>If you want to have the "notion of quadrate" in unicode, then you have to introduce at least another control character that you will have to use in front of *every single quadrate* in your text. if i understood correctly, one of the proposal that were circulating was indeed suggesting something like that. Which means that to display any text you will probably end up using more control characters that hieroglyphic signs themselves. This can be a sound way of thinking in a MdC (as you don't like JSesh ;-) ) perspective, but I doubt it make sense in unicode (and more important i doubt the people of the unicode consortium will think it make sense).</div><div><br></div><div>Unless you want to code every single possible "quadrate" as an independent glyph: that would thus end up being conceptually comparable to a chinese character.</div><div><br></div><div>But I assume (i hope) we don't want do do that, right?</div><div><br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex"><div style="word-wrap:break-word"><div><span><blockquote type="cite"><div dir="ltr"><p class="MsoNormal" style="text-align:justify"><b><span lang="EN-GB">5) special characters, vertical/horizontal texts and input methods</span></b></p><div><br></div></div></blockquote><div><br></div></span>Your point escapes me, here. Unless it is a result of getting rid of the quadrats: within quadrats, the groups of hieroglyphs are essentially the same.</div></div></blockquote><div><br></div><div><br></div><div>With quadrats marked by special control characters would be even worse, as you would have to take into consideration even more possible combinations.</div><div><br></div><div> <span style="text-align:justify"> </span></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex"><div style="word-wrap:break-word"><div><span><blockquote type="cite"><div dir="ltr"><p class="MsoNormal" style="text-align:justify"><b><span lang="EN-GB">6) Ramesside “groups” (or “tall groups”).</span></b></p><div><br></div></div></blockquote><div><br></div></span>Why are they groups or quadrats and not ’small columns’? </div><div><br></div><div>The answer is simple and straightforward (and provided by your example): because they correspond to the size occupied by the A1 sign. Look at the example.</div></div></blockquote><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex"><div style="word-wrap:break-word"><div>but some signs do occupy the full height of the horizontal line (unlike in your Japanese example, which is nice, but as such irrelevant), which is the basis for deciding what counts as a unit (quadrat) or not. (Or do you have another definition in mind?). </div></div></blockquote><div><br></div><div>Not true.</div><div>This can be interpreted just as a question of layout, not as a question of grouping.</div><div>Simply, there could be signs that can be graphically stretched/enlarged to fully fill alone a given space, while other will be squeezed fit together within an equivalent space.</div><div><br></div><div>This does not say anything at all about them being "groups" or not. It can be interpreted just as a question of layout.</div><div><br></div><div>And actually this is very common in various writing systems around the world, and no one would consider the fact that some isolated glyphs appear as big as some "combined one" as reason to consider the "combined ones" as "groups".<br></div><div><br></div><div>Just a few examples from the internet:</div><div><br></div><div>have a look at this chinese (bopomofo + characters) text:</div><div><br></div><div><a href="https://upload.wikimedia.org/wikipedia/commons/1/1b/Bopomofo_in_Regular,_Handwritten_Regular_%26_Cursive_formats.jpg">https://upload.wikimedia.org/wikipedia/commons/1/1b/Bopomofo_in_Regular,_Handwritten_Regular_%26_Cursive_formats.jpg</a><br></div><div><br></div><div>According to your way of interpreting groups ("some signs do occupy the full height of the horizontal line, which is the basis for deciding what counts as a unit (quadrat) or not."), the four little characters within the parentheses on the right of the image should be interpreted as a single "group" (or as a combined "character", as we are in china) just because they, combined, are as big as some of the other single characters.</div><div>This is not the case. No one in asia would have consider such a combination as a "group" or as a "combined character". They are just four independent characters(or "four groups" assuming character = group, which conceptually is a fairly sound equivalence) which happen to fit into the space in a slightly different way compared to the other characters of the text. </div><div>Their appearance is just a question of *scale*, thus of layout. Not of grouping, and do not say anything about what a "group" should be.</div><div><br></div><div>Here, an even better example, a chiense text with bopofomo and hanzi characters:</div><div><br></div><div><a href="http://chinesehacks.com/app/uploads/2010/06/zhuyin.jpg">http://chinesehacks.com/app/uploads/2010/06/zhuyin.jpg</a><br></div><div><br></div><div>no one would consider those small signs on the right of the big characters as "groups" just because when combined together they end up being as big as the single bigger characters.</div><div>And as you can see, conceptually, you can interpret this text as a sequence of short vertical columns organized in horizontal lines. Some of these short vertical columns are occupied by a single "scaled up" character (like the A1 in the Egyptian text above) while other short vertical columns are occupied by multiple "scaled down" signs one above the other.</div><div>it is the same with Ramesside writing.</div><div><br></div><div>No one, however, would consider those smaller "scaled down" signs written one above the other as "groups" just because together they are as big as the big ones.</div><div><br></div><div><br></div><div>Same with latin script, actually: for instance take again the manuscript page above:</div><div><br></div><div><a href="http://www.sothebys.com/content/dam/stb/lots/L12/L12240/256L12240_6G4Y2.jpg">http://www.sothebys.com/content/dam/stb/lots/L12/L12240/256L12240_6G4Y2.jpg</a><br></div><div><br></div><div>According to your definition, one should take the tallest sign (i.e. the initial "B") as the unit to define what a "group" is. As a consequence, one should consider the 6 lines of text beside it not as "lines of text", but as a single "group", and according to your approach those lines should not be represented with a specific layout, but should be somehow combines with control characters.</div><div><br></div><div>I hope we agree that this would make no sense.</div><div><br></div><div>So if in Chinese (and Japanese, and Korean) and in Latin script you can have glyphs of different sizes in the same text, without the need to conceptually cluster the "smallest one" into "groups" that will appear as big as the biggest signs, why do you feel this need with Egyptian?</div><div><br></div><div>And note that this is different from Egyptian ordinary square writing, as in ordinary square writing one can assume that all the signs (both those combines into square groups and those outside square groups) are more or less in the same *scale* (as a general principle, again i know there are exceptions). In Ramesside writing, instead, the *scale* of the sign is not always the same. Some are "scaled down" to fit togheter within one of those "small vertical columns" while others (like the A1) are just "scaled up" to fill up alone the same space of one of those "small vertical columns".</div><div><br></div><div>Exactly like in the Chinese examples here above, or exactly like with illuminated capital letters in manuscripts.</div><div><br></div><div>It is (or it can be interpreted) as a question of layout.</div><div><br></div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex"><div style="word-wrap:break-word"><div>But if you want to be able to use it minimally for (1) standardized electronic corpus and (2) journals like LingAeg, etc. dealing with the Egyptian language, this would be a requirement. </div></div></blockquote><div><br></div><div>All the egyptian words quoted in my article published on the last issue of JNES were written with my font with just standard ligatures, without control characters. No one complained, no one told me anything, so I assume that such a system is indeed suitable for scientific publications, not only for writing tourists' names.. ;-)</div><div><br></div><div>Beside the anecdote, you can do everything scientific you need with unicode without ending up using tens of control characters.</div><div>And if you can't then perhaps you should consider that unicode is not the best tool for your job..</div><div><br></div><div>It is actually funny: i have tried to understand all the various proposal including control characters, and as a result I have pointed out *several* practical problems in the use of control characters that can seriously affect the *linguistic* information of the text (like control characters requiring to change the actual inputting order of the actual hieroglyphs to display them correctly), some still without an answer, or with answer so complex that will be hardly realizable (see question of turning vertical text into horizontal text, that will require the introduction of at least one more control character).</div><div><br></div><div>On the other hand, almost no one so far has pointed out any real problem in the use of ligatures (possibly combined with a very restricted number, 1 o 2, of control characters as suggested by Bob) for which there isn't already a possible solution which is already implemented in some script around the world (i.e. unwanted ligatures can be broken with a zero-width character like in indian scripts, deferent ligatures using the same signs could be selected with "variant characters" like in emoji etc).</div><div><br></div><div>And still, a system that does not use multiple control characters will be good only for writing tourists' names..</div><div><br></div><div>Please, no offense, but again: the story of the elephant..</div><div><br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex"><div style="word-wrap:break-word"><div><span><blockquote type="cite"><div dir="ltr"><p class="MsoNormal" style="text-align:justify"><span lang="EN-GB">Trying to describe (and encode) them

as independent units (or devising control characters to build them) is more or

less like trying to encode every singly column of hieroglyphic text as single

independent “groups”. </span></p></div></blockquote></span>Are you sure? Come on...<span> <br></span></div></div></blockquote><div><br></div><div>it is you who is talking about encoding (as glyphs in font, not necessarily as unicode characters) tens of thousands of possible and often unique combinations, not me..</div><div><br></div><div>Ramesside "tall" groups can be interpreted as short vertical texts. Thus thinking of encoding them (or building them with control characters) is like thinking of encoding as groups whole columns of text.</div><div><br></div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex"><div style="word-wrap:break-word"><div><span><div></div></span>The wording is transparent: easy, but unfortunately inaccurate and irrelevant for scholarly uses (not palaeography, of course: palaeography has to do with the actual appearance and style of individual signs).<span style="text-align:justify"> </span></div></div></blockquote><div><br></div><div>again, why?</div><div><br></div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex"><div style="word-wrap:break-word"><div><span><blockquote type="cite"><div dir="ltr"><p class="MsoNormal" style="text-align:justify"><b><span lang="EN-GB">7) What are the “square groups”?</span></b></p><div><br></div></div></blockquote><div><br></div></span>I provided the definition agreed on — I think — by everyone, if you have a better one, I’m listening of course.</div></div></blockquote><div><br></div><div>see above.</div><div>And see my "Second" introductory point in the previous email: it is not a question of finding the "right" definition (as there isn't any). It is a question of finding the *most efficient* definition in order to efficiently reach our goal of having a working system of unicode-based hieroglyphs that can be used by the *whole* Egyptological community (and not only by a handful of people working on a specific corpus or database, and on his respect, to respond to you last remark..).</div><div><br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex"><div style="word-wrap:break-word"><div>I’m sorry to put it so bluntly, but if Unicode is not to be useful for the majority of egyptologists, so be it. It will remain what it is, a standard not used by the community: Journals, Corpora, etc. will keep on using JSESH and other tools, they’re doing well with it. </div></div></blockquote><div><br></div><div>Perhaps the opposite should also be considered? perhaps the unicode-based system to write hieroglyphs (considering that the very idea at the basis of unicode is to make writing standardized and accessible to as many people as possible) should indeed aim at being useful for the majority of the Egyptologists (corpus-linguists as well as all the other thousands of non-corpus-linguist egyptologists), and if some team working with some specific database should have some specific need that cannot be satisfied with such a standard unicode system, then perhaps it is them who should use other tools?</div><div><br></div><div>I guess this is an option that should be considered.. no?</div><div><br></div><div>Best</div><div><br></div><div>Marwan</div></div></div></div>