From everson at evertype.com Fri Jul 15 19:52:00 2016 From: everson at evertype.com (Michael Everson) Date: Fri, 15 Jul 2016 19:52:00 +0100 Subject: [Egyptian] Cambridge I&E Workshop: Some follow-ups for July In-Reply-To: References: <6ef6585e-8ddf-152d-7d16-612bfc0b7641@w3.org> Message-ID: <930878D5-3729-40CF-9ABF-22F8D3B296B4@evertype.com> On 15 Jul 2016, at 19:48, Nigel Strudwick wrote: > > Richard > > I think we all agreed a specific list for this subject, set up by you, was the way forward. Well, this list was used for previous discussion, and it?s still here (with an old 2008 archive no less), and you?re all subscribed to it. Michael From mn31 at st-andrews.ac.uk Sat Jul 16 14:21:46 2016 From: mn31 at st-andrews.ac.uk (Mark-Jan Nederhof) Date: Sat, 16 Jul 2016 14:21:46 +0100 Subject: [Egyptian] Cambridge I&E Workshop: Some follow-ups for July In-Reply-To: <930878D5-3729-40CF-9ABF-22F8D3B296B4@evertype.com> References: <930878D5-3729-40CF-9ABF-22F8D3B296B4@evertype.com> Message-ID: <1516249.IGz3T585fz@thuis> On Friday 15 Jul 2016 19:52:00 Michael Everson wrote: > On 15 Jul 2016, at 19:48, Nigel Strudwick wrote: > > > > Richard > > > > I think we all agreed a specific list for this subject, set up by you, was the way forward. > > Well, this list was used for previous discussion, and it?s still here (with an old 2008 archive no less), and you?re all subscribed to it. > > Michael Just a thought: we are a very small, select group. For example, Richard wrote to me with some very detailed and insightful comments about writing-mode in CJK text, with implications for Ancient Egyptian. It would be a shame not to have such material archived publicly to share it with others interested in these matters. Mark-Jan From bobqq at live.co.uk Mon Jul 18 12:38:05 2016 From: bobqq at live.co.uk (Bob Richmond) Date: Mon, 18 Jul 2016 12:38:05 +0100 Subject: [Egyptian] Simple higher level protocols Message-ID: Hi All I?ve posted a short note on simple HLP at http://hieroglyphseverywhere.blogspot.co.uk/2016/07/simple-higher-level-protocols-and.html. HLP is a topic we didn?t get to discuss much last week but important to bear in mind when thinking of unencoded characters and edge-cases for control characters. At some point it would be useful to try and put together a wish-list. Bob -------------- next part -------------- An HTML attachment was scrubbed... URL: From bobqq at live.co.uk Wed Jul 20 10:58:26 2016 From: bobqq at live.co.uk (Bob Richmond) Date: Wed, 20 Jul 2016 10:58:26 +0100 Subject: [Egyptian] Ligature joiners - evidence needed please Message-ID: Hi all, I?ve not yet seen any data, evidence or rationale about the suggested 5 positional ligature system discussed at Cambridge. I don?t have unlimited time to research all this myself and I?m not an Egyptologist. Technically such a system can be added to current controls but it needs to be well-defined, detailed and justified if it is to be submitted. [This is what I?m doing with the group-joiners]. One part I?m reasonably happy that I understand is ligatures involving the cobra-pattern (also seen with tongue). This ligature has a special role in orthography, especially in column vertical writing and ?tall groups?. I?ve seen plenty of instances. Conversely there is the bird pattern a+b, b+c or a+b+c where b is a bird. Pretty much all I?ve seen is a and c as individual hieroglyphs. What other bird patterns attested? Groups? There are characters in Hieroglyphica e.g. HG-D0160 to HG-D0188 and many more HG arrangements that could be implemented using controls (as with monograms). If this is what some of these proposed controls are about I think next step repertoire might be a better context in which to discuss the topic to begin with. Mark-Jan ? you?ve been using your ?insert? operator in RES typography for many years. It would be great to see a list of at least some of the clusters you?ve encountered. Anyway, great if anyone has any examples/evidence to share. Bob -------------- next part -------------- An HTML attachment was scrubbed... URL: From ncs3 at cam.ac.uk Wed Jul 20 11:20:34 2016 From: ncs3 at cam.ac.uk (Nigel Strudwick) Date: Wed, 20 Jul 2016 11:20:34 +0100 Subject: [Egyptian] Ligature joiners - evidence needed please In-Reply-To: References: Message-ID: <1A8607F8-B7D1-4320-822B-717E663993E4@cam.ac.uk> Bob, I?d have more chances of understanding this (as an Egyptologist) if I could see it. Could you be able to post a sketch of what you mean? Handwritten would do. Nigel On 20 Jul 2016, at 10:58, Bob Richmond wrote: > Hi all, > > I?ve not yet seen any data, evidence or rationale about the suggested 5 positional ligature system discussed at Cambridge. I don?t have unlimited time to research all this myself and I?m not an Egyptologist. Technically such a system can be added to current controls but it needs to be well-defined, detailed and justified if it is to be submitted. [This is what I?m doing with the group-joiners]. > > One part I?m reasonably happy that I understand is ligatures involving the cobra-pattern (also seen with tongue). This ligature has a special role in orthography, especially in column vertical writing and ?tall groups?. I?ve seen plenty of instances. > > Conversely there is the bird pattern a+b, b+c or a+b+c where b is a bird. Pretty much all I?ve seen is a and c as individual hieroglyphs. What other bird patterns attested? Groups? > > There are characters in Hieroglyphica e.g. HG-D0160 to HG-D0188 and many more HG arrangements that could be implemented using controls (as with monograms). If this is what some of these proposed controls are about I think next step repertoire might be a better context in which to discuss the topic to begin with. > > Mark-Jan ? you?ve been using your ?insert? operator in RES typography for many years. It would be great to see a list of at least some of the clusters you?ve encountered. > > Anyway, great if anyone has any examples/evidence to share. > > Bob > > _______________________________________________ > Egyptian mailing list > Egyptian at evertype.com > http://evertype.com/mailman/listinfo/egyptian_evertype.com From mn31 at st-andrews.ac.uk Wed Jul 20 11:21:34 2016 From: mn31 at st-andrews.ac.uk (Mark-Jan Nederhof) Date: Wed, 20 Jul 2016 11:21:34 +0100 Subject: [Egyptian] List of considerations Message-ID: <6022711.NM8k5uuFyp@thuis> Dear All, I'm afraid the momentum is lost if we wait any longer with resuming the discussion. So let me make an inventory about how I see things. This assumes familiarity with the issues in the last version of the proposal: https://mjn.host.cs.st-andrews.ac.uk/tmp/unicode.pdf as well as with the discussions in Cambridge, inside and outside the Fitzwilliam. One thought is that we should make the encoding as simple as possible, but not simpler. Another thought is that we need a systematic design, not a bunch of individual control characters thrown together. This pertains both to functionality (what can be expressed) and to syntax (how is it expressed). As for functionality, the primitives should cover a natural range of what Egyptologists want the encoding to express at a very minimum, recognizing the need to sacrifice precision for simplicity. As to syntax, we should not lose sight of the bigger picture, or we might tie ourselves into a knot with operator precedence. More on the functionality: our team (TLA/Ramses/St Andrews) had relatively few qualms simplifying the semantics of the insertions, to allow an inserted group to spill over to outside the bounding box of the 'big' sign. This makes it easier to use, in particular removing the need of the EMPTY to artificially increase the size of the bounding box. (It makes the mapping to richer and more precise encodings outside Unicode more difficult, but this aside.) With the simplification, we could abandon the four insertions at the four sides, because then these are basically horizontal and vertical grouping in combination with the "JOIN" from our original proposal. However, if we then drop the JOIN, it would be quite odd to be able to have a primitive for "insert into a corner" but not for the functionality of "insert into a side". If you look at typical uses of the & in the past, then you see many can only be expressed as "insert into a side", or alternatively horizontal/vertical grouping with JOIN. So JOIN plus the four insertions into the four corners forms a logical whole. In more detail, when we insert G into a corner of S: if G is small, it might fit into the bounding box of S; if G is big, it may extend to outside the bounding box; the extension would be to the left or right for signs with unit height (as for most birds), and to the top or bottom for signs that are less high (as for the tongue). In the case of birds, insertion into the lower-left corner would no longer mean strictly in the corner, but normally just above the feet of the bird. Here we sacrifice precision for simplicity of use. Note that PLOTTEXT distinguished insertion-into-lower-left-corner and insertion-above-the-bird's-feet. An example of insert into a side is Hm-kA, with the Hm club half inside the pair of raised arms. It would now be encoded as a vertical group of Hm and kA, with a JOIN in between. There is only one case I can think of where insert-at-the-bottom is not very well expressed in terms of vertical grouping with JOIN, and that is with two X1 next to one another at the bottom within the bounding box of S22. One could probably live with an approximation. More on syntax: Suppose we have * and : and INSERT and perhaps STACK, all represented as infix operators, then the question is what A * B INSERT C : D STACK E means. No one has yet provided a reasonable syntax with infix operators but without brackets that disambiguates in a satisfactory manner. From my understanding of the well-established academic discipline of the design of programming languages, I would assume such a syntax does not exist. Personally I welcome the prospect to put some separating characters between signs within a horizontal/vertical group, because that simplifies OpenType substitution rules; it would also streamline the syntax of the JOIN with the 'normal' way of horizontal/vertical grouping. So, some part of the notation could well be infix, but not having brackets leads us into the abyss of structural ambiguity. Mixed systems of operator precedence plus brackets to override operator precedence where needed are a bit old-fashioned, and are only helpful if we assume that the control characters would be actually typed one by one by the user, instead of having a specialized graphical editor for hieroglyphic text that relieves the user of having to worry about syntax. To recapitulate what I wrote earlier, my proposal for representing horizontal grouping would be exemplified by: OPEN_HOR arg1 NORMAL_SEP arg2 JOIN_SEP arg3 CLOSE where I combine a normal separator with the joining (fitting) one. Here arg1 and arg2 could be single signs or vertical groups or insertions, etc. An example of vertical grouping could be: OPEN_VERT arg1 NORMAL_SEP arg2 CLOSE Insertion could be: OPEN_INSERT arg1 TOP_LEFT big_sign TOP_RIGHT arg2 BOTTOM_RIGHT arg3 CLOSE which would mean insert arg1 into the top left corner of the big sign, insert arg2 into the top right corner and insert arg3 into the bottom right corner. Note: if we insert G into S, then S is usually a single sign, but not always. Consider for example a superimposition (stacking) of P6 and D36, with N5 inserted into the lower-left corner and Z1 inserted into the lower-right corner. So 'big_sign' above could be a group as well. If for now we want to assume it is always a single sign, that is fine, and we can drop the restriction some time in the future, when font technology has evolved. This should be a guiding principle in general: We can put restrictions anywhere we want, motivated by limitations on today's font technology, as long as it doesn't cause major problems 10 or 20 years down the line. Future generations will be grateful if we dare think a little ahead. Other issues: * Richard has written quite a few interesting things about horizontal 'writing-mode' vs vertical 'writing-mode' in other scripts. I don't think the matter has been exhaustively discussed for hieroglyphs. More about this later. * Cartouches (enclosures/boxes): we need to have a suggestion that fits into the design. It is fine to postpone formal proposal of cartouches, but again, we need design, not a bunch of loose characters thrown together. My proposal for syntax would be exemplified by: OPEN_CARTOUCHE first_group NORMAL_SEP second_group JOIN_SEP third_group CLOSE which would fit in with the proposed syntax of the other elements. The problem with reinterpreting the enclosure characters among the existing 1071 Unicode signs is that the isolated hieratic open-cartouche and close-cartouche are then missing. This would be a big problem. So, if we reinterpret pairs of the existing characters to produce full-form enclosures, we need to at least add two more characters for transcription of hieratic. * The EMPTY glyph: With the new semantics of the four insert-into-a-corner primitives, the EMPTY is less urgent, but it is still very useful. Can we take an existing EMPTY character from Unicode? It would be nice to pick a specific one. This means one fewer character needs to be proposed, but it would be good to mention in the proposal that using an EMPTY in place of a hieroglyph is legitimate. * Stacking. Almost all the Egyptologists wanted to have a stacking primitive, while some UTC members were objecting on technical grounds, stacking being difficult to implement dynamically. I think the discussion at the meeting stranded because it was comparing apples and oranges (dynamic versus precomposed), and opposition against stacking was based on the wrong arguments. More about this later. * Insert-into. This requires further investigation. Some observations: - One (very) convincing example is wabt, with the leg-and-jug-of-flowing-water with the feminine ending inside. - If D031 didn't already exist, how would one encode it? I think D032 with the Hm sign in a vertical group with JOIN is possible. - N018A and N018B I would be tempted to see as atomic signs, unless there are many similar combinations of N018 (or X004B) with other flat signs. - To me O010C seems definitely D002 inserted into O018. - This raises the question whether the notation of a box/enclosure for the Hwt sign with something inside is appropriate, or whether 'insert-into' is preferable. For a cartouche/serekh/castle-walls, the text inside can be quite long. Does that apply to Hwt as well? If the length of the text inside Hwt is always quite limited, it doesn't seem to be of the same nature as a cartouche. I'm revising the proposal to match the above. Feedback sooner rather than later would be helpful. Best regards, Mark-Jan From mn31 at st-andrews.ac.uk Wed Jul 20 11:46:04 2016 From: mn31 at st-andrews.ac.uk (Mark-Jan Nederhof) Date: Wed, 20 Jul 2016 11:46:04 +0100 Subject: [Egyptian] Stacking and syntax Message-ID: <5973769.kzhsOBgLFk@thuis> Dear all, More considerations about stacking and syntax (apologies to Michel and Michael for overlap with another message I sent last week): (1) My understanding is that we need to work under the assumption that the control characters do nothing complicated by themselves, but substitution rules are used to map sequences of signs plus control characters to precomposed groups, which are stored as separate glyphs in the font. Is this correct? If the above is correct, then there are some follow-up questions. (2) Should we not worry about the 64K limit on the number of glyphs in OpenType? It would be interesting to know how many different groups (not just ligatures) there are in Ramses? (3) I wonder whether in the discussion about stacking (superimposed signs, monograms) we were comparing apples and oranges. I proposed the stacking operation could be done dynamically, which is consistent with experiments I've done with OpenType, and which would require linearly many anchor points to be stored in the font, if we restrict our attention to pairs of signs being stacked (not three or more signs). In the easiest case each anchor point could be the center point of the bounding box, and determining these could even be automated, say by a Python script in FontForge. Michael says that this is not how font designers like to do things, and they would still like to store precomposed glyphs. Okay, let's take this as given. But then what is the objection against stacking? That there would be too many precomposed glyph combinations? If we assume that all stacked combinations are stored as glyphs in the font, we would have N * N such glyphs. But how about normal groups with pure horizontal and vertical grouping? If you similarly want to store these as precomposed glyphs, and if we assume that such groups can have up to 4 glyphs, you already need N * N * N * N combinations, which dwarfs the costs of implementing stacking. If we do _not_ assume all horizontal/vertical groups are precomposed, but only the ones we have found in some corpus (the 'fallback assumption') then obviously, we have much fewer glyphs to store than N * N * N * N. But then why is it not acceptable to precompose only those stacked pairs that are known from a corpus? So if we compare apples and apples, so to speak, both normal horizontal/vertical groups and stackings require excessive storage space. If we compare oranges and oranges, both normal horizontal/vertical groups and stackings are feasible, and stackings more so than horizontal/vertical groups. Do I see this wrong? (4) Coming back to syntax in more detail than in previous message. The Richmond & Glass proposal had three characters, with & (ligatures) having tightest binding, then * (horizontal grouping), then : (vertical grouping). But we need (very limited) recursion of horizontal/vertical grouping, say up to three levels; two is surely too little to handle perfectly ordinary Middle Egyptian horizontal texts. And we need finer control characters, such as the INSERT. This implies we quickly need many levels of operator precedence. For example: "A INSERT_TOP_RIGHT B VERTICALGROUPING C" could mean: " ( A INSERT_TOP_RIGHT B ) VERTICALGROUPING C " or " A INSERT_TOP_RIGHT ( B VERTICALGROUPING C ) " Both interpretations correspond to attested groups. If we try to disambiguate by operator precedence, then either we need several copies of each control character that differ in tightness of binding, leading to horrible complexity, or we need brackets. In order to avoid the problems with operator precedence, I chose an entirely different syntax for the draft proposal, under the motto: if we need brackets, we might as well use them consistently in combination with prefix operators, and get rid of infix operators altogether. A formal grammar is in the draft proposal in the appendix, and you can see it is very simple and uniform. Now, I do understand that the closer we stay to traditions of "ordinary" writing systems, the better it is for adoption of our encoding. If including infix operators is the only way to make the encoding palatable to font designers, so be it. But can I just ask: If it is the case that sequences of hieroglyphs and control characters are replaced by precomposed single glyphs (question (1) above), then would the choice between prefix or infix or postfix (reverse Polish notation) make any real difference for actual realization in terms of feature files of fonts ? Regards, Mark-Jan From ishida at w3.org Wed Jul 20 11:55:31 2016 From: ishida at w3.org (ishida at w3.org) Date: Wed, 20 Jul 2016 11:55:31 +0100 Subject: [Egyptian] Vertical vs horizontal writing-mode Message-ID: On 13/07/2016 23:16, Mark-Jan Nederhof wrote: > A belated thank you for the information. I now had some time to go > through the links that you sent in more detail. The issue of directionality > is not crucial to our proposal, and there was so much confusion about > the matter that it seemed best to not further discuss it during the > meeting. All the Egyptologists fully understood, because the problem is > unavoidable in everyday transcription of hieroglyphic texts, but apart from > the Egyptologists we couldn't get anyone to care about this matter, so > there may be no point in trying to keep it as part of the proposal. > > This leaves two options: we omit mention of directionality from the proposal > altogether, or mention the matter in passing, with the suggestion that > we might revisit it at some future time. To determine what is best, can > I ask you the following? Do you know of some other language/script in > which the encoding of text itself is different depending on whether the > text direction is horizontal or vertical? (To be clear, I don't care about > ltr versus rtl. The central issue is horizontal versus vertical.) > > While going through the material you sent, I got the impression that > the encoding is always the same, for CJK and other scripts. As I tried to > explain in the paper and in the presentation, the situation for Ancient Egyptian > is very different, because the signs tend to be divided into groups ('quadrats') > somewhat differently depending on whether the text direction is horizontal > or vertical, and the division into groups is part of the encoding. In a way, > one could say that a certain encoding is only genuine for horizontal text, > or for vertical text, but not usually for both. Do you see the problem, and the > reason why we brought this up? Would you agree with me this problem > does not occur in CJK and similar scripts ? Let's refer to horizontal 'writing-mode' vs vertical 'writing-mode' to make the terminology clearer. So, in general for CJK text one would expect to see the same sequence of code points for a text whether it is rendered horizontally or vertically. However, there are some differences... [a] there are certain characters that by convention are more likely to be found in one versus the other. For example, vertical text usually uses corner brackets for quotation marks, whereas horizontal CJK use quotation marks. For examples, see bullet (b) under https://www.w3.org/TR/jlreq/#differences_in_vertical_and_horizontal_composition_in_use_of_punctuation_marks So for good quality rendering of text, the choice of code point goes with the choice of writing-mode for a small number of characters, and you can't just flip between the two by switching the CSS. Other times you may find full-width characters being used for latin letters and digits in vertical script when they are not in horizontal. For example an acronym such as FIFA is likely to be rendered as non-rotated, full-width characters in vertical Japanese, but as ordinary proportionally-spaced characters in horizontal. [b] the visual appearance of some characters in CJK varies according to the writing-mode. For example, parentheses are rotated 90? between vertical and horizontal. Other characters need completely different glyphs to be swapped in. For example the horizontal Japanese full stop has an advance width the same as other characters but has just a small circle in the lower left corner. In vertical text that circle appears in the top right corner (ie. it can't be achieved by rotation). In these cases you need extra glyphs in the font that are activate by sensitivity to the writing-mode. For examples, see http://r12a.github.io/scripts/tutorial/part4#rotations [c] Sometimes text flows horizontally within vertical columns in CJK (known as tate ch? yoko). See an example at http://r12a.github.io/scripts/tutorial/part4#tatechuyoko. This is something that has no correspondence in horizontal text. So in summary, sometimes the sequence of characters needs to be different for vertical vs horizontal text, but most of the time apparent differences are achieved through rendering algorithms operating on and selecting appropriate glyphs. What does remain the same, however, is the logical progression of codepoints in memory. That sequence, as is usually the case throughout Unicode, typically follows the pronounced order of the 'letters' involved or some other rule such as combining characters following base characters. If the expected order of codepoints in a word varies for sequences of character in vertical vs horizontal writing modes, then problems arise in searching or processing text. Btw, there are also plenty of examples in Unicode of scripts that treat visual display in terms of syllables, clusters or groups of characters. The underlying sequence of codepoints in many Brahmi-derived scripts is different from the order in which the respective glyphs are displayed. For example a RA at the start (nominally the left) of a Hindi sequence of consonants in the word 'irsya' is likely to be displayed above the 'a' (far to the right). For examples, see http://r12a.github.io/scripts/tutorial/part3#positional This, like the other things noted above (with the exception of the first) is achieved through applying some magical rendering process, using smoke and mirrors to transform the underlying, logically-ordered codepoint sequence. Fwiw, in vertical arrangements the 'syllabic' clusters in indic scripts are treated as indivisible units that run horizontally. So, coming back to Egyptian hieroglyphs, and making it clear that i know very little about how Egyptian hieroglyphs work, i find myself wondering the following: a. perhaps some combination of the above smoke and mirrors techniques may be adequate to manage some of the differences between layout in horizontal vs vertical writing-mode when the thing we are struggling with is the spacial relationships between the elements circumscribed by a quadrat when they are rendered. b. perhaps it's not particularly problematic that you can't automatically flip between horizontal vs vertical without changing code points, especially when one considers that there is anyway so much variation in 'spelling' of egyptian content, often to fit the visual space available. c. if the control characters used to indicate the positioning of hieroglyphs within the quadrat display space are treated like other Unicode control characters, ie. they are not part of the semantics and are ignored for sorting, searching, and processing the text for meaning, rather they are just cues for visual arrangement, then perhaps it's not a big issue either if they are different for vertical vs horizontally rendered content. does that help? ri From mn31 at st-andrews.ac.uk Wed Jul 20 12:37:29 2016 From: mn31 at st-andrews.ac.uk (Mark-Jan Nederhof) Date: Wed, 20 Jul 2016 12:37:29 +0100 Subject: [Egyptian] Ligature joiners - evidence needed please In-Reply-To: References: Message-ID: <2414353.YqfQxLpJiR@thuis> The evidence for the insertions is of the same kind as the evidence for horizontal and vertical grouping. Look at an inscription. What do you see? Signs and groups are below and above other signs and groups, or next to one another. Therefore primitives for horizontal and vertical grouping are natural. But not everything is horizontal or vertical. Sometimes you see one sign in the corner of another. This warrants the insertions as primitives. This was recognized already by PLOTTEXT in the late 1980s. I reinvented the wheel around 2000-2002 with RES, coming to almost the same conclusions. JSesh has insertions as well. I have also shown that OCR tools are able to automatically recognize insertions. So the empirical evidence is that any known way of encoding hieroglyphic text that explicitly describes the graphical form has insertions. Horizontal/vertical grouping plus 4 or 5 insertions plus JOIN do not cover everything, but it will be sufficient for many applications, and it is the most we can hope for within the limitations of Unicode, what precision and coverage is concerned (ignoring cartouches and such for now). Mark-Jan On Wednesday 20 Jul 2016 10:58:26 Bob Richmond wrote: > Hi all, > > I?ve not yet seen any data, evidence or rationale about the suggested 5 positional ligature system discussed at Cambridge. I don?t have unlimited time to research all this myself and I?m not an Egyptologist. Technically such a system can be added to current controls but it needs to be well-defined, detailed and justified if it is to be submitted. [This is what I?m doing with the group-joiners]. > > One part I?m reasonably happy that I understand is ligatures involving the cobra-pattern (also seen with tongue). This ligature has a special role in orthography, especially in column vertical writing and ?tall groups?. I?ve seen plenty of instances. > > Conversely there is the bird pattern a+b, b+c or a+b+c where b is a bird. Pretty much all I?ve seen is a and c as individual hieroglyphs. What other bird patterns attested? Groups? > > There are characters in Hieroglyphica e.g. HG-D0160 to HG-D0188 and many more HG arrangements that could be implemented using controls (as with monograms). If this is what some of these proposed controls are about I think next step repertoire might be a better context in which to discuss the topic to begin with. > > Mark-Jan ? you?ve been using your ?insert? operator in RES typography for many years. It would be great to see a list of at least some of the clusters you?ve encountered. > > Anyway, great if anyone has any examples/evidence to share. > > Bob > From bobqq at live.co.uk Wed Jul 20 13:04:18 2016 From: bobqq at live.co.uk (Bob Richmond) Date: Wed, 20 Jul 2016 13:04:18 +0100 Subject: [Egyptian] Aaron IE Experimental Font version 1 Message-ID: Hi All I?ve put together this ?Aaron IE Experimental? font available for the I&E group to use for documentation, communication, whatever. The PDF describes the font. I?ve included the PDF source docx file so you can see how a plain text system works. Source seems fine with Word (latest version). I found LibreOffice and OpenOffice are still buggy with Complex Text handling but didn?t explode into pieces when I loaded the doc! Your mileage may differ. Apologies for our less technical readers for the gobbledygook. Bob -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: AaronIECambridgeExperimentalFont1.docx Type: application/vnd.openxmlformats-officedocument.wordprocessingml.document Size: 24482 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: AaronIECambridgeExperimentalFont1.pdf Type: application/pdf Size: 439594 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: AaronIEExperimental.ttf Type: application/octet-stream Size: 742088 bytes Desc: not available URL: From hawilbrink at hotmail.com Wed Jul 20 15:00:18 2016 From: hawilbrink at hotmail.com (heleen wilbrink) Date: Wed, 20 Jul 2016 16:00:18 +0200 Subject: [Egyptian] Cambridge: conclusions and next steps Message-ID: Hi guys, Sorry to have kept you waiting. Here are the main conclusions and next steps (bold) as I interpreted them. I hope you will find them helpful. Have a great day,Heleen Repertoire A list will be made with the references to the publication of an original text (photo or facsimile, not a print font like eg IFAO) for each hieroglyph from the proposal of Michel Suignard. Background: The characters in the proposal of Michel do not have sources yet. These sources are needed in order for the proposal to be accepted. Michel will do the coordination for this list with input from Stephane/Serge/Simon/Mark-JanThe hieroglyphs with references can be added in tranchesThe first tranche will be the proposal of Michael Everson on hieratic hieroglyphs from Moller to which the hieratic dot will be addedStephane/Serge/Simon/Mark-Jan will review the proposal of Michael and add any other hieroglyphs that they are missing including the references before mid AugustMichael will finish the proposal in the second half of August so that it can be included in the Unicode meeting on the 6th of September Monograms: I think the consensus is to make a list of attested combinations and then decide on the approach, either monograms (A) or the use of a control character (B). This list will be used as a proposal for monograms (A) or will be used by developers for their fonts (B). Most Egyptologists were in favor of solution B because of the ease of searching and the ease of implementation (in fonts and not in Unicode proposals). Stephane/Serge/Simon/Mark-Jan will make a list and share it with the group. Then a decision is made for A or B Control characters On the last day of the conference there seemed to be consensus on the control characters, which Michael summed up by mail as: ?The proposal is to remove the LIG character from the ballot and to add in 7 more, the two group controls of Bob, and then the four corner and the center controls?It was agreed that this proposal will be finished in August, I thought by Michael, and discussed on the Unicode meeting on the 6th of September. Now it seems over the mail that several people are working on it separatelySuggestion: let Michael, Bob and Mark-Jan do a conference call or so soon to A. make sure there is one proposal that all agree on and can be shared with the group and B. give guidelines to Simon/Serge/Stephane for what examples they should look in their databases Input methods and fonts So and Marwan have integrated their input systems and font into SINUHESo/Marwan could you share with the us the online location so we can start using it?Bob has made and shared an experimental font for us to use -------------- next part -------------- An HTML attachment was scrubbed... URL: From mn31 at st-andrews.ac.uk Wed Jul 20 15:41:41 2016 From: mn31 at st-andrews.ac.uk (Mark-Jan Nederhof) Date: Wed, 20 Jul 2016 15:41:41 +0100 Subject: [Egyptian] Cambridge: conclusions and next steps In-Reply-To: References: Message-ID: <17511299.cOQPMXRhrm@bear> On Wednesday 20 Jul 2016 16:00:18 heleen wilbrink wrote: > let Michael, Bob > and Mark-Jan do a conference call or so soon to A. make sure there is one > proposal that all agree on and can be shared with the group and B. give > guidelines to Simon/Serge/Stephane for what examples they should look in their > databases Heleen, It is kind of you to try to coordinate, but you don't seem to understand the issues. There are already two competing proposals since 2016-06-30, when we uploaded ours onto L2. What I've tried to do in the past emails is to point out that a naive attempt to pick and choose from the two proposals leads to inconsistencies. We need a coherent design, not majority votes for wishlists of individual elements, nor a friendly compromise that leaves us with ambiguous or inconsistent notation. The problems I have pointed out are quite technical, difficult, and not amenable to oral communication alone, and a conference call is not going to bring us any closer to a resolution. Mark-Jan From ncs3 at cam.ac.uk Wed Jul 20 15:58:59 2016 From: ncs3 at cam.ac.uk (Nigel Strudwick) Date: Wed, 20 Jul 2016 15:58:59 +0100 Subject: [Egyptian] Cambridge: conclusions and next steps In-Reply-To: <17511299.cOQPMXRhrm@bear> References: <17511299.cOQPMXRhrm@bear> Message-ID: <9DD2825D-7541-46FB-9282-FB8C1E7BEC29@cam.ac.uk> Mark-Jan I was going to send this privately, but I think it has wider implications. I hate to say this, but a little more tact in emails is never a bad thing. Do consider retracting or modifying that statement. If Heleen, and by inference I, cannot understand the issues, then you and the other people putting forward proposals should bear some of the responsibility because you have not communicated it in a way those of us not intimately involved can understand. I made a great play in my closing remarks that what is being done needs to be understandable and usable by the Egyptologists for whom you say you are doing this (as otherwise it reverts to being an intellectual exercise). And please remember that all of you had been thrashing these things around for some time but it took someone like me, strongly backed by Heleen, to get you all together to move things on this much. Personally, I cannot see why a video call (or calls) between a handful of you cannot work, but you may be right. Perhaps you do need another meeting, a technical one, just for the small group of you where you don?t need to worry about people like me asking silly questions and holding you back. I am fine with that. But I see from this present debate the somewhat entrenched positions, which I thought were breaking down on the Monday evening and Tuesday, re-emerging. Nigel On 20 Jul 2016, at 15:41, Mark-Jan Nederhof wrote: > On Wednesday 20 Jul 2016 16:00:18 heleen wilbrink wrote: >> let Michael, Bob >> and Mark-Jan do a conference call or so soon to A. make sure there is one >> proposal that all agree on and can be shared with the group and B. give >> guidelines to Simon/Serge/Stephane for what examples they should look in their >> databases > > Heleen, > > It is kind of you to try to coordinate, but you don't seem to understand the issues. > There are already two competing proposals since 2016-06-30, when we uploaded > ours onto L2. What I've tried to do in the past emails is to point out that a naive > attempt to pick and choose from the two proposals leads to inconsistencies. We > need a coherent design, not majority votes for wishlists of individual elements, > nor a friendly compromise that leaves us with ambiguous or inconsistent notation. > > The problems I have pointed out are quite technical, difficult, and not amenable > to oral communication alone, and a conference call is not going to bring us any closer > to a resolution. > > Mark-Jan > > > _______________________________________________ > Egyptian mailing list > Egyptian at evertype.com > http://evertype.com/mailman/listinfo/egyptian_evertype.com From ishida at w3.org Wed Jul 20 16:07:40 2016 From: ishida at w3.org (ishida at w3.org) Date: Wed, 20 Jul 2016 16:07:40 +0100 Subject: [Egyptian] Mailing list stuff Message-ID: <4305cf67-a096-7a88-15e1-b2a7a88b7fd3@w3.org> folks, please note that Michael has changed the email archive so that it can be viewed by the public at large (not just subscribers). Hopefully, it is also visible to search engines. Btw, if you have been accessing the archive you may need to change your existing bookmarks or links to point to http://evertype.com/pipermail/egyptian_evertype.com/ cheers, ri From ncs3 at cam.ac.uk Wed Jul 20 16:26:40 2016 From: ncs3 at cam.ac.uk (Nigel Strudwick) Date: Wed, 20 Jul 2016 16:26:40 +0100 Subject: [Egyptian] Mailing list stuff In-Reply-To: <4305cf67-a096-7a88-15e1-b2a7a88b7fd3@w3.org> References: <4305cf67-a096-7a88-15e1-b2a7a88b7fd3@w3.org> Message-ID: <9BA81627-D9DE-445D-9EFA-7B4DD65606AA@cam.ac.uk> Does this mean it is no longer a private list? if so, I want out? Honestly; these are still private deliberations. Nigel On 20 Jul 2016, at 16:07, ishida at w3.org wrote: > folks, > > please note that Michael has changed the email archive so that it can be viewed by the public at large (not just subscribers). Hopefully, it is also visible to search engines. > > Btw, if you have been accessing the archive you may need to change your existing bookmarks or links to point to http://evertype.com/pipermail/egyptian_evertype.com/ > > cheers, > ri > > > _______________________________________________ > Egyptian mailing list > Egyptian at evertype.com > http://evertype.com/mailman/listinfo/egyptian_evertype.com From ishida at w3.org Wed Jul 20 17:22:14 2016 From: ishida at w3.org (ishida at w3.org) Date: Wed, 20 Jul 2016 17:22:14 +0100 Subject: [Egyptian] Mailing list stuff In-Reply-To: <9BA81627-D9DE-445D-9EFA-7B4DD65606AA@cam.ac.uk> References: <4305cf67-a096-7a88-15e1-b2a7a88b7fd3@w3.org> <9BA81627-D9DE-445D-9EFA-7B4DD65606AA@cam.ac.uk> Message-ID: <5d349150-6a18-4e47-cf0f-36d775ca7adc@w3.org> On 20/07/2016 16:26, Nigel Strudwick wrote: > Does this mean it is no longer a private list? if so, I want out? Honestly; these are still private deliberations. My understanding is that only subscribers can post to the mailing list, but anyone can read the mails in the archive. Sorry Nigel, but since you were originally pushing for a W3C list i thought you were in favour of making the archive public. This is the standard approach for W3C and Unicode lists, since it makes the information widely available and historically accessible (see http://evertype.com/pipermail/egyptian_evertype.com/2016-July/000083.html) and is likely to attract other useful experts to the discussion over time. If you or other group members don't like that, then we should ask Michael to revert the list to a closed circulation. For my part, i think it would be a pity, though. cheers, ri From mn31 at st-andrews.ac.uk Wed Jul 20 17:23:11 2016 From: mn31 at st-andrews.ac.uk (Mark-Jan Nederhof) Date: Wed, 20 Jul 2016 17:23:11 +0100 Subject: [Egyptian] Cambridge: conclusions and next steps In-Reply-To: <9DD2825D-7541-46FB-9282-FB8C1E7BEC29@cam.ac.uk> References: <17511299.cOQPMXRhrm@bear> <9DD2825D-7541-46FB-9282-FB8C1E7BEC29@cam.ac.uk> Message-ID: <29403533.ajqXQdU4yC@bear> Nigel, You are absolutely right. The tone of my message was unacceptable. I offered my apologies to Heleen off-list. For now, I need to take a break until Friday. Too much time pressure with unrelated matters. Mark-Jan On Wednesday 20 Jul 2016 15:58:59 Nigel Strudwick wrote: > Mark-Jan > > I was going to send this privately, but I think it has wider implications. > > I hate to say this, but a little more tact in emails is never a bad thing. Do consider retracting or modifying that statement. > > If Heleen, and by inference I, cannot understand the issues, then you and the other people putting forward proposals should bear some of the responsibility because you have not communicated it in a way those of us not intimately involved can understand. > > I made a great play in my closing remarks that what is being done needs to be understandable and usable by the Egyptologists for whom you say you are doing this (as otherwise it reverts to being an intellectual exercise). And please remember that all of you had been thrashing these things around for some time but it took someone like me, strongly backed by Heleen, to get you all together to move things on this much. > > Personally, I cannot see why a video call (or calls) between a handful of you cannot work, but you may be right. Perhaps you do need another meeting, a technical one, just for the small group of you where you don?t need to worry about people like me asking silly questions and holding you back. I am fine with that. But I see from this present debate the somewhat entrenched positions, which I thought were breaking down on the Monday evening and Tuesday, re-emerging. > > Nigel > > > > On 20 Jul 2016, at 15:41, Mark-Jan Nederhof wrote: > > > On Wednesday 20 Jul 2016 16:00:18 heleen wilbrink wrote: > >> let Michael, Bob > >> and Mark-Jan do a conference call or so soon to A. make sure there is one > >> proposal that all agree on and can be shared with the group and B. give > >> guidelines to Simon/Serge/Stephane for what examples they should look in their > >> databases > > > > Heleen, > > > > It is kind of you to try to coordinate, but you don't seem to understand the issues. > > There are already two competing proposals since 2016-06-30, when we uploaded > > ours onto L2. What I've tried to do in the past emails is to point out that a naive > > attempt to pick and choose from the two proposals leads to inconsistencies. We > > need a coherent design, not majority votes for wishlists of individual elements, > > nor a friendly compromise that leaves us with ambiguous or inconsistent notation. > > > > The problems I have pointed out are quite technical, difficult, and not amenable > > to oral communication alone, and a conference call is not going to bring us any closer > > to a resolution. > > > > Mark-Jan > > > > > > _______________________________________________ > > Egyptian mailing list > > Egyptian at evertype.com > > http://evertype.com/mailman/listinfo/egyptian_evertype.com > > > _______________________________________________ > Egyptian mailing list > Egyptian at evertype.com > http://evertype.com/mailman/listinfo/egyptian_evertype.com From everson at evertype.com Wed Jul 20 19:47:38 2016 From: everson at evertype.com (Michael Everson) Date: Wed, 20 Jul 2016 19:47:38 +0100 Subject: [Egyptian] Mailing list stuff In-Reply-To: <5d349150-6a18-4e47-cf0f-36d775ca7adc@w3.org> References: <4305cf67-a096-7a88-15e1-b2a7a88b7fd3@w3.org> <9BA81627-D9DE-445D-9EFA-7B4DD65606AA@cam.ac.uk> <5d349150-6a18-4e47-cf0f-36d775ca7adc@w3.org> Message-ID: <9DC6E31B-AF43-4ECE-B706-69BA1445FBA4@evertype.com> On 20 Jul 2016, at 17:22, ishida at w3.org wrote: > My understanding is that only subscribers can post to the mailing list, but anyone can read the mails in the archive. Correct, with the current settings. Michael From runa.uei at gmail.com Wed Jul 20 19:49:40 2016 From: runa.uei at gmail.com (So Miyagawa) Date: Wed, 20 Jul 2016 20:49:40 +0200 Subject: [Egyptian] Cambridge: conclusions and next steps In-Reply-To: References: Message-ID: Hi Heleen and all, I put HieroJIS and links to its demo and how-to videos on GitHub with basic vocabulary (not including the data of TLA). https://github.com/somiyagawa/Toolkit-for-Coptic-and-Ancient-Egyptian-including-HieroJIS If Marwan and TLA agree, I will put all the things of SINUHE. SINUHE is HieroJIS & Marwan's group writing font using TLA data. (Sublime INputting of Unicode for Hieroglyphic Egyptian) Now, several students of Goettingen and Macquarie are contributing to our data. We will have a workshop mainly for these contributors in November or December. I'm now on the way of implementing CATEGORY or DESCRIPTION inputting system like SEATEDMAN --> A1, MAN ---> all the glyphs in A. If you have such a good list of description of glyphs except Gardiner's, please let me know it. Best, So ________________________________ So Miyagawa [so? mij??g?w?] Georg-August-Universit?t G?ttingen (Egyptology & Coptology, Ph.D. candidate), SFB1136 "Bildung und Religion in Kulturen des Mittelmeerraums und seiner Umwelt von der Antike bis zum Mittelalter und zum Klassischen Islam" (Research Fellow), KELLIA (Research Fellow), Coptic SCRIPTORIUM (Research Member), Unicode Consortium (Student Member) Kyoto University (Linguistics, Ph.D. candidate) SFB1136: https://www.uni-goettingen.de/de/531081.html CV: https://docs.google.com/document/d/1HhhKovsJzqZQGCn6W1oNweqyqKYUfUvFTxAlStKICdM/edit?usp=sharing Academia.edu: https://uni-goettingen.academia.edu/SoMiyagawa ?????????? ??????-???????? ??????????? ??????, ?????? ?????? ?????????? ???-???? ??????-?????????, ?????-??????? ??????? ???????? ????????, ????????? ????????; "But the king of Assyria found treachery in Hoshea, for he had sent messengers to So, king of Egypt, and offered no tribute to the king of Assyria, as he had done year by year ." (2 Kings 17:4, ESV) On Wed, Jul 20, 2016 at 4:00 PM, heleen wilbrink wrote: > Hi guys, > > Sorry to have kept you waiting. Here are the main conclusions and next > steps (bold) as I interpreted them. I hope you will find them helpful. > > Have a great day, > Heleen > > Repertoire > > 1. A list will be made with the references to the publication of an > original text (photo or facsimile, not a print font like eg IFAO) for each > hieroglyph from the proposal of Michel Suignard. Background: The characters > in the proposal of Michel do not have sources yet. These sources are needed > in order for the proposal to be accepted. > 2. *Michel will do the coordination for this list with input from > Stephane/Serge/Simon/Mark-Jan* > 3. The hieroglyphs with references can be added in tranches > 4. The first tranche will be the proposal of Michael Everson on > hieratic hieroglyphs from Moller to which the hieratic dot will be added > 5. *Stephane/Serge/Simon/Mark-Jan will review the proposal of Michael > and add any other hieroglyphs that they are missing including the > references before mid August* > 6. *Michael will finish the proposal in the second half of August* so > that it can be included in the Unicode meeting on the 6th of September > > > Monograms: > > 1. I think the consensus is to make a list of attested combinations > and then decide on the approach, either monograms (A) or the use of a > control character (B). This list will be used as a proposal for monograms > (A) or will be used by developers for their fonts (B). Most Egyptologists > were in favor of solution B because of the ease of searching and the ease > of implementation (in fonts and not in Unicode proposals). > 2. *Stephane/Serge/Simon/Mark-Jan will make a list and share it with > the group. Then a decision is made for A or B* > > > Control characters > > 1. On the last day of the conference there seemed to be consensus on > the control characters, which Michael summed up by mail as: ?The proposal > is to remove the LIG character from the ballot and to add in 7 more, the > two group controls of Bob, and then the four corner and the center controls? > 2. It was agreed that this proposal will be finished in August, I > thought by Michael, and discussed on the Unicode meeting on the 6th of > September. Now it seems over the mail that several people are working on it > separately > 3. *Suggestion: let Michael, Bob and Mark-Jan do a conference call or > so soon to A. make sure there is one proposal that all agree on and can be > shared with the group and B. give guidelines to Simon/Serge/Stephane for > what examples they should look in their databases * > > > Input methods and fonts > > 1. So and Marwan have integrated their input systems and font into > SINUHE > 2. *So/Marwan could you share with the us the online location so we > can start using it?* > 3. Bob has made and shared an experimental font for us to use > > > _______________________________________________ > Egyptian mailing list > Egyptian at evertype.com > http://evertype.com/mailman/listinfo/egyptian_evertype.com > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From schweitzer at bbaw.de Thu Jul 21 08:39:14 2016 From: schweitzer at bbaw.de (Simon Schweitzer) Date: Thu, 21 Jul 2016 09:39:14 +0200 Subject: [Egyptian] Ligature joiners - evidence needed please In-Reply-To: References: Message-ID: Hi all, concerning the one ligature joiner vs. the four corner ligature joiner: > I?ve not yet seen any data, evidence or rationale about the suggested 5 positional ligature system discussed at Cambridge. I don?t have unlimited time to research all this myself and I?m not an Egyptologist. Technically such a system can be added to current controls but it needs to be well-defined, detailed and justified if it is to be submitted. [This is what I?m doing with the group-joiners]. Why should one use four instead of one ligature joiner? Because the encoding with four control characters offers more information. It is readable for the egyptologist and for the font developer. Cf. my example from the last week: In the temple of Kom Ombo, there are two ligatures with the knife T31 and the bread X1. We can encode this in RES with insert[bs](T31,X1) and insert[te](T31,X1). The position of the X1 is obvious. But if we have T31&X1, one cannot decide the position of X1. And a font developer could create such ligature with the "corner control character" even if he do not see the original reference, but he cannot do anything if he only has T31&X1. But what about T31&N29 or T31&O49, ligatures, that occurs in the Kom Ombo temple? We (the egyptologists, the font developers, the fonts) cannot interpret the ligature only with this encoding. Okay, the Kom Ombo temple belongs to the ptolemaic and not to the "classical" writing system. But the encoding of this temple is important for the TLA project, because the Kom Ombo project wants to encode their data in our system, so that we will have such encodings in our material. But there are such problems in more classical data, too: For example the ligatures with E6. Bob, you offers three ligatures in your EGPZ 1.0 BETA Specification (August 2007): U+eb13 E6&Z2d, U+eb14 E6&X1, and U+eb15 E6&Z1. These ligatures could be encoded as insert[te](E6,Z2), insert[te](E6,X1), insert[b](E6,Z1) in RES. But there are examples where the plral strokes are in another corner. These examples (DZA 24.607.440 or DZA 28.723.300 ) could be encoded as insert[bs](E6,Z2) in RES. As in T31&X1, the encoding of E6&Z2 is ambiguous. The three ligatures which E6 use different corners. Therefore, it is not clear how to interpret E6&D2 or E6&X1&Z2. But if we have the four corner control characters, we could encode something like the RES code insert[te](E6,D2) instead of E6&D2 in DZA 28.722.290 . The E6&X1&Z2 would be the readable insert[bs](E6,X1*Z2) in RES (DZA 28.722.530 ). BTW, this example is an argument for the special grouping character "g*" which Michael wrote on the whiteboard last week: X1 g* Z2 insert_bs E6 and the g*-group has to be parsed first. But this is another topic... Best regards, Simon From s.polis at ulg.ac.be Thu Jul 21 10:56:34 2016 From: s.polis at ulg.ac.be (=?utf-8?Q?St=C3=A9phane_polis?=) Date: Thu, 21 Jul 2016 11:56:34 +0200 Subject: [Egyptian] Ligature joiners - evidence needed please In-Reply-To: References: Message-ID: <847EA36B-D71C-4F32-986F-7796A7912272@ulg.ac.be> Hi all, Please find attached data from Ramses regarding : 1) free-groups.gly, i.e. groups which involve an absolute positioning of signs in MdC (**, etc.); a lot of these groups are directly relevant for (1) making a case about the need for the four corner INSERT operators, (2) exemplifying the need for a ?stack? operator, etc., etc. 2) ligatures.gly, i.e. groups that use the ?&? operator; directly relevant for illustrating all the cases of the 4 corners operators. 3) subgroups.gly, i.e.groups with the parentheses in MdC, which are directly relevant for illustrating the several levels embedding of hieroglyphs. Note that: * the number before indicates in how many spellings the group/ligature occurs in our corpus. * the encoding of theses spellings was made by PhD or MA students; there are inconsistencies in the encoding because they first and foremost tried to stick to the visual arrangement of the signs in the edition of the text. * 98% of the data are coming from horizontal text (I admit that the *vertical argument* made by Bob several times escapes me a bit, but I might not understand fully what you mean, Bob). We are happy to share this material under 2 conditions: 1 - A reference to Ramses should be made for any use, minimally referring to the website . 2 - These groups are there for helping the development of an appropriate encoding scheme for Unicode as regards the control characters, but they should by no mean be integrated as glyphs or characters into Unicode. I?m sorry to insist again on this and to stress that it would be against all the principles advocated for during the meeting by the Egyptologists (as well as against the conditions for using the material from Ramses): (1) such combinations of signs are productive in ancient Egyptian and we want to be able to encode them without adding new groups in Unicode, this would make no sense; (2) we have to be able to search easily for the signs in these groups as well as the position of theses signs in the groups. The only way out in my view is: a) A well defined set of **insert** operators with a precise semantics. b) A way to make sub-groups (parentheses, precedence operators, begin-end marker, whatever you like). More comments soon regarding the other topics! Have a nice day, St?phane (also on behalf of Serge, of course) -------------- next part -------------- A non-text attachment was scrubbed... Name: freeGroups.gly Type: application/octet-stream Size: 22407 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: ligatures.gly Type: application/octet-stream Size: 8364 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: subGroups.gly Type: application/octet-stream Size: 13240 bytes Desc: not available URL: -------------- next part -------------- ------------------------------------------------------ Chercheur qualifi? F.R.S.-FNRS Universit? de Li?ge Service d'?gyptologie D?partement des sciences de l?Antiquit? Place du 20-Ao?t, B-4000 Li?ge http://www.egypto.ulg.ac.be ------------------------------------------------------ > Le 21 juil. 2016 ? 09:39, Simon Schweitzer a ?crit : > > Hi all, > > concerning the one ligature joiner vs. the four corner ligature joiner: > >> I?ve not yet seen any data, evidence or rationale about the suggested 5 positional ligature system discussed at Cambridge. I don?t have unlimited time to research all this myself and I?m not an Egyptologist. Technically such a system can be added to current controls but it needs to be well-defined, detailed and justified if it is to be submitted. [This is what I?m doing with the group-joiners]. > > Why should one use four instead of one ligature joiner? > Because the encoding with four control characters offers more information. It is readable for the egyptologist and for the font developer. Cf. my example from the last week: In the temple of Kom Ombo, there are two ligatures with the knife T31 and the bread X1. We can encode this in RES with insert[bs](T31,X1) and insert[te](T31,X1). The position of the X1 is obvious. But if we have T31&X1, one cannot decide the position of X1. And a font developer could create such ligature with the "corner control character" even if he do not see the original reference, but he cannot do anything if he only has T31&X1. But what about T31&N29 or T31&O49, ligatures, that occurs in the Kom Ombo temple? We (the egyptologists, the font developers, the fonts) cannot interpret the ligature only with this encoding. > > Okay, the Kom Ombo temple belongs to the ptolemaic and not to the "classical" writing system. But the encoding of this temple is important for the TLA project, because the Kom Ombo project wants to encode their data in our system, so that we will have such encodings in our material. But there are such problems in more classical data, too: > > For example the ligatures with E6. Bob, you offers three ligatures in your EGPZ 1.0 BETA Specification (August 2007): U+eb13 E6&Z2d, U+eb14 E6&X1, and U+eb15 E6&Z1. These ligatures could be encoded as insert[te](E6,Z2), insert[te](E6,X1), insert[b](E6,Z1) in RES. But there are examples where the plral strokes are in another corner. These examples (DZA 24.607.440 or DZA 28.723.300 ) could be encoded as insert[bs](E6,Z2) in RES. As in T31&X1, the encoding of E6&Z2 is ambiguous. The three ligatures which E6 use different corners. Therefore, it is not clear how to interpret E6&D2 or E6&X1&Z2. But if we have the four corner control characters, we could encode something like the RES code insert[te](E6,D2) instead of E6&D2 in DZA 28.722.290 . The E6&X1&Z2 would be the readable insert[bs](E6,X1*Z2) in RES (DZA 28.722.530 ). BTW, this example is an argument for the special grouping character "g*" which Michael wrote on the whiteboard last week: X1 g* Z2 insert_bs E6 and the g*-group has to be parsed first. But this is another topic... > > Best regards, > > Simon > > _______________________________________________ > Egyptian mailing list > Egyptian at evertype.com > http://evertype.com/mailman/listinfo/egyptian_evertype.com From bobqq at live.co.uk Thu Jul 21 11:00:23 2016 From: bobqq at live.co.uk (Bob Richmond) Date: Thu, 21 Jul 2016 11:00:23 +0100 Subject: [Egyptian] Ligature joiners - evidence needed please Message-ID: Hi Simon Thanks for the examples. Just to be clear to all. I?d like to make a good case for controls and identify/illustrate any variations or issues needing discussion. If anyone has time to create a doc/PDF with a bunch of graphic illustrations of clusters showing awkward cases this would save me time ? I?ve plenty else to do on this ASAP so help appreciated. Bob From: Simon Schweitzer Sent: 21 July 2016 08:40 To: Egyptian Hieroglyphs in the UCS Subject: Re: [Egyptian] Ligature joiners - evidence needed please Hi all, concerning the one ligature joiner vs. the four corner ligature joiner: > I?ve not yet seen any data, evidence or rationale about the suggested 5 positional ligature system discussed at Cambridge. I don?t have unlimited time to research all this myself and I?m not an Egyptologist. Technically such a system can be added to current controls but it needs to be well-defined, detailed and justified if it is to be submitted. [This is what I?m doing with the group-joiners]. Why should one use four instead of one ligature joiner? Because the encoding with four control characters offers more information. It is readable for the egyptologist and for the font developer. Cf. my example from the last week: In the temple of Kom Ombo, there are two ligatures with the knife T31 and the bread X1. We can encode this in RES with insert[bs](T31,X1) and insert[te](T31,X1). The position of the X1 is obvious. But if we have T31&X1, one cannot decide the position of X1. And a font developer could create such ligature with the "corner control character" even if he do not see the original reference, but he cannot do anything if he only has T31&X1. But what about T31&N29 or T31&O49, ligatures, that occurs in the Kom Ombo temple? We (the egyptologists, the font developers, the fonts) cannot interpret the ligature only with this encoding. Okay, the Kom Ombo temple belongs to the ptolemaic and not to the "classical" writing system. But the encoding of this temple is important for the TLA project, because the Kom Ombo project wants to encode their data in our system, so that we will have such encodings in our material. But there are such problems in more classical data, too: For example the ligatures with E6. Bob, you offers three ligatures in your EGPZ 1.0 BETA Specification (August 2007): U+eb13 E6&Z2d, U+eb14 E6&X1, and U+eb15 E6&Z1. These ligatures could be encoded as insert[te](E6,Z2), insert[te](E6,X1), insert[b](E6,Z1) in RES. But there are examples where the plral strokes are in another corner. These examples (DZA 24.607.440 or DZA 28.723.300 ) could be encoded as insert[bs](E6,Z2) in RES. As in T31&X1, the encoding of E6&Z2 is ambiguous. The three ligatures which E6 use different corners. Therefore, it is not clear how to interpret E6&D2 or E6&X1&Z2. But if we have the four corner control characters, we could encode something like the RES code insert[te](E6,D2) instead of E6&D2 in DZA 28.722.290 . The E6&X1&Z2 would be the readable insert[bs](E6,X1*Z2) in RES (DZA 28.722.530 ). BTW, this example is an argument for the special grouping character "g*" which Michael wrote on the whiteboard last week: X1 g* Z2 insert_bs E6 and the g*-group has to be parsed first. But this is another topic... Best regards, Simon _______________________________________________ Egyptian mailing list Egyptian at evertype.com http://evertype.com/mailman/listinfo/egyptian_evertype.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From ncs3 at cam.ac.uk Thu Jul 21 11:22:21 2016 From: ncs3 at cam.ac.uk (Nigel Strudwick) Date: Thu, 21 Jul 2016 11:22:21 +0100 Subject: [Egyptian] Ligature joiners - evidence needed please In-Reply-To: References: Message-ID: Bob Was just glancing to see if there were any awkward groups in the TT99 material and one jumped out which is not so often mentioned, that is with the tongue sign when used as imy-r ?overseer?. It is impossible to get grouping it right in JSesh without considerable manual manipulation, or export to Illustrator (which I prefer anyway). I attach some graphics, one from a quick go in JSesh, and others from the facsimile originals. The ?tail? of the tongue has something in common with D. Title groups are often enough to defeat any setting program. The Z1 sign is a pain too, but that may be more of the fact that many fonts oversize it. Using ?3? for the squashed plural signs produces a better effect. But I realise this is not the issue about which you are concerned. Nigel On 21 Jul 2016, at 11:00, Bob Richmond wrote: > Hi Simon > > Thanks for the examples. > > Just to be clear to all. I?d like to make a good case for controls and identify/illustrate any variations or issues needing discussion. > > If anyone has time to create a doc/PDF with a bunch of graphic illustrations of clusters showing awkward cases this would save me time ? I?ve plenty else to do on this ASAP so help appreciated. > > Bob > > From: Simon Schweitzer > Sent: 21 July 2016 08:40 > To: Egyptian Hieroglyphs in the UCS > Subject: Re: [Egyptian] Ligature joiners - evidence needed please > > Hi all, > > concerning the one ligature joiner vs. the four corner ligature joiner: > > > I?ve not yet seen any data, evidence or rationale about the suggested 5 positional ligature system discussed at Cambridge. I don?t have unlimited time to research all this myself and I?m not an Egyptologist. Technically such a system can be added to current controls but it needs to be well-defined, detailed and justified if it is to be submitted. [This is what I?m doing with the group-joiners]. > > Why should one use four instead of one ligature joiner? > Because the encoding with four control characters offers more > information. It is readable for the egyptologist and for the font > developer. Cf. my example from the last week: In the temple of Kom Ombo, > there are two ligatures with the knife T31 and the bread X1. We can > encode this in RES with insert[bs](T31,X1) and insert[te](T31,X1). The > position of the X1 is obvious. But if we have T31&X1, one cannot decide > the position of X1. And a font developer could create such ligature with > the "corner control character" even if he do not see the original > reference, but he cannot do anything if he only has T31&X1. But what > about T31&N29 or T31&O49, ligatures, that occurs in the Kom Ombo temple? > We (the egyptologists, the font developers, the fonts) cannot interpret > the ligature only with this encoding. > > Okay, the Kom Ombo temple belongs to the ptolemaic and not to the > "classical" writing system. But the encoding of this temple is important > for the TLA project, because the Kom Ombo project wants to encode their > data in our system, so that we will have such encodings in our material. > But there are such problems in more classical data, too: > > For example the ligatures with E6. Bob, you offers three ligatures in > your EGPZ 1.0 BETA Specification (August 2007): U+eb13 E6&Z2d, U+eb14 > E6&X1, and U+eb15 E6&Z1. These ligatures could be encoded as > insert[te](E6,Z2), insert[te](E6,X1), insert[b](E6,Z1) in RES. But there > are examples where the plral strokes are in another corner. These > examples (DZA 24.607.440 > or DZA 28.723.300 > ) could be encoded as > insert[bs](E6,Z2) in RES. As in T31&X1, the encoding of E6&Z2 is > ambiguous. The three ligatures which E6 use different corners. > Therefore, it is not clear how to interpret E6&D2 or E6&X1&Z2. But if we > have the four corner control characters, we could encode something like > the RES code insert[te](E6,D2) instead of E6&D2 in DZA 28.722.290 > . The E6&X1&Z2 would > be the readable insert[bs](E6,X1*Z2) in RES (DZA 28.722.530 > ). BTW, this example > is an argument for the special grouping character "g*" which Michael > wrote on the whiteboard last week: X1 g* Z2 insert_bs E6 and the > g*-group has to be parsed first. But this is another topic... > > Best regards, > > Simon > > _______________________________________________ > Egyptian mailing list > Egyptian at evertype.com > http://evertype.com/mailman/listinfo/egyptian_evertype.com > _______________________________________________ > Egyptian mailing list > Egyptian at evertype.com > http://evertype.com/mailman/listinfo/egyptian_evertype.com -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: PastedGraphic-4.png Type: image/png Size: 16344 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: PastedGraphic-2.png Type: image/png Size: 58499 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: PastedGraphic-3.png Type: image/png Size: 44217 bytes Desc: not available URL: From s.polis at ulg.ac.be Thu Jul 21 11:41:43 2016 From: s.polis at ulg.ac.be (=?utf-8?Q?St=C3=A9phane_polis?=) Date: Thu, 21 Jul 2016 12:41:43 +0200 Subject: [Egyptian] Cambridge: conclusions and next steps In-Reply-To: References: Message-ID: Hi Heleen, Hi everyone, Thanks for the summary, it?s a great start for putting things on the table as clearly as possible . > Repertoire > > A list will be made with the references to the publication of an original text (photo or facsimile, not a print font like eg IFAO) for each hieroglyph from the proposal of Michel Suignard. Background: The characters in the proposal of Michel do not have sources yet. These sources are needed in order for the proposal to be accepted. > Michel will do the coordination for this list with input from Stephane/Serge/Simon/Mark-Jan > The hieroglyphs with references can be added in tranches > The first tranche will be the proposal of Michael Everson on hieratic hieroglyphs from Moller to which the hieratic dot will be added > Stephane/Serge/Simon/Mark-Jan will review the proposal of Michael and add any other hieroglyphs that they are missing including the references before mid August > Michael will finish the proposal in the second half of August so that it can be included in the Unicode meeting on the 6th of September I was not there on Tuesday morning, so apologies if you reached an agreement that I am unaware of, but this is not how I see the forthcoming steps in terms of repertoire. The proposal of Michel for the hieroglyphica represents a significant work, but a work that is impossible to document quickly. And we certainly do not want to introduce signs without knowing what they mean, what they represent and where they are attested (many problematic examples were discussed by Ingelore as well as Serge and myself during the meeting). Sourcing these hieroglyphs is a enormous endeavor, even for professional egyptologists and it will take some years (probably not so many, but we are talking in years before being done) and anyway there is no hurry for anyone to use Unicode for these signs as far as I can tell. What we can do quite quickly, on the other hand, is: 1. Extracting all the hieroglyphs that we presently use in the TLA and Ramses, looking for the ones missing in Unicode and preparing an addition in Unicode for these signs that we can document precisely. 2. To do the same for the signs identified in M?ller by Michael; check the proposal that he prepared (ask our colleague from Mainz) and prepare a document for addition in Unicode. 3. In October, some scholars will begin to work full-time on the sign-list and the joint-work between the TLA and Ramses should become progressively available online (Thot Sign-list), with references to signs, classes and forms, as well as to the scholars responsible for the identification of the values, shapes, etc. This will provide a good basis for integrating the hieroglyphica *step-by-step* in Unicode, when we can actually say what each individual sign is. In a nutshell, the work done by Michel will be more than useful, but should be integrated step by step in the future (probably a third step). I think I should get the support of all the egyptologists on this. Correct me guys if I?m wrong. > Monograms: > > I think the consensus is to make a list of attested combinations and then decide on the approach, either monograms (A) or the use of a control character (B). This list will be used as a proposal for monograms (A) or will be used by developers for their fonts (B). Most Egyptologists were in favor of solution B because of the ease of searching and the ease of implementation (in fonts and not in Unicode proposals). > Stephane/Serge/Simon/Mark-Jan will make a list and share it with the group. Then a decision is made for A or B Agreed. > Control characters > > On the last day of the conference there seemed to be consensus on the control characters, which Michael summed up by mail as: ?The proposal is to remove the LIG character from the ballot and to add in 7 more, the two group controls of Bob, and then the four corner and the center controls? Sure. I just sent the data from Ramses that are needed for illustrating and arguing in favour of theses control characters. > It was agreed that this proposal will be finished in August, I thought by Michael, and discussed on the Unicode meeting on the 6th of September. Now it seems over the mail that several people are working on it separately > Suggestion: let Michael, Bob and Mark-Jan do a conference call or so soon to A. make sure there is one proposal that all agree on and can be shared with the group and B. give guidelines to Simon/Serge/Stephane for what examples they should look in their databases If possible, I think that the best would be to produce a single ?consensus? document, which could be signed and approved by all the parties. I leave it up to you to know who is to produce the first draft. I?m certainly happy to read through it carefully, provide examples, and check everything I can, but I cannot write the first draft, which would be beyond my technical capacities and the time I?ve left during the next weeks. Best wishes, St?phane > Input methods and fonts > > So and Marwan have integrated their input systems and font into SINUHE > So/Marwan could you share with the us the online location so we can start using it? > Bob has made and shared an experimental font for us to use > _______________________________________________ > Egyptian mailing list > Egyptian at evertype.com > http://evertype.com/mailman/listinfo/egyptian_evertype.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From hafemann at bbaw.de Thu Jul 21 13:19:00 2016 From: hafemann at bbaw.de (Ingelore Hafemann) Date: Thu, 21 Jul 2016 14:19:00 +0200 Subject: [Egyptian] Cambridge: conclusions and next steps In-Reply-To: References: Message-ID: Dear St?phan, dear all, thanks St?phane for your contribution, I was just getting to write a very similar mail at this moment. I would like to confirm to St?phane in al points - especially about the sign list and even Simon does. In Cambridge I hope to have show in my record how difficult it is to check all signs. We could give toMichel an electronic list of encoded hieroglyphs occured in our TLA , excluded the well referenced Gardiner Codes. So we have the reference of the TLA but an other problem: Students have used variants of signs in the process of encoding our texts, - we have to check it. Our aim is to differentiate between signs and variants as far as it its possible. 1. We can give you a a list of signs missing in Unicode too, with references of our corpus - as St?phane proposed to do in point 1. below. Adding to point 3. of Steph?ne below: In Berlin we have checked and described nearly 2/3 of the Group A (Man), all of Group G (Birds, 313) all of Group P (Ships, 121) and all of Group Z (Strokes, Geometrical F., 39) and a lot signs of the other groups more unsystematical and by-the way. These work we continue in Berlin with one half women power and a re happy that in October one or more scholars will start in Liege. The Thot-sign list - our common product - will be published step by step. I guess - as St?phane does - it will takes few years to finish it. We could give part of the allready finished signs to Michel step by step. This should be possible if it seems useful. Concerning point 2.: I would again recommend to contact Ursula Verhoeven and Svenja G?lden in Mainz, who are engaged in creating a new electronic and digitized "M?ller" Concerning the problems of control characters Simon has discussed the last issues and send some evidences of our texts archive. Many greetings and best wishes Ingelore and Simon Am 21.07.2016 um 12:41 schrieb St?phane polis: > Hi Heleen, > Hi everyone, > > Thanks for the summary, it?s a great start for putting things on the table as clearly as possible . > >> Repertoire >> >> A list will be made with the references to the publication of an original text (photo or facsimile, not a print font like eg IFAO) for each hieroglyph from the proposal of Michel Suignard. Background: The characters in the proposal of Michel do not have sources yet. These sources are needed in order for the proposal to be accepted. >> Michel will do the coordination for this list with input from Stephane/Serge/Simon/Mark-Jan >> The hieroglyphs with references can be added in tranches >> The first tranche will be the proposal of Michael Everson on hieratic hieroglyphs from Moller to which the hieratic dot will be added >> Stephane/Serge/Simon/Mark-Jan will review the proposal of Michael and add any other hieroglyphs that they are missing including the references before mid August >> Michael will finish the proposal in the second half of August so that it can be included in the Unicode meeting on the 6th of September > I was not there on Tuesday morning, so apologies if you reached an agreement that I am unaware of, but this is not how I see the forthcoming steps in terms of repertoire. > The proposal of Michel for the hieroglyphica represents a significant work, but a work that is impossible to document quickly. And we certainly do not want to introduce signs without knowing what they mean, what they represent and where they are attested (many problematic examples were discussed by Ingelore as well as Serge and myself during the meeting). Sourcing these hieroglyphs is a enormous endeavor, even for professional egyptologists and it will take some years (probably not so many, but we are talking in years before being done) and anyway there is no hurry for anyone to use Unicode for these signs as far as I can tell. What we can do quite quickly, on the other hand, is: > > 1. Extracting all the hieroglyphs that we presently use in the TLA and Ramses, looking for the ones missing in Unicode and preparing an addition in Unicode for these signs that we can document precisely. > 2. To do the same for the signs identified in M?ller by Michael; check the proposal that he prepared (ask our colleague from Mainz) and prepare a document for addition in Unicode. > 3. In October, some scholars will begin to work full-time on the sign-list and the joint-work between the TLA and Ramses should become progressively available online (Thot Sign-list), with references to signs, classes and forms, as well as to the scholars responsible for the identification of the values, shapes, etc. This will provide a good basis for integrating the hieroglyphica *step-by-step* in Unicode, when we can actually say what each individual sign is. > > In a nutshell, the work done by Michel will be more than useful, but should be integrated step by step in the future (probably a third step). > I think I should get the support of all the egyptologists on this. Correct me guys if I?m wrong. > > >> Monograms: >> >> I think the consensus is to make a list of attested combinations and then decide on the approach, either monograms (A) or the use of a control character (B). This list will be used as a proposal for monograms (A) or will be used by developers for their fonts (B). Most Egyptologists were in favor of solution B because of the ease of searching and the ease of implementation (in fonts and not in Unicode proposals). >> Stephane/Serge/Simon/Mark-Jan will make a list and share it with the group. Then a decision is made for A or B > Agreed. > >> Control characters >> >> On the last day of the conference there seemed to be consensus on the control characters, which Michael summed up by mail as: ?The proposal is to remove the LIG character from the ballot and to add in 7 more, the two group controls of Bob, and then the four corner and the center controls? > Sure. I just sent the data from Ramses that are needed for illustrating and arguing in favour of theses control characters. >> It was agreed that this proposal will be finished in August, I thought by Michael, and discussed on the Unicode meeting on the 6th of September. Now it seems over the mail that several people are working on it separately >> Suggestion: let Michael, Bob and Mark-Jan do a conference call or so soon to A. make sure there is one proposal that all agree on and can be shared with the group and B. give guidelines to Simon/Serge/Stephane for what examples they should look in their databases > If possible, I think that the best would be to produce a single ?consensus? document, which could be signed and approved by all the parties. > I leave it up to you to know who is to produce the first draft. I?m certainly happy to read through it carefully, provide examples, and check everything I can, but I cannot write the first draft, which would be beyond my technical capacities and the time I?ve left during the next weeks. > > Best wishes, > > St?phane > > >> Input methods and fonts >> >> So and Marwan have integrated their input systems and font into SINUHE >> So/Marwan could you share with the us the online location so we can start using it? >> Bob has made and shared an experimental font for us to use >> _______________________________________________ >> Egyptian mailing list >> Egyptian at evertype.com >> http://evertype.com/mailman/listinfo/egyptian_evertype.com > > > _______________________________________________ > Egyptian mailing list > Egyptian at evertype.com > http://evertype.com/mailman/listinfo/egyptian_evertype.com -- Dr. Ingelore Hafemann Strukturen und Transformationen des Wortschatzes der ?gyptischen Sprache Post: J?gerstra?e 22/23 Sitz: Unter den Linden 8 10117 Berlin Tel: (030) 20370 447 -------------- next part -------------- An HTML attachment was scrubbed... URL: From schweitzer at bbaw.de Thu Jul 21 14:24:51 2016 From: schweitzer at bbaw.de (Simon Schweitzer) Date: Thu, 21 Jul 2016 15:24:51 +0200 Subject: [Egyptian] Brackets in the TLA encoding Message-ID: Hi all, @St?phane: thank you for your .gly-files! In this mail, I want to add some remarks concerning the subgroup topic. As in Ramses, there are many encodings with "(" and ")" in the TLA. I collected these encodings ans I want to present you my evaluation: * In some cases, the encoding is invalid, e.g. (F12-S29):D21, which should be understood as F12*S29:D21. * Sometimes, the encoding of the brackets is superflous. There are many cases of Hiero1:(Hiero2*Hiero3) The brackets are not necessary: use Hiero1:Hiero2*Hiero3 ! * But in many cases, the parsing without the brackets would be misleading: 1) There are many vertical groups in horizontal groups in vertical groups. I list only 10 examples: N35:"?"*"?"*(W22:Z2) (Rezepte Papyrus Zagreb E-597-3, l. 1; ID:ABLN5PNQ2BBENE7LWO72KDRPPU) Aa1&D58:(X1:N35)*N25 (Stele Louvre C 284 ("Bentresch-Stele"), l. 22; ID:4VLZLA44UVGJZN22WIWP774LOQ) Aa13:S40*(X1:O49) (Stele Louvre C 284 ("Bentresch-Stele"), l. 21; ID:4VLZLA44UVGJZN22WIWP774LOQ) Aa15:W19*(X1:X1) (Harfnerlieder Text C, l. 2; ID:H6Z5TORPQFFZXOU6CJODODZHYQ) D21:V28*(X1:B1) (?Stele des Montuhotep (Cambridge E.9.1922)?, l. C.3; ID:LQ7QIJTK7NFWTIFBS7R6AIYWGY) D21:V7*(W24:X1) (?Stele des Montuhotep (Kairo CG 20539)?, l. I.b.18; ID:ZOLMMIAB2NHV7PSOSOOLAHN64U) D35A:(X1:Z4A)*G37 (?Stele des Antef (Louvre C 167 = E 3111)?, l. C.1; ID:DWHZIO5ZCFBURLZ6G4T26YIP7U) D36:D21:N29*(X1:"?"*Z4*"?") (?Stele des Antef (Glasgow D1922.13)?, l. A.3; ID:OIYODBZ74RHM7OPTR72OLCMJ3A) I10&I9:X2*(X4:Z2) (?Stele des Antef (Glasgow D1922.13)?, l. A.7; ID:OIYODBZ74RHM7OPTR72OLCMJ3A) K4:G1*(Z7:X1) (Sinuhe AOS, vs. 18; ID:RP2F6BGNDBAARDBNDHGFSFPIEM) As you can see, this kind of grouping occurs in hieroglyphic and in hieratic texts, and this feature is also attested in the "classical" period from the Middle Kingdom (the examples from the stela of Montuhotep and Antef). 2) horizontal grouping of vertical groups in columns If the text is written vertically, there are cases of horizontal groups of vertical groups, e.g. in the Buch von der Himmelskuh (ID:WHEOIX5P5ZAVVFU4BGO6OWASV4), M17*(Q3:N35), (X1:Z1)*I12, M17*S29*(A2:Z2) and so on. Best regards, Simon From hawilbrink at hotmail.com Thu Jul 21 20:19:09 2016 From: hawilbrink at hotmail.com (heleen wilbrink) Date: Thu, 21 Jul 2016 21:19:09 +0200 Subject: [Egyptian] Cambridge: conclusions and next steps In-Reply-To: References: Message-ID: Dear Ingelore and St?phane, dear all, It is great to see so much valuable input on our mail group. I would like to clarify one thing regarding the repertoire. I completely agree with you St?phane and Ingelore and I am sorry that my wording was apparently not so clear. For short term (hopefully by mid August, as was discussed in Cambridge) the check on the Mollerlist proposal of Michael can be done and other missing hieroglyphs that have references can be added to it. If August is not feasible it could be I guess later this year. The compilation of the entire list with references indeed will take much longer and can be done step by step. Thanks Ingelore for reminding your suggestion to get in contact with Ursula and Svenja. Is anyone from the Cambridge group in close contact with them and willing to contact them? All the best, Heleen Verstuurd vanaf mijn iPhone > Op 21 jul. 2016 om 14:20 heeft Ingelore Hafemann het volgende geschreven: > > Dear St?phan, dear all, > thanks St?phane for your contribution, I was just getting to write a very similar mail at this moment. > I would like to confirm to St?phane in al points - especially about the sign list and even Simon does. In Cambridge I hope to have show in my record how difficult it is to check all signs. We could give toMichel an electronic list of encoded hieroglyphs occured in our TLA , excluded the well referenced Gardiner Codes. So we have the reference of the TLA but an other problem: Students have used variants of signs in the process of encoding our texts, - we have to check it. Our aim is to differentiate between signs and variants as far as it its possible. > 1. We can give you a a list of signs missing in Unicode too, with references of our corpus - as St?phane proposed to do in point 1. below. > Adding to point 3. of Steph?ne below: In Berlin we have checked and described nearly 2/3 of the Group A (Man), all of Group G (Birds, 313) all of Group P (Ships, 121) and all of Group Z (Strokes, Geometrical F., 39) and a lot signs of the other groups more unsystematical and by-the way. These work we continue in Berlin with one half women power and a re happy that in October one or more scholars will start in Liege. The Thot-sign list - our common product - will be published step by step. I guess - as St?phane does - it will takes few years to finish it. We could give part of the allready finished signs to Michel step by step. This should be possible if it seems useful. > Concerning point 2.: I would again recommend to contact Ursula Verhoeven and Svenja G?lden in Mainz, who are engaged in creating a new electronic and digitized "M?ller" > > Concerning the problems of control characters Simon has discussed the last issues and send some evidences of our texts archive. > Many greetings and best wishes > Ingelore and Simon > > > >> Am 21.07.2016 um 12:41 schrieb St?phane polis: >> Hi Heleen, >> Hi everyone, >> >> Thanks for the summary, it?s a great start for putting things on the table as clearly as possible . >> >>> Repertoire >>> >>> A list will be made with the references to the publication of an original text (photo or facsimile, not a print font like eg IFAO) for each hieroglyph from the proposal of Michel Suignard. Background: The characters in the proposal of Michel do not have sources yet. These sources are needed in order for the proposal to be accepted. >>> Michel will do the coordination for this list with input from Stephane/Serge/Simon/Mark-Jan >>> The hieroglyphs with references can be added in tranches >>> The first tranche will be the proposal of Michael Everson on hieratic hieroglyphs from Moller to which the hieratic dot will be added >>> Stephane/Serge/Simon/Mark-Jan will review the proposal of Michael and add any other hieroglyphs that they are missing including the references before mid August >>> Michael will finish the proposal in the second half of August so that it can be included in the Unicode meeting on the 6th of September >> I was not there on Tuesday morning, so apologies if you reached an agreement that I am unaware of, but this is not how I see the forthcoming steps in terms of repertoire. >> The proposal of Michel for the hieroglyphica represents a significant work, but a work that is impossible to document quickly. And we certainly do not want to introduce signs without knowing what they mean, what they represent and where they are attested (many problematic examples were discussed by Ingelore as well as Serge and myself during the meeting). Sourcing these hieroglyphs is a enormous endeavor, even for professional egyptologists and it will take some years (probably not so many, but we are talking in years before being done) and anyway there is no hurry for anyone to use Unicode for these signs as far as I can tell. What we can do quite quickly, on the other hand, is: >> >> 1. Extracting all the hieroglyphs that we presently use in the TLA and Ramses, looking for the ones missing in Unicode and preparing an addition in Unicode for these signs that we can document precisely. >> 2. To do the same for the signs identified in M?ller by Michael; check the proposal that he prepared (ask our colleague from Mainz) and prepare a document for addition in Unicode. >> 3. In October, some scholars will begin to work full-time on the sign-list and the joint-work between the TLA and Ramses should become progressively available online (Thot Sign-list), with references to signs, classes and forms, as well as to the scholars responsible for the identification of the values, shapes, etc. This will provide a good basis for integrating the hieroglyphica *step-by-step* in Unicode, when we can actually say what each individual sign is. >> >> In a nutshell, the work done by Michel will be more than useful, but should be integrated step by step in the future (probably a third step). >> I think I should get the support of all the egyptologists on this. Correct me guys if I?m wrong. >> >> >>> Monograms: >>> >>> I think the consensus is to make a list of attested combinations and then decide on the approach, either monograms (A) or the use of a control character (B). This list will be used as a proposal for monograms (A) or will be used by developers for their fonts (B). Most Egyptologists were in favor of solution B because of the ease of searching and the ease of implementation (in fonts and not in Unicode proposals). >>> Stephane/Serge/Simon/Mark-Jan will make a list and share it with the group. Then a decision is made for A or B >> Agreed. >> >>> Control characters >>> >>> On the last day of the conference there seemed to be consensus on the control characters, which Michael summed up by mail as: ?The proposal is to remove the LIG character from the ballot and to add in 7 more, the two group controls of Bob, and then the four corner and the center controls? >> Sure. I just sent the data from Ramses that are needed for illustrating and arguing in favour of theses control characters. >>> It was agreed that this proposal will be finished in August, I thought by Michael, and discussed on the Unicode meeting on the 6th of September. Now it seems over the mail that several people are working on it separately >>> Suggestion: let Michael, Bob and Mark-Jan do a conference call or so soon to A. make sure there is one proposal that all agree on and can be shared with the group and B. give guidelines to Simon/Serge/Stephane for what examples they should look in their databases >> If possible, I think that the best would be to produce a single ?consensus? document, which could be signed and approved by all the parties. >> I leave it up to you to know who is to produce the first draft. I?m certainly happy to read through it carefully, provide examples, and check everything I can, but I cannot write the first draft, which would be beyond my technical capacities and the time I?ve left during the next weeks. >> >> Best wishes, >> >> St?phane >> >> >>> Input methods and fonts >>> >>> So and Marwan have integrated their input systems and font into SINUHE >>> So/Marwan could you share with the us the online location so we can start using it? >>> Bob has made and shared an experimental font for us to use >>> _______________________________________________ >>> Egyptian mailing list >>> Egyptian at evertype.com >>> http://evertype.com/mailman/listinfo/egyptian_evertype.com >> >> >> _______________________________________________ >> Egyptian mailing list >> Egyptian at evertype.com >> http://evertype.com/mailman/listinfo/egyptian_evertype.com > > -- > Dr. Ingelore Hafemann > Strukturen und Transformationen des Wortschatzes > der ?gyptischen Sprache > Post: J?gerstra?e 22/23 > Sitz: Unter den Linden 8 > 10117 Berlin > Tel: (030) 20370 447 > _______________________________________________ > Egyptian mailing list > Egyptian at evertype.com > http://evertype.com/mailman/listinfo/egyptian_evertype.com -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- _______________________________________________ Egyptian mailing list Egyptian at evertype.com http://evertype.com/mailman/listinfo/egyptian_evertype.com From runa.uei at gmail.com Thu Jul 21 23:47:36 2016 From: runa.uei at gmail.com (So Miyagawa) Date: Thu, 21 Jul 2016 22:47:36 +0000 Subject: [Egyptian] Cambridge: conclusions and next steps In-Reply-To: References: Message-ID: Dear Heleen, I know Svenja. She is working for AKU, the Hieratic database project based in Mainz. She will give a good insight to us. Also, if you are interested in Demotic Unicode, in S?K in Vienna, I discussed unicode-ization of Demotic with Fabian from University of Heidelberg. We reached a conclusion that we need three Unicode blocks for Demotic, Early Demotic, Ptolemaic and Roman. He is one of the leaders of Demotic database project called DPDP. Best wishes from D?sseldorf International Airport, So On Thu, Jul 21, 2016 at 21:20 heleen wilbrink wrote: > Dear Ingelore and St?phane, dear all, > > It is great to see so much valuable input on our mail group. I would like > to clarify one thing regarding the repertoire. I completely agree with you > St?phane and Ingelore and I am sorry that my wording was apparently not so > clear. > > For short term (hopefully by mid August, as was discussed in Cambridge) > the check on the Mollerlist proposal of Michael can be done and other > missing hieroglyphs that have references can be added to it. If August is > not feasible it could be I guess later this year. The compilation of the > entire list with references indeed will take much longer and can be done > step by step. > > Thanks Ingelore for reminding your suggestion to get in contact with > Ursula and Svenja. Is anyone from the Cambridge group in close contact with > them and willing to contact them? > > All the best, > Heleen > > Verstuurd vanaf mijn iPhone > > Op 21 jul. 2016 om 14:20 heeft Ingelore Hafemann het > volgende geschreven: > > Dear St?phan, dear all, > > thanks St?phane for your contribution, I was just getting to write a > very similar mail at this moment. > > I would like to confirm to St?phane in al points - especially about the > sign list and even Simon does. In Cambridge I hope to have show in my > record how difficult it is to check all signs. We could give toMichel an > electronic list of encoded hieroglyphs occured in our TLA , excluded the > well referenced Gardiner Codes. So we have the reference of the TLA but an > other problem: Students have used variants of signs in the process of > encoding our texts, - we have to check it. Our aim is to differentiate > between signs and variants as far as it its possible. > > 1. We can give you a a list of signs missing in Unicode too, with > references of our corpus - as St?phane proposed to do in point 1. below. > > Adding to point 3. of Steph?ne below: In Berlin we have checked and > described nearly 2/3 of the Group A (Man), all of Group G (Birds, 313) all > of Group P (Ships, 121) and all of Group Z (Strokes, Geometrical F., 39) > and a lot signs of the other groups more unsystematical and by-the way. > These work we continue in Berlin with one half women power and a re happy > that in October one or more scholars will start in Liege. The Thot-sign > list - our common product - will be published step by step. I guess - as > St?phane does - it will takes few years to finish it. We could give part of > the allready finished signs to Michel step by step. This should be possible > if it seems useful. > > Concerning point 2.: I would again recommend to contact Ursula Verhoeven > and Svenja G?lden in Mainz, who are engaged in creating a new electronic > and digitized "M?ller" > > Concerning the problems of control characters Simon has discussed the last > issues and send some evidences of our texts archive. > > Many greetings and best wishes > > Ingelore and Simon > > > > > Am 21.07.2016 um 12:41 schrieb St?phane polis: > > Hi Heleen, > Hi everyone, > > Thanks for the summary, it?s a great start for putting things on the table as clearly as possible . > > > Repertoire > > A list will be made with the references to the publication of an original text (photo or facsimile, not a print font like eg IFAO) for each hieroglyph from the proposal of Michel Suignard. Background: The characters in the proposal of Michel do not have sources yet. These sources are needed in order for the proposal to be accepted. > Michel will do the coordination for this list with input from Stephane/Serge/Simon/Mark-Jan > The hieroglyphs with references can be added in tranches > The first tranche will be the proposal of Michael Everson on hieratic hieroglyphs from Moller to which the hieratic dot will be added > Stephane/Serge/Simon/Mark-Jan will review the proposal of Michael and add any other hieroglyphs that they are missing including the references before mid August > Michael will finish the proposal in the second half of August so that it can be included in the Unicode meeting on the 6th of September > > I was not there on Tuesday morning, so apologies if you reached an agreement that I am unaware of, but this is not how I see the forthcoming steps in terms of repertoire. > The proposal of Michel for the hieroglyphica represents a significant work, but a work that is impossible to document quickly. And we certainly do not want to introduce signs without knowing what they mean, what they represent and where they are attested (many problematic examples were discussed by Ingelore as well as Serge and myself during the meeting). Sourcing these hieroglyphs is a enormous endeavor, even for professional egyptologists and it will take some years (probably not so many, but we are talking in years before being done) and anyway there is no hurry for anyone to use Unicode for these signs as far as I can tell. What we can do quite quickly, on the other hand, is: > > 1. Extracting all the hieroglyphs that we presently use in the TLA and Ramses, looking for the ones missing in Unicode and preparing an addition in Unicode for these signs that we can document precisely. > 2. To do the same for the signs identified in M?ller by Michael; check the proposal that he prepared (ask our colleague from Mainz) and prepare a document for addition in Unicode. > 3. In October, some scholars will begin to work full-time on the sign-list and the joint-work between the TLA and Ramses should become progressively available online (Thot Sign-list), with references to signs, classes and forms, as well as to the scholars responsible for the identification of the values, shapes, etc. This will provide a good basis for integrating the hieroglyphica *step-by-step* in Unicode, when we can actually say what each individual sign is. > > In a nutshell, the work done by Michel will be more than useful, but should be integrated step by step in the future (probably a third step). > I think I should get the support of all the egyptologists on this. Correct me guys if I?m wrong. > > > > Monograms: > > I think the consensus is to make a list of attested combinations and then decide on the approach, either monograms (A) or the use of a control character (B). This list will be used as a proposal for monograms (A) or will be used by developers for their fonts (B). Most Egyptologists were in favor of solution B because of the ease of searching and the ease of implementation (in fonts and not in Unicode proposals). > Stephane/Serge/Simon/Mark-Jan will make a list and share it with the group. Then a decision is made for A or B > > Agreed. > > > Control characters > > On the last day of the conference there seemed to be consensus on the control characters, which Michael summed up by mail as: ?The proposal is to remove the LIG character from the ballot and to add in 7 more, the two group controls of Bob, and then the four corner and the center controls? > > Sure. I just sent the data from Ramses that are needed for illustrating and arguing in favour of theses control characters. > > It was agreed that this proposal will be finished in August, I thought by Michael, and discussed on the Unicode meeting on the 6th of September. Now it seems over the mail that several people are working on it separately > Suggestion: let Michael, Bob and Mark-Jan do a conference call or so soon to A. make sure there is one proposal that all agree on and can be shared with the group and B. give guidelines to Simon/Serge/Stephane for what examples they should look in their databases > > If possible, I think that the best would be to produce a single ?consensus? document, which could be signed and approved by all the parties. > I leave it up to you to know who is to produce the first draft. I?m certainly happy to read through it carefully, provide examples, and check everything I can, but I cannot write the first draft, which would be beyond my technical capacities and the time I?ve left during the next weeks. > > Best wishes, > > St?phane > > > > Input methods and fonts > > So and Marwan have integrated their input systems and font into SINUHE > So/Marwan could you share with the us the online location so we can start using it? > Bob has made and shared an experimental font for us to use > _______________________________________________ > Egyptian mailing listEgyptian at evertype.com http://evertype.com/mailman/listinfo/egyptian_evertype.com > > > > _______________________________________________ > Egyptian mailing listEgyptian at evertype.comhttp://evertype.com/mailman/listinfo/egyptian_evertype.com > > > -- > Dr. Ingelore Hafemann > Strukturen und Transformationen des Wortschatzes > der ?gyptischen Sprache > Post: J?gerstra?e 22/23 > Sitz: Unter den Linden 8 > 10117 Berlin > Tel: (030) 20370 447 > > _______________________________________________ > Egyptian mailing list > Egyptian at evertype.com > http://evertype.com/mailman/listinfo/egyptian_evertype.com > > _______________________________________________ > Egyptian mailing list > Egyptian at evertype.com > http://evertype.com/mailman/listinfo/egyptian_evertype.com > -- --------------------------------- # So Miyagawa ## [so? mij??g?w?] ### Websites * Profile: https://www.uni-goettingen.de/de/531081.html * academia.edu: https://uni-goettingen.academia.edu/SoMiyagawa * GitHub general: https://github.com/somiyagawa * GitHub toolkit: https://github.com/somiyagawa/toolkitForCopticAndAncientEgyptian * CV: https://docs.google.com/document/d/1HhhKovsJzqZQGCn6W1oNweqyqKYUfUvFTxAlStKICdM/edit?usp=sharing * Facebook: https://www.facebook.com/runauei * LinkedIn: https://de.linkedin.com/pub/so-miyagawa/62/777/720 ### Status 1. Research Fellow at Georg-August-Universit?t G?ttingen Sonderforschungsbereich (SFB) 1136 ??Bildung und Religion in Kulturen des Mittelmeerraums und seiner Umwelt von der Antike bis zum Mittelalter und zum Klassischen Islam? 2. Research Fellow at "KELLIA: Koptische/Coptic Electronic Language and Literature International Alliance" (DFG/NEH) 3. Ph.D. candidate at Georg-August-Universit?t G?ttingen, Philosophische Fakult?t, Seminar f?r ?gyptologie und Koptologie 4. Ph.D. candidate at Kyoto University, Faculty of Letters, Department of Linguistics 5. Member in Coptic SCRIPTORIUM research team 6. Student Member at Unicode Consortium --------------------------------- -------------- next part -------------- An HTML attachment was scrubbed... URL: From s.polis at ulg.ac.be Fri Jul 22 09:07:58 2016 From: s.polis at ulg.ac.be (=?utf-8?Q?St=C3=A9phane_polis?=) Date: Fri, 22 Jul 2016 10:07:58 +0200 Subject: [Egyptian] Cambridge: conclusions and next steps In-Reply-To: References: Message-ID: <43F73292-362C-43AD-9C20-54A5AB3606C9@ulg.ac.be> Dear friends, If you wish, I can take care of contacting our colleagues from Mainz. I have to travel there soon in order to discuss the structure of their hieratic sign-list in relation to the Thot Sign-List: we want to investigate how they could be linked and whether the structures are compatible. Best wishes, St?phane ------------------------------------------------------ Chercheur qualifi? F.R.S.-FNRS Universit? de Li?ge Service d'?gyptologie D?partement des sciences de l?Antiquit? Place du 20-Ao?t, B-4000 Li?ge http://www.egypto.ulg.ac.be ------------------------------------------------------ > Le 22 juil. 2016 ? 00:47, So Miyagawa a ?crit : > > Dear Heleen, > > I know Svenja. She is working for AKU, the Hieratic database project based in Mainz. She will give a good insight to us. Also, if you are interested in Demotic Unicode, in S?K in Vienna, I discussed unicode-ization of Demotic with Fabian from University of Heidelberg. We reached a conclusion that we need three Unicode blocks for Demotic, Early Demotic, Ptolemaic and Roman. He is one of the leaders of Demotic database project called DPDP. > > Best wishes from D?sseldorf International Airport, > So > > On Thu, Jul 21, 2016 at 21:20 heleen wilbrink > wrote: > Dear Ingelore and St?phane, dear all, > > It is great to see so much valuable input on our mail group. I would like to clarify one thing regarding the repertoire. I completely agree with you St?phane and Ingelore and I am sorry that my wording was apparently not so clear. > > For short term (hopefully by mid August, as was discussed in Cambridge) the check on the Mollerlist proposal of Michael can be done and other missing hieroglyphs that have references can be added to it. If August is not feasible it could be I guess later this year. The compilation of the entire list with references indeed will take much longer and can be done step by step. > > Thanks Ingelore for reminding your suggestion to get in contact with Ursula and Svenja. Is anyone from the Cambridge group in close contact with them and willing to contact them? > > All the best, > Heleen > > Verstuurd vanaf mijn iPhone > > Op 21 jul. 2016 om 14:20 heeft Ingelore Hafemann > het volgende geschreven: > >> Dear St?phan, dear all, >> thanks St?phane for your contribution, I was just getting to write a very similar mail at this moment. >> I would like to confirm to St?phane in al points - especially about the sign list and even Simon does. In Cambridge I hope to have show in my record how difficult it is to check all signs. We could give toMichel an electronic list of encoded hieroglyphs occured in our TLA , excluded the well referenced Gardiner Codes. So we have the reference of the TLA but an other problem: Students have used variants of signs in the process of encoding our texts, - we have to check it. Our aim is to differentiate between signs and variants as far as it its possible. >> 1. We can give you a a list of signs missing in Unicode too, with references of our corpus - as St?phane proposed to do in point 1. below. >> Adding to point 3. of Steph?ne below: In Berlin we have checked and described nearly 2/3 of the Group A (Man), all of Group G (Birds, 313) all of Group P (Ships, 121) and all of Group Z (Strokes, Geometrical F., 39) and a lot signs of the other groups more unsystematical and by-the way. These work we continue in Berlin with one half women power and a re happy that in October one or more scholars will start in Liege. The Thot-sign list - our common product - will be published step by step. I guess - as St?phane does - it will takes few years to finish it. We could give part of the allready finished signs to Michel step by step. This should be possible if it seems useful. >> Concerning point 2.: I would again recommend to contact Ursula Verhoeven and Svenja G?lden in Mainz, who are engaged in creating a new electronic and digitized "M?ller" >> >> Concerning the problems of control characters Simon has discussed the last issues and send some evidences of our texts archive. >> Many greetings and best wishes >> Ingelore and Simon >> >> >> >> Am 21.07.2016 um 12:41 schrieb St?phane polis: >>> Hi Heleen, >>> Hi everyone, >>> >>> Thanks for the summary, it?s a great start for putting things on the table as clearly as possible . >>> >>>> Repertoire >>>> >>>> A list will be made with the references to the publication of an original text (photo or facsimile, not a print font like eg IFAO) for each hieroglyph from the proposal of Michel Suignard. Background: The characters in the proposal of Michel do not have sources yet. These sources are needed in order for the proposal to be accepted. >>>> Michel will do the coordination for this list with input from Stephane/Serge/Simon/Mark-Jan >>>> The hieroglyphs with references can be added in tranches >>>> The first tranche will be the proposal of Michael Everson on hieratic hieroglyphs from Moller to which the hieratic dot will be added >>>> Stephane/Serge/Simon/Mark-Jan will review the proposal of Michael and add any other hieroglyphs that they are missing including the references before mid August >>>> Michael will finish the proposal in the second half of August so that it can be included in the Unicode meeting on the 6th of September >>> I was not there on Tuesday morning, so apologies if you reached an agreement that I am unaware of, but this is not how I see the forthcoming steps in terms of repertoire. >>> The proposal of Michel for the hieroglyphica represents a significant work, but a work that is impossible to document quickly. And we certainly do not want to introduce signs without knowing what they mean, what they represent and where they are attested (many problematic examples were discussed by Ingelore as well as Serge and myself during the meeting). Sourcing these hieroglyphs is a enormous endeavor, even for professional egyptologists and it will take some years (probably not so many, but we are talking in years before being done) and anyway there is no hurry for anyone to use Unicode for these signs as far as I can tell. What we can do quite quickly, on the other hand, is: >>> >>> 1. Extracting all the hieroglyphs that we presently use in the TLA and Ramses, looking for the ones missing in Unicode and preparing an addition in Unicode for these signs that we can document precisely. >>> 2. To do the same for the signs identified in M?ller by Michael; check the proposal that he prepared (ask our colleague from Mainz) and prepare a document for addition in Unicode. >>> 3. In October, some scholars will begin to work full-time on the sign-list and the joint-work between the TLA and Ramses should become progressively available online (Thot Sign-list), with references to signs, classes and forms, as well as to the scholars responsible for the identification of the values, shapes, etc. This will provide a good basis for integrating the hieroglyphica *step-by-step* in Unicode, when we can actually say what each individual sign is. >>> >>> In a nutshell, the work done by Michel will be more than useful, but should be integrated step by step in the future (probably a third step). >>> I think I should get the support of all the egyptologists on this. Correct me guys if I?m wrong. >>> >>> >>>> Monograms: >>>> >>>> I think the consensus is to make a list of attested combinations and then decide on the approach, either monograms (A) or the use of a control character (B). This list will be used as a proposal for monograms (A) or will be used by developers for their fonts (B). Most Egyptologists were in favor of solution B because of the ease of searching and the ease of implementation (in fonts and not in Unicode proposals). >>>> Stephane/Serge/Simon/Mark-Jan will make a list and share it with the group. Then a decision is made for A or B >>> Agreed. >>> >>>> Control characters >>>> >>>> On the last day of the conference there seemed to be consensus on the control characters, which Michael summed up by mail as: ?The proposal is to remove the LIG character from the ballot and to add in 7 more, the two group controls of Bob, and then the four corner and the center controls? >>> Sure. I just sent the data from Ramses that are needed for illustrating and arguing in favour of theses control characters. >>>> It was agreed that this proposal will be finished in August, I thought by Michael, and discussed on the Unicode meeting on the 6th of September. Now it seems over the mail that several people are working on it separately >>>> Suggestion: let Michael, Bob and Mark-Jan do a conference call or so soon to A. make sure there is one proposal that all agree on and can be shared with the group and B. give guidelines to Simon/Serge/Stephane for what examples they should look in their databases >>> If possible, I think that the best would be to produce a single ?consensus? document, which could be signed and approved by all the parties. >>> I leave it up to you to know who is to produce the first draft. I?m certainly happy to read through it carefully, provide examples, and check everything I can, but I cannot write the first draft, which would be beyond my technical capacities and the time I?ve left during the next weeks. >>> >>> Best wishes, >>> >>> St?phane >>> >>> >>>> Input methods and fonts >>>> >>>> So and Marwan have integrated their input systems and font into SINUHE >>>> So/Marwan could you share with the us the online location so we can start using it? >>>> Bob has made and shared an experimental font for us to use >>>> _______________________________________________ >>>> Egyptian mailing list >>>> Egyptian at evertype.com >>>> http://evertype.com/mailman/listinfo/egyptian_evertype.com >>> >>> >>> _______________________________________________ >>> Egyptian mailing list >>> Egyptian at evertype.com >>> http://evertype.com/mailman/listinfo/egyptian_evertype.com >> >> -- >> Dr. Ingelore Hafemann >> Strukturen und Transformationen des Wortschatzes >> der ?gyptischen Sprache >> Post: J?gerstra?e 22/23 >> Sitz: Unter den Linden 8 >> 10117 Berlin >> Tel: (030) 20370 447 >> _______________________________________________ >> Egyptian mailing list >> Egyptian at evertype.com >> http://evertype.com/mailman/listinfo/egyptian_evertype.com > _______________________________________________ > Egyptian mailing list > Egyptian at evertype.com > http://evertype.com/mailman/listinfo/egyptian_evertype.com > -- > --------------------------------- > > # So Miyagawa > ## [so? mij??g?w?] > > ### Websites > * Profile: https://www.uni-goettingen.de/de/531081.html > * academia.edu : https://uni-goettingen.academia.edu/SoMiyagawa > * GitHub general: https://github.com/somiyagawa > * GitHub toolkit: https://github.com/somiyagawa/toolkitForCopticAndAncientEgyptian > * CV: https://docs.google.com/document/d/1HhhKovsJzqZQGCn6W1oNweqyqKYUfUvFTxAlStKICdM/edit?usp=sharing > * Facebook: https://www.facebook.com/runauei > * LinkedIn: https://de.linkedin.com/pub/so-miyagawa/62/777/720 > > ### Status > 1. Research Fellow at Georg-August-Universit?t G?ttingen Sonderforschungsbereich (SFB) 1136 > ??Bildung und Religion in Kulturen des Mittelmeerraums und seiner Umwelt von der Antike bis > zum Mittelalter und zum Klassischen Islam? > 2. Research Fellow at "KELLIA: Koptische/Coptic Electronic Language and Literature International > Alliance" (DFG/NEH) > 3. Ph.D. candidate at Georg-August-Universit?t G?ttingen, Philosophische Fakult?t, > Seminar f?r ?gyptologie und Koptologie > 4. Ph.D. candidate at Kyoto University, Faculty of Letters, Department of Linguistics > 5. Member in Coptic SCRIPTORIUM research team > 6. Student Member at Unicode Consortium > > --------------------------------- > _______________________________________________ > Egyptian mailing list > Egyptian at evertype.com > http://evertype.com/mailman/listinfo/egyptian_evertype.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From bobqq at live.co.uk Fri Jul 22 14:34:20 2016 From: bobqq at live.co.uk (Bob Richmond) Date: Fri, 22 Jul 2016 14:34:20 +0100 Subject: [Egyptian] Early note for comment on the 'Four corner' system for ligatures Message-ID: Hi All This is a rough note to kick the ball rolling about the 4 corner system of positional ligatures. I?ve not covered everything by any means. I?ve not heard anything from others on this topic yet so like to have feedback asap. I think the 5th centre position code talked about is similar to the Monogram in linking to repertoire discussion and should be treated as a distinct item. Neither affects any other revisions to the three character system under discussion. Anyone disagree? Bob -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: BobsInitialThoughtsFourCorners1.pdf Type: application/pdf Size: 397094 bytes Desc: not available URL: From mn31 at st-andrews.ac.uk Fri Jul 22 18:05:15 2016 From: mn31 at st-andrews.ac.uk (Mark-Jan Nederhof) Date: Fri, 22 Jul 2016 18:05:15 +0100 Subject: [Egyptian] proposal 2016-07-22 In-Reply-To: References: Message-ID: <1897138.Z3PCXJQWcV@bear> Dear all, We adapted our proposal. Please find it in: https://mjn.host.cs.st-andrews.ac.uk/tmp/unicode2.pdf We have responded to the criticism about brackets and prefix operators. We have gotten rid of all prefix operators and replaced them by infix operators. As for brackets, it is well possible to get rid of them too (not entirely, there are still the cartouches), but the price to pay for this is added complexity of the syntax due to needing several copies of each primitive with different operator precedence, perhaps three or four. It is outlined in Section 9 how this would be done. This adds to the complexity that already exists, after the prefix operators were removed; have a look at appendix A and see whether you can verify the grammar is unambiguous. There are no names on the proposal. That is partly because not everyone from TLA/Ramses/St Andrews has had a chance to look at it yet, and partly because anyone is welcome to have their name added if they feel they contributed and subscribe to the content. Mark-Jan From mn31 at st-andrews.ac.uk Fri Jul 22 19:10:33 2016 From: mn31 at st-andrews.ac.uk (Mark-Jan Nederhof) Date: Fri, 22 Jul 2016 19:10:33 +0100 Subject: [Egyptian] proposal 2016-07-22 In-Reply-To: <1897138.Z3PCXJQWcV@bear> References: <1897138.Z3PCXJQWcV@bear> Message-ID: <4803394.MerqE8crjr@thuis> On Friday 22 Jul 2016 18:05:15 Mark-Jan Nederhof wrote: > have a look at > appendix A and see whether you can verify the grammar is unambiguous. PS I now see it isn't. Fix will follow later, hopefully. Mark-Jan From bobqq at live.co.uk Fri Jul 22 23:19:12 2016 From: bobqq at live.co.uk (Bob Richmond) Date: Fri, 22 Jul 2016 23:19:12 +0100 Subject: [Egyptian] Two group joiners Message-ID: Hi All I've attached a rough description of the two group joiner additions to the encoding system. This is for comment and feedback. These are primarily about improving vertical text and 'tall group' support in horizontal text while maintaining a straightforward sequence model for users. Mainly for the more technically oriented members of the group but feedback appreciated from all. Thanks Bob -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: TwoGroupJoinersForEgyptianDraft2.pdf Type: application/pdf Size: 954772 bytes Desc: not available URL: From odusseus at gmail.com Sat Jul 23 10:46:07 2016 From: odusseus at gmail.com (Marwan Kilani) Date: Sat, 23 Jul 2016 11:46:07 +0200 Subject: [Egyptian] Some general considerations Message-ID: Hello Everyone First of all: I am attaching a pdf version of this email because I am using a few images, and I am not sure they will displayed in the right places in the email. If you don?t see any image in the text below or something does not make sense, please refer to the pdf. ------- I dare to write here an email pointing out a few general and specific observations on both what has been said in Cambridge, and what has been discussed in these emails in the past few days. I have the feeling that many of you will not like what I am going to write, but well.. First of all, as you know I didn?t know any of you before Cambridge, so I had the feeling to be a bit an ?external observer?. Which, in turn, led me to a few observations. *First:* Do you know the Indian story of the three (or more) blind men who are put in front of an elephant and are asked to find out what hey have in front of them ( https://en.wikipedia.org/wiki/Blind_men_and_an_elephant )? One in front of the trump, one next to the ear and one near the tail. The three blind men start touching the elephant and start to try to describe it and to try to figure out what kind of animal is, but they end up fighting because they can only touch a small part of the animal but missed the general picture. Besides the ?entrenched positions? mentioned by Nigel, I had the feeling that some of the participants in Cambridge were a bit like the blind men, knowing very well their specific fields, but missing a bit the general picture, thus ending up misunderstanding the others. Now, I don?t want to sound arrogant. I don?t think I have a vision of the whole picture and I don?t think to be more knowledgeable than any of you. I put myself as well among those blind men. But considering that, correct me if I am wrong, So and I (note that I am talking only in my name, not in So's name) are the only person at that workshop who: a) are Egyptologists and therefore know both how Egyptian hieroglyphs works and what Egyptologists need (or at least what we need as egyptlogists) b) have been playing since a while with Unicode characters, fonts, input methods etc and therefore have a certain understanding of how these technical tools work c) have a good practical understanding of how non-Latin complex horizontal/vertical scripts work. In particular So is familiar with Japanese, Chinese and Korean, I think? While I know pretty well Arabic-based scripts, Indian ones (I lived in Nepal and when I was there I ended up teaching Nepali to Nepali children in a Nepali school) and I have played a lot with Chinese and Japanese scripts. Then, perhaps, our small little contribution should also be considered to make sense of the whole big elephant. This even more considering that in spite of having never met before, both So and I ended up developing very similar solutions to some of the problems you are discussing, solutions that, by the way, seem to me very similar to what Ishida was suggesting in some of his emails. Solutions that, in fact, are already implemented by various scripts around the world. *Second:* Many of the problems you are discussing don?t have a single ?right? solution, because in fact many of those problems depends on how you interpret the data (i.e. the ancient hieroglyphs). Therefore, the aim should not be to find the ?right? solution, but rather to find the easiest *interpretation* that could led to the easiest implementation in Unicode. We are not ancient Egyptians, we don?t know how ancient Egyptian scribes perceived their writing system. We can just observe it from the outside and suggest interpretations. Which are not ?right? or ?wrong?, but rather ?more efficient? or ?less efficient? in respect to the problems we want to solve through these interpretations. This is something that, I think, should be kept in mind. *Third* As said during the meeting in Cambridge, I am not against the introduction and use of some control characters per se. Still, honestly, more I read about your proposals and about your control characters, and more I am convinced that using rendering algorithms at the font level and general or contextual ligatures embedded within the fonts as a main way to combine and display groups would be much easier and much more efficient than using control characters. This said, allow me to call your attention to a few more specific points. *1) JSesh approach* I had the feeling during the meeting and reading your emails that many of you think in a very JSesh-based mentality. Like I had the feeling that at least some of the participants? goal is to imitate JSesh in Unicode. But Unicode *is* *not* JSesh, or at least should not be JSesh. Using control functions, parentheses etc can be an efficient way in JSesh, but I doubt that a system requiring to input 10 (sic!) control characters to display 3 (sic!) hieroglyphs in a pretty common group (the Htp example in Mark-Jan?s last proposal) makes much sense in a unicode perspective. Especially considering that you can obtain exactly the same result without control character and just with 1 (sic!) general ligature within the font that will be automatically rendered by usual standard already available rendering algorithms. In other words, what can be an efficient solution to a problem in JSesh, is not necessarily a good solution in Unicode. And Unicode is not even a stand-alone thing: Unicode can work in combination with rendering algorithm (i.e. ligatures embedded in fonts), with layouts at the editor level etc. that can help solving the problems on different level than just on the Unicode encoding level. *2) Groups in fonts* Someone in the list was saying that the groups need to be listed, but that not all of them necessarily need to be built in the font (or something like that). This is not true: compound characters and groups will not automatically build on their own, and if you envisaged the possibility to build a group that you *do* need to have in the font, otherwise it wont display correctly and you will have random broken control-characters popping out here and there in your hieroglyph text. This perhaps will not bother you in your database etc, but it will be just unacceptable for common users (and, for instance, for editors). If you are accepting that a group can be built, then you need a way to display that group, i.e. you must have that group saved somewhere in your font. Whether you will build those characters through control characters or with rendering and ligature algorithms embedded in the fonts. *3) the 4 ?small sign in the corner of big sign? control characters.* If you really want to have control characters to combine small signs in the corner of bigger signs, you can do that with just 2 control characters, you don?t need 4 of them. These because the distinction you are making between ?big? and ?small? signs is superfluous. It is enough to have 2 transversal control characters, that we can represent as ?A\B? and ?A/B?. They will join the main signs by virtually/ideally putting them at two opposite corners of the square, on the base of their relative order. So for instance, if you want to render the group [image: Inline image 2] tw, you will just type in [image: Inline image 3] t + ?A\B-control-character? + [image: Inline image 4] w. If instead you want to code for [image: Inline image 6] wt, then you will just need to type in [image: Inline image 5] w + ?A\B-control-character? + [image: Inline image 7] t. If you want to render [image: Inline image 10] twt, then you just type in [image: Inline image 11] t + ?A\B-control-character? + [image: Inline image 12] w + ?A\B-control-character? + [image: Inline image 13] t In all these cases, you can simply use the same control character. You don?t need two distinct control characters for that, because there is no need to specify which one is the ?big? sign, and which one is the ?small? one, because what really matters is not their size, but their relative position. NOTE: if by any chance you are going to adopt this system of 2 control characters, then I?d like to be mentioned in the proposal. You know, it could be useful on the CV. *4) Vertical and horizontal script and control characters* You are talking about using control characters to render texts in horizontal and vertical texts. This, however, can generate some quite relevant problems. In particular, you have to consider that it will be hardly possible to automatically convert a vertical text encoded with control characters into horizontal text. In other words, if you have a text written in vertical columns that used control characters, and for whatever reason (for instance for editorial reasons, you editor wants your text in horizontal lines and not in vertical lines, for instance if you are quoting a hieroglyphic passage within a English paragraph) you will need to turn it into horizontal text, you will hardly be able to do it automatically, and you will likely have to retype it entirely. The problem is that the control characters that you will need and use in your vertical text will not be the same and will not be inputted in the same places as those that you will need in your horizontal rendition of the same text. Let?s take, for instance, the following example. [image: Inline image 14] If this text will be typed with a vertical font (as I would expect it to be), then to display it correctly you will have to type the following sequence of Unicode characters: [image: Inline image 15] + [image: Inline image 16]+ ?left/right-ctrl-character? (sic!) + [image: Inline image 17] + ?up/down-control-character? (sic!) + [image: Inline image 18] + [image: Inline image 19] + [image: Inline image 20] NOTE that you will probably (as far as I know, correct me if I am wrong) have to use the ?left/right-ctrl-character? (not a up/down crtl-character) to combine the [image: Inline image 21] and [image: Inline image 22] and then the ?up/down ctrl character (not a ?left/right ctrl character) to combine them with the [image: Inline image 23], because in vertical texts the baseline of reference is the left line of the column. Note that this will change the order the signs have to be inputted, which means it will interfere with the searchability of the text. This however is a problem that should be possible to solved somehow. Perhaps, I am not sure. I am not expert of these details. If however you will try to display this same text (i.e. this same sequence of Unicode characters) horizontally, it wont be enough to chance the direction of the text and to use a horizontal font, because what you would obtain would be something wrong like this: [image: Inline image 24][image: Inline image 25] "broken-control-character? [image: Inline image 26] "broken-control-character? [image: Inline image 27][image: Inline image 29][image: Inline image 30] In order to display horizontally the same text in a graphically acceptable way, in fact, you will have to type the following sequence of Unicode characters: [image: Inline image 31] + [image: Inline image 32] + [image: Inline image 33] + ?up-down-control-character? + [image: Inline image 34] + [image: Inline image 35] + ?up-down-control-character? + [image: Inline image 36] Which would be displayed as: [image: Inline image 37] Essentially, you will have to type the text anew. Not very practical, in my opinion. Using rendering algorithms, i.e. general or contextual ligatures embedded within the *font* at the font level (instead of control characters), would be a very easy way to solve the problem. In fact it would be enough to have two fonts, one for vertical texts and one for horizontal texts (or one single font with the possibility to switch between the two layouts) with different sets of ligatures embedded in them. In that way, you will just have to type the following plain sequence of Unicode characters (without any control character): [image: Inline image 38] + [image: Inline image 39] + [image: Inline image 41] + [image: Inline image 42] + [image: Inline image 43] + [image: Inline image 45] And the ligature algorithm in the vertical font will render the ligature [image: Inline image 46]. Then, to turn this same vertical string of text into horizontal text you will just have to change the font, and the algorithm in the horizontal font will automatically display the correct [image: Inline image 47] and [image: Inline image 48] ligatures. Without need to retype anything. NOTE that the hieroglyphic text you see the examples here above have all been obtained in this way, namely without control characters and just with my font with embedded ligatures. As Ishida suggested in one of his emails, this is essentially how Asian languages deal with this problems. It is very efficient, it works, it does not require any new special character and frankly I still don?t understand why Egyptian should be different. *5) special characters, vertical/horizontal texts and input methods* Note that because of the problem with control characters and horizontal/vertical texts highlighted above, you *wont be able* to use a same simple predictive input method to type vertical and horizontal texts, because the sequences of signs and control characters needed will be *totally* different, and will both need to be encoded independently within the input method itself. In fact, you will probably have to explicitly list in advance within the input method every single possible combination of hieroglyphic signs + control characters, or you will have to just type you text sign by sign, control character by control character (even if you adopt shortcuts to input the control characters and the most common groups, the problem will still be there). Again, with general or contextual ligatures embedded at the font level this problem would not exist, because the input method would just have to input the plain sequence of signs, and it will be the rendering algorithm within the fonts that will take care of displaying the signs in the correct spatial order. *6) Ramesside ?groups? (or ?tall groups?).* You all seem to assume that the clusters of signs we see in Ramesside texts are ?groups? (or ligatures) analogous to the middle Egyptian square groups, and therefore have to be created, manipulated and displayed as graphical units. For the non-Egyptologists among us, this is a Ramesside text [image: Inline image 49] in red you have what is generally referred to as a ?Ramesside group? (or ?tall groups?). [image: Inline image 50] Here, instead, you have an ordinary Egyptian text, [image: Inline image 51] and in red you have what are generally considered as ordinary ?square groups?. [image: Inline image 52] As you can see, the difference is that in ordinary Egyptian writing, sings are grouped into regular square spaces (or half-squares). In Ramesside writing, instead, signs tend to be combined into vertically elongated rectangles. These rectangles can contain multiple ordinary square groups. For instance, the following Ramesside group from the text above: [image: Inline image 53] could be split into two ordinary square groups, [image: Inline image 54] and [image: Inline image 55], in the ordinary square groups writing. As said, people often assume that these clusters of signs in Ramesside texts have to be understood as ?groups? analogous to the ?square groups? of the ordinary writing. But, what if the Ramesside ?groups? were actually not really ?groups?? Or better, what if there were an easier and most efficient way to analyses, interpret, describe, and therefore display them? People seem to often assume that there are only two main ways to write a string of text, namely vertically or horizontally. This however is in not true. It is also possible to write short horizontal strings of text within a larger main vertical frame, and similarly it is possible to write short vertical strings of text within a larger main horizontal frame. The second case, in particular, is attested in Asian scripts. Have a look at the following image: [image: Inline image 57] This is a Japanese text, as you can see from the next image, it is written in short vertical ?columns? (red), organized into one general horizontal ?line? (green). [image: Inline image 56] Namely, the text is written (the text in the pic is right to left, I am transcribing it left to right because it is easier to explain the concept) --------------- ????? ????? --------------- But it has to be read: ?? | ?? | ?? | ?? | ?? namely: ?????????? Japanese writing has ?groups? (i.e. characters, marked in white in the following picture) that in some general respect could be compared to Egyptian groups (they are not identical, I know, but *in some respects* they are conceptually comparable). [image: Inline image 58] Now, *no one* dealing with Japanese writing would consider the short vertical strings of text (the red bits above) as some sort ?non-ordinary elongated groups(characters)?. And *no one* would ever consider to code such hypothetical elongated vertical groups into Unicode, or to devise control characters or anything to pre-compose them. They are just sequences of *regular* groups(characters) written vertically within a main horizontal layout. And in order to represent such a text in Unicode, you just have to play with the layout of your text in your text editor. You don?t need special control characters or anything else, it is *just* a question of *layout*. Now, Ramesside inscriptions can be analysed exactly in the same way. [image: Inline image 59] In other words, what you are interpreting as the Ramesside ?abnormal elongated/tall groups? (red in the pic above) could actually be interpreted just as short strings of texts written vertically (i.e short *columns*) within a bigger main horizontal layout (i.e. in lines, in green). Such an interpretation has a few advantages both in respect to the actual data from the ancient texts and for what concern Unicode etc. First, someone was pointing out that there are thousands of such ?Ramesside groups? and a large part of them is attested only once. Well, if you interpreted them as short columns of texts, rather than as groups, you understand why: those are short strings of texts, they are not isolated independent graphic and orthographic units. This also helps to answer the question: ?how many groups are you expecting to find in the future?? Virtually, potentially, infinite. If tomorrow we should find a new Ramesside temple with a new long text inscribed in ?Ramesside groups?, there could be hundreds, even thousands of these new short vertical strings of texts. Trying to describe (and encode) them as independent units (or devising control characters to build them) is more or less like trying to encode every singly column of hieroglyphic text as single independent ?groups?. It does not make much sense, as it would not make sense to try to describe and code every single horizontal sequence of groups(characters) in the Japanese texts above. Rather, it would make much more sense to deal with these ?Ramesside short vertical strings of text disposed in horizontal lines? as if they were.. well, short vertical strings of text disposed in horizontal lines. This can be *easily* done at the layout level, either with xml/CSS style sheets, or with any other layout algorithm of any text editor. Word can do that, already. It is enough to use a vertical font, and create a page layout with horizontal sequences (= the lines) of very short vertical columns. That is all what you need. The Ramesside ?group? (or ?tall group?) [image: Inline image 60] used by Bob as example in his last proposal, for instance, would not need to be built and displayed as ?group? (i.e. with control characters etc) at all, but it could just be inputted as a plain sequence of independent signs displayed with a vertical font within a vertical column, in a general layout that present series of horizontal sequences (i.e. a ?lines?) of such short columnts. No need for ligatures, no need for control characters, nothing. Just vertical fonts and properly set up layouts for the page where the signs will be displayed. Obviously, there will still be signs that will need to be combined in ?groups?, i.e. in ?real? square groups. As you can see from the pic above, however, if you interpret the Ramesside texts as composed of short vertical columns, the number of groups needed decrease significantly. And in general, those groups are often the same basic groups that you find in good old ordinary Egyptian square groups orthography. So no need to list and encode (as Unicode or at the font level, doesn?t matter) thousands of unique groups, we would just need to encode the basic square groups and then playing a bit with the layout of the text. And this brings me to the next point: *7) What are the ?square groups??* I have never specifically worked on square groups from a linguistic point of view, and I do not know if there is any specific study about square groups and their graphic behaviour within a larger linguistic frame (i.e. for instance comparing them with the behaviour of similar ?groups? in other writing systems). This is one of those points where the experience of some of you could be extremely useful. Such a study could be used, for instance, to define some contextual rules to allow the font to automatically manage the combination of at least some of these groups, or at least to define some rules to automatically prioritize one ligature over the other (if we are working with ligatures at the font level). If such a study exists, then it would be useful to take it into consideration in the discussion. If such a study does not exist, then perhaps it could be worth considering doing it *before* submitting any new proposal involving control characters to the Unicode consortium, because I think it would be better to be sure we really understand how those groups work, before suggesting a method to encode them. I am obviously talking about control characters etc, I am not talking about expanding the basic set of glyphs. Otherwise, we would be encoding something whose actual functioning has never been studied, and therefore we would risk to be encoding features and elements that are actually superfluous, or that could have been managed in a more efficient way at other levels (ligatures within the fonts, layout table at the texteditor level etc). Ok, I guess this is more or less all what I wanted to say. Now feel free to ignore me :-) Best Marwan -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: tawy.jpg Type: image/jpeg Size: 4562 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: tall group1.jpg Type: image/jpeg Size: 6521 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: xAst.jpg Type: image/jpeg Size: 3110 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: tA.jpg Type: image/jpeg Size: 1196 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: h.jpg Type: image/jpeg Size: 2014 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: tA.jpg Type: image/jpeg Size: 1196 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: jap text.jpg Type: image/jpeg Size: 13279 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: hthr hnwt tawy 1.jpg Type: image/jpeg Size: 9092 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: tA.jpg Type: image/jpeg Size: 1196 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: h.jpg Type: image/jpeg Size: 2014 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: classical 2.jpg Type: image/jpeg Size: 134706 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: hthr.jpg Type: image/jpeg Size: 4416 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: w.jpg Type: image/jpeg Size: 2104 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: nw.jpg Type: image/jpeg Size: 1714 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: t.jpg Type: image/jpeg Size: 840 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: tA.jpg Type: image/jpeg Size: 1196 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: t.jpg Type: image/jpeg Size: 840 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: h.jpg Type: image/jpeg Size: 2014 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: israel 2.jpg Type: image/jpeg Size: 449778 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: jap text copia.jpg Type: image/jpeg Size: 11587 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: tA.jpg Type: image/jpeg Size: 1196 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: jap text copia 2.jpg Type: image/jpeg Size: 11995 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: nw.jpg Type: image/jpeg Size: 1714 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: t.jpg Type: image/jpeg Size: 840 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: israel 1.jpg Type: image/jpeg Size: 37460 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: tw.jpg Type: image/jpeg Size: 2452 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: hthr.jpg Type: image/jpeg Size: 4416 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: h.jpg Type: image/jpeg Size: 2014 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: w.jpg Type: image/jpeg Size: 2104 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: nw.jpg Type: image/jpeg Size: 1714 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: t.jpg Type: image/jpeg Size: 840 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: w.jpg Type: image/jpeg Size: 2104 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: tA.jpg Type: image/jpeg Size: 1196 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: israel 2 copia.jpg Type: image/jpeg Size: 449778 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: t.jpg Type: image/jpeg Size: 840 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: t.jpg Type: image/jpeg Size: 840 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: tA.jpg Type: image/jpeg Size: 1196 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: twt2.jpg Type: image/jpeg Size: 5167 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: h.jpg Type: image/jpeg Size: 2014 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: classical 1.jpg Type: image/jpeg Size: 5244 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: nw.jpg Type: image/jpeg Size: 1714 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: hthr.jpg Type: image/jpeg Size: 4416 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: t.jpg Type: image/jpeg Size: 840 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: t.jpg Type: image/jpeg Size: 840 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: tA.jpg Type: image/jpeg Size: 1196 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: tall group.jpg Type: image/jpeg Size: 6174 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: wt.jpg Type: image/jpeg Size: 2446 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: hthr.jpg Type: image/jpeg Size: 4416 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: hthr hnwt tawy 2.jpg Type: image/jpeg Size: 9078 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: nbt.jpg Type: image/jpeg Size: 2393 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: t.jpg Type: image/jpeg Size: 840 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: nw.jpg Type: image/jpeg Size: 1714 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: hnwt.jpg Type: image/jpeg Size: 5740 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: hnwt.jpg Type: image/jpeg Size: 5740 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: Some general considerations.pdf Type: application/pdf Size: 282164 bytes Desc: not available URL: From mn31 at st-andrews.ac.uk Sat Jul 23 13:47:15 2016 From: mn31 at st-andrews.ac.uk (Mark-Jan Nederhof) Date: Sat, 23 Jul 2016 13:47:15 +0100 Subject: [Egyptian] Brackets in the TLA encoding In-Reply-To: References: Message-ID: <1922257.noGW0DXB86@thuis> Hi Simon, Hi St?phane, Hi All, This is very helpful. Physicists tell us that if you want to gather and use data, you need hypotheses first, or else you don't know what to look for. For me the relevant hypotheses are: (1) The primitives we have in our current document allow description of most of the groups in an accurate enough way. Both 'most of' and 'accurate enough' are subjective of course. There is no escaping that. (2) It would be quite difficult to reduce the expressive power before we would lose coverage. There is an implicit parameter, which is a limit on the depth of nesting, which I assume is 3. As also Simon confirmed once more, 2 is not enough, even for the most basic, run-of-the-mill classical (horizontal) inscriptions. As to (1), we have moved away quite considerably from descriptive power that is machine-interpretable. This was motivated by people finding the original encoding too complicated, and arguing that fonts would do a lot of fine-tuning anyway for particular choices of signs. Also, we don't really care about a sign being printed 0.5 mm too much to the left or to the right, as long as the user gets a rough idea of what the text looks like. These arguments all sound reasonable, but realise two things: * If even stupid machines don't know how to render an encoding roughly as it was intended, perhaps there is not enough information present for humans to know what was meant either. * As stressed once more by St?phane, the kinds of groups we are talking about are productive. We don't want to be manually fine-tuning the appearance of an unbounded number of groups, so some approximately correct automatic rendering would be quite useful. I think we are still okay with the present version of the proposal, but we have moved a long way from existing routines that do the rendering in a deterministic, predictable manner, to needing lots more refinements to program code and the result being not quite well-defined. As to both (1) and (2), the provided examples include quite a few cases of insertions and stacking, insertions into stacked groups, and even groups with insertions that are themselves inserted. So far I don't see either hypothesis refuted. I had to struggle quite a bit to get rid of prefix operators. As anyone with the slightest knowledge of formal languages knows, prefix or suffix operators are ideal for automatic processing, because the problem of ambiguity simply does not exist, whereas endless volumes of textbooks since the late 1950s have been written about the ambiguities caused by infix operators and how to solve them using principled or not so principled methods involving operator precedence and low-level hacks in shift-reduce parsers. Using infix operators is really only justifiable if notation is meant for human consumption. That is why I was very surprised to hear objections with the argument that font technology is too primitive to handle prefix operators. If anything, I would have imagined that primitive tools would have a lot of difficulty with parsing in the presence of operator precedence and such. I implemented OpenType substitution rules that analysed bracket structures and prefix operators myself, and that works fine. It would be a nightmare for me to have to implement OpenType substitution rules in the presence of operator precedence. There may be something in the arguments people use that I don't understand. Anyway, one thing to look out for (I say this in particular to Simon, St?phane and Serge, with whom this was discussed in detail in Cambridge), is that in the process of getting rid of prefix operators, and avoiding ambiguity, the following coverage was lost: it is not possible anymore to insert A into the top-left corner of B, and to insert the resulting group into the bottom-left corner of C. The same holds for the right corners. I have yet to see a group where this matters. It is still possible however to insert A into the bottom-left corner of B, and to insert the resulting group in the bottom-left corner of C. The same holds for two upper-left corners and the corresponding right corners. There are groups of these forms among the provided examples. The problem of course is that inability to find certain structures in the corpora we happen to have at this very moment does not prove their non-existence. At best it means our encoding won't be too much lacking in terms of coverage. Best regards, Mark-Jan On Thursday 21 Jul 2016 15:24:51 Simon Schweitzer wrote: > Hi all, > > @St?phane: thank you for your .gly-files! In this mail, I want to add > some remarks concerning the subgroup topic. > > As in Ramses, there are many encodings with "(" and ")" in the TLA. I > collected these encodings ans I want to present you my evaluation: > * In some cases, the encoding is invalid, e.g. (F12-S29):D21, which > should be understood as F12*S29:D21. > * Sometimes, the encoding of the brackets is superflous. There are many > cases of Hiero1:(Hiero2*Hiero3) The brackets are not necessary: use > Hiero1:Hiero2*Hiero3 ! > * But in many cases, the parsing without the brackets would be misleading: > 1) There are many vertical groups in horizontal groups in vertical > groups. I list only 10 examples: > N35:"?"*"?"*(W22:Z2) (Rezepte Papyrus Zagreb E-597-3, l. 1; > ID:ABLN5PNQ2BBENE7LWO72KDRPPU) > Aa1&D58:(X1:N35)*N25 (Stele Louvre C 284 ("Bentresch-Stele"), l. 22; > ID:4VLZLA44UVGJZN22WIWP774LOQ) > Aa13:S40*(X1:O49) (Stele Louvre C 284 ("Bentresch-Stele"), l. 21; > ID:4VLZLA44UVGJZN22WIWP774LOQ) > Aa15:W19*(X1:X1) (Harfnerlieder Text C, l. 2; ID:H6Z5TORPQFFZXOU6CJODODZHYQ) > D21:V28*(X1:B1) (?Stele des Montuhotep (Cambridge E.9.1922)?, l. C.3; > ID:LQ7QIJTK7NFWTIFBS7R6AIYWGY) > D21:V7*(W24:X1) (?Stele des Montuhotep (Kairo CG 20539)?, l. I.b.18; > ID:ZOLMMIAB2NHV7PSOSOOLAHN64U) > D35A:(X1:Z4A)*G37 (?Stele des Antef (Louvre C 167 = E 3111)?, l. C.1; > ID:DWHZIO5ZCFBURLZ6G4T26YIP7U) > D36:D21:N29*(X1:"?"*Z4*"?") (?Stele des Antef (Glasgow D1922.13)?, l. > A.3; ID:OIYODBZ74RHM7OPTR72OLCMJ3A) > I10&I9:X2*(X4:Z2) (?Stele des Antef (Glasgow D1922.13)?, l. A.7; > ID:OIYODBZ74RHM7OPTR72OLCMJ3A) > K4:G1*(Z7:X1) (Sinuhe AOS, vs. 18; ID:RP2F6BGNDBAARDBNDHGFSFPIEM) > As you can see, this kind of grouping occurs in hieroglyphic and in > hieratic texts, and this feature is also attested in the "classical" > period from the Middle Kingdom (the examples from the stela of > Montuhotep and Antef). > 2) horizontal grouping of vertical groups in columns > If the text is written vertically, there are cases of horizontal groups > of vertical groups, e.g. in the Buch von der Himmelskuh > (ID:WHEOIX5P5ZAVVFU4BGO6OWASV4), M17*(Q3:N35), (X1:Z1)*I12, > M17*S29*(A2:Z2) and so on. > > Best regards, > > Simon > > _______________________________________________ > Egyptian mailing list > Egyptian at evertype.com > http://evertype.com/mailman/listinfo/egyptian_evertype.com > From bobqq at live.co.uk Sat Jul 23 13:51:53 2016 From: bobqq at live.co.uk (Bob Richmond) Date: Sat, 23 Jul 2016 13:51:53 +0100 Subject: [Egyptian] Some general considerations In-Reply-To: References: Message-ID: Hi Marwan Thank you for sharing your thoughts. I?m sympathetic to much of what you have written. Lots of good points. When I kicked the ball rolling on the topic 18 months ago I submitted a rough background note to UTC L2/15-069 . I gave two general approaches. 1. An implied clustering scheme (similar to what you are suggesting in your note). I first raised this ?Simplified Egyptian? notion at I&E 2006 when we were discussing the initial repertoire. 2. An example of an explicit approach (much like what we have now as the UTC recommendation). Like you I like the Simplified approach and it would work well for much casual use. One reason the explicit approach was actually chosen is it enables the author of a transcription to be clear about intended layout. If the look of a text were reliant on a particular fonts clustering model there are opportunities for confusion long term. Some explicit structure seemed essential. The need to input joiner characters was a tradeoff ? however specialist software for Egyptologists will be able to use specialist input methods for fast input of text. And remember the most popular input method is copy and paste! The simplicity of a single LIGATURE was proposed with consideration about input methods and the practicality/usability of editing in general purpose software. I?ve not had time yet to analyse the Ramses data fully and have not yet received corresponding TLA data but on evidence so far the 4 corner method is only possibly needed for at most something like 1 in 5,000 clusters so this low frequency should be taken into account when we decide what to do. On vertical and horizontal have you looked at the 3 controls + 2 group controls (as per note yesterday). Have you tried experimenting with font I sent out in the week (this has a couple of vertical/tall group examples in the doc about the font? Regards, Bob From: Egyptian [mailto:egyptian-bounces at evertype.com] On Behalf Of Marwan Kilani Sent: 23 July 2016 10:46 To: Egyptian Hieroglyphs in the UCS Subject: [Egyptian] Some general considerations Hello Everyone First of all: I am attaching a pdf version of this email because I am using a few images, and I am not sure they will displayed in the right places in the email. If you don?t see any image in the text below or something does not make sense, please refer to the pdf. ------- I dare to write here an email pointing out a few general and specific observations on both what has been said in Cambridge, and what has been discussed in these emails in the past few days. I have the feeling that many of you will not like what I am going to write, but well.. First of all, as you know I didn?t know any of you before Cambridge, so I had the feeling to be a bit an ?external observer?. Which, in turn, led me to a few observations. First: Do you know the Indian story of the three (or more) blind men who are put in front of an elephant and are asked to find out what hey have in front of them ( https://en.wikipedia.org/wiki/Blind_men_and_an_elephant )? One in front of the trump, one next to the ear and one near the tail. The three blind men start touching the elephant and start to try to describe it and to try to figure out what kind of animal is, but they end up fighting because they can only touch a small part of the animal but missed the general picture. Besides the ?entrenched positions? mentioned by Nigel, I had the feeling that some of the participants in Cambridge were a bit like the blind men, knowing very well their specific fields, but missing a bit the general picture, thus ending up misunderstanding the others. Now, I don?t want to sound arrogant. I don?t think I have a vision of the whole picture and I don?t think to be more knowledgeable than any of you. I put myself as well among those blind men. But considering that, correct me if I am wrong, So and I (note that I am talking only in my name, not in So's name) are the only person at that workshop who: a) are Egyptologists and therefore know both how Egyptian hieroglyphs works and what Egyptologists need (or at least what we need as egyptlogists) b) have been playing since a while with Unicode characters, fonts, input methods etc and therefore have a certain understanding of how these technical tools work c) have a good practical understanding of how non-Latin complex horizontal/vertical scripts work. In particular So is familiar with Japanese, Chinese and Korean, I think? While I know pretty well Arabic-based scripts, Indian ones (I lived in Nepal and when I was there I ended up teaching Nepali to Nepali children in a Nepali school) and I have played a lot with Chinese and Japanese scripts. Then, perhaps, our small little contribution should also be considered to make sense of the whole big elephant. This even more considering that in spite of having never met before, both So and I ended up developing very similar solutions to some of the problems you are discussing, solutions that, by the way, seem to me very similar to what Ishida was suggesting in some of his emails. Solutions that, in fact, are already implemented by various scripts around the world. Second: Many of the problems you are discussing don?t have a single ?right? solution, because in fact many of those problems depends on how you interpret the data (i.e. the ancient hieroglyphs). Therefore, the aim should not be to find the ?right? solution, but rather to find the easiest *interpretation* that could led to the easiest implementation in Unicode. We are not ancient Egyptians, we don?t know how ancient Egyptian scribes perceived their writing system. We can just observe it from the outside and suggest interpretations. Which are not ?right? or ?wrong?, but rather ?more efficient? or ?less efficient? in respect to the problems we want to solve through these interpretations. This is something that, I think, should be kept in mind. Third As said during the meeting in Cambridge, I am not against the introduction and use of some control characters per se. Still, honestly, more I read about your proposals and about your control characters, and more I am convinced that using rendering algorithms at the font level and general or contextual ligatures embedded within the fonts as a main way to combine and display groups would be much easier and much more efficient than using control characters. This said, allow me to call your attention to a few more specific points. 1) JSesh approach I had the feeling during the meeting and reading your emails that many of you think in a very JSesh-based mentality. Like I had the feeling that at least some of the participants? goal is to imitate JSesh in Unicode. But Unicode *is* *not* JSesh, or at least should not be JSesh. Using control functions, parentheses etc can be an efficient way in JSesh, but I doubt that a system requiring to input 10 (sic!) control characters to display 3 (sic!) hieroglyphs in a pretty common group (the Htp example in Mark-Jan?s last proposal) makes much sense in a unicode perspective. Especially considering that you can obtain exactly the same result without control character and just with 1 (sic!) general ligature within the font that will be automatically rendered by usual standard already available rendering algorithms. In other words, what can be an efficient solution to a problem in JSesh, is not necessarily a good solution in Unicode. And Unicode is not even a stand-alone thing: Unicode can work in combination with rendering algorithm (i.e. ligatures embedded in fonts), with layouts at the editor level etc. that can help solving the problems on different level than just on the Unicode encoding level. 2) Groups in fonts Someone in the list was saying that the groups need to be listed, but that not all of them necessarily need to be built in the font (or something like that). This is not true: compound characters and groups will not automatically build on their own, and if you envisaged the possibility to build a group that you *do* need to have in the font, otherwise it wont display correctly and you will have random broken control-characters popping out here and there in your hieroglyph text. This perhaps will not bother you in your database etc, but it will be just unacceptable for common users (and, for instance, for editors). If you are accepting that a group can be built, then you need a way to display that group, i.e. you must have that group saved somewhere in your font. Whether you will build those characters through control characters or with rendering and ligature algorithms embedded in the fonts. 3) the 4 ?small sign in the corner of big sign? control characters. If you really want to have control characters to combine small signs in the corner of bigger signs, you can do that with just 2 control characters, you don?t need 4 of them. These because the distinction you are making between ?big? and ?small? signs is superfluous. It is enough to have 2 transversal control characters, that we can represent as ?A\B? and ?A/B?. They will join the main signs by virtually/ideally putting them at two opposite corners of the square, on the base of their relative order. So for instance, if you want to render the group tw, you will just type in t + ?A\B-control-character? + w. If instead you want to code for wt, then you will just need to type in w + ?A\B-control-character? + t. If you want to render twt, then you just type in t + ?A\B-control-character? + w + ?A\B-control-character? + t In all these cases, you can simply use the same control character. You don?t need two distinct control characters for that, because there is no need to specify which one is the ?big? sign, and which one is the ?small? one, because what really matters is not their size, but their relative position. NOTE: if by any chance you are going to adopt this system of 2 control characters, then I?d like to be mentioned in the proposal. You know, it could be useful on the CV. 4) Vertical and horizontal script and control characters You are talking about using control characters to render texts in horizontal and vertical texts. This, however, can generate some quite relevant problems. In particular, you have to consider that it will be hardly possible to automatically convert a vertical text encoded with control characters into horizontal text. In other words, if you have a text written in vertical columns that used control characters, and for whatever reason (for instance for editorial reasons, you editor wants your text in horizontal lines and not in vertical lines, for instance if you are quoting a hieroglyphic passage within a English paragraph) you will need to turn it into horizontal text, you will hardly be able to do it automatically, and you will likely have to retype it entirely. The problem is that the control characters that you will need and use in your vertical text will not be the same and will not be inputted in the same places as those that you will need in your horizontal rendition of the same text. Let?s take, for instance, the following example. If this text will be typed with a vertical font (as I would expect it to be), then to display it correctly you will have to type the following sequence of Unicode characters: + + ?left/right-ctrl-character? (sic!) + + ?up/down-control-character? (sic!) + + + NOTE that you will probably (as far as I know, correct me if I am wrong) have to use the ?left/right-ctrl-character? (not a up/down crtl-character) to combine the and and then the ?up/down ctrl character (not a ?left/right ctrl character) to combine them with the , because in vertical texts the baseline of reference is the left line of the column. Note that this will change the order the signs have to be inputted, which means it will interfere with the searchability of the text. This however is a problem that should be possible to solved somehow. Perhaps, I am not sure. I am not expert of these details. If however you will try to display this same text (i.e. this same sequence of Unicode characters) horizontally, it wont be enough to chance the direction of the text and to use a horizontal font, because what you would obtain would be something wrong like this: "broken-control-character? "broken-control-character? In order to display horizontally the same text in a graphically acceptable way, in fact, you will have to type the following sequence of Unicode characters: + + + ?up-down-control-character? + + + ?up-down-control-character? + Which would be displayed as: Essentially, you will have to type the text anew. Not very practical, in my opinion. Using rendering algorithms, i.e. general or contextual ligatures embedded within the *font* at the font level (instead of control characters), would be a very easy way to solve the problem. In fact it would be enough to have two fonts, one for vertical texts and one for horizontal texts (or one single font with the possibility to switch between the two layouts) with different sets of ligatures embedded in them. In that way, you will just have to type the following plain sequence of Unicode characters (without any control character): + + + + + And the ligature algorithm in the vertical font will render the ligature . Then, to turn this same vertical string of text into horizontal text you will just have to change the font, and the algorithm in the horizontal font will automatically display the correct and ligatures. Without need to retype anything. NOTE that the hieroglyphic text you see the examples here above have all been obtained in this way, namely without control characters and just with my font with embedded ligatures. As Ishida suggested in one of his emails, this is essentially how Asian languages deal with this problems. It is very efficient, it works, it does not require any new special character and frankly I still don?t understand why Egyptian should be different. 5) special characters, vertical/horizontal texts and input methods Note that because of the problem with control characters and horizontal/vertical texts highlighted above, you *wont be able* to use a same simple predictive input method to type vertical and horizontal texts, because the sequences of signs and control characters needed will be *totally* different, and will both need to be encoded independently within the input method itself. In fact, you will probably have to explicitly list in advance within the input method every single possible combination of hieroglyphic signs + control characters, or you will have to just type you text sign by sign, control character by control character (even if you adopt shortcuts to input the control characters and the most common groups, the problem will still be there). Again, with general or contextual ligatures embedded at the font level this problem would not exist, because the input method would just have to input the plain sequence of signs, and it will be the rendering algorithm within the fonts that will take care of displaying the signs in the correct spatial order. 6) Ramesside ?groups? (or ?tall groups?). You all seem to assume that the clusters of signs we see in Ramesside texts are ?groups? (or ligatures) analogous to the middle Egyptian square groups, and therefore have to be created, manipulated and displayed as graphical units. For the non-Egyptologists among us, this is a Ramesside text in red you have what is generally referred to as a ?Ramesside group? (or ?tall groups?). Here, instead, you have an ordinary Egyptian text, and in red you have what are generally considered as ordinary ?square groups?. As you can see, the difference is that in ordinary Egyptian writing, sings are grouped into regular square spaces (or half-squares). In Ramesside writing, instead, signs tend to be combined into vertically elongated rectangles. These rectangles can contain multiple ordinary square groups. For instance, the following Ramesside group from the text above: could be split into two ordinary square groups, and , in the ordinary square groups writing. As said, people often assume that these clusters of signs in Ramesside texts have to be understood as ?groups? analogous to the ?square groups? of the ordinary writing. But, what if the Ramesside ?groups? were actually not really ?groups?? Or better, what if there were an easier and most efficient way to analyses, interpret, describe, and therefore display them? People seem to often assume that there are only two main ways to write a string of text, namely vertically or horizontally. This however is in not true. It is also possible to write short horizontal strings of text within a larger main vertical frame, and similarly it is possible to write short vertical strings of text within a larger main horizontal frame. The second case, in particular, is attested in Asian scripts. Have a look at the following image: This is a Japanese text, as you can see from the next image, it is written in short vertical ?columns? (red), organized into one general horizontal ?line? (green). Namely, the text is written (the text in the pic is right to left, I am transcribing it left to right because it is easier to explain the concept) --------------- ????? ????? --------------- But it has to be read: ?? | ?? | ?? | ?? | ?? namely: ?????????? Japanese writing has ?groups? (i.e. characters, marked in white in the following picture) that in some general respect could be compared to Egyptian groups (they are not identical, I know, but *in some respects* they are conceptually comparable). Now, *no one* dealing with Japanese writing would consider the short vertical strings of text (the red bits above) as some sort ?non-ordinary elongated groups(characters)?. And *no one* would ever consider to code such hypothetical elongated vertical groups into Unicode, or to devise control characters or anything to pre-compose them. They are just sequences of *regular* groups(characters) written vertically within a main horizontal layout. And in order to represent such a text in Unicode, you just have to play with the layout of your text in your text editor. You don?t need special control characters or anything else, it is *just* a question of *layout*. Now, Ramesside inscriptions can be analysed exactly in the same way. In other words, what you are interpreting as the Ramesside ?abnormal elongated/tall groups? (red in the pic above) could actually be interpreted just as short strings of texts written vertically (i.e short *columns*) within a bigger main horizontal layout (i.e. in lines, in green). Such an interpretation has a few advantages both in respect to the actual data from the ancient texts and for what concern Unicode etc. First, someone was pointing out that there are thousands of such ?Ramesside groups? and a large part of them is attested only once. Well, if you interpreted them as short columns of texts, rather than as groups, you understand why: those are short strings of texts, they are not isolated independent graphic and orthographic units. This also helps to answer the question: ?how many groups are you expecting to find in the future?? Virtually, potentially, infinite. If tomorrow we should find a new Ramesside temple with a new long text inscribed in ?Ramesside groups?, there could be hundreds, even thousands of these new short vertical strings of texts. Trying to describe (and encode) them as independent units (or devising control characters to build them) is more or less like trying to encode every singly column of hieroglyphic text as single independent ?groups?. It does not make much sense, as it would not make sense to try to describe and code every single horizontal sequence of groups(characters) in the Japanese texts above. Rather, it would make much more sense to deal with these ?Ramesside short vertical strings of text disposed in horizontal lines? as if they were.. well, short vertical strings of text disposed in horizontal lines. This can be *easily* done at the layout level, either with xml/CSS style sheets, or with any other layout algorithm of any text editor. Word can do that, already. It is enough to use a vertical font, and create a page layout with horizontal sequences (= the lines) of very short vertical columns. That is all what you need. The Ramesside ?group? (or ?tall group?) used by Bob as example in his last proposal, for instance, would not need to be built and displayed as ?group? (i.e. with control characters etc) at all, but it could just be inputted as a plain sequence of independent signs displayed with a vertical font within a vertical column, in a general layout that present series of horizontal sequences (i.e. a ?lines?) of such short columnts. No need for ligatures, no need for control characters, nothing. Just vertical fonts and properly set up layouts for the page where the signs will be displayed. Obviously, there will still be signs that will need to be combined in ?groups?, i.e. in ?real? square groups. As you can see from the pic above, however, if you interpret the Ramesside texts as composed of short vertical columns, the number of groups needed decrease significantly. And in general, those groups are often the same basic groups that you find in good old ordinary Egyptian square groups orthography. So no need to list and encode (as Unicode or at the font level, doesn?t matter) thousands of unique groups, we would just need to encode the basic square groups and then playing a bit with the layout of the text. And this brings me to the next point: 7) What are the ?square groups?? I have never specifically worked on square groups from a linguistic point of view, and I do not know if there is any specific study about square groups and their graphic behaviour within a larger linguistic frame (i.e. for instance comparing them with the behaviour of similar ?groups? in other writing systems). This is one of those points where the experience of some of you could be extremely useful. Such a study could be used, for instance, to define some contextual rules to allow the font to automatically manage the combination of at least some of these groups, or at least to define some rules to automatically prioritize one ligature over the other (if we are working with ligatures at the font level). If such a study exists, then it would be useful to take it into consideration in the discussion. If such a study does not exist, then perhaps it could be worth considering doing it *before* submitting any new proposal involving control characters to the Unicode consortium, because I think it would be better to be sure we really understand how those groups work, before suggesting a method to encode them. I am obviously talking about control characters etc, I am not talking about expanding the basic set of glyphs. Otherwise, we would be encoding something whose actual functioning has never been studied, and therefore we would risk to be encoding features and elements that are actually superfluous, or that could have been managed in a more efficient way at other levels (ligatures within the fonts, layout table at the texteditor level etc). Ok, I guess this is more or less all what I wanted to say. Now feel free to ignore me :-) Best Marwan -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image002.png Type: image/png Size: 799 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image004.png Type: image/png Size: 317 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image006.png Type: image/png Size: 654 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image008.png Type: image/png Size: 806 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image009.png Type: image/png Size: 654 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image011.png Type: image/png Size: 910 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image013.png Type: image/png Size: 3701 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image015.png Type: image/png Size: 1698 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image017.png Type: image/png Size: 622 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image019.png Type: image/png Size: 661 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image021.png Type: image/png Size: 483 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image022.png Type: image/png Size: 622 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image023.png Type: image/png Size: 623 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image024.png Type: image/png Size: 320 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image025.png Type: image/png Size: 620 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image026.png Type: image/png Size: 489 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image028.png Type: image/png Size: 3663 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image029.png Type: image/png Size: 661 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image031.jpg Type: image/jpeg Size: 1021 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image033.jpg Type: image/jpeg Size: 896 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image035.png Type: image/png Size: 14433 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image037.png Type: image/png Size: 20432 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image038.png Type: image/png Size: 4403 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image039.png Type: image/png Size: 7742 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image041.jpg Type: image/jpeg Size: 1095 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image043.png Type: image/png Size: 703 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image045.png Type: image/png Size: 465 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image046.png Type: image/png Size: 13279 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image047.png Type: image/png Size: 11587 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image048.png Type: image/png Size: 11995 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image050.png Type: image/png Size: 21407 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image052.png Type: image/png Size: 878 bytes Desc: not available URL: From s.polis at ulg.ac.be Sat Jul 23 16:00:30 2016 From: s.polis at ulg.ac.be (=?utf-8?Q?St=C3=A9phane_polis?=) Date: Sat, 23 Jul 2016 17:00:30 +0200 Subject: [Egyptian] Some general considerations In-Reply-To: References: Message-ID: <96C8F2FC-A766-4379-BCBE-B5A6FB700A3B@ulg.ac.be> Hi Marwan, Thanks for your mail! Some very quick answers to your suggestions. > First: > No worries, everyone is entitled to have an opinion based on his/her own experience, and I would certainly agree with the fact that no one (except maybe for Serge) has a proper understanding of all the aspects involved here. > Second: > Sure, no ?right? solution, we simply want a solution that meets our minimal needs. The definition of these needs is what we do not seem to agree on. > We are not ancient Egyptians, we don?t know how ancient Egyptian scribes perceived their writing system. We can just observe it from the outside and suggest interpretations. Which are not ?right? or ?wrong?, but rather ?more efficient? or ?less efficient? in respect to the problems we want to solve through these interpretations. This is something that, I think, should be kept in mind. Certainly, but they give us some clues that we should not ignore (see e.g. below, re point 6). > Third > As this is an issue recurring again and again, I would like to stress one more time that, unlike what is done for most other scripts, we are not producing/imputing new texts, but transcribing old ones, hopefully without loosing too much information. And your solution is a good one ? I have absolutely no doubt ? as long as you do not want to *search* for the relative position of signs with respect to one another. This is however a piece of information that is, like it or not, part of the ?orthographic? system of ancient Egyptian and to which we want to have access (see further Simon?s mail earlier this week). > 1) JSesh approach > Do not underestimate too much your addressees, Marwan ;) Furthermore, this is not a JSESH-based mentality, but a MdC-based mentality: the people behind this standard had some pretty good idea of what they were doing, trust me, and there are a lot of problems, but one should simply not throw out the baby with the bath water. > I had the feeling during the meeting and reading your emails that many of you think in a very JSesh-based mentality. Like I had the feeling that at least some of the participants? goal is to imitate JSesh in Unicode. > > > But Unicode *is* *not* JSesh, or at least should not be JSesh. > Using control functions, parentheses etc can be an efficient way in JSesh, but I doubt that a system requiring to input 10 (sic!) control characters to display 3 (sic!) hieroglyphs in a pretty common group (the Htp example in Mark-Jan?s last proposal) makes much sense in a unicode perspective. Especially considering that you can obtain exactly the same result without control character and just with 1 (sic!) general ligature within the font that will be automatically rendered by usual standard already available rendering algorithms. > > > In other words, what can be an efficient solution to a problem in JSesh, is not necessarily a good solution in Unicode. And Unicode is not even a stand-alone thing: Unicode can work in combination with rendering algorithm (i.e. ligatures embedded in fonts), with layouts at the editor level etc. that can help solving the problems on different level than just on the Unicode encoding level. > Sure, everything needs not be dealt with at the level of Unicode, but the data needed should not be hidden in the ligatures embedded in the font either. > 2) Groups in fonts > > > Someone in the list was saying that the groups need to be listed, but that not all of them necessarily need to be built in the font (or something like that). > > This is not true: compound characters and groups will not automatically build on their own, and if you envisaged the possibility to build a group that you *do* need to have in the font, otherwise it wont display correctly and you will have random broken control-characters popping out here and there in your hieroglyph text. This perhaps will not bother you in your database etc, but it will be just unacceptable for common users (and, for instance, for editors). > > > If you are accepting that a group can be built, then you need a way to display that group, i.e. you must have that group saved somewhere in your font. > > > Whether you will build those characters through control characters or with rendering and ligature algorithms embedded in the fonts. > > OK, sure. But then again control characters have the advantage of being explicit about the relative position of signs when a group is not in a font: how would you proceed for storing such an information, as a lay user, when using ligatures? (this is a real question, nothing ironic here). > > 3) the 4 ?small sign in the corner of big sign? control characters. > I?m glad to see that your solution is exactly the same as the one we suggest! Indeed, there are logically 2 ?variables? (Top vs Bottom and Left-Right), if the sequence of signs indicates the Left-Right unambiguously (like in your example), then we only need to encode explicitly the Top vs Bottom, of course. This is basically how we represented things with Michael at the pub. Please keep in mind that the four operators come from another type of syntax. > 4) Vertical and horizontal script and control characters > > I do not understand why you do entirely get rid of the notion of ?quadrat? in your example. If you encode your text as /hthr-H*(n:t)-tA:tA/, the text would be displayed correctly whatever the orientation (hltr, hrtl, vrtl, vltr), no? Of course, one needs the notion of quadrat in the encoding for this; and this is a nice case for showing that we *need* this well-defined notion in the encoding, not just sequences of glyphs. > 5) special characters, vertical/horizontal texts and input methods > > Your point escapes me, here. Unless it is a result of getting rid of the quadrats: within quadrats, the groups of hieroglyphs are essentially the same. > > 6) Ramesside ?groups? (or ?tall groups?). > > Why are they groups or quadrats and not ?small columns?? The answer is simple and straightforward (and provided by your example): because they correspond to the size occupied by the A1 sign. Look at the example. Again, it might not be easy to handle from a technical point of view, I do agree, and you might want to split them in smaller groups for the sake of convenience, OK, but some signs do occupy the full height of the horizontal line (unlike in your Japanese example, which is nice, but as such irrelevant), which is the basis for deciding what counts as a unit (quadrat) or not. (Or do you have another definition in mind?). Again, if unicode is meant for writing names of tourists (not even in cartouches as it seems) or to prepare simplified layout for online teaching grammar, etc. that?s really fine. But if you want to be able to use it minimally for (1) standardized electronic corpus and (2) journals like LingAeg, etc. dealing with the Egyptian language, this would be a requirement. > Trying to describe (and encode) them as independent units (or devising control characters to build them) is more or less like trying to encode every singly column of hieroglyphic text as single independent ?groups?. > Are you sure? Come on... > Rather, it would make much more sense to deal with these ?Ramesside short vertical strings of text disposed in horizontal lines? as if they were.. well, short vertical strings of text disposed in horizontal lines. > > This can be *easily* done at the layout level, either with xml/CSS style sheets, or with any other layout algorithm of any text editor. Word can do that, already. It is enough to use a vertical font, and create a page layout with horizontal sequences (= the lines) of very short vertical columns. > > The wording is transparent: easy, but unfortunately inaccurate and irrelevant for scholarly uses (not palaeography, of course: palaeography has to do with the actual appearance and style of individual signs). > 7) What are the ?square groups?? > > I provided the definition agreed on ? I think ? by everyone, if you have a better one, I?m listening of course. A final remark (aimed at everyone, not an answer to Marwan, of course). I spent hours and days reading and thinking about the arguments during the last few weeks, and I?ve got the impression that we are repeating the same obvious points again and again. I?m sorry to put it so bluntly, but if Unicode is not to be useful for the majority of egyptologists, so be it. It will remain what it is, a standard not used by the community: Journals, Corpora, etc. will keep on using JSESH and other tools, they?re doing well with it. On the other hand, the Egyptologists around the table seem to agree that we ?simply? need (and it does not sound extraordinary to me): 1 - to build quadrats. 2 - to create (sub)groups of signs within quadrats. 3 - to be able to position (groups of) signs with respect to other (groups) either vertically, horizontally or in other ?corner? like position (the INSERT-like operators). [As a side-note to Bob: the positioning of groups of signs in corners is trivial in monumental inscriptions: the low number of them in Ramses comes from the fact that we encoded very few hieroglyphic texts; check the first pages of the KRI for getting an idea.] These are the basic principles, illustrated ad nauseam in the (files attached to the) mails before (and we leave out everything else as mandatory requirements). If this is not possible to envision, I think that we can close the discussion, without anger or regret: Unicode cannot be used by most of us. That?s a pity, but so be it. And I would find it really great if one could now stop asking for more data regarding these questions or post-poning the decision for bad reasons: we provided more than is needed. If you do not want to take these cases into account because they are not frequent enough (on which basis, e.g., for someone working only with this kind of texts?) or because they are hard to implement, that?s fine; but please do not invoke the lack of evidence. Have a good weekend folks! St?phane -------------- next part -------------- An HTML attachment was scrubbed... URL: From mn31 at st-andrews.ac.uk Sat Jul 23 16:22:56 2016 From: mn31 at st-andrews.ac.uk (Mark-Jan Nederhof) Date: Sat, 23 Jul 2016 16:22:56 +0100 Subject: [Egyptian] Vertical vs horizontal writing-mode In-Reply-To: References: Message-ID: <1596000.3AgpvMF7jM@thuis> Dear Richard, A belated thank you. On Wednesday 20 Jul 2016 11:55:31 ishida at w3.org wrote: > a. perhaps some combination of the above smoke and mirrors techniques > may be adequate to manage some of the differences between layout in > horizontal vs vertical writing-mode when the thing we are struggling > with is the spacial relationships between the elements circumscribed by > a quadrat when they are rendered. I'm not aware of a fail-safe way to convert vertical to horizontal. If smoke and mirrors do something reasonable, then that is better than nothing. Here 'nothing' could mean putting groups side by side horizontally, no matter how high a group is. Note that the usual encodings of hieroglyphic text place no restriction on the width of a group for horizontal writing, or the height of a group for vertical writing. Our point in the original proposal was that if you want to do more than 'nothing', then the rendering application must at the very least be able to detect that the writing-mode has changed and that there is something to be done in the first place. Horizontal and vertical text use the same code points, just in slightly different ways, so there is no automatic way to detect a change of writing-mode. Unless there are characters that contain the needed information. You mention other scripts where some characters are more likely to be found in one or the other writing mode, but there seem to be no existing scripts where some character uniquely identifies the intended writing-mode. So there is no precedent for us to invoke to argue for something similar for Ancient Egyptian. > b. perhaps it's not particularly problematic that you can't > automatically flip between horizontal vs vertical without changing code > points, especially when one considers that there is anyway so much > variation in 'spelling' of egyptian content, often to fit the visual > space available. It seems acceptable for now to assume all text is horizontal, and it is up to the encoder to manually rearrange groups if not. But it is not ideal. I would suggest to keep the issue on the agenda, and at least flag up that there is a problem to be solved. If it cannot be solved within the constraints of Unicode, too bad. > c. if the control characters used to indicate the positioning of > hieroglyphs within the quadrat display space are treated like other > Unicode control characters, ie. they are not part of the semantics and > are ignored for sorting, searching, and processing the text for meaning, > rather they are just cues for visual arrangement, then perhaps it's not > a big issue either if they are different for vertical vs horizontally > rendered content. You're right this matter is irrelevant for (most) sorting and searching. But let me remark that as part of our investigations, St?phane and Serge analysed occurrences of certain groups, through automatic search, distinguishing between horizontal and vertical text. Unless I misunderstood this was possible because knowledge about the writing-mode was available in the Ramses corpus. So there is at least one example of text with indication of writing mode being more useful than text without that indication. Best regards, Mark-Jan From bobqq at live.co.uk Sat Jul 23 17:10:00 2016 From: bobqq at live.co.uk (Bob Richmond) Date: Sat, 23 Jul 2016 17:10:00 +0100 Subject: [Egyptian] Brackets in the TLA encoding In-Reply-To: <1922257.noGW0DXB86@thuis> References: <1922257.noGW0DXB86@thuis> Message-ID: Hi Mark-Jan A mathematician or physicist will also tell you not to use Einsteins Equations when Newtons suffice. Especially while cycling down a hill at speed however well versed you are in tensor calculus. On a more serious note, have you organized the data from St?phane to map against your model with 4 corners yet? No point in us both doing the same thing. Have a list of examples that you think need large levels of nesting in your model? Have you any comment on group joiners? You mention below "Using infix operators is really only justifiable if notation is meant for human consumption." Quite. Human and machine consumption is exactly what we are designing for. Text is about people not parsers. Try opening my font description document in Word and insert/delete characters in hieroglyphic strings, see the control codes come and go. Think about what end controls in your model imply. Bob -----Original Message----- From: Egyptian [mailto:egyptian-bounces at evertype.com] On Behalf Of Mark-Jan Nederhof Sent: 23 July 2016 13:47 To: egyptian at evertype.com Subject: Re: [Egyptian] Brackets in the TLA encoding Hi Simon, Hi St?phane, Hi All, This is very helpful. Physicists tell us that if you want to gather and use data, you need hypotheses first, or else you don't know what to look for. For me the relevant hypotheses are: (1) The primitives we have in our current document allow description of most of the groups in an accurate enough way. Both 'most of' and 'accurate enough' are subjective of course. There is no escaping that. (2) It would be quite difficult to reduce the expressive power before we would lose coverage. There is an implicit parameter, which is a limit on the depth of nesting, which I assume is 3. As also Simon confirmed once more, 2 is not enough, even for the most basic, run-of-the-mill classical (horizontal) inscriptions. As to (1), we have moved away quite considerably from descriptive power that is machine-interpretable. This was motivated by people finding the original encoding too complicated, and arguing that fonts would do a lot of fine-tuning anyway for particular choices of signs. Also, we don't really care about a sign being printed 0.5 mm too much to the left or to the right, as long as the user gets a rough idea of what the text looks like. These arguments all sound reasonable, but realise two things: * If even stupid machines don't know how to render an encoding roughly as it was intended, perhaps there is not enough information present for humans to know what was meant either. * As stressed once more by St?phane, the kinds of groups we are talking about are productive. We don't want to be manually fine-tuning the appearance of an unbounded number of groups, so some approximately correct automatic rendering would be quite useful. I think we are still okay with the present version of the proposal, but we have moved a long way from existing routines that do the rendering in a deterministic, predictable manner, to needing lots more refinements to program code and the result being not quite well-defined. As to both (1) and (2), the provided examples include quite a few cases of insertions and stacking, insertions into stacked groups, and even groups with insertions that are themselves inserted. So far I don't see either hypothesis refuted. I had to struggle quite a bit to get rid of prefix operators. As anyone with the slightest knowledge of formal languages knows, prefix or suffix operators are ideal for automatic processing, because the problem of ambiguity simply does not exist, whereas endless volumes of textbooks since the late 1950s have been written about the ambiguities caused by infix operators and how to solve them using principled or not so principled methods involving operator precedence and low-level hacks in shift-reduce parsers. Using infix operators is really only justifiable if notation is meant for human consumption. That is why I was very surprised to hear objections with the argument that font technology is too primitive to handle prefix operators. If anything, I would have imagined that primitive tools would have a lot of difficulty with parsing in the presence of operator precedence and such. I implemented OpenType substitution rules that analysed bracket structures and prefix operators myself, and that works fine. It would be a nightmare for me to have to implement OpenType substitution rules in the presence of operator precedence. There may be something in the arguments people use that I don't understand. Anyway, one thing to look out for (I say this in particular to Simon, St?phane and Serge, with whom this was discussed in detail in Cambridge), is that in the process of getting rid of prefix operators, and avoiding ambiguity, the following coverage was lost: it is not possible anymore to insert A into the top-left corner of B, and to insert the resulting group into the bottom-left corner of C. The same holds for the right corners. I have yet to see a group where this matters. It is still possible however to insert A into the bottom-left corner of B, and to insert the resulting group in the bottom-left corner of C. The same holds for two upper-left corners and the corresponding right corners. There are groups of these forms among the provided examples. The problem of course is that inability to find certain structures in the corpora we happen to have at this very moment does not prove their non-existence. At best it means our encoding won't be too much lacking in terms of coverage. Best regards, Mark-Jan On Thursday 21 Jul 2016 15:24:51 Simon Schweitzer wrote: > Hi all, > > @St?phane: thank you for your .gly-files! In this mail, I want to add > some remarks concerning the subgroup topic. > > As in Ramses, there are many encodings with "(" and ")" in the TLA. I > collected these encodings ans I want to present you my evaluation: > * In some cases, the encoding is invalid, e.g. (F12-S29):D21, which > should be understood as F12*S29:D21. > * Sometimes, the encoding of the brackets is superflous. There are > many cases of Hiero1:(Hiero2*Hiero3) The brackets are not necessary: > use > Hiero1:Hiero2*Hiero3 ! > * But in many cases, the parsing without the brackets would be misleading: > 1) There are many vertical groups in horizontal groups in vertical > groups. I list only 10 examples: > N35:"?"*"?"*(W22:Z2) (Rezepte Papyrus Zagreb E-597-3, l. 1; > ID:ABLN5PNQ2BBENE7LWO72KDRPPU) > Aa1&D58:(X1:N35)*N25 (Stele Louvre C 284 ("Bentresch-Stele"), l. 22; > ID:4VLZLA44UVGJZN22WIWP774LOQ) > Aa13:S40*(X1:O49) (Stele Louvre C 284 ("Bentresch-Stele"), l. 21; > ID:4VLZLA44UVGJZN22WIWP774LOQ) > Aa15:W19*(X1:X1) (Harfnerlieder Text C, l. 2; > ID:H6Z5TORPQFFZXOU6CJODODZHYQ) > D21:V28*(X1:B1) (?Stele des Montuhotep (Cambridge E.9.1922)?, l. C.3; > ID:LQ7QIJTK7NFWTIFBS7R6AIYWGY) > D21:V7*(W24:X1) (?Stele des Montuhotep (Kairo CG 20539)?, l. I.b.18; > ID:ZOLMMIAB2NHV7PSOSOOLAHN64U) > D35A:(X1:Z4A)*G37 (?Stele des Antef (Louvre C 167 = E 3111)?, l. C.1; > ID:DWHZIO5ZCFBURLZ6G4T26YIP7U) > D36:D21:N29*(X1:"?"*Z4*"?") (?Stele des Antef (Glasgow D1922.13)?, l. > A.3; ID:OIYODBZ74RHM7OPTR72OLCMJ3A) > I10&I9:X2*(X4:Z2) (?Stele des Antef (Glasgow D1922.13)?, l. A.7; > ID:OIYODBZ74RHM7OPTR72OLCMJ3A) > K4:G1*(Z7:X1) (Sinuhe AOS, vs. 18; ID:RP2F6BGNDBAARDBNDHGFSFPIEM) As > you can see, this kind of grouping occurs in hieroglyphic and in > hieratic texts, and this feature is also attested in the "classical" > period from the Middle Kingdom (the examples from the stela of > Montuhotep and Antef). > 2) horizontal grouping of vertical groups in columns If the text is > written vertically, there are cases of horizontal groups of vertical > groups, e.g. in the Buch von der Himmelskuh > (ID:WHEOIX5P5ZAVVFU4BGO6OWASV4), M17*(Q3:N35), (X1:Z1)*I12, > M17*S29*(A2:Z2) and so on. > > Best regards, > > Simon > > _______________________________________________ > Egyptian mailing list > Egyptian at evertype.com > http://evertype.com/mailman/listinfo/egyptian_evertype.com > _______________________________________________ Egyptian mailing list Egyptian at evertype.com http://evertype.com/mailman/listinfo/egyptian_evertype.com From odusseus at gmail.com Sat Jul 23 18:12:18 2016 From: odusseus at gmail.com (Marwan Kilani) Date: Sat, 23 Jul 2016 19:12:18 +0200 Subject: [Egyptian] Some general considerations In-Reply-To: <96C8F2FC-A766-4379-BCBE-B5A6FB700A3B@ulg.ac.be> References: <96C8F2FC-A766-4379-BCBE-B5A6FB700A3B@ulg.ac.be> Message-ID: Hello everyone *Third* > > And your solution is a good one ? I have absolutely no doubt ? as long as > you do not want to *search* for the relative position of signs with respect > to one another. This is however a piece of information that is, like it or > not, part of the ?orthographic? system of ancient Egyptian and to which we > want to have access (see further Simon?s mail earlier this week). > In general, the spatial distribution of signs (i.e. the "grouping") has *no meaning at all* in Egyptian. No semantic, phonological, or morphological information is coded in the relative position of the signs. This is demonstrated by the texts themselves: if the relative position of the signs were linguistically important, you would have some form of regularity, with some combinations being possible and others being forbidden. This is not the case. Combining three signs into a group, or writing them one after the other is linguistically exactly the same. it is just an esthetic, a layout, matter. Not a linguistic one. No more than illuminated initial capital letters in medieval manuscripts. Like these: http://www.sothebys.com/content/dam/stb/lots/L12/L12240/256L12240_6G4Y2.jpg They are nice, they are fancy, they are there (hundreds of thousands of them), but they do not carry any additional linguistic information whatsoever. Thus, there is not specific need to (and in fact they are not) encode or represent them in unicode. As far as I know (and please correct me if I am wrong), the only utility that recording the position of signs could have on a practical level would be to fill lacunas in text or to suggest alternative readings in hieratic scripts. But in fact, in order to find what sign could be missing in a given lacuna, or what sign could be hidden behind a hieratic ligature, you need dictionaries and corpora of texts where you can search for sequences of signs *independently* form their spatial position (what signs x is attested after sign y? whether combined in a similar group or not?), you don't need to code anything in unicode. It can be a plus, but it is not indispensable. > *1) JSesh approach* > > > Sure, everything needs not be dealt with at the level of Unicode, but the > data needed should not be hidden in the ligatures embedded in the font > either. > Again, what data? What is the *linguistic* (not graphic, not philological) information carried by groups that is so important to code? Perhaps this question has been already asked, but honestly so far I haven't heard or read any convincing answer, and I haven't seen any common example (i can exclude there could be some uncommon case, i obviously have not seen all the egyptian texts existing in the world) where the position of a sign in respect with the other signs around it carries and linguistic information. *2) Groups in fonts* > > OK, sure. But then again control characters have the advantage of being > explicit about the relative position of signs when a group is not in a > font: how would you proceed for storing such an information, as a lay user, > when using ligatures? (this is a real question, nothing ironic here). > I am not sure I understand your question: the information about the spatial distribution is already coded in the ligature. the ligature IS the information. And if the ligature does not exist, it takes (literally) 30 seconds to create it within a font. And with a common database and a common font of reference, if would be extremely easy to create a common shared table of ligatures that will allow everyone to see exactly the same groups (or alternatively if i write my text with the font x with the ligature x, it would be enough to embed the font itself in the text file (there are various ways to do so), because everyone would be using the same font or at least the same set of ligatures. > *3) the 4 ?small sign in the corner of big sign? control characters.* > > I?m glad to see that your solution is exactly the same as the one we > suggest! Indeed, there are logically 2 ?variables? (Top vs Bottom and > Left-Right), if the sequence of signs indicates the Left-Right > unambiguously (like in your example), then we only need to encode > explicitly the Top vs Bottom, of course. This is basically how we > represented things with Michael at the pub. > Please keep in mind that the four operators come from another type of > syntax. > This is not what I am suggesting. I am suggesting 4 control characters: top-bottom, left-right, A\B, B/A. instead of top-bottom, left-right + four corners. Or perhaps I am not understanding your system. > *4) Vertical and horizontal script and control characters* > > > I do not understand why you do entirely get rid of the notion of ?quadrat? > in your example. If you encode your text as /hthr-H*(n:t)-tA:tA/, the text > would be displayed correctly whatever the orientation (hltr, hrtl, vrtl, > vltr), no? Of course, one needs the notion of quadrat in the encoding for > this; and this is a nice case for showing that we *need* this well-defined > notion in the encoding, not just sequences of glyphs. > If you want to have the "notion of quadrate" in unicode, then you have to introduce at least another control character that you will have to use in front of *every single quadrate* in your text. if i understood correctly, one of the proposal that were circulating was indeed suggesting something like that. Which means that to display any text you will probably end up using more control characters that hieroglyphic signs themselves. This can be a sound way of thinking in a MdC (as you don't like JSesh ;-) ) perspective, but I doubt it make sense in unicode (and more important i doubt the people of the unicode consortium will think it make sense). Unless you want to code every single possible "quadrate" as an independent glyph: that would thus end up being conceptually comparable to a chinese character. But I assume (i hope) we don't want do do that, right? *5) special characters, vertical/horizontal texts and input methods* > > > Your point escapes me, here. Unless it is a result of getting rid of the > quadrats: within quadrats, the groups of hieroglyphs are essentially the > same. > With quadrats marked by special control characters would be even worse, as you would have to take into consideration even more possible combinations. > *6) Ramesside ?groups? (or ?tall groups?).* > > > Why are they groups or quadrats and not ?small columns?? > > The answer is simple and straightforward (and provided by your example): > because they correspond to the size occupied by the A1 sign. Look at the > example. > > but some signs do occupy the full height of the horizontal line (unlike in > your Japanese example, which is nice, but as such irrelevant), which is the > basis for deciding what counts as a unit (quadrat) or not. (Or do you have > another definition in mind?). > Not true. This can be interpreted just as a question of layout, not as a question of grouping. Simply, there could be signs that can be graphically stretched/enlarged to fully fill alone a given space, while other will be squeezed fit together within an equivalent space. This does not say anything at all about them being "groups" or not. It can be interpreted just as a question of layout. And actually this is very common in various writing systems around the world, and no one would consider the fact that some isolated glyphs appear as big as some "combined one" as reason to consider the "combined ones" as "groups". Just a few examples from the internet: have a look at this chinese (bopomofo + characters) text: https://upload.wikimedia.org/wikipedia/commons/1/1b/Bopomofo_in_Regular,_Handwritten_Regular_%26_Cursive_formats.jpg According to your way of interpreting groups ("some signs do occupy the full height of the horizontal line, which is the basis for deciding what counts as a unit (quadrat) or not."), the four little characters within the parentheses on the right of the image should be interpreted as a single "group" (or as a combined "character", as we are in china) just because they, combined, are as big as some of the other single characters. This is not the case. No one in asia would have consider such a combination as a "group" or as a "combined character". They are just four independent characters(or "four groups" assuming character = group, which conceptually is a fairly sound equivalence) which happen to fit into the space in a slightly different way compared to the other characters of the text. Their appearance is just a question of *scale*, thus of layout. Not of grouping, and do not say anything about what a "group" should be. Here, an even better example, a chiense text with bopofomo and hanzi characters: http://chinesehacks.com/app/uploads/2010/06/zhuyin.jpg no one would consider those small signs on the right of the big characters as "groups" just because when combined together they end up being as big as the single bigger characters. And as you can see, conceptually, you can interpret this text as a sequence of short vertical columns organized in horizontal lines. Some of these short vertical columns are occupied by a single "scaled up" character (like the A1 in the Egyptian text above) while other short vertical columns are occupied by multiple "scaled down" signs one above the other. it is the same with Ramesside writing. No one, however, would consider those smaller "scaled down" signs written one above the other as "groups" just because together they are as big as the big ones. Same with latin script, actually: for instance take again the manuscript page above: http://www.sothebys.com/content/dam/stb/lots/L12/L12240/256L12240_6G4Y2.jpg According to your definition, one should take the tallest sign (i.e. the initial "B") as the unit to define what a "group" is. As a consequence, one should consider the 6 lines of text beside it not as "lines of text", but as a single "group", and according to your approach those lines should not be represented with a specific layout, but should be somehow combines with control characters. I hope we agree that this would make no sense. So if in Chinese (and Japanese, and Korean) and in Latin script you can have glyphs of different sizes in the same text, without the need to conceptually cluster the "smallest one" into "groups" that will appear as big as the biggest signs, why do you feel this need with Egyptian? And note that this is different from Egyptian ordinary square writing, as in ordinary square writing one can assume that all the signs (both those combines into square groups and those outside square groups) are more or less in the same *scale* (as a general principle, again i know there are exceptions). In Ramesside writing, instead, the *scale* of the sign is not always the same. Some are "scaled down" to fit togheter within one of those "small vertical columns" while others (like the A1) are just "scaled up" to fill up alone the same space of one of those "small vertical columns". Exactly like in the Chinese examples here above, or exactly like with illuminated capital letters in manuscripts. It is (or it can be interpreted) as a question of layout. > But if you want to be able to use it minimally for (1) standardized > electronic corpus and (2) journals like LingAeg, etc. dealing with the > Egyptian language, this would be a requirement. > All the egyptian words quoted in my article published on the last issue of JNES were written with my font with just standard ligatures, without control characters. No one complained, no one told me anything, so I assume that such a system is indeed suitable for scientific publications, not only for writing tourists' names.. ;-) Beside the anecdote, you can do everything scientific you need with unicode without ending up using tens of control characters. And if you can't then perhaps you should consider that unicode is not the best tool for your job.. It is actually funny: i have tried to understand all the various proposal including control characters, and as a result I have pointed out *several* practical problems in the use of control characters that can seriously affect the *linguistic* information of the text (like control characters requiring to change the actual inputting order of the actual hieroglyphs to display them correctly), some still without an answer, or with answer so complex that will be hardly realizable (see question of turning vertical text into horizontal text, that will require the introduction of at least one more control character). On the other hand, almost no one so far has pointed out any real problem in the use of ligatures (possibly combined with a very restricted number, 1 o 2, of control characters as suggested by Bob) for which there isn't already a possible solution which is already implemented in some script around the world (i.e. unwanted ligatures can be broken with a zero-width character like in indian scripts, deferent ligatures using the same signs could be selected with "variant characters" like in emoji etc). And still, a system that does not use multiple control characters will be good only for writing tourists' names.. Please, no offense, but again: the story of the elephant.. Trying to describe (and encode) them as independent units (or devising > control characters to build them) is more or less like trying to encode > every singly column of hieroglyphic text as single independent ?groups?. > > Are you sure? Come on... > it is you who is talking about encoding (as glyphs in font, not necessarily as unicode characters) tens of thousands of possible and often unique combinations, not me.. Ramesside "tall" groups can be interpreted as short vertical texts. Thus thinking of encoding them (or building them with control characters) is like thinking of encoding as groups whole columns of text. > The wording is transparent: easy, but unfortunately inaccurate and > irrelevant for scholarly uses (not palaeography, of course: palaeography > has to do with the actual appearance and style of individual signs). > again, why? > *7) What are the ?square groups??* > > > I provided the definition agreed on ? I think ? by everyone, if you have a > better one, I?m listening of course. > see above. And see my "Second" introductory point in the previous email: it is not a question of finding the "right" definition (as there isn't any). It is a question of finding the *most efficient* definition in order to efficiently reach our goal of having a working system of unicode-based hieroglyphs that can be used by the *whole* Egyptological community (and not only by a handful of people working on a specific corpus or database, and on his respect, to respond to you last remark..). I?m sorry to put it so bluntly, but if Unicode is not to be useful for the > majority of egyptologists, so be it. It will remain what it is, a standard > not used by the community: Journals, Corpora, etc. will keep on using JSESH > and other tools, they?re doing well with it. > Perhaps the opposite should also be considered? perhaps the unicode-based system to write hieroglyphs (considering that the very idea at the basis of unicode is to make writing standardized and accessible to as many people as possible) should indeed aim at being useful for the majority of the Egyptologists (corpus-linguists as well as all the other thousands of non-corpus-linguist egyptologists), and if some team working with some specific database should have some specific need that cannot be satisfied with such a standard unicode system, then perhaps it is them who should use other tools? I guess this is an option that should be considered.. no? Best Marwan -------------- next part -------------- An HTML attachment was scrubbed... URL: From everson at evertype.com Sun Jul 24 00:33:37 2016 From: everson at evertype.com (Michael Everson) Date: Sun, 24 Jul 2016 00:33:37 +0100 Subject: [Egyptian] proposal 2016-07-22 In-Reply-To: <1897138.Z3PCXJQWcV@bear> References: <1897138.Z3PCXJQWcV@bear> Message-ID: I use different words and different glyphs for these. But you have END, SEPARATOR, and EMPTY which we did not discuss in Cambridge. > On 22 Jul 2016, at 18:05, Mark-Jan Nederhof wrote: > > Dear all, > > We adapted our proposal. Please find it in: > > https://mjn.host.cs.st-andrews.ac.uk/tmp/unicode2.pdf > > We have responded to the criticism about brackets and prefix operators. We have gotten > rid of all prefix operators and replaced them by infix operators. As for brackets, it is well > possible to get rid of them too (not entirely, there are still the cartouches), > but the price to pay for this is added complexity of the syntax due to needing several > copies of each primitive with different operator precedence, perhaps three or four. > It is outlined in Section 9 how this would be done. This adds to the complexity > that already exists, after the prefix operators were removed; have a look at > appendix A and see whether you can verify the grammar is unambiguous. > > There are no names on the proposal. That is partly because not everyone from > TLA/Ramses/St Andrews has had a chance to look at it yet, and partly because > anyone is welcome to have their name added if they feel they contributed and > subscribe to the content. > > Mark-Jan > > _______________________________________________ > Egyptian mailing list > Egyptian at evertype.com > http://evertype.com/mailman/listinfo/egyptian_evertype.com From mn31 at st-andrews.ac.uk Sun Jul 24 00:50:50 2016 From: mn31 at st-andrews.ac.uk (Mark-Jan Nederhof) Date: Sun, 24 Jul 2016 00:50:50 +0100 Subject: [Egyptian] proposal 2016-07-22 In-Reply-To: References: <1897138.Z3PCXJQWcV@bear> Message-ID: <2376856.de12YEsUQd@thuis> END and EMPTY have always been there. END is part of the brackets that you didn't like, as you said during my presentation. I'm sure EMPTY was mentioned during my presentation as well. The SEPARATOR was introduced because (I believe) you said that there should be something between each pair of characters that is combined in some way. The choice for a single operator SEPARATOR for both horizontal and vertical grouping was motivated in my message 13/07/2016 11:47 (addressed to you personally, before there was the email list). Mark-Jan On Sunday 24 Jul 2016 00:33:37 Michael Everson wrote: > I use different words and different glyphs for these. > > But you have END, SEPARATOR, and EMPTY which we did not discuss in Cambridge. > > > > On 22 Jul 2016, at 18:05, Mark-Jan Nederhof wrote: > > > > Dear all, > > > > We adapted our proposal. Please find it in: > > > > https://mjn.host.cs.st-andrews.ac.uk/tmp/unicode2.pdf > > > > We have responded to the criticism about brackets and prefix operators. We have gotten > > rid of all prefix operators and replaced them by infix operators. As for brackets, it is well > > possible to get rid of them too (not entirely, there are still the cartouches), > > but the price to pay for this is added complexity of the syntax due to needing several > > copies of each primitive with different operator precedence, perhaps three or four. > > It is outlined in Section 9 how this would be done. This adds to the complexity > > that already exists, after the prefix operators were removed; have a look at > > appendix A and see whether you can verify the grammar is unambiguous. > > > > There are no names on the proposal. That is partly because not everyone from > > TLA/Ramses/St Andrews has had a chance to look at it yet, and partly because > > anyone is welcome to have their name added if they feel they contributed and > > subscribe to the content. > > > > Mark-Jan > > > > _______________________________________________ > > Egyptian mailing list > > Egyptian at evertype.com > > http://evertype.com/mailman/listinfo/egyptian_evertype.com > > > _______________________________________________ > Egyptian mailing list > Egyptian at evertype.com > http://evertype.com/mailman/listinfo/egyptian_evertype.com From s.polis at ulg.ac.be Sun Jul 24 11:33:23 2016 From: s.polis at ulg.ac.be (=?utf-8?Q?St=C3=A9phane_polis?=) Date: Sun, 24 Jul 2016 12:33:23 +0200 Subject: [Egyptian] Some general considerations In-Reply-To: References: <96C8F2FC-A766-4379-BCBE-B5A6FB700A3B@ulg.ac.be> Message-ID: Hi guys! (no worries, this is my last mail on the topic, enough time and energy spent on this.) >> Third >> > And your solution is a good one ? I have absolutely no doubt ? as long as you do not want to *search* for the relative position of signs with respect to one another. This is however a piece of information that is, like it or not, part of the ?orthographic? system of ancient Egyptian and to which we want to have access (see further Simon?s mail earlier this week). > > In general, the spatial distribution of signs (i.e. the "grouping") has *no meaning at all* in Egyptian. No semantic, phonological, or morphological information is coded in the relative position of the signs. This is patently inaccurate. Three simple examples should suffice here (sorry for providing textbook examples known by all). 1 - spatial distribution affecting phono-morphology [t] followed by [w] can only be read /tw/, while t&w can be /tw/ or /wt/ => same ?linear order?, 1 reading in one case, 2 readings in the other (referring potentially to 2+ morphemes). 2 - spatial distribution as a condition for reading (and adding semantic value) makes no sense, when combined as it is clear that it should be read /ptH/ ?Ptah? (> p(t) + t(A) + H(H)), referring to the god in his demiurgic dimension (separating the sky from the earth). The position of the sign is both a condition for reading and an added semantic information (not only Ptah, but Ptah as demiurge). 3 - spatial distribution affecting the function of a sign Hieratic example? If the rowing man is followed by a n () the n has to be read /n/ (phonemogram), if the n is positioned under the rowing man (), we are dealing with a compound classifier made of two signs, the /n/ has an iconic value and the whole group refer to a man rowing in water. Etc., etc. So position is 'just an esthetic, layout matter, not a linguistic one? as you say? This kind of assertion reflects badly on our understanding of the hieroglyphic system. That?s a pity to make such unwarranted statements in a discussion also aimed at non-specialists to whom we try to explain things as straightforwardly as possible in order to come up with a solution that is satisfying for everyone. Accordingly, if Unicode aims first and foremost at rendering the ?linguistic? dimension of writing, the examples above should suffice to show the importance of the ?quadrat' organization of this script. Again, you might not like it from a computer/font oriented perspective; I agree it?s not convenient to encode, it might even not be possible to encode it at all in Unicode because the standard has not the capabilities needed (and only higher level protocols could then handle this). All of this is fine with me, but it would be great not to distord presentation of the data. [I leave here alone other semiotic dimensions of writing: the spatial arrangement is part of the ?orthography? of the scribes, not necessarily meaningful at the ?linguistic? level strictly speaking, but at the level of scribal practices, etc.: why don?t we use IPA for our modern languages? simply because writing is much more than ?linguistic? stricto sensu and that it make sense to know who writes ?next? and who plays with the script and writes ?neckst?. As simple as that.] > No more than illuminated initial capital letters in medieval manuscripts. > Like these: > > http://www.sothebys.com/content/dam/stb/lots/L12/L12240/256L12240_6G4Y2.jpg > > They are nice, they are fancy, they are there (hundreds of thousands of them), but they do not carry any additional linguistic information whatsoever. Thus, there is not specific need to (and in fact they are not) encode or represent them in unicode. This comparaison is hilarious. > As far as I know (and please correct me if I am wrong), the only utility that recording the position of signs could have on a practical level would be to fill lacunas in text or to suggest alternative readings in hieratic scripts. Nope, see above. This is one side-interest of the control characters that I mentioned in the discussions, indeed, but definitely not the only one, since the arrangement is meaningful at multiple levels. >> 1) JSesh approach >> > >> > Sure, everything needs not be dealt with at the level of Unicode, but the data needed should not be hidden in the ligatures embedded in the font either. > > Again, what data? > What is the *linguistic* (not graphic, not philological) information carried by groups that is so important to code? See above. >> 2) Groups in fonts >> >> > OK, sure. But then again control characters have the advantage of being explicit about the relative position of signs when a group is not in a font: how would you proceed for storing such an information, as a lay user, when using ligatures? (this is a real question, nothing ironic here). > > I am not sure I understand your question: the information about the spatial distribution is already coded in the ligature. the ligature IS the information. > And if the ligature does not exist, it takes (literally) 30 seconds to create it within a font. And with a common database and a common font of reference, if would be extremely easy to create a common shared table of ligatures that will allow everyone to see exactly the same groups (or alternatively if i write my text with the font x with the ligature x, it would be enough to embed the font itself in the text file (there are various ways to do so), because everyone would be using the same font or at least the same set of ligatures. All this was clear, sure. What I meant is that the ligature are purely ?graphic? right? No information is stored about the position of one sign with respect to another? >> 3) the 4 ?small sign in the corner of big sign? control characters. >> > I?m glad to see that your solution is exactly the same as the one we suggest! Indeed, there are logically 2 ?variables? (Top vs Bottom and Left-Right), if the sequence of signs indicates the Left-Right unambiguously (like in your example), then we only need to encode explicitly the Top vs Bottom, of course. This is basically how we represented things with Michael at the pub. > Please keep in mind that the four operators come from another type of syntax. > > This is not what I am suggesting. I am suggesting 4 control characters: top-bottom, left-right, A\B, B/A. instead of top-bottom, left-right + four corners. > Or perhaps I am not understanding your system. OK, then I disagree because of the polysemic value of A/B and B/A. >> 4) Vertical and horizontal script and control characters >> >> > > I do not understand why you do entirely get rid of the notion of ?quadrat? in your example. If you encode your text as /hthr-H*(n:t)-tA:tA/, the text would be displayed correctly whatever the orientation (hltr, hrtl, vrtl, vltr), no? Of course, one needs the notion of quadrat in the encoding for this; and this is a nice case for showing that we *need* this well-defined notion in the encoding, not just sequences of glyphs. > > If you want to have the "notion of quadrate" in unicode, then you have to introduce at least another control character that you will have to use in front of *every single quadrate* in your text. if i understood correctly, one of the proposal that were circulating was indeed suggesting something like that. Which means that to display any text you will probably end up using more control characters that hieroglyphic signs themselves. This can be a sound way of thinking in a MdC (as you don't like JSesh ;-) ) perspective, but I doubt it make sense in unicode (and more important i doubt the people of the unicode consortium will think it make sense). > > Unless you want to code every single possible "quadrate" as an independent glyph: that would thus end up being conceptually comparable to a chinese character. > > But I assume (i hope) we don't want do do that, right? We use spaces between words, would it be bad to use quadrats separators between quadrats? Mutatis mutandis, it feels to me like saying ?oh, no, there are blank spaces all around!?. More seriously, taking signs as basic units for Unicode might be a practical solution (even if it leads to many subsequent difficulties, see the vertical vs horizontal discussion), but denying the essential function of quadrats is kind of funny when discussing the introduction of control characters (1, 2, 38, it does not matter): what they do is building quadrats (implicitly or explicitly). >> 5) special characters, vertical/horizontal texts and input methods >> >> > > Your point escapes me, here. Unless it is a result of getting rid of the quadrats: within quadrats, the groups of hieroglyphs are essentially the same. > > > With quadrats marked by special control characters would be even worse, as you would have to take into consideration even more possible combinations. > > >> 6) Ramesside ?groups? (or ?tall groups?). >> >> > > Why are they groups or quadrats and not ?small columns?? > > The answer is simple and straightforward (and provided by your example): because they correspond to the size occupied by the A1 sign. Look at the example. > > but some signs do occupy the full height of the horizontal line (unlike in your Japanese example, which is nice, but as such irrelevant), which is the basis for deciding what counts as a unit (quadrat) or not. (Or do you have another definition in mind?). > > Not true. > This can be interpreted just as a question of layout, not as a question of grouping. Paraphrasing what Ariel Shisha-Halevy once said to a colleague during a conference: you?re entitled to have your own kind of Egyptian if you want! I?m not arguing against non-sense. But that?s a pity, because there are actually many cases (esp. in Ptolemaic temples) that fit perfectly with your ?small columns? hypothesis. The Ramesside example that you provide simply does not work this way. And all the parallels from other scripts are pointless (and admittedly funny in the framework of this discussion), since none of these scripts are based on a quadratic structure that is close in any respect to Egyptian. Comparing apple with pears won?t help: and again, that?s a bit of a pity, because we have many cases in Egyptian of layouts similar to the ones you mentioned. These are nice cases of special layouts, indeed, and I agree with your analysis: we do not want to encode them. (unlike the quadrat structure). > But if you want to be able to use it minimally for (1) standardized electronic corpus and (2) journals like LingAeg, etc. dealing with the Egyptian language, this would be a requirement. > > All the egyptian words quoted in my article published on the last issue of JNES were written with my font with just standard ligatures, without control characters. No one complained, no one told me anything, so I assume that such a system is indeed suitable for scientific publications, not only for writing tourists' names.. ;-) I?m afraid that one paper consisting of a lexicographical discussion about the use of one word in one hieratic document cannot be taken as any evidence for disproving my point: you?re happy with the way your hieroglyphs are rendered? Perfect. But why should it be the case that people willing to render a simple hieroglyphic line as the one below would face difficulties? This escapes me. (KRI I, 4) >> 7) What are the ?square groups?? >> >> > > I provided the definition agreed on ? I think ? by everyone, if you have a better one, I?m listening of course. > > see above. > And see my "Second" introductory point in the previous email: it is not a question of finding the "right" definition (as there isn't any). It is a question of finding the *most efficient* definition in order to efficiently reach our goal of having a working system of unicode-based hieroglyphs that can be used by the *whole* Egyptological community (and not only by a handful of people working on a specific corpus or database, and on his respect, to respond to you last remark..). > > I?m sorry to put it so bluntly, but if Unicode is not to be useful for the majority of egyptologists, so be it. It will remain what it is, a standard not used by the community: Journals, Corpora, etc. will keep on using JSESH and other tools, they?re doing well with it. > > Perhaps the opposite should also be considered? perhaps the unicode-based system to write hieroglyphs (considering that the very idea at the basis of unicode is to make writing standardized and accessible to as many people as possible) should indeed aim at being useful for the majority of the Egyptologists (corpus-linguists as well as all the other thousands of non-corpus-linguist egyptologists), and if some team working with some specific database should have some specific need that cannot be satisfied with such a standard unicode system, then perhaps it is them who should use other tools? > > I guess this is an option that should be considered.. no? We are talking about feelings and different perceptions here, so it?s hard to be objective. My own view (admittedly subjective) is as follows: most Egyptologists (publishing texts, monuments, etc) will never be happy with what Unicode has to offer, because this is not precise enough at multiple levels (Nigel expressed this position multiple times during the meeting for instance). On the other hand, grammarians (broadly speaking) and people working on corpora are probably the ones who are the more open to standardization (hieroglyphs were not even there in the TLA at the beginning). I see this community (much more than historians, etc.) as intensive users of Unicode: using it for exchanging texts, publishing volumes, creating Online resources, etc. I feel like we are very much in favor of standardization and striving for making resources accessible; two goals of Unicode, as you mentioned. These resources should be there for lasting, it would be a pity not to think about the standard carefully before going in any direction. That?s all folks! Have a nice weekend, St?phane -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: PastedGraphic-1.pdf Type: application/pdf Size: 3613 bytes Desc: not available URL: -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: PastedGraphic-2.pdf Type: application/pdf Size: 3441 bytes Desc: not available URL: -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: PastedGraphic-3.pdf Type: application/pdf Size: 3713 bytes Desc: not available URL: -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: PastedGraphic-5.pdf Type: application/pdf Size: 3638 bytes Desc: not available URL: -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: Capture d?e?cran 2016-07-24 a? 11.47.11.png Type: image/png Size: 54301 bytes Desc: not available URL: -------------- next part -------------- An HTML attachment was scrubbed... URL: From ncs3 at cam.ac.uk Sun Jul 24 12:56:05 2016 From: ncs3 at cam.ac.uk (Nigel Strudwick) Date: Sun, 24 Jul 2016 12:56:05 +0100 Subject: [Egyptian] Some general considerations In-Reply-To: References: <96C8F2FC-A766-4379-BCBE-B5A6FB700A3B@ulg.ac.be> Message-ID: <4B6C12F0-C9B1-4168-ACE1-AE0385979BA2@cam.ac.uk> Greetings Just wanted to say that St?phane?s last paragraph admirably summarises my perspective. I haven?t been reading this posts closely since I don?t have the theoretical background on Unicode to have a view. But some observations and a question. But I would say that of the categories St?phane mentions in his last paragraph, it is probably also making a distinction in the text corpora section between hieroglyphic and hieratic texts, simply due to the generally more regular arrangement of the latter, and hieroglyphic?s notorious ability to be squeezed, expanded etc to fit a space for a variety of reasons. So while those doing corpora won?t be as fussed about precise arrangements as I can be, there are going to be times where there will be challenges with hieroglyphs. Something that did come as a shock during the meeting was the dawning on me, in the course of a slightly tense exchange with Michael, was that Unicode fonts clearly can put a lot of burden on the font designer, especially for glyphs, if they are to include multiple and complex ligatures etc. This is where the current fonts score well (such as the Cleo Fonts) in that they are about the design only, and the arrangement is then carried out in the specialist software (e.g. JSesh) where the ultimate control lies with the Egyptologist and not the font designer. [I know that won?t be well received but you know where I come from on all this!] One question that I should have asked at the meeting is this: not knowing how control characters actually manifest themselves, will a text formatted with unicode with all the control characters be able to be exported into a plain MdC format, or will one of these input systems be able to import an MdC text and format it correctly? I ask as there will be a lot of MdC plain text around on people?s hard discs. Or will be have ultimately to revise the MdC system to handle all these other codes? Apologies in advance for the evident failures of comprehension in some of the points above. I don?t really need comments on the first points I made, but I would be interested on how the MdC issues might be handled. Best, Nigel On 24 Jul 2016, at 11:33, St?phane polis wrote: > Hi guys! > > (no worries, this is my last mail on the topic, enough time and energy spent on this.) >>> Third >>> >> And your solution is a good one ? I have absolutely no doubt ? as long as you do not want to *search* for the relative position of signs with respect to one another. This is however a piece of information that is, like it or not, part of the ?orthographic? system of ancient Egyptian and to which we want to have access (see further Simon?s mail earlier this week). >> >> In general, the spatial distribution of signs (i.e. the "grouping") has *no meaning at all* in Egyptian. No semantic, phonological, or morphological information is coded in the relative position of the signs. > > This is patently inaccurate. > Three simple examples should suffice here (sorry for providing textbook examples known by all). > > 1 - spatial distribution affecting phono-morphology > [t] followed by [w] can only be read /tw/, while t&w can be /tw/ or /wt/ => same ?linear order?, 1 reading in one case, 2 readings in the other (referring potentially to 2+ morphemes). > > 2 - spatial distribution as a condition for reading (and adding semantic value) > > makes no sense, when combined as > > it is clear that it should be read /ptH/ ?Ptah? (> p(t) + t(A) + H(H)), referring to the god in his demiurgic dimension (separating the sky from the earth). The position of the sign is both a condition for reading and an added semantic information (not only Ptah, but Ptah as demiurge). > > 3 - spatial distribution affecting the function of a sign > Hieratic example? If the rowing man is followed by a n ( > > ) the n has to be read /n/ (phonemogram), if the n is positioned under the rowing man ( > > ), we are dealing with a compound classifier made of two signs, the /n/ has an iconic value and the whole group refer to a man rowing in water. > > Etc., etc. > > So position is 'just an esthetic, layout matter, not a linguistic one? as you say? This kind of assertion reflects badly on our understanding of the hieroglyphic system. That?s a pity to make such unwarranted statements in a discussion also aimed at non-specialists to whom we try to explain things as straightforwardly as possible in order to come up with a solution that is satisfying for everyone. > > Accordingly, if Unicode aims first and foremost at rendering the ?linguistic? dimension of writing, the examples above should suffice to show the importance of the ?quadrat' organization of this script. Again, you might not like it from a computer/font oriented perspective; I agree it?s not convenient to encode, it might even not be possible to encode it at all in Unicode because the standard has not the capabilities needed (and only higher level protocols could then handle this). All of this is fine with me, but it would be great not to distord presentation of the data. > > [I leave here alone other semiotic dimensions of writing: the spatial arrangement is part of the ?orthography? of the scribes, not necessarily meaningful at the ?linguistic? level strictly speaking, but at the level of scribal practices, etc.: why don?t we use IPA for our modern languages? simply because writing is much more than ?linguistic? stricto sensu and that it make sense to know who writes ?next? and who plays with the script and writes ?neckst?. As simple as that.] > > >> No more than illuminated initial capital letters in medieval manuscripts. >> Like these: >> >> http://www.sothebys.com/content/dam/stb/lots/L12/L12240/256L12240_6G4Y2.jpg >> >> They are nice, they are fancy, they are there (hundreds of thousands of them), but they do not carry any additional linguistic information whatsoever. Thus, there is not specific need to (and in fact they are not) encode or represent them in unicode. > > This comparaison is hilarious. > >> As far as I know (and please correct me if I am wrong), the only utility that recording the position of signs could have on a practical level would be to fill lacunas in text or to suggest alternative readings in hieratic scripts. > > Nope, see above. This is one side-interest of the control characters that I mentioned in the discussions, indeed, but definitely not the only one, since the arrangement is meaningful at multiple levels. >>> 1) JSesh approach >>> >>> >>> >> Sure, everything needs not be dealt with at the level of Unicode, but the data needed should not be hidden in the ligatures embedded in the font either. >> >> Again, what data? >> What is the *linguistic* (not graphic, not philological) information carried by groups that is so important to code? > > See above. >>> 2) Groups in fonts >>> >>> >> OK, sure. But then again control characters have the advantage of being explicit about the relative position of signs when a group is not in a font: how would you proceed for storing such an information, as a lay user, when using ligatures? (this is a real question, nothing ironic here). >> >> I am not sure I understand your question: the information about the spatial distribution is already coded in the ligature. the ligature IS the information. >> And if the ligature does not exist, it takes (literally) 30 seconds to create it within a font. And with a common database and a common font of reference, if would be extremely easy to create a common shared table of ligatures that will allow everyone to see exactly the same groups (or alternatively if i write my text with the font x with the ligature x, it would be enough to embed the font itself in the text file (there are various ways to do so), because everyone would be using the same font or at least the same set of ligatures. > > All this was clear, sure. What I meant is that the ligature are purely ?graphic? right? No information is stored about the position of one sign with respect to another? > >>> 3) the 4 ?small sign in the corner of big sign? control characters. >>> >> I?m glad to see that your solution is exactly the same as the one we suggest! Indeed, there are logically 2 ?variables? (Top vs Bottom and Left-Right), if the sequence of signs indicates the Left-Right unambiguously (like in your example), then we only need to encode explicitly the Top vs Bottom, of course. This is basically how we represented things with Michael at the pub. >> Please keep in mind that the four operators come from another type of syntax. >> >> This is not what I am suggesting. I am suggesting 4 control characters: top-bottom, left-right, A\B, B/A. instead of top-bottom, left-right + four corners. >> Or perhaps I am not understanding your system. > > OK, then I disagree because of the polysemic value of A/B and B/A. > >>> 4) Vertical and horizontal script and control characters >>> >>> >> >> I do not understand why you do entirely get rid of the notion of ?quadrat? in your example. If you encode your text as /hthr-H*(n:t)-tA:tA/, the text would be displayed correctly whatever the orientation (hltr, hrtl, vrtl, vltr), no? Of course, one needs the notion of quadrat in the encoding for this; and this is a nice case for showing that we *need* this well-defined notion in the encoding, not just sequences of glyphs. >> >> If you want to have the "notion of quadrate" in unicode, then you have to introduce at least another control character that you will have to use in front of *every single quadrate* in your text. if i understood correctly, one of the proposal that were circulating was indeed suggesting something like that. Which means that to display any text you will probably end up using more control characters that hieroglyphic signs themselves. This can be a sound way of thinking in a MdC (as you don't like JSesh ;-) ) perspective, but I doubt it make sense in unicode (and more important i doubt the people of the unicode consortium will think it make sense). >> >> Unless you want to code every single possible "quadrate" as an independent glyph: that would thus end up being conceptually comparable to a chinese character. >> >> But I assume (i hope) we don't want do do that, right? > > We use spaces between words, would it be bad to use quadrats separators between quadrats? Mutatis mutandis, it feels to me like saying ?oh, no, there are blank spaces all around!?. > > More seriously, taking signs as basic units for Unicode might be a practical solution (even if it leads to many subsequent difficulties, see the vertical vs horizontal discussion), but denying the essential function of quadrats is kind of funny when discussing the introduction of control characters (1, 2, 38, it does not matter): what they do is building quadrats (implicitly or explicitly). >>> 5) special characters, vertical/horizontal texts and input methods >>> >>> >> >> Your point escapes me, here. Unless it is a result of getting rid of the quadrats: within quadrats, the groups of hieroglyphs are essentially the same. >> >> >> With quadrats marked by special control characters would be even worse, as you would have to take into consideration even more possible combinations. >> >> >>> 6) Ramesside ?groups? (or ?tall groups?). >>> >>> >> >> Why are they groups or quadrats and not ?small columns?? >> >> The answer is simple and straightforward (and provided by your example): because they correspond to the size occupied by the A1 sign. Look at the example. >> >> but some signs do occupy the full height of the horizontal line (unlike in your Japanese example, which is nice, but as such irrelevant), which is the basis for deciding what counts as a unit (quadrat) or not. (Or do you have another definition in mind?). >> >> Not true. >> This can be interpreted just as a question of layout, not as a question of grouping. > > Paraphrasing what Ariel Shisha-Halevy once said to a colleague during a conference: you?re entitled to have your own kind of Egyptian if you want! > I?m not arguing against non-sense. But that?s a pity, because there are actually many cases (esp. in Ptolemaic temples) that fit perfectly with your ?small columns? hypothesis. > The Ramesside example that you provide simply does not work this way. > > And all the parallels from other scripts are pointless (and admittedly funny in the framework of this discussion), since none of these scripts are based on a quadratic structure that is close in any respect to Egyptian. Comparing apple with pears won?t help: and again, that?s a bit of a pity, because we have many cases in Egyptian of layouts similar to the ones you mentioned. These are nice cases of special layouts, indeed, and I agree with your analysis: we do not want to encode them. (unlike the quadrat structure). > > >> But if you want to be able to use it minimally for (1) standardized electronic corpus and (2) journals like LingAeg, etc. dealing with the Egyptian language, this would be a requirement. >> >> All the egyptian words quoted in my article published on the last issue of JNES were written with my font with just standard ligatures, without control characters. No one complained, no one told me anything, so I assume that such a system is indeed suitable for scientific publications, not only for writing tourists' names.. ;-) > > I?m afraid that one paper consisting of a lexicographical discussion about the use of one word in one hieratic document cannot be taken as any evidence for disproving my point: you?re happy with the way your hieroglyphs are rendered? Perfect. But why should it be the case that people willing to render a simple hieroglyphic line as the one below would face difficulties? This escapes me. > > > > (KRI I, 4) >>> 7) What are the ?square groups?? >>> >>> >> >> I provided the definition agreed on ? I think ? by everyone, if you have a better one, I?m listening of course. >> >> see above. >> And see my "Second" introductory point in the previous email: it is not a question of finding the "right" definition (as there isn't any). It is a question of finding the *most efficient* definition in order to efficiently reach our goal of having a working system of unicode-based hieroglyphs that can be used by the *whole* Egyptological community (and not only by a handful of people working on a specific corpus or database, and on his respect, to respond to you last remark..). >> >> I?m sorry to put it so bluntly, but if Unicode is not to be useful for the majority of egyptologists, so be it. It will remain what it is, a standard not used by the community: Journals, Corpora, etc. will keep on using JSESH and other tools, they?re doing well with it. >> >> Perhaps the opposite should also be considered? perhaps the unicode-based system to write hieroglyphs (considering that the very idea at the basis of unicode is to make writing standardized and accessible to as many people as possible) should indeed aim at being useful for the majority of the Egyptologists (corpus-linguists as well as all the other thousands of non-corpus-linguist egyptologists), and if some team working with some specific database should have some specific need that cannot be satisfied with such a standard unicode system, then perhaps it is them who should use other tools? >> >> I guess this is an option that should be considered.. no? > > We are talking about feelings and different perceptions here, so it?s hard to be objective. My own view (admittedly subjective) is as follows: most Egyptologists (publishing texts, monuments, etc) will never be happy with what Unicode has to offer, because this is not precise enough at multiple levels (Nigel expressed this position multiple times during the meeting for instance). On the other hand, grammarians (broadly speaking) and people working on corpora are probably the ones who are the more open to standardization (hieroglyphs were not even there in the TLA at the beginning). I see this community (much more than historians, etc.) as intensive users of Unicode: using it for exchanging texts, publishing volumes, creating Online resources, etc. I feel like we are very much in favor of standardization and striving for making resources accessible; two goals of Unicode, as you mentioned. > > These resources should be there for lasting, it would be a pity not to think about the standard carefully before going in any direction. > > That?s all folks! > Have a nice weekend, > > St?phane > _______________________________________________ > Egyptian mailing list > Egyptian at evertype.com > http://evertype.com/mailman/listinfo/egyptian_evertype.com From mn31 at st-andrews.ac.uk Sun Jul 24 17:11:05 2016 From: mn31 at st-andrews.ac.uk (Mark-Jan Nederhof) Date: Sun, 24 Jul 2016 17:11:05 +0100 Subject: [Egyptian] Some general considerations In-Reply-To: <4B6C12F0-C9B1-4168-ACE1-AE0385979BA2@cam.ac.uk> References: <4B6C12F0-C9B1-4168-ACE1-AE0385979BA2@cam.ac.uk> Message-ID: <3369008.Gi8cJxaxqj@thuis> Dear Nigel, We address just this point in our proposal, conversion to and from more precise kinds of encoding outside Unicode. In fact, it was one of the fundamental considerations that guided us to the Unicode encoding that we are proposing. See the fifth bullet point on p. 1 and the beginning of Section 12 of: https://mjn.host.cs.st-andrews.ac.uk/tmp/unicode2.pdf In Section 12, RES is illustrated a number of times, but much applies equally to MdC (I should really say JSesh, because MdC is hopelessly vague, due to lack of a published standard). I implemented automatic conversion routines from JSesh to RES, of which the Unicode encoding is (roughly) a subset, and it should be equally possible to convert from JSesh directly to Unicode. Whether automatic conversion from Unicode to JSesh would be possible already, or whether JSesh would need to be extended, this matter I gladly leave to Serge. However, if I say "you can convert" it doesn't mean the output hieroglyphic text would look the same, and not even that the output of conversion is "correct", whatever that means. E.g. JSesh has absolute positioning and scaling, which are by definition impossible in Unicode. So this then has to be represented differently somehow. Perhaps heuristics and/or hacks are able to achieve at least something reasonable, but it seems unlikely you would be able to blithely dump all your old MdC code into a Unicode plain-text document after automatic conversion without at least some visual inspection and manual correction. That holds for the proposed Unicode encoding, and would hold for any alternative Unicode encoding. Sorry if this is a disappointment to you, but this is inherent in the exercise: if you convert from a specialistic and powerful and precise format to a simpler format, you need to sacrifice information. If you cannot afford to lose that information, you should not convert. Mark-Jan On Sunday 24 Jul 2016 12:56:05 Nigel Strudwick wrote: > Greetings > > Just wanted to say that St?phane?s last paragraph admirably summarises my perspective. > > I haven?t been reading this posts closely since I don?t have the theoretical background on Unicode to have a view. But some observations and a question. > > But I would say that of the categories St?phane mentions in his last paragraph, it is probably also making a distinction in the text corpora section between hieroglyphic and hieratic texts, simply due to the generally more regular arrangement of the latter, and hieroglyphic?s notorious ability to be squeezed, expanded etc to fit a space for a variety of reasons. So while those doing corpora won?t be as fussed about precise arrangements as I can be, there are going to be times where there will be challenges with hieroglyphs. > > Something that did come as a shock during the meeting was the dawning on me, in the course of a slightly tense exchange with Michael, was that Unicode fonts clearly can put a lot of burden on the font designer, especially for glyphs, if they are to include multiple and complex ligatures etc. This is where the current fonts score well (such as the Cleo Fonts) in that they are about the design only, and the arrangement is then carried out in the specialist software (e.g. JSesh) where the ultimate control lies with the Egyptologist and not the font designer. [I know that won?t be well received but you know where I come from on all this!] > > One question that I should have asked at the meeting is this: not knowing how control characters actually manifest themselves, will a text formatted with unicode with all the control characters be able to be exported into a plain MdC format, or will one of these input systems be able to import an MdC text and format it correctly? I ask as there will be a lot of MdC plain text around on people?s hard discs. Or will be have ultimately to revise the MdC system to handle all these other codes? > > Apologies in advance for the evident failures of comprehension in some of the points above. I don?t really need comments on the first points I made, but I would be interested on how the MdC issues might be handled. > > Best, Nigel > > On 24 Jul 2016, at 11:33, St?phane polis wrote: > > > Hi guys! > > > > (no worries, this is my last mail on the topic, enough time and energy spent on this.) > >>> Third > >>> > >> And your solution is a good one ? I have absolutely no doubt ? as long as you do not want to *search* for the relative position of signs with respect to one another. This is however a piece of information that is, like it or not, part of the ?orthographic? system of ancient Egyptian and to which we want to have access (see further Simon?s mail earlier this week). > >> > >> In general, the spatial distribution of signs (i.e. the "grouping") has *no meaning at all* in Egyptian. No semantic, phonological, or morphological information is coded in the relative position of the signs. > > > > This is patently inaccurate. > > Three simple examples should suffice here (sorry for providing textbook examples known by all). > > > > 1 - spatial distribution affecting phono-morphology > > [t] followed by [w] can only be read /tw/, while t&w can be /tw/ or /wt/ => same ?linear order?, 1 reading in one case, 2 readings in the other (referring potentially to 2+ morphemes). > > > > 2 - spatial distribution as a condition for reading (and adding semantic value) > > > > makes no sense, when combined as > > > > it is clear that it should be read /ptH/ ?Ptah? (> p(t) + t(A) + H(H)), referring to the god in his demiurgic dimension (separating the sky from the earth). The position of the sign is both a condition for reading and an added semantic information (not only Ptah, but Ptah as demiurge). > > > > 3 - spatial distribution affecting the function of a sign > > Hieratic example? If the rowing man is followed by a n ( > > > > ) the n has to be read /n/ (phonemogram), if the n is positioned under the rowing man ( > > > > ), we are dealing with a compound classifier made of two signs, the /n/ has an iconic value and the whole group refer to a man rowing in water. > > > > Etc., etc. > > > > So position is 'just an esthetic, layout matter, not a linguistic one? as you say? This kind of assertion reflects badly on our understanding of the hieroglyphic system. That?s a pity to make such unwarranted statements in a discussion also aimed at non-specialists to whom we try to explain things as straightforwardly as possible in order to come up with a solution that is satisfying for everyone. > > > > Accordingly, if Unicode aims first and foremost at rendering the ?linguistic? dimension of writing, the examples above should suffice to show the importance of the ?quadrat' organization of this script. Again, you might not like it from a computer/font oriented perspective; I agree it?s not convenient to encode, it might even not be possible to encode it at all in Unicode because the standard has not the capabilities needed (and only higher level protocols could then handle this). All of this is fine with me, but it would be great not to distord presentation of the data. > > > > [I leave here alone other semiotic dimensions of writing: the spatial arrangement is part of the ?orthography? of the scribes, not necessarily meaningful at the ?linguistic? level strictly speaking, but at the level of scribal practices, etc.: why don?t we use IPA for our modern languages? simply because writing is much more than ?linguistic? stricto sensu and that it make sense to know who writes ?next? and who plays with the script and writes ?neckst?. As simple as that.] > > > > > >> No more than illuminated initial capital letters in medieval manuscripts. > >> Like these: > >> > >> http://www.sothebys.com/content/dam/stb/lots/L12/L12240/256L12240_6G4Y2.jpg > >> > >> They are nice, they are fancy, they are there (hundreds of thousands of them), but they do not carry any additional linguistic information whatsoever. Thus, there is not specific need to (and in fact they are not) encode or represent them in unicode. > > > > This comparaison is hilarious. > > > >> As far as I know (and please correct me if I am wrong), the only utility that recording the position of signs could have on a practical level would be to fill lacunas in text or to suggest alternative readings in hieratic scripts. > > > > Nope, see above. This is one side-interest of the control characters that I mentioned in the discussions, indeed, but definitely not the only one, since the arrangement is meaningful at multiple levels. > >>> 1) JSesh approach > >>> > >>> > >>> > >> Sure, everything needs not be dealt with at the level of Unicode, but the data needed should not be hidden in the ligatures embedded in the font either. > >> > >> Again, what data? > >> What is the *linguistic* (not graphic, not philological) information carried by groups that is so important to code? > > > > See above. > >>> 2) Groups in fonts > >>> > >>> > >> OK, sure. But then again control characters have the advantage of being explicit about the relative position of signs when a group is not in a font: how would you proceed for storing such an information, as a lay user, when using ligatures? (this is a real question, nothing ironic here). > >> > >> I am not sure I understand your question: the information about the spatial distribution is already coded in the ligature. the ligature IS the information. > >> And if the ligature does not exist, it takes (literally) 30 seconds to create it within a font. And with a common database and a common font of reference, if would be extremely easy to create a common shared table of ligatures that will allow everyone to see exactly the same groups (or alternatively if i write my text with the font x with the ligature x, it would be enough to embed the font itself in the text file (there are various ways to do so), because everyone would be using the same font or at least the same set of ligatures. > > > > All this was clear, sure. What I meant is that the ligature are purely ?graphic? right? No information is stored about the position of one sign with respect to another? > > > >>> 3) the 4 ?small sign in the corner of big sign? control characters. > >>> > >> I?m glad to see that your solution is exactly the same as the one we suggest! Indeed, there are logically 2 ?variables? (Top vs Bottom and Left-Right), if the sequence of signs indicates the Left-Right unambiguously (like in your example), then we only need to encode explicitly the Top vs Bottom, of course. This is basically how we represented things with Michael at the pub. > >> Please keep in mind that the four operators come from another type of syntax. > >> > >> This is not what I am suggesting. I am suggesting 4 control characters: top-bottom, left-right, A\B, B/A. instead of top-bottom, left-right + four corners. > >> Or perhaps I am not understanding your system. > > > > OK, then I disagree because of the polysemic value of A/B and B/A. > > > >>> 4) Vertical and horizontal script and control characters > >>> > >>> > >> > >> I do not understand why you do entirely get rid of the notion of ?quadrat? in your example. If you encode your text as /hthr-H*(n:t)-tA:tA/, the text would be displayed correctly whatever the orientation (hltr, hrtl, vrtl, vltr), no? Of course, one needs the notion of quadrat in the encoding for this; and this is a nice case for showing that we *need* this well-defined notion in the encoding, not just sequences of glyphs. > >> > >> If you want to have the "notion of quadrate" in unicode, then you have to introduce at least another control character that you will have to use in front of *every single quadrate* in your text. if i understood correctly, one of the proposal that were circulating was indeed suggesting something like that. Which means that to display any text you will probably end up using more control characters that hieroglyphic signs themselves. This can be a sound way of thinking in a MdC (as you don't like JSesh ;-) ) perspective, but I doubt it make sense in unicode (and more important i doubt the people of the unicode consortium will think it make sense). > >> > >> Unless you want to code every single possible "quadrate" as an independent glyph: that would thus end up being conceptually comparable to a chinese character. > >> > >> But I assume (i hope) we don't want do do that, right? > > > > We use spaces between words, would it be bad to use quadrats separators between quadrats? Mutatis mutandis, it feels to me like saying ?oh, no, there are blank spaces all around!?. > > > > More seriously, taking signs as basic units for Unicode might be a practical solution (even if it leads to many subsequent difficulties, see the vertical vs horizontal discussion), but denying the essential function of quadrats is kind of funny when discussing the introduction of control characters (1, 2, 38, it does not matter): what they do is building quadrats (implicitly or explicitly). > >>> 5) special characters, vertical/horizontal texts and input methods > >>> > >>> > >> > >> Your point escapes me, here. Unless it is a result of getting rid of the quadrats: within quadrats, the groups of hieroglyphs are essentially the same. > >> > >> > >> With quadrats marked by special control characters would be even worse, as you would have to take into consideration even more possible combinations. > >> > >> > >>> 6) Ramesside ?groups? (or ?tall groups?). > >>> > >>> > >> > >> Why are they groups or quadrats and not ?small columns?? > >> > >> The answer is simple and straightforward (and provided by your example): because they correspond to the size occupied by the A1 sign. Look at the example. > >> > >> but some signs do occupy the full height of the horizontal line (unlike in your Japanese example, which is nice, but as such irrelevant), which is the basis for deciding what counts as a unit (quadrat) or not. (Or do you have another definition in mind?). > >> > >> Not true. > >> This can be interpreted just as a question of layout, not as a question of grouping. > > > > Paraphrasing what Ariel Shisha-Halevy once said to a colleague during a conference: you?re entitled to have your own kind of Egyptian if you want! > > I?m not arguing against non-sense. But that?s a pity, because there are actually many cases (esp. in Ptolemaic temples) that fit perfectly with your ?small columns? hypothesis. > > The Ramesside example that you provide simply does not work this way. > > > > And all the parallels from other scripts are pointless (and admittedly funny in the framework of this discussion), since none of these scripts are based on a quadratic structure that is close in any respect to Egyptian. Comparing apple with pears won?t help: and again, that?s a bit of a pity, because we have many cases in Egyptian of layouts similar to the ones you mentioned. These are nice cases of special layouts, indeed, and I agree with your analysis: we do not want to encode them. (unlike the quadrat structure). > > > > > >> But if you want to be able to use it minimally for (1) standardized electronic corpus and (2) journals like LingAeg, etc. dealing with the Egyptian language, this would be a requirement. > >> > >> All the egyptian words quoted in my article published on the last issue of JNES were written with my font with just standard ligatures, without control characters. No one complained, no one told me anything, so I assume that such a system is indeed suitable for scientific publications, not only for writing tourists' names.. ;-) > > > > I?m afraid that one paper consisting of a lexicographical discussion about the use of one word in one hieratic document cannot be taken as any evidence for disproving my point: you?re happy with the way your hieroglyphs are rendered? Perfect. But why should it be the case that people willing to render a simple hieroglyphic line as the one below would face difficulties? This escapes me. > > > > > > > > (KRI I, 4) > >>> 7) What are the ?square groups?? > >>> > >>> > >> > >> I provided the definition agreed on ? I think ? by everyone, if you have a better one, I?m listening of course. > >> > >> see above. > >> And see my "Second" introductory point in the previous email: it is not a question of finding the "right" definition (as there isn't any). It is a question of finding the *most efficient* definition in order to efficiently reach our goal of having a working system of unicode-based hieroglyphs that can be used by the *whole* Egyptological community (and not only by a handful of people working on a specific corpus or database, and on his respect, to respond to you last remark..). > >> > >> I?m sorry to put it so bluntly, but if Unicode is not to be useful for the majority of egyptologists, so be it. It will remain what it is, a standard not used by the community: Journals, Corpora, etc. will keep on using JSESH and other tools, they?re doing well with it. > >> > >> Perhaps the opposite should also be considered? perhaps the unicode-based system to write hieroglyphs (considering that the very idea at the basis of unicode is to make writing standardized and accessible to as many people as possible) should indeed aim at being useful for the majority of the Egyptologists (corpus-linguists as well as all the other thousands of non-corpus-linguist egyptologists), and if some team working with some specific database should have some specific need that cannot be satisfied with such a standard unicode system, then perhaps it is them who should use other tools? > >> > >> I guess this is an option that should be considered.. no? > > > > We are talking about feelings and different perceptions here, so it?s hard to be objective. My own view (admittedly subjective) is as follows: most Egyptologists (publishing texts, monuments, etc) will never be happy with what Unicode has to offer, because this is not precise enough at multiple levels (Nigel expressed this position multiple times during the meeting for instance). On the other hand, grammarians (broadly speaking) and people working on corpora are probably the ones who are the more open to standardization (hieroglyphs were not even there in the TLA at the beginning). I see this community (much more than historians, etc.) as intensive users of Unicode: using it for exchanging texts, publishing volumes, creating Online resources, etc. I feel like we are very much in favor of standardization and striving for making resources accessible; two goals of Unicode, as you mentioned. > > > > These resources should be there for lasting, it would be a pity not to think about the standard carefully before going in any direction. > > > > That?s all folks! > > Have a nice weekend, > > > > St?phane > > _______________________________________________ > > Egyptian mailing list > > Egyptian at evertype.com > > http://evertype.com/mailman/listinfo/egyptian_evertype.com > > > _______________________________________________ > Egyptian mailing list > Egyptian at evertype.com > http://evertype.com/mailman/listinfo/egyptian_evertype.com From odusseus at gmail.com Sun Jul 24 17:15:24 2016 From: odusseus at gmail.com (Marwan Kilani) Date: Sun, 24 Jul 2016 18:15:24 +0200 Subject: [Egyptian] Some general considerations In-Reply-To: References: <96C8F2FC-A766-4379-BCBE-B5A6FB700A3B@ulg.ac.be> Message-ID: Just one comment: And all the parallels from other scripts are pointless (and admittedly > funny in the framework of this discussion), since none of these scripts are > based on a quadratic structure that is close in any respect to Egyptian. > Funny to read such a comment while at that meeting in Cambridge a Japanese linguist and Egyptologist explained, by referring to scientific publications, that Japanese (and chinese and korean) are indeed based on quadratic structures that are indeed very close in many respect to Egyptian. There were many things that could be said as a reply to your email, but let be honest: with these premises, it won't make any sense. p.s.: No, actually, i am going to add something: Look what happens (image 1 below) if you cut your ramesside line in small columns (which is just a way of analyzing the text, as it is yours taking them as groups, but as we are not egyptians it is not "truth", it is not and don't want to be the "true nature of the text"), I was saying: Look what happens (image 1 below) if you cut your ramesside line in small columns and you put them one under the other.. Such a nice example of Egyptian vertical text, isn't it? :-) And now, count how many groups you would need to compose such a text with a vertical font within a vertical layout. You can see the result in the second image: the groups you will need are marked in green: 7 groups. That's all. Note that you will not need to combine the D snake and the ns tongue as a group with the following signs, because with a vertical font you could put the baseline of the sign under its "horizontal" bit, and you could consider the tail as hanging below the baseline, as it is for the latin letters "q", "g" "p" and so on. And consider that the "30" is already a single character in unicode (as the t&w, btw.. so i would suggest you to find a more relevant example to argue for the importance of the relative position of signs..) So if you use my way of interpreting ramesside texts as short vertical strings of texts within an main horizontal layout, you would just need 7 (sic!!!), in general very basic, groups to display it correctly. All the other signs could just be inputted one after the other, without control characters, without ligatures, without anything, within short columns one next to the other. And you could just use basic layout algorithm to make them with the space in a nice way (you know, like when you expand or to squeeze the letters within a line to make you text look better? exactly the same thing, but vertically). Instead, with your way of interpreting ramesside writing, which assumes that every "tall group" is a real "group" that need to be built and displayed within a single purely horizontal layout, and with your system of control characters, you would need 15 (sic!!!) groups, many of them very rare in not even unique, and you would need tens of control characters nested one into the other to correctly build and correctly display each of them. I really wonder which approach would be more efficient, more economic and more easily implemented.. Image 1: [image: Inline image 1] Image 2: [image: Inline image 2] -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image_2.png Type: image/png Size: 62103 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image_1.png Type: image/png Size: 59893 bytes Desc: not available URL: From odusseus at gmail.com Sun Jul 24 17:17:13 2016 From: odusseus at gmail.com (Marwan Kilani) Date: Sun, 24 Jul 2016 18:17:13 +0200 Subject: [Egyptian] Some general considerations In-Reply-To: References: <96C8F2FC-A766-4379-BCBE-B5A6FB700A3B@ulg.ac.be> Message-ID: Have a nice Sunday evening Marwan (sorry, that was meant to be in the previous email, obviously :-p ) On Sun, Jul 24, 2016 at 6:15 PM, Marwan Kilani wrote: > Just one comment: > > And all the parallels from other scripts are pointless (and admittedly >> funny in the framework of this discussion), since none of these scripts are >> based on a quadratic structure that is close in any respect to Egyptian. >> > > Funny to read such a comment while at that meeting in Cambridge a Japanese > linguist and Egyptologist explained, by referring to scientific > publications, that Japanese (and chinese and korean) are indeed based on > quadratic structures that are indeed very close in many respect to Egyptian. > > There were many things that could be said as a reply to your email, but > let be honest: with these premises, it won't make any sense. > > p.s.: > No, actually, i am going to add something: > Look what happens (image 1 below) if you cut your ramesside line in small > columns (which is just a way of analyzing the text, as it is yours taking > them as groups, but as we are not egyptians it is not "truth", it is not > and don't want to be the "true nature of the text"), I was saying: Look > what happens (image 1 below) if you cut your ramesside line in small > columns and you put them one under the other.. > > Such a nice example of Egyptian vertical text, isn't it? :-) > > And now, count how many groups you would need to compose such a text with > a vertical font within a vertical layout. > You can see the result in the second image: the groups you will need are > marked in green: 7 groups. That's all. > Note that you will not need to combine the D snake and the ns tongue as a > group with the following signs, because with a vertical font you could put > the baseline of the sign under its "horizontal" bit, and you could consider > the tail as hanging below the baseline, as it is for the latin letters "q", > "g" "p" and so on. And consider that the "30" is already a single character > in unicode (as the t&w, btw.. so i would suggest you to find a more > relevant example to argue for the importance of the relative position of > signs..) > > So if you use my way of interpreting ramesside texts as short vertical > strings of texts within an main horizontal layout, you would just need 7 > (sic!!!), in general very basic, groups to display it correctly. All the > other signs could just be inputted one after the other, without control > characters, without ligatures, without anything, within short columns one > next to the other. And you could just use basic layout algorithm to make > them with the space in a nice way (you know, like when you expand or to > squeeze the letters within a line to make you text look better? exactly the > same thing, but vertically). > > Instead, with your way of interpreting ramesside writing, which assumes > that every "tall group" is a real "group" that need to be built and > displayed within a single purely horizontal layout, and with your system of > control characters, you would need 15 (sic!!!) groups, many of them very > rare in not even unique, and you would need tens of control characters > nested one into the other to correctly build and correctly display each of > them. > > I really wonder which approach would be more efficient, more economic and > more easily implemented.. > > Image 1: > > [image: Inline image 1] > > > > Image 2: > [image: Inline image 2] > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image_1.png Type: image/png Size: 59893 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image_2.png Type: image/png Size: 62103 bytes Desc: not available URL: From bobqq at live.co.uk Sun Jul 24 17:38:59 2016 From: bobqq at live.co.uk (Bob Richmond) Date: Sun, 24 Jul 2016 17:38:59 +0100 Subject: [Egyptian] Some general considerations In-Reply-To: <4B6C12F0-C9B1-4168-ACE1-AE0385979BA2@cam.ac.uk> References: <96C8F2FC-A766-4379-BCBE-B5A6FB700A3B@ulg.ac.be> <4B6C12F0-C9B1-4168-ACE1-AE0385979BA2@cam.ac.uk> Message-ID: Replying to Nigel The point you make on the importance of making a distinction between hieratic transcription to hieroglyphic and transcription from (original) hieroglyphic to (digital format) hieroglyphic is important. This is one snag with the MdC tradition which encourages 'one size fits all' thinking about arrangements of hieroglyphs and groups. Likewise vertical and horizontal writing have their own considerations. Fonts. As you note a font such as Cleo defines the glyphs only. Nevertheless in designing her font Cleo gave attention to relative proportions of sign and use in combinations using the tools at her disposal at the time. Her use case was based on the Gardiner Egyptian Grammar model which attempts to render classic Middle Egyptian style well at 18pt text and acceptably at 12pt. Contrast with the Hieroglyphica font which provides more detail but is optimised for larger point sizes. Traditional MdC applications just as JSesh come with font+application and the two are intended to be used in concert. The Egyptologist has little freedom unless both font and application meet their needs. Portability from one app+font to another app+font is not ideal. Fonts with shaping in Unicode. Glyphs as before. But the font designer now has the opportunity to have more control over how their font looks and better deal with proportion and aesthetics. Whether this is a burden will depend on tools available. This is one personal interest of mine and font practicalities are part of thinking my behind control characters and their straightforward implementation. An application such as JSesh can choose to ignore some or all of the features built into the font and do its own thing. It can also add functionality on top of basic text. There is no loss, only the potential of gain. My personal prototype tools work with MdC (including JSesh extensions) and Unicode plain text and in my experience over last 18 months it all works pretty well. It is desirable that application such as JSesh add in support for Unicode. JSesh 5.5 does not have 'Gardiner codes' for all the Unicode (2009) hieroglyphs so there is housekeeping needed but nothing Serge or I regard as problematic. Incidentally. My own software is agnostic about what is settled on for initial Unicode plain text controls (I could even add in support for RES if required!) but on hold until standards situation is clarified. Big topic. Bob -----Original Message----- From: Egyptian [mailto:egyptian-bounces at evertype.com] On Behalf Of Nigel Strudwick Sent: 24 July 2016 12:56 To: Egyptian Hieroglyphs in the UCS Subject: Re: [Egyptian] Some general considerations Greetings Just wanted to say that St?phane?s last paragraph admirably summarises my perspective. I haven?t been reading this posts closely since I don?t have the theoretical background on Unicode to have a view. But some observations and a question. But I would say that of the categories St?phane mentions in his last paragraph, it is probably also making a distinction in the text corpora section between hieroglyphic and hieratic texts, simply due to the generally more regular arrangement of the latter, and hieroglyphic?s notorious ability to be squeezed, expanded etc to fit a space for a variety of reasons. So while those doing corpora won?t be as fussed about precise arrangements as I can be, there are going to be times where there will be challenges with hieroglyphs. Something that did come as a shock during the meeting was the dawning on me, in the course of a slightly tense exchange with Michael, was that Unicode fonts clearly can put a lot of burden on the font designer, especially for glyphs, if they are to include multiple and complex ligatures etc. This is where the current fonts score well (such as the Cleo Fonts) in that they are about the design only, and the arrangement is then carried out in the specialist software (e.g. JSesh) where the ultimate control lies with the Egyptologist and not the font designer. [I know that won?t be well received but you know where I come from on all this!] One question that I should have asked at the meeting is this: not knowing how control characters actually manifest themselves, will a text formatted with unicode with all the control characters be able to be exported into a plain MdC format, or will one of these input systems be able to import an MdC text and format it correctly? I ask as there will be a lot of MdC plain text around on people?s hard discs. Or will be have ultimately to revise the MdC system to handle all these other codes? Apologies in advance for the evident failures of comprehension in some of the points above. I don?t really need comments on the first points I made, but I would be interested on how the MdC issues might be handled. Best, Nigel On 24 Jul 2016, at 11:33, St?phane polis wrote: > Hi guys! > > (no worries, this is my last mail on the topic, enough time and energy > spent on this.) >>> Third >>> >> And your solution is a good one ? I have absolutely no doubt ? as long as you do not want to *search* for the relative position of signs with respect to one another. This is however a piece of information that is, like it or not, part of the ?orthographic? system of ancient Egyptian and to which we want to have access (see further Simon?s mail earlier this week). >> >> In general, the spatial distribution of signs (i.e. the "grouping") has *no meaning at all* in Egyptian. No semantic, phonological, or morphological information is coded in the relative position of the signs. > > This is patently inaccurate. > Three simple examples should suffice here (sorry for providing textbook examples known by all). > > 1 - spatial distribution affecting phono-morphology [t] followed by > [w] can only be read /tw/, while t&w can be /tw/ or /wt/ => same ?linear order?, 1 reading in one case, 2 readings in the other (referring potentially to 2+ morphemes). > > 2 - spatial distribution as a condition for reading (and adding > semantic value) makes no sense, when combined > as it is clear that it should be read /ptH/ > ?Ptah? (> p(t) + t(A) + H(H)), referring to the god in his demiurgic dimension (separating the sky from the earth). The position of the sign is both a condition for reading and an added semantic information (not only Ptah, but Ptah as demiurge). > > 3 - spatial distribution affecting the function of a sign Hieratic > example? If the rowing man is followed by a n ( > ) the n has to be read /n/ (phonemogram), if the n is positioned under > the rowing man ( ), we are dealing with a > compound classifier made of two signs, the /n/ has an iconic value and the whole group refer to a man rowing in water. > > Etc., etc. > > So position is 'just an esthetic, layout matter, not a linguistic one? as you say? This kind of assertion reflects badly on our understanding of the hieroglyphic system. That?s a pity to make such unwarranted statements in a discussion also aimed at non-specialists to whom we try to explain things as straightforwardly as possible in order to come up with a solution that is satisfying for everyone. > > Accordingly, if Unicode aims first and foremost at rendering the ?linguistic? dimension of writing, the examples above should suffice to show the importance of the ?quadrat' organization of this script. Again, you might not like it from a computer/font oriented perspective; I agree it?s not convenient to encode, it might even not be possible to encode it at all in Unicode because the standard has not the capabilities needed (and only higher level protocols could then handle this). All of this is fine with me, but it would be great not to distord presentation of the data. > > [I leave here alone other semiotic dimensions of writing: the spatial > arrangement is part of the ?orthography? of the scribes, not > necessarily meaningful at the ?linguistic? level strictly speaking, > but at the level of scribal practices, etc.: why don?t we use IPA for > our modern languages? simply because writing is much more than > ?linguistic? stricto sensu and that it make sense to know who writes > ?next? and who plays with the script and writes ?neckst?. As simple as > that.] > > >> No more than illuminated initial capital letters in medieval manuscripts. >> Like these: >> >> http://www.sothebys.com/content/dam/stb/lots/L12/L12240/256L12240_6G4 >> Y2.jpg >> >> They are nice, they are fancy, they are there (hundreds of thousands of them), but they do not carry any additional linguistic information whatsoever. Thus, there is not specific need to (and in fact they are not) encode or represent them in unicode. > > This comparaison is hilarious. > >> As far as I know (and please correct me if I am wrong), the only utility that recording the position of signs could have on a practical level would be to fill lacunas in text or to suggest alternative readings in hieratic scripts. > > Nope, see above. This is one side-interest of the control characters that I mentioned in the discussions, indeed, but definitely not the only one, since the arrangement is meaningful at multiple levels. >>> 1) JSesh approach >>> >>> >>> >> Sure, everything needs not be dealt with at the level of Unicode, but the data needed should not be hidden in the ligatures embedded in the font either. >> >> Again, what data? >> What is the *linguistic* (not graphic, not philological) information carried by groups that is so important to code? > > See above. >>> 2) Groups in fonts >>> >>> >> OK, sure. But then again control characters have the advantage of being explicit about the relative position of signs when a group is not in a font: how would you proceed for storing such an information, as a lay user, when using ligatures? (this is a real question, nothing ironic here). >> >> I am not sure I understand your question: the information about the spatial distribution is already coded in the ligature. the ligature IS the information. >> And if the ligature does not exist, it takes (literally) 30 seconds to create it within a font. And with a common database and a common font of reference, if would be extremely easy to create a common shared table of ligatures that will allow everyone to see exactly the same groups (or alternatively if i write my text with the font x with the ligature x, it would be enough to embed the font itself in the text file (there are various ways to do so), because everyone would be using the same font or at least the same set of ligatures. > > All this was clear, sure. What I meant is that the ligature are purely ?graphic? right? No information is stored about the position of one sign with respect to another? > >>> 3) the 4 ?small sign in the corner of big sign? control characters. >>> >> I?m glad to see that your solution is exactly the same as the one we suggest! Indeed, there are logically 2 ?variables? (Top vs Bottom and Left-Right), if the sequence of signs indicates the Left-Right unambiguously (like in your example), then we only need to encode explicitly the Top vs Bottom, of course. This is basically how we represented things with Michael at the pub. >> Please keep in mind that the four operators come from another type of syntax. >> >> This is not what I am suggesting. I am suggesting 4 control characters: top-bottom, left-right, A\B, B/A. instead of top-bottom, left-right + four corners. >> Or perhaps I am not understanding your system. > > OK, then I disagree because of the polysemic value of A/B and B/A. > >>> 4) Vertical and horizontal script and control characters >>> >>> >> >> I do not understand why you do entirely get rid of the notion of ?quadrat? in your example. If you encode your text as /hthr-H*(n:t)-tA:tA/, the text would be displayed correctly whatever the orientation (hltr, hrtl, vrtl, vltr), no? Of course, one needs the notion of quadrat in the encoding for this; and this is a nice case for showing that we *need* this well-defined notion in the encoding, not just sequences of glyphs. >> >> If you want to have the "notion of quadrate" in unicode, then you have to introduce at least another control character that you will have to use in front of *every single quadrate* in your text. if i understood correctly, one of the proposal that were circulating was indeed suggesting something like that. Which means that to display any text you will probably end up using more control characters that hieroglyphic signs themselves. This can be a sound way of thinking in a MdC (as you don't like JSesh ;-) ) perspective, but I doubt it make sense in unicode (and more important i doubt the people of the unicode consortium will think it make sense). >> >> Unless you want to code every single possible "quadrate" as an independent glyph: that would thus end up being conceptually comparable to a chinese character. >> >> But I assume (i hope) we don't want do do that, right? > > We use spaces between words, would it be bad to use quadrats separators between quadrats? Mutatis mutandis, it feels to me like saying ?oh, no, there are blank spaces all around!?. > > More seriously, taking signs as basic units for Unicode might be a practical solution (even if it leads to many subsequent difficulties, see the vertical vs horizontal discussion), but denying the essential function of quadrats is kind of funny when discussing the introduction of control characters (1, 2, 38, it does not matter): what they do is building quadrats (implicitly or explicitly). >>> 5) special characters, vertical/horizontal texts and input methods >>> >>> >> >> Your point escapes me, here. Unless it is a result of getting rid of the quadrats: within quadrats, the groups of hieroglyphs are essentially the same. >> >> >> With quadrats marked by special control characters would be even worse, as you would have to take into consideration even more possible combinations. >> >> >>> 6) Ramesside ?groups? (or ?tall groups?). >>> >>> >> >> Why are they groups or quadrats and not ?small columns?? >> >> The answer is simple and straightforward (and provided by your example): because they correspond to the size occupied by the A1 sign. Look at the example. >> >> but some signs do occupy the full height of the horizontal line (unlike in your Japanese example, which is nice, but as such irrelevant), which is the basis for deciding what counts as a unit (quadrat) or not. (Or do you have another definition in mind?). >> >> Not true. >> This can be interpreted just as a question of layout, not as a question of grouping. > > Paraphrasing what Ariel Shisha-Halevy once said to a colleague during a conference: you?re entitled to have your own kind of Egyptian if you want! > I?m not arguing against non-sense. But that?s a pity, because there are actually many cases (esp. in Ptolemaic temples) that fit perfectly with your ?small columns? hypothesis. > The Ramesside example that you provide simply does not work this way. > > And all the parallels from other scripts are pointless (and admittedly funny in the framework of this discussion), since none of these scripts are based on a quadratic structure that is close in any respect to Egyptian. Comparing apple with pears won?t help: and again, that?s a bit of a pity, because we have many cases in Egyptian of layouts similar to the ones you mentioned. These are nice cases of special layouts, indeed, and I agree with your analysis: we do not want to encode them. (unlike the quadrat structure). > > >> But if you want to be able to use it minimally for (1) standardized electronic corpus and (2) journals like LingAeg, etc. dealing with the Egyptian language, this would be a requirement. >> >> All the egyptian words quoted in my article published on the last >> issue of JNES were written with my font with just standard ligatures, >> without control characters. No one complained, no one told me >> anything, so I assume that such a system is indeed suitable for >> scientific publications, not only for writing tourists' names.. ;-) > > I?m afraid that one paper consisting of a lexicographical discussion about the use of one word in one hieratic document cannot be taken as any evidence for disproving my point: you?re happy with the way your hieroglyphs are rendered? Perfect. But why should it be the case that people willing to render a simple hieroglyphic line as the one below would face difficulties? This escapes me. > > > (KRI I, 4) >>> 7) What are the ?square groups?? >>> >>> >> >> I provided the definition agreed on ? I think ? by everyone, if you have a better one, I?m listening of course. >> >> see above. >> And see my "Second" introductory point in the previous email: it is not a question of finding the "right" definition (as there isn't any). It is a question of finding the *most efficient* definition in order to efficiently reach our goal of having a working system of unicode-based hieroglyphs that can be used by the *whole* Egyptological community (and not only by a handful of people working on a specific corpus or database, and on his respect, to respond to you last remark..). >> >> I?m sorry to put it so bluntly, but if Unicode is not to be useful for the majority of egyptologists, so be it. It will remain what it is, a standard not used by the community: Journals, Corpora, etc. will keep on using JSESH and other tools, they?re doing well with it. >> >> Perhaps the opposite should also be considered? perhaps the unicode-based system to write hieroglyphs (considering that the very idea at the basis of unicode is to make writing standardized and accessible to as many people as possible) should indeed aim at being useful for the majority of the Egyptologists (corpus-linguists as well as all the other thousands of non-corpus-linguist egyptologists), and if some team working with some specific database should have some specific need that cannot be satisfied with such a standard unicode system, then perhaps it is them who should use other tools? >> >> I guess this is an option that should be considered.. no? > > We are talking about feelings and different perceptions here, so it?s hard to be objective. My own view (admittedly subjective) is as follows: most Egyptologists (publishing texts, monuments, etc) will never be happy with what Unicode has to offer, because this is not precise enough at multiple levels (Nigel expressed this position multiple times during the meeting for instance). On the other hand, grammarians (broadly speaking) and people working on corpora are probably the ones who are the more open to standardization (hieroglyphs were not even there in the TLA at the beginning). I see this community (much more than historians, etc.) as intensive users of Unicode: using it for exchanging texts, publishing volumes, creating Online resources, etc. I feel like we are very much in favor of standardization and striving for making resources accessible; two goals of Unicode, as you mentioned. > > These resources should be there for lasting, it would be a pity not to think about the standard carefully before going in any direction. > > That?s all folks! > Have a nice weekend, > > St?phane > _______________________________________________ > Egyptian mailing list > Egyptian at evertype.com > http://evertype.com/mailman/listinfo/egyptian_evertype.com _______________________________________________ Egyptian mailing list Egyptian at evertype.com http://evertype.com/mailman/listinfo/egyptian_evertype.com From bobqq at live.co.uk Sun Jul 24 19:41:23 2016 From: bobqq at live.co.uk (Bob Richmond) Date: Sun, 24 Jul 2016 19:41:23 +0100 Subject: [Egyptian] On "A system of control characters for Ancient Egyptian hieroglyphic text" (2016-07-23) Message-ID: Hi Mark-Jan We've been talking about plain text on and off for over 18 months so I was interested to read your L2/16-177 "new system" earlier this month and your latest update yesterday (https://mjn.host.cs.st-andrews.ac.uk/tmp/unicode2.pdf). Good to see your number of control characters now reduced, more consideration has been given to OpenType and some discussion in Cambridge has been factored in. I understand what you are trying to do and there are points I'd like to discuss when we have time. There are items you note that indeed need to be progressed and agreed (EMPTY, STACK CARTOUCHE .) However I am disappointed you have not taken on board the fundamental failing with the scheme you've been developing that makes it unsuitable for Unicode plain text consideration, however useful it may be for other purposes. I thought this was clear at Cambridge but apparently not. Unicode plain text hieroglyphic is exposed to a huge audience and the very first consideration is don't make simple things complicated. Use of BEGIN/END for every group breaks that basic rule. Your scheme is unnecessarily complicated so fails at the outset. It's a continuing distraction for the less technically minded in this group to continue to hold it up as a viable alternative to the current UTC recommendation. It is not. MdC X1:R1 in your scheme uses 3 control characters. Quite honestly I find it hard to understand why anyone thinks this is possibly a good idea. If anyone reading this does support 3 over 1 please let me know your reasoning I'm actually quite curious why I've had to spend time on this. What you've tried to do is build a theoretical model which describes cluster layout given a set of constraints and that's all good as an academic exercise. I'd be interested to see it tested against texts and data. All features of could be added in some way to the three control system as HLP or extra controls. So, there's no need to throw your work away. Some parts apply to plain text implementation e.g. your description on making an MdC-like font. Plenty more you could re-purpose should you wish to continue to be involved in Unicode developments. What I suggest you do is consider how you might use elements of your scheme in a simple higher level protocol on top of plain text as it is at present (I'll be circulating my considered view on adding to the 3 control set on Tuesday). Brackets in an HLP could be ok for rare cases. May be useful for TLA and Ramses. Meanwhile if you have any ideas on how you would like to see e.g. 4 corners added to the existing proposal I'd be pleased to hear from your or anyone else on the topic. Regards, Bob -------------- next part -------------- An HTML attachment was scrubbed... URL: From s.polis at ulg.ac.be Mon Jul 25 11:11:43 2016 From: s.polis at ulg.ac.be (=?utf-8?Q?St=C3=A9phane_polis?=) Date: Mon, 25 Jul 2016 12:11:43 +0200 Subject: [Egyptian] Some general considerations In-Reply-To: References: <96C8F2FC-A766-4379-BCBE-B5A6FB700A3B@ulg.ac.be> Message-ID: Yes, Marwarn, I agree that we should indeed stop here. The vertical text that you produce is nothing even close to what could have been an ancient Egyptian vertical text, even less a ?nice example? (or was it ironic?). If you want to produce such texts for you own purposes, that?s fine, but for rendering ancient texts that is not an option. (and as I said during the meeting, converting a hieroglyphic text from vertical to horizontal is not an easy business, it implies interpretation and restructuring). As such, counting the number of groups makes little sense (minimally, p-w should be p*w; i-n should be i*n, iyA would imply 2 groups + A1, etc., etc.; etc.) and won?t lead to any firm conclusions in terms of what are the best options for Unicode. So I?d rather stop here and wish you all the best for your future work on ancient Egyptian. St?phane > Le 24 juil. 2016 ? 18:15, Marwan Kilani a ?crit : > > Just one comment: > > And all the parallels from other scripts are pointless (and admittedly funny in the framework of this discussion), since none of these scripts are based on a quadratic structure that is close in any respect to Egyptian. > > Funny to read such a comment while at that meeting in Cambridge a Japanese linguist and Egyptologist explained, by referring to scientific publications, that Japanese (and chinese and korean) are indeed based on quadratic structures that are indeed very close in many respect to Egyptian. > > There were many things that could be said as a reply to your email, but let be honest: with these premises, it won't make any sense. > > p.s.: > No, actually, i am going to add something: > Look what happens (image 1 below) if you cut your ramesside line in small columns (which is just a way of analyzing the text, as it is yours taking them as groups, but as we are not egyptians it is not "truth", it is not and don't want to be the "true nature of the text"), I was saying: Look what happens (image 1 below) if you cut your ramesside line in small columns and you put them one under the other.. > > Such a nice example of Egyptian vertical text, isn't it? :-) > > And now, count how many groups you would need to compose such a text with a vertical font within a vertical layout. > You can see the result in the second image: the groups you will need are marked in green: 7 groups. That's all. > Note that you will not need to combine the D snake and the ns tongue as a group with the following signs, because with a vertical font you could put the baseline of the sign under its "horizontal" bit, and you could consider the tail as hanging below the baseline, as it is for the latin letters "q", "g" "p" and so on. And consider that the "30" is already a single character in unicode (as the t&w, btw.. so i would suggest you to find a more relevant example to argue for the importance of the relative position of signs..) > > So if you use my way of interpreting ramesside texts as short vertical strings of texts within an main horizontal layout, you would just need 7 (sic!!!), in general very basic, groups to display it correctly. All the other signs could just be inputted one after the other, without control characters, without ligatures, without anything, within short columns one next to the other. And you could just use basic layout algorithm to make them with the space in a nice way (you know, like when you expand or to squeeze the letters within a line to make you text look better? exactly the same thing, but vertically). > > Instead, with your way of interpreting ramesside writing, which assumes that every "tall group" is a real "group" that need to be built and displayed within a single purely horizontal layout, and with your system of control characters, you would need 15 (sic!!!) groups, many of them very rare in not even unique, and you would need tens of control characters nested one into the other to correctly build and correctly display each of them. > > I really wonder which approach would be more efficient, more economic and more easily implemented.. > > Image 1: > > > > > > Image 2: > > > > > _______________________________________________ > Egyptian mailing list > Egyptian at evertype.com > http://evertype.com/mailman/listinfo/egyptian_evertype.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From s.polis at ulg.ac.be Mon Jul 25 11:25:19 2016 From: s.polis at ulg.ac.be (=?utf-8?Q?St=C3=A9phane_polis?=) Date: Mon, 25 Jul 2016 12:25:19 +0200 Subject: [Egyptian] On "A system of control characters for Ancient Egyptian hieroglyphic text" (2016-07-23) In-Reply-To: References: Message-ID: Hi all, If i may jump in one last time. As I said several times, I leave entirely up to the specialists all the issues concerning the syntax of the operators. What matters for me is what you can effectively achieve, and Mak-Jan?s proposal covers precisely what we minimally would like to have (which is why I support it heartfully). Now, if it really ends up to be too much for Unicode and that there is no way to make this happen there, but that you are confident that it can be handled at the level of HLPs, then can I ask a very naive question: what need is there for any control character in Unicode? After all, Marwan?s font with the ligatures seems to work quite well for basic purposes, so it can be considered as a good solution for some users esp. when combined with So?s input system. For other uses, we will need several types of grouping, groups inserted in groups, etc.: why would some bits end up in Unicode, while other would be up to HLP? Shouldn?t we try to have a coherent scheme and not something made of bits and pieces? A real (even if maybe naive) concern. Take care, St?phane > Le 24 juil. 2016 ? 20:41, Bob Richmond a ?crit : > > Hi Mark-Jan > > We?ve been talking about plain text on and off for over 18 months so I was interested to read your L2/16-177 ?new system? earlier this month and your latest update yesterday (https://mjn.host.cs.st-andrews.ac.uk/tmp/unicode2.pdf ). Good to see your number of control characters now reduced, more consideration has been given to OpenType and some discussion in Cambridge has been factored in. I understand what you are trying to do and there are points I?d like to discuss when we have time. There are items you note that indeed need to be progressed and agreed (EMPTY, STACK CARTOUCHE ?) > > However I am disappointed you have not taken on board the fundamental failing with the scheme you?ve been developing that makes it unsuitable for Unicode plain text consideration, however useful it may be for other purposes. I thought this was clear at Cambridge but apparently not. > > Unicode plain text hieroglyphic is exposed to a huge audience and the very first consideration is don?t make simple things complicated. Use of BEGIN/END for every group breaks that basic rule. Your scheme is unnecessarily complicated so fails at the outset. It?s a continuing distraction for the less technically minded in this group to continue to hold it up as a viable alternative to the current UTC recommendation. It is not. > > MdC X1:R1 in your scheme uses 3 control characters. Quite honestly I find it hard to understand why anyone thinks this is possibly a good idea. If anyone reading this does support 3 over 1 please let me know your reasoning I?m actually quite curious why I?ve had to spend time on this. > > What you?ve tried to do is build a theoretical model which describes cluster layout given a set of constraints and that?s all good as an academic exercise. I?d be interested to see it tested against texts and data. All features of could be added in some way to the three control system as HLP or extra controls. > > So, there?s no need to throw your work away. Some parts apply to plain text implementation e.g. your description on making an MdC-like font. Plenty more you could re-purpose should you wish to continue to be involved in Unicode developments. > > What I suggest you do is consider how you might use elements of your scheme in a simple higher level protocol on top of plain text as it is at present (I?ll be circulating my considered view on adding to the 3 control set on Tuesday). Brackets in an HLP could be ok for rare cases. May be useful for TLA and Ramses. > > Meanwhile if you have any ideas on how you would like to see e.g. 4 corners added to the existing proposal I?d be pleased to hear from your or anyone else on the topic. > > Regards, > Bob > > > > > > > > > _______________________________________________ > Egyptian mailing list > Egyptian at evertype.com > http://evertype.com/mailman/listinfo/egyptian_evertype.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From odusseus at gmail.com Mon Jul 25 11:49:55 2016 From: odusseus at gmail.com (Marwan Kilani) Date: Mon, 25 Jul 2016 12:49:55 +0200 Subject: [Egyptian] Some general considerations In-Reply-To: References: <96C8F2FC-A766-4379-BCBE-B5A6FB700A3B@ulg.ac.be> Message-ID: On Mon, Jul 25, 2016 at 12:11 PM, St?phane polis wrote: > Yes, Marwarn, I agree that we should indeed stop here. > > The vertical text that you produce is nothing even close to what could > have been an ancient Egyptian vertical text, even less a ?nice example? (or > was it ironic?). > St?phane, I am sorry to disappoint you, but you are not an ancient Egyptian scribe and I am sorry to disappoint you, but your perception of egyptian writing is not the "truth". Your aim is not to *produce* an ancient egyptian text. The aim is, or should be, to analyze what ancient Egyptian wrote in a way to transcribe what ancient Egyptian wrote in an precise and efficient way. Not to determine what according to some abstract concept is the "true nature" of a text and trying to encode this "true nature". And the aim of that "vertical rendition" was not to produce a "real" ancient egyptian vertical text. it was to explicit and explain one specific step (and its advantages) the inputting procedure and approach that i am suggesting. And my approach allows to transcribe it as precisely as yours. But more efficiently. Perhaps you should try to understand how it works (because clearly you don't) before stating what is possible and what is not. And perhaps you should try to understand a bit better how fonts, ligatures, layouts, and unicode in general work because you seem to believe that is possible/practical to do things that are actually impossible/impractical (stacking in hieroglyphs, using tens of control characters to display 3 hieroglyphs), and you seem to believe that is impossible/impractical to do things that are instead very possible/practical (as coding spatial information in ligatures - you can give names to ligature, so you can call your p*w-ligature "p*w" and you have your spatial information that you can retrieve from your font whenever you want) Let alone your comments about non-western non-latin scripts or your tw/wt example (which is a non-existing problem that has *already* been solved in the *current* current unicode set, where t&w is *already* coded as an independent character. So if you want to display tw = /tw/ you just input t + w, if you want to display wt = /wt/ you just input w + t, and if you want to display t&w = /wt/-/tw/ leaving the ambiguity of the pronunciation you just input the unicode character tw (u13172) which was probably encoded *exactly* for this purpose - pretty smart idea btw? - you should find some other common examples that cannot be solved in this very easy and practical way to explain why coding position is important, because arguing for a proposal on the basis of a problem that has *already* been solved in the current unicode set, well.. and btw, if you knew how ligatures work, you would see that in the case of the wt/tw problem it is actually possible to code *more* information, about the spatial organization *AND* about the reading order of the signs, by using ligatures, than by using control characters.. but I assume you are not really interested in that right?) So well.. All the best with your work with ancient Egyptian as well Marwan > Le 24 juil. 2016 ? 18:15, Marwan Kilani a ?crit : > > Just one comment: > > And all the parallels from other scripts are pointless (and admittedly >> funny in the framework of this discussion), since none of these scripts are >> based on a quadratic structure that is close in any respect to Egyptian. >> > > Funny to read such a comment while at that meeting in Cambridge a Japanese > linguist and Egyptologist explained, by referring to scientific > publications, that Japanese (and chinese and korean) are indeed based on > quadratic structures that are indeed very close in many respect to Egyptian. > > There were many things that could be said as a reply to your email, but > let be honest: with these premises, it won't make any sense. > > p.s.: > No, actually, i am going to add something: > Look what happens (image 1 below) if you cut your ramesside line in small > columns (which is just a way of analyzing the text, as it is yours taking > them as groups, but as we are not egyptians it is not "truth", it is not > and don't want to be the "true nature of the text"), I was saying: Look > what happens (image 1 below) if you cut your ramesside line in small > columns and you put them one under the other.. > > Such a nice example of Egyptian vertical text, isn't it? :-) > > And now, count how many groups you would need to compose such a text with > a vertical font within a vertical layout. > You can see the result in the second image: the groups you will need are > marked in green: 7 groups. That's all. > Note that you will not need to combine the D snake and the ns tongue as a > group with the following signs, because with a vertical font you could put > the baseline of the sign under its "horizontal" bit, and you could consider > the tail as hanging below the baseline, as it is for the latin letters "q", > "g" "p" and so on. And consider that the "30" is already a single character > in unicode (as the t&w, btw.. so i would suggest you to find a more > relevant example to argue for the importance of the relative position of > signs..) > > So if you use my way of interpreting ramesside texts as short vertical > strings of texts within an main horizontal layout, you would just need 7 > (sic!!!), in general very basic, groups to display it correctly. All the > other signs could just be inputted one after the other, without control > characters, without ligatures, without anything, within short columns one > next to the other. And you could just use basic layout algorithm to make > them with the space in a nice way (you know, like when you expand or to > squeeze the letters within a line to make you text look better? exactly the > same thing, but vertically). > > Instead, with your way of interpreting ramesside writing, which assumes > that every "tall group" is a real "group" that need to be built and > displayed within a single purely horizontal layout, and with your system of > control characters, you would need 15 (sic!!!) groups, many of them very > rare in not even unique, and you would need tens of control characters > nested one into the other to correctly build and correctly display each of > them. > > I really wonder which approach would be more efficient, more economic and > more easily implemented.. > > Image 1: > > > > > > Image 2: > > > > > _______________________________________________ > Egyptian mailing list > Egyptian at evertype.com > http://evertype.com/mailman/listinfo/egyptian_evertype.com > > > > _______________________________________________ > Egyptian mailing list > Egyptian at evertype.com > http://evertype.com/mailman/listinfo/egyptian_evertype.com > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ncs3 at cam.ac.uk Mon Jul 25 11:54:44 2016 From: ncs3 at cam.ac.uk (Nigel Strudwick) Date: Mon, 25 Jul 2016 11:54:44 +0100 Subject: [Egyptian] Some general considerations In-Reply-To: References: <96C8F2FC-A766-4379-BCBE-B5A6FB700A3B@ulg.ac.be> Message-ID: Can we please stop this sort of pointless arguing and get back to the basic issue of getting the encoding members back onto sorting out this Unicode control character business? I?m trying to keep out of it, but I think there are times when everyone needs to be called to order and reminded of the main thing we need to achieve first after Cambridge. Nigel On 25 Jul 2016, at 11:49, Marwan Kilani wrote: > > > On Mon, Jul 25, 2016 at 12:11 PM, St?phane polis wrote: > Yes, Marwarn, I agree that we should indeed stop here. > > The vertical text that you produce is nothing even close to what could have been an ancient Egyptian vertical text, even less a ?nice example? (or was it ironic?). > > St?phane, I am sorry to disappoint you, but you are not an ancient Egyptian scribe and I am sorry to disappoint you, but your perception of egyptian writing is not the "truth". > > Your aim is not to *produce* an ancient egyptian text. > > The aim is, or should be, to analyze what ancient Egyptian wrote in a way to transcribe what ancient Egyptian wrote in an precise and efficient way. Not to determine what according to some abstract concept is the "true nature" of a text and trying to encode this "true nature". > > And the aim of that "vertical rendition" was not to produce a "real" ancient egyptian vertical text. it was to explicit and explain one specific step (and its advantages) the inputting procedure and approach that i am suggesting. > > And my approach allows to transcribe it as precisely as yours. But more efficiently. > Perhaps you should try to understand how it works (because clearly you don't) before stating what is possible and what is not. > And perhaps you should try to understand a bit better how fonts, ligatures, layouts, and unicode in general work because you seem to believe that is possible/practical to do things that are actually impossible/impractical (stacking in hieroglyphs, using tens of control characters to display 3 hieroglyphs), and you seem to believe that is impossible/impractical to do things that are instead very possible/practical (as coding spatial information in ligatures - you can give names to ligature, so you can call your p*w-ligature "p*w" and you have your spatial information that you can retrieve from your font whenever you want) > > Let alone your comments about non-western non-latin scripts or your tw/wt example (which is a non-existing problem that has *already* been solved in the *current* current unicode set, where t&w is *already* coded as an independent character. So if you want to display tw = /tw/ you just input t + w, if you want to display wt = /wt/ you just input w + t, and if you want to display t&w = /wt/-/tw/ leaving the ambiguity of the pronunciation you just input the unicode character tw (u13172) which was probably encoded *exactly* for this purpose - pretty smart idea btw? - you should find some other common examples that cannot be solved in this very easy and practical way to explain why coding position is important, because arguing for a proposal on the basis of a problem that has *already* been solved in the current unicode set, well.. and btw, if you knew how ligatures work, you would see that in the case of the wt/tw problem it is actually possible to code *more* information, about the spatial organization *AND* about the reading order of the signs, by using ligatures, than by using control characters.. but I assume you are not really interested in that right?) > > So well.. > > All the best with your work with ancient Egyptian as well > > Marwan > > > > >> Le 24 juil. 2016 ? 18:15, Marwan Kilani a ?crit : >> >> Just one comment: >> >> And all the parallels from other scripts are pointless (and admittedly funny in the framework of this discussion), since none of these scripts are based on a quadratic structure that is close in any respect to Egyptian. >> >> Funny to read such a comment while at that meeting in Cambridge a Japanese linguist and Egyptologist explained, by referring to scientific publications, that Japanese (and chinese and korean) are indeed based on quadratic structures that are indeed very close in many respect to Egyptian. >> >> There were many things that could be said as a reply to your email, but let be honest: with these premises, it won't make any sense. >> >> p.s.: >> No, actually, i am going to add something: >> Look what happens (image 1 below) if you cut your ramesside line in small columns (which is just a way of analyzing the text, as it is yours taking them as groups, but as we are not egyptians it is not "truth", it is not and don't want to be the "true nature of the text"), I was saying: Look what happens (image 1 below) if you cut your ramesside line in small columns and you put them one under the other.. >> >> Such a nice example of Egyptian vertical text, isn't it? :-) >> >> And now, count how many groups you would need to compose such a text with a vertical font within a vertical layout. >> You can see the result in the second image: the groups you will need are marked in green: 7 groups. That's all. >> Note that you will not need to combine the D snake and the ns tongue as a group with the following signs, because with a vertical font you could put the baseline of the sign under its "horizontal" bit, and you could consider the tail as hanging below the baseline, as it is for the latin letters "q", "g" "p" and so on. And consider that the "30" is already a single character in unicode (as the t&w, btw.. so i would suggest you to find a more relevant example to argue for the importance of the relative position of signs..) >> >> So if you use my way of interpreting ramesside texts as short vertical strings of texts within an main horizontal layout, you would just need 7 (sic!!!), in general very basic, groups to display it correctly. All the other signs could just be inputted one after the other, without control characters, without ligatures, without anything, within short columns one next to the other. And you could just use basic layout algorithm to make them with the space in a nice way (you know, like when you expand or to squeeze the letters within a line to make you text look better? exactly the same thing, but vertically). >> >> Instead, with your way of interpreting ramesside writing, which assumes that every "tall group" is a real "group" that need to be built and displayed within a single purely horizontal layout, and with your system of control characters, you would need 15 (sic!!!) groups, many of them very rare in not even unique, and you would need tens of control characters nested one into the other to correctly build and correctly display each of them. >> >> I really wonder which approach would be more efficient, more economic and more easily implemented.. >> >> Image 1: >> >> >> >> >> >> Image 2: >> >> >> >> >> _______________________________________________ >> Egyptian mailing list >> Egyptian at evertype.com >> http://evertype.com/mailman/listinfo/egyptian_evertype.com > > > _______________________________________________ > Egyptian mailing list > Egyptian at evertype.com > http://evertype.com/mailman/listinfo/egyptian_evertype.com > > > _______________________________________________ > Egyptian mailing list > Egyptian at evertype.com > http://evertype.com/mailman/listinfo/egyptian_evertype.com From odusseus at gmail.com Mon Jul 25 11:58:03 2016 From: odusseus at gmail.com (Marwan Kilani) Date: Mon, 25 Jul 2016 12:58:03 +0200 Subject: [Egyptian] Some general considerations In-Reply-To: References: <96C8F2FC-A766-4379-BCBE-B5A6FB700A3B@ulg.ac.be> Message-ID: P.S. St?phane, please obviously don't take my tone, which I know could have sounded a bit harsh, on a personal level. I am not taking any of remarks personally and feel free to freely criticize my arguments as much as you wish. Please do the same. It is nothing personal, I am talking about work, this as nothing to do about me or you as a person, obviously. I hope it is clear for everyone. Marwan On Mon, Jul 25, 2016 at 12:54 PM, Nigel Strudwick wrote: > Can we please stop this sort of pointless arguing and get back to the > basic issue of getting the encoding members back onto sorting out this > Unicode control character business? > > I?m trying to keep out of it, but I think there are times when everyone > needs to be called to order and reminded of the main thing we need to > achieve first after Cambridge. > > Nigel > > > > > On 25 Jul 2016, at 11:49, Marwan Kilani wrote: > > > > > > > On Mon, Jul 25, 2016 at 12:11 PM, St?phane polis > wrote: > > Yes, Marwarn, I agree that we should indeed stop here. > > > > The vertical text that you produce is nothing even close to what could > have been an ancient Egyptian vertical text, even less a ?nice example? (or > was it ironic?). > > > > St?phane, I am sorry to disappoint you, but you are not an ancient > Egyptian scribe and I am sorry to disappoint you, but your perception of > egyptian writing is not the "truth". > > > > Your aim is not to *produce* an ancient egyptian text. > > > > The aim is, or should be, to analyze what ancient Egyptian wrote in a > way to transcribe what ancient Egyptian wrote in an precise and efficient > way. Not to determine what according to some abstract concept is the "true > nature" of a text and trying to encode this "true nature". > > > > And the aim of that "vertical rendition" was not to produce a "real" > ancient egyptian vertical text. it was to explicit and explain one specific > step (and its advantages) the inputting procedure and approach that i am > suggesting. > > > > And my approach allows to transcribe it as precisely as yours. But more > efficiently. > > Perhaps you should try to understand how it works (because clearly you > don't) before stating what is possible and what is not. > > And perhaps you should try to understand a bit better how fonts, > ligatures, layouts, and unicode in general work because you seem to believe > that is possible/practical to do things that are actually > impossible/impractical (stacking in hieroglyphs, using tens of control > characters to display 3 hieroglyphs), and you seem to believe that is > impossible/impractical to do things that are instead very > possible/practical (as coding spatial information in ligatures - you can > give names to ligature, so you can call your p*w-ligature "p*w" and you > have your spatial information that you can retrieve from your font whenever > you want) > > > > Let alone your comments about non-western non-latin scripts or your > tw/wt example (which is a non-existing problem that has *already* been > solved in the *current* current unicode set, where t&w is *already* coded > as an independent character. So if you want to display tw = /tw/ you just > input t + w, if you want to display wt = /wt/ you just input w + t, and if > you want to display t&w = /wt/-/tw/ leaving the ambiguity of the > pronunciation you just input the unicode character tw (u13172) which was > probably encoded *exactly* for this purpose - pretty smart idea btw? - you > should find some other common examples that cannot be solved in this very > easy and practical way to explain why coding position is important, because > arguing for a proposal on the basis of a problem that has *already* been > solved in the current unicode set, well.. and btw, if you knew how > ligatures work, you would see that in the case of the wt/tw problem it is > actually possible to code *more* information, about the spatial > organization *AND* about the reading order of the signs, by using > ligatures, than by using control characters.. but I assume you are not > really interested in that right?) > > > > So well.. > > > > All the best with your work with ancient Egyptian as well > > > > Marwan > > > > > > > > > >> Le 24 juil. 2016 ? 18:15, Marwan Kilani a ?crit : > >> > >> Just one comment: > >> > >> And all the parallels from other scripts are pointless (and admittedly > funny in the framework of this discussion), since none of these scripts are > based on a quadratic structure that is close in any respect to Egyptian. > >> > >> Funny to read such a comment while at that meeting in Cambridge a > Japanese linguist and Egyptologist explained, by referring to scientific > publications, that Japanese (and chinese and korean) are indeed based on > quadratic structures that are indeed very close in many respect to Egyptian. > >> > >> There were many things that could be said as a reply to your email, but > let be honest: with these premises, it won't make any sense. > >> > >> p.s.: > >> No, actually, i am going to add something: > >> Look what happens (image 1 below) if you cut your ramesside line in > small columns (which is just a way of analyzing the text, as it is yours > taking them as groups, but as we are not egyptians it is not "truth", it is > not and don't want to be the "true nature of the text"), I was saying: Look > what happens (image 1 below) if you cut your ramesside line in small > columns and you put them one under the other.. > >> > >> Such a nice example of Egyptian vertical text, isn't it? :-) > >> > >> And now, count how many groups you would need to compose such a text > with a vertical font within a vertical layout. > >> You can see the result in the second image: the groups you will need > are marked in green: 7 groups. That's all. > >> Note that you will not need to combine the D snake and the ns tongue as > a group with the following signs, because with a vertical font you could > put the baseline of the sign under its "horizontal" bit, and you could > consider the tail as hanging below the baseline, as it is for the latin > letters "q", "g" "p" and so on. And consider that the "30" is already a > single character in unicode (as the t&w, btw.. so i would suggest you to > find a more relevant example to argue for the importance of the relative > position of signs..) > >> > >> So if you use my way of interpreting ramesside texts as short vertical > strings of texts within an main horizontal layout, you would just need 7 > (sic!!!), in general very basic, groups to display it correctly. All the > other signs could just be inputted one after the other, without control > characters, without ligatures, without anything, within short columns one > next to the other. And you could just use basic layout algorithm to make > them with the space in a nice way (you know, like when you expand or to > squeeze the letters within a line to make you text look better? exactly the > same thing, but vertically). > >> > >> Instead, with your way of interpreting ramesside writing, which assumes > that every "tall group" is a real "group" that need to be built and > displayed within a single purely horizontal layout, and with your system of > control characters, you would need 15 (sic!!!) groups, many of them very > rare in not even unique, and you would need tens of control characters > nested one into the other to correctly build and correctly display each of > them. > >> > >> I really wonder which approach would be more efficient, more economic > and more easily implemented.. > >> > >> Image 1: > >> > >> > >> > >> > >> > >> Image 2: > >> > >> > >> > >> > >> _______________________________________________ > >> Egyptian mailing list > >> Egyptian at evertype.com > >> http://evertype.com/mailman/listinfo/egyptian_evertype.com > > > > > > _______________________________________________ > > Egyptian mailing list > > Egyptian at evertype.com > > http://evertype.com/mailman/listinfo/egyptian_evertype.com > > > > > > _______________________________________________ > > Egyptian mailing list > > Egyptian at evertype.com > > http://evertype.com/mailman/listinfo/egyptian_evertype.com > > > _______________________________________________ > Egyptian mailing list > Egyptian at evertype.com > http://evertype.com/mailman/listinfo/egyptian_evertype.com > -------------- next part -------------- An HTML attachment was scrubbed... URL: From s.polis at ulg.ac.be Mon Jul 25 11:58:52 2016 From: s.polis at ulg.ac.be (=?utf-8?Q?St=C3=A9phane_polis?=) Date: Mon, 25 Jul 2016 12:58:52 +0200 Subject: [Egyptian] Some general considerations In-Reply-To: References: <96C8F2FC-A766-4379-BCBE-B5A6FB700A3B@ulg.ac.be> Message-ID: <7366EDBB-A193-4482-BCBC-47D2D40AB972@ulg.ac.be> > Le 25 juil. 2016 ? 12:49, Marwan Kilani a ?crit : > > > > On Mon, Jul 25, 2016 at 12:11 PM, St?phane polis > wrote: > Yes, Marwarn, I agree that we should indeed stop here. > > The vertical text that you produce is nothing even close to what could have been an ancient Egyptian vertical text, even less a ?nice example? (or was it ironic?). > > St?phane, I am sorry to disappoint you, but you are not an ancient Egyptian scribe and I am sorry to disappoint you, but your perception of egyptian writing is not the "truth". > > Your aim is not to *produce* an ancient egyptian text. > > The aim is, or should be, to analyze what ancient Egyptian wrote in a way to transcribe what ancient Egyptian wrote in an precise and efficient way. Not to determine what according to some abstract concept is the "true nature" of a text and trying to encode this "true nature ?. I never spoke about any ?truth? at any point, i?m talking about what is (likely to be) attested in Egyptian texts and what is not. > > And the aim of that "vertical rendition" was not to produce a "real" ancient egyptian vertical text. it was to explicit and explain one specific step (and its advantages) the inputting procedure and approach that i am suggesting. > > And my approach allows to transcribe it as precisely as yours. But more efficiently. > Perhaps you should try to understand how it works (because clearly you don't) before stating what is possible and what is not. > And perhaps you should try to understand a bit better how fonts, ligatures, layouts, and unicode in general work because you seem to believe that is possible/practical to do things that are actually impossible/impractical (stacking in hieroglyphs, using tens of control characters to display 3 hieroglyphs), and you seem to believe that is impossible/impractical to do things that are instead very possible/practical (as coding spatial information in ligatures - you can give names to ligature, so you can call your p*w-ligature "p*w" and you have your spatial information that you can retrieve from your font whenever you want) > > Let alone your comments about non-western non-latin scripts or your tw/wt example (which is a non-existing problem that has *already* been solved in the *current* current unicode set, where t&w is *already* coded as an independent character. So if you want to display tw = /tw/ you just input t + w, if you want to display wt = /wt/ you just input w + t, and if you want to display t&w = /wt/-/tw/ leaving the ambiguity of the pronunciation you just input the unicode character tw (u13172) which was probably encoded *exactly* for this purpose - pretty smart idea btw? - you should find some other common examples that cannot be solved in this very easy and practical way to explain why coding position is important, because arguing for a proposal on the basis of a problem that has *already* been solved in the current unicode set, well.. and btw, if you knew how ligatures work, you would see that in the case of the wt/tw problem it is actually possible to code *more* information, about the spatial organization *AND* about the reading order of the signs, by using ligatures, than by using control characters.. but I assume you are not really interested in that right?) Sorry Marwan, but we leave on different planets: your observation was general (along the line ?no linguistic meaning for the organization of the hieroglyphs'), I provided general examples about the facts that it is not the case. That this is handled in a way or another and finds practical solutions in Unicode has nothing to do with the general principles. I stop here with pointless argumentation. Cheers, St. > > So well.. > > All the best with your work with ancient Egyptian as well > > Marwan > > > > >> Le 24 juil. 2016 ? 18:15, Marwan Kilani > a ?crit : >> >> Just one comment: >> >> And all the parallels from other scripts are pointless (and admittedly funny in the framework of this discussion), since none of these scripts are based on a quadratic structure that is close in any respect to Egyptian. >> >> Funny to read such a comment while at that meeting in Cambridge a Japanese linguist and Egyptologist explained, by referring to scientific publications, that Japanese (and chinese and korean) are indeed based on quadratic structures that are indeed very close in many respect to Egyptian. >> >> There were many things that could be said as a reply to your email, but let be honest: with these premises, it won't make any sense. >> >> p.s.: >> No, actually, i am going to add something: >> Look what happens (image 1 below) if you cut your ramesside line in small columns (which is just a way of analyzing the text, as it is yours taking them as groups, but as we are not egyptians it is not "truth", it is not and don't want to be the "true nature of the text"), I was saying: Look what happens (image 1 below) if you cut your ramesside line in small columns and you put them one under the other.. >> >> Such a nice example of Egyptian vertical text, isn't it? :-) >> >> And now, count how many groups you would need to compose such a text with a vertical font within a vertical layout. >> You can see the result in the second image: the groups you will need are marked in green: 7 groups. That's all. >> Note that you will not need to combine the D snake and the ns tongue as a group with the following signs, because with a vertical font you could put the baseline of the sign under its "horizontal" bit, and you could consider the tail as hanging below the baseline, as it is for the latin letters "q", "g" "p" and so on. And consider that the "30" is already a single character in unicode (as the t&w, btw.. so i would suggest you to find a more relevant example to argue for the importance of the relative position of signs..) >> >> So if you use my way of interpreting ramesside texts as short vertical strings of texts within an main horizontal layout, you would just need 7 (sic!!!), in general very basic, groups to display it correctly. All the other signs could just be inputted one after the other, without control characters, without ligatures, without anything, within short columns one next to the other. And you could just use basic layout algorithm to make them with the space in a nice way (you know, like when you expand or to squeeze the letters within a line to make you text look better? exactly the same thing, but vertically). >> >> Instead, with your way of interpreting ramesside writing, which assumes that every "tall group" is a real "group" that need to be built and displayed within a single purely horizontal layout, and with your system of control characters, you would need 15 (sic!!!) groups, many of them very rare in not even unique, and you would need tens of control characters nested one into the other to correctly build and correctly display each of them. >> >> I really wonder which approach would be more efficient, more economic and more easily implemented.. >> >> Image 1: >> >> >> >> >> >> Image 2: >> >> >> >> >> _______________________________________________ >> Egyptian mailing list >> Egyptian at evertype.com >> http://evertype.com/mailman/listinfo/egyptian_evertype.com > > > _______________________________________________ > Egyptian mailing list > Egyptian at evertype.com > http://evertype.com/mailman/listinfo/egyptian_evertype.com > > > _______________________________________________ > Egyptian mailing list > Egyptian at evertype.com > http://evertype.com/mailman/listinfo/egyptian_evertype.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From odusseus at gmail.com Mon Jul 25 12:15:22 2016 From: odusseus at gmail.com (Marwan Kilani) Date: Mon, 25 Jul 2016 13:15:22 +0200 Subject: [Egyptian] Some general considerations In-Reply-To: <7366EDBB-A193-4482-BCBC-47D2D40AB972@ulg.ac.be> References: <96C8F2FC-A766-4379-BCBE-B5A6FB700A3B@ulg.ac.be> <7366EDBB-A193-4482-BCBC-47D2D40AB972@ulg.ac.be> Message-ID: > Sorry Marwan, but we leave on different planets: your observation was > general (along the line ?no linguistic meaning for the organization of the > hieroglyphs'), I provided general examples about the facts that it is not > the case... > ? according to your interpretation of those cases ;-) Because other interpretations could led to different conclusions, as for the t&w wt/tw case: it is enough to interpret the t&w as a single glyph as was done in the original unicode set, and your problem/case disappears. And just to clarify, I am not arguing that t&w is truly, in its "essence", a single glyph, i am not talking about what it *is*, I am talking about how we can interpret it in a practical way... we can agree that this was probably not how ancient egyptians "truly" saw those two signs, but it is a very *practical* and *efficient* interpretation - and the same stands true for the other cases you suggested, in different ways: the rower with water can be interpreted as a variant of the standard rower as semantically the water does not seem to me add anything -as as far as i know it is implicit for rowers to row in water- and in the case of ptah it is the choice of the signs with which the name is written (earth-god-sky) that allows to say that we are dealing with the demiurge and not with an ordinary ptah, and this stands true whether you write this sequence horizontally or vertically or in circle (actually thank Gods egyptians did not use "circular groups"). But yes, let's follow Nigel's suggestion and let's stop here Marwan -------------- next part -------------- An HTML attachment was scrubbed... URL: From ncs3 at cam.ac.uk Mon Jul 25 12:17:22 2016 From: ncs3 at cam.ac.uk (Nigel Strudwick) Date: Mon, 25 Jul 2016 12:17:22 +0100 Subject: [Egyptian] Some general considerations In-Reply-To: References: <96C8F2FC-A766-4379-BCBE-B5A6FB700A3B@ulg.ac.be> <7366EDBB-A193-4482-BCBC-47D2D40AB972@ulg.ac.be> Message-ID: <3A122131-0AD2-4A6F-ABF2-731D1C4D7DF9@cam.ac.uk> Thank you. I was about to shout STOP! On 25 Jul 2016, at 12:15, Marwan Kilani wrote: > > Sorry Marwan, but we leave on different planets: your observation was general (along the line ?no linguistic meaning for the organization of the hieroglyphs'), I provided general examples about the facts that it is not the case... > > ? according to your interpretation of those cases ;-) > Because other interpretations could led to different conclusions, as for the t&w wt/tw case: it is enough to interpret the t&w as a single glyph as was done in the original unicode set, and your problem/case disappears. > And just to clarify, I am not arguing that t&w is truly, in its "essence", a single glyph, i am not talking about what it *is*, I am talking about how we can interpret it in a practical way... we can agree that this was probably not how ancient egyptians "truly" saw those two signs, but it is a very *practical* and *efficient* interpretation - and the same stands true for the other cases you suggested, in different ways: the rower with water can be interpreted as a variant of the standard rower as semantically the water does not seem to me add anything -as as far as i know it is implicit for rowers to row in water- and in the case of ptah it is the choice of the signs with which the name is written (earth-god-sky) that allows to say that we are dealing with the demiurge and not with an ordinary ptah, and this stands true whether you write this sequence horizontally or vertically or in circle (actually thank Gods egyptians did not use "circular groups"). > > But yes, let's follow Nigel's suggestion and let's stop here > > Marwan > _______________________________________________ > Egyptian mailing list > Egyptian at evertype.com > http://evertype.com/mailman/listinfo/egyptian_evertype.com From bobqq at live.co.uk Mon Jul 25 12:43:08 2016 From: bobqq at live.co.uk (Bob Richmond) Date: Mon, 25 Jul 2016 12:43:08 +0100 Subject: [Egyptian] UTC meeting, August Message-ID: Hi All Mark Jan, St?phane and myself have been invited to participate remotely in a UTC discussion next week Thursday (4th August). Deadline for written documents is tomorrow (Tuesday 26th) and PDFs will then be made available on the Unicode website. So if there?s anything specifically you?d like any of us to take into account please raise on list or communicate privately. Bob -------------- next part -------------- An HTML attachment was scrubbed... URL: From ncs3 at cam.ac.uk Mon Jul 25 12:49:56 2016 From: ncs3 at cam.ac.uk (Nigel Strudwick) Date: Mon, 25 Jul 2016 12:49:56 +0100 Subject: [Egyptian] UTC meeting, August In-Reply-To: References: Message-ID: I take it the intention is that this will focus everyone on sorting something out? Or am I being naive? :D Best, Nigel On 25 Jul 2016, at 12:43, Bob Richmond wrote: > Hi All > > Mark Jan, St?phane and myself have been invited to participate remotely in a UTC discussion next week Thursday (4th August). Deadline for written documents is tomorrow (Tuesday 26th) and PDFs will then be made available on the Unicode website. > > So if there?s anything specifically you?d like any of us to take into account please raise on list or communicate privately. > > Bob > > > _______________________________________________ > Egyptian mailing list > Egyptian at evertype.com > http://evertype.com/mailman/listinfo/egyptian_evertype.com From mn31 at st-andrews.ac.uk Mon Jul 25 12:53:27 2016 From: mn31 at st-andrews.ac.uk (Mark-Jan Nederhof) Date: Mon, 25 Jul 2016 12:53:27 +0100 Subject: [Egyptian] submitting proposal In-Reply-To: <3A122131-0AD2-4A6F-ABF2-731D1C4D7DF9@cam.ac.uk> References: <3A122131-0AD2-4A6F-ABF2-731D1C4D7DF9@cam.ac.uk> Message-ID: <1710622.6bilBc1MUz@bear> Dear All, Let me first thank our kind hosts for bringing so many experts together in Cambridge, both from the UTC and Egyptologists working on corpus linguistics. As far as I know, this was a unique event, creating an extraordinary opportunity to make progress. I learned a lot from discussions with UTC members and with Egyptologists. As for the prospect of a consensus proposal, there is no consensus. The recent emails made this very clear. Time is running out. If no new proposal is submitted now, we have to wait many more months for the next opportunity, and there is no guarantee (and personally I think no hope at all) there would be anything resembling a full or partial consensus then. We (TLA/Ramses/St Andrews) are therefore submitting a revised proposal to the UTC, which is grosso modo the version that I distributed during the weekend. For me, this ends the discussion for the time being. Best regards, Mark-Jan From ncs3 at cam.ac.uk Mon Jul 25 13:26:33 2016 From: ncs3 at cam.ac.uk (Nigel Strudwick) Date: Mon, 25 Jul 2016 13:26:33 +0100 Subject: [Egyptian] submitting proposal In-Reply-To: <1710622.6bilBc1MUz@bear> References: <3A122131-0AD2-4A6F-ABF2-731D1C4D7DF9@cam.ac.uk> <1710622.6bilBc1MUz@bear> Message-ID: <6B102BB2-2B7E-4FAF-8F2A-03A53D1F70F5@cam.ac.uk> Right. I obviously do not know what the future holds if we do not clarify things now, but I would innately share the gut feeling that we are otherwise headed for several more years without agreement. This is precisely why I convened the meeting in Cambridge. If ?there is no consensus?, then perhaps everyone needs to look deeply at what they are doing and think, just think, about a compromise. Please everyone, in the interests of co-operation, try and pull together, accept some compromise, and perhaps let the great and good of the UTC know where the conflicting issues are and help everyone to see sense. This is as bad as classic British industrial relations. where no-one wants to give in. If we achieve nothing, then the whole Unicode business becomes nothing more or less than an intellectual exercise, proving that we can do something for no end. Let us try and make something happen. ?Yes we can?, to quote a well-known public figure. Nigel On 25 Jul 2016, at 12:53, Mark-Jan Nederhof wrote: > Dear All, > > Let me first thank our kind hosts for bringing so many experts together in Cambridge, > both from the UTC and Egyptologists working on corpus linguistics. As far as I know, > this was a unique event, creating an extraordinary opportunity to make progress. > I learned a lot from discussions with UTC members and with Egyptologists. > > As for the prospect of a consensus proposal, there is no consensus. The recent emails made > this very clear. > > Time is running out. If no new proposal is submitted now, we have to wait many more months > for the next opportunity, and there is no guarantee (and personally I think no hope at all) > there would be anything resembling a full or partial consensus then. > > We (TLA/Ramses/St Andrews) are therefore submitting a revised proposal to the UTC, which > is grosso modo the version that I distributed during the weekend. > > For me, this ends the discussion for the time being. > > Best regards, > Mark-Jan > > _______________________________________________ > Egyptian mailing list > Egyptian at evertype.com > http://evertype.com/mailman/listinfo/egyptian_evertype.com From bobqq at live.co.uk Mon Jul 25 17:16:34 2016 From: bobqq at live.co.uk (Bob Richmond) Date: Mon, 25 Jul 2016 17:16:34 +0100 Subject: [Egyptian] On "A system of control characters for Ancient Egyptian hieroglyphic text" (2016-07-23) Message-ID: Hi St?phane Congratulations for being the first to (heartfully) express public support for a representation of MdC X1:R1 that uses 3 control characters. Certainly Unicode could handle it but I?m a little sceptical most Egyptologists could. I hope we?ll have this topic cleared up in the next few days otherwise I think we are going to have to draw more Egyptologists into the discussion. Incidentally have you ever attempted to edit text written in a Complex Script using a word processor? If you haven?t I suggest you do then perhaps think again about what you are putting your name to. I mentioned the background on the need for control characters in my reply to Marwans post at the weekend. Reading what you?ve said here I suspect the underlying issue is you feel that if we don?t do everything now all is lost. Relax. This is not the case with Unicode. The approach is to start off with something we know works, perhaps with some minor limitations, and build from that on the basis of experience and evidence. That?s what we did with the repertoire. I?ve listened carefully to feedback on the three controls and will propose some adjustments and additions which I hope go a long way to addressing points raised. For the occasional clusters or writings we don?t address yet you can use an HLP then when good evidence exists on exactly what is needed we can extend. You say you leave it to the specialists but then seem to want to ignore what we recommend! I can?t think of anyone I?ve met with solid software experience who would see the syntax of ?A System of ?? as a practical step forward ? if you find one put them in touch with me I enjoy new experiences! Sorry St?phane you get the brunt, price you pay for being the first :). Regards, Bob From: Egyptian [mailto:egyptian-bounces at evertype.com] On Behalf Of St?phane polis Sent: 25 July 2016 11:25 To: Egyptian Hieroglyphs in the UCS Subject: Re: [Egyptian] On "A system of control characters for Ancient Egyptian hieroglyphic text" (2016-07-23) Hi all, If i may jump in one last time. As I said several times, I leave entirely up to the specialists all the issues concerning the syntax of the operators. What matters for me is what you can effectively achieve, and Mak-Jan?s proposal covers precisely what we minimally would like to have (which is why I support it heartfully). Now, if it really ends up to be too much for Unicode and that there is no way to make this happen there, but that you are confident that it can be handled at the level of HLPs, then can I ask a very naive question: what need is there for any control character in Unicode? After all, Marwan?s font with the ligatures seems to work quite well for basic purposes, so it can be considered as a good solution for some users esp. when combined with So?s input system. For other uses, we will need several types of grouping, groups inserted in groups, etc.: why would some bits end up in Unicode, while other would be up to HLP? Shouldn?t we try to have a coherent scheme and not something made of bits and pieces? A real (even if maybe naive) concern. Take care, St?phane Le 24 juil. 2016 ? 20:41, Bob Richmond > a ?crit : Hi Mark-Jan We?ve been talking about plain text on and off for over 18 months so I was interested to read your L2/16-177 ?new system? earlier this month and your latest update yesterday ( https://mjn.host.cs.st-andrews.ac.uk/tmp/unicode2.pdf). Good to see your number of control characters now reduced, more consideration has been given to OpenType and some discussion in Cambridge has been factored in. I understand what you are trying to do and there are points I?d like to discuss when we have time. There are items you note that indeed need to be progressed and agreed (EMPTY, STACK CARTOUCHE ?) However I am disappointed you have not taken on board the fundamental failing with the scheme you?ve been developing that makes it unsuitable for Unicode plain text consideration, however useful it may be for other purposes. I thought this was clear at Cambridge but apparently not. Unicode plain text hieroglyphic is exposed to a huge audience and the very first consideration is don?t make simple things complicated. Use of BEGIN/END for every group breaks that basic rule. Your scheme is unnecessarily complicated so fails at the outset. It?s a continuing distraction for the less technically minded in this group to continue to hold it up as a viable alternative to the current UTC recommendation. It is not. MdC X1:R1 in your scheme uses 3 control characters. Quite honestly I find it hard to understand why anyone thinks this is possibly a good idea. If anyone reading this does support 3 over 1 please let me know your reasoning I?m actually quite curious why I?ve had to spend time on this. What you?ve tried to do is build a theoretical model which describes cluster layout given a set of constraints and that?s all good as an academic exercise. I?d be interested to see it tested against texts and data. All features of could be added in some way to the three control system as HLP or extra controls. So, there?s no need to throw your work away. Some parts apply to plain text implementation e.g. your description on making an MdC-like font. Plenty more you could re-purpose should you wish to continue to be involved in Unicode developments. What I suggest you do is consider how you might use elements of your scheme in a simple higher level protocol on top of plain text as it is at present (I?ll be circulating my considered view on adding to the 3 control set on Tuesday). Brackets in an HLP could be ok for rare cases. May be useful for TLA and Ramses. Meanwhile if you have any ideas on how you would like to see e.g. 4 corners added to the existing proposal I?d be pleased to hear from your or anyone else on the topic. Regards, Bob _______________________________________________ Egyptian mailing list Egyptian at evertype.com http://evertype.com/mailman/listinfo/egyptian_evertype.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From s.polis at ulg.ac.be Tue Jul 26 10:51:51 2016 From: s.polis at ulg.ac.be (=?utf-8?Q?St=C3=A9phane_polis?=) Date: Tue, 26 Jul 2016 11:51:51 +0200 Subject: [Egyptian] On "A system of control characters for Ancient Egyptian hieroglyphic text" (2016-07-23) In-Reply-To: References: Message-ID: <2E8CBE1F-4748-4EDA-BEBD-050F343F300D@ulg.ac.be> > Sorry St?phane you get the brunt, price you pay for being the first J. Hi Bob, There is nothing to be sorry about: a discussion is a discussion and you can have your opinion, based on your experience and I respect it (same with Marwan: as he said, there is nothing personal here, and I definitely agree). Now, if one wants to reach a consensus at some point, as Nigel was pleading it, it would be great not to distord what I said in order to perpetuate this ?unbalanced relationship? that leads to nothing (if I am to trust what you say as IT and font specialist, why would you systematically discard the evidence provided and the argument developed as regards the basic capabilities that would be important for us, i.e. the egyptologists working on electronic corpora and developing the textual resources for the filed, to have. Respecting you as a specialist of your field, when the reverse does not seem to hold, is not always an easy task I confess and requires some sangfroid.) > Congratulations for being the first to (heartfully) express public support for a representation of MdC X1:R1 that uses 3 control characters. > > Certainly Unicode could handle it but I?m a little sceptical most Egyptologists could. I hope we?ll have this topic cleared up in the next few days otherwise I think we are going to have to draw more Egyptologists into the discussion. Why do you feel the need to distord my words (if not to antagonize the positions): I said that the syntax (number of control characters, etc.) had no importance for me whatsoever, leaving this up to specialists; what matters are the principles, and I quote ?what you can effectively achieve? with these control characters. At the moment, these basic needs ? and significant compromises and simplifications have been made, don?t you agree? ? are only covered by Mark-Jan?s proposal. That?s it, nothing less, nothing more. If you have a better ?technical? solution for these capabilities, please go for it and submit them to the group, the way it?s done doesn?t matter to me at all! > Incidentally have you ever attempted to edit text written in a Complex Script using a word processor? If you haven?t I suggest you do then perhaps think again about what you are putting your name to. No (depending on your definition of complex scripts obviously), and that?s why I trust people like you, So, Marwan and others to find acceptable solution for the edition, I fully accept that you?re the specialists and that it is not a simple issue. But we?re in the 21st century and you?re imaginative IT guys: I?m sure that the basic arrangement of signs that Egyptians could handle is not beyond you reach, even with Unicode! (Note, as I said several times: ancient Egyptian is not English, 5.000.000 words max., including Demotic, and very soon most of the texts will be available freely in MdC, so the encoding speed/efficiency that you repeatedly mention is almost of no concern for long texts that could be converted; and for short quotations, words, etc., I?m pretty confident that between 30 sec. a sentence with limited capabilities and 2 min. a sentence for a more adequate encoding, most Egyptologists would choose the second [again, I?m not representative of anyone but myself, that?s a general and subjective feeling; you can ask for a consultation using the IAE mailing list I suppose, but be ready for some actual craziness ;)]. Now, let me return the argument: have you spent the last 15 years reading, teaching, publishing, encoding and annotating all sorts of hieratic and hieroglyphic texts from OK inscriptions down to Late Period rituals? If you haven?t I suggest you do and perhaps think again about what kind of encoding scheme you?re developing. That?s completely caricatural and ridiculous, isn?t it? You certainly know what John Locke said about this kind of argument from authority, and I would never use them as you do in any sort of debate. Up until now, egyptological evidence was systematically provided by us for any kind of principle advocated for. There might be other ways to deal with theses cases than with control characters in Unicode (ligatures in fonts, HLP, compound characters in Unicode, name it), fine, but please no ?further evidence is needed? for the basic capabilities we?re talking about. > Reading what you?ve said here I suspect the underlying issue is you feel that if we don?t do everything now all is lost. No, my sole concern is that we do not go in one direction that will be problematic in the future. As such I will be really happy to see how your proposal can be expanded, e.g., for dealing with multiple levels of embedding: complex groups inserted in corners, several levels of embedding, etc. (you have all the data needed, right?). I?m simply not ready to buy a pig in a poke: we provide you with egyptological evidence, if you can do the same for the encoding scheme, perfect! My only concern is about long term evolution: I agree that having the equivalent of ?:? and ?*? in MdC would already be a big plus, I wrote it several times, but it should fit into the broader picture of a coherent scheme and not lead to some ?bricolage? on top of a (too) simple scheme. Furthermore, I get perfectly that Unicode can be expanded and that it?s not a one time thing, but my understanding (correct me if I?m wrong) is that withdrawing and/or revising substantially a scheme is not likely to happen at any point, right? (just because resources will be quickly created, and that you do not want to be incompatible, right?) > You say you leave it to the specialists but then seem to want to ignore what we recommend! I can?t think of anyone I?ve met with solid software experience who would see the syntax of ?A System of ?? as a practical step forward ? if you find one put them in touch with me I enjoy new experiences! Nope, again: I?m not ignoring what you recommend. I?m listening as carefully as I can (sorry for not being Bill Gates though) and, as I said, I?m open to any syntax guaranteeing coherent and adequate long-term extensions: simply show us how you?d proceed; evidence is needed both ways in well-balanced relationships. And, I hate to return the argument, but ? I can?t think of anyone I?ve met with solid egyptological experience who would see the actual limitations of your scheme as a practical step forward for the Egyptology in the long run. ? [and please take this as a joke, the argument from authority story.] In a nutshell: I?m happily supporting any proposal that implements the basic capabilities described in Mark-Jan?s document (because, really, no further evidence is needed), or even part of it, *as long as one can show how it can be developed satisfactorily in the future* (I?m not an IT guy and I?m eager to learn and also enjoy new experiences). You asked for egyptological evidence, you have them; the opposite should hold, no? Again, I?d rather not buy a pig in a poke, so just tell me how your scheme can be expanded, at which level, etc. for supporting the capabilities of Mark-Jan?s proposal, with a comprehensive description that even people like me could understand, and I?m quite confident that we should be able to produce some consensus document! Take care, St?phane > > Regards, > Bob > > From: Egyptian [mailto:egyptian-bounces at evertype.com ] On Behalf Of St?phane polis > Sent: 25 July 2016 11:25 > To: Egyptian Hieroglyphs in the UCS > > Subject: Re: [Egyptian] On "A system of control characters for Ancient Egyptian hieroglyphic text" (2016-07-23) > > Hi all, > > If i may jump in one last time. > > As I said several times, I leave entirely up to the specialists all the issues concerning the syntax of the operators. > > What matters for me is what you can effectively achieve, and Mak-Jan?s proposal covers precisely what we minimally would like to have (which is why I support it heartfully). > Now, if it really ends up to be too much for Unicode and that there is no way to make this happen there, but that you are confident that it can be handled at the level of HLPs, then can I ask a very naive question: what need is there for any control character in Unicode? > > After all, Marwan?s font with the ligatures seems to work quite well for basic purposes, so it can be considered as a good solution for some users esp. when combined with So?s input system. > For other uses, we will need several types of grouping, groups inserted in groups, etc.: why would some bits end up in Unicode, while other would be up to HLP? Shouldn?t we try to have a coherent scheme and not something made of bits and pieces? > > A real (even if maybe naive) concern. > Take care, > > St?phane > > >> Le 24 juil. 2016 ? 20:41, Bob Richmond > a ?crit : >> >> Hi Mark-Jan >> >> We?ve been talking about plain text on and off for over 18 months so I was interested to read your L2/16-177 ?new system? earlier this month and your latest update yesterday (https://mjn.host.cs.st-andrews.ac.uk/tmp/unicode2.pdf ). Good to see your number of control characters now reduced, more consideration has been given to OpenType and some discussion in Cambridge has been factored in. I understand what you are trying to do and there are points I?d like to discuss when we have time. There are items you note that indeed need to be progressed and agreed (EMPTY, STACK CARTOUCHE ?) >> >> However I am disappointed you have not taken on board the fundamental failing with the scheme you?ve been developing that makes it unsuitable for Unicode plain text consideration, however useful it may be for other purposes. I thought this was clear at Cambridge but apparently not. >> >> Unicode plain text hieroglyphic is exposed to a huge audience and the very first consideration is don?t make simple things complicated. Use of BEGIN/END for every group breaks that basic rule. Your scheme is unnecessarily complicated so fails at the outset. It?s a continuing distraction for the less technically minded in this group to continue to hold it up as a viable alternative to the current UTC recommendation. It is not. >> >> MdC X1:R1 in your scheme uses 3 control characters. Quite honestly I find it hard to understand why anyone thinks this is possibly a good idea. If anyone reading this does support 3 over 1 please let me know your reasoning I?m actually quite curious why I?ve had to spend time on this. >> >> What you?ve tried to do is build a theoretical model which describes cluster layout given a set of constraints and that?s all good as an academic exercise. I?d be interested to see it tested against texts and data. All features of could be added in some way to the three control system as HLP or extra controls. >> >> So, there?s no need to throw your work away. Some parts apply to plain text implementation e.g. your description on making an MdC-like font. Plenty more you could re-purpose should you wish to continue to be involved in Unicode developments. >> >> What I suggest you do is consider how you might use elements of your scheme in a simple higher level protocol on top of plain text as it is at present (I?ll be circulating my considered view on adding to the 3 control set on Tuesday). Brackets in an HLP could be ok for rare cases. May be useful for TLA and Ramses. >> >> Meanwhile if you have any ideas on how you would like to see e.g. 4 corners added to the existing proposal I?d be pleased to hear from your or anyone else on the topic. >> >> Regards, >> Bob >> >> >> >> >> >> >> >> >> _______________________________________________ >> Egyptian mailing list >> Egyptian at evertype.com >> http://evertype.com/mailman/listinfo/egyptian_evertype.com > > _______________________________________________ > Egyptian mailing list > Egyptian at evertype.com > http://evertype.com/mailman/listinfo/egyptian_evertype.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From s.polis at ulg.ac.be Tue Jul 26 11:57:45 2016 From: s.polis at ulg.ac.be (=?utf-8?Q?St=C3=A9phane_polis?=) Date: Tue, 26 Jul 2016 12:57:45 +0200 Subject: [Egyptian] Two group joiners In-Reply-To: References: Message-ID: Dear Bob, Thanks for this document! Some remarks. P. 1 - tall groups are not ?best know from Late Egyptian?: probably rarer in Old Kingdom inscriptions (but this should be checked and I don?t know of studies about this aspect, Nigel?), they are everywhere in hieroglyphic inscriptions (and perhaps more and more systematically present) from the First Intermediate Period onward (down to the Late Period). Look for instance at the biographical texts on FIP and Middle Kingdom stelae; you?ll find several examples on every single document. The same remark applies to your note after the title ?horizontal text? obviously. P. 1 - Note. 'The topic etc.?: I understand that you do not want to conflate the issues, but in a broader perspective why not acknowledging the fact that the same principles apply, namely make groups that are somehow scaled down for fitting in horizontal, vertical or diagonal arrangements. Your solution is probably convenient for vertical and horizontal grouping in most cases (even if it probably implies later additions of additional lower/higher level operators for more complex embedding), but could you explain how you intend to deal with e.g. groups like -------------- next part -------------- A non-text attachment was scrubbed... Name: PastedGraphic-6.pdf Type: application/pdf Size: 4051 bytes Desc: not available URL: -------------- next part -------------- (in s*xm-ib*), frequent as well in the same context? Offering a solution to this issue would be a big step forward in my view! Best wishes, St?phane ------------------------------------------------------ Chercheur qualifi? F.R.S.-FNRS Universit? de Li?ge Service d'?gyptologie D?partement des sciences de l?Antiquit? Place du 20-Ao?t, B-4000 Li?ge http://www.egypto.ulg.ac.be ------------------------------------------------------ > Le 23 juil. 2016 ? 00:19, Bob Richmond a ?crit : > > From odusseus at gmail.com Tue Jul 26 12:02:54 2016 From: odusseus at gmail.com (Marwan Kilani) Date: Tue, 26 Jul 2016 13:02:54 +0200 Subject: [Egyptian] On "A system of control characters for Ancient Egyptian hieroglyphic text" (2016-07-23) In-Reply-To: <2E8CBE1F-4748-4EDA-BEBD-050F343F300D@ulg.ac.be> References: <2E8CBE1F-4748-4EDA-BEBD-050F343F300D@ulg.ac.be> Message-ID: Hello everyone, If I may jump in with a little consideration about the evidence, taking into account that as said I didn't know any of you before the meeting in Cambridge, so it could be that you have already discussed that in other contexts - in that case just ignore my comment, but at the same time perhaps in that case it could be useful to put this material available online somewhere (a website? a shared drive? something) or to share a link if these data are already online. I think (but again, it is just my feeling, to take as a possible way to try to solve the dispute, not as "truth") that part of the problem could be that the kind "evidence" you IT people and you Egyptologists are talking about is conceptually slightly different. In particular it seems to me that egyptologists here tend to think in a qualitative way, while IT people tend to think in a quantitate way - or at least need quantitative data. For instance, the IT people want to have evidence for the need of 5 level recursivity in egyptian groups (just a random example). I am sure that egyptologists will be able to find an example of actual text with an actual group with a 5 level recursivity. And I have the feeling that for them this will be enough as "evidence". >From an IT point of view (and a Unicode point of view), however, this is not enough. The point is not only to see if some feature is attested, but also how frequent it is. 5 level recursivity is attested in egyptian writing, good. But how often does it occur in texts, in the whole corpus of egyptian texts? Is it something that happens every three words? is it something that happens only in specific texts in specific periods (like e.g. 5% of the the total), and in those texts it is very frequent, or is it something attested only a few times among all the egyptian texts (from the old kingdom to the roman period)? Because if something is very frequent, then for instance it makes sense to consider to code in uncode. If however something is quite rare, then other solutions (e.g. HLP) could be more efficient to deal with it. Giving evidence that something is attested, without giving quantitative data about its frequency/importance/relevance is probably not enough. And note that this is not a problem only for the encoding of Egyptian: this is a very common problem that exists for the encoding of major living languages and scripts used by hundreds of millions of people. Just a very simple example (it is an example about ligatures, but it would be the same with control characters): devanagare scripts, the writing system of hindi, sanskrit and other indian languages, is essentially syllabic (with consent(s) + vowel structure, so only open syllables), and uses ligatures to represent syllables with consonant clusters. in other words: in devanagare, "ma", "ta", "ka" are coded as single characters. While syllables like "tma" or "kta" are rendered with ligatures as single characters. Now indian languages have hundreds of possible consonantal clusters, with 2, 3, 4 (and potentially more) consonants clustered togheter. If in addition you consider that devanagare not only was used for centuries for sanskrit and prakrit and other ancient languages, but is a script still used today for modern, living languages (which therefore could need, for instance, to transcribe words from other languages with even more complex consonantal clusters that are not attested in indian languages, like e.g. if you want to transcribe in devanagare the icelandic word "islansklukkur" - nsklu does not sound like a cluster that could be "native" in indian languages - or just consider that in nepal, for instance, devanagare is used to transcribe dozens of different languages, many of which are not even indoeuropen and therefore have very specific phonologies and clusters), the number of possible ligatures that could exists that are attested and for which one could indeed supply some form of evidence would be very huge. Thinking to code all of these combinations just because they are attested (or could be needed) would not be very practical and in fact is not what happens in "unicode reality". What happens is that only the most common and most frequent ligatures are encoded and specifically dealt with. For all the other less common consonant clusters, other strategies are adopted. For instance, if you want to display a complex/rare ligature that is not part of the standard set, you just use a diacritic that will indicate that the signs bearing it have to be read and understood as ligated with the following one, although in practice they are not displayed as such. And again, we are talking about devanagare, which is a purely phonetic writing system used by hundreds of millions of people. So again, to go back to Egyptian: showing that a given graphic phenomenon/combination/etc is attested, is not enough to claim that a given feature is need and need to be encoded (now, perhaps in the future). Its frequency and relative importance have also to be shown. Now again, i repeat: i don't know if this has already be done. if so, ignore my comments above, but at the same time it would be interesting and useful to have access to such data. So far however, what I have just seen "samples" of specific cases showing that a given feature is attested, but no data about how common and frequent such specific cases are. So for instance (and believe me, and believe me nigel :-p it is not to open again the discussion), in the case of the tw/wt. Ok, a good example of ambiguity that can be interpreted as deriving at least in part from the spatial distribution of signs. But this is one case. How many other cases of similar ambiguities involving other signs, and how frequent and common are they on the whole corpus of surviving egyptian texts? And it is a serious question, because i haven't worked for 15 years on egyptian texts, but still i have been dealing with egyptian texts for more than 10 years now -gez..- and right now i can't think of any other really common and really frequent similar case (in my period of competence at least). I can think of exceptional examples perhaps, but nothing common and systematic. And this is important, because if this is a frequent feature recurring often and with many different characters, then it would make sense to deal with it at the unicode level, with control characters or whatever. If instead is something that essentially happens only with the sign t and w, and perhaps in a bunch of other random cases on the whole of the egyptian literary history, then there could be more practical/reasonable/efficient (chose the word you prefer) solutions, from a IT/unicode perspective, to deal with them (such as for instance introducing a t&w character, as they did in the basic unicode set). So again, evidence that a given feature is attested is good, but it would be better to have info also about its frequency, relevance etc. And it these data are already available somewhere, then it could be useful to share them (again?) here Best :-) Marwan On Tue, Jul 26, 2016 at 11:51 AM, St?phane polis wrote: > > Sorry St?phane you get the brunt, price you pay for being the first J. > > > Hi Bob, > > There is nothing to be sorry about: a discussion is a discussion and you > can have your opinion, based on your experience and I respect it (same with > Marwan: as he said, there is nothing personal here, and I definitely > agree). > > Now, if one wants to reach a consensus at some point, as Nigel was > pleading it, it would be great not to distord what I said in order to > perpetuate this ?unbalanced relationship? that leads to nothing (if I am to > trust what you say as IT and font specialist, why would you systematically > discard the evidence provided and the argument developed as regards the > basic capabilities that would be important for us, i.e. the egyptologists > working on electronic corpora and developing the textual resources for the > filed, to have. Respecting you as a specialist of your field, when the > reverse does not seem to hold, is not always an easy task I confess and > requires some sangfroid.) > > Congratulations for being the first to (heartfully) express public support > for a representation of MdC X1:R1 that uses *3 control characters*. > > Certainly Unicode could handle it but I?m a little sceptical most > Egyptologists could. I hope we?ll have this topic cleared up in the next > few days otherwise I think we are going to have to draw more Egyptologists > into the discussion. > > > Why do you feel the need to distord my words (if not to antagonize the > positions): I said that the syntax (number of control characters, etc.) had > no importance for me whatsoever, leaving this up to specialists; what > matters are the principles, and I quote ?what you can effectively achieve? > with these control characters. At the moment, these basic needs ? and > significant compromises and simplifications have been made, don?t you > agree? ? are only covered by Mark-Jan?s proposal. That?s it, nothing less, > nothing more. If you have a better ?technical? solution for these > capabilities, please go for it and submit them to the group, the way it?s > done doesn?t matter to me at all! > > Incidentally have you ever attempted to edit text written in a Complex > Script using a word processor? If you haven?t I suggest you do then perhaps > think again about what you are putting your name to. > > > No (depending on your definition of complex scripts obviously), and that?s > why I trust people like you, So, Marwan and others to find acceptable > solution for the edition, I fully accept that you?re the specialists and > that it is not a simple issue. But we?re in the 21st century and you?re > imaginative IT guys: I?m sure that the basic arrangement of signs that > Egyptians could handle is not beyond you reach, even with Unicode! (Note, > as I said several times: ancient Egyptian is not English, 5.000.000 words > max., including Demotic, and very soon most of the texts will be available > freely in MdC, so the encoding speed/efficiency that you repeatedly mention > is almost of no concern for long texts that could be converted; and for > short quotations, words, etc., I?m pretty confident that between 30 sec. a > sentence with limited capabilities and 2 min. a sentence for a more > adequate encoding, most Egyptologists would choose the second [again, I?m > not representative of anyone but myself, that?s a general and subjective > feeling; you can ask for a consultation using the IAE mailing list I > suppose, but be ready for some actual craziness ;)]. > > Now, let me return the argument: have you spent the last 15 years reading, > teaching, publishing, encoding and annotating all sorts of hieratic and > hieroglyphic texts from OK inscriptions down to Late Period rituals? If you > haven?t I suggest you do and perhaps think again about what kind of > encoding scheme you?re developing. That?s completely caricatural and > ridiculous, isn?t it? You certainly know what John Locke said about this > kind of argument from authority, and I would never use them as you do in > any sort of debate. Up until now, egyptological evidence was systematically > provided by us for any kind of principle advocated for. There might be > other ways to deal with theses cases than with control characters in > Unicode (ligatures in fonts, HLP, compound characters in Unicode, name it), > fine, but please no ?further evidence is needed? for the basic capabilities > we?re talking about. > > Reading what you?ve said here I suspect the underlying issue is you feel > that if we don?t do everything now all is lost. > > > No, my sole concern is that we do not go in one direction that will be > problematic in the future. > As such I will be really happy to see how your proposal can be expanded, > e.g., for dealing with multiple levels of embedding: complex groups > inserted in corners, several levels of embedding, etc. (you have all the > data needed, right?). I?m simply not ready to buy a pig in a poke: we > provide you with egyptological evidence, if you can do the same for the > encoding scheme, perfect! My only concern is about long term evolution: I > agree that having the equivalent of ?:? and ?*? in MdC would already be a > big plus, I wrote it several times, but it should fit into the broader > picture of a coherent scheme and not lead to some ?bricolage? on top of a > (too) simple scheme. > > Furthermore, I get perfectly that Unicode can be expanded and that it?s > not a one time thing, but my understanding (correct me if I?m wrong) is > that withdrawing and/or revising substantially a scheme is not likely to > happen at any point, right? (just because resources will be quickly > created, and that you do not want to be incompatible, right?) > > You say you leave it to the specialists but then seem to want to ignore > what we recommend! I can?t think of anyone I?ve met with solid software > experience who would see the syntax of ?A System of ?? as a practical step > forward ? if you find one put them in touch with me I enjoy new experiences! > > > Nope, again: I?m not ignoring what you recommend. I?m listening as > carefully as I can (sorry for not being Bill Gates though) and, as I said, > I?m open to any syntax guaranteeing coherent and adequate long-term > extensions: simply show us how you?d proceed; evidence is needed both ways > in well-balanced relationships. And, I hate to return the argument, but ? I > can?t think of anyone I?ve met with solid egyptological experience who > would see the actual limitations of your scheme as a practical step forward > for the Egyptology in the long run. ? [and please take this as a joke, the > argument from authority story.] > > In a nutshell: I?m happily supporting any proposal that implements the > basic capabilities described in Mark-Jan?s document (because, really, no > further evidence is needed), or even part of it, *as long as one can show > how it can be developed satisfactorily in the future* (I?m not an IT guy > and I?m eager to learn and also enjoy new experiences). You asked for > egyptological evidence, you have them; the opposite should hold, no? Again, > I?d rather not buy a pig in a poke, so just tell me how your scheme can be > expanded, at which level, etc. for supporting the capabilities of > Mark-Jan?s proposal, with a comprehensive description that even people like > me could understand, and I?m quite confident that we should be able to > produce some consensus document! > > Take care, > > St?phane > > > Regards, > Bob > > *From:* Egyptian [mailto:egyptian-bounces at evertype.com > ] *On Behalf Of *St?phane polis > *Sent:* 25 July 2016 11:25 > *To:* Egyptian Hieroglyphs in the UCS > *Subject:* Re: [Egyptian] On "A system of control characters for Ancient > Egyptian hieroglyphic text" (2016-07-23) > > Hi all, > > If i may jump in one last time. > > As I said several times, I leave entirely up to the specialists all the > issues concerning the syntax of the operators. > > What matters for me is what you can effectively achieve, and Mak-Jan?s > proposal covers precisely what we minimally would like to have (which is > why I support it heartfully). > Now, if it really ends up to be too much for Unicode and that there is no > way to make this happen there, but that you are confident that it can be > handled at the level of HLPs, then can I ask a very naive question: what > need is there for any control character in Unicode? > > After all, Marwan?s font with the ligatures seems to work quite well for > basic purposes, so it can be considered as a good solution for some users > esp. when combined with So?s input system. > For other uses, we will need several types of grouping, groups inserted in > groups, etc.: why would some bits end up in Unicode, while other would be > up to HLP? Shouldn?t we try to have a coherent scheme and not something > made of bits and pieces? > > A real (even if maybe naive) concern. > Take care, > > St?phane > > > > Le 24 juil. 2016 ? 20:41, Bob Richmond a ?crit : > > Hi Mark-Jan > > We?ve been talking about plain text on and off for over 18 months so I was > interested to read your L2/16-177 ?new system? earlier this month and your > latest update yesterday ( > https://mjn.host.cs.st-andrews.ac.uk/tmp/unicode2.pdf). Good to see your > number of control characters now reduced, more consideration has been given > to OpenType and some discussion in Cambridge has been factored in. I > understand what you are trying to do and there are points I?d like to > discuss when we have time. There are items you note that indeed need to be > progressed and agreed (EMPTY, STACK CARTOUCHE ?) > > However I am disappointed you have not taken on board the fundamental > failing with the scheme you?ve been developing that makes it unsuitable for > Unicode plain text consideration, however useful it may be for other > purposes. I thought this was clear at Cambridge but apparently not. > > Unicode plain text hieroglyphic is exposed to a huge audience and the very > first consideration is *don?t make simple things complicated*. Use of > BEGIN/END for every group breaks that basic rule. Your scheme is > unnecessarily complicated so fails at the outset. It?s a continuing > distraction for the less technically minded in this group to continue to > hold it up as a viable alternative to the current UTC recommendation. It is > not. > > MdC X1:R1 in your scheme uses *3 control characters*. Quite honestly I > find it hard to understand why anyone thinks this is possibly a good idea. > *If anyone reading this does support 3 over 1 please let me know your > reasoning I?m actually quite curious why I?ve had to spend time on this.* > > What you?ve tried to do is build a theoretical model which describes > cluster layout given a set of constraints and that?s all good as an > academic exercise. I?d be interested to see it tested against texts and > data. All features of could be added in some way to the three control > system as HLP or extra controls. > > So, there?s no need to throw your work away. Some parts apply to plain > text implementation e.g. your description on making an MdC-like font. > Plenty more you could re-purpose should you wish to continue to be involved > in Unicode developments. > > What I suggest you do is consider how you might use elements of your > scheme in a simple higher level protocol on top of plain text as it is at > present (I?ll be circulating my considered view on adding to the 3 control > set on Tuesday). Brackets in an HLP could be ok for rare cases. May be > useful for TLA and Ramses. > > Meanwhile if you have any ideas on how you would like to see e.g. 4 > corners added to the existing proposal I?d be pleased to hear from your or > anyone else on the topic. > > Regards, > Bob > > > > > > > > > _______________________________________________ > Egyptian mailing list > Egyptian at evertype.com > http://evertype.com/mailman/listinfo/egyptian_evertype.com > > > _______________________________________________ > Egyptian mailing list > Egyptian at evertype.com > http://evertype.com/mailman/listinfo/egyptian_evertype.com > > > > _______________________________________________ > Egyptian mailing list > Egyptian at evertype.com > http://evertype.com/mailman/listinfo/egyptian_evertype.com > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ncs3 at cam.ac.uk Tue Jul 26 12:32:01 2016 From: ncs3 at cam.ac.uk (Nigel Strudwick) Date: Tue, 26 Jul 2016 12:32:01 +0100 Subject: [Egyptian] Two group joiners In-Reply-To: References: Message-ID: <4A025D35-16E9-4821-A5E6-66ECCCBAB862@cam.ac.uk> As I am called out in para 1, here?s my 10?/10p.Nice to be able to add a bit of Egyptology into this... I would say that as a general point these tall groups seem to be an affectation of certain types of monuments IN HIEROGLYPHS at certain periods (excluding Graeco-Roman stuff, which is beyond me). The following comments are based on a brief bit of thought and should no way be regarded as definitive. Firstly, you need to remember that the best quality monumental texts are almost always vertical, so it?s not relevant to those. This would go for temples and tomb walls. So I would indeed exclude the OK (although there is bound to be the odd exception!) Longer series of horizontal texts seem to come in with the increasing use of stelae, which is a development of the First Intermediate Period. You only need to look at some of the private stelae from the 11th dyn and before to see what I mean. This continues into the MK and beyond. As a general rule, the more elaborate the graphic carving/painting of the glyphs, the less frequent these tall groups are. So as a general rule, you don?t see them on earlier fancy royal stelae with elaborate signs, but once you get into the NK with the more lengthy inscriptions, even royal ones, cut into sandstone with no pretence of using beautiful painted signs, yes, they get stacked more. My guess it is as a result of texts getting longer but don?t quote me on that. Any help? Nigel On 26 Jul 2016, at 11:57, St?phane polis wrote: > Dear Bob, > > Thanks for this document! Some remarks. > > P. 1 - tall groups are not ?best know from Late Egyptian?: probably rarer in Old Kingdom inscriptions (but this should be checked and I don?t know of studies about this aspect, Nigel?), they are everywhere in hieroglyphic inscriptions (and perhaps more and more systematically present) from the First Intermediate Period onward (down to the Late Period). Look for instance at the biographical texts on FIP and Middle Kingdom stelae; you?ll find several examples on every single document. The same remark applies to your note after the title ?horizontal text? obviously. > > P. 1 - Note. 'The topic etc.?: I understand that you do not want to conflate the issues, but in a broader perspective why not acknowledging the fact that the same principles apply, namely make groups that are somehow scaled down for fitting in horizontal, vertical or diagonal arrangements. Your solution is probably convenient for vertical and horizontal grouping in most cases (even if it probably implies later additions of additional lower/higher level operators for more complex embedding), but could you explain how you intend to deal with e.g. groups like (in s*xm-ib*), frequent as well in the same context? > > Offering a solution to this issue would be a big step forward in my view! > > Best wishes, > > St?phane > > ------------------------------------------------------ > Chercheur qualifi? F.R.S.-FNRS > > Universit? de Li?ge > Service d'?gyptologie > D?partement des sciences de l?Antiquit? > Place du 20-Ao?t, > B-4000 Li?ge > > http://www.egypto.ulg.ac.be > ------------------------------------------------------ > > > >> Le 23 juil. 2016 ? 00:19, Bob Richmond a ?crit : >> >> > > _______________________________________________ > Egyptian mailing list > Egyptian at evertype.com > http://evertype.com/mailman/listinfo/egyptian_evertype.com From bobqq at live.co.uk Tue Jul 26 13:31:13 2016 From: bobqq at live.co.uk (Bob Richmond) Date: Tue, 26 Jul 2016 13:31:13 +0100 Subject: [Egyptian] On "A system of control characters for Ancient Egyptian hieroglyphic text" (2016-07-23) In-Reply-To: <2E8CBE1F-4748-4EDA-BEBD-050F343F300D@ulg.ac.be> References: <2E8CBE1F-4748-4EDA-BEBD-050F343F300D@ulg.ac.be> Message-ID: Hi St?phane I?m glad we agree on the importance of trying to avoid problems downstream. Its dead easy to build a complex untested system, the challenge is to create something that actually works and maximizes usability without introducing complexity and future problems. If that means sacrificing some functionality short term it?s a small price to pay. I?ve much sympathy with what you would like to be able to do and if I could magic up a silver bullet I surely would. If someone else has a suggested improvement great so long as it works. This is not a competition however much seem to want to paint it as such. Point in hand. By putting your name to a specific technical proposal you and others who are listed are telling UTC that this is the system you would like to use day to day. In every detail. It really is that simple. If it?s just intended as a discussion document not an implementation proposal that should be made clear. Bob From: Egyptian [mailto:egyptian-bounces at evertype.com] On Behalf Of St?phane polis Sent: 26 July 2016 10:52 To: Egyptian Hieroglyphs in the UCS Subject: Re: [Egyptian] On "A system of control characters for Ancient Egyptian hieroglyphic text" (2016-07-23) Sorry St?phane you get the brunt, price you pay for being the first :). Hi Bob, There is nothing to be sorry about: a discussion is a discussion and you can have your opinion, based on your experience and I respect it (same with Marwan: as he said, there is nothing personal here, and I definitely agree). Now, if one wants to reach a consensus at some point, as Nigel was pleading it, it would be great not to distord what I said in order to perpetuate this ?unbalanced relationship? that leads to nothing (if I am to trust what you say as IT and font specialist, why would you systematically discard the evidence provided and the argument developed as regards the basic capabilities that would be important for us, i.e. the egyptologists working on electronic corpora and developing the textual resources for the filed, to have. Respecting you as a specialist of your field, when the reverse does not seem to hold, is not always an easy task I confess and requires some sangfroid.) Congratulations for being the first to (heartfully) express public support for a representation of MdC X1:R1 that uses 3 control characters. Certainly Unicode could handle it but I?m a little sceptical most Egyptologists could. I hope we?ll have this topic cleared up in the next few days otherwise I think we are going to have to draw more Egyptologists into the discussion. Why do you feel the need to distord my words (if not to antagonize the positions): I said that the syntax (number of control characters, etc.) had no importance for me whatsoever, leaving this up to specialists; what matters are the principles, and I quote ?what you can effectively achieve? with these control characters. At the moment, these basic needs ? and significant compromises and simplifications have been made, don?t you agree? ? are only covered by Mark-Jan?s proposal. That?s it, nothing less, nothing more. If you have a better ?technical? solution for these capabilities, please go for it and submit them to the group, the way it?s done doesn?t matter to me at all! Incidentally have you ever attempted to edit text written in a Complex Script using a word processor? If you haven?t I suggest you do then perhaps think again about what you are putting your name to. No (depending on your definition of complex scripts obviously), and that?s why I trust people like you, So, Marwan and others to find acceptable solution for the edition, I fully accept that you?re the specialists and that it is not a simple issue. But we?re in the 21st century and you?re imaginative IT guys: I?m sure that the basic arrangement of signs that Egyptians could handle is not beyond you reach, even with Unicode! (Note, as I said several times: ancient Egyptian is not English, 5.000.000 words max., including Demotic, and very soon most of the texts will be available freely in MdC, so the encoding speed/efficiency that you repeatedly mention is almost of no concern for long texts that could be converted; and for short quotations, words, etc., I?m pretty confident that between 30 sec. a sentence with limited capabilities and 2 min. a sentence for a more adequate encoding, most Egyptologists would choose the second [again, I?m not representative of anyone but myself, that?s a general and subjective feeling; you can ask for a consultation using the IAE mailing list I suppose, but be ready for some actual craziness ;)]. Now, let me return the argument: have you spent the last 15 years reading, teaching, publishing, encoding and annotating all sorts of hieratic and hieroglyphic texts from OK inscriptions down to Late Period rituals? If you haven?t I suggest you do and perhaps think again about what kind of encoding scheme you?re developing. That?s completely caricatural and ridiculous, isn?t it? You certainly know what John Locke said about this kind of argument from authority, and I would never use them as you do in any sort of debate. Up until now, egyptological evidence was systematically provided by us for any kind of principle advocated for. There might be other ways to deal with theses cases than with control characters in Unicode (ligatures in fonts, HLP, compound characters in Unicode, name it), fine, but please no ?further evidence is needed? for the basic capabilities we?re talking about. Reading what you?ve said here I suspect the underlying issue is you feel that if we don?t do everything now all is lost. No, my sole concern is that we do not go in one direction that will be problematic in the future. As such I will be really happy to see how your proposal can be expanded, e.g., for dealing with multiple levels of embedding: complex groups inserted in corners, several levels of embedding, etc. (you have all the data needed, right?). I?m simply not ready to buy a pig in a poke: we provide you with egyptological evidence, if you can do the same for the encoding scheme, perfect! My only concern is about long term evolution: I agree that having the equivalent of ?:? and ?*? in MdC would already be a big plus, I wrote it several times, but it should fit into the broader picture of a coherent scheme and not lead to some ?bricolage? on top of a (too) simple scheme. Furthermore, I get perfectly that Unicode can be expanded and that it?s not a one time thing, but my understanding (correct me if I?m wrong) is that withdrawing and/or revising substantially a scheme is not likely to happen at any point, right? (just because resources will be quickly created, and that you do not want to be incompatible, right?) You say you leave it to the specialists but then seem to want to ignore what we recommend! I can?t think of anyone I?ve met with solid software experience who would see the syntax of ?A System of ?? as a practical step forward ? if you find one put them in touch with me I enjoy new experiences! Nope, again: I?m not ignoring what you recommend. I?m listening as carefully as I can (sorry for not being Bill Gates though) and, as I said, I?m open to any syntax guaranteeing coherent and adequate long-term extensions: simply show us how you?d proceed; evidence is needed both ways in well-balanced relationships. And, I hate to return the argument, but ? I can?t think of anyone I?ve met with solid egyptological experience who would see the actual limitations of your scheme as a practical step forward for the Egyptology in the long run. ? [and please take this as a joke, the argument from authority story.] In a nutshell: I?m happily supporting any proposal that implements the basic capabilities described in Mark-Jan?s document (because, really, no further evidence is needed), or even part of it, *as long as one can show how it can be developed satisfactorily in the future* (I?m not an IT guy and I?m eager to learn and also enjoy new experiences). You asked for egyptological evidence, you have them; the opposite should hold, no? Again, I?d rather not buy a pig in a poke, so just tell me how your scheme can be expanded, at which level, etc. for supporting the capabilities of Mark-Jan?s proposal, with a comprehensive description that even people like me could understand, and I?m quite confident that we should be able to produce some consensus document! Take care, St?phane Regards, Bob From: Egyptian [ mailto:egyptian-bounces at evertype.com] On Behalf Of St?phane polis Sent: 25 July 2016 11:25 To: Egyptian Hieroglyphs in the UCS < egyptian at evertype.com> Subject: Re: [Egyptian] On "A system of control characters for Ancient Egyptian hieroglyphic text" (2016-07-23) Hi all, If i may jump in one last time. As I said several times, I leave entirely up to the specialists all the issues concerning the syntax of the operators. What matters for me is what you can effectively achieve, and Mak-Jan?s proposal covers precisely what we minimally would like to have (which is why I support it heartfully). Now, if it really ends up to be too much for Unicode and that there is no way to make this happen there, but that you are confident that it can be handled at the level of HLPs, then can I ask a very naive question: what need is there for any control character in Unicode? After all, Marwan?s font with the ligatures seems to work quite well for basic purposes, so it can be considered as a good solution for some users esp. when combined with So?s input system. For other uses, we will need several types of grouping, groups inserted in groups, etc.: why would some bits end up in Unicode, while other would be up to HLP? Shouldn?t we try to have a coherent scheme and not something made of bits and pieces? A real (even if maybe naive) concern. Take care, St?phane Le 24 juil. 2016 ? 20:41, Bob Richmond < bobqq at live.co.uk> a ?crit : Hi Mark-Jan We?ve been talking about plain text on and off for over 18 months so I was interested to read your L2/16-177 ?new system? earlier this month and your latest update yesterday ( https://mjn.host.cs.st-andrews.ac.uk/tmp/unicode2.pdf). Good to see your number of control characters now reduced, more consideration has been given to OpenType and some discussion in Cambridge has been factored in. I understand what you are trying to do and there are points I?d like to discuss when we have time. There are items you note that indeed need to be progressed and agreed (EMPTY, STACK CARTOUCHE ?) However I am disappointed you have not taken on board the fundamental failing with the scheme you?ve been developing that makes it unsuitable for Unicode plain text consideration, however useful it may be for other purposes. I thought this was clear at Cambridge but apparently not. Unicode plain text hieroglyphic is exposed to a huge audience and the very first consideration is don?t make simple things complicated. Use of BEGIN/END for every group breaks that basic rule. Your scheme is unnecessarily complicated so fails at the outset. It?s a continuing distraction for the less technically minded in this group to continue to hold it up as a viable alternative to the current UTC recommendation. It is not. MdC X1:R1 in your scheme uses 3 control characters. Quite honestly I find it hard to understand why anyone thinks this is possibly a good idea. If anyone reading this does support 3 over 1 please let me know your reasoning I?m actually quite curious why I?ve had to spend time on this. What you?ve tried to do is build a theoretical model which describes cluster layout given a set of constraints and that?s all good as an academic exercise. I?d be interested to see it tested against texts and data. All features of could be added in some way to the three control system as HLP or extra controls. So, there?s no need to throw your work away. Some parts apply to plain text implementation e.g. your description on making an MdC-like font. Plenty more you could re-purpose should you wish to continue to be involved in Unicode developments. What I suggest you do is consider how you might use elements of your scheme in a simple higher level protocol on top of plain text as it is at present (I?ll be circulating my considered view on adding to the 3 control set on Tuesday). Brackets in an HLP could be ok for rare cases. May be useful for TLA and Ramses. Meanwhile if you have any ideas on how you would like to see e.g. 4 corners added to the existing proposal I?d be pleased to hear from your or anyone else on the topic. Regards, Bob _______________________________________________ Egyptian mailing list Egyptian at evertype.com http://evertype.com/mailman/listinfo/egyptian_evertype.com _______________________________________________ Egyptian mailing list Egyptian at evertype.com http://evertype.com/mailman/listinfo/egyptian_evertype.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From bobqq at live.co.uk Tue Jul 26 13:45:15 2016 From: bobqq at live.co.uk (Bob Richmond) Date: Tue, 26 Jul 2016 13:45:15 +0100 Subject: [Egyptian] Two group joiners In-Reply-To: References: Message-ID: Hi St?phane -> "possibly best known from Late Egyptian" (thinking e.g. tall narrow groups on the well-known Israel stela)? Doesn't really matter for point in hand so long as controls work well. Your example JSesh: m&&&(x:ib*Z1) falls into the "4 corners"/ligature topic. I don't think it features in the Ramses data you sent. Ultimately it ought to be a supported Unicode cluster. It was in the original proposal. In the revision I've had to make to take into account concerns raised in April (and since) it currently isn't available and you'd need a HLP for release 1. We can discuss that after I've finished writing it up today - there is some wiggle room! Bob -----Original Message----- From: Egyptian [mailto:egyptian-bounces at evertype.com] On Behalf Of St?phane polis Sent: 26 July 2016 11:58 To: Egyptian Hieroglyphs in the UCS Subject: Re: [Egyptian] Two group joiners Dear Bob, Thanks for this document! Some remarks. P. 1 - tall groups are not ?best know from Late Egyptian?: probably rarer in Old Kingdom inscriptions (but this should be checked and I don?t know of studies about this aspect, Nigel?), they are everywhere in hieroglyphic inscriptions (and perhaps more and more systematically present) from the First Intermediate Period onward (down to the Late Period). Look for instance at the biographical texts on FIP and Middle Kingdom stelae; you?ll find several examples on every single document. The same remark applies to your note after the title ?horizontal text? obviously. P. 1 - Note. 'The topic etc.?: I understand that you do not want to conflate the issues, but in a broader perspective why not acknowledging the fact that the same principles apply, namely make groups that are somehow scaled down for fitting in horizontal, vertical or diagonal arrangements. Your solution is probably convenient for vertical and horizontal grouping in most cases (even if it probably implies later additions of additional lower/higher level operators for more complex embedding), but could you explain how you intend to deal with e.g. groups like (in s*xm-ib*), frequent as well in the same context? Offering a solution to this issue would be a big step forward in my view! Best wishes, St?phane From everson at evertype.com Wed Jul 27 18:29:25 2016 From: everson at evertype.com (Michael Everson) Date: Wed, 27 Jul 2016 18:29:25 +0100 Subject: [Egyptian] On "A system of control characters for Ancient Egyptian hieroglyphic text" (2016-07-23) In-Reply-To: References: Message-ID: <3E09BAFD-CF24-47FB-AECC-E90001DC7D07@evertype.com> On 25 Jul 2016, at 17:16, Bob Richmond wrote: > > Congratulations for being the first to (heartfully) express public support for a representation of MdC X1:R1 that uses 3 control characters. Well, I don?t. ?Ligate X with Y? does not tell font developers anything. This is why the five controls are superior, and why a generic ligator is dangerous, because if you end up with six controls then you will have multiple spellings for many many many clusters. Michael. From bobqq at live.co.uk Wed Jul 27 19:03:53 2016 From: bobqq at live.co.uk (Bob Richmond) Date: Wed, 27 Jul 2016 19:03:53 +0100 Subject: [Egyptian] On "A system of control characters for Ancient Egyptian hieroglyphic text" (2016-07-23) In-Reply-To: <3E09BAFD-CF24-47FB-AECC-E90001DC7D07@evertype.com> References: <3E09BAFD-CF24-47FB-AECC-E90001DC7D07@evertype.com> Message-ID: I've no problem with some kind of 4 corner system in theory and spent some time over last two weeks trying to come up with a workable version that isn't insanely complicated. Incidentally nobody to date has actually said how they might like it to work so I've been on my own. Apart from Mark-Jan et al (in http://www.unicode.org/L2/L2016/16210-egyptian-control.pdf) who propose that a "Cobra with group tucked inside" should be encoded with the group preceding the Cobra. For the common and simple example 'Dd' (Cobra containing hand) he proposes the 'd' hieroglyph comes before the 'D' in the code sequence. I'm actually quite shocked that several experienced Egyptologists want to do this. Bob -----Original Message----- From: Egyptian [mailto:egyptian-bounces at evertype.com] On Behalf Of Michael Everson Sent: 27 July 2016 18:29 To: Egyptian Hieroglyphs in the UCS Subject: Re: [Egyptian] On "A system of control characters for Ancient Egyptian hieroglyphic text" (2016-07-23) On 25 Jul 2016, at 17:16, Bob Richmond wrote: > > Congratulations for being the first to (heartfully) express public support for a representation of MdC X1:R1 that uses 3 control characters. Well, I don?t. ?Ligate X with Y? does not tell font developers anything. This is why the five controls are superior, and why a generic ligator is dangerous, because if you end up with six controls then you will have multiple spellings for many many many clusters. Michael. _______________________________________________ Egyptian mailing list Egyptian at evertype.com http://evertype.com/mailman/listinfo/egyptian_evertype.com From s.polis at ulg.ac.be Thu Jul 28 09:17:13 2016 From: s.polis at ulg.ac.be (=?utf-8?Q?St=C3=A9phane_polis?=) Date: Thu, 28 Jul 2016 10:17:13 +0200 Subject: [Egyptian] Two group joiners In-Reply-To: References: Message-ID: <95BE07BA-CE76-49E8-B6DE-304C925ABFD3@ulg.ac.be> Hi Bob, > -> "possibly best known from Late Egyptian" (thinking e.g. tall narrow groups on the well-known Israel stela)? Doesn't really matter for point in hand so long as controls work well. No, I agree, but it matters in terms of text/period frequency (see the comments made by Marwan): the way you describe it might lead UTC members to think that they are rather exceptional and limited to some hieroglyphic corpora. Quite to the contrary, they are very well documented (if admittedly not pervasive) from the FIP onwards. As such, the support of (several) levels of embedding is not just an adornment, but ? in my view at least ? a necessity for the arrangements of hieroglyphs in Unicode. > Your example JSesh: m&&&(x:ib*Z1) falls into the "4 corners"/ligature topic. Yes indeed. > I don't think it features in the Ramses data you sent. Certainly not: it comes from a Middle Kindgom stela (one of the BM collection, if I remember correctly). I repeated this enormous *caveat* several times regarding Ramses, but it?s worth probably saying it again: the data from Ramses are (close to be) comprehensive for *Ramesside hieratic* documents (with a small sample of hieroglyphic texts for the same period). For most of the hieroglyphic texts and for all the other periods, I?m afraid one has no way to query significant amounts of texts at the graphemic level at the moment: the only way to get an idea of what the encoding should cover is therefore, either by experience of the types of clusters attested or by going over publications a bit systematically. These kinds of groups are well attested in horizontal and vertical hieroglyphic texts as well; frequency? No precise idea, I would say 1 or 2 such groups by biographical texts on stelae during the MK to give you an idea. It?s productive and makes certainly quite a lot of them is you consider a broader corpus, but I would be unable to give you figures. Now, this is a good illustration of my concerns regarding the coherence of the system and it?s later extension, and I would be really happy to have your opinion on what follows. 1) Let?s say that your group joiners are integrated in Unicode. (As I said, probably not a bad thing per se: it would allow to cover a lot of cases and other operators of the same nature could be added later for additional levels of embedding. One does not look at the broader picture in a first step, but let?s consider that it?s fine.) 2) We have groups such as m&&&(x:ib*Z1). We know they exist and are well attested, but we decide to ignore them for the time being. Fine. 3) We want to support them at some point in the future in a later extension, and let?s imagine that the ?insert_Top_Right? is available as an operator, we nevertheless need to find a way to say that it concerns a *group* of some sort and not only the ?x? coming after the ?m?, for instance. Do you have an idea how to do this without some sort of parenthetical system? [this is a real question, no irony of whatever sort: I guess that one will not integrate as much INSERT operators as required by the priorities and possible combinations with respect to ?:? and ?*?]. 4) One could potentially decide to finally integrate some kind of parenthetical system for handling such cases in Unicode, m [top_right] (x:ib*Z1). However this would potentially lead to two ways of of encoding other (vertical and horizontal) groups. For instance, sin of your draft could be: or . 5) The final decision is that such groups cannot be supported by Unicode (because of previous decisions as regards the syntax): it would mess up with the syntax previously defined for the group joiner and this is not acceptable. BAD. I might completely miss something, but as we know that *making groups of signs* and *integrating them in various positions* (vertical, horizontal, corners) is part of the basic principles of the hieroglyphic script, there will probably be no huge discoveries in the years to come in this respect, why not trying to find a way to handle these grouping coherently directly in order to avoid problems that we?ll know we?ll have to face? I understood that parentheses are very difficult in Unicode, but can?t we think of a kind of ?glue? operator (precedence, as in Mark-Jan?s et al. proposal, p. 8-9), e.g. with the dot, or , that would create groups by bounding signs together. What would be the problem of such an approach (it was initially suggested by Serge, not my idea, I insist)? Can you explain me why it would be problematic? In my view, it has the advantage to deal with groups in a homogeneous way (and we know this is something that has to be supported), to avoid parenthesis (because the Unicode specialists seems to agree that it?s a bad idea), to be easily readable and understandable by the lay-egyptologist (if you want signs to stick together and they don?t, just add a dot), etc., etc. The number of cases with several levels of precedence are rarer and could be handled by the multiplication of such an operator (it becomes then admittedly a bit more difficult to read but it is still easily understandable by humans and machine alike). This solution is economic ? addition of a single operator (that could be repeated; instead of opening/closing brackets or several types of groups joiners (as in your proposal) ?, adequate (because it allows to cover all the kind of groups that we discussed so far); it won?t pollute the syntax ? if you do not need it, don?t use it; etc., etc. What?s the argument against such an approach (again, a real question)? As I said, I?m also eager to learn and understand the issues! If one could build a consensus document that goes in this direction, I?m in. Best wishes, St?phane > Ultimately it ought to be a supported Unicode cluster. It was in the original proposal. In the revision I've had to make to take into account concerns raised in April (and since) it currently isn't available and you'd need a HLP for release 1. We can discuss that after I've finished writing it up today - there is some wiggle room! > > Bob > > > -----Original Message----- > From: Egyptian [mailto:egyptian-bounces at evertype.com] On Behalf Of St?phane polis > Sent: 26 July 2016 11:58 > To: Egyptian Hieroglyphs in the UCS > Subject: Re: [Egyptian] Two group joiners > > Dear Bob, > > Thanks for this document! Some remarks. > > P. 1 - tall groups are not ?best know from Late Egyptian?: probably rarer in Old Kingdom inscriptions (but this should be checked and I don?t know of studies about this aspect, Nigel?), they are everywhere in hieroglyphic inscriptions (and perhaps more and more systematically present) from the First Intermediate Period onward (down to the Late Period). Look for instance at the biographical texts on FIP and Middle Kingdom stelae; you?ll find several examples on every single document. The same remark applies to your note after the title ?horizontal text? obviously. > > P. 1 - Note. 'The topic etc.?: I understand that you do not want to conflate the issues, but in a broader perspective why not acknowledging the fact that the same principles apply, namely make groups that are somehow scaled down for fitting in horizontal, vertical or diagonal arrangements. Your solution is probably convenient for vertical and horizontal grouping in most cases (even if it probably implies later additions of additional lower/higher level operators for more complex embedding), but could you explain how you intend to deal with e.g. groups like (in s*xm-ib*), frequent as well in the same context? Offering a solution to this issue would be a big step forward in my view! Best wishes, St?phane > > _______________________________________________ > Egyptian mailing list > Egyptian at evertype.com > http://evertype.com/mailman/listinfo/egyptian_evertype.com From s.polis at ulg.ac.be Thu Jul 28 10:26:12 2016 From: s.polis at ulg.ac.be (=?utf-8?Q?St=C3=A9phane_polis?=) Date: Thu, 28 Jul 2016 11:26:12 +0200 Subject: [Egyptian] On "A system of control characters for Ancient Egyptian hieroglyphic text" (2016-07-23) In-Reply-To: References: <3E09BAFD-CF24-47FB-AECC-E90001DC7D07@evertype.com> Message-ID: > Apart from Mark-Jan et al (in http://www.unicode.org/L2/L2016/16210-egyptian-control.pdf) who propose that a "Cobra with group tucked inside" should be encoded with the group preceding the Cobra. For the common and simple example 'Dd' (Cobra containing hand) he proposes the 'd' hieroglyph comes before the 'D' in the code sequence. I'm actually quite shocked that several experienced Egyptologists want to do this. The issue of the reading order is a real one and probably not an easy one to solve; but it should be tackled and integrated in the discussions, I definitely agree. In a first place, I was of the opinion to stick strictly to the graphemic order (as some clusters are not always easy to interpret in terms of reading; and this is the conclusion that we reached with Michael at the pub and discussed with Mark-Jan, Serge and others), but a second thought made me understand why it would not be ideal and very practical in Unicode (notably for searches), and admittedly weird for Egyptologists: encoding Dd as is not natural, to say the least. This is why Serge suggested to deal with such case as , if I remember correctly, but not an ideal solution either of course. In order to cover both the graphemic and reading order, I see at the moment two options (there might be other ones and I?m carefully listening to what you guys think). ****** 1. Insertion with a flexible syntax and ?orientation? operators ******* I mean here insertion operators that you could position before or after a sign depending on the reading order. Let?s take a basic example. I agree that you would like to read Dd -------------- next part -------------- A non-text attachment was scrubbed... Name: PastedGraphic-7.pdf Type: application/pdf Size: 2393 bytes Desc: not available URL: -------------- next part -------------- as , with the inserted sign coming after. On the other hand, for tA (the feminine article) -------------- next part -------------- A non-text attachment was scrubbed... Name: PastedGraphic-8.pdf Type: application/pdf Size: 2396 bytes Desc: not available URL: -------------- next part -------------- , you would like to have , with the inserted sign coming before. For this to be possible, we would have to define two operators that ?orient? the insertion: one such as ?>?, saying that the sign is inserted in what follows, and one such as ? A Possible, but probably not ideal given the problems that you would face for multiple insertions. Let?s take, for instance, (i)w.t -------------- next part -------------- A non-text attachment was scrubbed... Name: PastedGraphic-9.pdf Type: application/pdf Size: 2541 bytes Desc: not available URL: -------------- next part -------------- (t&w&D54). Ideally, you would like to read something which is not t&w&D54, but rather: w [insert_bottom_left]< t [insert_top_right]< D54, in order to stick somehow to an ideal reading order (even if the sequential reading is questionable in such cases). However, it is unclear how one knows that the target of the insertion of D54 is ?w? and not the ?t?, with such a syntax. An elegant solution for handling this could be to use: ***** 2. A single ?scope operator? **** I mean here an operator that marks one sign as the scope of all the insertions around. Let?s say we use ?!? as a scope marker, then the examples above would become: D! [insert_bottom_left] d t [insert_bottom_left] A! w! [insert_bottom_left] t [insert_top_right] D54 The graphemic order is explicit and the reading order is respected. I do not know if this is manageable or practical from an IT/font point of view (it does not seem terrifically problematic though), but this second solution has the advantage of being economic (a single operator), graphically explicit (one knows exactly which sign goes into which and at which position), and ?linguistically? searchable (the reading order is respected). For instance, xm -------------- next part -------------- A non-text attachment was scrubbed... Name: PastedGraphic-10.pdf Type: application/pdf Size: 3314 bytes Desc: not available URL: -------------- next part -------------- as , respects the reading order and is graphically explicit. There will be ambiguous cases, I have no doubt, but this is unavoidable if one wants to make any room to the ?reading? order in the encoding of ligatures. What do you think? Best wishes, St?phane > > Bob > > -----Original Message----- > From: Egyptian [mailto:egyptian-bounces at evertype.com] On Behalf Of Michael Everson > Sent: 27 July 2016 18:29 > To: Egyptian Hieroglyphs in the UCS > Subject: Re: [Egyptian] On "A system of control characters for Ancient Egyptian hieroglyphic text" (2016-07-23) > > On 25 Jul 2016, at 17:16, Bob Richmond wrote: >> >> Congratulations for being the first to (heartfully) express public support for a representation of MdC X1:R1 that uses 3 control characters. > > Well, I don?t. > > ?Ligate X with Y? does not tell font developers anything. This is why the five controls are superior, and why a generic ligator is dangerous, because if you end up with six controls then you will have multiple spellings for many many many clusters. > > Michael. > _______________________________________________ > Egyptian mailing list > Egyptian at evertype.com > http://evertype.com/mailman/listinfo/egyptian_evertype.com > > _______________________________________________ > Egyptian mailing list > Egyptian at evertype.com > http://evertype.com/mailman/listinfo/egyptian_evertype.com From mn31 at st-andrews.ac.uk Thu Jul 28 10:32:05 2016 From: mn31 at st-andrews.ac.uk (Mark-Jan Nederhof) Date: Thu, 28 Jul 2016 10:32:05 +0100 Subject: [Egyptian] Two group joiners In-Reply-To: <95BE07BA-CE76-49E8-B6DE-304C925ABFD3@ulg.ac.be> References: <95BE07BA-CE76-49E8-B6DE-304C925ABFD3@ulg.ac.be> Message-ID: <4429959.tRbiO4p2SD@thuis> Dear St?phane, To support to what you wrote on coherence: A long time ago we passed beyond the stage of knowing we need certain primitives. The main question is how to fit them together into a coherent scheme with a syntax that is consistent and unambiguous, without sacrificing (too much) of the power of these primitives and how they can be combined. > I understood that parentheses are very difficult in Unicode, but can?t we think of a kind of ?glue? operator (precedence, as in Mark-Jan?s et al. proposal, p. 8-9), e.g. with the dot, or , that would create groups by bounding signs together. What would be the problem of such an approach (it was initially suggested by Serge, not my idea, I insist)? Can you explain me why it would be problematic? With respect to the discussion in Section 9 of: http://www.unicode.org/L2/L2016/16210-egyptian-control.pdf the '.' above is a notational variant of the 'levels' of operator precedence, there achieved by having several copies of each primitive, for different levels. Here, for every '.' added behind an operator one goes up one level of operator precedence. Some trick using operator precedence, either using the '.' or using several control characters per primitive with different binding values, would be an alternative to having brackets. There is a relatively straightforward mechanical mapping from one notation to the other, so there should be no difference to the expressive power and I don't think that among us (TLA/Ramses/St Andrews) there is any firm commitment to brackets or otherwise. But as explained in Section 9, there are costs to abandoning brackets. Specifying the syntax with operator precedence formally would be a bit tricky, it might not be easy for the human to know when to use :... or *.. or insert_top_right. , and OpenType cannot really parse, as it is not a general purpose programming language, and trying to implement about a dozen different binding values (4 primitives times 3 levels of operator precedence) might be pushing our luck. My working assumption is that an input method editor would be used to edit hieroglyphic text in practice, so the actual syntax should not matter too much, as long as it is hidden from the user. Best regards, Mark-Jan From s.polis at ulg.ac.be Thu Jul 28 11:37:36 2016 From: s.polis at ulg.ac.be (=?utf-8?Q?St=C3=A9phane_polis?=) Date: Thu, 28 Jul 2016 12:37:36 +0200 Subject: [Egyptian] Two group joiners In-Reply-To: <4429959.tRbiO4p2SD@thuis> References: <95BE07BA-CE76-49E8-B6DE-304C925ABFD3@ulg.ac.be> <4429959.tRbiO4p2SD@thuis> Message-ID: <8934E9F2-FB32-4790-827C-C66EA6F38524@ulg.ac.be> Dear Mark-Jan, Yeap, I do agree. As you can see, I?m simply trying to be imaginative and to think about the better possible consensus before the very close deadline (in order not to take the chance for everything to collapse or to go in unwarranted territories, because I do care). As such, I understand perfectly the cost of getting rid of parenthesis; my point is that, as we know that we need to make clusters (and there is not much room for discussion about this basic principle), the only alternative would be this ?glue-precedence? operator, discussed in Section 9 of the document, that has not yet been discussed by others. It?s one or the other, but there are logically no third option as far as I understand. I would like to hear what you and others think about the ?scope operator? for the insertion, but to sum up, a basic consensus document for control characters could in my view contain only 8 control characters (and we know that we could handle most [if not all] of the cases discussed so far with these characters, that the scheme will be coherent, that it will be easy to expand with others operators without destroying everything, etc.), namely: 1) Vertical [:] // Already in MdC and in Bob?s proposal 2) Horizontal [*] // Already in MdC in Bob?s proposal 3) Groups (precedence operator) [.] // Needed for clusters (insertion, vertical, horizontal) and for avoiding the multiplication of group-joiners of various level in the future. 4) Top-left [?] // Agreed on by most (if not all) 5) Bottom-left [?] // Agreed on by most (if not all) 6) Top-Right [?] // Agreed on by most (if not all) 7) Bottom-Right [?] // Agreed on by most (if not all) 8) Scope operator [!] // A suggestion to be discussed Maybe that a quadrat separator [-] should be added in order to avoid issues, e.g. with multiple corners insertions following one another (but this should be tested of course), and because after all, the basic unit of this writing system is the quadrat (not the sign, even if it happens that 1 sign = 1 quadrat). These operators correspond to a very ?flat? and linear syntax (without ?begin? and ?end? markers that seem to be so problematic), which is a requirement from the Unicode specialists as I understand it. If needed, I am ready to leave out of this ?consensus? document: - The join - The insert-center - The stack - The empty Not because I think we don?t need them (I think it would be great to have), but (1) because I would like the discussion to focus on the 8/9 operators above, (2) because these additions are easy to make in a second step and will not impedes much for short term encoding. That?s about it for me at the moment. I leave the floor to you, Bob and others. Let me stress that it?s probably difficult for me/us to make more steps in the direction of a consensus. At this stage, everyone should probably try and understand other?s point of view and accept that compromises are needed rather quickly... As you can see in the summary above, the central point is about the group/precedence operator. Then the ?scope? operator and what you think about it. The rest should be easily agreed on. Unicode is not a one time thing, as Bob said, sure, but one decision can affect all the future developments. Take care, St?phane ------------------------------------------------------ Chercheur qualifi? F.R.S.-FNRS Universit? de Li?ge Service d'?gyptologie D?partement des sciences de l?Antiquit? Place du 20-Ao?t, B-4000 Li?ge http://www.egypto.ulg.ac.be ------------------------------------------------------ > Le 28 juil. 2016 ? 11:32, Mark-Jan Nederhof a ?crit : > > Dear St?phane, > > To support to what you wrote on coherence: A long time ago we passed beyond the stage of knowing > we need certain primitives. The main question is how to fit them together into a coherent scheme with > a syntax that is consistent and unambiguous, without sacrificing (too much) of the power of these > primitives and how they can be combined. > >> I understood that parentheses are very difficult in Unicode, but can?t we think of a kind of ?glue? operator (precedence, as in Mark-Jan?s et al. proposal, p. 8-9), e.g. with the dot, or , that would create groups by bounding signs together. What would be the problem of such an approach (it was initially suggested by Serge, not my idea, I insist)? Can you explain me why it would be problematic? > > With respect to the discussion in Section 9 of: > http://www.unicode.org/L2/L2016/16210-egyptian-control.pdf > the '.' above is a notational variant of the 'levels' of operator precedence, there achieved by > having several copies of each primitive, for different levels. Here, for every '.' added behind > an operator one goes up one level of operator precedence. > > Some trick using operator precedence, either using the '.' or using several control characters per > primitive with different binding values, would be an alternative to having brackets. There is a > relatively straightforward mechanical mapping from one notation to the other, so there should be > no difference to the expressive power and I don't think that among us (TLA/Ramses/St Andrews) > there is any firm commitment to brackets or otherwise. > > But as explained in Section 9, there are costs to abandoning brackets. Specifying the syntax with > operator precedence formally would be a bit tricky, it might not be easy for the human to know > when to use :... or *.. or insert_top_right. , and OpenType cannot really parse, as it is not a general > purpose programming language, and trying to implement about a dozen different binding values > (4 primitives times 3 levels of operator precedence) might be pushing our luck. > My working assumption is that an input method editor would be used to edit hieroglyphic text in > practice, so the actual syntax should not matter too much, as long as it is hidden from the user. > > Best regards, > Mark-Jan > > _______________________________________________ > Egyptian mailing list > Egyptian at evertype.com > http://evertype.com/mailman/listinfo/egyptian_evertype.com From mn31 at st-andrews.ac.uk Thu Jul 28 12:00:08 2016 From: mn31 at st-andrews.ac.uk (Mark-Jan Nederhof) Date: Thu, 28 Jul 2016 12:00:08 +0100 Subject: [Egyptian] On "A system of control characters for Ancient Egyptian hieroglyphic text" (2016-07-23) In-Reply-To: References: Message-ID: <15473721.ga2UxhmWhY@thuis> Dear St?phane, > The issue of the reading order is a real one and probably not an easy one to solve; but it should be tackled and integrated in the discussions, I definitely agree. As we discussed before, I have considered splitting RES 'insert' into two primitives, called 'insert' and 'inserted'. They differ only in the order of the 'big sign' and the 'inserted sign'. So insert(A,B) could mean 'insert B into A' and inserted(A,B) could mean 'A is inserted into B'. This would solve a long-standing problem in RES that the reading order is out of sync with the notational order. What you propose goes in the same direction. There are advantages and disadvantages of course. > ****** 1. Insertion with a flexible syntax and ?orientation? operators ******* > [...] > D [insert_bottom_left]< d > t [insert_bottom_left]> A > > Possible, but probably not ideal given the problems that you would face for multiple insertions. I think it is feasible. It would require the encoder to actually ask themselves what the reading order is. > ***** 2. A single ?scope operator? **** > > I mean here an operator that marks one sign as the scope of all the insertions around. Let?s say we use ?!? as a scope marker, then the examples above would become: > > D! [insert_bottom_left] d > t [insert_bottom_left] A! > w! [insert_bottom_left] t [insert_top_right] D54 > > The graphemic order is explicit and the reading order is respected. I don't think this works because what is inserted is a group in general, and then where do you place the '!' ? I think if you want to go down this route, it seems more attractive to introduce a separate tier of annotation, in which you place indices at each sign, enumerating reading order. This would be outside Unicode. > This is why Serge suggested to deal with such case as , if I remember correctly, but not an ideal solution either of course. I think we discarded this idea some weeks ago, but the general direction of thinking, that of trying to express the insertion-into-the-cobra differently from other insertions in the bottom-left corner, without introducing additional notational elements, might still be worth considering. I had my doubts about using [:_and_kerning] for this purpose, as it requires 'magic' from the font and/or rendering engine, to know that the cobra is special, and it is not just pushing the two groups in the vertical arrangement together, but is actually inserting the whole second group inside the first group. So we would get two ways of expressing what is really insertion, but expressed in two very different ways. One option might be to use 'insert-into-the-middle' for the cobra combination. In the current syntax, 'insert-into-the-middle' cannot be combined with some other insertion-into-a-corner, but this may not be needed. Because for 'insert-into-the-middle' the inserted sign comes after the main sign, the special case of the cobra would be solved, with respect to reading order. I believe we have assumed implicitly that 'insert-into-the-middle' is only used for signs that more or less enclose some whitespace in the middle, whereas the cobra doesn't. We have to drop this assumption of course for the above to work. By the way, our decision in Cambridge to replace 'insertion-into-top' and 'insert-into-bottom' by a suitable vertical grouping, plus 'kerning' (fitting, JOIN), and similarly for insertion into left and right, is not entirely unproblematic. Consider for example the two raised arms (kA). Inserted (from above?) might be a larger group, such as the bull's head with Z1*Z1*Z1 underneath. Trying to encode this with: bullshead:Z1*Z1*Z1 [:_and_kerning] raisedarms has the same problem as using kerning for the cobra combination, namely to require magic, to know to squeeze the three Z1 together and then to 'slide' the whole group into the raised arms, and to know that not only the three Z1 go into the cobra, but also the bull's head. So how about we use 'insert-into-the-middle' for this as well? Does this then introduce new problems with reading order? Sometimes, as in Hm-kA, but in other cases it is fine as with t or nTr or nsw inside. We should accept that if we encode graphical order only (whatever that means for the 2 dimensional writing system) we cannot guarantee to also always perfectly capture reading order. Best regards, Mark-Jan From bobqq at live.co.uk Thu Jul 28 12:40:14 2016 From: bobqq at live.co.uk (Bob Richmond) Date: Thu, 28 Jul 2016 12:40:14 +0100 Subject: [Egyptian] Two group joiners In-Reply-To: <4429959.tRbiO4p2SD@thuis> References: <95BE07BA-CE76-49E8-B6DE-304C925ABFD3@ulg.ac.be> <4429959.tRbiO4p2SD@thuis> Message-ID: Hi Mark Jan I'll be replying to the very helpful points and suggestions from St?phane later. Great you've been learning more about Unicode technicalities. But I want to skip to your concluding remark right now because it's tremendously relevant to the overall discussion. You stated "My working assumption is that an input method editor would be used to edit hieroglyphic text in practice, so the actual syntax should not matter too much, as long as it is hidden from the user.". NO, NO, Ten thousand times. NO. Michel among others pointed out we need to work within conventions and techniques for Complex Scripts used in living languages. We cannot expect developers of general purpose software or web sites to spend time and money to support hieroglyphic. There is no need to presume or guess. Complex scripts are already supported in word processors and web browsers so we can test out usability. In fact it was only after experiments proved hieroglyphic was now feasible with current technology that I started this process. FACT ----It is impossible to fully hide control characters from users in real life.---- Input methods can help. Specialist software can indeed shield the user from complexity and do wonderful things using Unicode. And believe me I'd far rather be spending my time on that right now! I don't want to enter a pointless debate on what is simply a fact. So I'll try to find some time later today to put together an illustrated note on the topic so we can wrap it up. Regards, Bob -----Original Message----- From: Egyptian [mailto:egyptian-bounces at evertype.com] On Behalf Of Mark-Jan Nederhof Sent: 28 July 2016 10:32 To: egyptian at evertype.com Subject: Re: [Egyptian] Two group joiners Dear St?phane, To support to what you wrote on coherence: A long time ago we passed beyond the stage of knowing we need certain primitives. The main question is how to fit them together into a coherent scheme with a syntax that is consistent and unambiguous, without sacrificing (too much) of the power of these primitives and how they can be combined. > I understood that parentheses are very difficult in Unicode, but can?t we think of a kind of ?glue? operator (precedence, as in Mark-Jan?s et al. proposal, p. 8-9), e.g. with the dot, or , that would create groups by bounding signs together. What would be the problem of such an approach (it was initially suggested by Serge, not my idea, I insist)? Can you explain me why it would be problematic? With respect to the discussion in Section 9 of: http://www.unicode.org/L2/L2016/16210-egyptian-control.pdf the '.' above is a notational variant of the 'levels' of operator precedence, there achieved by having several copies of each primitive, for different levels. Here, for every '.' added behind an operator one goes up one level of operator precedence. Some trick using operator precedence, either using the '.' or using several control characters per primitive with different binding values, would be an alternative to having brackets. There is a relatively straightforward mechanical mapping from one notation to the other, so there should be no difference to the expressive power and I don't think that among us (TLA/Ramses/St Andrews) there is any firm commitment to brackets or otherwise. But as explained in Section 9, there are costs to abandoning brackets. Specifying the syntax with operator precedence formally would be a bit tricky, it might not be easy for the human to know when to use :... or *.. or insert_top_right. , and OpenType cannot really parse, as it is not a general purpose programming language, and trying to implement about a dozen different binding values (4 primitives times 3 levels of operator precedence) might be pushing our luck. My working assumption is that an input method editor would be used to edit hieroglyphic text in practice, so the actual syntax should not matter too much, as long as it is hidden from the user. Best regards, Mark-Jan _______________________________________________ Egyptian mailing list Egyptian at evertype.com http://evertype.com/mailman/listinfo/egyptian_evertype.com From mn31 at st-andrews.ac.uk Thu Jul 28 12:41:48 2016 From: mn31 at st-andrews.ac.uk (Mark-Jan Nederhof) Date: Thu, 28 Jul 2016 12:41:48 +0100 Subject: [Egyptian] On "A system of control characters for Ancient Egyptian hieroglyphic text" (2016-07-23) In-Reply-To: <15473721.ga2UxhmWhY@thuis> References: <15473721.ga2UxhmWhY@thuis> Message-ID: <5487135.A6Y7aQEzAa@thuis> Apropos, the reason none of the two versions of our proposal gave much consideration to reading order is because the signals we were receiving from the UTC, even before we produced the first version, seemed to go into the direction of giving precedence to graphical order. For example in: Recommendations to UTC #147 May 2016 on Script Proposals http://www.unicode.org/L2/L2016/16156-script-recs.pdf we find: " For complex writing systems such as Egyptian hieroglyphs or Japanese, the visual order of characters should be separate from phonetic order. In complex writing systems, combining visual and phonetic order can result in visual ambiguity, where one reading could have two different visual sequences. One could, however, keep a separate field (or use mark-up) with the phonetic reading, which could be carried along with the data. " I interpret this to say reading ('phonetic') order is secondary to graphical order what Unicode is concerned. But it also suggests there might be room for taking reading order into account, and not just in mark-up but also into the Unicode encoding itself. I can't be certain my interpretation of this is correct. We should ask the UTC for clarification next week. Mark-Jan From s.polis at ulg.ac.be Thu Jul 28 13:32:04 2016 From: s.polis at ulg.ac.be (=?utf-8?Q?St=C3=A9phane_polis?=) Date: Thu, 28 Jul 2016 14:32:04 +0200 Subject: [Egyptian] On "A system of control characters for Ancient Egyptian hieroglyphic text" (2016-07-23) In-Reply-To: <15473721.ga2UxhmWhY@thuis> References: <15473721.ga2UxhmWhY@thuis> Message-ID: Dear Mark-Jan, >> The issue of the reading order is a real one and probably not an easy one to solve; but it should be tackled and integrated in the discussions, I definitely agree. > > As we discussed before, I have considered splitting RES 'insert' into two primitives, called 'insert' and 'inserted'. > They differ only in the order of the 'big sign' and the 'inserted sign'. So insert(A,B) could mean 'insert B into A' and > inserted(A,B) could mean 'A is inserted into B'. This would solve a long-standing problem in RES that > the reading order is out of sync with the notational order. What you propose goes in the same direction. > > There are advantages and disadvantages of course. > >> ****** 1. Insertion with a flexible syntax and ?orientation? operators ******* >> [...] >> D [insert_bottom_left]< d >> t [insert_bottom_left]> A >> >> Possible, but probably not ideal given the problems that you would face for multiple insertions. > > I think it is feasible. It would require the encoder to actually ask themselves what the reading order is. Good, that?s already something to consider then. >> ***** 2. A single ?scope operator? **** >> >> I mean here an operator that marks one sign as the scope of all the insertions around. Let?s say we use ?!? as a scope marker, then the examples above would become: >> >> D! [insert_bottom_left] d >> t [insert_bottom_left] A! >> w! [insert_bottom_left] t [insert_top_right] D54 >> >> The graphemic order is explicit and the reading order is respected. > > I don't think this works because what is inserted is a group in general, and then where do you > place the '!? ? Sorry, I?m tired, slow, and not sure I understand? Could you give an example of what you mean here? > I think if you want to go down this route, it seems more attractive to introduce > a separate tier of annotation, in which you place indices at each sign, enumerating reading order. > This would be outside Unicode. > >> This is why Serge suggested to deal with such case as , if I remember correctly, but not an ideal solution either of course. > > I think we discarded this idea some weeks ago, but the general direction of thinking, that of trying > to express the insertion-into-the-cobra differently from other insertions in the bottom-left corner, > without introducing additional notational elements, might still be worth considering. > > I had my doubts about using [:_and_kerning] for this purpose, as it requires 'magic' > from the font and/or rendering engine, to know that the cobra is special, and it is not > just pushing the two groups in the vertical arrangement together, but is actually > inserting the whole second group inside the first group. So we would get two ways of expressing > what is really insertion, but expressed in two very different ways. Agreed. > One option might be to use 'insert-into-the-middle' for the cobra combination. In the > current syntax, 'insert-into-the-middle' cannot be combined with some other > insertion-into-a-corner, but this may not be needed. Because for 'insert-into-the-middle' > the inserted sign comes after the main sign, the special case of the cobra would be solved, > with respect to reading order. This could be a practical solution of course, but this would stretch quite a bit the definition of the ?insert-into-the-middle? (as you say below actually) for no good reason (except to solve a reading issue), and would lead one to use ?insert-into-the-middle? in unprincipled way. I mean, what would be the rationale for choosing ?bottom-left? and ?middle?? The only one is a graphic one, and I?m not sure that a reading motivation is good. But an option to keep in mind of course. > I believe we have assumed implicitly that 'insert-into-the-middle' is only used for signs > that more or less enclose some whitespace in the middle, whereas the cobra doesn't. > We have to drop this assumption of course for the above to work. > > By the way, our decision in Cambridge to replace 'insertion-into-top' and 'insert-into-bottom' > by a suitable vertical grouping, plus 'kerning' (fitting, JOIN), and similarly for insertion into > left and right, is not entirely unproblematic. Consider for example the two raised arms (kA). > Inserted (from above?) might be a larger group, such as the bull's head with > Z1*Z1*Z1 underneath. Trying to encode this with: > bullshead:Z1*Z1*Z1 [:_and_kerning] raisedarms > has the same problem as using kerning for the cobra combination, namely to require magic, > to know to squeeze the three Z1 together and then to 'slide' the whole group into the raised > arms, and to know that not only the three Z1 go into the cobra, but also the bull's head. > > So how about we use 'insert-into-the-middle' for this as well? Does this then introduce > new problems with reading order? Sometimes, as in Hm-kA, but in other cases it is > fine as with t or nTr or nsw inside. We should accept that if we encode graphical order only > (whatever that means for the 2 dimensional writing system) we cannot guarantee to also > always perfectly capture reading order. Nope, that?s for sure. For the cases that you mention, I wonder whether we should not collect more evidence and prepare a later addition with ?insert-top? and ?insert-bottom? based on this (for insert bottom, one can think of the combination of ?pt (sky)? with several signs, etc.), rather than to stretch the definition of ?insert-middle?. But this remains certainly an open debate. As a general point: mixing insertion and kerning (or join) for a single phenomenon annoys me a bit to be honest. And I think this is also you view, right? Best wishes, St?phane > > Best regards, > Mark-Jan > > > _______________________________________________ > Egyptian mailing list > Egyptian at evertype.com > http://evertype.com/mailman/listinfo/egyptian_evertype.com From everson at evertype.com Thu Jul 28 14:12:09 2016 From: everson at evertype.com (Michael Everson) Date: Thu, 28 Jul 2016 14:12:09 +0100 Subject: [Egyptian] On "A system of control characters for Ancient Egyptian hieroglyphic text" (2016-07-23) In-Reply-To: References: <3E09BAFD-CF24-47FB-AECC-E90001DC7D07@evertype.com> Message-ID: <56DE0DF7-6F74-4EDC-83EB-86653C343A9A@evertype.com> On 27 Jul 2016, at 19:03, Bob Richmond wrote: > > I've no problem with some kind of 4 corner system in theory and spent some time over last two weeks trying to come up with a workable version that isn't insanely complicated. Incidentally nobody to date has actually said how they might like it to work so I've been on my own. Instructions to the font developer: Put an element into one of the four quadrants. The syntax would indicate that the Character A when followed by top-left or bottom-left control will go on the top or bottom left of the next character, Character B. And if Character B is followed by a top-right or bottom-right control then it is the next character, Character C, which will go on the top or bottom right of Character B. > Apart from Mark-Jan et al (in http://www.unicode.org/L2/L2016/16210-egyptian-control.pdf) who propose that a "Cobra with group tucked inside" should be encoded with the group preceding the Cobra. For the common and simple example 'Dd' (Cobra containing hand) he proposes the 'd' hieroglyph comes before the 'D' in the code sequence. I'm actually quite shocked that several experienced Egyptologists want to do this. The Cobra can be dealt with in one of two ways: Hand + bottom-left + Cobra or Cobra + vertical-stack + Hand The decision here is *conventional* and *arbitrary*. Either option will work. Both options should not be allowed. Egyptologists just decide whether to consider the cobra as a diagonal character like the Chick, or a horizontal character which just happens to have a tail. It doesn?t matter which, so long as ONE choice is made. That choice will be explained in the Unicode Technical Report which will eventually be one of the instruments which can provide font developers information on Egyptian. Michael From everson at evertype.com Thu Jul 28 14:15:51 2016 From: everson at evertype.com (Michael Everson) Date: Thu, 28 Jul 2016 14:15:51 +0100 Subject: [Egyptian] On "A system of control characters for Ancient Egyptian hieroglyphic text" (2016-07-23) In-Reply-To: <5487135.A6Y7aQEzAa@thuis> References: <15473721.ga2UxhmWhY@thuis> <5487135.A6Y7aQEzAa@thuis> Message-ID: <45D0121F-5FF6-4CF5-8DE7-AE828BCAE90B@evertype.com> On 28 Jul 2016, at 12:41, Mark-Jan Nederhof wrote: > "For complex writing systems such as Egyptian hieroglyphs or Japanese, the visual order of characters should be separate from phonetic order. In complex writing systems, combining visual and phonetic order can result in visual ambiguity, where one reading could have two different visual sequences. One could, however, keep a separate field (or use mark-up) with the phonetic reading, which could be carried along with the data.? OK > I interpret this to say reading ('phonetic') order is secondary to graphical order what Unicode is concerned. But it also suggests there might be room for taking reading order into account, and not just in mark-up but also into the Unicode encoding itself. I can't be certain my interpretation of this is correct. We should ask the UTC for clarification next week. It depends on the script. Devanagari and Thai have similar structures though Thai doesn?t do conjuncts the same way. Devanagari is encoded phonetically, Thai visually. In my view it is graphical order which is the only practical way of encoding Egyptian. Michael From bobqq at live.co.uk Thu Jul 28 15:59:15 2016 From: bobqq at live.co.uk (Bob Richmond) Date: Thu, 28 Jul 2016 15:59:15 +0100 Subject: [Egyptian] On "A system of control characters for Ancient Egyptian hieroglyphic text" (2016-07-23) In-Reply-To: <56DE0DF7-6F74-4EDC-83EB-86653C343A9A@evertype.com> References: <3E09BAFD-CF24-47FB-AECC-E90001DC7D07@evertype.com> <56DE0DF7-6F74-4EDC-83EB-86653C343A9A@evertype.com> Message-ID: Hieroglyphic font development, the coding for clusters doesn't bother me at all so long as I have suitable reference docs and data. If a+G043+b means what we all know it means or whether 10 controls come into play. GSUB is good enough for basic fonts but I'm not going to be managing, hand arranging and scaling thousands of clusters and I surely hope others don't either. More elaborate approaches will use GPOS but and interesting layout devices but too technical a topic to bore this general group with. It sure ain't like the ABC world. As you know better than anyone. I am not arguing against layout features for Egyptology purposes, just saying it?s the tiniest issue when making a hieroglyphic font with shaping. True we don't need to be total slaves to reading order (the Egyptians weren't) but if methods exist that preserve the regular sequence why be perverse? Bob -----Original Message----- From: Egyptian [mailto:egyptian-bounces at evertype.com] On Behalf Of Michael Everson Sent: 28 July 2016 14:12 To: Egyptian Hieroglyphs in the UCS Subject: Re: [Egyptian] On "A system of control characters for Ancient Egyptian hieroglyphic text" (2016-07-23) On 27 Jul 2016, at 19:03, Bob Richmond wrote: > > I've no problem with some kind of 4 corner system in theory and spent some time over last two weeks trying to come up with a workable version that isn't insanely complicated. Incidentally nobody to date has actually said how they might like it to work so I've been on my own. Instructions to the font developer: Put an element into one of the four quadrants. The syntax would indicate that the Character A when followed by top-left or bottom-left control will go on the top or bottom left of the next character, Character B. And if Character B is followed by a top-right or bottom-right control then it is the next character, Character C, which will go on the top or bottom right of Character B. > Apart from Mark-Jan et al (in http://www.unicode.org/L2/L2016/16210-egyptian-control.pdf) who propose that a "Cobra with group tucked inside" should be encoded with the group preceding the Cobra. For the common and simple example 'Dd' (Cobra containing hand) he proposes the 'd' hieroglyph comes before the 'D' in the code sequence. I'm actually quite shocked that several experienced Egyptologists want to do this. The Cobra can be dealt with in one of two ways: Hand + bottom-left + Cobra or Cobra + vertical-stack + Hand The decision here is *conventional* and *arbitrary*. Either option will work. Both options should not be allowed. Egyptologists just decide whether to consider the cobra as a diagonal character like the Chick, or a horizontal character which just happens to have a tail. It doesn?t matter which, so long as ONE choice is made. That choice will be explained in the Unicode Technical Report which will eventually be one of the instruments which can provide font developers information on Egyptian. Michael _______________________________________________ Egyptian mailing list Egyptian at evertype.com http://evertype.com/mailman/listinfo/egyptian_evertype.com From mn31 at st-andrews.ac.uk Thu Jul 28 16:03:20 2016 From: mn31 at st-andrews.ac.uk (Mark-Jan Nederhof) Date: Thu, 28 Jul 2016 16:03:20 +0100 Subject: [Egyptian] On "A system of control characters for Ancient Egyptian hieroglyphic text" (2016-07-23) In-Reply-To: References: <15473721.ga2UxhmWhY@thuis> Message-ID: <3771665.YKeWPZMlE7@thuis> Dear St?phane, > >> ***** 2. A single ?scope operator? **** > >> > >> I mean here an operator that marks one sign as the scope of all the insertions around. Let?s say we use ?!? as a scope marker, then the examples above would become: > >> > >> D! [insert_bottom_left] d > >> t [insert_bottom_left] A! > >> w! [insert_bottom_left] t [insert_top_right] D54 > >> > >> The graphemic order is explicit and the reading order is respected. > > > > I don't think this works because what is inserted is a group in general, and then where do you > > place the '!? ? > > Sorry, I?m tired, slow, and not sure I understand? Could you give an example of what you mean here? Sorry, I misread the first time around. The way I understand it now, it might be an unambiguous syntax, at least if we rule out groups with more than one core_group in which to insert. I have an uneasy feeling though about the interpretation of the [insert_bottom_left] depending on some mark somewhere else to the left or to the right. And would it be possible to have no marked element at all, and what would that mean? (Defaulting to the left-most or right-most element?) > > One option might be to use 'insert-into-the-middle' for the cobra combination. In the > > current syntax, 'insert-into-the-middle' cannot be combined with some other > > insertion-into-a-corner, but this may not be needed. Because for 'insert-into-the-middle' > > the inserted sign comes after the main sign, the special case of the cobra would be solved, > > with respect to reading order. > > This could be a practical solution of course, but this would stretch quite a bit the definition of the ?insert-into-the-middle? (as you say below actually) for no good reason (except to solve a reading issue), and would lead one to use ?insert-into-the-middle? in unprincipled way. I mean, what would be the rationale for choosing ?bottom-left? and ?middle?? The only one is a graphic one, and I?m not sure that a reading motivation is good. But an option to keep in mind of course. Yes, I see problems there. As Michael reminded us, the font designer needs instructions to know what to do. If the instructions are few, this is easier than if there are many instructions, with many special rules and exceptions. I also would hope to keep open the possibility of doing the rendering totally automatically, certainly outside the context of OpenType, in applications where general-purpose programming languages can be used. Having an encoding where every primitive has a simple procedural interpretation (or, as simple as possible) would therefore be a tremendous advantage. And as you suggested, the encoder needs to know how to encode. Having many special cases makes it more difficult to decide. There are trade-offs, as many times before. > Nope, that?s for sure. For the cases that you mention, I wonder whether we should not collect more evidence and prepare a later addition with ?insert-top? and ?insert-bottom? based on this (for insert bottom, one can think of the combination of ?pt (sky)? with several signs, etc.), rather than to stretch the definition of ?insert-middle?. But this remains certainly an open debate. We could keep this open. My working hypothesis is that where one would feel the need for insert-top or insert-bottom there is usually some case to be made for either vertical+kerning or insert-inside, which would avoid having more primitives. After having gone through a fair number of original inscriptions, I have seen few exceptions. But by all means we should keep looking and perhaps our views might change. > As a general point: mixing insertion and kerning (or join) for a single phenomenon annoys me a bit to be honest. And I think this is also you view, right? I would agree it is best to have sharp distinctions where sharp distinctions can be made. This said, we are dealing with an extraordinarily complex writing system. We need not be under the illusion there is only one correct way to encode a text, at least if we are working from an original inscription. (If we take JSesh as input, the choices have been made already.) For example take Dd=f, with cobra + hand + viper. Sometimes you see hand:viper clearly entirely inside the bounding box of the cobra. Sometimes you see the hand in the cobra, while the viper is entirely below, with the tail of the viper extending below the tip of the tail of the cobra. Sometimes, the head of the viper is inside the bounding box of the cobra, while the tail of the viper extends below the tail of the cobra. So how should this then be encoded? (d insert-left-corner cobra) :[fit] f or (d:f) insert-left-corner cobra ? My view is that both should be allowable. [Modulo the notation, which might or might not be in sync with reading order.] Would you agree? Best regards, Mark-Jan From bobqq at live.co.uk Fri Jul 29 14:35:16 2016 From: bobqq at live.co.uk (Bob Richmond) Date: Fri, 29 Jul 2016 14:35:16 +0100 Subject: [Egyptian] Using hieroglyphs in a Word processor Message-ID: Hi All Before we get to discuss what is needed to get consensus for UTC-related issues I thought it would be a good idea to get back on track about what this is all about. So I drafted this note about one fundamental point - hieroglyphic in Word processing. Maybe I'm wrong but I suspect this is one key part of what many Egyptologists want to be able to do. PDF is attached. I first did experiments along these lines back in 2015 and successful prototyping coloured my thoughts on how to approach the topic. So its old news to me but possibly new to some of you - I'd very little time available to dig into this before lunch in Cambridge! Bob -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: EgyptianInWPComplexScript.pdf Type: application/pdf Size: 729319 bytes Desc: not available URL: From bobqq at live.co.uk Fri Jul 29 19:02:47 2016 From: bobqq at live.co.uk (Bob Richmond) Date: Fri, 29 Jul 2016 19:02:47 +0100 Subject: [Egyptian] Two group joiners In-Reply-To: <8934E9F2-FB32-4790-827C-C66EA6F38524@ulg.ac.be> References: <95BE07BA-CE76-49E8-B6DE-304C925ABFD3@ulg.ac.be> <4429959.tRbiO4p2SD@thuis> <8934E9F2-FB32-4790-827C-C66EA6F38524@ulg.ac.be> Message-ID: Hi St?phane Thank you for suggesting what you?d find useful as the basis for attempting to find a possible consensus. Lets see what can be done. If you've had time to read my earlier document about Word Processing I hope you now have a clearer idea of why I'm holding firm on complexity and the interests of Egyptologists in general. As far as I can tell there is a consensus for 1) Vertical [:] // Already in MdC and in Bob?s proposal 2) Horizontal [*] // Already in MdC in Bob?s proposal So it would be helpful if further discussions started from this point. Does anyone disagree or can we take this and move on? [However interesting the evolving http://www.unicode.org/L2/L2016/16210-egyptian-control.pdf is as a theoretical model against which other concepts may be tested]. I'd also like to add another potential consensus item: 0) Any representation of the Westcar Papyrus (as in the WP doc as a fairly typical hieratic transcription) should not be significantly more complex in a consensus system unless there is a compelling reason. Again, does anyone disagree? If we can be settled on these points I can try to map something out on the other items. On 4 corners etc. it's more complicated, partly because not all corners are equal. [I think Mark-Jan and myself might agree there's mileage in treating the Cobra-pattern distinctly for instance.] But lets come to that once we've a basis to move forward. If we can be settled with points 0-2 I'm happy to try to map out what I think are the next steps (taking into account your other points). And great if anyone else wants to contribute their take. Bob -----Original Message----- From: Egyptian [mailto:egyptian-bounces at evertype.com] On Behalf Of St?phane polis Sent: 28 July 2016 11:38 To: Egyptian Hieroglyphs in the UCS Subject: Re: [Egyptian] Two group joiners Dear Mark-Jan, Yeap, I do agree. As you can see, I?m simply trying to be imaginative and to think about the better possible consensus before the very close deadline (in order not to take the chance for everything to collapse or to go in unwarranted territories, because I do care). As such, I understand perfectly the cost of getting rid of parenthesis; my point is that, as we know that we need to make clusters (and there is not much room for discussion about this basic principle), the only alternative would be this ?glue-precedence? operator, discussed in Section 9 of the document, that has not yet been discussed by others. It?s one or the other, but there are logically no third option as far as I understand. I would like to hear what you and others think about the ?scope operator? for the insertion, but to sum up, a basic consensus document for control characters could in my view contain only 8 control characters (and we know that we could handle most [if not all] of the cases discussed so far with these characters, that the scheme will be coherent, that it will be easy to expand with others operators without destroying everything, etc.), namely: 1) Vertical [:] // Already in MdC and in Bob?s proposal 2) Horizontal [*] // Already in MdC in Bob?s proposal 3) Groups (precedence operator) [.] // Needed for clusters (insertion, vertical, horizontal) and for avoiding the multiplication of group-joiners of various level in the future. 4) Top-left [?] // Agreed on by most (if not all) 5) Bottom-left [?] // Agreed on by most (if not all) 6) Top-Right [?] // Agreed on by most (if not all) 7) Bottom-Right [?] // Agreed on by most (if not all) 8) Scope operator [!] // A suggestion to be discussed Maybe that a quadrat separator [-] should be added in order to avoid issues, e.g. with multiple corners insertions following one another (but this should be tested of course), and because after all, the basic unit of this writing system is the quadrat (not the sign, even if it happens that 1 sign = 1 quadrat). These operators correspond to a very ?flat? and linear syntax (without ?begin? and ?end? markers that seem to be so problematic), which is a requirement from the Unicode specialists as I understand it. If needed, I am ready to leave out of this ?consensus? document: - The join - The insert-center - The stack - The empty Not because I think we don?t need them (I think it would be great to have), but (1) because I would like the discussion to focus on the 8/9 operators above, (2) because these additions are easy to make in a second step and will not impedes much for short term encoding. That?s about it for me at the moment. I leave the floor to you, Bob and others. Let me stress that it?s probably difficult for me/us to make more steps in the direction of a consensus. At this stage, everyone should probably try and understand other?s point of view and accept that compromises are needed rather quickly... As you can see in the summary above, the central point is about the group/precedence operator. Then the ?scope? operator and what you think about it. The rest should be easily agreed on. Unicode is not a one time thing, as Bob said, sure, but one decision can affect all the future developments. Take care, St?phane ------------------------------------------------------ Chercheur qualifi? F.R.S.-FNRS Universit? de Li?ge Service d'?gyptologie D?partement des sciences de l?Antiquit? Place du 20-Ao?t, B-4000 Li?ge http://www.egypto.ulg.ac.be ------------------------------------------------------ > Le 28 juil. 2016 ? 11:32, Mark-Jan Nederhof a ?crit : > > Dear St?phane, > > To support to what you wrote on coherence: A long time ago we passed > beyond the stage of knowing we need certain primitives. The main > question is how to fit them together into a coherent scheme with a > syntax that is consistent and unambiguous, without sacrificing (too much) of the power of these primitives and how they can be combined. > >> I understood that parentheses are very difficult in Unicode, but can?t we think of a kind of ?glue? operator (precedence, as in Mark-Jan?s et al. proposal, p. 8-9), e.g. with the dot, or , that would create groups by bounding signs together. What would be the problem of such an approach (it was initially suggested by Serge, not my idea, I insist)? Can you explain me why it would be problematic? > > With respect to the discussion in Section 9 of: > http://www.unicode.org/L2/L2016/16210-egyptian-control.pdf > the '.' above is a notational variant of the 'levels' of operator > precedence, there achieved by having several copies of each primitive, > for different levels. Here, for every '.' added behind an operator one goes up one level of operator precedence. > > Some trick using operator precedence, either using the '.' or using > several control characters per primitive with different binding > values, would be an alternative to having brackets. There is a > relatively straightforward mechanical mapping from one notation to the > other, so there should be no difference to the expressive power and I don't think that among us (TLA/Ramses/St Andrews) there is any firm commitment to brackets or otherwise. > > But as explained in Section 9, there are costs to abandoning brackets. > Specifying the syntax with operator precedence formally would be a bit > tricky, it might not be easy for the human to know when to use :... or > *.. or insert_top_right. , and OpenType cannot really parse, as it is > not a general purpose programming language, and trying to implement > about a dozen different binding values > (4 primitives times 3 levels of operator precedence) might be pushing our luck. > My working assumption is that an input method editor would be used to > edit hieroglyphic text in practice, so the actual syntax should not matter too much, as long as it is hidden from the user. > > Best regards, > Mark-Jan > > _______________________________________________ > Egyptian mailing list > Egyptian at evertype.com > http://evertype.com/mailman/listinfo/egyptian_evertype.com _______________________________________________ Egyptian mailing list Egyptian at evertype.com http://evertype.com/mailman/listinfo/egyptian_evertype.com From odusseus at gmail.com Sat Jul 30 09:43:16 2016 From: odusseus at gmail.com (Marwan Kilani) Date: Sat, 30 Jul 2016 10:43:16 +0200 Subject: [Egyptian] On "A system of control characters for Ancient Egyptian hieroglyphic text" (2016-07-23) In-Reply-To: <45D0121F-5FF6-4CF5-8DE7-AE828BCAE90B@evertype.com> References: <15473721.ga2UxhmWhY@thuis> <5487135.A6Y7aQEzAa@thuis> <45D0121F-5FF6-4CF5-8DE7-AE828BCAE90B@evertype.com> Message-ID: "It depends on the script. Devanagari and Thai have similar structures though Thai doesn?t do conjuncts the same way. Devanagari is encoded phonetically, Thai visually" Correct me if I am wrong, I don't know much about Thai: As far as I know Thai script has both isolated signs (like letters in the latin alphabet), and signs combined together into groups/ligatures. In particular, some vowels are written as independent signs, while others are combined with consonants in graphic units that represent syllables. now, as far as I know, those signs that are displayed as independent sigs are encoded visually. This mean that some vowels, that are displayed as independent glyphs before the consonants are encoded before the consonants, although they have to be read after them. The reason, I guess, is that these letters are perceived as independent glyphs, and therefore the fact that a vowel has to be inputted/displayed before the consonant, although it has to be read afterwards, can be interpreted just as an orthographic/spelling convention. As long as the signs are treated as independent, non ligated glyphs it makes sense. I guess perhaps you can find similar examples also in other languages. Just spelling conventions. At the same time, in Thais, however, as far as I know, everything combined into groups/ligatures, is encoded phonetically, not visually. Diacritics for tones or vowels that have to be displayed above the consonant and have to be read after it are encoded after the consonant, according to their phonetic reading (and not before it, as one could argue for a "top-bottom" visual encoding). is it correct? Or am I wrong? In devanagari everything is encoded phonetically because per se everything is combined into ligatures/groups. Actually, question for the Unicode guys of the group: is there in unicode any writing system that uses groups or ligatures (not isolated glyphs), and encode them visually, rather than phonetically? Or more in general, what is the proportion of writing systems using ligatures/groups that encode them visually, and what is the proportion of those encoding them phonetically? Just to understand where some of the methods suggested for Egyptian here would stand Best Marwan On Thu, Jul 28, 2016 at 3:15 PM, Michael Everson wrote: > On 28 Jul 2016, at 12:41, Mark-Jan Nederhof wrote: > > > "For complex writing systems such as Egyptian hieroglyphs or Japanese, > the visual order of characters should be separate from phonetic order. In > complex writing systems, combining visual and phonetic order can result in > visual ambiguity, where one reading could have two different visual > sequences. One could, however, keep a separate field (or use mark-up) with > the phonetic reading, which could be carried along with the data.? > > OK > > > I interpret this to say reading ('phonetic') order is secondary to > graphical order what Unicode is concerned. But it also suggests there might > be room for taking reading order into account, and not just in mark-up but > also into the Unicode encoding itself. I can't be certain my interpretation > of this is correct. We should ask the UTC for clarification next week. > > It depends on the script. Devanagari and Thai have similar structures > though Thai doesn?t do conjuncts the same way. Devanagari is encoded > phonetically, Thai visually. > > In my view it is graphical order which is the only practical way of > encoding Egyptian. > > Michael > _______________________________________________ > Egyptian mailing list > Egyptian at evertype.com > http://evertype.com/mailman/listinfo/egyptian_evertype.com > -------------- next part -------------- An HTML attachment was scrubbed... URL: From odusseus at gmail.com Sat Jul 30 09:52:00 2016 From: odusseus at gmail.com (Marwan Kilani) Date: Sat, 30 Jul 2016 10:52:00 +0200 Subject: [Egyptian] On "A system of control characters for Ancient Egyptian hieroglyphic text" (2016-07-23) In-Reply-To: <3771665.YKeWPZMlE7@thuis> References: <15473721.ga2UxhmWhY@thuis> <3771665.YKeWPZMlE7@thuis> Message-ID: "For example take Dd=f, with cobra + hand + viper. Sometimes you see hand:viper clearly entirely inside the bounding box of the cobra. Sometimes you see the hand in the cobra, while the viper is entirely below, with the tail of the viper extending below the tip of the tail of the cobra. Sometimes, the head of the viper is inside the bounding box of the cobra, while the tail of the viper extends below the tail of the cobra." Sorry, to go back to the same thing, but.. what is the practical need of encoding such a distinction? having the inside or outside the cobra has (I guess I should say "as far as I know") no meaning and no importance whatsoever. There is no linguistic, no semantic, nothing.. it is just a graphical variant, a calligraphic choice of the scribe. Unicode should be about standardized transcriptions, not about paleographic details. The important thing here is that the "f" comes after "D+d". That's all. Why should we encode such calligraphic variants in the first place? What is the utility of that? Marwan On Thu, Jul 28, 2016 at 5:03 PM, Mark-Jan Nederhof wrote: > Dear St?phane, > > > >> ***** 2. A single ?scope operator? **** > > >> > > >> I mean here an operator that marks one sign as the scope of all the > insertions around. Let?s say we use ?!? as a scope marker, then the > examples above would become: > > >> > > >> D! [insert_bottom_left] d > > >> t [insert_bottom_left] A! > > >> w! [insert_bottom_left] t [insert_top_right] D54 > > >> > > >> The graphemic order is explicit and the reading order is respected. > > > > > > I don't think this works because what is inserted is a group in > general, and then where do you > > > place the '!? ? > > > > Sorry, I?m tired, slow, and not sure I understand? Could you give an > example of what you mean here? > > Sorry, I misread the first time around. The way I understand it now, it > might be an unambiguous syntax, > at least if we rule out groups with more than one core_group in which to > insert. I have an uneasy feeling > though about the interpretation of the [insert_bottom_left] depending on > some mark somewhere else > to the left or to the right. And would it be possible to have no marked > element at all, and what would > that mean? (Defaulting to the left-most or right-most element?) > > > > One option might be to use 'insert-into-the-middle' for the cobra > combination. In the > > > current syntax, 'insert-into-the-middle' cannot be combined with some > other > > > insertion-into-a-corner, but this may not be needed. Because for > 'insert-into-the-middle' > > > the inserted sign comes after the main sign, the special case of the > cobra would be solved, > > > with respect to reading order. > > > > This could be a practical solution of course, but this would stretch > quite a bit the definition of the ?insert-into-the-middle? (as you say > below actually) for no good reason (except to solve a reading issue), and > would lead one to use ?insert-into-the-middle? in unprincipled way. I mean, > what would be the rationale for choosing ?bottom-left? and ?middle?? The > only one is a graphic one, and I?m not sure that a reading motivation is > good. But an option to keep in mind of course. > > Yes, I see problems there. As Michael reminded us, the font designer needs > instructions to know > what to do. If the instructions are few, this is easier than if there are > many instructions, with many > special rules and exceptions. I also would hope to keep open the > possibility of doing the rendering > totally automatically, certainly outside the context of OpenType, in > applications where general-purpose > programming languages can be used. Having an encoding where every > primitive has a simple > procedural interpretation (or, as simple as possible) would therefore be a > tremendous advantage. > > And as you suggested, the encoder needs to know how to encode. Having many > special cases > makes it more difficult to decide. > > There are trade-offs, as many times before. > > > Nope, that?s for sure. For the cases that you mention, I wonder whether > we should not collect more evidence and prepare a later addition with > ?insert-top? and ?insert-bottom? based on this (for insert bottom, one can > think of the combination of ?pt (sky)? with several signs, etc.), rather > than to stretch the definition of ?insert-middle?. But this remains > certainly an open debate. > > We could keep this open. My working hypothesis is that where one would > feel the need for insert-top or insert-bottom > there is usually some case to be made for either vertical+kerning or > insert-inside, which would avoid having more primitives. > After having gone through a fair number of original inscriptions, I have > seen few exceptions. But by all means we > should keep looking and perhaps our views might change. > > > As a general point: mixing insertion and kerning (or join) for a single > phenomenon annoys me a bit to be honest. And I think this is also you view, > right? > > I would agree it is best to have sharp distinctions where sharp > distinctions can be made. This said, we > are dealing with an extraordinarily complex writing system. We need not be > under the illusion there is only > one correct way to encode a text, at least if we are working from an > original inscription. (If we take JSesh as > input, the choices have been made already.) > > For example take Dd=f, with cobra + hand + viper. Sometimes you see > hand:viper clearly entirely > inside the bounding box of the cobra. Sometimes you see the hand in the > cobra, while the viper > is entirely below, with the tail of the viper extending below the tip of > the tail of the cobra. Sometimes, > the head of the viper is inside the bounding box of the cobra, while the > tail of the viper extends > below the tail of the cobra. > So how should this then be encoded? > (d insert-left-corner cobra) :[fit] f > or > (d:f) insert-left-corner cobra > ? > My view is that both should be allowable. [Modulo the notation, which > might or might not be in sync > with reading order.] Would you agree? > > Best regards, > Mark-Jan > > _______________________________________________ > Egyptian mailing list > Egyptian at evertype.com > http://evertype.com/mailman/listinfo/egyptian_evertype.com > -------------- next part -------------- An HTML attachment was scrubbed... URL: From odusseus at gmail.com Sat Jul 30 10:59:33 2016 From: odusseus at gmail.com (Marwan Kilani) Date: Sat, 30 Jul 2016 11:59:33 +0200 Subject: [Egyptian] Using hieroglyphs in a Word processor In-Reply-To: References: Message-ID: Hello Bob, thank you for that. Your document is very interesting, but at the same time highlights a problem with control characters that I pointed out a while ago, but which has still not been addressed/answered, as far as I know. The point is: as already said, whether you use control characters or ligatures, groups will be displayed correctly *only* if they are present, as precomposed glyphs, in the font. Control characters do not make things up on their own. Now, the problem I see is: what happens if the group you want to display is not present in the font? With ligatures embedded in the font, essentially nothing happens: the single characters would be displayed just one after the other without being grouped together (or they would be grouped together in simpler groups, if some simpler group combining at least some of the signs would be present in the font). It would not be visually too fancy, but it would still be readable and will still be overall ok as text. On the other hand, your file shows very well what would happen in such cases with control characters: the result would be broken control characters popping out here and there between your actual hieroglyphs. Essentially, the editor would not display only the proper hieroglyphic signs, but it would display the hieroglyphic signs *AND* the control characters, that will appear as unrecognized/unprocessed glyphs. Essentially the result will be those sequences of hieroglyphs + control characters that you highlighted in yellow in your proposal. I think this a huge problem: leaving aside secondary issues like the fact that such broken sequences would be ugly (much uglier that just ungrouped hieroglyphs) and that no journal editor will ever accept to publish a text with broken control characters, leaving aside that, the huge problem I see is that such sequences of unprocessed signs+control characters would essentially make the text unreadable and thus useless. The examples appearing in your paper are relatively simple, and therefore is relatively easy to visually focus on the hieroglyphic signs, to ignore the unprocessed control characters and just to read through them. It's ok. But one can also assume that those simple groups will be quickly integrated into the font. But what about complex groups (which i think is the main reason people here want to use control characters in the first place)? What about ramesside groups? If your ramesside group is not present as a precomposed glyph in your font, your text processor will not display it correctly. Whether you use ligatures, or control characters, it won't display correctly. Now, if you use font-embedded ligatures, this won't be a big problem, because the unrecognized ramesside group will just be displayed as a plain sequence of hieroglyphs, which perhaps could be reorganized into simpler groups (that are more readable than plain hieroglyphs). It won't be as nice as the proper desired group, but it will work and the resulting text will still be readable and usable. But what about control characters? What it will display will just be the raw sequence of hieroglyphs *AND* unprocessed control characters. That means that the users will see just a string of a lot of random unprocessed/unrecognized control characters with some hieroglyphs scattered among them (and if you think that St?phane etc's proposal requires 10 control characters to code a group of just 3 signs like Htp, you can imagine how many control characters you will need to code even just a "simple" ramesside group). The result would just be graphically unreadable, and probably completely undecipherable for the large majority of egyptologist who won't know how to read through,to mentally "parse" and to mentally "compose" the control characters (which again will be very numerous, nested one into the other, etc). If you add that some of the control characters you (pl, not necessarily you Bob) are suggesting require to modify the inputting order of the signs, you would just end up with a completely broken text, not only with tens (literally tens) of unprocessed control characters scattered everywhere, but also with proper hieroglyphic signs displayed in totally random and totally wrong order (like having the hand-d displayed before the cobra-D in group Dd, with a broken control character between the two). it would just be near to impossible to easily make any sense out of such a text. Now, this in my opinion this is a huge problem, because it is a problem that is very likely to occur. Consider: it will occur if your font does not have a precomposed glyph for the group you want to display. It will also occur (and this is the important part) if your font had the precomposed glyph, but the text editor (whatever will be) will not be able to to process your control characters. And this is something that is *very likely* to happen, especially if you are considering to introduce "exotic" control characters that have no parallel in other already encoded languages. Because let be honest: very mainstream text processors such as office and open office or browsers like safari still have problems in processing widely used control characters and complex scripts.. and well, egyptian definitely does not have the commercial weight for us to hope in a fast and bug-free implementation of exotic, hieroglyphic specific control characters. This means that until text processors and browsers won't have implemented the recognition of control characters, you will have very high probabilities of displaying broken text with unprocessed control characters that no one will be able to use in any way because it will be just crazy to read and to make sense of them (or even almost impossible, if you include control characters that will mess up the sequence of the hieroglyphs themselves). So my question is very simple: How are you (all) planning to deal with such a scenario? In how many practical cases would be more useful to have a broken string with visible control characters, rather than a readable plain sequence of signs, possible recomposed into simple ligatures? You are right, you will lose the information about the original precise spatial organization of the signs, but again, in *practice*, in how many cases the preserving exact information about the precise order will be more important that just displaying the text in a readable way? And note: I am seriously asking these questions, as I think they are important questions that requires precise answers. Best Marwan On Fri, Jul 29, 2016 at 3:35 PM, Bob Richmond wrote: > Hi All > > > > Before we get to discuss what is needed to get consensus for UTC-related > issues I thought it would be a good idea to get back on track about what > this is all about. > > > > So I drafted this note about one fundamental point ? hieroglyphic in Word > processing. Maybe I?m wrong but I suspect this is one key part of what many > Egyptologists want to be able to do. PDF is attached. > > > > I first did experiments along these lines back in 2015 and successful > prototyping coloured my thoughts on how to approach the topic. So its old > news to me but possibly new to some of you ? I?d very little time available > to dig into this before lunch in Cambridge! > > > > Bob > > > > > > > > > > > > _______________________________________________ > Egyptian mailing list > Egyptian at evertype.com > http://evertype.com/mailman/listinfo/egyptian_evertype.com > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From mn31 at st-andrews.ac.uk Sat Jul 30 11:18:32 2016 From: mn31 at st-andrews.ac.uk (Mark-Jan Nederhof) Date: Sat, 30 Jul 2016 11:18:32 +0100 Subject: [Egyptian] On "A system of control characters for Ancient Egyptian hieroglyphic text" (2016-07-23) In-Reply-To: References: <3771665.YKeWPZMlE7@thuis> Message-ID: <6717832.Pdsaq0kWUC@thuis> On Saturday 30 Jul 2016 10:52:00 Marwan Kilani wrote: > "For example take Dd=f, with cobra + hand + viper. Sometimes you see > hand:viper clearly entirely > inside the bounding box of the cobra. Sometimes you see the hand in the > cobra, while the viper > is entirely below, with the tail of the viper extending below the tip of > the tail of the cobra. Sometimes, > the head of the viper is inside the bounding box of the cobra, while the > tail of the viper extends > below the tail of the cobra." > > Sorry, to go back to the same thing, but.. what is the practical need of > encoding such a distinction? > > having the inside or outside the cobra has (I guess I should say "as far as > I know") no meaning and no importance whatsoever. There is no linguistic, > no semantic, nothing.. > > it is just a graphical variant, a calligraphic choice of the scribe. > > Unicode should be about standardized transcriptions, not about paleographic > details. The important thing here is that the "f" comes after "D+d". That's > all. > Why should we encode such calligraphic variants in the first place? > > What is the utility of that? I must have been spectacularly unclear. If one accepts that an encoding should contain primitives that describe the approximate spatial arrangement of signs, then it is inevitable there will be boundary cases where it is unclear how to encode some text. That holds for a system with 20 control characters as well as for a system with 3 control characters. Of course, if one does not accept that an encoding should contain primitives that describe the spatial arrangement of signs, then we're back to square one. I suggest you reread St?phane's messages on the subject, who motivated time and time again why at least one prominent potential user community most certainly needs to have access to the graphical realisation of a text. St?phane explained this with extraordinary detail and above all patience. I don't see any need to restart this. There are diminishing returns for repeating the same discussions ad infinitum. Mark-Jan From odusseus at gmail.com Sat Jul 30 11:46:12 2016 From: odusseus at gmail.com (Marwan Kilani) Date: Sat, 30 Jul 2016 12:46:12 +0200 Subject: [Egyptian] On "A system of control characters for Ancient Egyptian hieroglyphic text" (2016-07-23) In-Reply-To: <6717832.Pdsaq0kWUC@thuis> References: <3771665.YKeWPZMlE7@thuis> <6717832.Pdsaq0kWUC@thuis> Message-ID: We are back to square one because you (and I say you as you are suggesting the proposal) have not approached one of the basic questions yet: what is a spatially relevant feature that need to be encoded/displaed, and what is not? st?phane gave some scattered example, but there is no definition yet, no general coherent frame about *how egyptian work spatially* and therefore about what is meaningful and what is not has been suggested. Only random examples taken here and there (some even borderline cases, like your Dd=f, where you are essentially discussing and issue that *does not exist* because the difference between the to is not meaningful in any way, as far as i know) have been pointed out. Or at least this is what I have seen. What you describe is a calligraphic variant: it is important to be able to represent calligraphic variants or not? This is an *important* question that *has to be tackled with* because it has consequences on all the the suggestive steps. just to say: if you want to be able to display calligraphic variants, then you have to include corresponding precomposed glyphs in your font (because control characters dont make up anything on their own), with all the problems this will generate (and beside the fact that unicode is not for calligraphic variants) if instead you don't want to do that, then you have to make explicit that calligraphic variants *are not* meant to be handled with unicode, and this can have consequences on the features you need in your encoding system, and also brings up the next questions: if calligraphic variants are not meant to be encoded then 1) what is a calligraphic variant and 2) what else could be left out?. This is an important question, as important as understanding what is a real glyph that should have an independent unicode slot, and what instead is just a graphic variant. It is the same principle that make people at the workshop arguing against encoding into the unicode set signs "at random": you need first to know what is a really meaningful variant, and what instead is an allograph to have a manful set of hieroglyphs. With spatial distribution should be the same: what is meaningful and what is not. What is the general theoretical frame we are working in? If any? Essentially, what (in absolute terms, in concepts, not in random examples) are we trying to encode and what are we not trying to encode? And possibly why (meaning-wise, quantity-wise etc?) And this stands true whether you want to use ligatures, control characters, or whatever else. Marwan On Sat, Jul 30, 2016 at 12:18 PM, Mark-Jan Nederhof wrote: > On Saturday 30 Jul 2016 10:52:00 Marwan Kilani wrote: > > "For example take Dd=f, with cobra + hand + viper. Sometimes you see > > hand:viper clearly entirely > > inside the bounding box of the cobra. Sometimes you see the hand in the > > cobra, while the viper > > is entirely below, with the tail of the viper extending below the tip of > > the tail of the cobra. Sometimes, > > the head of the viper is inside the bounding box of the cobra, while the > > tail of the viper extends > > below the tail of the cobra." > > > > Sorry, to go back to the same thing, but.. what is the practical need of > > encoding such a distinction? > > > > having the inside or outside the cobra has (I guess I should say "as far > as > > I know") no meaning and no importance whatsoever. There is no linguistic, > > no semantic, nothing.. > > > > it is just a graphical variant, a calligraphic choice of the scribe. > > > > Unicode should be about standardized transcriptions, not about > paleographic > > details. The important thing here is that the "f" comes after "D+d". > That's > > all. > > Why should we encode such calligraphic variants in the first place? > > > > What is the utility of that? > > I must have been spectacularly unclear. > > If one accepts that an encoding should contain primitives that describe > the approximate spatial arrangement of signs, then it is inevitable there > will be > boundary cases where it is unclear how to encode some text. That holds > for a system with 20 control characters as well as for a system with 3 > control > characters. > > Of course, if one does not accept that an encoding should contain > primitives > that describe the spatial arrangement of signs, then we're back to square > one. I suggest you reread St?phane's messages on the subject, who > motivated time and time again why at least one prominent potential user > community most certainly needs to have access to the graphical realisation > of a text. St?phane explained this with extraordinary detail and above all > patience. I don't see any need to restart this. There are diminishing > returns > for repeating the same discussions ad infinitum. > > Mark-Jan > > > > > > > > _______________________________________________ > Egyptian mailing list > Egyptian at evertype.com > http://evertype.com/mailman/listinfo/egyptian_evertype.com > -------------- next part -------------- An HTML attachment was scrubbed... URL: From mn31 at st-andrews.ac.uk Sat Jul 30 12:28:24 2016 From: mn31 at st-andrews.ac.uk (Mark-Jan Nederhof) Date: Sat, 30 Jul 2016 12:28:24 +0100 Subject: [Egyptian] On "A system of control characters for Ancient Egyptian hieroglyphic text" (2016-07-23) In-Reply-To: References: <6717832.Pdsaq0kWUC@thuis> Message-ID: <2535989.LuAgqhuLG9@thuis> There are motivations for the particular choice of primitives we have now, which was the result of half a year of concentrated study by the folks from TLA, Ramses and St Andrews, with constructive criticism from some participants in Cambridge, and some further ideas discussed on this email list. Is the current collection of primitives and the way they are to be used set in stone? No. Do we have a single easy one-line answer why we need this set of primitives and no other. No. Could the theoretical and empirical support in favour of our set of primitives, or any other, ever be exhaustively explored? No. Does that mean our set of primitives is completely arbitrary and baseless? If you think that you have not read the text of our proposal and have not followed the discussions on the list. For the rest you are contributing no new thoughts. Life is too short to have to repeat the same discussions over and over and over again. I hope you excuse me but this is for me my last response on this thread. Mark-Jan On Saturday 30 Jul 2016 12:46:12 Marwan Kilani wrote: > We are back to square one because you (and I say you as you are suggesting > the proposal) have not approached one of the basic questions yet: > > what is a spatially relevant feature that need to be encoded/displaed, and > what is not? > > st?phane gave some scattered example, but there is no definition yet, no > general coherent frame about *how egyptian work spatially* and therefore > about what is meaningful and what is not has been suggested. Only random > examples taken here and there (some even borderline cases, like your Dd=f, > where you are essentially discussing and issue that *does not exist* > because the difference between the to is not meaningful in any way, as far > as i know) have been pointed out. Or at least this is what I have seen. > > What you describe is a calligraphic variant: it is important to be able to > represent calligraphic variants or not? > > This is an *important* question that *has to be tackled with* because it > has consequences on all the the suggestive steps. > > just to say: if you want to be able to display calligraphic variants, then > you have to include corresponding precomposed glyphs in your font (because > control characters dont make up anything on their own), with all the > problems this will generate (and beside the fact that unicode is not for > calligraphic variants) > if instead you don't want to do that, then you have to make explicit that > calligraphic variants *are not* meant to be handled with unicode, and this > can have consequences on the features you need in your encoding system, and > also brings up the next questions: if calligraphic variants are not meant > to be encoded then 1) what is a calligraphic variant and 2) what else could > be left out?. > > This is an important question, as important as understanding what is a real > glyph that should have an independent unicode slot, and what instead is > just a graphic variant. > > It is the same principle that make people at the workshop arguing against > encoding into the unicode set signs "at random": you need first to know > what is a really meaningful variant, and what instead is an allograph to > have a manful set of hieroglyphs. > > With spatial distribution should be the same: what is meaningful and what > is not. > > What is the general theoretical frame we are working in? If any? > > Essentially, what (in absolute terms, in concepts, not in random examples) > are we trying to encode and what are we not trying to encode? > And possibly why (meaning-wise, quantity-wise etc?) > > And this stands true whether you want to use ligatures, control characters, > or whatever else. > > > Marwan > > > > > > On Sat, Jul 30, 2016 at 12:18 PM, Mark-Jan Nederhof > wrote: > > > On Saturday 30 Jul 2016 10:52:00 Marwan Kilani wrote: > > > "For example take Dd=f, with cobra + hand + viper. Sometimes you see > > > hand:viper clearly entirely > > > inside the bounding box of the cobra. Sometimes you see the hand in the > > > cobra, while the viper > > > is entirely below, with the tail of the viper extending below the tip of > > > the tail of the cobra. Sometimes, > > > the head of the viper is inside the bounding box of the cobra, while the > > > tail of the viper extends > > > below the tail of the cobra." > > > > > > Sorry, to go back to the same thing, but.. what is the practical need of > > > encoding such a distinction? > > > > > > having the inside or outside the cobra has (I guess I should say "as far > > as > > > I know") no meaning and no importance whatsoever. There is no linguistic, > > > no semantic, nothing.. > > > > > > it is just a graphical variant, a calligraphic choice of the scribe. > > > > > > Unicode should be about standardized transcriptions, not about > > paleographic > > > details. The important thing here is that the "f" comes after "D+d". > > That's > > > all. > > > Why should we encode such calligraphic variants in the first place? > > > > > > What is the utility of that? > > > > I must have been spectacularly unclear. > > > > If one accepts that an encoding should contain primitives that describe > > the approximate spatial arrangement of signs, then it is inevitable there > > will be > > boundary cases where it is unclear how to encode some text. That holds > > for a system with 20 control characters as well as for a system with 3 > > control > > characters. > > > > Of course, if one does not accept that an encoding should contain > > primitives > > that describe the spatial arrangement of signs, then we're back to square > > one. I suggest you reread St?phane's messages on the subject, who > > motivated time and time again why at least one prominent potential user > > community most certainly needs to have access to the graphical realisation > > of a text. St?phane explained this with extraordinary detail and above all > > patience. I don't see any need to restart this. There are diminishing > > returns > > for repeating the same discussions ad infinitum. > > > > Mark-Jan > > > > > > > > > > > > > > > > _______________________________________________ > > Egyptian mailing list > > Egyptian at evertype.com > > http://evertype.com/mailman/listinfo/egyptian_evertype.com > > From odusseus at gmail.com Sat Jul 30 13:13:03 2016 From: odusseus at gmail.com (Marwan Kilani) Date: Sat, 30 Jul 2016 14:13:03 +0200 Subject: [Egyptian] On "A system of control characters for Ancient Egyptian hieroglyphic text" (2016-07-23) In-Reply-To: References: <6717832.Pdsaq0kWUC@thuis> <2535989.LuAgqhuLG9@thuis> Message-ID: I asked a very simple question, which was not about primitives, was not about the need of primites or anything like tgat. you wrote a nice email with lots of details, but you didn't reply that very simple question: In principle, should it be possible and useful to encode each possible *calligraphic* variants separately, or not? Is that one of the things Unicode should be about or not? And this is not a question I am asking to you in particular, but to everyone, egyptologist and non egyptologists. And I think you should be interested in the answers as well On Jul 30, 2016 1:30 PM, "Mark-Jan Nederhof" wrote: There are motivations for the particular choice of primitives we have now, which was the result of half a year of concentrated study by the folks from TLA, Ramses and St Andrews, with constructive criticism from some participants in Cambridge, and some further ideas discussed on this email list. Is the current collection of primitives and the way they are to be used set in stone? No. Do we have a single easy one-line answer why we need this set of primitives and no other. No. Could the theoretical and empirical support in favour of our set of primitives, or any other, ever be exhaustively explored? No. Does that mean our set of primitives is completely arbitrary and baseless? If you think that you have not read the text of our proposal and have not followed the discussions on the list. For the rest you are contributing no new thoughts. Life is too short to have to repeat the same discussions over and over and over again. I hope you excuse me but this is for me my last response on this thread. Mark-Jan On Saturday 30 Jul 2016 12:46:12 Marwan Kilani wrote: > We are back to square one because you (and I say you as you are suggesting > the proposal) have not approached one of the basic questions yet: > > what is a spatially relevant feature that need to be encoded/displaed, and > what is not? > > st?phane gave some scattered example, but there is no definition yet, no > general coherent frame about *how egyptian work spatially* and therefore > about what is meaningful and what is not has been suggested. Only random > examples taken here and there (some even borderline cases, like your Dd=f, > where you are essentially discussing and issue that *does not exist* > because the difference between the to is not meaningful in any way, as far > as i know) have been pointed out. Or at least this is what I have seen. > > What you describe is a calligraphic variant: it is important to be able to > represent calligraphic variants or not? > > This is an *important* question that *has to be tackled with* because it > has consequences on all the the suggestive steps. > > just to say: if you want to be able to display calligraphic variants, then > you have to include corresponding precomposed glyphs in your font (because > control characters dont make up anything on their own), with all the > problems this will generate (and beside the fact that unicode is not for > calligraphic variants) > if instead you don't want to do that, then you have to make explicit that > calligraphic variants *are not* meant to be handled with unicode, and this > can have consequences on the features you need in your encoding system, and > also brings up the next questions: if calligraphic variants are not meant > to be encoded then 1) what is a calligraphic variant and 2) what else could > be left out?. > > This is an important question, as important as understanding what is a real > glyph that should have an independent unicode slot, and what instead is > just a graphic variant. > > It is the same principle that make people at the workshop arguing against > encoding into the unicode set signs "at random": you need first to know > what is a really meaningful variant, and what instead is an allograph to > have a manful set of hieroglyphs. > > With spatial distribution should be the same: what is meaningful and what > is not. > > What is the general theoretical frame we are working in? If any? > > Essentially, what (in absolute terms, in concepts, not in random examples) > are we trying to encode and what are we not trying to encode? > And possibly why (meaning-wise, quantity-wise etc?) > > And this stands true whether you want to use ligatures, control characters, > or whatever else. > > > Marwan > > > > > > On Sat, Jul 30, 2016 at 12:18 PM, Mark-Jan Nederhof > wrote: > > > On Saturday 30 Jul 2016 10:52:00 Marwan Kilani wrote: > > > "For example take Dd=f, with cobra + hand + viper. Sometimes you see > > > hand:viper clearly entirely > > > inside the bounding box of the cobra. Sometimes you see the hand in the > > > cobra, while the viper > > > is entirely below, with the tail of the viper extending below the tip of > > > the tail of the cobra. Sometimes, > > > the head of the viper is inside the bounding box of the cobra, while the > > > tail of the viper extends > > > below the tail of the cobra." > > > > > > Sorry, to go back to the same thing, but.. what is the practical need of > > > encoding such a distinction? > > > > > > having the inside or outside the cobra has (I guess I should say "as far > > as > > > I know") no meaning and no importance whatsoever. There is no linguistic, > > > no semantic, nothing.. > > > > > > it is just a graphical variant, a calligraphic choice of the scribe. > > > > > > Unicode should be about standardized transcriptions, not about > > paleographic > > > details. The important thing here is that the "f" comes after "D+d". > > That's > > > all. > > > Why should we encode such calligraphic variants in the first place? > > > > > > What is the utility of that? > > > > I must have been spectacularly unclear. > > > > If one accepts that an encoding should contain primitives that describe > > the approximate spatial arrangement of signs, then it is inevitable there > > will be > > boundary cases where it is unclear how to encode some text. That holds > > for a system with 20 control characters as well as for a system with 3 > > control > > characters. > > > > Of course, if one does not accept that an encoding should contain > > primitives > > that describe the spatial arrangement of signs, then we're back to square > > one. I suggest you reread St?phane's messages on the subject, who > > motivated time and time again why at least one prominent potential user > > community most certainly needs to have access to the graphical realisation > > of a text. St?phane explained this with extraordinary detail and above all > > patience. I don't see any need to restart this. There are diminishing > > returns > > for repeating the same discussions ad infinitum. > > > > Mark-Jan > > > > > > > > > > > > > > > > _______________________________________________ > > Egyptian mailing list > > Egyptian at evertype.com > > http://evertype.com/mailman/listinfo/egyptian_evertype.com > > _______________________________________________ Egyptian mailing list Egyptian at evertype.com http://evertype.com/mailman/listinfo/egyptian_evertype.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From everson at evertype.com Sat Jul 30 13:22:21 2016 From: everson at evertype.com (Michael Everson) Date: Sat, 30 Jul 2016 13:22:21 +0100 Subject: [Egyptian] On "A system of control characters for Ancient Egyptian hieroglyphic text" (2016-07-23) In-Reply-To: References: <6717832.Pdsaq0kWUC@thuis> <2535989.LuAgqhuLG9@thuis> Message-ID: <3CBB8161-1005-4246-BDC4-B134AB829CCF@evertype.com> Palaeographic reproduction is out of scope for Unicode. The precision needed for that is already available in Illustrator and such. Unicode encoding permits searchable and interchangeable text. A certain normalization is expected, I think. I really think we should start out with a robust mechanism for representing text, whether paragraphs of Egyptian text on its own or inline citations within paragraphs in English, German, French, etc. Can we possibly focus on normalized Egyptian in LTR order? Earlier I said: The Cobra can be dealt with in one of two ways: Hand + bottom-left + Cobra or Cobra + vertical-stack + Hand The decision here is *conventional* and *arbitrary*. Either option will work. Both options should not be allowed. Egyptologists just decide whether to consider the cobra as a diagonal character like the Chick, or a horizontal character which just happens to have a tail. It doesn?t matter which, so long as ONE choice is made. That choice will be explained in the Unicode Technical Report which will eventually be one of the instruments which can provide font developers information on Egyptian. Comment please. Michael From odusseus at gmail.com Sat Jul 30 13:33:29 2016 From: odusseus at gmail.com (Marwan Kilani) Date: Sat, 30 Jul 2016 14:33:29 +0200 Subject: [Egyptian] On "A system of control characters for Ancient Egyptian hieroglyphic text" (2016-07-23) In-Reply-To: <3CBB8161-1005-4246-BDC4-B134AB829CCF@evertype.com> References: <6717832.Pdsaq0kWUC@thuis> <2535989.LuAgqhuLG9@thuis> <3CBB8161-1005-4246-BDC4-B134AB829CCF@evertype.com> Message-ID: I agree with you michael. One of the two has to be conventionally chosen (the most frequent one?), the other should not be allowed. This is a problem that will emerge with a lot of groups, if we don't clarify first what is common and what is rare, what is meaningful and what is just (calli)grafic variants. On Jul 30, 2016 2:22 PM, "Michael Everson" wrote: > Palaeographic reproduction is out of scope for Unicode. The precision > needed for that is already available in Illustrator and such. Unicode > encoding permits searchable and interchangeable text. A certain > normalization is expected, I think. > > I really think we should start out with a robust mechanism for > representing text, whether paragraphs of Egyptian text on its own or inline > citations within paragraphs in English, German, French, etc. Can we > possibly focus on normalized Egyptian in LTR order? > > Earlier I said: > > The Cobra can be dealt with in one of two ways: > > Hand + bottom-left + Cobra > > or > > Cobra + vertical-stack + Hand > > The decision here is *conventional* and *arbitrary*. Either option will > work. Both options should not be allowed. Egyptologists just decide whether > to consider the cobra as a diagonal character like the Chick, or a > horizontal character which just happens to have a tail. It doesn?t matter > which, so long as ONE choice is made. That choice will be explained in the > Unicode Technical Report which will eventually be one of the instruments > which can provide font developers information on Egyptian. > > Comment please. > > Michael > _______________________________________________ > Egyptian mailing list > Egyptian at evertype.com > http://evertype.com/mailman/listinfo/egyptian_evertype.com > -------------- next part -------------- An HTML attachment was scrubbed... URL: From odusseus at gmail.com Sat Jul 30 14:21:58 2016 From: odusseus at gmail.com (Marwan Kilani) Date: Sat, 30 Jul 2016 15:21:58 +0200 Subject: [Egyptian] another approach, and possibly a compromise? Message-ID: May I dare to suggest another approach to the problem, suggesting a compromise solution that could be at the same time easy and efficient and that could satisfy more or less everyone? Let?s start from the assumption that at some point we will need a common archive or database listing all the attested groups. This is something that *will have to be done* at some point, whether you want to use 0 or 100 control characters, because groups will need to be inserted as precomposed glyphs into the fonts to be displayed correctly, and the people who will design the fonts will need a list of the groups they have to draw. This is a must, such a reference list must be created at some point, whatever solution for grouping will be decided. And I think it should not be too difficult to set up something like this, and one could think to an online database of reference, where a unified and generally recognized list of reference with all the attested groups will be freely available. The list will give information about the individual signs composing each group, and about their spatial organization within the group itself. This can be done by just writing for each group the list of signs composing it with the relevant MdC operators. Groups, or sequences of texts could be searched also on the basis of their spatial organization Now, as for the Unicode: Yes about control characters, but: What about deciding that: a) sequences of signs that can be combined into only one attested groups (as many Ramesside groups) will be rendered *only* with plain ligatures, without control characters. The advantages would be: - 1) complex groups could be rendered without the need of recursivity or complex nested control characters, because it is logic to assume that the more complex a group is, and the less likely it is that its composing signs will appear in more than one spatial organization. So they can be dealt with plain ligatures without the risk of ambiguity. - 2) Because of the point above, this will probably solve the problem of nesting control characters, or at least it will greatly reduce it. I doubt that there are many sequences of signs with nesting features that can be combined in multiple different groups. - 3) If for some reason the group should not be rendered correctly by the browser/text editor, it would not be a problem because the signs composing the group will be displayed ungrouped, as plain text, without broken control characters among them. The text would still be readable and usable - 4) The syntax of the signs will not be affected by control characters, because control characters will not be involved. The ligatures will be triggered jsut by imputing the sign in their correct reading order. Note that there would not be any loss of information, because these groups would be unique spatial compositions, i.e. the spatial organization of the signs within the group will be implicit in the ligature itself (which could be named with the string of signs + MdC operators describing the group), and will be available in the database mentioned above. As long as the database will be used as a reference for the creation of fonts, the ligature needed for the group will be present and therefore the group itself will be displayed correctly. b) sequences of signs that can be combined into more than one different group will be rendered differently. In particular, one, the most common group (or the one with the most complex organization?) will be rendered with a plain ligature. Any other group will be rendered with control characters (whatever control characters you prefer). So for instance the sequence: owl G17 + arm D36 The basic group, rendered with the plain ligature could be the owl above the arm, that will be automatically created by inputting owl + arm. Then there could be a secondary group, the arm across the owl (I know this is already encoded as a distinct character, just for the sake of the example), that will be created with a control character, so e.g. by inputting owl + control character + arm. The advantages would be: - 5) even if a text editor/browser will not be able to correctly render control characters, there will always be one basic group with a plain ligature that can be rendered (whether as a group or as a sequence of signs) without broken control characters popping out here and there - 6) you can have as many variants as you want, and therefore the precision of the spatial distribution will be preserved, because the less common spatial organizations (i.e. the less common groups) will be distinguished by the basic one through the use of control characters. If you don?t need spatial precision, you can just use the plain basic ligature, if you want to be precise you can select the specific organization you need. - 7) short sequences of signs are more likely to have more than one possible spatial organization, i.e. short sequences of signs are more likely to be composed in more than one different group. Still, groups built from short sequences of signs are much less likely to require features like nested control characters, recursivity, etc., so it will probably be possible to deal with them with just very simple sequences of control characters. - 8) if a new group for sequence of signs that until now would be combined into only one group should be discovered, it could just be added to the list and rendered with a control characters Because of the point 7) here above, and because of the fact that complex groups will likely be unique and therefore dealt with plain ligatures (point 1) above), it is quite likely that this system could reduce (if not even solve) the problem of the multiplication and nesting of too many control characters. Now it seems to me such a system would satisfy everyone?s needs and deal with a lot of problems: - searchability will be granted because groups will be dealt mainly with plain ligatures, and therefore the phonetic order can be respected in imputing the single sings - spatial information will be preserved, because it will be either encoded in the ligature itself, or in the variants using control characters. And will in any case case be present in the basic database (that *has to be created* anyway). - The system will allow to use control characters in a efficient way, that won?t require tens of them, both because unique groups will be rendered with plain ligatures, and because those groups that will must likely need control characters will be relatively simple, based on short sequences of signs. This, I think, is more in line with the basic principles of Unicode. - Note that the system is different from Bob?s original simplified Egyptian proposal, as it suggests to use plain ligatures for all the sequences of signs that can be combined in only one group, and for one of the possible groups for those sequences that can be combined in more than one group. At the same time the system is relatively similar to how the emoji variants work, and to what we were discussion with some of you during the workshop. - The use of plain ligatures for the most common/basic forms of the groups would solve the Dd problem pointed out by Michael: no need to chose between up/down or corner control character, because the pain ligature can be used - It would be very easy to create input methods for such a system Now, as we are here to reach an agreement on a system that can work. It seems to me that such a system would solve a lot of the issues that are being discussed. So what do you think about it? Marwan -------------- next part -------------- An HTML attachment was scrubbed... URL: From ncs3 at cam.ac.uk Sat Jul 30 14:52:12 2016 From: ncs3 at cam.ac.uk (Nigel Strudwick) Date: Sat, 30 Jul 2016 14:52:12 +0100 Subject: [Egyptian] On "A system of control characters for Ancient Egyptian hieroglyphic text" (2016-07-23) In-Reply-To: <3CBB8161-1005-4246-BDC4-B134AB829CCF@evertype.com> References: <6717832.Pdsaq0kWUC@thuis> <2535989.LuAgqhuLG9@thuis> <3CBB8161-1005-4246-BDC4-B134AB829CCF@evertype.com> Message-ID: <04BD83BF-9326-4814-91BB-D9C373186D22@cam.ac.uk> I am sure I have misunderstood this, but as an Egyptologist I would prefer something that preserves the order of the transliterated group Dd, so D d. Or else just leave us to do our best in Jsesh without troubling Unicode (my preferred solution as you all know!) Nigel On 30 Jul 2016, at 13:22, Michael Everson wrote: > Palaeographic reproduction is out of scope for Unicode. The precision needed for that is already available in Illustrator and such. Unicode encoding permits searchable and interchangeable text. A certain normalization is expected, I think. > > I really think we should start out with a robust mechanism for representing text, whether paragraphs of Egyptian text on its own or inline citations within paragraphs in English, German, French, etc. Can we possibly focus on normalized Egyptian in LTR order? > > Earlier I said: > > The Cobra can be dealt with in one of two ways: > > Hand + bottom-left + Cobra > > or > > Cobra + vertical-stack + Hand > > The decision here is *conventional* and *arbitrary*. Either option will work. Both options should not be allowed. Egyptologists just decide whether to consider the cobra as a diagonal character like the Chick, or a horizontal character which just happens to have a tail. It doesn?t matter which, so long as ONE choice is made. That choice will be explained in the Unicode Technical Report which will eventually be one of the instruments which can provide font developers information on Egyptian. > > Comment please. > > Michael > _______________________________________________ > Egyptian mailing list > Egyptian at evertype.com > http://evertype.com/mailman/listinfo/egyptian_evertype.com From bobqq at live.co.uk Sat Jul 30 14:57:36 2016 From: bobqq at live.co.uk (Bob Richmond) Date: Sat, 30 Jul 2016 14:57:36 +0100 Subject: [Egyptian] On "A system of control characters for Ancient Egyptian hieroglyphic text" (2016-07-23) In-Reply-To: <2535989.LuAgqhuLG9@thuis> References: <6717832.Pdsaq0kWUC@thuis> <2535989.LuAgqhuLG9@thuis> Message-ID: Hi Mark-Jan I was grateful to receive data from TLA and Ramses this week and last which I've been working on to organize. On repeated occasions I've asked for data and examples to help put these questions on a secure footing. Since well before Cambridge. Now you tell us there's been "six months of concentrated study" from a bunch of people. Am I, and have I been, wasting my time if you've already done this work? I'd really appreciate if you could share with group what you can such as any of: images of the difficult cases, data, analysis and other evidence that you've been collecting and analysing during the last 6 months to inform your work. Thank you Bob -----Original Message----- From: Egyptian [mailto:egyptian-bounces at evertype.com] On Behalf Of Mark-Jan Nederhof Sent: 30 July 2016 12:28 To: Egyptian Hieroglyphs in the UCS Subject: Re: [Egyptian] On "A system of control characters for Ancient Egyptian hieroglyphic text" (2016-07-23) There are motivations for the particular choice of primitives we have now, which was the result of half a year of concentrated study by the folks from TLA, Ramses and St Andrews, with constructive criticism from some participants in Cambridge, and some further ideas discussed on this email list. Is the current collection of primitives and the way they are to be used set in stone? No. Do we have a single easy one-line answer why we need this set of primitives and no other. No. Could the theoretical and empirical support in favour of our set of primitives, or any other, ever be exhaustively explored? No. Does that mean our set of primitives is completely arbitrary and baseless? If you think that you have not read the text of our proposal and have not followed the discussions on the list. For the rest you are contributing no new thoughts. Life is too short to have to repeat the same discussions over and over and over again. I hope you excuse me but this is for me my last response on this thread. Mark-Jan On Saturday 30 Jul 2016 12:46:12 Marwan Kilani wrote: > We are back to square one because you (and I say you as you are > suggesting the proposal) have not approached one of the basic questions yet: > > what is a spatially relevant feature that need to be encoded/displaed, > and what is not? > > st?phane gave some scattered example, but there is no definition yet, > no general coherent frame about *how egyptian work spatially* and > therefore about what is meaningful and what is not has been suggested. > Only random examples taken here and there (some even borderline cases, > like your Dd=f, where you are essentially discussing and issue that > *does not exist* because the difference between the to is not > meaningful in any way, as far as i know) have been pointed out. Or at least this is what I have seen. > > What you describe is a calligraphic variant: it is important to be > able to represent calligraphic variants or not? > > This is an *important* question that *has to be tackled with* because > it has consequences on all the the suggestive steps. > > just to say: if you want to be able to display calligraphic variants, > then you have to include corresponding precomposed glyphs in your font > (because control characters dont make up anything on their own), with > all the problems this will generate (and beside the fact that unicode > is not for calligraphic variants) if instead you don't want to do > that, then you have to make explicit that calligraphic variants *are > not* meant to be handled with unicode, and this can have consequences > on the features you need in your encoding system, and also brings up > the next questions: if calligraphic variants are not meant to be > encoded then 1) what is a calligraphic variant and 2) what else could > be left out?. > > This is an important question, as important as understanding what is a > real glyph that should have an independent unicode slot, and what > instead is just a graphic variant. > > It is the same principle that make people at the workshop arguing > against encoding into the unicode set signs "at random": you need > first to know what is a really meaningful variant, and what instead is > an allograph to have a manful set of hieroglyphs. > > With spatial distribution should be the same: what is meaningful and > what is not. > > What is the general theoretical frame we are working in? If any? > > Essentially, what (in absolute terms, in concepts, not in random > examples) are we trying to encode and what are we not trying to encode? > And possibly why (meaning-wise, quantity-wise etc?) > > And this stands true whether you want to use ligatures, control > characters, or whatever else. > > > Marwan > > > > > > On Sat, Jul 30, 2016 at 12:18 PM, Mark-Jan Nederhof > > wrote: > > > On Saturday 30 Jul 2016 10:52:00 Marwan Kilani wrote: > > > "For example take Dd=f, with cobra + hand + viper. Sometimes you > > > see hand:viper clearly entirely inside the bounding box of the > > > cobra. Sometimes you see the hand in the cobra, while the viper is > > > entirely below, with the tail of the viper extending below the tip > > > of the tail of the cobra. Sometimes, the head of the viper is > > > inside the bounding box of the cobra, while the tail of the viper > > > extends below the tail of the cobra." > > > > > > Sorry, to go back to the same thing, but.. what is the practical > > > need of encoding such a distinction? > > > > > > having the inside or outside the cobra has (I guess I should say > > > "as far > > as > > > I know") no meaning and no importance whatsoever. There is no > > > linguistic, no semantic, nothing.. > > > > > > it is just a graphical variant, a calligraphic choice of the scribe. > > > > > > Unicode should be about standardized transcriptions, not about > > paleographic > > > details. The important thing here is that the "f" comes after "D+d". > > That's > > > all. > > > Why should we encode such calligraphic variants in the first place? > > > > > > What is the utility of that? > > > > I must have been spectacularly unclear. > > > > If one accepts that an encoding should contain primitives that > > describe the approximate spatial arrangement of signs, then it is > > inevitable there will be boundary cases where it is unclear how to > > encode some text. That holds for a system with 20 control characters > > as well as for a system with 3 control characters. > > > > Of course, if one does not accept that an encoding should contain > > primitives that describe the spatial arrangement of signs, then > > we're back to square one. I suggest you reread St?phane's messages > > on the subject, who motivated time and time again why at least one > > prominent potential user community most certainly needs to have > > access to the graphical realisation of a text. St?phane explained > > this with extraordinary detail and above all patience. I don't see > > any need to restart this. There are diminishing returns for > > repeating the same discussions ad infinitum. > > > > Mark-Jan > > > > > > > > > > > > > > > > _______________________________________________ > > Egyptian mailing list > > Egyptian at evertype.com > > http://evertype.com/mailman/listinfo/egyptian_evertype.com > > _______________________________________________ Egyptian mailing list Egyptian at evertype.com http://evertype.com/mailman/listinfo/egyptian_evertype.com From bobqq at live.co.uk Sat Jul 30 14:59:32 2016 From: bobqq at live.co.uk (Bob Richmond) Date: Sat, 30 Jul 2016 14:59:32 +0100 Subject: [Egyptian] On "A system of control characters for Ancient Egyptian hieroglyphic text" (2016-07-23) In-Reply-To: <04BD83BF-9326-4814-91BB-D9C373186D22@cam.ac.uk> References: <6717832.Pdsaq0kWUC@thuis> <2535989.LuAgqhuLG9@thuis> <3CBB8161-1005-4246-BDC4-B134AB829CCF@evertype.com> <04BD83BF-9326-4814-91BB-D9C373186D22@cam.ac.uk> Message-ID: Don't worry Nigel if you read the document I sent round yesterday you'll see all is well in the real world. Bob -----Original Message----- From: Egyptian [mailto:egyptian-bounces at evertype.com] On Behalf Of Nigel Strudwick Sent: 30 July 2016 14:52 To: Egyptian Hieroglyphs in the UCS Subject: Re: [Egyptian] On "A system of control characters for Ancient Egyptian hieroglyphic text" (2016-07-23) I am sure I have misunderstood this, but as an Egyptologist I would prefer something that preserves the order of the transliterated group Dd, so D d. Or else just leave us to do our best in Jsesh without troubling Unicode (my preferred solution as you all know!) Nigel On 30 Jul 2016, at 13:22, Michael Everson wrote: > Palaeographic reproduction is out of scope for Unicode. The precision needed for that is already available in Illustrator and such. Unicode encoding permits searchable and interchangeable text. A certain normalization is expected, I think. > > I really think we should start out with a robust mechanism for representing text, whether paragraphs of Egyptian text on its own or inline citations within paragraphs in English, German, French, etc. Can we possibly focus on normalized Egyptian in LTR order? > > Earlier I said: > > The Cobra can be dealt with in one of two ways: > > Hand + bottom-left + Cobra > > or > > Cobra + vertical-stack + Hand > > The decision here is *conventional* and *arbitrary*. Either option will work. Both options should not be allowed. Egyptologists just decide whether to consider the cobra as a diagonal character like the Chick, or a horizontal character which just happens to have a tail. It doesn't matter which, so long as ONE choice is made. That choice will be explained in the Unicode Technical Report which will eventually be one of the instruments which can provide font developers information on Egyptian. > > Comment please. > > Michael > _______________________________________________ > Egyptian mailing list > Egyptian at evertype.com > http://evertype.com/mailman/listinfo/egyptian_evertype.com _______________________________________________ Egyptian mailing list Egyptian at evertype.com http://evertype.com/mailman/listinfo/egyptian_evertype.com From bobqq at live.co.uk Sat Jul 30 17:51:06 2016 From: bobqq at live.co.uk (Bob Richmond) Date: Sat, 30 Jul 2016 17:51:06 +0100 Subject: [Egyptian] An afternoon at the Ashmolean. And a challenge. Message-ID: Hi All I'm not an Egyptologist. Nevertheless I spent a couple of hours today inspecting the hieroglyphic artefacts on display at the Ashmolean with notebook and camera phone in hand. My main objective was to try to spot any writings that might cause problems with group joiners. Collection isn't much help for 'tall groups' but plenty of vertical texts. I was also looking for any LIGATURE related problems. None discovered, just the usual suspects among the ligated clusters. Not that I expected anything: the anomaly frequency in the total corpuses appears very low, order of 0.01% averaged probably (not saying that it should be ignored- it's a consensus topic; but important to put in proportion). I'm hoping to do similar at the BM Tuesday. Not very scientific I know but worth doing to keep grounded. If anyone else has a local museum here's the Bob challenge. Find and photograph any problem clusters you can spot and send for my collection! Thank you Bob -------------- next part -------------- An HTML attachment was scrubbed... URL: From bobqq at live.co.uk Sat Jul 30 17:55:26 2016 From: bobqq at live.co.uk (Bob Richmond) Date: Sat, 30 Jul 2016 17:55:26 +0100 Subject: [Egyptian] another approach, and possibly a compromise? In-Reply-To: References: Message-ID: Hi Marwan The attested groups database as a Unicode resource is part of the proposal Andrew and I submitted hence part of the current UTC recommendation. I?d hoped to have a draft done this Summer but obviously with the current disruption I had to put that to one side. No time atm for general discussion on controls and theory, sorry. Bob From: Egyptian [mailto:egyptian-bounces at evertype.com] On Behalf Of Marwan Kilani Sent: 30 July 2016 14:22 To: Egyptian Hieroglyphs in the UCS Subject: [Egyptian] another approach, and possibly a compromise? May I dare to suggest another approach to the problem, suggesting a compromise solution that could be at the same time easy and efficient and that could satisfy more or less everyone? Let?s start from the assumption that at some point we will need a common archive or database listing all the attested groups. This is something that *will have to be done* at some point, whether you want to use 0 or 100 control characters, because groups will need to be inserted as precomposed glyphs into the fonts to be displayed correctly, and the people who will design the fonts will need a list of the groups they have to draw. This is a must, such a reference list must be created at some point, whatever solution for grouping will be decided. And I think it should not be too difficult to set up something like this, and one could think to an online database of reference, where a unified and generally recognized list of reference with all the attested groups will be freely available. The list will give information about the individual signs composing each group, and about their spatial organization within the group itself. This can be done by just writing for each group the list of signs composing it with the relevant MdC operators. Groups, or sequences of texts could be searched also on the basis of their spatial organization Now, as for the Unicode: Yes about control characters, but: What about deciding that: a) sequences of signs that can be combined into only one attested groups (as many Ramesside groups) will be rendered *only* with plain ligatures, without control characters. The advantages would be: - 1) complex groups could be rendered without the need of recursivity or complex nested control characters, because it is logic to assume that the more complex a group is, and the less likely it is that its composing signs will appear in more than one spatial organization. So they can be dealt with plain ligatures without the risk of ambiguity. - 2) Because of the point above, this will probably solve the problem of nesting control characters, or at least it will greatly reduce it. I doubt that there are many sequences of signs with nesting features that can be combined in multiple different groups. - 3) If for some reason the group should not be rendered correctly by the browser/text editor, it would not be a problem because the signs composing the group will be displayed ungrouped, as plain text, without broken control characters among them. The text would still be readable and usable - 4) The syntax of the signs will not be affected by control characters, because control characters will not be involved. The ligatures will be triggered jsut by imputing the sign in their correct reading order. Note that there would not be any loss of information, because these groups would be unique spatial compositions, i.e. the spatial organization of the signs within the group will be implicit in the ligature itself (which could be named with the string of signs + MdC operators describing the group), and will be available in the database mentioned above. As long as the database will be used as a reference for the creation of fonts, the ligature needed for the group will be present and therefore the group itself will be displayed correctly. b) sequences of signs that can be combined into more than one different group will be rendered differently. In particular, one, the most common group (or the one with the most complex organization?) will be rendered with a plain ligature. Any other group will be rendered with control characters (whatever control characters you prefer). So for instance the sequence: owl G17 + arm D36 The basic group, rendered with the plain ligature could be the owl above the arm, that will be automatically created by inputting owl + arm. Then there could be a secondary group, the arm across the owl (I know this is already encoded as a distinct character, just for the sake of the example), that will be created with a control character, so e.g. by inputting owl + control character + arm. The advantages would be: - 5) even if a text editor/browser will not be able to correctly render control characters, there will always be one basic group with a plain ligature that can be rendered (whether as a group or as a sequence of signs) without broken control characters popping out here and there - 6) you can have as many variants as you want, and therefore the precision of the spatial distribution will be preserved, because the less common spatial organizations (i.e. the less common groups) will be distinguished by the basic one through the use of control characters. If you don?t need spatial precision, you can just use the plain basic ligature, if you want to be precise you can select the specific organization you need. - 7) short sequences of signs are more likely to have more than one possible spatial organization, i.e. short sequences of signs are more likely to be composed in more than one different group. Still, groups built from short sequences of signs are much less likely to require features like nested control characters, recursivity, etc., so it will probably be possible to deal with them with just very simple sequences of control characters. - 8) if a new group for sequence of signs that until now would be combined into only one group should be discovered, it could just be added to the list and rendered with a control characters Because of the point 7) here above, and because of the fact that complex groups will likely be unique and therefore dealt with plain ligatures (point 1) above), it is quite likely that this system could reduce (if not even solve) the problem of the multiplication and nesting of too many control characters. Now it seems to me such a system would satisfy everyone?s needs and deal with a lot of problems: - searchability will be granted because groups will be dealt mainly with plain ligatures, and therefore the phonetic order can be respected in imputing the single sings - spatial information will be preserved, because it will be either encoded in the ligature itself, or in the variants using control characters. And will in any case case be present in the basic database (that *has to be created* anyway). - The system will allow to use control characters in a efficient way, that won?t require tens of them, both because unique groups will be rendered with plain ligatures, and because those groups that will must likely need control characters will be relatively simple, based on short sequences of signs. This, I think, is more in line with the basic principles of Unicode. - Note that the system is different from Bob?s original simplified Egyptian proposal, as it suggests to use plain ligatures for all the sequences of signs that can be combined in only one group, and for one of the possible groups for those sequences that can be combined in more than one group. At the same time the system is relatively similar to how the emoji variants work, and to what we were discussion with some of you during the workshop. - The use of plain ligatures for the most common/basic forms of the groups would solve the Dd problem pointed out by Michael: no need to chose between up/down or corner control character, because the pain ligature can be used - It would be very easy to create input methods for such a system Now, as we are here to reach an agreement on a system that can work. It seems to me that such a system would solve a lot of the issues that are being discussed. So what do you think about it? Marwan -------------- next part -------------- An HTML attachment was scrubbed... URL: From ncs3 at cam.ac.uk Sat Jul 30 17:57:34 2016 From: ncs3 at cam.ac.uk (Nigel Strudwick) Date: Sat, 30 Jul 2016 17:57:34 +0100 Subject: [Egyptian] An afternoon at the Ashmolean. And a challenge. In-Reply-To: References: Message-ID: <768D94D8-CDFC-47FB-8BD4-58342A0493E8@cam.ac.uk> Didn?t I send one the other day? Nigel On 30 Jul 2016, at 17:51, Bob Richmond wrote: > Hi All > > I?m not an Egyptologist. > > Nevertheless I spent a couple of hours today inspecting the hieroglyphic artefacts on display at the Ashmolean with notebook and camera phone in hand. My main objective was to try to spot any writings that might cause problems with group joiners. Collection isn?t much help for ?tall groups? but plenty of vertical texts. > > I was also looking for any LIGATURE related problems. None discovered, just the usual suspects among the ligated clusters. Not that I expected anything: the anomaly frequency in the total corpuses appears very low, order of 0.01% averaged probably (not saying that it should be ignored- it?s a consensus topic; but important to put in proportion). > > I?m hoping to do similar at the BM Tuesday. Not very scientific I know but worth doing to keep grounded. > > If anyone else has a local museum here?s the Bob challenge. Find and photograph any problem clusters you can spot and send for my collection! > > Thank you > Bob > _______________________________________________ > Egyptian mailing list > Egyptian at evertype.com > http://evertype.com/mailman/listinfo/egyptian_evertype.com From bobqq at live.co.uk Sat Jul 30 18:19:06 2016 From: bobqq at live.co.uk (Bob Richmond) Date: Sat, 30 Jul 2016 18:19:06 +0100 Subject: [Egyptian] An afternoon at the Ashmolean. And a challenge. In-Reply-To: <768D94D8-CDFC-47FB-8BD4-58342A0493E8@cam.ac.uk> References: <768D94D8-CDFC-47FB-8BD4-58342A0493E8@cam.ac.uk> Message-ID: Hi Nigel Yes you did - thanks - but your example encodes fine - I'm looking for the weird and unusual ones. Bob -----Original Message----- From: Egyptian [mailto:egyptian-bounces at evertype.com] On Behalf Of Nigel Strudwick Sent: 30 July 2016 17:58 To: Egyptian Hieroglyphs in the UCS Subject: Re: [Egyptian] An afternoon at the Ashmolean. And a challenge. Didn't I send one the other day? Nigel On 30 Jul 2016, at 17:51, Bob Richmond wrote: > Hi All > > I'm not an Egyptologist. > > Nevertheless I spent a couple of hours today inspecting the hieroglyphic artefacts on display at the Ashmolean with notebook and camera phone in hand. My main objective was to try to spot any writings that might cause problems with group joiners. Collection isn't much help for 'tall groups' but plenty of vertical texts. > > I was also looking for any LIGATURE related problems. None discovered, just the usual suspects among the ligated clusters. Not that I expected anything: the anomaly frequency in the total corpuses appears very low, order of 0.01% averaged probably (not saying that it should be ignored- it's a consensus topic; but important to put in proportion). > > I'm hoping to do similar at the BM Tuesday. Not very scientific I know but worth doing to keep grounded. > > If anyone else has a local museum here's the Bob challenge. Find and photograph any problem clusters you can spot and send for my collection! > > Thank you > Bob > _______________________________________________ > Egyptian mailing list > Egyptian at evertype.com > http://evertype.com/mailman/listinfo/egyptian_evertype.com _______________________________________________ Egyptian mailing list Egyptian at evertype.com http://evertype.com/mailman/listinfo/egyptian_evertype.com