From everson at evertype.com Fri Jul 15 19:52:00 2016 From: everson at evertype.com (Michael Everson) Date: Fri, 15 Jul 2016 19:52:00 +0100 Subject: [Egyptian] Cambridge I&E Workshop: Some follow-ups for July In-Reply-To: References: <6ef6585e-8ddf-152d-7d16-612bfc0b7641@w3.org> Message-ID: <930878D5-3729-40CF-9ABF-22F8D3B296B4@evertype.com> On 15 Jul 2016, at 19:48, Nigel Strudwick wrote: > > Richard > > I think we all agreed a specific list for this subject, set up by you, was the way forward. Well, this list was used for previous discussion, and it?s still here (with an old 2008 archive no less), and you?re all subscribed to it. Michael From mn31 at st-andrews.ac.uk Sat Jul 16 14:21:46 2016 From: mn31 at st-andrews.ac.uk (Mark-Jan Nederhof) Date: Sat, 16 Jul 2016 14:21:46 +0100 Subject: [Egyptian] Cambridge I&E Workshop: Some follow-ups for July In-Reply-To: <930878D5-3729-40CF-9ABF-22F8D3B296B4@evertype.com> References:

<930878D5-3729-40CF-9ABF-22F8D3B296B4@evertype.com> Message-ID: <1516249.IGz3T585fz@thuis> On Friday 15 Jul 2016 19:52:00 Michael Everson wrote: > On 15 Jul 2016, at 19:48, Nigel Strudwick wrote: > > > > Richard > > > > I think we all agreed a specific list for this subject, set up by you, was the way forward. > > Well, this list was used for previous discussion, and it?s still here (with an old 2008 archive no less), and you?re all subscribed to it. > > Michael Just a thought: we are a very small, select group. For example, Richard wrote to me with some very detailed and insightful comments about writing-mode in CJK text, with implications for Ancient Egyptian. It would be a shame not to have such material archived publicly to share it with others interested in these matters. Mark-Jan From bobqq at live.co.uk Mon Jul 18 12:38:05 2016 From: bobqq at live.co.uk (Bob Richmond) Date: Mon, 18 Jul 2016 12:38:05 +0100 Subject: [Egyptian] Simple higher level protocols Message-ID: Hi All I?ve posted a short note on simple HLP at http://hieroglyphseverywhere.blogspot.co.uk/2016/07/simple-higher-level-protocols-and.html. HLP is a topic we didn?t get to discuss much last week but important to bear in mind when thinking of unencoded characters and edge-cases for control characters. At some point it would be useful to try and put together a wish-list. Bob -------------- next part -------------- An HTML attachment was scrubbed... URL: From bobqq at live.co.uk Wed Jul 20 10:58:26 2016 From: bobqq at live.co.uk (Bob Richmond) Date: Wed, 20 Jul 2016 10:58:26 +0100 Subject: [Egyptian] Ligature joiners - evidence needed please Message-ID: Hi all, I?ve not yet seen any data, evidence or rationale about the suggested 5 positional ligature system discussed at Cambridge. I don?t have unlimited time to research all this myself and I?m not an Egyptologist. Technically such a system can be added to current controls but it needs to be well-defined, detailed and justified if it is to be submitted. [This is what I?m doing with the group-joiners]. One part I?m reasonably happy that I understand is ligatures involving the cobra-pattern (also seen with tongue). This ligature has a special role in orthography, especially in column vertical writing and ?tall groups?. I?ve seen plenty of instances. Conversely there is the bird pattern a+b, b+c or a+b+c where b is a bird. Pretty much all I?ve seen is a and c as individual hieroglyphs. What other bird patterns attested? Groups? There are characters in Hieroglyphica e.g. HG-D0160 to HG-D0188 and many more HG arrangements that could be implemented using controls (as with monograms). If this is what some of these proposed controls are about I think next step repertoire might be a better context in which to discuss the topic to begin with. Mark-Jan ? you?ve been using your ?insert? operator in RES typography for many years. It would be great to see a list of at least some of the clusters you?ve encountered. Anyway, great if anyone has any examples/evidence to share. Bob -------------- next part -------------- An HTML attachment was scrubbed... URL: From ncs3 at cam.ac.uk Wed Jul 20 11:20:34 2016 From: ncs3 at cam.ac.uk (Nigel Strudwick) Date: Wed, 20 Jul 2016 11:20:34 +0100 Subject: [Egyptian] Ligature joiners - evidence needed please In-Reply-To: References: Message-ID: <1A8607F8-B7D1-4320-822B-717E663993E4@cam.ac.uk> Bob, I?d have more chances of understanding this (as an Egyptologist) if I could see it. Could you be able to post a sketch of what you mean? Handwritten would do. Nigel On 20 Jul 2016, at 10:58, Bob Richmond wrote: > Hi all, > > I?ve not yet seen any data, evidence or rationale about the suggested 5 positional ligature system discussed at Cambridge. I don?t have unlimited time to research all this myself and I?m not an Egyptologist. Technically such a system can be added to current controls but it needs to be well-defined, detailed and justified if it is to be submitted. [This is what I?m doing with the group-joiners]. > > One part I?m reasonably happy that I understand is ligatures involving the cobra-pattern (also seen with tongue). This ligature has a special role in orthography, especially in column vertical writing and ?tall groups?. I?ve seen plenty of instances. > > Conversely there is the bird pattern a+b, b+c or a+b+c where b is a bird. Pretty much all I?ve seen is a and c as individual hieroglyphs. What other bird patterns attested? Groups? > > There are characters in Hieroglyphica e.g. HG-D0160 to HG-D0188 and many more HG arrangements that could be implemented using controls (as with monograms). If this is what some of these proposed controls are about I think next step repertoire might be a better context in which to discuss the topic to begin with. > > Mark-Jan ? you?ve been using your ?insert? operator in RES typography for many years. It would be great to see a list of at least some of the clusters you?ve encountered. > > Anyway, great if anyone has any examples/evidence to share. > > Bob > > _______________________________________________ > Egyptian mailing list > Egyptian at evertype.com > http://evertype.com/mailman/listinfo/egyptian_evertype.com From mn31 at st-andrews.ac.uk Wed Jul 20 11:21:34 2016 From: mn31 at st-andrews.ac.uk (Mark-Jan Nederhof) Date: Wed, 20 Jul 2016 11:21:34 +0100 Subject: [Egyptian] List of considerations Message-ID: <6022711.NM8k5uuFyp@thuis> Dear All, I'm afraid the momentum is lost if we wait any longer with resuming the discussion. So let me make an inventory about how I see things. This assumes familiarity with the issues in the last version of the proposal: https://mjn.host.cs.st-andrews.ac.uk/tmp/unicode.pdf as well as with the discussions in Cambridge, inside and outside the Fitzwilliam. One thought is that we should make the encoding as simple as possible, but not simpler. Another thought is that we need a systematic design, not a bunch of individual control characters thrown together. This pertains both to functionality (what can be expressed) and to syntax (how is it expressed). As for functionality, the primitives should cover a natural range of what Egyptologists want the encoding to express at a very minimum, recognizing the need to sacrifice precision for simplicity. As to syntax, we should not lose sight of the bigger picture, or we might tie ourselves into a knot with operator precedence. More on the functionality: our team (TLA/Ramses/St Andrews) had relatively few qualms simplifying the semantics of the insertions, to allow an inserted group to spill over to outside the bounding box of the 'big' sign. This makes it easier to use, in particular removing the need of the EMPTY to artificially increase the size of the bounding box. (It makes the mapping to richer and more precise encodings outside Unicode more difficult, but this aside.) With the simplification, we could abandon the four insertions at the four sides, because then these are basically horizontal and vertical grouping in combination with the "JOIN" from our original proposal. However, if we then drop the JOIN, it would be quite odd to be able to have a primitive for "insert into a corner" but not for the functionality of "insert into a side". If you look at typical uses of the & in the past, then you see many can only be expressed as "insert into a side", or alternatively horizontal/vertical grouping with JOIN. So JOIN plus the four insertions into the four corners forms a logical whole. In more detail, when we insert G into a corner of S: if G is small, it might fit into the bounding box of S; if G is big, it may extend to outside the bounding box; the extension would be to the left or right for signs with unit height (as for most birds), and to the top or bottom for signs that are less high (as for the tongue). In the case of birds, insertion into the lower-left corner would no longer mean strictly in the corner, but normally just above the feet of the bird. Here we sacrifice precision for simplicity of use. Note that PLOTTEXT distinguished insertion-into-lower-left-corner and insertion-above-the-bird's-feet. An example of insert into a side is Hm-kA, with the Hm club half inside the pair of raised arms. It would now be encoded as a vertical group of Hm and kA, with a JOIN in between. There is only one case I can think of where insert-at-the-bottom is not very well expressed in terms of vertical grouping with JOIN, and that is with two X1 next to one another at the bottom within the bounding box of S22. One could probably live with an approximation. More on syntax: Suppose we have * and : and INSERT and perhaps STACK, all represented as infix operators, then the question is what A * B INSERT C : D STACK E means. No one has yet provided a reasonable syntax with infix operators but without brackets that disambiguates in a satisfactory manner. From my understanding of the well-established academic discipline of the design of programming languages, I would assume such a syntax does not exist. Personally I welcome the prospect to put some separating characters between signs within a horizontal/vertical group, because that simplifies OpenType substitution rules; it would also streamline the syntax of the JOIN with the 'normal' way of horizontal/vertical grouping. So, some part of the notation could well be infix, but not having brackets leads us into the abyss of structural ambiguity. Mixed systems of operator precedence plus brackets to override operator precedence where needed are a bit old-fashioned, and are only helpful if we assume that the control characters would be actually typed one by one by the user, instead of having a specialized graphical editor for hieroglyphic text that relieves the user of having to worry about syntax. To recapitulate what I wrote earlier, my proposal for representing horizontal grouping would be exemplified by: OPEN_HOR arg1 NORMAL_SEP arg2 JOIN_SEP arg3 CLOSE where I combine a normal separator with the joining (fitting) one. Here arg1 and arg2 could be single signs or vertical groups or insertions, etc. An example of vertical grouping could be: OPEN_VERT arg1 NORMAL_SEP arg2 CLOSE Insertion could be: OPEN_INSERT arg1 TOP_LEFT big_sign TOP_RIGHT arg2 BOTTOM_RIGHT arg3 CLOSE which would mean insert arg1 into the top left corner of the big sign, insert arg2 into the top right corner and insert arg3 into the bottom right corner. Note: if we insert G into S, then S is usually a single sign, but not always. Consider for example a superimposition (stacking) of P6 and D36, with N5 inserted into the lower-left corner and Z1 inserted into the lower-right corner. So 'big_sign' above could be a group as well. If for now we want to assume it is always a single sign, that is fine, and we can drop the restriction some time in the future, when font technology has evolved. This should be a guiding principle in general: We can put restrictions anywhere we want, motivated by limitations on today's font technology, as long as it doesn't cause major problems 10 or 20 years down the line. Future generations will be grateful if we dare think a little ahead. Other issues: * Richard has written quite a few interesting things about horizontal 'writing-mode' vs vertical 'writing-mode' in other scripts. I don't think the matter has been exhaustively discussed for hieroglyphs. More about this later. * Cartouches (enclosures/boxes): we need to have a suggestion that fits into the design. It is fine to postpone formal proposal of cartouches, but again, we need design, not a bunch of loose characters thrown together. My proposal for syntax would be exemplified by: OPEN_CARTOUCHE first_group NORMAL_SEP second_group JOIN_SEP third_group CLOSE which would fit in with the proposed syntax of the other elements. The problem with reinterpreting the enclosure characters among the existing 1071 Unicode signs is that the isolated hieratic open-cartouche and close-cartouche are then missing. This would be a big problem. So, if we reinterpret pairs of the existing characters to produce full-form enclosures, we need to at least add two more characters for transcription of hieratic. * The EMPTY glyph: With the new semantics of the four insert-into-a-corner primitives, the EMPTY is less urgent, but it is still very useful. Can we take an existing EMPTY character from Unicode? It would be nice to pick a specific one. This means one fewer character needs to be proposed, but it would be good to mention in the proposal that using an EMPTY in place of a hieroglyph is legitimate. * Stacking. Almost all the Egyptologists wanted to have a stacking primitive, while some UTC members were objecting on technical grounds, stacking being difficult to implement dynamically. I think the discussion at the meeting stranded because it was comparing apples and oranges (dynamic versus precomposed), and opposition against stacking was based on the wrong arguments. More about this later. * Insert-into. This requires further investigation. Some observations: - One (very) convincing example is wabt, with the leg-and-jug-of-flowing-water with the feminine ending inside. - If D031 didn't already exist, how would one encode it? I think D032 with the Hm sign in a vertical group with JOIN is possible. - N018A and N018B I would be tempted to see as atomic signs, unless there are many similar combinations of N018 (or X004B) with other flat signs. - To me O010C seems definitely D002 inserted into O018. - This raises the question whether the notation of a box/enclosure for the Hwt sign with something inside is appropriate, or whether 'insert-into' is preferable. For a cartouche/serekh/castle-walls, the text inside can be quite long. Does that apply to Hwt as well? If the length of the text inside Hwt is always quite limited, it doesn't seem to be of the same nature as a cartouche. I'm revising the proposal to match the above. Feedback sooner rather than later would be helpful. Best regards, Mark-Jan From mn31 at st-andrews.ac.uk Wed Jul 20 11:46:04 2016 From: mn31 at st-andrews.ac.uk (Mark-Jan Nederhof) Date: Wed, 20 Jul 2016 11:46:04 +0100 Subject: [Egyptian] Stacking and syntax Message-ID: <5973769.kzhsOBgLFk@thuis> Dear all, More considerations about stacking and syntax (apologies to Michel and Michael for overlap with another message I sent last week): (1) My understanding is that we need to work under the assumption that the control characters do nothing complicated by themselves, but substitution rules are used to map sequences of signs plus control characters to precomposed groups, which are stored as separate glyphs in the font. Is this correct? If the above is correct, then there are some follow-up questions. (2) Should we not worry about the 64K limit on the number of glyphs in OpenType? It would be interesting to know how many different groups (not just ligatures) there are in Ramses? (3) I wonder whether in the discussion about stacking (superimposed signs, monograms) we were comparing apples and oranges. I proposed the stacking operation could be done dynamically, which is consistent with experiments I've done with OpenType, and which would require linearly many anchor points to be stored in the font, if we restrict our attention to pairs of signs being stacked (not three or more signs). In the easiest case each anchor point could be the center point of the bounding box, and determining these could even be automated, say by a Python script in FontForge. Michael says that this is not how font designers like to do things, and they would still like to store precomposed glyphs. Okay, let's take this as given. But then what is the objection against stacking? That there would be too many precomposed glyph combinations? If we assume that all stacked combinations are stored as glyphs in the font, we would have N * N such glyphs. But how about normal groups with pure horizontal and vertical grouping? If you similarly want to store these as precomposed glyphs, and if we assume that such groups can have up to 4 glyphs, you already need N * N * N * N combinations, which dwarfs the costs of implementing stacking. If we do _not_ assume all horizontal/vertical groups are precomposed, but only the ones we have found in some corpus (the 'fallback assumption') then obviously, we have much fewer glyphs to store than N * N * N * N. But then why is it not acceptable to precompose only those stacked pairs that are known from a corpus? So if we compare apples and apples, so to speak, both normal horizontal/vertical groups and stackings require excessive storage space. If we compare oranges and oranges, both normal horizontal/vertical groups and stackings are feasible, and stackings more so than horizontal/vertical groups. Do I see this wrong? (4) Coming back to syntax in more detail than in previous message. The Richmond & Glass proposal had three characters, with & (ligatures) having tightest binding, then * (horizontal grouping), then : (vertical grouping). But we need (very limited) recursion of horizontal/vertical grouping, say up to three levels; two is surely too little to handle perfectly ordinary Middle Egyptian horizontal texts. And we need finer control characters, such as the INSERT. This implies we quickly need many levels of operator precedence. For example: "A INSERT_TOP_RIGHT B VERTICALGROUPING C" could mean: " ( A INSERT_TOP_RIGHT B ) VERTICALGROUPING C " or " A INSERT_TOP_RIGHT ( B VERTICALGROUPING C ) " Both interpretations correspond to attested groups. If we try to disambiguate by operator precedence, then either we need several copies of each control character that differ in tightness of binding, leading to horrible complexity, or we need brackets. In order to avoid the problems with operator precedence, I chose an entirely different syntax for the draft proposal, under the motto: if we need brackets, we might as well use them consistently in combination with prefix operators, and get rid of infix operators altogether. A formal grammar is in the draft proposal in the appendix, and you can see it is very simple and uniform. Now, I do understand that the closer we stay to traditions of "ordinary" writing systems, the better it is for adoption of our encoding. If including infix operators is the only way to make the encoding palatable to font designers, so be it. But can I just ask: If it is the case that sequences of hieroglyphs and control characters are replaced by precomposed single glyphs (question (1) above), then would the choice between prefix or infix or postfix (reverse Polish notation) make any real difference for actual realization in terms of feature files of fonts ? Regards, Mark-Jan From ishida at w3.org Wed Jul 20 11:55:31 2016 From: ishida at w3.org (ishida at w3.org) Date: Wed, 20 Jul 2016 11:55:31 +0100 Subject: [Egyptian] Vertical vs horizontal writing-mode Message-ID: On 13/07/2016 23:16, Mark-Jan Nederhof wrote: > A belated thank you for the information. I now had some time to go > through the links that you sent in more detail. The issue of directionality > is not crucial to our proposal, and there was so much confusion about > the matter that it seemed best to not further discuss it during the > meeting. All the Egyptologists fully understood, because the problem is > unavoidable in everyday transcription of hieroglyphic texts, but apart from > the Egyptologists we couldn't get anyone to care about this matter, so > there may be no point in trying to keep it as part of the proposal. > > This leaves two options: we omit mention of directionality from the proposal > altogether, or mention the matter in passing, with the suggestion that > we might revisit it at some future time. To determine what is best, can > I ask you the following? Do you know of some other language/script in > which the encoding of text itself is different depending on whether the > text direction is horizontal or vertical? (To be clear, I don't care about > ltr versus rtl. The central issue is horizontal versus vertical.) > > While going through the material you sent, I got the impression that > the encoding is always the same, for CJK and other scripts. As I tried to > explain in the paper and in the presentation, the situation for Ancient Egyptian > is very different, because the signs tend to be divided into groups ('quadrats') > somewhat differently depending on whether the text direction is horizontal > or vertical, and the division into groups is part of the encoding. In a way, > one could say that a certain encoding is only genuine for horizontal text, > or for vertical text, but not usually for both. Do you see the problem, and the > reason why we brought this up? Would you agree with me this problem > does not occur in CJK and similar scripts ? Let's refer to horizontal 'writing-mode' vs vertical 'writing-mode' to make the terminology clearer. So, in general for CJK text one would expect to see the same sequence of code points for a text whether it is rendered horizontally or vertically. However, there are some differences... [a] there are certain characters that by convention are more likely to be found in one versus the other. For example, vertical text usually uses corner brackets for quotation marks, whereas horizontal CJK use quotation marks. For examples, see bullet (b) under https://www.w3.org/TR/jlreq/#differences_in_vertical_and_horizontal_composition_in_use_of_punctuation_marks So for good quality rendering of text, the choice of code point goes with the choice of writing-mode for a small number of characters, and you can't just flip between the two by switching the CSS. Other times you may find full-width characters being used for latin letters and digits in vertical script when they are not in horizontal. For example an acronym such as FIFA is likely to be rendered as non-rotated, full-width characters in vertical Japanese, but as ordinary proportionally-spaced characters in horizontal. [b] the visual appearance of some characters in CJK varies according to the writing-mode. For example, parentheses are rotated 90? between vertical and horizontal. Other characters need completely different glyphs to be swapped in. For example the horizontal Japanese full stop has an advance width the same as other characters but has just a small circle in the lower left corner. In vertical text that circle appears in the top right corner (ie. it can't be achieved by rotation). In these cases you need extra glyphs in the font that are activate by sensitivity to the writing-mode. For examples, see http://r12a.github.io/scripts/tutorial/part4#rotations [c] Sometimes text flows horizontally within vertical columns in CJK (known as tate ch? yoko). See an example at http://r12a.github.io/scripts/tutorial/part4#tatechuyoko. This is something that has no correspondence in horizontal text. So in summary, sometimes the sequence of characters needs to be different for vertical vs horizontal text, but most of the time apparent differences are achieved through rendering algorithms operating on and selecting appropriate glyphs. What does remain the same, however, is the logical progression of codepoints in memory. That sequence, as is usually the case throughout Unicode, typically follows the pronounced order of the 'letters' involved or some other rule such as combining characters following base characters. If the expected order of codepoints in a word varies for sequences of character in vertical vs horizontal writing modes, then problems arise in searching or processing text. Btw, there are also plenty of examples in Unicode of scripts that treat visual display in terms of syllables, clusters or groups of characters. The underlying sequence of codepoints in many Brahmi-derived scripts is different from the order in which the respective glyphs are displayed. For example a RA at the start (nominally the left) of a Hindi sequence of consonants in the word 'irsya' is likely to be displayed above the 'a' (far to the right). For examples, see http://r12a.github.io/scripts/tutorial/part3#positional This, like the other things noted above (with the exception of the first) is achieved through applying some magical rendering process, using smoke and mirrors to transform the underlying, logically-ordered codepoint sequence. Fwiw, in vertical arrangements the 'syllabic' clusters in indic scripts are treated as indivisible units that run horizontally. So, coming back to Egyptian hieroglyphs, and making it clear that i know very little about how Egyptian hieroglyphs work, i find myself wondering the following: a. perhaps some combination of the above smoke and mirrors techniques may be adequate to manage some of the differences between layout in horizontal vs vertical writing-mode when the thing we are struggling with is the spacial relationships between the elements circumscribed by a quadrat when they are rendered. b. perhaps it's not particularly problematic that you can't automatically flip between horizontal vs vertical without changing code points, especially when one considers that there is anyway so much variation in 'spelling' of egyptian content, often to fit the visual space available. c. if the control characters used to indicate the positioning of hieroglyphs within the quadrat display space are treated like other Unicode control characters, ie. they are not part of the semantics and are ignored for sorting, searching, and processing the text for meaning, rather they are just cues for visual arrangement, then perhaps it's not a big issue either if they are different for vertical vs horizontally rendered content. does that help? ri From mn31 at st-andrews.ac.uk Wed Jul 20 12:37:29 2016 From: mn31 at st-andrews.ac.uk (Mark-Jan Nederhof) Date: Wed, 20 Jul 2016 12:37:29 +0100 Subject: [Egyptian] Ligature joiners - evidence needed please In-Reply-To: References: Message-ID: <2414353.YqfQxLpJiR@thuis> The evidence for the insertions is of the same kind as the evidence for horizontal and vertical grouping. Look at an inscription. What do you see? Signs and groups are below and above other signs and groups, or next to one another. Therefore primitives for horizontal and vertical grouping are natural. But not everything is horizontal or vertical. Sometimes you see one sign in the corner of another. This warrants the insertions as primitives. This was recognized already by PLOTTEXT in the late 1980s. I reinvented the wheel around 2000-2002 with RES, coming to almost the same conclusions. JSesh has insertions as well. I have also shown that OCR tools are able to automatically recognize insertions. So the empirical evidence is that any known way of encoding hieroglyphic text that explicitly describes the graphical form has insertions. Horizontal/vertical grouping plus 4 or 5 insertions plus JOIN do not cover everything, but it will be sufficient for many applications, and it is the most we can hope for within the limitations of Unicode, what precision and coverage is concerned (ignoring cartouches and such for now). Mark-Jan On Wednesday 20 Jul 2016 10:58:26 Bob Richmond wrote: > Hi all, > > I?ve not yet seen any data, evidence or rationale about the suggested 5 positional ligature system discussed at Cambridge. I don?t have unlimited time to research all this myself and I?m not an Egyptologist. Technically such a system can be added to current controls but it needs to be well-defined, detailed and justified if it is to be submitted. [This is what I?m doing with the group-joiners]. > > One part I?m reasonably happy that I understand is ligatures involving the cobra-pattern (also seen with tongue). This ligature has a special role in orthography, especially in column vertical writing and ?tall groups?. I?ve seen plenty of instances. > > Conversely there is the bird pattern a+b, b+c or a+b+c where b is a bird. Pretty much all I?ve seen is a and c as individual hieroglyphs. What other bird patterns attested? Groups? > > There are characters in Hieroglyphica e.g. HG-D0160 to HG-D0188 and many more HG arrangements that could be implemented using controls (as with monograms). If this is what some of these proposed controls are about I think next step repertoire might be a better context in which to discuss the topic to begin with. > > Mark-Jan ? you?ve been using your ?insert? operator in RES typography for many years. It would be great to see a list of at least some of the clusters you?ve encountered. > > Anyway, great if anyone has any examples/evidence to share. > > Bob > From bobqq at live.co.uk Wed Jul 20 13:04:18 2016 From: bobqq at live.co.uk (Bob Richmond) Date: Wed, 20 Jul 2016 13:04:18 +0100 Subject: [Egyptian] Aaron IE Experimental Font version 1 Message-ID: Hi All I?ve put together this ?Aaron IE Experimental? font available for the I&E group to use for documentation, communication, whatever. The PDF describes the font. I?ve included the PDF source docx file so you can see how a plain text system works. Source seems fine with Word (latest version). I found LibreOffice and OpenOffice are still buggy with Complex Text handling but didn?t explode into pieces when I loaded the doc! Your mileage may differ. Apologies for our less technical readers for the gobbledygook. Bob -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: AaronIECambridgeExperimentalFont1.docx Type: application/vnd.openxmlformats-officedocument.wordprocessingml.document Size: 24482 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: AaronIECambridgeExperimentalFont1.pdf Type: application/pdf Size: 439594 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: AaronIEExperimental.ttf Type: application/octet-stream Size: 742088 bytes Desc: not available URL: From hawilbrink at hotmail.com Wed Jul 20 15:00:18 2016 From: hawilbrink at hotmail.com (heleen wilbrink) Date: Wed, 20 Jul 2016 16:00:18 +0200 Subject: [Egyptian] Cambridge: conclusions and next steps Message-ID: Hi guys, Sorry to have kept you waiting. Here are the main conclusions and next steps (bold) as I interpreted them. I hope you will find them helpful. Have a great day,Heleen Repertoire A list will be made with the references to the publication of an original text (photo or facsimile, not a print font like eg IFAO) for each hieroglyph from the proposal of Michel Suignard. Background: The characters in the proposal of Michel do not have sources yet. These sources are needed in order for the proposal to be accepted. Michel will do the coordination for this list with input from Stephane/Serge/Simon/Mark-JanThe hieroglyphs with references can be added in tranchesThe first tranche will be the proposal of Michael Everson on hieratic hieroglyphs from Moller to which the hieratic dot will be addedStephane/Serge/Simon/Mark-Jan will review the proposal of Michael and add any other hieroglyphs that they are missing including the references before mid AugustMichael will finish the proposal in the second half of August so that it can be included in the Unicode meeting on the 6th of September Monograms: I think the consensus is to make a list of attested combinations and then decide on the approach, either monograms (A) or the use of a control character (B). This list will be used as a proposal for monograms (A) or will be used by developers for their fonts (B). Most Egyptologists were in favor of solution B because of the ease of searching and the ease of implementation (in fonts and not in Unicode proposals). Stephane/Serge/Simon/Mark-Jan will make a list and share it with the group. Then a decision is made for A or B Control characters On the last day of the conference there seemed to be consensus on the control characters, which Michael summed up by mail as: ?The proposal is to remove the LIG character from the ballot and to add in 7 more, the two group controls of Bob, and then the four corner and the center controls?It was agreed that this proposal will be finished in August, I thought by Michael, and discussed on the Unicode meeting on the 6th of September. Now it seems over the mail that several people are working on it separatelySuggestion: let Michael, Bob and Mark-Jan do a conference call or so soon to A. make sure there is one proposal that all agree on and can be shared with the group and B. give guidelines to Simon/Serge/Stephane for what examples they should look in their databases Input methods and fonts So and Marwan have integrated their input systems and font into SINUHESo/Marwan could you share with the us the online location so we can start using it?Bob has made and shared an experimental font for us to use -------------- next part -------------- An HTML attachment was scrubbed... URL: From mn31 at st-andrews.ac.uk Wed Jul 20 15:41:41 2016 From: mn31 at st-andrews.ac.uk (Mark-Jan Nederhof) Date: Wed, 20 Jul 2016 15:41:41 +0100 Subject: [Egyptian] Cambridge: conclusions and next steps In-Reply-To: References: Message-ID: <17511299.cOQPMXRhrm@bear> On Wednesday 20 Jul 2016 16:00:18 heleen wilbrink wrote: > let Michael, Bob > and Mark-Jan do a conference call or so soon to A. make sure there is one > proposal that all agree on and can be shared with the group and B. give > guidelines to Simon/Serge/Stephane for what examples they should look in their > databases Heleen, It is kind of you to try to coordinate, but you don't seem to understand the issues. There are already two competing proposals since 2016-06-30, when we uploaded ours onto L2. What I've tried to do in the past emails is to point out that a naive attempt to pick and choose from the two proposals leads to inconsistencies. We need a coherent design, not majority votes for wishlists of individual elements, nor a friendly compromise that leaves us with ambiguous or inconsistent notation. The problems I have pointed out are quite technical, difficult, and not amenable to oral communication alone, and a conference call is not going to bring us any closer to a resolution. Mark-Jan From ncs3 at cam.ac.uk Wed Jul 20 15:58:59 2016 From: ncs3 at cam.ac.uk (Nigel Strudwick) Date: Wed, 20 Jul 2016 15:58:59 +0100 Subject: [Egyptian] Cambridge: conclusions and next steps In-Reply-To: <17511299.cOQPMXRhrm@bear> References: <17511299.cOQPMXRhrm@bear> Message-ID: <9DD2825D-7541-46FB-9282-FB8C1E7BEC29@cam.ac.uk> Mark-Jan I was going to send this privately, but I think it has wider implications. I hate to say this, but a little more tact in emails is never a bad thing. Do consider retracting or modifying that statement. If Heleen, and by inference I, cannot understand the issues, then you and the other people putting forward proposals should bear some of the responsibility because you have not communicated it in a way those of us not intimately involved can understand. I made a great play in my closing remarks that what is being done needs to be understandable and usable by the Egyptologists for whom you say you are doing this (as otherwise it reverts to being an intellectual exercise). And please remember that all of you had been thrashing these things around for some time but it took someone like me, strongly backed by Heleen, to get you all together to move things on this much. Personally, I cannot see why a video call (or calls) between a handful of you cannot work, but you may be right. Perhaps you do need another meeting, a technical one, just for the small group of you where you don?t need to worry about people like me asking silly questions and holding you back. I am fine with that. But I see from this present debate the somewhat entrenched positions, which I thought were breaking down on the Monday evening and Tuesday, re-emerging. Nigel On 20 Jul 2016, at 15:41, Mark-Jan Nederhof wrote: > On Wednesday 20 Jul 2016 16:00:18 heleen wilbrink wrote: >> let Michael, Bob >> and Mark-Jan do a conference call or so soon to A. make sure there is one >> proposal that all agree on and can be shared with the group and B. give >> guidelines to Simon/Serge/Stephane for what examples they should look in their >> databases > > Heleen, > > It is kind of you to try to coordinate, but you don't seem to understand the issues. > There are already two competing proposals since 2016-06-30, when we uploaded > ours onto L2. What I've tried to do in the past emails is to point out that a naive > attempt to pick and choose from the two proposals leads to inconsistencies. We > need a coherent design, not majority votes for wishlists of individual elements, > nor a friendly compromise that leaves us with ambiguous or inconsistent notation. > > The problems I have pointed out are quite technical, difficult, and not amenable > to oral communication alone, and a conference call is not going to bring us any closer > to a resolution. > > Mark-Jan > > > _______________________________________________ > Egyptian mailing list > Egyptian at evertype.com > http://evertype.com/mailman/listinfo/egyptian_evertype.com From ishida at w3.org Wed Jul 20 16:07:40 2016 From: ishida at w3.org (ishida at w3.org) Date: Wed, 20 Jul 2016 16:07:40 +0100 Subject: [Egyptian] Mailing list stuff Message-ID: <4305cf67-a096-7a88-15e1-b2a7a88b7fd3@w3.org> folks, please note that Michael has changed the email archive so that it can be viewed by the public at large (not just subscribers). Hopefully, it is also visible to search engines. Btw, if you have been accessing the archive you may need to change your existing bookmarks or links to point to http://evertype.com/pipermail/egyptian_evertype.com/ cheers, ri From ncs3 at cam.ac.uk Wed Jul 20 16:26:40 2016 From: ncs3 at cam.ac.uk (Nigel Strudwick) Date: Wed, 20 Jul 2016 16:26:40 +0100 Subject: [Egyptian] Mailing list stuff In-Reply-To: <4305cf67-a096-7a88-15e1-b2a7a88b7fd3@w3.org> References: <4305cf67-a096-7a88-15e1-b2a7a88b7fd3@w3.org> Message-ID: <9BA81627-D9DE-445D-9EFA-7B4DD65606AA@cam.ac.uk> Does this mean it is no longer a private list? if so, I want out? Honestly; these are still private deliberations. Nigel On 20 Jul 2016, at 16:07, ishida at w3.org wrote: > folks, > > please note that Michael has changed the email archive so that it can be viewed by the public at large (not just subscribers). Hopefully, it is also visible to search engines. > > Btw, if you have been accessing the archive you may need to change your existing bookmarks or links to point to http://evertype.com/pipermail/egyptian_evertype.com/ > > cheers, > ri > > > _______________________________________________ > Egyptian mailing list > Egyptian at evertype.com > http://evertype.com/mailman/listinfo/egyptian_evertype.com From ishida at w3.org Wed Jul 20 17:22:14 2016 From: ishida at w3.org (ishida at w3.org) Date: Wed, 20 Jul 2016 17:22:14 +0100 Subject: [Egyptian] Mailing list stuff In-Reply-To: <9BA81627-D9DE-445D-9EFA-7B4DD65606AA@cam.ac.uk> References: <4305cf67-a096-7a88-15e1-b2a7a88b7fd3@w3.org> <9BA81627-D9DE-445D-9EFA-7B4DD65606AA@cam.ac.uk> Message-ID: <5d349150-6a18-4e47-cf0f-36d775ca7adc@w3.org> On 20/07/2016 16:26, Nigel Strudwick wrote: > Does this mean it is no longer a private list? if so, I want out? Honestly; these are still private deliberations. My understanding is that only subscribers can post to the mailing list, but anyone can read the mails in the archive. Sorry Nigel, but since you were originally pushing for a W3C list i thought you were in favour of making the archive public. This is the standard approach for W3C and Unicode lists, since it makes the information widely available and historically accessible (see http://evertype.com/pipermail/egyptian_evertype.com/2016-July/000083.html) and is likely to attract other useful experts to the discussion over time. If you or other group members don't like that, then we should ask Michael to revert the list to a closed circulation. For my part, i think it would be a pity, though. cheers, ri From mn31 at st-andrews.ac.uk Wed Jul 20 17:23:11 2016 From: mn31 at st-andrews.ac.uk (Mark-Jan Nederhof) Date: Wed, 20 Jul 2016 17:23:11 +0100 Subject: [Egyptian] Cambridge: conclusions and next steps In-Reply-To: <9DD2825D-7541-46FB-9282-FB8C1E7BEC29@cam.ac.uk> References: <17511299.cOQPMXRhrm@bear> <9DD2825D-7541-46FB-9282-FB8C1E7BEC29@cam.ac.uk> Message-ID: <29403533.ajqXQdU4yC@bear> Nigel, You are absolutely right. The tone of my message was unacceptable. I offered my apologies to Heleen off-list. For now, I need to take a break until Friday. Too much time pressure with unrelated matters. Mark-Jan On Wednesday 20 Jul 2016 15:58:59 Nigel Strudwick wrote: > Mark-Jan > > I was going to send this privately, but I think it has wider implications. > > I hate to say this, but a little more tact in emails is never a bad thing. Do consider retracting or modifying that statement. > > If Heleen, and by inference I, cannot understand the issues, then you and the other people putting forward proposals should bear some of the responsibility because you have not communicated it in a way those of us not intimately involved can understand. > > I made a great play in my closing remarks that what is being done needs to be understandable and usable by the Egyptologists for whom you say you are doing this (as otherwise it reverts to being an intellectual exercise). And please remember that all of you had been thrashing these things around for some time but it took someone like me, strongly backed by Heleen, to get you all together to move things on this much. > > Personally, I cannot see why a video call (or calls) between a handful of you cannot work, but you may be right. Perhaps you do need another meeting, a technical one, just for the small group of you where you don?t need to worry about people like me asking silly questions and holding you back. I am fine with that. But I see from this present debate the somewhat entrenched positions, which I thought were breaking down on the Monday evening and Tuesday, re-emerging. > > Nigel > > > > On 20 Jul 2016, at 15:41, Mark-Jan Nederhof wrote: > > > On Wednesday 20 Jul 2016 16:00:18 heleen wilbrink wrote: > >> let Michael, Bob > >> and Mark-Jan do a conference call or so soon to A. make sure there is one > >> proposal that all agree on and can be shared with the group and B. give > >> guidelines to Simon/Serge/Stephane for what examples they should look in their > >> databases > > > > Heleen, > > > > It is kind of you to try to coordinate, but you don't seem to understand the issues. > > There are already two competing proposals since 2016-06-30, when we uploaded > > ours onto L2. What I've tried to do in the past emails is to point out that a naive > > attempt to pick and choose from the two proposals leads to inconsistencies. We > > need a coherent design, not majority votes for wishlists of individual elements, > > nor a friendly compromise that leaves us with ambiguous or inconsistent notation. > > > > The problems I have pointed out are quite technical, difficult, and not amenable > > to oral communication alone, and a conference call is not going to bring us any closer > > to a resolution. > > > > Mark-Jan > > > > > > _______________________________________________ > > Egyptian mailing list > > Egyptian at evertype.com > > http://evertype.com/mailman/listinfo/egyptian_evertype.com > > > _______________________________________________ > Egyptian mailing list > Egyptian at evertype.com > http://evertype.com/mailman/listinfo/egyptian_evertype.com From everson at evertype.com Wed Jul 20 19:47:38 2016 From: everson at evertype.com (Michael Everson) Date: Wed, 20 Jul 2016 19:47:38 +0100 Subject: [Egyptian] Mailing list stuff In-Reply-To: <5d349150-6a18-4e47-cf0f-36d775ca7adc@w3.org> References: <4305cf67-a096-7a88-15e1-b2a7a88b7fd3@w3.org> <9BA81627-D9DE-445D-9EFA-7B4DD65606AA@cam.ac.uk> <5d349150-6a18-4e47-cf0f-36d775ca7adc@w3.org> Message-ID: <9DC6E31B-AF43-4ECE-B706-69BA1445FBA4@evertype.com> On 20 Jul 2016, at 17:22, ishida at w3.org wrote: > My understanding is that only subscribers can post to the mailing list, but anyone can read the mails in the archive. Correct, with the current settings. Michael From runa.uei at gmail.com Wed Jul 20 19:49:40 2016 From: runa.uei at gmail.com (So Miyagawa) Date: Wed, 20 Jul 2016 20:49:40 +0200 Subject: [Egyptian] Cambridge: conclusions and next steps In-Reply-To: References: Message-ID: Hi Heleen and all, I put HieroJIS and links to its demo and how-to videos on GitHub with basic vocabulary (not including the data of TLA). https://github.com/somiyagawa/Toolkit-for-Coptic-and-Ancient-Egyptian-including-HieroJIS If Marwan and TLA agree, I will put all the things of SINUHE. SINUHE is HieroJIS & Marwan's group writing font using TLA data. (Sublime INputting of Unicode for Hieroglyphic Egyptian) Now, several students of Goettingen and Macquarie are contributing to our data. We will have a workshop mainly for these contributors in November or December. I'm now on the way of implementing CATEGORY or DESCRIPTION inputting system like SEATEDMAN --> A1, MAN ---> all the glyphs in A. If you have such a good list of description of glyphs except Gardiner's, please let me know it. Best, So ________________________________ So Miyagawa [so? mij??g?w?] Georg-August-Universit?t G?ttingen (Egyptology & Coptology, Ph.D. candidate), SFB1136 "Bildung und Religion in Kulturen des Mittelmeerraums und seiner Umwelt von der Antike bis zum Mittelalter und zum Klassischen Islam" (Research Fellow), KELLIA (Research Fellow), Coptic SCRIPTORIUM (Research Member), Unicode Consortium (Student Member) Kyoto University (Linguistics, Ph.D. candidate) SFB1136: https://www.uni-goettingen.de/de/531081.html CV: https://docs.google.com/document/d/1HhhKovsJzqZQGCn6W1oNweqyqKYUfUvFTxAlStKICdM/edit?usp=sharing Academia.edu: https://uni-goettingen.academia.edu/SoMiyagawa ?????????? ??????-???????? ??????????? ??????, ?????? ?????? ?????????? ???-???? ??????-?????????, ?????-??????? ??????? ???????? ????????, ????????? ????????; "But the king of Assyria found treachery in Hoshea, for he had sent messengers to So, king of Egypt, and offered no tribute to the king of Assyria, as he had done year by year ." (2 Kings 17:4, ESV) On Wed, Jul 20, 2016 at 4:00 PM, heleen wilbrink wrote: > Hi guys, > > Sorry to have kept you waiting. Here are the main conclusions and next > steps (bold) as I interpreted them. I hope you will find them helpful. > > Have a great day, > Heleen > > Repertoire > > 1. A list will be made with the references to the publication of an > original text (photo or facsimile, not a print font like eg IFAO) for each > hieroglyph from the proposal of Michel Suignard. Background: The characters > in the proposal of Michel do not have sources yet. These sources are needed > in order for the proposal to be accepted. > 2. *Michel will do the coordination for this list with input from > Stephane/Serge/Simon/Mark-Jan* > 3. The hieroglyphs with references can be added in tranches > 4. The first tranche will be the proposal of Michael Everson on > hieratic hieroglyphs from Moller to which the hieratic dot will be added > 5. *Stephane/Serge/Simon/Mark-Jan will review the proposal of Michael > and add any other hieroglyphs that they are missing including the > references before mid August* > 6. *Michael will finish the proposal in the second half of August* so > that it can be included in the Unicode meeting on the 6th of September > > > Monograms: > > 1. I think the consensus is to make a list of attested combinations > and then decide on the approach, either monograms (A) or the use of a > control character (B). This list will be used as a proposal for monograms > (A) or will be used by developers for their fonts (B). Most Egyptologists > were in favor of solution B because of the ease of searching and the ease > of implementation (in fonts and not in Unicode proposals). > 2. *Stephane/Serge/Simon/Mark-Jan will make a list and share it with > the group. Then a decision is made for A or B* > > > Control characters > > 1. On the last day of the conference there seemed to be consensus on > the control characters, which Michael summed up by mail as: ?The proposal > is to remove the LIG character from the ballot and to add in 7 more, the > two group controls of Bob, and then the four corner and the center controls? > 2. It was agreed that this proposal will be finished in August, I > thought by Michael, and discussed on the Unicode meeting on the 6th of > September. Now it seems over the mail that several people are working on it > separately > 3. *Suggestion: let Michael, Bob and Mark-Jan do a conference call or > so soon to A. make sure there is one proposal that all agree on and can be > shared with the group and B. give guidelines to Simon/Serge/Stephane for > what examples they should look in their databases * > > > Input methods and fonts > > 1. So and Marwan have integrated their input systems and font into > SINUHE > 2. *So/Marwan could you share with the us the online location so we > can start using it?* > 3. Bob has made and shared an experimental font for us to use > > > _______________________________________________ > Egyptian mailing list > Egyptian at evertype.com > http://evertype.com/mailman/listinfo/egyptian_evertype.com > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From schweitzer at bbaw.de Thu Jul 21 08:39:14 2016 From: schweitzer at bbaw.de (Simon Schweitzer) Date: Thu, 21 Jul 2016 09:39:14 +0200 Subject: [Egyptian] Ligature joiners - evidence needed please In-Reply-To: References: Message-ID: Hi all, concerning the one ligature joiner vs. the four corner ligature joiner: > I?ve not yet seen any data, evidence or rationale about the suggested 5 positional ligature system discussed at Cambridge. I don?t have unlimited time to research all this myself and I?m not an Egyptologist. Technically such a system can be added to current controls but it needs to be well-defined, detailed and justified if it is to be submitted. [This is what I?m doing with the group-joiners]. Why should one use four instead of one ligature joiner? Because the encoding with four control characters offers more information. It is readable for the egyptologist and for the font developer. Cf. my example from the last week: In the temple of Kom Ombo, there are two ligatures with the knife T31 and the bread X1. We can encode this in RES with insert[bs](T31,X1) and insert[te](T31,X1). The position of the X1 is obvious. But if we have T31&X1, one cannot decide the position of X1. And a font developer could create such ligature with the "corner control character" even if he do not see the original reference, but he cannot do anything if he only has T31&X1. But what about T31&N29 or T31&O49, ligatures, that occurs in the Kom Ombo temple? We (the egyptologists, the font developers, the fonts) cannot interpret the ligature only with this encoding. Okay, the Kom Ombo temple belongs to the ptolemaic and not to the "classical" writing system. But the encoding of this temple is important for the TLA project, because the Kom Ombo project wants to encode their data in our system, so that we will have such encodings in our material. But there are such problems in more classical data, too: For example the ligatures with E6. Bob, you offers three ligatures in your EGPZ 1.0 BETA Specification (August 2007): U+eb13 E6&Z2d, U+eb14 E6&X1, and U+eb15 E6&Z1. These ligatures could be encoded as insert[te](E6,Z2), insert[te](E6,X1), insert[b](E6,Z1) in RES. But there are examples where the plral strokes are in another corner. These examples (DZA 24.607.440 or DZA 28.723.300 ) could be encoded as insert[bs](E6,Z2) in RES. As in T31&X1, the encoding of E6&Z2 is ambiguous. The three ligatures which E6 use different corners. Therefore, it is not clear how to interpret E6&D2 or E6&X1&Z2. But if we have the four corner control characters, we could encode something like the RES code insert[te](E6,D2) instead of E6&D2 in DZA 28.722.290 . The E6&X1&Z2 would be the readable insert[bs](E6,X1*Z2) in RES (DZA 28.722.530 ). BTW, this example is an argument for the special grouping character "g*" which Michael wrote on the whiteboard last week: X1 g* Z2 insert_bs E6 and the g*-group has to be parsed first. But this is another topic... Best regards, Simon From s.polis at ulg.ac.be Thu Jul 21 10:56:34 2016 From: s.polis at ulg.ac.be (=?utf-8?Q?St=C3=A9phane_polis?=) Date: Thu, 21 Jul 2016 11:56:34 +0200 Subject: [Egyptian] Ligature joiners - evidence needed please In-Reply-To: References:

Message-ID: <847EA36B-D71C-4F32-986F-7796A7912272@ulg.ac.be> Hi all, Please find attached data from Ramses regarding : 1) free-groups.gly, i.e. groups which involve an absolute positioning of signs in MdC (**, etc.); a lot of these groups are directly relevant for (1) making a case about the need for the four corner INSERT operators, (2) exemplifying the need for a ?stack? operator, etc., etc. 2) ligatures.gly, i.e. groups that use the ?&? operator; directly relevant for illustrating all the cases of the 4 corners operators. 3) subgroups.gly, i.e.groups with the parentheses in MdC, which are directly relevant for illustrating the several levels embedding of hieroglyphs. Note that: * the number before indicates in how many spellings the group/ligature occurs in our corpus. * the encoding of theses spellings was made by PhD or MA students; there are inconsistencies in the encoding because they first and foremost tried to stick to the visual arrangement of the signs in the edition of the text. * 98% of the data are coming from horizontal text (I admit that the *vertical argument* made by Bob several times escapes me a bit, but I might not understand fully what you mean, Bob). We are happy to share this material under 2 conditions: 1 - A reference to Ramses should be made for any use, minimally referring to the website . 2 - These groups are there for helping the development of an appropriate encoding scheme for Unicode as regards the control characters, but they should by no mean be integrated as glyphs or characters into Unicode. I?m sorry to insist again on this and to stress that it would be against all the principles advocated for during the meeting by the Egyptologists (as well as against the conditions for using the material from Ramses): (1) such combinations of signs are productive in ancient Egyptian and we want to be able to encode them without adding new groups in Unicode, this would make no sense; (2) we have to be able to search easily for the signs in these groups as well as the position of theses signs in the groups. The only way out in my view is: a) A well defined set of **insert** operators with a precise semantics. b) A way to make sub-groups (parentheses, precedence operators, begin-end marker, whatever you like). More comments soon regarding the other topics! Have a nice day, St?phane (also on behalf of Serge, of course) -------------- next part -------------- A non-text attachment was scrubbed... Name: freeGroups.gly Type: application/octet-stream Size: 22407 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: ligatures.gly Type: application/octet-stream Size: 8364 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: subGroups.gly Type: application/octet-stream Size: 13240 bytes Desc: not available URL: -------------- next part -------------- ------------------------------------------------------ Chercheur qualifi? F.R.S.-FNRS Universit? de Li?ge Service d'?gyptologie D?partement des sciences de l?Antiquit? Place du 20-Ao?t, B-4000 Li?ge http://www.egypto.ulg.ac.be ------------------------------------------------------ > Le 21 juil. 2016 ? 09:39, Simon Schweitzer a ?crit : > > Hi all, > > concerning the one ligature joiner vs. the four corner ligature joiner: > >> I?ve not yet seen any data, evidence or rationale about the suggested 5 positional ligature system discussed at Cambridge. I don?t have unlimited time to research all this myself and I?m not an Egyptologist. Technically such a system can be added to current controls but it needs to be well-defined, detailed and justified if it is to be submitted. [This is what I?m doing with the group-joiners]. > > Why should one use four instead of one ligature joiner? > Because the encoding with four control characters offers more information. It is readable for the egyptologist and for the font developer. Cf. my example from the last week: In the temple of Kom Ombo, there are two ligatures with the knife T31 and the bread X1. We can encode this in RES with insert[bs](T31,X1) and insert[te](T31,X1). The position of the X1 is obvious. But if we have T31&X1, one cannot decide the position of X1. And a font developer could create such ligature with the "corner control character" even if he do not see the original reference, but he cannot do anything if he only has T31&X1. But what about T31&N29 or T31&O49, ligatures, that occurs in the Kom Ombo temple? We (the egyptologists, the font developers, the fonts) cannot interpret the ligature only with this encoding. > > Okay, the Kom Ombo temple belongs to the ptolemaic and not to the "classical" writing system. But the encoding of this temple is important for the TLA project, because the Kom Ombo project wants to encode their data in our system, so that we will have such encodings in our material. But there are such problems in more classical data, too: > > For example the ligatures with E6. Bob, you offers three ligatures in your EGPZ 1.0 BETA Specification (August 2007): U+eb13 E6&Z2d, U+eb14 E6&X1, and U+eb15 E6&Z1. These ligatures could be encoded as insert[te](E6,Z2), insert[te](E6,X1), insert[b](E6,Z1) in RES. But there are examples where the plral strokes are in another corner. These examples (DZA 24.607.440 or DZA 28.723.300 ) could be encoded as insert[bs](E6,Z2) in RES. As in T31&X1, the encoding of E6&Z2 is ambiguous. The three ligatures which E6 use different corners. Therefore, it is not clear how to interpret E6&D2 or E6&X1&Z2. But if we have the four corner control characters, we could encode something like the RES code insert[te](E6,D2) instead of E6&D2 in DZA 28.722.290 . The E6&X1&Z2 would be the readable insert[bs](E6,X1*Z2) in RES (DZA 28.722.530 ). BTW, this example is an argument for the special grouping character "g*" which Michael wrote on the whiteboard last week: X1 g* Z2 insert_bs E6 and the g*-group has to be parsed first. But this is another topic... > > Best regards, > > Simon > > _______________________________________________ > Egyptian mailing list > Egyptian at evertype.com > http://evertype.com/mailman/listinfo/egyptian_evertype.com From bobqq at live.co.uk Thu Jul 21 11:00:23 2016 From: bobqq at live.co.uk (Bob Richmond) Date: Thu, 21 Jul 2016 11:00:23 +0100 Subject: [Egyptian] Ligature joiners - evidence needed please Message-ID: Hi Simon Thanks for the examples. Just to be clear to all. I?d like to make a good case for controls and identify/illustrate any variations or issues needing discussion. If anyone has time to create a doc/PDF with a bunch of graphic illustrations of clusters showing awkward cases this would save me time ? I?ve plenty else to do on this ASAP so help appreciated. Bob From: Simon Schweitzer Sent: 21 July 2016 08:40 To: Egyptian Hieroglyphs in the UCS Subject: Re: [Egyptian] Ligature joiners - evidence needed please Hi all, concerning the one ligature joiner vs. the four corner ligature joiner: > I?ve not yet seen any data, evidence or rationale about the suggested 5 positional ligature system discussed at Cambridge. I don?t have unlimited time to research all this myself and I?m not an Egyptologist. Technically such a system can be added to current controls but it needs to be well-defined, detailed and justified if it is to be submitted. [This is what I?m doing with the group-joiners]. Why should one use four instead of one ligature joiner? Because the encoding with four control characters offers more information. It is readable for the egyptologist and for the font developer. Cf. my example from the last week: In the temple of Kom Ombo, there are two ligatures with the knife T31 and the bread X1. We can encode this in RES with insert[bs](T31,X1) and insert[te](T31,X1). The position of the X1 is obvious. But if we have T31&X1, one cannot decide the position of X1. And a font developer could create such ligature with the "corner control character" even if he do not see the original reference, but he cannot do anything if he only has T31&X1. But what about T31&N29 or T31&O49, ligatures, that occurs in the Kom Ombo temple? We (the egyptologists, the font developers, the fonts) cannot interpret the ligature only with this encoding. Okay, the Kom Ombo temple belongs to the ptolemaic and not to the "classical" writing system. But the encoding of this temple is important for the TLA project, because the Kom Ombo project wants to encode their data in our system, so that we will have such encodings in our material. But there are such problems in more classical data, too: For example the ligatures with E6. Bob, you offers three ligatures in your EGPZ 1.0 BETA Specification (August 2007): U+eb13 E6&Z2d, U+eb14 E6&X1, and U+eb15 E6&Z1. These ligatures could be encoded as insert[te](E6,Z2), insert[te](E6,X1), insert[b](E6,Z1) in RES. But there are examples where the plral strokes are in another corner. These examples (DZA 24.607.440 or DZA 28.723.300 ) could be encoded as insert[bs](E6,Z2) in RES. As in T31&X1, the encoding of E6&Z2 is ambiguous. The three ligatures which E6 use different corners. Therefore, it is not clear how to interpret E6&D2 or E6&X1&Z2. But if we have the four corner control characters, we could encode something like the RES code insert[te](E6,D2) instead of E6&D2 in DZA 28.722.290 . The E6&X1&Z2 would be the readable insert[bs](E6,X1*Z2) in RES (DZA 28.722.530 ). BTW, this example is an argument for the special grouping character "g*" which Michael wrote on the whiteboard last week: X1 g* Z2 insert_bs E6 and the g*-group has to be parsed first. But this is another topic... Best regards, Simon _______________________________________________ Egyptian mailing list Egyptian at evertype.com http://evertype.com/mailman/listinfo/egyptian_evertype.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From ncs3 at cam.ac.uk Thu Jul 21 11:22:21 2016 From: ncs3 at cam.ac.uk (Nigel Strudwick) Date: Thu, 21 Jul 2016 11:22:21 +0100 Subject: [Egyptian] Ligature joiners - evidence needed please In-Reply-To: References: Message-ID: Bob Was just glancing to see if there were any awkward groups in the TT99 material and one jumped out which is not so often mentioned, that is with the tongue sign when used as imy-r ?overseer?. It is impossible to get grouping it right in JSesh without considerable manual manipulation, or export to Illustrator (which I prefer anyway). I attach some graphics, one from a quick go in JSesh, and others from the facsimile originals. The ?tail? of the tongue has something in common with D. Title groups are often enough to defeat any setting program. The Z1 sign is a pain too, but that may be more of the fact that many fonts oversize it. Using ?3? for the squashed plural signs produces a better effect. But I realise this is not the issue about which you are concerned. Nigel On 21 Jul 2016, at 11:00, Bob Richmond wrote: > Hi Simon > > Thanks for the examples. > > Just to be clear to all. I?d like to make a good case for controls and identify/illustrate any variations or issues needing discussion. > > If anyone has time to create a doc/PDF with a bunch of graphic illustrations of clusters showing awkward cases this would save me time ? I?ve plenty else to do on this ASAP so help appreciated. > > Bob > > From: Simon Schweitzer > Sent: 21 July 2016 08:40 > To: Egyptian Hieroglyphs in the UCS > Subject: Re: [Egyptian] Ligature joiners - evidence needed please > > Hi all, > > concerning the one ligature joiner vs. the four corner ligature joiner: > > > I?ve not yet seen any data, evidence or rationale about the suggested 5 positional ligature system discussed at Cambridge. I don?t have unlimited time to research all this myself and I?m not an Egyptologist. Technically such a system can be added to current controls but it needs to be well-defined, detailed and justified if it is to be submitted. [This is what I?m doing with the group-joiners]. > > Why should one use four instead of one ligature joiner? > Because the encoding with four control characters offers more > information. It is readable for the egyptologist and for the font > developer. Cf. my example from the last week: In the temple of Kom Ombo, > there are two ligatures with the knife T31 and the bread X1. We can > encode this in RES with insert[bs](T31,X1) and insert[te](T31,X1). The > position of the X1 is obvious. But if we have T31&X1, one cannot decide > the position of X1. And a font developer could create such ligature with > the "corner control character" even if he do not see the original > reference, but he cannot do anything if he only has T31&X1. But what > about T31&N29 or T31&O49, ligatures, that occurs in the Kom Ombo temple? > We (the egyptologists, the font developers, the fonts) cannot interpret > the ligature only with this encoding. > > Okay, the Kom Ombo temple belongs to the ptolemaic and not to the > "classical" writing system. But the encoding of this temple is important > for the TLA project, because the Kom Ombo project wants to encode their > data in our system, so that we will have such encodings in our material. > But there are such problems in more classical data, too: > > For example the ligatures with E6. Bob, you offers three ligatures in > your EGPZ 1.0 BETA Specification (August 2007): U+eb13 E6&Z2d, U+eb14 > E6&X1, and U+eb15 E6&Z1. These ligatures could be encoded as > insert[te](E6,Z2), insert[te](E6,X1), insert[b](E6,Z1) in RES. But there > are examples where the plral strokes are in another corner. These > examples (DZA 24.607.440 > or DZA 28.723.300 > ) could be encoded as > insert[bs](E6,Z2) in RES. As in T31&X1, the encoding of E6&Z2 is > ambiguous. The three ligatures which E6 use different corners. > Therefore, it is not clear how to interpret E6&D2 or E6&X1&Z2. But if we > have the four corner control characters, we could encode something like > the RES code insert[te](E6,D2) instead of E6&D2 in DZA 28.722.290 > . The E6&X1&Z2 would > be the readable insert[bs](E6,X1*Z2) in RES (DZA 28.722.530 > ). BTW, this example > is an argument for the special grouping character "g*" which Michael > wrote on the whiteboard last week: X1 g* Z2 insert_bs E6 and the > g*-group has to be parsed first. But this is another topic... > > Best regards, > > Simon > > _______________________________________________ > Egyptian mailing list > Egyptian at evertype.com > http://evertype.com/mailman/listinfo/egyptian_evertype.com > _______________________________________________ > Egyptian mailing list > Egyptian at evertype.com > http://evertype.com/mailman/listinfo/egyptian_evertype.com -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: PastedGraphic-4.png Type: image/png Size: 16344 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: PastedGraphic-2.png Type: image/png Size: 58499 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: PastedGraphic-3.png Type: image/png Size: 44217 bytes Desc: not available URL: From s.polis at ulg.ac.be Thu Jul 21 11:41:43 2016 From: s.polis at ulg.ac.be (=?utf-8?Q?St=C3=A9phane_polis?=) Date: Thu, 21 Jul 2016 12:41:43 +0200 Subject: [Egyptian] Cambridge: conclusions and next steps In-Reply-To: References: Message-ID: Hi Heleen, Hi everyone, Thanks for the summary, it?s a great start for putting things on the table as clearly as possible . > Repertoire > > A list will be made with the references to the publication of an original text (photo or facsimile, not a print font like eg IFAO) for each hieroglyph from the proposal of Michel Suignard. Background: The characters in the proposal of Michel do not have sources yet. These sources are needed in order for the proposal to be accepted. > Michel will do the coordination for this list with input from Stephane/Serge/Simon/Mark-Jan > The hieroglyphs with references can be added in tranches > The first tranche will be the proposal of Michael Everson on hieratic hieroglyphs from Moller to which the hieratic dot will be added > Stephane/Serge/Simon/Mark-Jan will review the proposal of Michael and add any other hieroglyphs that they are missing including the references before mid August > Michael will finish the proposal in the second half of August so that it can be included in the Unicode meeting on the 6th of September I was not there on Tuesday morning, so apologies if you reached an agreement that I am unaware of, but this is not how I see the forthcoming steps in terms of repertoire. The proposal of Michel for the hieroglyphica represents a significant work, but a work that is impossible to document quickly. And we certainly do not want to introduce signs without knowing what they mean, what they represent and where they are attested (many problematic examples were discussed by Ingelore as well as Serge and myself during the meeting). Sourcing these hieroglyphs is a enormous endeavor, even for professional egyptologists and it will take some years (probably not so many, but we are talking in years before being done) and anyway there is no hurry for anyone to use Unicode for these signs as far as I can tell. What we can do quite quickly, on the other hand, is: 1. Extracting all the hieroglyphs that we presently use in the TLA and Ramses, looking for the ones missing in Unicode and preparing an addition in Unicode for these signs that we can document precisely. 2. To do the same for the signs identified in M?ller by Michael; check the proposal that he prepared (ask our colleague from Mainz) and prepare a document for addition in Unicode. 3. In October, some scholars will begin to work full-time on the sign-list and the joint-work between the TLA and Ramses should become progressively available online (Thot Sign-list), with references to signs, classes and forms, as well as to the scholars responsible for the identification of the values, shapes, etc. This will provide a good basis for integrating the hieroglyphica *step-by-step* in Unicode, when we can actually say what each individual sign is. In a nutshell, the work done by Michel will be more than useful, but should be integrated step by step in the future (probably a third step). I think I should get the support of all the egyptologists on this. Correct me guys if I?m wrong. > Monograms: > > I think the consensus is to make a list of attested combinations and then decide on the approach, either monograms (A) or the use of a control character (B). This list will be used as a proposal for monograms (A) or will be used by developers for their fonts (B). Most Egyptologists were in favor of solution B because of the ease of searching and the ease of implementation (in fonts and not in Unicode proposals). > Stephane/Serge/Simon/Mark-Jan will make a list and share it with the group. Then a decision is made for A or B Agreed. > Control characters > > On the last day of the conference there seemed to be consensus on the control characters, which Michael summed up by mail as: ?The proposal is to remove the LIG character from the ballot and to add in 7 more, the two group controls of Bob, and then the four corner and the center controls? Sure. I just sent the data from Ramses that are needed for illustrating and arguing in favour of theses control characters. > It was agreed that this proposal will be finished in August, I thought by Michael, and discussed on the Unicode meeting on the 6th of September. Now it seems over the mail that several people are working on it separately > Suggestion: let Michael, Bob and Mark-Jan do a conference call or so soon to A. make sure there is one proposal that all agree on and can be shared with the group and B. give guidelines to Simon/Serge/Stephane for what examples they should look in their databases If possible, I think that the best would be to produce a single ?consensus? document, which could be signed and approved by all the parties. I leave it up to you to know who is to produce the first draft. I?m certainly happy to read through it carefully, provide examples, and check everything I can, but I cannot write the first draft, which would be beyond my technical capacities and the time I?ve left during the next weeks. Best wishes, St?phane > Input methods and fonts > > So and Marwan have integrated their input systems and font into SINUHE > So/Marwan could you share with the us the online location so we can start using it? > Bob has made and shared an experimental font for us to use > _______________________________________________ > Egyptian mailing list > Egyptian at evertype.com > http://evertype.com/mailman/listinfo/egyptian_evertype.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From hafemann at bbaw.de Thu Jul 21 13:19:00 2016 From: hafemann at bbaw.de (Ingelore Hafemann) Date: Thu, 21 Jul 2016 14:19:00 +0200 Subject: [Egyptian] Cambridge: conclusions and next steps In-Reply-To: References:

Message-ID: Dear St?phan, dear all, thanks St?phane for your contribution, I was just getting to write a very similar mail at this moment. I would like to confirm to St?phane in al points - especially about the sign list and even Simon does. In Cambridge I hope to have show in my record how difficult it is to check all signs. We could give toMichel an electronic list of encoded hieroglyphs occured in our TLA , excluded the well referenced Gardiner Codes. So we have the reference of the TLA but an other problem: Students have used variants of signs in the process of encoding our texts, - we have to check it. Our aim is to differentiate between signs and variants as far as it its possible. 1. We can give you a a list of signs missing in Unicode too, with references of our corpus - as St?phane proposed to do in point 1. below. Adding to point 3. of Steph?ne below: In Berlin we have checked and described nearly 2/3 of the Group A (Man), all of Group G (Birds, 313) all of Group P (Ships, 121) and all of Group Z (Strokes, Geometrical F., 39) and a lot signs of the other groups more unsystematical and by-the way. These work we continue in Berlin with one half women power and a re happy that in October one or more scholars will start in Liege. The Thot-sign list - our common product - will be published step by step. I guess - as St?phane does - it will takes few years to finish it. We could give part of the allready finished signs to Michel step by step. This should be possible if it seems useful. Concerning point 2.: I would again recommend to contact Ursula Verhoeven and Svenja G?lden in Mainz, who are engaged in creating a new electronic and digitized "M?ller" Concerning the problems of control characters Simon has discussed the last issues and send some evidences of our texts archive. Many greetings and best wishes Ingelore and Simon Am 21.07.2016 um 12:41 schrieb St?phane polis: > Hi Heleen, > Hi everyone, > > Thanks for the summary, it?s a great start for putting things on the table as clearly as possible . > >> Repertoire >> >> A list will be made with the references to the publication of an original text (photo or facsimile, not a print font like eg IFAO) for each hieroglyph from the proposal of Michel Suignard. Background: The characters in the proposal of Michel do not have sources yet. These sources are needed in order for the proposal to be accepted. >> Michel will do the coordination for this list with input from Stephane/Serge/Simon/Mark-Jan >> The hieroglyphs with references can be added in tranches >> The first tranche will be the proposal of Michael Everson on hieratic hieroglyphs from Moller to which the hieratic dot will be added >> Stephane/Serge/Simon/Mark-Jan will review the proposal of Michael and add any other hieroglyphs that they are missing including the references before mid August >> Michael will finish the proposal in the second half of August so that it can be included in the Unicode meeting on the 6th of September > I was not there on Tuesday morning, so apologies if you reached an agreement that I am unaware of, but this is not how I see the forthcoming steps in terms of repertoire. > The proposal of Michel for the hieroglyphica represents a significant work, but a work that is impossible to document quickly. And we certainly do not want to introduce signs without knowing what they mean, what they represent and where they are attested (many problematic examples were discussed by Ingelore as well as Serge and myself during the meeting). Sourcing these hieroglyphs is a enormous endeavor, even for professional egyptologists and it will take some years (probably not so many, but we are talking in years before being done) and anyway there is no hurry for anyone to use Unicode for these signs as far as I can tell. What we can do quite quickly, on the other hand, is: > > 1. Extracting all the hieroglyphs that we presently use in the TLA and Ramses, looking for the ones missing in Unicode and preparing an addition in Unicode for these signs that we can document precisely. > 2. To do the same for the signs identified in M?ller by Michael; check the proposal that he prepared (ask our colleague from Mainz) and prepare a document for addition in Unicode. > 3. In October, some scholars will begin to work full-time on the sign-list and the joint-work between the TLA and Ramses should become progressively available online (Thot Sign-list), with references to signs, classes and forms, as well as to the scholars responsible for the identification of the values, shapes, etc. This will provide a good basis for integrating the hieroglyphica *step-by-step* in Unicode, when we can actually say what each individual sign is. > > In a nutshell, the work done by Michel will be more than useful, but should be integrated step by step in the future (probably a third step). > I think I should get the support of all the egyptologists on this. Correct me guys if I?m wrong. > > >> Monograms: >> >> I think the consensus is to make a list of attested combinations and then decide on the approach, either monograms (A) or the use of a control character (B). This list will be used as a proposal for monograms (A) or will be used by developers for their fonts (B). Most Egyptologists were in favor of solution B because of the ease of searching and the ease of implementation (in fonts and not in Unicode proposals). >> Stephane/Serge/Simon/Mark-Jan will make a list and share it with the group. Then a decision is made for A or B > Agreed. > >> Control characters >> >> On the last day of the conference there seemed to be consensus on the control characters, which Michael summed up by mail as: ?The proposal is to remove the LIG character from the ballot and to add in 7 more, the two group controls of Bob, and then the four corner and the center controls? > Sure. I just sent the data from Ramses that are needed for illustrating and arguing in favour of theses control characters. >> It was agreed that this proposal will be finished in August, I thought by Michael, and discussed on the Unicode meeting on the 6th of September. Now it seems over the mail that several people are working on it separately >> Suggestion: let Michael, Bob and Mark-Jan do a conference call or so soon to A. make sure there is one proposal that all agree on and can be shared with the group and B. give guidelines to Simon/Serge/Stephane for what examples they should look in their databases > If possible, I think that the best would be to produce a single ?consensus? document, which could be signed and approved by all the parties. > I leave it up to you to know who is to produce the first draft. I?m certainly happy to read through it carefully, provide examples, and check everything I can, but I cannot write the first draft, which would be beyond my technical capacities and the time I?ve left during the next weeks. > > Best wishes, > > St?phane > > >> Input methods and fonts >> >> So and Marwan have integrated their input systems and font into SINUHE >> So/Marwan could you share with the us the online location so we can start using it? >> Bob has made and shared an experimental font for us to use >> _______________________________________________ >> Egyptian mailing list >> Egyptian at evertype.com >> http://evertype.com/mailman/listinfo/egyptian_evertype.com > > > _______________________________________________ > Egyptian mailing list > Egyptian at evertype.com > http://evertype.com/mailman/listinfo/egyptian_evertype.com -- Dr. Ingelore Hafemann Strukturen und Transformationen des Wortschatzes der ?gyptischen Sprache Post: J?gerstra?e 22/23 Sitz: Unter den Linden 8 10117 Berlin Tel: (030) 20370 447 -------------- next part -------------- An HTML attachment was scrubbed... URL: From schweitzer at bbaw.de Thu Jul 21 14:24:51 2016 From: schweitzer at bbaw.de (Simon Schweitzer) Date: Thu, 21 Jul 2016 15:24:51 +0200 Subject: [Egyptian] Brackets in the TLA encoding Message-ID: Hi all, @St?phane: thank you for your .gly-files! In this mail, I want to add some remarks concerning the subgroup topic. As in Ramses, there are many encodings with "(" and ")" in the TLA. I collected these encodings ans I want to present you my evaluation: * In some cases, the encoding is invalid, e.g. (F12-S29):D21, which should be understood as F12*S29:D21. * Sometimes, the encoding of the brackets is superflous. There are many cases of Hiero1:(Hiero2*Hiero3) The brackets are not necessary: use Hiero1:Hiero2*Hiero3 ! * But in many cases, the parsing without the brackets would be misleading: 1) There are many vertical groups in horizontal groups in vertical groups. I list only 10 examples: N35:"?"*"?"*(W22:Z2) (Rezepte Papyrus Zagreb E-597-3, l. 1; ID:ABLN5PNQ2BBENE7LWO72KDRPPU) Aa1&D58:(X1:N35)*N25 (Stele Louvre C 284 ("Bentresch-Stele"), l. 22; ID:4VLZLA44UVGJZN22WIWP774LOQ) Aa13:S40*(X1:O49) (Stele Louvre C 284 ("Bentresch-Stele"), l. 21; ID:4VLZLA44UVGJZN22WIWP774LOQ) Aa15:W19*(X1:X1) (Harfnerlieder Text C, l. 2; ID:H6Z5TORPQFFZXOU6CJODODZHYQ) D21:V28*(X1:B1) (?Stele des Montuhotep (Cambridge E.9.1922)?, l. C.3; ID:LQ7QIJTK7NFWTIFBS7R6AIYWGY) D21:V7*(W24:X1) (?Stele des Montuhotep (Kairo CG 20539)?, l. I.b.18; ID:ZOLMMIAB2NHV7PSOSOOLAHN64U) D35A:(X1:Z4A)*G37 (?Stele des Antef (Louvre C 167 = E 3111)?, l. C.1; ID:DWHZIO5ZCFBURLZ6G4T26YIP7U) D36:D21:N29*(X1:"?"*Z4*"?") (?Stele des Antef (Glasgow D1922.13)?, l. A.3; ID:OIYODBZ74RHM7OPTR72OLCMJ3A) I10&I9:X2*(X4:Z2) (?Stele des Antef (Glasgow D1922.13)?, l. A.7; ID:OIYODBZ74RHM7OPTR72OLCMJ3A) K4:G1*(Z7:X1) (Sinuhe AOS, vs. 18; ID:RP2F6BGNDBAARDBNDHGFSFPIEM) As you can see, this kind of grouping occurs in hieroglyphic and in hieratic texts, and this feature is also attested in the "classical" period from the Middle Kingdom (the examples from the stela of Montuhotep and Antef). 2) horizontal grouping of vertical groups in columns If the text is written vertically, there are cases of horizontal groups of vertical groups, e.g. in the Buch von der Himmelskuh (ID:WHEOIX5P5ZAVVFU4BGO6OWASV4), M17*(Q3:N35), (X1:Z1)*I12, M17*S29*(A2:Z2) and so on. Best regards, Simon From hawilbrink at hotmail.com Thu Jul 21 20:19:09 2016 From: hawilbrink at hotmail.com (heleen wilbrink) Date: Thu, 21 Jul 2016 21:19:09 +0200 Subject: [Egyptian] Cambridge: conclusions and next steps In-Reply-To: References:

Message-ID: Dear Ingelore and St?phane, dear all, It is great to see so much valuable input on our mail group. I would like to clarify one thing regarding the repertoire. I completely agree with you St?phane and Ingelore and I am sorry that my wording was apparently not so clear. For short term (hopefully by mid August, as was discussed in Cambridge) the check on the Mollerlist proposal of Michael can be done and other missing hieroglyphs that have references can be added to it. If August is not feasible it could be I guess later this year. The compilation of the entire list with references indeed will take much longer and can be done step by step. Thanks Ingelore for reminding your suggestion to get in contact with Ursula and Svenja. Is anyone from the Cambridge group in close contact with them and willing to contact them? All the best, Heleen Verstuurd vanaf mijn iPhone > Op 21 jul. 2016 om 14:20 heeft Ingelore Hafemann het volgende geschreven: > > Dear St?phan, dear all, > thanks St?phane for your contribution, I was just getting to write a very similar mail at this moment. > I would like to confirm to St?phane in al points - especially about the sign list and even Simon does. In Cambridge I hope to have show in my record how difficult it is to check all signs. We could give toMichel an electronic list of encoded hieroglyphs occured in our TLA , excluded the well referenced Gardiner Codes. So we have the reference of the TLA but an other problem: Students have used variants of signs in the process of encoding our texts, - we have to check it. Our aim is to differentiate between signs and variants as far as it its possible. > 1. We can give you a a list of signs missing in Unicode too, with references of our corpus - as St?phane proposed to do in point 1. below. > Adding to point 3. of Steph?ne below: In Berlin we have checked and described nearly 2/3 of the Group A (Man), all of Group G (Birds, 313) all of Group P (Ships, 121) and all of Group Z (Strokes, Geometrical F., 39) and a lot signs of the other groups more unsystematical and by-the way. These work we continue in Berlin with one half women power and a re happy that in October one or more scholars will start in Liege. The Thot-sign list - our common product - will be published step by step. I guess - as St?phane does - it will takes few years to finish it. We could give part of the allready finished signs to Michel step by step. This should be possible if it seems useful. > Concerning point 2.: I would again recommend to contact Ursula Verhoeven and Svenja G?lden in Mainz, who are engaged in creating a new electronic and digitized "M?ller" > > Concerning the problems of control characters Simon has discussed the last issues and send some evidences of our texts archive. > Many greetings and best wishes > Ingelore and Simon > > > >> Am 21.07.2016 um 12:41 schrieb St?phane polis: >> Hi Heleen, >> Hi everyone, >> >> Thanks for the summary, it?s a great start for putting things on the table as clearly as possible . >> >>> Repertoire >>> >>> A list will be made with the references to the publication of an original text (photo or facsimile, not a print font like eg IFAO) for each hieroglyph from the proposal of Michel Suignard. Background: The characters in the proposal of Michel do not have sources yet. These sources are needed in order for the proposal to be accepted. >>> Michel will do the coordination for this list with input from Stephane/Serge/Simon/Mark-Jan >>> The hieroglyphs with references can be added in tranches >>> The first tranche will be the proposal of Michael Everson on hieratic hieroglyphs from Moller to which the hieratic dot will be added >>> Stephane/Serge/Simon/Mark-Jan will review the proposal of Michael and add any other hieroglyphs that they are missing including the references before mid August >>> Michael will finish the proposal in the second half of August so that it can be included in the Unicode meeting on the 6th of September >> I was not there on Tuesday morning, so apologies if you reached an agreement that I am unaware of, but this is not how I see the forthcoming steps in terms of repertoire. >> The proposal of Michel for the hieroglyphica represents a significant work, but a work that is impossible to document quickly. And we certainly do not want to introduce signs without knowing what they mean, what they represent and where they are attested (many problematic examples were discussed by Ingelore as well as Serge and myself during the meeting). Sourcing these hieroglyphs is a enormous endeavor, even for professional egyptologists and it will take some years (probably not so many, but we are talking in years before being done) and anyway there is no hurry for anyone to use Unicode for these signs as far as I can tell. What we can do quite quickly, on the other hand, is: >> >> 1. Extracting all the hieroglyphs that we presently use in the TLA and Ramses, looking for the ones missing in Unicode and preparing an addition in Unicode for these signs that we can document precisely. >> 2. To do the same for the signs identified in M?ller by Michael; check the proposal that he prepared (ask our colleague from Mainz) and prepare a document for addition in Unicode. >> 3. In October, some scholars will begin to work full-time on the sign-list and the joint-work between the TLA and Ramses should become progressively available online (Thot Sign-list), with references to signs, classes and forms, as well as to the scholars responsible for the identification of the values, shapes, etc. This will provide a good basis for integrating the hieroglyphica *step-by-step* in Unicode, when we can actually say what each individual sign is. >> >> In a nutshell, the work done by Michel will be more than useful, but should be integrated step by step in the future (probably a third step). >> I think I should get the support of all the egyptologists on this. Correct me guys if I?m wrong. >> >> >>> Monograms: >>> >>> I think the consensus is to make a list of attested combinations and then decide on the approach, either monograms (A) or the use of a control character (B). This list will be used as a proposal for monograms (A) or will be used by developers for their fonts (B). Most Egyptologists were in favor of solution B because of the ease of searching and the ease of implementation (in fonts and not in Unicode proposals). >>> Stephane/Serge/Simon/Mark-Jan will make a list and share it with the group. Then a decision is made for A or B >> Agreed. >> >>> Control characters >>> >>> On the last day of the conference there seemed to be consensus on the control characters, which Michael summed up by mail as: ?The proposal is to remove the LIG character from the ballot and to add in 7 more, the two group controls of Bob, and then the four corner and the center controls? >> Sure. I just sent the data from Ramses that are needed for illustrating and arguing in favour of theses control characters. >>> It was agreed that this proposal will be finished in August, I thought by Michael, and discussed on the Unicode meeting on the 6th of September. Now it seems over the mail that several people are working on it separately >>> Suggestion: let Michael, Bob and Mark-Jan do a conference call or so soon to A. make sure there is one proposal that all agree on and can be shared with the group and B. give guidelines to Simon/Serge/Stephane for what examples they should look in their databases >> If possible, I think that the best would be to produce a single ?consensus? document, which could be signed and approved by all the parties. >> I leave it up to you to know who is to produce the first draft. I?m certainly happy to read through it carefully, provide examples, and check everything I can, but I cannot write the first draft, which would be beyond my technical capacities and the time I?ve left during the next weeks. >> >> Best wishes, >> >> St?phane >> >> >>> Input methods and fonts >>> >>> So and Marwan have integrated their input systems and font into SINUHE >>> So/Marwan could you share with the us the online location so we can start using it? >>> Bob has made and shared an experimental font for us to use >>> _______________________________________________ >>> Egyptian mailing list >>> Egyptian at evertype.com >>> http://evertype.com/mailman/listinfo/egyptian_evertype.com >> >> >> _______________________________________________ >> Egyptian mailing list >> Egyptian at evertype.com >> http://evertype.com/mailman/listinfo/egyptian_evertype.com > > -- > Dr. Ingelore Hafemann > Strukturen und Transformationen des Wortschatzes > der ?gyptischen Sprache > Post: J?gerstra?e 22/23 > Sitz: Unter den Linden 8 > 10117 Berlin > Tel: (030) 20370 447 > _______________________________________________ > Egyptian mailing list > Egyptian at evertype.com > http://evertype.com/mailman/listinfo/egyptian_evertype.com -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- _______________________________________________ Egyptian mailing list Egyptian at evertype.com http://evertype.com/mailman/listinfo/egyptian_evertype.com From runa.uei at gmail.com Thu Jul 21 23:47:36 2016 From: runa.uei at gmail.com (So Miyagawa) Date: Thu, 21 Jul 2016 22:47:36 +0000 Subject: [Egyptian] Cambridge: conclusions and next steps In-Reply-To: References:

Message-ID: Dear Heleen, I know Svenja. She is working for AKU, the Hieratic database project based in Mainz. She will give a good insight to us. Also, if you are interested in Demotic Unicode, in S?K in Vienna, I discussed unicode-ization of Demotic with Fabian from University of Heidelberg. We reached a conclusion that we need three Unicode blocks for Demotic, Early Demotic, Ptolemaic and Roman. He is one of the leaders of Demotic database project called DPDP. Best wishes from D?sseldorf International Airport, So On Thu, Jul 21, 2016 at 21:20 heleen wilbrink wrote: > Dear Ingelore and St?phane, dear all, > > It is great to see so much valuable input on our mail group. I would like > to clarify one thing regarding the repertoire. I completely agree with you > St?phane and Ingelore and I am sorry that my wording was apparently not so > clear. > > For short term (hopefully by mid August, as was discussed in Cambridge) > the check on the Mollerlist proposal of Michael can be done and other > missing hieroglyphs that have references can be added to it. If August is > not feasible it could be I guess later this year. The compilation of the > entire list with references indeed will take much longer and can be done > step by step. > > Thanks Ingelore for reminding your suggestion to get in contact with > Ursula and Svenja. Is anyone from the Cambridge group in close contact with > them and willing to contact them? > > All the best, > Heleen > > Verstuurd vanaf mijn iPhone > > Op 21 jul. 2016 om 14:20 heeft Ingelore Hafemann het > volgende geschreven: > > Dear St?phan, dear all, > > thanks St?phane for your contribution, I was just getting to write a > very similar mail at this moment. > > I would like to confirm to St?phane in al points - especially about the > sign list and even Simon does. In Cambridge I hope to have show in my > record how difficult it is to check all signs. We could give toMichel an > electronic list of encoded hieroglyphs occured in our TLA , excluded the > well referenced Gardiner Codes. So we have the reference of the TLA but an > other problem: Students have used variants of signs in the process of > encoding our texts, - we have to check it. Our aim is to differentiate > between signs and variants as far as it its possible. > > 1. We can give you a a list of signs missing in Unicode too, with > references of our corpus - as St?phane proposed to do in point 1. below. > > Adding to point 3. of Steph?ne below: In Berlin we have checked and > described nearly 2/3 of the Group A (Man), all of Group G (Birds, 313) all > of Group P (Ships, 121) and all of Group Z (Strokes, Geometrical F., 39) > and a lot signs of the other groups more unsystematical and by-the way. > These work we continue in Berlin with one half women power and a re happy > that in October one or more scholars will start in Liege. The Thot-sign > list - our common product - will be published step by step. I guess - as > St?phane does - it will takes few years to finish it. We could give part of > the allready finished signs to Michel step by step. This should be possible > if it seems useful. > > Concerning point 2.: I would again recommend to contact Ursula Verhoeven > and Svenja G?lden in Mainz, who are engaged in creating a new electronic > and digitized "M?ller" > > Concerning the problems of control characters Simon has discussed the last > issues and send some evidences of our texts archive. > > Many greetings and best wishes > > Ingelore and Simon > > > > > Am 21.07.2016 um 12:41 schrieb St?phane polis: > > Hi Heleen, > Hi everyone, > > Thanks for the summary, it?s a great start for putting things on the table as clearly as possible . > > > Repertoire > > A list will be made with the references to the publication of an original text (photo or facsimile, not a print font like eg IFAO) for each hieroglyph from the proposal of Michel Suignard. Background: The characters in the proposal of Michel do not have sources yet. These sources are needed in order for the proposal to be accepted. > Michel will do the coordination for this list with input from Stephane/Serge/Simon/Mark-Jan > The hieroglyphs with references can be added in tranches > The first tranche will be the proposal of Michael Everson on hieratic hieroglyphs from Moller to which the hieratic dot will be added > Stephane/Serge/Simon/Mark-Jan will review the proposal of Michael and add any other hieroglyphs that they are missing including the references before mid August > Michael will finish the proposal in the second half of August so that it can be included in the Unicode meeting on the 6th of September > > I was not there on Tuesday morning, so apologies if you reached an agreement that I am unaware of, but this is not how I see the forthcoming steps in terms of repertoire. > The proposal of Michel for the hieroglyphica represents a significant work, but a work that is impossible to document quickly. And we certainly do not want to introduce signs without knowing what they mean, what they represent and where they are attested (many problematic examples were discussed by Ingelore as well as Serge and myself during the meeting). Sourcing these hieroglyphs is a enormous endeavor, even for professional egyptologists and it will take some years (probably not so many, but we are talking in years before being done) and anyway there is no hurry for anyone to use Unicode for these signs as far as I can tell. What we can do quite quickly, on the other hand, is: > > 1. Extracting all the hieroglyphs that we presently use in the TLA and Ramses, looking for the ones missing in Unicode and preparing an addition in Unicode for these signs that we can document precisely. > 2. To do the same for the signs identified in M?ller by Michael; check the proposal that he prepared (ask our colleague from Mainz) and prepare a document for addition in Unicode. > 3. In October, some scholars will begin to work full-time on the sign-list and the joint-work between the TLA and Ramses should become progressively available online (Thot Sign-list), with references to signs, classes and forms, as well as to the scholars responsible for the identification of the values, shapes, etc. This will provide a good basis for integrating the hieroglyphica *step-by-step* in Unicode, when we can actually say what each individual sign is. > > In a nutshell, the work done by Michel will be more than useful, but should be integrated step by step in the future (probably a third step). > I think I should get the support of all the egyptologists on this. Correct me guys if I?m wrong. > > > > Monograms: > > I think the consensus is to make a list of attested combinations and then decide on the approach, either monograms (A) or the use of a control character (B). This list will be used as a proposal for monograms (A) or will be used by developers for their fonts (B). Most Egyptologists were in favor of solution B because of the ease of searching and the ease of implementation (in fonts and not in Unicode proposals). > Stephane/Serge/Simon/Mark-Jan will make a list and share it with the group. Then a decision is made for A or B > > Agreed. > > > Control characters > > On the last day of the conference there seemed to be consensus on the control characters, which Michael summed up by mail as: ?The proposal is to remove the LIG character from the ballot and to add in 7 more, the two group controls of Bob, and then the four corner and the center controls? > > Sure. I just sent the data from Ramses that are needed for illustrating and arguing in favour of theses control characters. > > It was agreed that this proposal will be finished in August, I thought by Michael, and discussed on the Unicode meeting on the 6th of September. Now it seems over the mail that several people are working on it separately > Suggestion: let Michael, Bob and Mark-Jan do a conference call or so soon to A. make sure there is one proposal that all agree on and can be shared with the group and B. give guidelines to Simon/Serge/Stephane for what examples they should look in their databases > > If possible, I think that the best would be to produce a single ?consensus? document, which could be signed and approved by all the parties. > I leave it up to you to know who is to produce the first draft. I?m certainly happy to read through it carefully, provide examples, and check everything I can, but I cannot write the first draft, which would be beyond my technical capacities and the time I?ve left during the next weeks. > > Best wishes, > > St?phane > > > > Input methods and fonts > > So and Marwan have integrated their input systems and font into SINUHE > So/Marwan could you share with the us the online location so we can start using it? > Bob has made and shared an experimental font for us to use > _______________________________________________ > Egyptian mailing listEgyptian at evertype.com http://evertype.com/mailman/listinfo/egyptian_evertype.com > > > > _______________________________________________ > Egyptian mailing listEgyptian at evertype.comhttp://evertype.com/mailman/listinfo/egyptian_evertype.com > > > -- > Dr. Ingelore Hafemann > Strukturen und Transformationen des Wortschatzes > der ?gyptischen Sprache > Post: J?gerstra?e 22/23 > Sitz: Unter den Linden 8 > 10117 Berlin > Tel: (030) 20370 447 > > _______________________________________________ > Egyptian mailing list > Egyptian at evertype.com > http://evertype.com/mailman/listinfo/egyptian_evertype.com > > _______________________________________________ > Egyptian mailing list > Egyptian at evertype.com > http://evertype.com/mailman/listinfo/egyptian_evertype.com > -- --------------------------------- # So Miyagawa ## [so? mij??g?w?] ### Websites * Profile: https://www.uni-goettingen.de/de/531081.html * academia.edu: https://uni-goettingen.academia.edu/SoMiyagawa * GitHub general: https://github.com/somiyagawa * GitHub toolkit: https://github.com/somiyagawa/toolkitForCopticAndAncientEgyptian * CV: https://docs.google.com/document/d/1HhhKovsJzqZQGCn6W1oNweqyqKYUfUvFTxAlStKICdM/edit?usp=sharing * Facebook: https://www.facebook.com/runauei * LinkedIn: https://de.linkedin.com/pub/so-miyagawa/62/777/720 ### Status 1. Research Fellow at Georg-August-Universit?t G?ttingen Sonderforschungsbereich (SFB) 1136 ??Bildung und Religion in Kulturen des Mittelmeerraums und seiner Umwelt von der Antike bis zum Mittelalter und zum Klassischen Islam? 2. Research Fellow at "KELLIA: Koptische/Coptic Electronic Language and Literature International Alliance" (DFG/NEH) 3. Ph.D. candidate at Georg-August-Universit?t G?ttingen, Philosophische Fakult?t, Seminar f?r ?gyptologie und Koptologie 4. Ph.D. candidate at Kyoto University, Faculty of Letters, Department of Linguistics 5. Member in Coptic SCRIPTORIUM research team 6. Student Member at Unicode Consortium --------------------------------- -------------- next part -------------- An HTML attachment was scrubbed... URL: From s.polis at ulg.ac.be Fri Jul 22 09:07:58 2016 From: s.polis at ulg.ac.be (=?utf-8?Q?St=C3=A9phane_polis?=) Date: Fri, 22 Jul 2016 10:07:58 +0200 Subject: [Egyptian] Cambridge: conclusions and next steps In-Reply-To: References:

Message-ID: <43F73292-362C-43AD-9C20-54A5AB3606C9@ulg.ac.be> Dear friends, If you wish, I can take care of contacting our colleagues from Mainz. I have to travel there soon in order to discuss the structure of their hieratic sign-list in relation to the Thot Sign-List: we want to investigate how they could be linked and whether the structures are compatible. Best wishes, St?phane ------------------------------------------------------ Chercheur qualifi? F.R.S.-FNRS Universit? de Li?ge Service d'?gyptologie D?partement des sciences de l?Antiquit? Place du 20-Ao?t, B-4000 Li?ge http://www.egypto.ulg.ac.be ------------------------------------------------------ > Le 22 juil. 2016 ? 00:47, So Miyagawa a ?crit : > > Dear Heleen, > > I know Svenja. She is working for AKU, the Hieratic database project based in Mainz. She will give a good insight to us. Also, if you are interested in Demotic Unicode, in S?K in Vienna, I discussed unicode-ization of Demotic with Fabian from University of Heidelberg. We reached a conclusion that we need three Unicode blocks for Demotic, Early Demotic, Ptolemaic and Roman. He is one of the leaders of Demotic database project called DPDP. > > Best wishes from D?sseldorf International Airport, > So > > On Thu, Jul 21, 2016 at 21:20 heleen wilbrink > wrote: > Dear Ingelore and St?phane, dear all, > > It is great to see so much valuable input on our mail group. I would like to clarify one thing regarding the repertoire. I completely agree with you St?phane and Ingelore and I am sorry that my wording was apparently not so clear. > > For short term (hopefully by mid August, as was discussed in Cambridge) the check on the Mollerlist proposal of Michael can be done and other missing hieroglyphs that have references can be added to it. If August is not feasible it could be I guess later this year. The compilation of the entire list with references indeed will take much longer and can be done step by step. > > Thanks Ingelore for reminding your suggestion to get in contact with Ursula and Svenja. Is anyone from the Cambridge group in close contact with them and willing to contact them? > > All the best, > Heleen > > Verstuurd vanaf mijn iPhone > > Op 21 jul. 2016 om 14:20 heeft Ingelore Hafemann > het volgende geschreven: > >> Dear St?phan, dear all, >> thanks St?phane for your contribution, I was just getting to write a very similar mail at this moment. >> I would like to confirm to St?phane in al points - especially about the sign list and even Simon does. In Cambridge I hope to have show in my record how difficult it is to check all signs. We could give toMichel an electronic list of encoded hieroglyphs occured in our TLA , excluded the well referenced Gardiner Codes. So we have the reference of the TLA but an other problem: Students have used variants of signs in the process of encoding our texts, - we have to check it. Our aim is to differentiate between signs and variants as far as it its possible. >> 1. We can give you a a list of signs missing in Unicode too, with references of our corpus - as St?phane proposed to do in point 1. below. >> Adding to point 3. of Steph?ne below: In Berlin we have checked and described nearly 2/3 of the Group A (Man), all of Group G (Birds, 313) all of Group P (Ships, 121) and all of Group Z (Strokes, Geometrical F., 39) and a lot signs of the other groups more unsystematical and by-the way. These work we continue in Berlin with one half women power and a re happy that in October one or more scholars will start in Liege. The Thot-sign list - our common product - will be published step by step. I guess - as St?phane does - it will takes few years to finish it. We could give part of the allready finished signs to Michel step by step. This should be possible if it seems useful. >> Concerning point 2.: I would again recommend to contact Ursula Verhoeven and Svenja G?lden in Mainz, who are engaged in creating a new electronic and digitized "M?ller" >> >> Concerning the problems of control characters Simon has discussed the last issues and send some evidences of our texts archive. >> Many greetings and best wishes >> Ingelore and Simon >> >> >> >> Am 21.07.2016 um 12:41 schrieb St?phane polis: >>> Hi Heleen, >>> Hi everyone, >>> >>> Thanks for the summary, it?s a great start for putting things on the table as clearly as possible . >>> >>>> Repertoire >>>> >>>> A list will be made with the references to the publication of an original text (photo or facsimile, not a print font like eg IFAO) for each hieroglyph from the proposal of Michel Suignard. Background: The characters in the proposal of Michel do not have sources yet. These sources are needed in order for the proposal to be accepted. >>>> Michel will do the coordination for this list with input from Stephane/Serge/Simon/Mark-Jan >>>> The hieroglyphs with references can be added in tranches >>>> The first tranche will be the proposal of Michael Everson on hieratic hieroglyphs from Moller to which the hieratic dot will be added >>>> Stephane/Serge/Simon/Mark-Jan will review the proposal of Michael and add any other hieroglyphs that they are missing including the references before mid August >>>> Michael will finish the proposal in the second half of August so that it can be included in the Unicode meeting on the 6th of September >>> I was not there on Tuesday morning, so apologies if you reached an agreement that I am unaware of, but this is not how I see the forthcoming steps in terms of repertoire. >>> The proposal of Michel for the hieroglyphica represents a significant work, but a work that is impossible to document quickly. And we certainly do not want to introduce signs without knowing what they mean, what they represent and where they are attested (many problematic examples were discussed by Ingelore as well as Serge and myself during the meeting). Sourcing these hieroglyphs is a enormous endeavor, even for professional egyptologists and it will take some years (probably not so many, but we are talking in years before being done) and anyway there is no hurry for anyone to use Unicode for these signs as far as I can tell. What we can do quite quickly, on the other hand, is: >>> >>> 1. Extracting all the hieroglyphs that we presently use in the TLA and Ramses, looking for the ones missing in Unicode and preparing an addition in Unicode for these signs that we can document precisely. >>> 2. To do the same for the signs identified in M?ller by Michael; check the proposal that he prepared (ask our colleague from Mainz) and prepare a document for addition in Unicode. >>> 3. In October, some scholars will begin to work full-time on the sign-list and the joint-work between the TLA and Ramses should become progressively available online (Thot Sign-list), with references to signs, classes and forms, as well as to the scholars responsible for the identification of the values, shapes, etc. This will provide a good basis for integrating the hieroglyphica *step-by-step* in Unicode, when we can actually say what each individual sign is. >>> >>> In a nutshell, the work done by Michel will be more than useful, but should be integrated step by step in the future (probably a third step). >>> I think I should get the support of all the egyptologists on this. Correct me guys if I?m wrong. >>> >>> >>>> Monograms: >>>> >>>> I think the consensus is to make a list of attested combinations and then decide on the approach, either monograms (A) or the use of a control character (B). This list will be used as a proposal for monograms (A) or will be used by developers for their fonts (B). Most Egyptologists were in favor of solution B because of the ease of searching and the ease of implementation (in fonts and not in Unicode proposals). >>>> Stephane/Serge/Simon/Mark-Jan will make a list and share it with the group. Then a decision is made for A or B >>> Agreed. >>> >>>> Control characters >>>> >>>> On the last day of the conference there seemed to be consensus on the control characters, which Michael summed up by mail as: ?The proposal is to remove the LIG character from the ballot and to add in 7 more, the two group controls of Bob, and then the four corner and the center controls? >>> Sure. I just sent the data from Ramses that are needed for illustrating and arguing in favour of theses control characters. >>>> It was agreed that this proposal will be finished in August, I thought by Michael, and discussed on the Unicode meeting on the 6th of September. Now it seems over the mail that several people are working on it separately >>>> Suggestion: let Michael, Bob and Mark-Jan do a conference call or so soon to A. make sure there is one proposal that all agree on and can be shared with the group and B. give guidelines to Simon/Serge/Stephane for what examples they should look in their databases >>> If possible, I think that the best would be to produce a single ?consensus? document, which could be signed and approved by all the parties. >>> I leave it up to you to know who is to produce the first draft. I?m certainly happy to read through it carefully, provide examples, and check everything I can, but I cannot write the first draft, which would be beyond my technical capacities and the time I?ve left during the next weeks. >>> >>> Best wishes, >>> >>> St?phane >>> >>> >>>> Input methods and fonts >>>> >>>> So and Marwan have integrated their input systems and font into SINUHE >>>> So/Marwan could you share with the us the online location so we can start using it? >>>> Bob has made and shared an experimental font for us to use >>>> _______________________________________________ >>>> Egyptian mailing list >>>> Egyptian at evertype.com >>>> http://evertype.com/mailman/listinfo/egyptian_evertype.com >>> >>> >>> _______________________________________________ >>> Egyptian mailing list >>> Egyptian at evertype.com >>> http://evertype.com/mailman/listinfo/egyptian_evertype.com >> >> -- >> Dr. Ingelore Hafemann >> Strukturen und Transformationen des Wortschatzes >> der ?gyptischen Sprache >> Post: J?gerstra?e 22/23 >> Sitz: Unter den Linden 8 >> 10117 Berlin >> Tel: (030) 20370 447 >> _______________________________________________ >> Egyptian mailing list >> Egyptian at evertype.com >> http://evertype.com/mailman/listinfo/egyptian_evertype.com > _______________________________________________ > Egyptian mailing list > Egyptian at evertype.com > http://evertype.com/mailman/listinfo/egyptian_evertype.com > -- > --------------------------------- > > # So Miyagawa > ## [so? mij??g?w?] > > ### Websites > * Profile: https://www.uni-goettingen.de/de/531081.html > * academia.edu : https://uni-goettingen.academia.edu/SoMiyagawa > * GitHub general: https://github.com/somiyagawa > * GitHub toolkit: https://github.com/somiyagawa/toolkitForCopticAndAncientEgyptian > * CV: https://docs.google.com/document/d/1HhhKovsJzqZQGCn6W1oNweqyqKYUfUvFTxAlStKICdM/edit?usp=sharing > * Facebook: https://www.facebook.com/runauei > * LinkedIn: https://de.linkedin.com/pub/so-miyagawa/62/777/720 > > ### Status > 1. Research Fellow at Georg-August-Universit?t G?ttingen Sonderforschungsbereich (SFB) 1136 > ??Bildung und Religion in Kulturen des Mittelmeerraums und seiner Umwelt von der Antike bis > zum Mittelalter und zum Klassischen Islam? > 2. Research Fellow at "KELLIA: Koptische/Coptic Electronic Language and Literature International > Alliance" (DFG/NEH) > 3. Ph.D. candidate at Georg-August-Universit?t G?ttingen, Philosophische Fakult?t, > Seminar f?r ?gyptologie und Koptologie > 4. Ph.D. candidate at Kyoto University, Faculty of Letters, Department of Linguistics > 5. Member in Coptic SCRIPTORIUM research team > 6. Student Member at Unicode Consortium > > --------------------------------- > _______________________________________________ > Egyptian mailing list > Egyptian at evertype.com > http://evertype.com/mailman/listinfo/egyptian_evertype.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From bobqq at live.co.uk Fri Jul 22 14:34:20 2016 From: bobqq at live.co.uk (Bob Richmond) Date: Fri, 22 Jul 2016 14:34:20 +0100 Subject: [Egyptian] Early note for comment on the 'Four corner' system for ligatures Message-ID: Hi All This is a rough note to kick the ball rolling about the 4 corner system of positional ligatures. I?ve not covered everything by any means. I?ve not heard anything from others on this topic yet so like to have feedback asap. I think the 5th centre position code talked about is similar to the Monogram in linking to repertoire discussion and should be treated as a distinct item. Neither affects any other revisions to the three character system under discussion. Anyone disagree? Bob -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: BobsInitialThoughtsFourCorners1.pdf Type: application/pdf Size: 397094 bytes Desc: not available URL: From mn31 at st-andrews.ac.uk Fri Jul 22 18:05:15 2016 From: mn31 at st-andrews.ac.uk (Mark-Jan Nederhof) Date: Fri, 22 Jul 2016 18:05:15 +0100 Subject: [Egyptian] proposal 2016-07-22 In-Reply-To: References: Message-ID: <1897138.Z3PCXJQWcV@bear> Dear all, We adapted our proposal. Please find it in: https://mjn.host.cs.st-andrews.ac.uk/tmp/unicode2.pdf We have responded to the criticism about brackets and prefix operators. We have gotten rid of all prefix operators and replaced them by infix operators. As for brackets, it is well possible to get rid of them too (not entirely, there are still the cartouches), but the price to pay for this is added complexity of the syntax due to needing several copies of each primitive with different operator precedence, perhaps three or four. It is outlined in Section 9 how this would be done. This adds to the complexity that already exists, after the prefix operators were removed; have a look at appendix A and see whether you can verify the grammar is unambiguous. There are no names on the proposal. That is partly because not everyone from TLA/Ramses/St Andrews has had a chance to look at it yet, and partly because anyone is welcome to have their name added if they feel they contributed and subscribe to the content. Mark-Jan From mn31 at st-andrews.ac.uk Fri Jul 22 19:10:33 2016 From: mn31 at st-andrews.ac.uk (Mark-Jan Nederhof) Date: Fri, 22 Jul 2016 19:10:33 +0100 Subject: [Egyptian] proposal 2016-07-22 In-Reply-To: <1897138.Z3PCXJQWcV@bear> References: <1897138.Z3PCXJQWcV@bear> Message-ID: <4803394.MerqE8crjr@thuis> On Friday 22 Jul 2016 18:05:15 Mark-Jan Nederhof wrote: > have a look at > appendix A and see whether you can verify the grammar is unambiguous. PS I now see it isn't. Fix will follow later, hopefully. Mark-Jan From bobqq at live.co.uk Fri Jul 22 23:19:12 2016 From: bobqq at live.co.uk (Bob Richmond) Date: Fri, 22 Jul 2016 23:19:12 +0100 Subject: [Egyptian] Two group joiners Message-ID: Hi All I've attached a rough description of the two group joiner additions to the encoding system. This is for comment and feedback. These are primarily about improving vertical text and 'tall group' support in horizontal text while maintaining a straightforward sequence model for users. Mainly for the more technically oriented members of the group but feedback appreciated from all. Thanks Bob -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: TwoGroupJoinersForEgyptianDraft2.pdf Type: application/pdf Size: 954772 bytes Desc: not available URL: From odusseus at gmail.com Sat Jul 23 10:46:07 2016 From: odusseus at gmail.com (Marwan Kilani) Date: Sat, 23 Jul 2016 11:46:07 +0200 Subject: [Egyptian] Some general considerations Message-ID: Hello Everyone First of all: I am attaching a pdf version of this email because I am using a few images, and I am not sure they will displayed in the right places in the email. If you don?t see any image in the text below or something does not make sense, please refer to the pdf. ------- I dare to write here an email pointing out a few general and specific observations on both what has been said in Cambridge, and what has been discussed in these emails in the past few days. I have the feeling that many of you will not like what I am going to write, but well.. First of all, as you know I didn?t know any of you before Cambridge, so I had the feeling to be a bit an ?external observer?. Which, in turn, led me to a few observations. *First:* Do you know the Indian story of the three (or more) blind men who are put in front of an elephant and are asked to find out what hey have in front of them ( https://en.wikipedia.org/wiki/Blind_men_and_an_elephant )? One in front of the trump, one next to the ear and one near the tail. The three blind men start touching the elephant and start to try to describe it and to try to figure out what kind of animal is, but they end up fighting because they can only touch a small part of the animal but missed the general picture. Besides the ?entrenched positions? mentioned by Nigel, I had the feeling that some of the participants in Cambridge were a bit like the blind men, knowing very well their specific fields, but missing a bit the general picture, thus ending up misunderstanding the others. Now, I don?t want to sound arrogant. I don?t think I have a vision of the whole picture and I don?t think to be more knowledgeable than any of you. I put myself as well among those blind men. But considering that, correct me if I am wrong, So and I (note that I am talking only in my name, not in So's name) are the only person at that workshop who: a) are Egyptologists and therefore know both how Egyptian hieroglyphs works and what Egyptologists need (or at least what we need as egyptlogists) b) have been playing since a while with Unicode characters, fonts, input methods etc and therefore have a certain understanding of how these technical tools work c) have a good practical understanding of how non-Latin complex horizontal/vertical scripts work. In particular So is familiar with Japanese, Chinese and Korean, I think? While I know pretty well Arabic-based scripts, Indian ones (I lived in Nepal and when I was there I ended up teaching Nepali to Nepali children in a Nepali school) and I have played a lot with Chinese and Japanese scripts. Then, perhaps, our small little contribution should also be considered to make sense of the whole big elephant. This even more considering that in spite of having never met before, both So and I ended up developing very similar solutions to some of the problems you are discussing, solutions that, by the way, seem to me very similar to what Ishida was suggesting in some of his emails. Solutions that, in fact, are already implemented by various scripts around the world. *Second:* Many of the problems you are discussing don?t have a single ?right? solution, because in fact many of those problems depends on how you interpret the data (i.e. the ancient hieroglyphs). Therefore, the aim should not be to find the ?right? solution, but rather to find the easiest *interpretation* that could led to the easiest implementation in Unicode. We are not ancient Egyptians, we don?t know how ancient Egyptian scribes perceived their writing system. We can just observe it from the outside and suggest interpretations. Which are not ?right? or ?wrong?, but rather ?more efficient? or ?less efficient? in respect to the problems we want to solve through these interpretations. This is something that, I think, should be kept in mind. *Third* As said during the meeting in Cambridge, I am not against the introduction and use of some control characters per se. Still, honestly, more I read about your proposals and about your control characters, and more I am convinced that using rendering algorithms at the font level and general or contextual ligatures embedded within the fonts as a main way to combine and display groups would be much easier and much more efficient than using control characters. This said, allow me to call your attention to a few more specific points. *1) JSesh approach* I had the feeling during the meeting and reading your emails that many of you think in a very JSesh-based mentality. Like I had the feeling that at least some of the participants? goal is to imitate JSesh in Unicode. But Unicode *is* *not* JSesh, or at least should not be JSesh. Using control functions, parentheses etc can be an efficient way in JSesh, but I doubt that a system requiring to input 10 (sic!) control characters to display 3 (sic!) hieroglyphs in a pretty common group (the Htp example in Mark-Jan?s last proposal) makes much sense in a unicode perspective. Especially considering that you can obtain exactly the same result without control character and just with 1 (sic!) general ligature within the font that will be automatically rendered by usual standard already available rendering algorithms. In other words, what can be an efficient solution to a problem in JSesh, is not necessarily a good solution in Unicode. And Unicode is not even a stand-alone thing: Unicode can work in combination with rendering algorithm (i.e. ligatures embedded in fonts), with layouts at the editor level etc. that can help solving the problems on different level than just on the Unicode encoding level. *2) Groups in fonts* Someone in the list was saying that the groups need to be listed, but that not all of them necessarily need to be built in the font (or something like that). This is not true: compound characters and groups will not automatically build on their own, and if you envisaged the possibility to build a group that you *do* need to have in the font, otherwise it wont display correctly and you will have random broken control-characters popping out here and there in your hieroglyph text. This perhaps will not bother you in your database etc, but it will be just unacceptable for common users (and, for instance, for editors). If you are accepting that a group can be built, then you need a way to display that group, i.e. you must have that group saved somewhere in your font. Whether you will build those characters through control characters or with rendering and ligature algorithms embedded in the fonts. *3) the 4 ?small sign in the corner of big sign? control characters.* If you really want to have control characters to combine small signs in the corner of bigger signs, you can do that with just 2 control characters, you don?t need 4 of them. These because the distinction you are making between ?big? and ?small? signs is superfluous. It is enough to have 2 transversal control characters, that we can represent as ?A\B? and ?A/B?. They will join the main signs by virtually/ideally putting them at two opposite corners of the square, on the base of their relative order. So for instance, if you want to render the group [image: Inline image 2] tw, you will just type in [image: Inline image 3] t + ?A\B-control-character? + [image: Inline image 4] w. If instead you want to code for [image: Inline image 6] wt, then you will just need to type in [image: Inline image 5] w + ?A\B-control-character? + [image: Inline image 7] t. If you want to render [image: Inline image 10] twt, then you just type in [image: Inline image 11] t + ?A\B-control-character? + [image: Inline image 12] w + ?A\B-control-character? + [image: Inline image 13] t In all these cases, you can simply use the same control character. You don?t need two distinct control characters for that, because there is no need to specify which one is the ?big? sign, and which one is the ?small? one, because what really matters is not their size, but their relative position. NOTE: if by any chance you are going to adopt this system of 2 control characters, then I?d like to be mentioned in the proposal. You know, it could be useful on the CV. *4) Vertical and horizontal script and control characters* You are talking about using control characters to render texts in horizontal and vertical texts. This, however, can generate some quite relevant problems. In particular, you have to consider that it will be hardly possible to automatically convert a vertical text encoded with control characters into horizontal text. In other words, if you have a text written in vertical columns that used control characters, and for whatever reason (for instance for editorial reasons, you editor wants your text in horizontal lines and not in vertical lines, for instance if you are quoting a hieroglyphic passage within a English paragraph) you will need to turn it into horizontal text, you will hardly be able to do it automatically, and you will likely have to retype it entirely. The problem is that the control characters that you will need and use in your vertical text will not be the same and will not be inputted in the same places as those that you will need in your horizontal rendition of the same text. Let?s take, for instance, the following example. [image: Inline image 14] If this text will be typed with a vertical font (as I would expect it to be), then to display it correctly you will have to type the following sequence of Unicode characters: [image: Inline image 15] + [image: Inline image 16]+ ?left/right-ctrl-character? (sic!) + [image: Inline image 17] + ?up/down-control-character? (sic!) + [image: Inline image 18] + [image: Inline image 19] + [image: Inline image 20] NOTE that you will probably (as far as I know, correct me if I am wrong) have to use the ?left/right-ctrl-character? (not a up/down crtl-character) to combine the [image: Inline image 21] and [image: Inline image 22] and then the ?up/down ctrl character (not a ?left/right ctrl character) to combine them with the [image: Inline image 23], because in vertical texts the baseline of reference is the left line of the column. Note that this will change the order the signs have to be inputted, which means it will interfere with the searchability of the text. This however is a problem that should be possible to solved somehow. Perhaps, I am not sure. I am not expert of these details. If however you will try to display this same text (i.e. this same sequence of Unicode characters) horizontally, it wont be enough to chance the direction of the text and to use a horizontal font, because what you would obtain would be something wrong like this: [image: Inline image 24][image: Inline image 25] "broken-control-character? [image: Inline image 26] "broken-control-character? [image: Inline image 27][image: Inline image 29][image: Inline image 30] In order to display horizontally the same text in a graphically acceptable way, in fact, you will have to type the following sequence of Unicode characters: [image: Inline image 31] + [image: Inline image 32] + [image: Inline image 33] + ?up-down-control-character? + [image: Inline image 34] + [image: Inline image 35] + ?up-down-control-character? + [image: Inline image 36] Which would be displayed as: [image: Inline image 37] Essentially, you will have to type the text anew. Not very practical, in my opinion. Using rendering algorithms, i.e. general or contextual ligatures embedded within the *font* at the font level (instead of control characters), would be a very easy way to solve the problem. In fact it would be enough to have two fonts, one for vertical texts and one for horizontal texts (or one single font with the possibility to switch between the two layouts) with different sets of ligatures embedded in them. In that way, you will just have to type the following plain sequence of Unicode characters (without any control character): [image: Inline image 38] + [image: Inline image 39] + [image: Inline image 41] + [image: Inline image 42] + [image: Inline image 43] + [image: Inline image 45] And the ligature algorithm in the vertical font will render the ligature [image: Inline image 46]. Then, to turn this same vertical string of text into horizontal text you will just have to change the font, and the algorithm in the horizontal font will automatically display the correct [image: Inline image 47] and [image: Inline image 48] ligatures. Without need to retype anything. NOTE that the hieroglyphic text you see the examples here above have all been obtained in this way, namely without control characters and just with my font with embedded ligatures. As Ishida suggested in one of his emails, this is essentially how Asian languages deal with this problems. It is very efficient, it works, it does not require any new special character and frankly I still don?t understand why Egyptian should be different. *5) special characters, vertical/horizontal texts and input methods* Note that because of the problem with control characters and horizontal/vertical texts highlighted above, you *wont be able* to use a same simple predictive input method to type vertical and horizontal texts, because the sequences of signs and control characters needed will be *totally* different, and will both need to be encoded independently within the input method itself. In fact, you will probably have to explicitly list in advance within the input method every single possible combination of hieroglyphic signs + control characters, or you will have to just type you text sign by sign, control character by control character (even if you adopt shortcuts to input the control characters and the most common groups, the problem will still be there). Again, with general or contextual ligatures embedded at the font level this problem would not exist, because the input method would just have to input the plain sequence of signs, and it will be the rendering algorithm within the fonts that will take care of displaying the signs in the correct spatial order. *6) Ramesside ?groups? (or ?tall groups?).* You all seem to assume that the clusters of signs we see in Ramesside texts are ?groups? (or ligatures) analogous to the middle Egyptian square groups, and therefore have to be created, manipulated and displayed as graphical units. For the non-Egyptologists among us, this is a Ramesside text [image: Inline image 49] in red you have what is generally referred to as a ?Ramesside group? (or ?tall groups?). [image: Inline image 50] Here, instead, you have an ordinary Egyptian text, [image: Inline image 51] and in red you have what are generally considered as ordinary ?square groups?. [image: Inline image 52] As you can see, the difference is that in ordinary Egyptian writing, sings are grouped into regular square spaces (or half-squares). In Ramesside writing, instead, signs tend to be combined into vertically elongated rectangles. These rectangles can contain multiple ordinary square groups. For instance, the following Ramesside group from the text above: [image: Inline image 53] could be split into two ordinary square groups, [image: Inline image 54] and [image: Inline image 55], in the ordinary square groups writing. As said, people often assume that these clusters of signs in Ramesside texts have to be understood as ?groups? analogous to the ?square groups? of the ordinary writing. But, what if the Ramesside ?groups? were actually not really ?groups?? Or better, what if there were an easier and most efficient way to analyses, interpret, describe, and therefore display them? People seem to often assume that there are only two main ways to write a string of text, namely vertically or horizontally. This however is in not true. It is also possible to write short horizontal strings of text within a larger main vertical frame, and similarly it is possible to write short vertical strings of text within a larger main horizontal frame. The second case, in particular, is attested in Asian scripts. Have a look at the following image: [image: Inline image 57] This is a Japanese text, as you can see from the next image, it is written in short vertical ?columns? (red), organized into one general horizontal ?line? (green). [image: Inline image 56] Namely, the text is written (the text in the pic is right to left, I am transcribing it left to right because it is easier to explain the concept) --------------- ????? ????? --------------- But it has to be read: ?? | ?? | ?? | ?? | ?? namely: ?????????? Japanese writing has ?groups? (i.e. characters, marked in white in the following picture) that in some general respect could be compared to Egyptian groups (they are not identical, I know, but *in some respects* they are conceptually comparable). [image: Inline image 58] Now, *no one* dealing with Japanese writing would consider the short vertical strings of text (the red bits above) as some sort ?non-ordinary elongated groups(characters)?. And *no one* would ever consider to code such hypothetical elongated vertical groups into Unicode, or to devise control characters or anything to pre-compose them. They are just sequences of *regular* groups(characters) written vertically within a main horizontal layout. And in order to represent such a text in Unicode, you just have to play with the layout of your text in your text editor. You don?t need special control characters or anything else, it is *just* a question of *layout*. Now, Ramesside inscriptions can be analysed exactly in the same way. [image: Inline image 59] In other words, what you are interpreting as the Ramesside ?abnormal elongated/tall groups? (red in the pic above) could actually be interpreted just as short strings of texts written vertically (i.e short *columns*) within a bigger main horizontal layout (i.e. in lines, in green). Such an interpretation has a few advantages both in respect to the actual data from the ancient texts and for what concern Unicode etc. First, someone was pointing out that there are thousands of such ?Ramesside groups? and a large part of them is attested only once. Well, if you interpreted them as short columns of texts, rather than as groups, you understand why: those are short strings of texts, they are not isolated independent graphic and orthographic units. This also helps to answer the question: ?how many groups are you expecting to find in the future?? Virtually, potentially, infinite. If tomorrow we should find a new Ramesside temple with a new long text inscribed in ?Ramesside groups?, there could be hundreds, even thousands of these new short vertical strings of texts. Trying to describe (and encode) them as independent units (or devising control characters to build them) is more or less like trying to encode every singly column of hieroglyphic text as single independent ?groups?. It does not make much sense, as it would not make sense to try to describe and code every single horizontal sequence of groups(characters) in the Japanese texts above. Rather, it would make much more sense to deal with these ?Ramesside short vertical strings of text disposed in horizontal lines? as if they were.. well, short vertical strings of text disposed in horizontal lines. This can be *easily* done at the layout level, either with xml/CSS style sheets, or with any other layout algorithm of any text editor. Word can do that, already. It is enough to use a vertical font, and create a page layout with horizontal sequences (= the lines) of very short vertical columns. That is all what you need. The Ramesside ?group? (or ?tall group?) [image: Inline image 60] used by Bob as example in his last proposal, for instance, would not need to be built and displayed as ?group? (i.e. with control characters etc) at all, but it could just be inputted as a plain sequence of independent signs displayed with a vertical font within a vertical column, in a general layout that present series of horizontal sequences (i.e. a ?lines?) of such short columnts. No need for ligatures, no need for control characters, nothing. Just vertical fonts and properly set up layouts for the page where the signs will be displayed. Obviously, there will still be signs that will need to be combined in ?groups?, i.e. in ?real? square groups. As you can see from the pic above, however, if you interpret the Ramesside texts as composed of short vertical columns, the number of groups needed decrease significantly. And in general, those groups are often the same basic groups that you find in good old ordinary Egyptian square groups orthography. So no need to list and encode (as Unicode or at the font level, doesn?t matter) thousands of unique groups, we would just need to encode the basic square groups and then playing a bit with the layout of the text. And this brings me to the next point: *7) What are the ?square groups??* I have never specifically worked on square groups from a linguistic point of view, and I do not know if there is any specific study about square groups and their graphic behaviour within a larger linguistic frame (i.e. for instance comparing them with the behaviour of similar ?groups? in other writing systems). This is one of those points where the experience of some of you could be extremely useful. Such a study could be used, for instance, to define some contextual rules to allow the font to automatically manage the combination of at least some of these groups, or at least to define some rules to automatically prioritize one ligature over the other (if we are working with ligatures at the font level). If such a study exists, then it would be useful to take it into consideration in the discussion. If such a study does not exist, then perhaps it could be worth considering doing it *before* submitting any new proposal involving control characters to the Unicode consortium, because I think it would be better to be sure we really understand how those groups work, before suggesting a method to encode them. I am obviously talking about control characters etc, I am not talking about expanding the basic set of glyphs. Otherwise, we would be encoding something whose actual functioning has never been studied, and therefore we would risk to be encoding features and elements that are actually superfluous, or that could have been managed in a more efficient way at other levels (ligatures within the fonts, layout table at the texteditor level etc). Ok, I guess this is more or less all what I wanted to say. Now feel free to ignore me :-) Best Marwan -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: tawy.jpg Type: image/jpeg Size: 4562 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: tall group1.jpg Type: image/jpeg Size: 6521 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: xAst.jpg Type: image/jpeg Size: 3110 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: tA.jpg Type: image/jpeg Size: 1196 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: h.jpg Type: image/jpeg Size: 2014 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: tA.jpg Type: image/jpeg Size: 1196 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: jap text.jpg Type: image/jpeg Size: 13279 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: hthr hnwt tawy 1.jpg Type: image/jpeg Size: 9092 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: tA.jpg Type: image/jpeg Size: 1196 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: h.jpg Type: image/jpeg Size: 2014 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: classical 2.jpg Type: image/jpeg Size: 134706 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: hthr.jpg Type: image/jpeg Size: 4416 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: w.jpg Type: image/jpeg Size: 2104 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: nw.jpg Type: image/jpeg Size: 1714 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: t.jpg Type: image/jpeg Size: 840 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: tA.jpg Type: image/jpeg Size: 1196 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: t.jpg Type: image/jpeg Size: 840 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: h.jpg Type: image/jpeg Size: 2014 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: israel 2.jpg Type: image/jpeg Size: 449778 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: jap text copia.jpg Type: image/jpeg Size: 11587 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: tA.jpg Type: image/jpeg Size: 1196 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: jap text copia 2.jpg Type: image/jpeg Size: 11995 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: nw.jpg Type: image/jpeg Size: 1714 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: t.jpg Type: image/jpeg Size: 840 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: israel 1.jpg Type: image/jpeg Size: 37460 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: tw.jpg Type: image/jpeg Size: 2452 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: hthr.jpg Type: image/jpeg Size: 4416 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: h.jpg Type: image/jpeg Size: 2014 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: w.jpg Type: image/jpeg Size: 2104 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: nw.jpg Type: image/jpeg Size: 1714 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: t.jpg Type: image/jpeg Size: 840 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: w.jpg Type: image/jpeg Size: 2104 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: tA.jpg Type: image/jpeg Size: 1196 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: israel 2 copia.jpg Type: image/jpeg Size: 449778 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: t.jpg Type: image/jpeg Size: 840 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: t.jpg Type: image/jpeg Size: 840 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: tA.jpg Type: image/jpeg Size: 1196 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: twt2.jpg Type: image/jpeg Size: 5167 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: h.jpg Type: image/jpeg Size: 2014 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: classical 1.jpg Type: image/jpeg Size: 5244 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: nw.jpg Type: image/jpeg Size: 1714 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: hthr.jpg Type: image/jpeg Size: 4416 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: t.jpg Type: image/jpeg Size: 840 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: t.jpg Type: image/jpeg Size: 840 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: tA.jpg Type: image/jpeg Size: 1196 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: tall group.jpg Type: image/jpeg Size: 6174 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: wt.jpg Type: image/jpeg Size: 2446 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: hthr.jpg Type: image/jpeg Size: 4416 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: hthr hnwt tawy 2.jpg Type: image/jpeg Size: 9078 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: nbt.jpg Type: image/jpeg Size: 2393 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: t.jpg Type: image/jpeg Size: 840 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: nw.jpg Type: image/jpeg Size: 1714 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: hnwt.jpg Type: image/jpeg Size: 5740 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: hnwt.jpg Type: image/jpeg Size: 5740 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: Some general considerations.pdf Type: application/pdf Size: 282164 bytes Desc: not available URL: From mn31 at st-andrews.ac.uk Sat Jul 23 13:47:15 2016 From: mn31 at st-andrews.ac.uk (Mark-Jan Nederhof) Date: Sat, 23 Jul 2016 13:47:15 +0100 Subject: [Egyptian] Brackets in the TLA encoding In-Reply-To: References: Message-ID: <1922257.noGW0DXB86@thuis> Hi Simon, Hi St?phane, Hi All, This is very helpful. Physicists tell us that if you want to gather and use data, you need hypotheses first, or else you don't know what to look for. For me the relevant hypotheses are: (1) The primitives we have in our current document allow description of most of the groups in an accurate enough way. Both 'most of' and 'accurate enough' are subjective of course. There is no escaping that. (2) It would be quite difficult to reduce the expressive power before we would lose coverage. There is an implicit parameter, which is a limit on the depth of nesting, which I assume is 3. As also Simon confirmed once more, 2 is not enough, even for the most basic, run-of-the-mill classical (horizontal) inscriptions. As to (1), we have moved away quite considerably from descriptive power that is machine-interpretable. This was motivated by people finding the original encoding too complicated, and arguing that fonts would do a lot of fine-tuning anyway for particular choices of signs. Also, we don't really care about a sign being printed 0.5 mm too much to the left or to the right, as long as the user gets a rough idea of what the text looks like. These arguments all sound reasonable, but realise two things: * If even stupid machines don't know how to render an encoding roughly as it was intended, perhaps there is not enough information present for humans to know what was meant either. * As stressed once more by St?phane, the kinds of groups we are talking about are productive. We don't want to be manually fine-tuning the appearance of an unbounded number of groups, so some approximately correct automatic rendering would be quite useful. I think we are still okay with the present version of the proposal, but we have moved a long way from existing routines that do the rendering in a deterministic, predictable manner, to needing lots more refinements to program code and the result being not quite well-defined. As to both (1) and (2), the provided examples include quite a few cases of insertions and stacking, insertions into stacked groups, and even groups with insertions that are themselves inserted. So far I don't see either hypothesis refuted. I had to struggle quite a bit to get rid of prefix operators. As anyone with the slightest knowledge of formal languages knows, prefix or suffix operators are ideal for automatic processing, because the problem of ambiguity simply does not exist, whereas endless volumes of textbooks since the late 1950s have been written about the ambiguities caused by infix operators and how to solve them using principled or not so principled methods involving operator precedence and low-level hacks in shift-reduce parsers. Using infix operators is really only justifiable if notation is meant for human consumption. That is why I was very surprised to hear objections with the argument that font technology is too primitive to handle prefix operators. If anything, I would have imagined that primitive tools would have a lot of difficulty with parsing in the presence of operator precedence and such. I implemented OpenType substitution rules that analysed bracket structures and prefix operators myself, and that works fine. It would be a nightmare for me to have to implement OpenType substitution rules in the presence of operator precedence. There may be something in the arguments people use that I don't understand. Anyway, one thing to look out for (I say this in particular to Simon, St?phane and Serge, with whom this was discussed in detail in Cambridge), is that in the process of getting rid of prefix operators, and avoiding ambiguity, the following coverage was lost: it is not possible anymore to insert A into the top-left corner of B, and to insert the resulting group into the bottom-left corner of C. The same holds for the right corners. I have yet to see a group where this matters. It is still possible however to insert A into the bottom-left corner of B, and to insert the resulting group in the bottom-left corner of C. The same holds for two upper-left corners and the corresponding right corners. There are groups of these forms among the provided examples. The problem of course is that inability to find certain structures in the corpora we happen to have at this very moment does not prove their non-existence. At best it means our encoding won't be too much lacking in terms of coverage. Best regards, Mark-Jan On Thursday 21 Jul 2016 15:24:51 Simon Schweitzer wrote: > Hi all, > > @St?phane: thank you for your .gly-files! In this mail, I want to add > some remarks concerning the subgroup topic. > > As in Ramses, there are many encodings with "(" and ")" in the TLA. I > collected these encodings ans I want to present you my evaluation: > * In some cases, the encoding is invalid, e.g. (F12-S29):D21, which > should be understood as F12*S29:D21. > * Sometimes, the encoding of the brackets is superflous. There are many > cases of Hiero1:(Hiero2*Hiero3) The brackets are not necessary: use > Hiero1:Hiero2*Hiero3 ! > * But in many cases, the parsing without the brackets would be misleading: > 1) There are many vertical groups in horizontal groups in vertical > groups. I list only 10 examples: > N35:"?"*"?"*(W22:Z2) (Rezepte Papyrus Zagreb E-597-3, l. 1; > ID:ABLN5PNQ2BBENE7LWO72KDRPPU) > Aa1&D58:(X1:N35)*N25 (Stele Louvre C 284 ("Bentresch-Stele"), l. 22; > ID:4VLZLA44UVGJZN22WIWP774LOQ) > Aa13:S40*(X1:O49) (Stele Louvre C 284 ("Bentresch-Stele"), l. 21; > ID:4VLZLA44UVGJZN22WIWP774LOQ) > Aa15:W19*(X1:X1) (Harfnerlieder Text C, l. 2; ID:H6Z5TORPQFFZXOU6CJODODZHYQ) > D21:V28*(X1:B1) (?Stele des Montuhotep (Cambridge E.9.1922)?, l. C.3; > ID:LQ7QIJTK7NFWTIFBS7R6AIYWGY) > D21:V7*(W24:X1) (?Stele des Montuhotep (Kairo CG 20539)?, l. I.b.18; > ID:ZOLMMIAB2NHV7PSOSOOLAHN64U) > D35A:(X1:Z4A)*G37 (?Stele des Antef (Louvre C 167 = E 3111)?, l. C.1; > ID:DWHZIO5ZCFBURLZ6G4T26YIP7U) > D36:D21:N29*(X1:"?"*Z4*"?") (?Stele des Antef (Glasgow D1922.13)?, l. > A.3; ID:OIYODBZ74RHM7OPTR72OLCMJ3A) > I10&I9:X2*(X4:Z2) (?Stele des Antef (Glasgow D1922.13)?, l. A.7; > ID:OIYODBZ74RHM7OPTR72OLCMJ3A) > K4:G1*(Z7:X1) (Sinuhe AOS, vs. 18; ID:RP2F6BGNDBAARDBNDHGFSFPIEM) > As you can see, this kind of grouping occurs in hieroglyphic and in > hieratic texts, and this feature is also attested in the "classical" > period from the Middle Kingdom (the examples from the stela of > Montuhotep and Antef). > 2) horizontal grouping of vertical groups in columns > If the text is written vertically, there are cases of horizontal groups > of vertical groups, e.g. in the Buch von der Himmelskuh > (ID:WHEOIX5P5ZAVVFU4BGO6OWASV4), M17*(Q3:N35), (X1:Z1)*I12, > M17*S29*(A2:Z2) and so on. > > Best regards, > > Simon > > _______________________________________________ > Egyptian mailing list > Egyptian at evertype.com > http://evertype.com/mailman/listinfo/egyptian_evertype.com > From bobqq at live.co.uk Sat Jul 23 13:51:53 2016 From: bobqq at live.co.uk (Bob Richmond) Date: Sat, 23 Jul 2016 13:51:53 +0100 Subject: [Egyptian] Some general considerations In-Reply-To: References: Message-ID: Hi Marwan Thank you for sharing your thoughts. I?m sympathetic to much of what you have written. Lots of good points. When I kicked the ball rolling on the topic 18 months ago I submitted a rough background note to UTC L2/15-069 . I gave two general approaches. 1. An implied clustering scheme (similar to what you are suggesting in your note). I first raised this ?Simplified Egyptian? notion at I&E 2006 when we were discussing the initial repertoire. 2. An example of an explicit approach (much like what we have now as the UTC recommendation). Like you I like the Simplified approach and it would work well for much casual use. One reason the explicit approach was actually chosen is it enables the author of a transcription to be clear about intended layout. If the look of a text were reliant on a particular fonts clustering model there are opportunities for confusion long term. Some explicit structure seemed essential. The need to input joiner characters was a tradeoff ? however specialist software for Egyptologists will be able to use specialist input methods for fast input of text. And remember the most popular input method is copy and paste! The simplicity of a single LIGATURE was proposed with consideration about input methods and the practicality/usability of editing in general purpose software. I?ve not had time yet to analyse the Ramses data fully and have not yet received corresponding TLA data but on evidence so far the 4 corner method is only possibly needed for at most something like 1 in 5,000 clusters so this low frequency should be taken into account when we decide what to do. On vertical and horizontal have you looked at the 3 controls + 2 group controls (as per note yesterday). Have you tried experimenting with font I sent out in the week (this has a couple of vertical/tall group examples in the doc about the font? Regards, Bob From: Egyptian [mailto:egyptian-bounces at evertype.com] On Behalf Of Marwan Kilani Sent: 23 July 2016 10:46 To: Egyptian Hieroglyphs in the UCS Subject: [Egyptian] Some general considerations Hello Everyone First of all: I am attaching a pdf version of this email because I am using a few images, and I am not sure they will displayed in the right places in the email. If you don?t see any image in the text below or something does not make sense, please refer to the pdf. ------- I dare to write here an email pointing out a few general and specific observations on both what has been said in Cambridge, and what has been discussed in these emails in the past few days. I have the feeling that many of you will not like what I am going to write, but well.. First of all, as you know I didn?t know any of you before Cambridge, so I had the feeling to be a bit an ?external observer?. Which, in turn, led me to a few observations. First: Do you know the Indian story of the three (or more) blind men who are put in front of an elephant and are asked to find out what hey have in front of them ( https://en.wikipedia.org/wiki/Blind_men_and_an_elephant )? One in front of the trump, one next to the ear and one near the tail. The three blind men start touching the elephant and start to try to describe it and to try to figure out what kind of animal is, but they end up fighting because they can only touch a small part of the animal but missed the general picture. Besides the ?entrenched positions? mentioned by Nigel, I had the feeling that some of the participants in Cambridge were a bit like the blind men, knowing very well their specific fields, but missing a bit the general picture, thus ending up misunderstanding the others. Now, I don?t want to sound arrogant. I don?t think I have a vision of the whole picture and I don?t think to be more knowledgeable than any of you. I put myself as well among those blind men. But considering that, correct me if I am wrong, So and I (note that I am talking only in my name, not in So's name) are the only person at that workshop who: a) are Egyptologists and therefore know both how Egyptian hieroglyphs works and what Egyptologists need (or at least what we need as egyptlogists) b) have been playing since a while with Unicode characters, fonts, input methods etc and therefore have a certain understanding of how these technical tools work c) have a good practical understanding of how non-Latin complex horizontal/vertical scripts work. In particular So is familiar with Japanese, Chinese and Korean, I think? While I know pretty well Arabic-based scripts, Indian ones (I lived in Nepal and when I was there I ended up teaching Nepali to Nepali children in a Nepali school) and I have played a lot with Chinese and Japanese scripts. Then, perhaps, our small little contribution should also be considered to make sense of the whole big elephant. This even more considering that in spite of having never met before, both So and I ended up developing very similar solutions to some of the problems you are discussing, solutions that, by the way, seem to me very similar to what Ishida was suggesting in some of his emails. Solutions that, in fact, are already implemented by various scripts around the world. Second: Many of the problems you are discussing don?t have a single ?right? solution, because in fact many of those problems depends on how you interpret the data (i.e. the ancient hieroglyphs). Therefore, the aim should not be to find the ?right? solution, but rather to find the easiest *interpretation* that could led to the easiest implementation in Unicode. We are not ancient Egyptians, we don?t know how ancient Egyptian scribes perceived their writing system. We can just observe it from the outside and suggest interpretations. Which are not ?right? or ?wrong?, but rather ?more efficient? or ?less efficient? in respect to the problems we want to solve through these interpretations. This is something that, I think, should be kept in mind. Third As said during the meeting in Cambridge, I am not against the introduction and use of some control characters per se. Still, honestly, more I read about your proposals and about your control characters, and more I am convinced that using rendering algorithms at the font level and general or contextual ligatures embedded within the fonts as a main way to combine and display groups would be much easier and much more efficient than using control characters. This said, allow me to call your attention to a few more specific points. 1) JSesh approach I had the feeling during the meeting and reading your emails that many of you think in a very JSesh-based mentality. Like I had the feeling that at least some of the participants? goal is to imitate JSesh in Unicode. But Unicode *is* *not* JSesh, or at least should not be JSesh. Using control functions, parentheses etc can be an efficient way in JSesh, but I doubt that a system requiring to input 10 (sic!) control characters to display 3 (sic!) hieroglyphs in a pretty common group (the Htp example in Mark-Jan?s last proposal) makes much sense in a unicode perspective. Especially considering that you can obtain exactly the same result without control character and just with 1 (sic!) general ligature within the font that will be automatically rendered by usual standard already available rendering algorithms. In other words, what can be an efficient solution to a problem in JSesh, is not necessarily a good solution in Unicode. And Unicode is not even a stand-alone thing: Unicode can work in combination with rendering algorithm (i.e. ligatures embedded in fonts), with layouts at the editor level etc. that can help solving the problems on different level than just on the Unicode encoding level. 2) Groups in fonts Someone in the list was saying that the groups need to be listed, but that not all of them necessarily need to be built in the font (or something like that). This is not true: compound characters and groups will not automatically build on their own, and if you envisaged the possibility to build a group that you *do* need to have in the font, otherwise it wont display correctly and you will have random broken control-characters popping out here and there in your hieroglyph text. This perhaps will not bother you in your database etc, but it will be just unacceptable for common users (and, for instance, for editors). If you are accepting that a group can be built, then you need a way to display that group, i.e. you must have that group saved somewhere in your font. Whether you will build those characters through control characters or with rendering and ligature algorithms embedded in the fonts. 3) the 4 ?small sign in the corner of big sign? control characters. If you really want to have control characters to combine small signs in the corner of bigger signs, you can do that with just 2 control characters, you don?t need 4 of them. These because the distinction you are making between ?big? and ?small? signs is superfluous. It is enough to have 2 transversal control characters, that we can represent as ?A\B? and ?A/B?. They will join the main signs by virtually/ideally putting them at two opposite corners of the square, on the base of their relative order. So for instance, if you want to render the group tw, you will just type in t + ?A\B-control-character? + w. If instead you want to code for wt, then you will just need to type in w + ?A\B-control-character? + t. If you want to render twt, then you just type in t + ?A\B-control-character? + w + ?A\B-control-character? + t In all these cases, you can simply use the same control character. You don?t need two distinct control characters for that, because there is no need to specify which one is the ?big? sign, and which one is the ?small? one, because what really matters is not their size, but their relative position. NOTE: if by any chance you are going to adopt this system of 2 control characters, then I?d like to be mentioned in the proposal. You know, it could be useful on the CV. 4) Vertical and horizontal script and control characters You are talking about using control characters to render texts in horizontal and vertical texts. This, however, can generate some quite relevant problems. In particular, you have to consider that it will be hardly possible to automatically convert a vertical text encoded with control characters into horizontal text. In other words, if you have a text written in vertical columns that used control characters, and for whatever reason (for instance for editorial reasons, you editor wants your text in horizontal lines and not in vertical lines, for instance if you are quoting a hieroglyphic passage within a English paragraph) you will need to turn it into horizontal text, you will hardly be able to do it automatically, and you will likely have to retype it entirely. The problem is that the control characters that you will need and use in your vertical text will not be the same and will not be inputted in the same places as those that you will need in your horizontal rendition of the same text. Let?s take, for instance, the following example. If this text will be typed with a vertical font (as I would expect it to be), then to display it correctly you will have to type the following sequence of Unicode characters: + + ?left/right-ctrl-character? (sic!) + + ?up/down-control-character? (sic!) + + + NOTE that you will probably (as far as I know, correct me if I am wrong) have to use the ?left/right-ctrl-character? (not a up/down crtl-character) to combine the and and then the ?up/down ctrl character (not a ?left/right ctrl character) to combine them with the , because in vertical texts the baseline of reference is the left line of the column. Note that this will change the order the signs have to be inputted, which means it will interfere with the searchability of the text. This however is a problem that should be possible to solved somehow. Perhaps, I am not sure. I am not expert of these details. If however you will try to display this same text (i.e. this same sequence of Unicode characters) horizontally, it wont be enough to chance the direction of the text and to use a horizontal font, because what you would obtain would be something wrong like this: "broken-control-character? "broken-control-character? In order to display horizontally the same text in a graphically acceptable way, in fact, you will have to type the following sequence of Unicode characters: + + + ?up-down-control-character? + + + ?up-down-control-character? + Which would be displayed as: Essentially, you will have to type the text anew. Not very practical, in my opinion. Using rendering algorithms, i.e. general or contextual ligatures embedded within the *font* at the font level (instead of control characters), would be a very easy way to solve the problem. In fact it would be enough to have two fonts, one for vertical texts and one for horizontal texts (or one single font with the possibility to switch between the two layouts) with different sets of ligatures embedded in them. In that way, you will just have to type the following plain sequence of Unicode characters (without any control character): + + + + + And the ligature algorithm in the vertical font will render the ligature . Then, to turn this same vertical string of text into horizontal text you will just have to change the font, and the algorithm in the horizontal font will automatically display the correct and ligatures. Without need to retype anything. NOTE that the hieroglyphic text you see the examples here above have all been obtained in this way, namely without control characters and just with my font with embedded ligatures. As Ishida suggested in one of his emails, this is essentially how Asian languages deal with this problems. It is very efficient, it works, it does not require any new special character and frankly I still don?t understand why Egyptian should be different. 5) special characters, vertical/horizontal texts and input methods Note that because of the problem with control characters and horizontal/vertical texts highlighted above, you *wont be able* to use a same simple predictive input method to type vertical and horizontal texts, because the sequences of signs and control characters needed will be *totally* different, and will both need to be encoded independently within the input method itself. In fact, you will probably have to explicitly list in advance within the input method every single possible combination of hieroglyphic signs + control characters, or you will have to just type you text sign by sign, control character by control character (even if you adopt shortcuts to input the control characters and the most common groups, the problem will still be there). Again, with general or contextual ligatures embedded at the font level this problem would not exist, because the input method would just have to input the plain sequence of signs, and it will be the rendering algorithm within the fonts that will take care of displaying the signs in the correct spatial order. 6) Ramesside ?groups? (or ?tall groups?). You all seem to assume that the clusters of signs we see in Ramesside texts are ?groups? (or ligatures) analogous to the middle Egyptian square groups, and therefore have to be created, manipulated and displayed as graphical units. For the non-Egyptologists among us, this is a Ramesside text in red you have what is generally referred to as a ?Ramesside group? (or ?tall groups?). Here, instead, you have an ordinary Egyptian text, and in red you have what are generally considered as ordinary ?square groups?. As you can see, the difference is that in ordinary Egyptian writing, sings are grouped into regular square spaces (or half-squares). In Ramesside writing, instead, signs tend to be combined into vertically elongated rectangles. These rectangles can contain multiple ordinary square groups. For instance, the following Ramesside group from the text above: could be split into two ordinary square groups, and , in the ordinary square groups writing. As said, people often assume that these clusters of signs in Ramesside texts have to be understood as ?groups? analogous to the ?square groups? of the ordinary writing. But, what if the Ramesside ?groups? were actually not really ?groups?? Or better, what if there were an easier and most efficient way to analyses, interpret, describe, and therefore display them? People seem to often assume that there are only two main ways to write a string of text, namely vertically or horizontally. This however is in not true. It is also possible to write short horizontal strings of text within a larger main vertical frame, and similarly it is possible to write short vertical strings of text within a larger main horizontal frame. The second case, in particular, is attested in Asian scripts. Have a look at the following image: This is a Japanese text, as you can see from the next image, it is written in short vertical ?columns? (red), organized into one general horizontal ?line? (green). Namely, the text is written (the text in the pic is right to left, I am transcribing it left to right because it is easier to explain the concept) --------------- ????? ????? --------------- But it has to be read: ?? | ?? | ?? | ?? | ?? namely: ?????????? Japanese writing has ?groups? (i.e. characters, marked in white in the following picture) that in some general respect could be compared to Egyptian groups (they are not identical, I know, but *in some respects* they are conceptually comparable). Now, *no one* dealing with Japanese writing would consider the short vertical strings of text (the red bits above) as some sort ?non-ordinary elongated groups(characters)?. And *no one* would ever consider to code such hypothetical elongated vertical groups into Unicode, or to devise control characters or anything to pre-compose them. They are just sequences of *regular* groups(characters) written vertically within a main horizontal layout. And in order to represent such a text in Unicode, you just have to play with the layout of your text in your text editor. You don?t need special control characters or anything else, it is *just* a question of *layout*. Now, Ramesside inscriptions can be analysed exactly in the same way. In other words, what you are interpreting as the Ramesside ?abnormal elongated/tall groups? (red in the pic above) could actually be interpreted just as short strings of texts written vertically (i.e short *columns*) within a bigger main horizontal layout (i.e. in lines, in green). Such an interpretation has a few advantages both in respect to the actual data from the ancient texts and for what concern Unicode etc. First, someone was pointing out that there are thousands of such ?Ramesside groups? and a large part of them is attested only once. Well, if you interpreted them as short columns of texts, rather than as groups, you understand why: those are short strings of texts, they are not isolated independent graphic and orthographic units. This also helps to answer the question: ?how many groups are you expecting to find in the future?? Virtually, potentially, infinite. If tomorrow we should find a new Ramesside temple with a new long text inscribed in ?Ramesside groups?, there could be hundreds, even thousands of these new short vertical strings of texts. Trying to describe (and encode) them as independent units (or devising control characters to build them) is more or less like trying to encode every singly column of hieroglyphic text as single independent ?groups?. It does not make much sense, as it would not make sense to try to describe and code every single horizontal sequence of groups(characters) in the Japanese texts above. Rather, it would make much more sense to deal with these ?Ramesside short vertical strings of text disposed in horizontal lines? as if they were.. well, short vertical strings of text disposed in horizontal lines. This can be *easily* done at the layout level, either with xml/CSS style sheets, or with any other layout algorithm of any text editor. Word can do that, already. It is enough to use a vertical font, and create a page layout with horizontal sequences (= the lines) of very short vertical columns. That is all what you need. The Ramesside ?group? (or ?tall group?) used by Bob as example in his last proposal, for instance, would not need to be built and displayed as ?group? (i.e. with control characters etc) at all, but it could just be inputted as a plain sequence of independent signs displayed with a vertical font within a vertical column, in a general layout that present series of horizontal sequences (i.e. a ?lines?) of such short columnts. No need for ligatures, no need for control characters, nothing. Just vertical fonts and properly set up layouts for the page where the signs will be displayed. Obviously, there will still be signs that will need to be combined in ?groups?, i.e. in ?real? square groups. As you can see from the pic above, however, if you interpret the Ramesside texts as composed of short vertical columns, the number of groups needed decrease significantly. And in general, those groups are often the same basic groups that you find in good old ordinary Egyptian square groups orthography. So no need to list and encode (as Unicode or at the font level, doesn?t matter) thousands of unique groups, we would just need to encode the basic square groups and then playing a bit with the layout of the text. And this brings me to the next point: 7) What are the ?square groups?? I have never specifically worked on square groups from a linguistic point of view, and I do not know if there is any specific study about square groups and their graphic behaviour within a larger linguistic frame (i.e. for instance comparing them with the behaviour of similar ?groups? in other writing systems). This is one of those points where the experience of some of you could be extremely useful. Such a study could be used, for instance, to define some contextual rules to allow the font to automatically manage the combination of at least some of these groups, or at least to define some rules to automatically prioritize one ligature over the other (if we are working with ligatures at the font level). If such a study exists, then it would be useful to take it into consideration in the discussion. If such a study does not exist, then perhaps it could be worth considering doing it *before* submitting any new proposal involving control characters to the Unicode consortium, because I think it would be better to be sure we really understand how those groups work, before suggesting a method to encode them. I am obviously talking about control characters etc, I am not talking about expanding the basic set of glyphs. Otherwise, we would be encoding something whose actual functioning has never been studied, and therefore we would risk to be encoding features and elements that are actually superfluous, or that could have been managed in a more efficient way at other levels (ligatures within the fonts, layout table at the texteditor level etc). Ok, I guess this is more or less all what I wanted to say. Now feel free to ignore me :-) Best Marwan -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image002.png Type: image/png Size: 799 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image004.png Type: image/png Size: 317 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image006.png Type: image/png Size: 654 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image008.png Type: image/png Size: 806 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image009.png Type: image/png Size: 654 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image011.png Type: image/png Size: 910 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image013.png Type: image/png Size: 3701 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image015.png Type: image/png Size: 1698 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image017.png Type: image/png Size: 622 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image019.png Type: image/png Size: 661 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image021.png Type: image/png Size: 483 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image022.png Type: image/png Size: 622 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image023.png Type: image/png Size: 623 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image024.png Type: image/png Size: 320 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image025.png Type: image/png Size: 620 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image026.png Type: image/png Size: 489 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image028.png Type: image/png Size: 3663 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image029.png Type: image/png Size: 661 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image031.jpg Type: image/jpeg Size: 1021 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image033.jpg Type: image/jpeg Size: 896 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image035.png Type: image/png Size: 14433 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image037.png Type: image/png Size: 20432 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image038.png Type: image/png Size: 4403 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image039.png Type: image/png Size: 7742 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image041.jpg Type: image/jpeg Size: 1095 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image043.png Type: image/png Size: 703 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image045.png Type: image/png Size: 465 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image046.png Type: image/png Size: 13279 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image047.png Type: image/png Size: 11587 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image048.png Type: image/png Size: 11995 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image050.png Type: image/png Size: 21407 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image052.png Type: image/png Size: 878 bytes Desc: not available URL: From s.polis at ulg.ac.be Sat Jul 23 16:00:30 2016 From: s.polis at ulg.ac.be (=?utf-8?Q?St=C3=A9phane_polis?=) Date: Sat, 23 Jul 2016 17:00:30 +0200 Subject: [Egyptian] Some general considerations In-Reply-To: References: Message-ID: <96C8F2FC-A766-4379-BCBE-B5A6FB700A3B@ulg.ac.be> Hi Marwan, Thanks for your mail! Some very quick answers to your suggestions. > First: > No worries, everyone is entitled to have an opinion based on his/her own experience, and I would certainly agree with the fact that no one (except maybe for Serge) has a proper understanding of all the aspects involved here. > Second: > Sure, no ?right? solution, we simply want a solution that meets our minimal needs. The definition of these needs is what we do not seem to agree on. > We are not ancient Egyptians, we don?t know how ancient Egyptian scribes perceived their writing system. We can just observe it from the outside and suggest interpretations. Which are not ?right? or ?wrong?, but rather ?more efficient? or ?less efficient? in respect to the problems we want to solve through these interpretations. This is something that, I think, should be kept in mind. Certainly, but they give us some clues that we should not ignore (see e.g. below, re point 6). > Third > As this is an issue recurring again and again, I would like to stress one more time that, unlike what is done for most other scripts, we are not producing/imputing new texts, but transcribing old ones, hopefully without loosing too much information. And your solution is a good one ? I have absolutely no doubt ? as long as you do not want to *search* for the relative position of signs with respect to one another. This is however a piece of information that is, like it or not, part of the ?orthographic? system of ancient Egyptian and to which we want to have access (see further Simon?s mail earlier this week). > 1) JSesh approach > Do not underestimate too much your addressees, Marwan ;) Furthermore, this is not a JSESH-based mentality, but a MdC-based mentality: the people behind this standard had some pretty good idea of what they were doing, trust me, and there are a lot of problems, but one should simply not throw out the baby with the bath water. > I had the feeling during the meeting and reading your emails that many of you think in a very JSesh-based mentality. Like I had the feeling that at least some of the participants? goal is to imitate JSesh in Unicode. > > > But Unicode *is* *not* JSesh, or at least should not be JSesh. > Using control functions, parentheses etc can be an efficient way in JSesh, but I doubt that a system requiring to input 10 (sic!) control characters to display 3 (sic!) hieroglyphs in a pretty common group (the Htp example in Mark-Jan?s last proposal) makes much sense in a unicode perspective. Especially considering that you can obtain exactly the same result without control character and just with 1 (sic!) general ligature within the font that will be automatically rendered by usual standard already available rendering algorithms. > > > In other words, what can be an efficient solution to a problem in JSesh, is not necessarily a good solution in Unicode. And Unicode is not even a stand-alone thing: Unicode can work in combination with rendering algorithm (i.e. ligatures embedded in fonts), with layouts at the editor level etc. that can help solving the problems on different level than just on the Unicode encoding level. > Sure, everything needs not be dealt with at the level of Unicode, but the data needed should not be hidden in the ligatures embedded in the font either. > 2) Groups in fonts > > > Someone in the list was saying that the groups need to be listed, but that not all of them necessarily need to be built in the font (or something like that). > > This is not true: compound characters and groups will not automatically build on their own, and if you envisaged the possibility to build a group that you *do* need to have in the font, otherwise it wont display correctly and you will have random broken control-characters popping out here and there in your hieroglyph text. This perhaps will not bother you in your database etc, but it will be just unacceptable for common users (and, for instance, for editors). > > > If you are accepting that a group can be built, then you need a way to display that group, i.e. you must have that group saved somewhere in your font. > > > Whether you will build those characters through control characters or with rendering and ligature algorithms embedded in the fonts. > > OK, sure. But then again control characters have the advantage of being explicit about the relative position of signs when a group is not in a font: how would you proceed for storing such an information, as a lay user, when using ligatures? (this is a real question, nothing ironic here). > > 3) the 4 ?small sign in the corner of big sign? control characters. > I?m glad to see that your solution is exactly the same as the one we suggest! Indeed, there are logically 2 ?variables? (Top vs Bottom and Left-Right), if the sequence of signs indicates the Left-Right unambiguously (like in your example), then we only need to encode explicitly the Top vs Bottom, of course. This is basically how we represented things with Michael at the pub. Please keep in mind that the four operators come from another type of syntax. > 4) Vertical and horizontal script and control characters > > I do not understand why you do entirely get rid of the notion of ?quadrat? in your example. If you encode your text as /hthr-H*(n:t)-tA:tA/, the text would be displayed correctly whatever the orientation (hltr, hrtl, vrtl, vltr), no? Of course, one needs the notion of quadrat in the encoding for this; and this is a nice case for showing that we *need* this well-defined notion in the encoding, not just sequences of glyphs. > 5) special characters, vertical/horizontal texts and input methods > > Your point escapes me, here. Unless it is a result of getting rid of the quadrats: within quadrats, the groups of hieroglyphs are essentially the same. > > 6) Ramesside ?groups? (or ?tall groups?). > > Why are they groups or quadrats and not ?small columns?? The answer is simple and straightforward (and provided by your example): because they correspond to the size occupied by the A1 sign. Look at the example. Again, it might not be easy to handle from a technical point of view, I do agree, and you might want to split them in smaller groups for the sake of convenience, OK, but some signs do occupy the full height of the horizontal line (unlike in your Japanese example, which is nice, but as such irrelevant), which is the basis for deciding what counts as a unit (quadrat) or not. (Or do you have another definition in mind?). Again, if unicode is meant for writing names of tourists (not even in cartouches as it seems) or to prepare simplified layout for online teaching grammar, etc. that?s really fine. But if you want to be able to use it minimally for (1) standardized electronic corpus and (2) journals like LingAeg, etc. dealing with the Egyptian language, this would be a requirement. > Trying to describe (and encode) them as independent units (or devising control characters to build them) is more or less like trying to encode every singly column of hieroglyphic text as single independent ?groups?. > Are you sure? Come on... > Rather, it would make much more sense to deal with these ?Ramesside short vertical strings of text disposed in horizontal lines? as if they were.. well, short vertical strings of text disposed in horizontal lines. > > This can be *easily* done at the layout level, either with xml/CSS style sheets, or with any other layout algorithm of any text editor. Word can do that, already. It is enough to use a vertical font, and create a page layout with horizontal sequences (= the lines) of very short vertical columns. > > The wording is transparent: easy, but unfortunately inaccurate and irrelevant for scholarly uses (not palaeography, of course: palaeography has to do with the actual appearance and style of individual signs). > 7) What are the ?square groups?? > > I provided the definition agreed on ? I think ? by everyone, if you have a better one, I?m listening of course. A final remark (aimed at everyone, not an answer to Marwan, of course). I spent hours and days reading and thinking about the arguments during the last few weeks, and I?ve got the impression that we are repeating the same obvious points again and again. I?m sorry to put it so bluntly, but if Unicode is not to be useful for the majority of egyptologists, so be it. It will remain what it is, a standard not used by the community: Journals, Corpora, etc. will keep on using JSESH and other tools, they?re doing well with it. On the other hand, the Egyptologists around the table seem to agree that we ?simply? need (and it does not sound extraordinary to me): 1 - to build quadrats. 2 - to create (sub)groups of signs within quadrats. 3 - to be able to position (groups of) signs with respect to other (groups) either vertically, horizontally or in other ?corner? like position (the INSERT-like operators). [As a side-note to Bob: the positioning of groups of signs in corners is trivial in monumental inscriptions: the low number of them in Ramses comes from the fact that we encoded very few hieroglyphic texts; check the first pages of the KRI for getting an idea.] These are the basic principles, illustrated ad nauseam in the (files attached to the) mails before (and we leave out everything else as mandatory requirements). If this is not possible to envision, I think that we can close the discussion, without anger or regret: Unicode cannot be used by most of us. That?s a pity, but so be it. And I would find it really great if one could now stop asking for more data regarding these questions or post-poning the decision for bad reasons: we provided more than is needed. If you do not want to take these cases into account because they are not frequent enough (on which basis, e.g., for someone working only with this kind of texts?) or because they are hard to implement, that?s fine; but please do not invoke the lack of evidence. Have a good weekend folks! St?phane -------------- next part -------------- An HTML attachment was scrubbed... URL: From mn31 at st-andrews.ac.uk Sat Jul 23 16:22:56 2016 From: mn31 at st-andrews.ac.uk (Mark-Jan Nederhof) Date: Sat, 23 Jul 2016 16:22:56 +0100 Subject: [Egyptian] Vertical vs horizontal writing-mode In-Reply-To: References: Message-ID: <1596000.3AgpvMF7jM@thuis> Dear Richard, A belated thank you. On Wednesday 20 Jul 2016 11:55:31 ishida at w3.org wrote: > a. perhaps some combination of the above smoke and mirrors techniques > may be adequate to manage some of the differences between layout in > horizontal vs vertical writing-mode when the thing we are struggling > with is the spacial relationships between the elements circumscribed by > a quadrat when they are rendered. I'm not aware of a fail-safe way to convert vertical to horizontal. If smoke and mirrors do something reasonable, then that is better than nothing. Here 'nothing' could mean putting groups side by side horizontally, no matter how high a group is. Note that the usual encodings of hieroglyphic text place no restriction on the width of a group for horizontal writing, or the height of a group for vertical writing. Our point in the original proposal was that if you want to do more than 'nothing', then the rendering application must at the very least be able to detect that the writing-mode has changed and that there is something to be done in the first place. Horizontal and vertical text use the same code points, just in slightly different ways, so there is no automatic way to detect a change of writing-mode. Unless there are characters that contain the needed information. You mention other scripts where some characters are more likely to be found in one or the other writing mode, but there seem to be no existing scripts where some character uniquely identifies the intended writing-mode. So there is no precedent for us to invoke to argue for something similar for Ancient Egyptian. > b. perhaps it's not particularly problematic that you can't > automatically flip between horizontal vs vertical without changing code > points, especially when one considers that there is anyway so much > variation in 'spelling' of egyptian content, often to fit the visual > space available. It seems acceptable for now to assume all text is horizontal, and it is up to the encoder to manually rearrange groups if not. But it is not ideal. I would suggest to keep the issue on the agenda, and at least flag up that there is a problem to be solved. If it cannot be solved within the constraints of Unicode, too bad. > c. if the control characters used to indicate the positioning of > hieroglyphs within the quadrat display space are treated like other > Unicode control characters, ie. they are not part of the semantics and > are ignored for sorting, searching, and processing the text for meaning, > rather they are just cues for visual arrangement, then perhaps it's not > a big issue either if they are different for vertical vs horizontally > rendered content. You're right this matter is irrelevant for (most) sorting and searching. But let me remark that as part of our investigations, St?phane and Serge analysed occurrences of certain groups, through automatic search, distinguishing between horizontal and vertical text. Unless I misunderstood this was possible because knowledge about the writing-mode was available in the Ramses corpus. So there is at least one example of text with indication of writing mode being more useful than text without that indication. Best regards, Mark-Jan From bobqq at live.co.uk Sat Jul 23 17:10:00 2016 From: bobqq at live.co.uk (Bob Richmond) Date: Sat, 23 Jul 2016 17:10:00 +0100 Subject: [Egyptian] Brackets in the TLA encoding In-Reply-To: <1922257.noGW0DXB86@thuis> References: <1922257.noGW0DXB86@thuis> Message-ID: Hi Mark-Jan A mathematician or physicist will also tell you not to use Einsteins Equations when Newtons suffice. Especially while cycling down a hill at speed however well versed you are in tensor calculus. On a more serious note, have you organized the data from St?phane to map against your model with 4 corners yet? No point in us both doing the same thing. Have a list of examples that you think need large levels of nesting in your model? Have you any comment on group joiners? You mention below "Using infix operators is really only justifiable if notation is meant for human consumption." Quite. Human and machine consumption is exactly what we are designing for. Text is about people not parsers. Try opening my font description document in Word and insert/delete characters in hieroglyphic strings, see the control codes come and go. Think about what end controls in your model imply. Bob -----Original Message----- From: Egyptian [mailto:egyptian-bounces at evertype.com] On Behalf Of Mark-Jan Nederhof Sent: 23 July 2016 13:47 To: egyptian at evertype.com Subject: Re: [Egyptian] Brackets in the TLA encoding Hi Simon, Hi St?phane, Hi All, This is very helpful. Physicists tell us that if you want to gather and use data, you need hypotheses first, or else you don't know what to look for. For me the relevant hypotheses are: (1) The primitives we have in our current document allow description of most of the groups in an accurate enough way. Both 'most of' and 'accurate enough' are subjective of course. There is no escaping that. (2) It would be quite difficult to reduce the expressive power before we would lose coverage. There is an implicit parameter, which is a limit on the depth of nesting, which I assume is 3. As also Simon confirmed once more, 2 is not enough, even for the most basic, run-of-the-mill classical (horizontal) inscriptions. As to (1), we have moved away quite considerably from descriptive power that is machine-interpretable. This was motivated by people finding the original encoding too complicated, and arguing that fonts would do a lot of fine-tuning anyway for particular choices of signs. Also, we don't really care about a sign being printed 0.5 mm too much to the left or to the right, as long as the user gets a rough idea of what the text looks like. These arguments all sound reasonable, but realise two things: * If even stupid machines don't know how to render an encoding roughly as it was intended, perhaps there is not enough information present for humans to know what was meant either. * As stressed once more by St?phane, the kinds of groups we are talking about are productive. We don't want to be manually fine-tuning the appearance of an unbounded number of groups, so some approximately correct automatic rendering would be quite useful. I think we are still okay with the present version of the proposal, but we have moved a long way from existing routines that do the rendering in a deterministic, predictable manner, to needing lots more refinements to program code and the result being not quite well-defined. As to both (1) and (2), the provided examples include quite a few cases of insertions and stacking, insertions into stacked groups, and even groups with insertions that are themselves inserted. So far I don't see either hypothesis refuted. I had to struggle quite a bit to get rid of prefix operators. As anyone with the slightest knowledge of formal languages knows, prefix or suffix operators are ideal for automatic processing, because the problem of ambiguity simply does not exist, whereas endless volumes of textbooks since the late 1950s have been written about the ambiguities caused by infix operators and how to solve them using principled or not so principled methods involving operator precedence and low-level hacks in shift-reduce parsers. Using infix operators is really only justifiable if notation is meant for human consumption. That is why I was very surprised to hear objections with the argument that font technology is too primitive to handle prefix operators. If anything, I would have imagined that primitive tools would have a lot of difficulty with parsing in the presence of operator precedence and such. I implemented OpenType substitution rules that analysed bracket structures and prefix operators myself, and that works fine. It would be a nightmare for me to have to implement OpenType substitution rules in the presence of operator precedence. There may be something in the arguments people use that I don't understand. Anyway, one thing to look out for (I say this in particular to Simon, St?phane and Serge, with whom this was discussed in detail in Cambridge), is that in the process of getting rid of prefix operators, and avoiding ambiguity, the following coverage was lost: it is not possible anymore to insert A into the top-left corner of B, and to insert the resulting group into the bottom-left corner of C. The same holds for the right corners. I have yet to see a group where this matters. It is still possible however to insert A into the bottom-left corner of B, and to insert the resulting group in the bottom-left corner of C. The same holds for two upper-left corners and the corresponding right corners. There are groups of these forms among the provided examples. The problem of course is that inability to find certain structures in the corpora we happen to have at this very moment does not prove their non-existence. At best it means our encoding won't be too much lacking in terms of coverage. Best regards, Mark-Jan On Thursday 21 Jul 2016 15:24:51 Simon Schweitzer wrote: > Hi all, > > @St?phane: thank you for your .gly-files! In this mail, I want to add > some remarks concerning the subgroup topic. > > As in Ramses, there are many encodings with "(" and ")" in the TLA. I > collected these encodings ans I want to present you my evaluation: > * In some cases, the encoding is invalid, e.g. (F12-S29):D21, which > should be understood as F12*S29:D21. > * Sometimes, the encoding of the brackets is superflous. There are > many cases of Hiero1:(Hiero2*Hiero3) The brackets are not necessary: > use > Hiero1:Hiero2*Hiero3 ! > * But in many cases, the parsing without the brackets would be misleading: > 1) There are many vertical groups in horizontal groups in vertical > groups. I list only 10 examples: > N35:"?"*"?"*(W22:Z2) (Rezepte Papyrus Zagreb E-597-3, l. 1; > ID:ABLN5PNQ2BBENE7LWO72KDRPPU) > Aa1&D58:(X1:N35)*N25 (Stele Louvre C 284 ("Bentresch-Stele"), l. 22; > ID:4VLZLA44UVGJZN22WIWP774LOQ) > Aa13:S40*(X1:O49) (Stele Louvre C 284 ("Bentresch-Stele"), l. 21; > ID:4VLZLA44UVGJZN22WIWP774LOQ) > Aa15:W19*(X1:X1) (Harfnerlieder Text C, l. 2; > ID:H6Z5TORPQFFZXOU6CJODODZHYQ) > D21:V28*(X1:B1) (?Stele des Montuhotep (Cambridge E.9.1922)?, l. C.3; > ID:LQ7QIJTK7NFWTIFBS7R6AIYWGY) > D21:V7*(W24:X1) (?Stele des Montuhotep (Kairo CG 20539)?, l. I.b.18; > ID:ZOLMMIAB2NHV7PSOSOOLAHN64U) > D35A:(X1:Z4A)*G37 (?Stele des Antef (Louvre C 167 = E 3111)?, l. C.1; > ID:DWHZIO5ZCFBURLZ6G4T26YIP7U) > D36:D21:N29*(X1:"?"*Z4*"?") (?Stele des Antef (Glasgow D1922.13)?, l. > A.3; ID:OIYODBZ74RHM7OPTR72OLCMJ3A) > I10&I9:X2*(X4:Z2) (?Stele des Antef (Glasgow D1922.13)?, l. A.7; > ID:OIYODBZ74RHM7OPTR72OLCMJ3A) > K4:G1*(Z7:X1) (Sinuhe AOS, vs. 18; ID:RP2F6BGNDBAARDBNDHGFSFPIEM) As > you can see, this kind of grouping occurs in hieroglyphic and in > hieratic texts, and this feature is also attested in the "classical" > period from the Middle Kingdom (the examples from the stela of > Montuhotep and Antef). > 2) horizontal grouping of vertical groups in columns If the text is > written vertically, there are cases of horizontal groups of vertical > groups, e.g. in the Buch von der Himmelskuh > (ID:WHEOIX5P5ZAVVFU4BGO6OWASV4), M17*(Q3:N35), (X1:Z1)*I12, > M17*S29*(A2:Z2) and so on. > > Best regards, > > Simon > > _______________________________________________ > Egyptian mailing list > Egyptian at evertype.com > http://evertype.com/mailman/listinfo/egyptian_evertype.com > _______________________________________________ Egyptian mailing list Egyptian at evertype.com http://evertype.com/mailman/listinfo/egyptian_evertype.com From odusseus at gmail.com Sat Jul 23 18:12:18 2016 From: odusseus at gmail.com (Marwan Kilani) Date: Sat, 23 Jul 2016 19:12:18 +0200 Subject: [Egyptian] Some general considerations In-Reply-To: <96C8F2FC-A766-4379-BCBE-B5A6FB700A3B@ulg.ac.be> References: <96C8F2FC-A766-4379-BCBE-B5A6FB700A3B@ulg.ac.be> Message-ID: Hello everyone *Third* > > And your solution is a good one ? I have absolutely no doubt ? as long as > you do not want to *search* for the relative position of signs with respect > to one another. This is however a piece of information that is, like it or > not, part of the ?orthographic? system of ancient Egyptian and to which we > want to have access (see further Simon?s mail earlier this week). > In general, the spatial distribution of signs (i.e. the "grouping") has *no meaning at all* in Egyptian. No semantic, phonological, or morphological information is coded in the relative position of the signs. This is demonstrated by the texts themselves: if the relative position of the signs were linguistically important, you would have some form of regularity, with some combinations being possible and others being forbidden. This is not the case. Combining three signs into a group, or writing them one after the other is linguistically exactly the same. it is just an esthetic, a layout, matter. Not a linguistic one. No more than illuminated initial capital letters in medieval manuscripts. Like these: http://www.sothebys.com/content/dam/stb/lots/L12/L12240/256L12240_6G4Y2.jpg They are nice, they are fancy, they are there (hundreds of thousands of them), but they do not carry any additional linguistic information whatsoever. Thus, there is not specific need to (and in fact they are not) encode or represent them in unicode. As far as I know (and please correct me if I am wrong), the only utility that recording the position of signs could have on a practical level would be to fill lacunas in text or to suggest alternative readings in hieratic scripts. But in fact, in order to find what sign could be missing in a given lacuna, or what sign could be hidden behind a hieratic ligature, you need dictionaries and corpora of texts where you can search for sequences of signs *independently* form their spatial position (what signs x is attested after sign y? whether combined in a similar group or not?), you don't need to code anything in unicode. It can be a plus, but it is not indispensable. > *1) JSesh approach* > > > Sure, everything needs not be dealt with at the level of Unicode, but the > data needed should not be hidden in the ligatures embedded in the font > either. > Again, what data? What is the *linguistic* (not graphic, not philological) information carried by groups that is so important to code? Perhaps this question has been already asked, but honestly so far I haven't heard or read any convincing answer, and I haven't seen any common example (i can exclude there could be some uncommon case, i obviously have not seen all the egyptian texts existing in the world) where the position of a sign in respect with the other signs around it carries and linguistic information. *2) Groups in fonts* > > OK, sure. But then again control characters have the advantage of being > explicit about the relative position of signs when a group is not in a > font: how would you proceed for storing such an information, as a lay user, > when using ligatures? (this is a real question, nothing ironic here). > I am not sure I understand your question: the information about the spatial distribution is already coded in the ligature. the ligature IS the information. And if the ligature does not exist, it takes (literally) 30 seconds to create it within a font. And with a common database and a common font of reference, if would be extremely easy to create a common shared table of ligatures that will allow everyone to see exactly the same groups (or alternatively if i write my text with the font x with the ligature x, it would be enough to embed the font itself in the text file (there are various ways to do so), because everyone would be using the same font or at least the same set of ligatures. > *3) the 4 ?small sign in the corner of big sign? control characters.* > > I?m glad to see that your solution is exactly the same as the one we > suggest! Indeed, there are logically 2 ?variables? (Top vs Bottom and > Left-Right), if the sequence of signs indicates the Left-Right > unambiguously (like in your example), then we only need to encode > explicitly the Top vs Bottom, of course. This is basically how we > represented things with Michael at the pub. > Please keep in mind that the four operators come from another type of > syntax. > This is not what I am suggesting. I am suggesting 4 control characters: top-bottom, left-right, A\B, B/A. instead of top-bottom, left-right + four corners. Or perhaps I am not understanding your system. > *4) Vertical and horizontal script and control characters* > > > I do not understand why you do entirely get rid of the notion of ?quadrat? > in your example. If you encode your text as /hthr-H*(n:t)-tA:tA/, the text > would be displayed correctly whatever the orientation (hltr, hrtl, vrtl, > vltr), no? Of course, one needs the notion of quadrat in the encoding for > this; and this is a nice case for showing that we *need* this well-defined > notion in the encoding, not just sequences of glyphs. > If you want to have the "notion of quadrate" in unicode, then you have to introduce at least another control character that you will have to use in front of *every single quadrate* in your text. if i understood correctly, one of the proposal that were circulating was indeed suggesting something like that. Which means that to display any text you will probably end up using more control characters that hieroglyphic signs themselves. This can be a sound way of thinking in a MdC (as you don't like JSesh ;-) ) perspective, but I doubt it make sense in unicode (and more important i doubt the people of the unicode consortium will think it make sense). Unless you want to code every single possible "quadrate" as an independent glyph: that would thus end up being conceptually comparable to a chinese character. But I assume (i hope) we don't want do do that, right? *5) special characters, vertical/horizontal texts and input methods* > > > Your point escapes me, here. Unless it is a result of getting rid of the > quadrats: within quadrats, the groups of hieroglyphs are essentially the > same. > With quadrats marked by special control characters would be even worse, as you would have to take into consideration even more possible combinations. > *6) Ramesside ?groups? (or ?tall groups?).* > > > Why are they groups or quadrats and not ?small columns?? > > The answer is simple and straightforward (and provided by your example): > because they correspond to the size occupied by the A1 sign. Look at the > example. > > but some signs do occupy the full height of the horizontal line (unlike in > your Japanese example, which is nice, but as such irrelevant), which is the > basis for deciding what counts as a unit (quadrat) or not. (Or do you have > another definition in mind?). > Not true. This can be interpreted just as a question of layout, not as a question of grouping. Simply, there could be signs that can be graphically stretched/enlarged to fully fill alone a given space, while other will be squeezed fit together within an equivalent space. This does not say anything at all about them being "groups" or not. It can be interpreted just as a question of layout. And actually this is very common in various writing systems around the world, and no one would consider the fact that some isolated glyphs appear as big as some "combined one" as reason to consider the "combined ones" as "groups". Just a few examples from the internet: have a look at this chinese (bopomofo + characters) text: https://upload.wikimedia.org/wikipedia/commons/1/1b/Bopomofo_in_Regular,_Handwritten_Regular_%26_Cursive_formats.jpg According to your way of interpreting groups ("some signs do occupy the full height of the horizontal line, which is the basis for deciding what counts as a unit (quadrat) or not."), the four little characters within the parentheses on the right of the image should be interpreted as a single "group" (or as a combined "character", as we are in china) just because they, combined, are as big as some of the other single characters. This is not the case. No one in asia would have consider such a combination as a "group" or as a "combined character". They are just four independent characters(or "four groups" assuming character = group, which conceptually is a fairly sound equivalence) which happen to fit into the space in a slightly different way compared to the other characters of the text. Their appearance is just a question of *scale*, thus of layout. Not of grouping, and do not say anything about what a "group" should be. Here, an even better example, a chiense text with bopofomo and hanzi characters: http://chinesehacks.com/app/uploads/2010/06/zhuyin.jpg no one would consider those small signs on the right of the big characters as "groups" just because when combined together they end up being as big as the single bigger characters. And as you can see, conceptually, you can interpret this text as a sequence of short vertical columns organized in horizontal lines. Some of these short vertical columns are occupied by a single "scaled up" character (like the A1 in the Egyptian text above) while other short vertical columns are occupied by multiple "scaled down" signs one above the other. it is the same with Ramesside writing. No one, however, would consider those smaller "scaled down" signs written one above the other as "groups" just because together they are as big as the big ones. Same with latin script, actually: for instance take again the manuscript page above: http://www.sothebys.com/content/dam/stb/lots/L12/L12240/256L12240_6G4Y2.jpg According to your definition, one should take the tallest sign (i.e. the initial "B") as the unit to define what a "group" is. As a consequence, one should consider the 6 lines of text beside it not as "lines of text", but as a single "group", and according to your approach those lines should not be represented with a specific layout, but should be somehow combines with control characters. I hope we agree that this would make no sense. So if in Chinese (and Japanese, and Korean) and in Latin script you can have glyphs of different sizes in the same text, without the need to conceptually cluster the "smallest one" into "groups" that will appear as big as the biggest signs, why do you feel this need with Egyptian? And note that this is different from Egyptian ordinary square writing, as in ordinary square writing one can assume that all the signs (both those combines into square groups and those outside square groups) are more or less in the same *scale* (as a general principle, again i know there are exceptions). In Ramesside writing, instead, the *scale* of the sign is not always the same. Some are "scaled down" to fit togheter within one of those "small vertical columns" while others (like the A1) are just "scaled up" to fill up alone the same space of one of those "small vertical columns". Exactly like in the Chinese examples here above, or exactly like with illuminated capital letters in manuscripts. It is (or it can be interpreted) as a question of layout. > But if you want to be able to use it minimally for (1) standardized > electronic corpus and (2) journals like LingAeg, etc. dealing with the > Egyptian language, this would be a requirement. > All the egyptian words quoted in my article published on the last issue of JNES were written with my font with just standard ligatures, without control characters. No one complained, no one told me anything, so I assume that such a system is indeed suitable for scientific publications, not only for writing tourists' names.. ;-) Beside the anecdote, you can do everything scientific you need with unicode without ending up using tens of control characters. And if you can't then perhaps you should consider that unicode is not the best tool for your job.. It is actually funny: i have tried to understand all the various proposal including control characters, and as a result I have pointed out *several* practical problems in the use of control characters that can seriously affect the *linguistic* information of the text (like control characters requiring to change the actual inputting order of the actual hieroglyphs to display them correctly), some still without an answer, or with answer so complex that will be hardly realizable (see question of turning vertical text into horizontal text, that will require the introduction of at least one more control character). On the other hand, almost no one so far has pointed out any real problem in the use of ligatures (possibly combined with a very restricted number, 1 o 2, of control characters as suggested by Bob) for which there isn't already a possible solution which is already implemented in some script around the world (i.e. unwanted ligatures can be broken with a zero-width character like in indian scripts, deferent ligatures using the same signs could be selected with "variant characters" like in emoji etc). And still, a system that does not use multiple control characters will be good only for writing tourists' names.. Please, no offense, but again: the story of the elephant.. Trying to describe (and encode) them as independent units (or devising > control characters to build them) is more or less like trying to encode > every singly column of hieroglyphic text as single independent ?groups?. > > Are you sure? Come on... > it is you who is talking about encoding (as glyphs in font, not necessarily as unicode characters) tens of thousands of possible and often unique combinations, not me.. Ramesside "tall" groups can be interpreted as short vertical texts. Thus thinking of encoding them (or building them with control characters) is like thinking of encoding as groups whole columns of text. > The wording is transparent: easy, but unfortunately inaccurate and > irrelevant for scholarly uses (not palaeography, of course: palaeography > has to do with the actual appearance and style of individual signs). > again, why? > *7) What are the ?square groups??* > > > I provided the definition agreed on ? I think ? by everyone, if you have a > better one, I?m listening of course. > see above. And see my "Second" introductory point in the previous email: it is not a question of finding the "right" definition (as there isn't any). It is a question of finding the *most efficient* definition in order to efficiently reach our goal of having a working system of unicode-based hieroglyphs that can be used by the *whole* Egyptological community (and not only by a handful of people working on a specific corpus or database, and on his respect, to respond to you last remark..). I?m sorry to put it so bluntly, but if Unicode is not to be useful for the > majority of egyptologists, so be it. It will remain what it is, a standard > not used by the community: Journals, Corpora, etc. will keep on using JSESH > and other tools, they?re doing well with it. > Perhaps the opposite should also be considered? perhaps the unicode-based system to write hieroglyphs (considering that the very idea at the basis of unicode is to make writing standardized and accessible to as many people as possible) should indeed aim at being useful for the majority of the Egyptologists (corpus-linguists as well as all the other thousands of non-corpus-linguist egyptologists), and if some team working with some specific database should have some specific need that cannot be satisfied with such a standard unicode system, then perhaps it is them who should use other tools? I guess this is an option that should be considered.. no? Best Marwan -------------- next part -------------- An HTML attachment was scrubbed... URL: From everson at evertype.com Sun Jul 24 00:33:37 2016 From: everson at evertype.com (Michael Everson) Date: Sun, 24 Jul 2016 00:33:37 +0100 Subject: [Egyptian] proposal 2016-07-22 In-Reply-To: <1897138.Z3PCXJQWcV@bear> References: <1897138.Z3PCXJQWcV@bear> Message-ID: I use different words and different glyphs for these. But you have END, SEPARATOR, and EMPTY which we did not discuss in Cambridge. > On 22 Jul 2016, at 18:05, Mark-Jan Nederhof wrote: > > Dear all, > > We adapted our proposal. Please find it in: > > https://mjn.host.cs.st-andrews.ac.uk/tmp/unicode2.pdf > > We have responded to the criticism about brackets and prefix operators. We have gotten > rid of all prefix operators and replaced them by infix operators. As for brackets, it is well > possible to get rid of them too (not entirely, there are still the cartouches), > but the price to pay for this is added complexity of the syntax due to needing several > copies of each primitive with different operator precedence, perhaps three or four. > It is outlined in Section 9 how this would be done. This adds to the complexity > that already exists, after the prefix operators were removed; have a look at > appendix A and see whether you can verify the grammar is unambiguous. > > There are no names on the proposal. That is partly because not everyone from > TLA/Ramses/St Andrews has had a chance to look at it yet, and partly because > anyone is welcome to have their name added if they feel they contributed and > subscribe to the content. > > Mark-Jan > > _______________________________________________ > Egyptian mailing list > Egyptian at evertype.com > http://evertype.com/mailman/listinfo/egyptian_evertype.com From mn31 at st-andrews.ac.uk Sun Jul 24 00:50:50 2016 From: mn31 at st-andrews.ac.uk (Mark-Jan Nederhof) Date: Sun, 24 Jul 2016 00:50:50 +0100 Subject: [Egyptian] proposal 2016-07-22 In-Reply-To: References: <1897138.Z3PCXJQWcV@bear> Message-ID: <2376856.de12YEsUQd@thuis> END and EMPTY have always been there. END is part of the brackets that you didn't like, as you said during my presentation. I'm sure EMPTY was mentioned during my presentation as well. The SEPARATOR was introduced because (I believe) you said that there should be something between each pair of characters that is combined in some way. The choice for a single operator SEPARATOR for both horizontal and vertical grouping was motivated in my message 13/07/2016 11:47 (addressed to you personally, before there was the email list). Mark-Jan On Sunday 24 Jul 2016 00:33:37 Michael Everson wrote: > I use different words and different glyphs for these. > > But you have END, SEPARATOR, and EMPTY which we did not discuss in Cambridge. > > > > On 22 Jul 2016, at 18:05, Mark-Jan Nederhof wrote: > > > > Dear all, > > > > We adapted our proposal. Please find it in: > > > > https://mjn.host.cs.st-andrews.ac.uk/tmp/unicode2.pdf > > > > We have responded to the criticism about brackets and prefix operators. We have gotten > > rid of all prefix operators and replaced them by infix operators. As for brackets, it is well > > possible to get rid of them too (not entirely, there are still the cartouches), > > but the price to pay for this is added complexity of the syntax due to needing several > > copies of each primitive with different operator precedence, perhaps three or four. > > It is outlined in Section 9 how this would be done. This adds to the complexity > > that already exists, after the prefix operators were removed; have a look at > > appendix A and see whether you can verify the grammar is unambiguous. > > > > There are no names on the proposal. That is partly because not everyone from > > TLA/Ramses/St Andrews has had a chance to look at it yet, and partly because > > anyone is welcome to have their name added if they feel they contributed and > > subscribe to the content. > > > > Mark-Jan > > > > _______________________________________________ > > Egyptian mailing list > > Egyptian at evertype.com > > http://evertype.com/mailman/listinfo/egyptian_evertype.com > > > _______________________________________________ > Egyptian mailing list > Egyptian at evertype.com > http://evertype.com/mailman/listinfo/egyptian_evertype.com From s.polis at ulg.ac.be Sun Jul 24 11:33:23 2016 From: s.polis at ulg.ac.be (=?utf-8?Q?St=C3=A9phane_polis?=) Date: Sun, 24 Jul 2016 12:33:23 +0200 Subject: [Egyptian] Some general considerations In-Reply-To: References: <96C8F2FC-A766-4379-BCBE-B5A6FB700A3B@ulg.ac.be> Message-ID: Hi guys! (no worries, this is my last mail on the topic, enough time and energy spent on this.) >> Third >> > And your solution is a good one ? I have absolutely no doubt ? as long as you do not want to *search* for the relative position of signs with respect to one another. This is however a piece of information that is, like it or not, part of the ?orthographic? system of ancient Egyptian and to which we want to have access (see further Simon?s mail earlier this week). > > In general, the spatial distribution of signs (i.e. the "grouping") has *no meaning at all* in Egyptian. No semantic, phonological, or morphological information is coded in the relative position of the signs. This is patently inaccurate. Three simple examples should suffice here (sorry for providing textbook examples known by all). 1 - spatial distribution affecting phono-morphology [t] followed by [w] can only be read /tw/, while t&w can be /tw/ or /wt/ => same ?linear order?, 1 reading in one case, 2 readings in the other (referring potentially to 2+ morphemes). 2 - spatial distribution as a condition for reading (and adding semantic value) makes no sense, when combined as it is clear that it should be read /ptH/ ?Ptah? (> p(t) + t(A) + H(H)), referring to the god in his demiurgic dimension (separating the sky from the earth). The position of the sign is both a condition for reading and an added semantic information (not only Ptah, but Ptah as demiurge). 3 - spatial distribution affecting the function of a sign Hieratic example? If the rowing man is followed by a n () the n has to be read /n/ (phonemogram), if the n is positioned under the rowing man (), we are dealing with a compound classifier made of two signs, the /n/ has an iconic value and the whole group refer to a man rowing in water. Etc., etc. So position is 'just an esthetic, layout matter, not a linguistic one? as you say? This kind of assertion reflects badly on our understanding of the hieroglyphic system. That?s a pity to make such unwarranted statements in a discussion also aimed at non-specialists to whom we try to explain things as straightforwardly as possible in order to come up with a solution that is satisfying for everyone. Accordingly, if Unicode aims first and foremost at rendering the ?linguistic? dimension of writing, the examples above should suffice to show the importance of the ?quadrat' organization of this script. Again, you might not like it from a computer/font oriented perspective; I agree it?s not convenient to encode, it might even not be possible to encode it at all in Unicode because the standard has not the capabilities needed (and only higher level protocols could then handle this). All of this is fine with me, but it would be great not to distord presentation of the data. [I leave here alone other semiotic dimensions of writing: the spatial arrangement is part of the ?orthography? of the scribes, not necessarily meaningful at the ?linguistic? level strictly speaking, but at the level of scribal practices, etc.: why don?t we use IPA for our modern languages? simply because writing is much more than ?linguistic? stricto sensu and that it make sense to know who writes ?next? and who plays with the script and writes ?neckst?. As simple as that.] > No more than illuminated initial capital letters in medieval manuscripts. > Like these: > > http://www.sothebys.com/content/dam/stb/lots/L12/L12240/256L12240_6G4Y2.jpg > > They are nice, they are fancy, they are there (hundreds of thousands of them), but they do not carry any additional linguistic information whatsoever. Thus, there is not specific need to (and in fact they are not) encode or represent them in unicode. This comparaison is hilarious. > As far as I know (and please correct me if I am wrong), the only utility that recording the position of signs could have on a practical level would be to fill lacunas in text or to suggest alternative readings in hieratic scripts. Nope, see above. This is one side-interest of the control characters that I mentioned in the discussions, indeed, but definitely not the only one, since the arrangement is meaningful at multiple levels. >> 1) JSesh approach >> > >> > Sure, everything needs not be dealt with at the level of Unicode, but the data needed should not be hidden in the ligatures embedded in the font either. > > Again, what data? > What is the *linguistic* (not graphic, not philological) information carried by groups that is so important to code? See above. >> 2) Groups in fonts >> >> > OK, sure. But then again control characters have the advantage of being explicit about the relative position of signs when a group is not in a font: how would you proceed for storing such an information, as a lay user, when using ligatures? (this is a real question, nothing ironic here). > > I am not sure I understand your question: the information about the spatial distribution is already coded in the ligature. the ligature IS the information. > And if the ligature does not exist, it takes (literally) 30 seconds to create it within a font. And with a common database and a common font of reference, if would be extremely easy to create a common shared table of ligatures that will allow everyone to see exactly the same groups (or alternatively if i write my text with the font x with the ligature x, it would be enough to embed the font itself in the text file (there are various ways to do so), because everyone would be using the same font or at least the same set of ligatures. All this was clear, sure. What I meant is that the ligature are purely ?graphic? right? No information is stored about the position of one sign with respect to another? >> 3) the 4 ?small sign in the corner of big sign? control characters. >> > I?m glad to see that your solution is exactly the same as the one we suggest! Indeed, there are logically 2 ?variables? (Top vs Bottom and Left-Right), if the sequence of signs indicates the Left-Right unambiguously (like in your example), then we only need to encode explicitly the Top vs Bottom, of course. This is basically how we represented things with Michael at the pub. > Please keep in mind that the four operators come from another type of syntax. > > This is not what I am suggesting. I am suggesting 4 control characters: top-bottom, left-right, A\B, B/A. instead of top-bottom, left-right + four corners. > Or perhaps I am not understanding your system. OK, then I disagree because of the polysemic value of A/B and B/A. >> 4) Vertical and horizontal script and control characters >> >> > > I do not understand why you do entirely get rid of the notion of ?quadrat? in your example. If you encode your text as /hthr-H*(n:t)-tA:tA/, the text would be displayed correctly whatever the orientation (hltr, hrtl, vrtl, vltr), no? Of course, one needs the notion of quadrat in the encoding for this; and this is a nice case for showing that we *need* this well-defined notion in the encoding, not just sequences of glyphs. > > If you want to have the "notion of quadrate" in unicode, then you have to introduce at least another control character that you will have to use in front of *every single quadrate* in your text. if i understood correctly, one of the proposal that were circulating was indeed suggesting something like that. Which means that to display any text you will probably end up using more control characters that hieroglyphic signs themselves. This can be a sound way of thinking in a MdC (as you don't like JSesh ;-) ) perspective, but I doubt it make sense in unicode (and more important i doubt the people of the unicode consortium will think it make sense). > > Unless you want to code every single possible "quadrate" as an independent glyph: that would thus end up being conceptually comparable to a chinese character. > > But I assume (i hope) we don't want do do that, right? We use spaces between words, would it be bad to use quadrats separators between quadrats? Mutatis mutandis, it feels to me like saying ?oh, no, there are blank spaces all around!?. More seriously, taking signs as basic units for Unicode might be a practical solution (even if it leads to many subsequent difficulties, see the vertical vs horizontal discussion), but denying the essential function of quadrats is kind of funny when discussing the introduction of control characters (1, 2, 38, it does not matter): what they do is building quadrats (implicitly or explicitly). >> 5) special characters, vertical/horizontal texts and input methods >> >> > > Your point escapes me, here. Unless it is a result of getting rid of the quadrats: within quadrats, the groups of hieroglyphs are essentially the same. > > > With quadrats marked by special control characters would be even worse, as you would have to take into consideration even more possible combinations. > > >> 6) Ramesside ?groups? (or ?tall groups?). >> >> > > Why are they groups or quadrats and not ?small columns?? > > The answer is simple and straightforward (and provided by your example): because they correspond to the size occupied by the A1 sign. Look at the example. > > but some signs do occupy the full height of the horizontal line (unlike in your Japanese example, which is nice, but as such irrelevant), which is the basis for deciding what counts as a unit (quadrat) or not. (Or do you have another definition in mind?). > > Not true. > This can be interpreted just as a question of layout, not as a question of grouping. Paraphrasing what Ariel Shisha-Halevy once said to a colleague during a conference: you?re entitled to have your own kind of Egyptian if you want! I?m not arguing against non-sense. But that?s a pity, because there are actually many cases (esp. in Ptolemaic temples) that fit perfectly with your ?small columns? hypothesis. The Ramesside example that you provide simply does not work this way. And all the parallels from other scripts are pointless (and admittedly funny in the framework of this discussion), since none of these scripts are based on a quadratic structure that is close in any respect to Egyptian. Comparing apple with pears won?t help: and again, that?s a bit of a pity, because we have many cases in Egyptian of layouts similar to the ones you mentioned. These are nice cases of special layouts, indeed, and I agree with your analysis: we do not want to encode them. (unlike the quadrat structure). > But if you want to be able to use it minimally for (1) standardized electronic corpus and (2) journals like LingAeg, etc. dealing with the Egyptian language, this would be a requirement. > > All the egyptian words quoted in my article published on the last issue of JNES were written with my font with just standard ligatures, without control characters. No one complained, no one told me anything, so I assume that such a system is indeed suitable for scientific publications, not only for writing tourists' names.. ;-) I?m afraid that one paper consisting of a lexicographical discussion about the use of one word in one hieratic document cannot be taken as any evidence for disproving my point: you?re happy with the way your hieroglyphs are rendered? Perfect. But why should it be the case that people willing to render a simple hieroglyphic line as the one below would face difficulties? This escapes me. (KRI I, 4) >> 7) What are the ?square groups?? >> >> > > I provided the definition agreed on ? I think ? by everyone, if you have a better one, I?m listening of course. > > see above. > And see my "Second" introductory point in the previous email: it is not a question of finding the "right" definition (as there isn't any). It is a question of finding the *most efficient* definition in order to efficiently reach our goal of having a working system of unicode-based hieroglyphs that can be used by the *whole* Egyptological community (and not only by a handful of people working on a specific corpus or database, and on his respect, to respond to you last remark..). > > I?m sorry to put it so bluntly, but if Unicode is not to be useful for the majority of egyptologists, so be it. It will remain what it is, a standard not used by the community: Journals, Corpora, etc. will keep on using JSESH and other tools, they?re doing well with it. > > Perhaps the opposite should also be considered? perhaps the unicode-based system to write hieroglyphs (considering that the very idea at the basis of unicode is to make writing standardized and accessible to as many people as possible) should indeed aim at being useful for the majority of the Egyptologists (corpus-linguists as well as all the other thousands of non-corpus-linguist egyptologists), and if some team working with some specific database should have some specific need that cannot be satisfied with such a standard unicode system, then perhaps it is them who should use other tools? > > I guess this is an option that should be considered.. no? We are talking about feelings and different perceptions here, so it?s hard to be objective. My own view (admittedly subjective) is as follows: most Egyptologists (publishing texts, monuments, etc) will never be happy with what Unicode has to offer, because this is not precise enough at multiple levels (Nigel expressed this position multiple times during the meeting for instance). On the other hand, grammarians (broadly speaking) and people working on corpora are probably the ones who are the more open to standardization (hieroglyphs were not even there in the TLA at the beginning). I see this community (much more than historians, etc.) as intensive users of Unicode: using it for exchanging texts, publishing volumes, creating Online resources, etc. I feel like we are very much in favor of standardization and striving for making resources accessible; two goals of Unicode, as you mentioned. These resources should be there for lasting, it would be a pity not to think about the standard carefully before going in any direction. That?s all folks! Have a nice weekend, St?phane -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: PastedGraphic-1.pdf Type: application/pdf Size: 3613 bytes Desc: not available URL: -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: PastedGraphic-2.pdf Type: application/pdf Size: 3441 bytes Desc: not available URL: -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: PastedGraphic-3.pdf Type: application/pdf Size: 3713 bytes Desc: not available URL: -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: PastedGraphic-5.pdf Type: application/pdf Size: 3638 bytes Desc: not available URL: -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: Capture d?e?cran 2016-07-24 a? 11.47.11.png Type: image/png Size: 54301 bytes Desc: not available URL: -------------- next part -------------- An HTML attachment was scrubbed... URL: From ncs3 at cam.ac.uk Sun Jul 24 12:56:05 2016 From: ncs3 at cam.ac.uk (Nigel Strudwick) Date: Sun, 24 Jul 2016 12:56:05 +0100 Subject: [Egyptian] Some general considerations In-Reply-To: References: <96C8F2FC-A766-4379-BCBE-B5A6FB700A3B@ulg.ac.be>

Message-ID: <4B6C12F0-C9B1-4168-ACE1-AE0385979BA2@cam.ac.uk> Greetings Just wanted to say that St?phane?s last paragraph admirably summarises my perspective. I haven?t been reading this posts closely since I don?t have the theoretical background on Unicode to have a view. But some observations and a question. But I would say that of the categories St?phane mentions in his last paragraph, it is probably also making a distinction in the text corpora section between hieroglyphic and hieratic texts, simply due to the generally more regular arrangement of the latter, and hieroglyphic?s notorious ability to be squeezed, expanded etc to fit a space for a variety of reasons. So while those doing corpora won?t be as fussed about precise arrangements as I can be, there are going to be times where there will be challenges with hieroglyphs. Something that did come as a shock during the meeting was the dawning on me, in the course of a slightly tense exchange with Michael, was that Unicode fonts clearly can put a lot of burden on the font designer, especially for glyphs, if they are to include multiple and complex ligatures etc. This is where the current fonts score well (such as the Cleo Fonts) in that they are about the design only, and the arrangement is then carried out in the specialist software (e.g. JSesh) where the ultimate control lies with the Egyptologist and not the font designer. [I know that won?t be well received but you know where I come from on all this!] One question that I should have asked at the meeting is this: not knowing how control characters actually manifest themselves, will a text formatted with unicode with all the control characters be able to be exported into a plain MdC format, or will one of these input systems be able to import an MdC text and format it correctly? I ask as there will be a lot of MdC plain text around on people?s hard discs. Or will be have ultimately to revise the MdC system to handle all these other codes? Apologies in advance for the evident failures of comprehension in some of the points above. I don?t really need comments on the first points I made, but I would be interested on how the MdC issues might be handled. Best, Nigel On 24 Jul 2016, at 11:33, St?phane polis wrote: > Hi guys! > > (no worries, this is my last mail on the topic, enough time and energy spent on this.) >>> Third >>> >> And your solution is a good one ? I have absolutely no doubt ? as long as you do not want to *search* for the relative position of signs with respect to one another. This is however a piece of information that is, like it or not, part of the ?orthographic? system of ancient Egyptian and to which we want to have access (see further Simon?s mail earlier this week). >> >> In general, the spatial distribution of signs (i.e. the "grouping") has *no meaning at all* in Egyptian. No semantic, phonological, or morphological information is coded in the relative position of the signs. > > This is patently inaccurate. > Three simple examples should suffice here (sorry for providing textbook examples known by all). > > 1 - spatial distribution affecting phono-morphology > [t] followed by [w] can only be read /tw/, while t&w can be /tw/ or /wt/ => same ?linear order?, 1 reading in one case, 2 readings in the other (referring potentially to 2+ morphemes). > > 2 - spatial distribution as a condition for reading (and adding semantic value) > > makes no sense, when combined as > > it is clear that it should be read /ptH/ ?Ptah? (> p(t) + t(A) + H(H)), referring to the god in his demiurgic dimension (separating the sky from the earth). The position of the sign is both a condition for reading and an added semantic information (not only Ptah, but Ptah as demiurge). > > 3 - spatial distribution affecting the function of a sign > Hieratic example? If the rowing man is followed by a n ( > > ) the n has to be read /n/ (phonemogram), if the n is positioned under the rowing man ( > > ), we are dealing with a compound classifier made of two signs, the /n/ has an iconic value and the whole group refer to a man rowing in water. > > Etc., etc. > > So position is 'just an esthetic, layout matter, not a linguistic one? as you say? This kind of assertion reflects badly on our understanding of the hieroglyphic system. That?s a pity to make such unwarranted statements in a discussion also aimed at non-specialists to whom we try to explain things as straightforwardly as possible in order to come up with a solution that is satisfying for everyone. > > Accordingly, if Unicode aims first and foremost at rendering the ?linguistic? dimension of writing, the examples above should suffice to show the importance of the ?quadrat' organization of this script. Again, you might not like it from a computer/font oriented perspective; I agree it?s not convenient to encode, it might even not be possible to encode it at all in Unicode because the standard has not the capabilities needed (and only higher level protocols could then handle this). All of this is fine with me, but it would be great not to distord presentation of the data. > > [I leave here alone other semiotic dimensions of writing: the spatial arrangement is part of the ?orthography? of the scribes, not necessarily meaningful at the ?linguistic? level strictly speaking, but at the level of scribal practices, etc.: why don?t we use IPA for our modern languages? simply because writing is much more than ?linguistic? stricto sensu and that it make sense to know who writes ?next? and who plays with the script and writes ?neckst?. As simple as that.] > > >> No more than illuminated initial capital letters in medieval manuscripts. >> Like these: >> >> http://www.sothebys.com/content/dam/stb/lots/L12/L12240/256L12240_6G4Y2.jpg >> >> They are nice, they are fancy, they are there (hundreds of thousands of them), but they do not carry any additional linguistic information whatsoever. Thus, there is not specific need to (and in fact they are not) encode or represent them in unicode. > > This comparaison is hilarious. > >> As far as I know (and please correct me if I am wrong), the only utility that recording the position of signs could have on a practical level would be to fill lacunas in text or to suggest alternative readings in hieratic scripts. > > Nope, see above. This is one side-interest of the control characters that I mentioned in the discussions, indeed, but definitely not the only one, since the arrangement is meaningful at multiple levels. >>> 1) JSesh approach >>> >>> >>> >> Sure, everything needs not be dealt with at the level of Unicode, but the data needed should not be hidden in the ligatures embedded in the font either. >> >> Again, what data? >> What is the *linguistic* (not graphic, not philological) information carried by groups that is so important to code? > > See above. >>> 2) Groups in fonts >>> >>> >> OK, sure. But then again control characters have the advantage of being explicit about the relative position of signs when a group is not in a font: how would you proceed for storing such an information, as a lay user, when using ligatures? (this is a real question, nothing ironic here). >> >> I am not sure I understand your question: the information about the spatial distribution is already coded in the ligature. the ligature IS the information. >> And if the ligature does not exist, it takes (literally) 30 seconds to create it within a font. And with a common database and a common font of reference, if would be extremely easy to create a common shared table of ligatures that will allow everyone to see exactly the same groups (or alternatively if i write my text with the font x with the ligature x, it would be enough to embed the font itself in the text file (there are various ways to do so), because everyone would be using the same font or at least the same set of ligatures. > > All this was clear, sure. What I meant is that the ligature are purely ?graphic? right? No information is stored about the position of one sign with respect to another? > >>> 3) the 4 ?small sign in the corner of big sign? control characters. >>> >> I?m glad to see that your solution is exactly the same as the one we suggest! Indeed, there are logically 2 ?variables? (Top vs Bottom and Left-Right), if the sequence of signs indicates the Left-Right unambiguously (like in your example), then we only need to encode explicitly the Top vs Bottom, of course. This is basically how we represented things with Michael at the pub. >> Please keep in mind that the four operators come from another type of syntax. >> >> This is not what I am suggesting. I am suggesting 4 control characters: top-bottom, left-right, A\B, B/A. instead of top-bottom, left-right + four corners. >> Or perhaps I am not understanding your system. > > OK, then I disagree because of the polysemic value of A/B and B/A. > >>> 4) Vertical and horizontal script and control characters >>> >>> >> >> I do not understand why you do entirely get rid of the notion of ?quadrat? in your example. If you encode your text as /hthr-H*(n:t)-tA:tA/, the text would be displayed correctly whatever the orientation (hltr, hrtl, vrtl, vltr), no? Of course, one needs the notion of quadrat in the encoding for this; and this is a nice case for showing that we *need* this well-defined notion in the encoding, not just sequences of glyphs. >> >> If you want to have the "notion of quadrate" in unicode, then you have to introduce at least another control character that you will have to use in front of *every single quadrate* in your text. if i understood correctly, one of the proposal that were circulating was indeed suggesting something like that. Which means that to display any text you will probably end up using more control characters that hieroglyphic signs themselves. This can be a sound way of thinking in a MdC (as you don't like JSesh ;-) ) perspective, but I doubt it make sense in unicode (and more important i doubt the people of the unicode consortium will think it make sense). >> >> Unless you want to code every single possible "quadrate" as an independent glyph: that would thus end up being conceptually comparable to a chinese character. >> >> But I assume (i hope) we don't want do do that, right? > > We use spaces between words, would it be bad to use quadrats separators between quadrats? Mutatis mutandis, it feels to me like saying ?oh, no, there are blank spaces all around!?. > > More seriously, taking signs as basic units for Unicode might be a practical solution (even if it leads to many subsequent difficulties, see the vertical vs horizontal discussion), but denying the essential function of quadrats is kind of funny when discussing the introduction of control characters (1, 2, 38, it does not matter): what they do is building quadrats (implicitly or explicitly). >>> 5) special characters, vertical/horizontal texts and input methods >>> >>> >> >> Your point escapes me, here. Unless it is a result of getting rid of the quadrats: within quadrats, the groups of hieroglyphs are essentially the same. >> >> >> With quadrats marked by special control characters would be even worse, as you would have to take into consideration even more possible combinations. >> >> >>> 6) Ramesside ?groups? (or ?tall groups?). >>> >>> >> >> Why are they groups or quadrats and not ?small columns?? >> >> The answer is simple and straightforward (and provided by your example): because they correspond to the size occupied by the A1 sign. Look at the example. >> >> but some signs do occupy the full height of the horizontal line (unlike in your Japanese example, which is nice, but as such irrelevant), which is the basis for deciding what counts as a unit (quadrat) or not. (Or do you have another definition in mind?). >> >> Not true. >> This can be interpreted just as a question of layout, not as a question of grouping. > > Paraphrasing what Ariel Shisha-Halevy once said to a colleague during a conference: you?re entitled to have your own kind of Egyptian if you want! > I?m not arguing against non-sense. But that?s a pity, because there are actually many cases (esp. in Ptolemaic temples) that fit perfectly with your ?small columns? hypothesis. > The Ramesside example that you provide simply does not work this way. > > And all the parallels from other scripts are pointless (and admittedly funny in the framework of this discussion), since none of these scripts are based on a quadratic structure that is close in any respect to Egyptian. Comparing apple with pears won?t help: and again, that?s a bit of a pity, because we have many cases in Egyptian of layouts similar to the ones you mentioned. These are nice cases of special layouts, indeed, and I agree with your analysis: we do not want to encode them. (unlike the quadrat structure). > > >> But if you want to be able to use it minimally for (1) standardized electronic corpus and (2) journals like LingAeg, etc. dealing with the Egyptian language, this would be a requirement. >> >> All the egyptian words quoted in my article published on the last issue of JNES were written with my font with just standard ligatures, without control characters. No one complained, no one told me anything, so I assume that such a system is indeed suitable for scientific publications, not only for writing tourists' names.. ;-) > > I?m afraid that one paper consisting of a lexicographical discussion about the use of one word in one hieratic document cannot be taken as any evidence for disproving my point: you?re happy with the way your hieroglyphs are rendered? Perfect. But why should it be the case that people willing to render a simple hieroglyphic line as the one below would face difficulties? This escapes me. > > > > (KRI I, 4) >>> 7) What are the ?square groups?? >>> >>> >> >> I provided the definition agreed on ? I think ? by everyone, if you have a better one, I?m listening of course. >> >> see above. >> And see my "Second" introductory point in the previous email: it is not a question of finding the "right" definition (as there isn't any). It is a question of finding the *most efficient* definition in order to efficiently reach our goal of having a working system of unicode-based hieroglyphs that can be used by the *whole* Egyptological community (and not only by a handful of people working on a specific corpus or database, and on his respect, to respond to you last remark..). >> >> I?m sorry to put it so bluntly, but if Unicode is not to be useful for the majority of egyptologists, so be it. It will remain what it is, a standard not used by the community: Journals, Corpora, etc. will keep on using JSESH and other tools, they?re doing well with it. >> >> Perhaps the opposite should also be considered? perhaps the unicode-based system to write hieroglyphs (considering that the very idea at the basis of unicode is to make writing standardized and accessible to as many people as possible) should indeed aim at being useful for the majority of the Egyptologists (corpus-linguists as well as all the other thousands of non-corpus-linguist egyptologists), and if some team working with some specific database should have some specific need that cannot be satisfied with such a standard unicode system, then perhaps it is them who should use other tools? >> >> I guess this is an option that should be considered.. no? > > We are talking about feelings and different perceptions here, so it?s hard to be objective. My own view (admittedly subjective) is as follows: most Egyptologists (publishing texts, monuments, etc) will never be happy with what Unicode has to offer, because this is not precise enough at multiple levels (Nigel expressed this position multiple times during the meeting for instance). On the other hand, grammarians (broadly speaking) and people working on corpora are probably the ones who are the more open to standardization (hieroglyphs were not even there in the TLA at the beginning). I see this community (much more than historians, etc.) as intensive users of Unicode: using it for exchanging texts, publishing volumes, creating Online resources, etc. I feel like we are very much in favor of standardization and striving for making resources accessible; two goals of Unicode, as you mentioned. > > These resources should be there for lasting, it would be a pity not to think about the standard carefully before going in any direction. > > That?s all folks! > Have a nice weekend, > > St?phane > _______________________________________________ > Egyptian mailing list > Egyptian at evertype.com > http://evertype.com/mailman/listinfo/egyptian_evertype.com From mn31 at st-andrews.ac.uk Sun Jul 24 17:11:05 2016 From: mn31 at st-andrews.ac.uk (Mark-Jan Nederhof) Date: Sun, 24 Jul 2016 17:11:05 +0100 Subject: [Egyptian] Some general considerations In-Reply-To: <4B6C12F0-C9B1-4168-ACE1-AE0385979BA2@cam.ac.uk> References:

<4B6C12F0-C9B1-4168-ACE1-AE0385979BA2@cam.ac.uk> Message-ID: <3369008.Gi8cJxaxqj@thuis> Dear Nigel, We address just this point in our proposal, conversion to and from more precise kinds of encoding outside Unicode. In fact, it was one of the fundamental considerations that guided us to the Unicode encoding that we are proposing. See the fifth bullet point on p. 1 and the beginning of Section 12 of: https://mjn.host.cs.st-andrews.ac.uk/tmp/unicode2.pdf In Section 12, RES is illustrated a number of times, but much applies equally to MdC (I should really say JSesh, because MdC is hopelessly vague, due to lack of a published standard). I implemented automatic conversion routines from JSesh to RES, of which the Unicode encoding is (roughly) a subset, and it should be equally possible to convert from JSesh directly to Unicode. Whether automatic conversion from Unicode to JSesh would be possible already, or whether JSesh would need to be extended, this matter I gladly leave to Serge. However, if I say "you can convert" it doesn't mean the output hieroglyphic text would look the same, and not even that the output of conversion is "correct", whatever that means. E.g. JSesh has absolute positioning and scaling, which are by definition impossible in Unicode. So this then has to be represented differently somehow. Perhaps heuristics and/or hacks are able to achieve at least something reasonable, but it seems unlikely you would be able to blithely dump all your old MdC code into a Unicode plain-text document after automatic conversion without at least some visual inspection and manual correction. That holds for the proposed Unicode encoding, and would hold for any alternative Unicode encoding. Sorry if this is a disappointment to you, but this is inherent in the exercise: if you convert from a specialistic and powerful and precise format to a simpler format, you need to sacrifice information. If you cannot afford to lose that information, you should not convert. Mark-Jan On Sunday 24 Jul 2016 12:56:05 Nigel Strudwick wrote: > Greetings > > Just wanted to say that St?phane?s last paragraph admirably summarises my perspective. > > I haven?t been reading this posts closely since I don?t have the theoretical background on Unicode to have a view. But some observations and a question. > > But I would say that of the categories St?phane mentions in his last paragraph, it is probably also making a distinction in the text corpora section between hieroglyphic and hieratic texts, simply due to the generally more regular arrangement of the latter, and hieroglyphic?s notorious ability to be squeezed, expanded etc to fit a space for a variety of reasons. So while those doing corpora won?t be as fussed about precise arrangements as I can be, there are going to be times where there will be challenges with hieroglyphs. > > Something that did come as a shock during the meeting was the dawning on me, in the course of a slightly tense exchange with Michael, was that Unicode fonts clearly can put a lot of burden on the font designer, especially for glyphs, if they are to include multiple and complex ligatures etc. This is where the current fonts score well (such as the Cleo Fonts) in that they are about the design only, and the arrangement is then carried out in the specialist software (e.g. JSesh) where the ultimate control lies with the Egyptologist and not the font designer. [I know that won?t be well received but you know where I come from on all this!] > > One question that I should have asked at the meeting is this: not knowing how control characters actually manifest themselves, will a text formatted with unicode with all the control characters be able to be exported into a plain MdC format, or will one of these input systems be able to import an MdC text and format it correctly? I ask as there will be a lot of MdC plain text around on people?s hard discs. Or will be have ultimately to revise the MdC system to handle all these other codes? > > Apologies in advance for the evident failures of comprehension in some of the points above. I don?t really need comments on the first points I made, but I would be interested on how the MdC issues might be handled. > > Best, Nigel > > On 24 Jul 2016, at 11:33, St?phane polis wrote: > > > Hi guys! > > > > (no worries, this is my last mail on the topic, enough time and energy spent on this.) > >>> Third > >>> > >> And your solution is a good one ? I have absolutely no doubt ? as long as you do not want to *search* for the relative position of signs with respect to one another. This is however a piece of information that is, like it or not, part of the ?orthographic? system of ancient Egyptian and to which we want to have access (see further Simon?s mail earlier this week). > >> > >> In general, the spatial distribution of signs (i.e. the "grouping") has *no meaning at all* in Egyptian. No semantic, phonological, or morphological information is coded in the relative position of the signs. > > > > This is patently inaccurate. > > Three simple examples should suffice here (sorry for providing textbook examples known by all). > > > > 1 - spatial distribution affecting phono-morphology > > [t] followed by [w] can only be read /tw/, while t&w can be /tw/ or /wt/ => same ?linear order?, 1 reading in one case, 2 readings in the other (referring potentially to 2+ morphemes). > > > > 2 - spatial distribution as a condition for reading (and adding semantic value) > > > > makes no sense, when combined as > > > > it is clear that it should be read /ptH/ ?Ptah? (> p(t) + t(A) + H(H)), referring to the god in his demiurgic dimension (separating the sky from the earth). The position of the sign is both a condition for reading and an added semantic information (not only Ptah, but Ptah as demiurge). > > > > 3 - spatial distribution affecting the function of a sign > > Hieratic example? If the rowing man is followed by a n ( > > > > ) the n has to be read /n/ (phonemogram), if the n is positioned under the rowing man ( > > > > ), we are dealing with a compound classifier made of two signs, the /n/ has an iconic value and the whole group refer to a man rowing in water. > > > > Etc., etc. > > > > So position is 'just an esthetic, layout matter, not a linguistic one? as you say? This kind of assertion reflects badly on our understanding of the hieroglyphic system. That?s a pity to make such unwarranted statements in a discussion also aimed at non-specialists to whom we try to explain things as straightforwardly as possible in order to come up with a solution that is satisfying for everyone. > > > > Accordingly, if Unicode aims first and foremost at rendering the ?linguistic? dimension of writing, the examples above should suffice to show the importance of the ?quadrat' organization of this script. Again, you might not like it from a computer/font oriented perspective; I agree it?s not convenient to encode, it might even not be possible to encode it at all in Unicode because the standard has not the capabilities needed (and only higher level protocols could then handle this). All of this is fine with me, but it would be great not to distord presentation of the data. > > > > [I leave here alone other semiotic dimensions of writing: the spatial arrangement is part of the ?orthography? of the scribes, not necessarily meaningful at the ?linguistic? level strictly speaking, but at the level of scribal practices, etc.: why don?t we use IPA for our modern languages? simply because writing is much more than ?linguistic? stricto sensu and that it make sense to know who writes ?next? and who plays with the script and writes ?neckst?. As simple as that.] > > > > > >> No more than illuminated initial capital letters in medieval manuscripts. > >> Like these: > >> > >> http://www.sothebys.com/content/dam/stb/lots/L12/L12240/256L12240_6G4Y2.jpg > >> > >> They are nice, they are fancy, they are there (hundreds of thousands of them), but they do not carry any additional linguistic information whatsoever. Thus, there is not specific need to (and in fact they are not) encode or represent them in unicode. > > > > This comparaison is hilarious. > > > >> As far as I know (and please correct me if I am wrong), the only utility that recording the position of signs could have on a practical level would be to fill lacunas in text or to suggest alternative readings in hieratic scripts. > > > > Nope, see above. This is one side-interest of the control characters that I mentioned in the discussions, indeed, but definitely not the only one, since the arrangement is meaningful at multiple levels. > >>> 1) JSesh approach > >>> > >>> > >>> > >> Sure, everything needs not be dealt with at the level of Unicode, but the data needed should not be hidden in the ligatures embedded in the font either. > >> > >> Again, what data? > >> What is the *linguistic* (not graphic, not philological) information carried by groups that is so important to code? > > > > See above. > >>> 2) Groups in fonts > >>> > >>> > >> OK, sure. But then again control characters have the advantage of being explicit about the relative position of signs when a group is not in a font: how would you proceed for storing such an information, as a lay user, when using ligatures? (this is a real question, nothing ironic here). > >> > >> I am not sure I understand your question: the information about the spatial distribution is already coded in the ligature. the ligature IS the information. > >> And if the ligature does not exist, it takes (literally) 30 seconds to create it within a font. And with a common database and a common font of reference, if would be extremely easy to create a common shared table of ligatures that will allow everyone to see exactly the same groups (or alternatively if i write my text with the font x with the ligature x, it would be enough to embed the font itself in the text file (there are various ways to do so), because everyone would be using the same font or at least the same set of ligatures. > > > > All this was clear, sure. What I meant is that the ligature are purely ?graphic? right? No information is stored about the position of one sign with respect to another? > > > >>> 3) the 4 ?small sign in the corner of big sign? control characters. > >>> > >> I?m glad to see that your solution is exactly the same as the one we suggest! Indeed, there are logically 2 ?variables? (Top vs Bottom and Left-Right), if the sequence of signs indicates the Left-Right unambiguously (like in your example), then we only need to encode explicitly the Top vs Bottom, of course. This is basically how we represented things with Michael at the pub. > >> Please keep in mind that the four operators come from another type of syntax. > >> > >> This is not what I am suggesting. I am suggesting 4 control characters: top-bottom, left-right, A\B, B/A. instead of top-bottom, left-right + four corners. > >> Or perhaps I am not understanding your system. > > > > OK, then I disagree because of the polysemic value of A/B and B/A. > > > >>> 4) Vertical and horizontal script and control characters > >>> > >>> > >> > >> I do not understand why you do entirely get rid of the notion of ?quadrat? in your example. If you encode your text as /hthr-H*(n:t)-tA:tA/, the text would be displayed correctly whatever the orientation (hltr, hrtl, vrtl, vltr), no? Of course, one needs the notion of quadrat in the encoding for this; and this is a nice case for showing that we *need* this well-defined notion in the encoding, not just sequences of glyphs. > >> > >> If you want to have the "notion of quadrate" in unicode, then you have to introduce at least another control character that you will have to use in front of *every single quadrate* in your text. if i understood correctly, one of the proposal that were circulating was indeed suggesting something like that. Which means that to display any text you will probably end up using more control characters that hieroglyphic signs themselves. This can be a sound way of thinking in a MdC (as you don't like JSesh ;-) ) perspective, but I doubt it make sense in unicode (and more important i doubt the people of the unicode consortium will think it make sense). > >> > >> Unless you want to code every single possible "quadrate" as an independent glyph: that would thus end up being conceptually comparable to a chinese character. > >> > >> But I assume (i hope) we don't want do do that, right? > > > > We use spaces between words, would it be bad to use quadrats separators between quadrats? Mutatis mutandis, it feels to me like saying ?oh, no, there are blank spaces all around!?. > > > > More seriously, taking signs as basic units for Unicode might be a practical solution (even if it leads to many subsequent difficulties, see the vertical vs horizontal discussion), but denying the essential function of quadrats is kind of funny when discussing the introduction of control characters (1, 2, 38, it does not matter): what they do is building quadrats (implicitly or explicitly). > >>> 5) special characters, vertical/horizontal texts and input methods > >>> > >>> > >> > >> Your point escapes me, here. Unless it is a result of getting rid of the quadrats: within quadrats, the groups of hieroglyphs are essentially the same. > >> > >> > >> With quadrats marked by special control characters would be even worse, as you would have to take into consideration even more possible combinations. > >> > >> > >>> 6) Ramesside ?groups? (or ?tall groups?). > >>> > >>> > >> > >> Why are they groups or quadrats and not ?small columns?? > >> > >> The answer is simple and straightforward (and provided by your example): because they correspond to the size occupied by the A1 sign. Look at the example. > >> > >> but some signs do occupy the full height of the horizontal line (unlike in your Japanese example, which is nice, but as such irrelevant), which is the basis for deciding what counts as a unit (quadrat) or not. (Or do you have another definition in mind?). > >> > >> Not true. > >> This can be interpreted just as a question of layout, not as a question of grouping. > > > > Paraphrasing what Ariel Shisha-Halevy once said to a colleague during a conference: you?re entitled to have your own kind of Egyptian if you want! > > I?m not arguing against non-sense. But that?s a pity, because there are actually many cases (esp. in Ptolemaic temples) that fit perfectly with your ?small columns? hypothesis. > > The Ramesside example that you provide simply does not work this way. > > > > And all the parallels from other scripts are pointless (and admittedly funny in the framework of this discussion), since none of these scripts are based on a quadratic structure that is close in any respect to Egyptian. Comparing apple with pears won?t help: and again, that?s a bit of a pity, because we have many cases in Egyptian of layouts similar to the ones you mentioned. These are nice cases of special layouts, indeed, and I agree with your analysis: we do not want to encode them. (unlike the quadrat structure). > > > > > >> But if you want to be able to use it minimally for (1) standardized electronic corpus and (2) journals like LingAeg, etc. dealing with the Egyptian language, this would be a requirement. > >> > >> All the egyptian words quoted in my article published on the last issue of JNES were written with my font with just standard ligatures, without control characters. No one complained, no one told me anything, so I assume that such a system is indeed suitable for scientific publications, not only for writing tourists' names.. ;-) > > > > I?m afraid that one paper consisting of a lexicographical discussion about the use of one word in one hieratic document cannot be taken as any evidence for disproving my point: you?re happy with the way your hieroglyphs are rendered? Perfect. But why should it be the case that people willing to render a simple hieroglyphic line as the one below would face difficulties? This escapes me. > > > > > > > > (KRI I, 4) > >>> 7) What are the ?square groups?? > >>> > >>> > >> > >> I provided the definition agreed on ? I think ? by everyone, if you have a better one, I?m listening of course. > >> > >> see above. > >> And see my "Second" introductory point in the previous email: it is not a question of finding the "right" definition (as there isn't any). It is a question of finding the *most efficient* definition in order to efficiently reach our goal of having a working system of unicode-based hieroglyphs that can be used by the *whole* Egyptological community (and not only by a handful of people working on a specific corpus or database, and on his respect, to respond to you last remark..). > >> > >> I?m sorry to put it so bluntly, but if Unicode is not to be useful for the majority of egyptologists, so be it. It will remain what it is, a standard not used by the community: Journals, Corpora, etc. will keep on using JSESH and other tools, they?re doing well with it. > >> > >> Perhaps the opposite should also be considered? perhaps the unicode-based system to write hieroglyphs (considering that the very idea at the basis of unicode is to make writing standardized and accessible to as many people as possible) should indeed aim at being useful for the majority of the Egyptologists (corpus-linguists as well as all the other thousands of non-corpus-linguist egyptologists), and if some team working with some specific database should have some specific need that cannot be satisfied with such a standard unicode system, then perhaps it is them who should use other tools? > >> > >> I guess this is an option that should be considered.. no? > > > > We are talking about feelings and different perceptions here, so it?s hard to be objective. My own view (admittedly subjective) is as follows: most Egyptologists (publishing texts, monuments, etc) will never be happy with what Unicode has to offer, because this is not precise enough at multiple levels (Nigel expressed this position multiple times during the meeting for instance). On the other hand, grammarians (broadly speaking) and people working on corpora are probably the ones who are the more open to standardization (hieroglyphs were not even there in the TLA at the beginning). I see this community (much more than historians, etc.) as intensive users of Unicode: using it for exchanging texts, publishing volumes, creating Online resources, etc. I feel like we are very much in favor of standardization and striving for making resources accessible; two goals of Unicode, as you mentioned. > > > > These resources should be there for lasting, it would be a pity not to think about the standard carefully before going in any direction. > > > > That?s all folks! > > Have a nice weekend, > > > > St?phane > > _______________________________________________ > > Egyptian mailing list > > Egyptian at evertype.com > > http://evertype.com/mailman/listinfo/egyptian_evertype.com > > > _______________________________________________ > Egyptian mailing list > Egyptian at evertype.com > http://evertype.com/mailman/listinfo/egyptian_evertype.com From odusseus at gmail.com Sun Jul 24 17:15:24 2016 From: odusseus at gmail.com (Marwan Kilani) Date: Sun, 24 Jul 2016 18:15:24 +0200 Subject: [Egyptian] Some general considerations In-Reply-To: References: <96C8F2FC-A766-4379-BCBE-B5A6FB700A3B@ulg.ac.be>

Message-ID: Just one comment: And all the parallels from other scripts are pointless (and admittedly > funny in the framework of this discussion), since none of these scripts are > based on a quadratic structure that is close in any respect to Egyptian. > Funny to read such a comment while at that meeting in Cambridge a Japanese linguist and Egyptologist explained, by referring to scientific publications, that Japanese (and chinese and korean) are indeed based on quadratic structures that are indeed very close in many respect to Egyptian. There were many things that could be said as a reply to your email, but let be honest: with these premises, it won't make any sense. p.s.: No, actually, i am going to add something: Look what happens (image 1 below) if you cut your ramesside line in small columns (which is just a way of analyzing the text, as it is yours taking them as groups, but as we are not egyptians it is not "truth", it is not and don't want to be the "true nature of the text"), I was saying: Look what happens (image 1 below) if you cut your ramesside line in small columns and you put them one under the other.. Such a nice example of Egyptian vertical text, isn't it? :-) And now, count how many groups you would need to compose such a text with a vertical font within a vertical layout. You can see the result in the second image: the groups you will need are marked in green: 7 groups. That's all. Note that you will not need to combine the D snake and the ns tongue as a group with the following signs, because with a vertical font you could put the baseline of the sign under its "horizontal" bit, and you could consider the tail as hanging below the baseline, as it is for the latin letters "q", "g" "p" and so on. And consider that the "30" is already a single character in unicode (as the t&w, btw.. so i would suggest you to find a more relevant example to argue for the importance of the relative position of signs..) So if you use my way of interpreting ramesside texts as short vertical strings of texts within an main horizontal layout, you would just need 7 (sic!!!), in general very basic, groups to display it correctly. All the other signs could just be inputted one after the other, without control characters, without ligatures, without anything, within short columns one next to the other. And you could just use basic layout algorithm to make them with the space in a nice way (you know, like when you expand or to squeeze the letters within a line to make you text look better? exactly the same thing, but vertically). Instead, with your way of interpreting ramesside writing, which assumes that every "tall group" is a real "group" that need to be built and displayed within a single purely horizontal layout, and with your system of control characters, you would need 15 (sic!!!) groups, many of them very rare in not even unique, and you would need tens of control characters nested one into the other to correctly build and correctly display each of them. I really wonder which approach would be more efficient, more economic and more easily implemented.. Image 1: [image: Inline image 1] Image 2: [image: Inline image 2] -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image_2.png Type: image/png Size: 62103 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image_1.png Type: image/png Size: 59893 bytes Desc: not available URL: From odusseus at gmail.com Sun Jul 24 17:17:13 2016 From: odusseus at gmail.com (Marwan Kilani) Date: Sun, 24 Jul 2016 18:17:13 +0200 Subject: [Egyptian] Some general considerations In-Reply-To: References: <96C8F2FC-A766-4379-BCBE-B5A6FB700A3B@ulg.ac.be>

Message-ID: Have a nice Sunday evening Marwan (sorry, that was meant to be in the previous email, obviously :-p ) On Sun, Jul 24, 2016 at 6:15 PM, Marwan Kilani wrote: > Just one comment: > > And all the parallels from other scripts are pointless (and admittedly >> funny in the framework of this discussion), since none of these scripts are >> based on a quadratic structure that is close in any respect to Egyptian. >> > > Funny to read such a comment while at that meeting in Cambridge a Japanese > linguist and Egyptologist explained, by referring to scientific > publications, that Japanese (and chinese and korean) are indeed based on > quadratic structures that are indeed very close in many respect to Egyptian. > > There were many things that could be said as a reply to your email, but > let be honest: with these premises, it won't make any sense. > > p.s.: > No, actually, i am going to add something: > Look what happens (image 1 below) if you cut your ramesside line in small > columns (which is just a way of analyzing the text, as it is yours taking > them as groups, but as we are not egyptians it is not "truth", it is not > and don't want to be the "true nature of the text"), I was saying: Look > what happens (image 1 below) if you cut your ramesside line in small > columns and you put them one under the other.. > > Such a nice example of Egyptian vertical text, isn't it? :-) > > And now, count how many groups you would need to compose such a text with > a vertical font within a vertical layout. > You can see the result in the second image: the groups you will need are > marked in green: 7 groups. That's all. > Note that you will not need to combine the D snake and the ns tongue as a > group with the following signs, because with a vertical font you could put > the baseline of the sign under its "horizontal" bit, and you could consider > the tail as hanging below the baseline, as it is for the latin letters "q", > "g" "p" and so on. And consider that the "30" is already a single character > in unicode (as the t&w, btw.. so i would suggest you to find a more > relevant example to argue for the importance of the relative position of > signs..) > > So if you use my way of interpreting ramesside texts as short vertical > strings of texts within an main horizontal layout, you would just need 7 > (sic!!!), in general very basic, groups to display it correctly. All the > other signs could just be inputted one after the other, without control > characters, without ligatures, without anything, within short columns one > next to the other. And you could just use basic layout algorithm to make > them with the space in a nice way (you know, like when you expand or to > squeeze the letters within a line to make you text look better? exactly the > same thing, but vertically). > > Instead, with your way of interpreting ramesside writing, which assumes > that every "tall group" is a real "group" that need to be built and > displayed within a single purely horizontal layout, and with your system of > control characters, you would need 15 (sic!!!) groups, many of them very > rare in not even unique, and you would need tens of control characters > nested one into the other to correctly build and correctly display each of > them. > > I really wonder which approach would be more efficient, more economic and > more easily implemented.. > > Image 1: > > [image: Inline image 1] > > > > Image 2: > [image: Inline image 2] > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image_1.png Type: image/png Size: 59893 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image_2.png Type: image/png Size: 62103 bytes Desc: not available URL: From bobqq at live.co.uk Sun Jul 24 17:38:59 2016 From: bobqq at live.co.uk (Bob Richmond) Date: Sun, 24 Jul 2016 17:38:59 +0100 Subject: [Egyptian] Some general considerations In-Reply-To: <4B6C12F0-C9B1-4168-ACE1-AE0385979BA2@cam.ac.uk> References: <96C8F2FC-A766-4379-BCBE-B5A6FB700A3B@ulg.ac.be>

<4B6C12F0-C9B1-4168-ACE1-AE0385979BA2@cam.ac.uk> Message-ID: Replying to Nigel The point you make on the importance of making a distinction between hieratic transcription to hieroglyphic and transcription from (original) hieroglyphic to (digital format) hieroglyphic is important. This is one snag with the MdC tradition which encourages 'one size fits all' thinking about arrangements of hieroglyphs and groups. Likewise vertical and horizontal writing have their own considerations. Fonts. As you note a font such as Cleo defines the glyphs only. Nevertheless in designing her font Cleo gave attention to relative proportions of sign and use in combinations using the tools at her disposal at the time. Her use case was based on the Gardiner Egyptian Grammar model which attempts to render classic Middle Egyptian style well at 18pt text and acceptably at 12pt. Contrast with the Hieroglyphica font which provides more detail but is optimised for larger point sizes. Traditional MdC applications just as JSesh come with font+application and the two are intended to be used in concert. The Egyptologist has little freedom unless both font and application meet their needs. Portability from one app+font to another app+font is not ideal. Fonts with shaping in Unicode. Glyphs as before. But the font designer now has the opportunity to have more control over how their font looks and better deal with proportion and aesthetics. Whether this is a burden will depend on tools available. This is one personal interest of mine and font practicalities are part of thinking my behind control characters and their straightforward implementation. An application such as JSesh can choose to ignore some or all of the features built into the font and do its own thing. It can also add functionality on top of basic text. There is no loss, only the potential of gain. My personal prototype tools work with MdC (including JSesh extensions) and Unicode plain text and in my experience over last 18 months it all works pretty well. It is desirable that application such as JSesh add in support for Unicode. JSesh 5.5 does not have 'Gardiner codes' for all the Unicode (2009) hieroglyphs so there is housekeeping needed but nothing Serge or I regard as problematic. Incidentally. My own software is agnostic about what is settled on for initial Unicode plain text controls (I could even add in support for RES if required!) but on hold until standards situation is clarified. Big topic. Bob -----Original Message----- From: Egyptian [mailto:egyptian-bounces at evertype.com] On Behalf Of Nigel Strudwick Sent: 24 July 2016 12:56 To: Egyptian Hieroglyphs in the UCS Subject: Re: [Egyptian] Some general considerations Greetings Just wanted to say that St?phane?s last paragraph admirably summarises my perspective. I haven?t been reading this posts closely since I don?t have the theoretical background on Unicode to have a view. But some observations and a question. But I would say that of the categories St?phane mentions in his last paragraph, it is probably also making a distinction in the text corpora section between hieroglyphic and hieratic texts, simply due to the generally more regular arrangement of the latter, and hieroglyphic?s notorious ability to be squeezed, expanded etc to fit a space for a variety of reasons. So while those doing corpora won?t be as fussed about precise arrangements as I can be, there are going to be times where there will be challenges with hieroglyphs. Something that did come as a shock during the meeting was the dawning on me, in the course of a slightly tense exchange with Michael, was that Unicode fonts clearly can put a lot of burden on the font designer, especially for glyphs, if they are to include multiple and complex ligatures etc. This is where the current fonts score well (such as the Cleo Fonts) in that they are about the design only, and the arrangement is then carried out in the specialist software (e.g. JSesh) where the ultimate control lies with the Egyptologist and not the font designer. [I know that won?t be well received but you know where I come from on all this!] One question that I should have asked at the meeting is this: not knowing how control characters actually manifest themselves, will a text formatted with unicode with all the control characters be able to be exported into a plain MdC format, or will one of these input systems be able to import an MdC text and format it correctly? I ask as there will be a lot of MdC plain text around on people?s hard discs. Or will be have ultimately to revise the MdC system to handle all these other codes? Apologies in advance for the evident failures of comprehension in some of the points above. I don?t really need comments on the first points I made, but I would be interested on how the MdC issues might be handled. Best, Nigel On 24 Jul 2016, at 11:33, St?phane polis wrote: > Hi guys! > > (no worries, this is my last mail on the topic, enough time and energy > spent on this.) >>> Third >>> >> And your solution is a good one ? I have absolutely no doubt ? as long as you do not want to *search* for the relative position of signs with respect to one another. This is however a piece of information that is, like it or not, part of the ?orthographic? system of ancient Egyptian and to which we want to have access (see further Simon?s mail earlier this week). >> >> In general, the spatial distribution of signs (i.e. the "grouping") has *no meaning at all* in Egyptian. No semantic, phonological, or morphological information is coded in the relative position of the signs. > > This is patently inaccurate. > Three simple examples should suffice here (sorry for providing textbook examples known by all). > > 1 - spatial distribution affecting phono-morphology [t] followed by > [w] can only be read /tw/, while t&w can be /tw/ or /wt/ => same ?linear order?, 1 reading in one case, 2 readings in the other (referring potentially to 2+ morphemes). > > 2 - spatial distribution as a condition for reading (and adding > semantic value) makes no sense, when combined > as it is clear that it should be read /ptH/ > ?Ptah? (> p(t) + t(A) + H(H)), referring to the god in his demiurgic dimension (separating the sky from the earth). The position of the sign is both a condition for reading and an added semantic information (not only Ptah, but Ptah as demiurge). > > 3 - spatial distribution affecting the function of a sign Hieratic > example? If the rowing man is followed by a n ( > ) the n has to be read /n/ (phonemogram), if the n is positioned under > the rowing man ( ), we are dealing with a > compound classifier made of two signs, the /n/ has an iconic value and the whole group refer to a man rowing in water. > > Etc., etc. > > So position is 'just an esthetic, layout matter, not a linguistic one? as you say? This kind of assertion reflects badly on our understanding of the hieroglyphic system. That?s a pity to make such unwarranted statements in a discussion also aimed at non-specialists to whom we try to explain things as straightforwardly as possible in order to come up with a solution that is satisfying for everyone. > > Accordingly, if Unicode aims first and foremost at rendering the ?linguistic? dimension of writing, the examples above should suffice to show the importance of the ?quadrat' organization of this script. Again, you might not like it from a computer/font oriented perspective; I agree it?s not convenient to encode, it might even not be possible to encode it at all in Unicode because the standard has not the capabilities needed (and only higher level protocols could then handle this). All of this is fine with me, but it would be great not to distord presentation of the data. > > [I leave here alone other semiotic dimensions of writing: the spatial > arrangement is part of the ?orthography? of the scribes, not > necessarily meaningful at the ?linguistic? level strictly speaking, > but at the level of scribal practices, etc.: why don?t we use IPA for > our modern languages? simply because writing is much more than > ?linguistic? stricto sensu and that it make sense to know who writes > ?next? and who plays with the script and writes ?neckst?. As simple as > that.] > > >> No more than illuminated initial capital letters in medieval manuscripts. >> Like these: >> >> http://www.sothebys.com/content/dam/stb/lots/L12/L12240/256L12240_6G4 >> Y2.jpg >> >> They are nice, they are fancy, they are there (hundreds of thousands of them), but they do not carry any additional linguistic information whatsoever. Thus, there is not specific need to (and in fact they are not) encode or represent them in unicode. > > This comparaison is hilarious. > >> As far as I know (and please correct me if I am wrong), the only utility that recording the position of signs could have on a practical level would be to fill lacunas in text or to suggest alternative readings in hieratic scripts. > > Nope, see above. This is one side-interest of the control characters that I mentioned in the discussions, indeed, but definitely not the only one, since the arrangement is meaningful at multiple levels. >>> 1) JSesh approach >>> >>> >>> >> Sure, everything needs not be dealt with at the level of Unicode, but the data needed should not be hidden in the ligatures embedded in the font either. >> >> Again, what data? >> What is the *linguistic* (not graphic, not philological) information carried by groups that is so important to code? > > See above. >>> 2) Groups in fonts >>> >>> >> OK, sure. But then again control characters have the advantage of being explicit about the relative position of signs when a group is not in a font: how would you proceed for storing such an information, as a lay user, when using ligatures? (this is a real question, nothing ironic here). >> >> I am not sure I understand your question: the information about the spatial distribution is already coded in the ligature. the ligature IS the information. >> And if the ligature does not exist, it takes (literally) 30 seconds to create it within a font. And with a common database and a common font of reference, if would be extremely easy to create a common shared table of ligatures that will allow everyone to see exactly the same groups (or alternatively if i write my text with the font x with the ligature x, it would be enough to embed the font itself in the text file (there are various ways to do so), because everyone would be using the same font or at least the same set of ligatures. > > All this was clear, sure. What I meant is that the ligature are purely ?graphic? right? No information is stored about the position of one sign with respect to another? > >>> 3) the 4 ?small sign in the corner of big sign? control characters. >>> >> I?m glad to see that your solution is exactly the same as the one we suggest! Indeed, there are logically 2 ?variables? (Top vs Bottom and Left-Right), if the sequence of signs indicates the Left-Right unambiguously (like in your example), then we only need to encode explicitly the Top vs Bottom, of course. This is basically how we represented things with Michael at the pub. >> Please keep in mind that the four operators come from another type of syntax. >> >> This is not what I am suggesting. I am suggesting 4 control characters: top-bottom, left-right, A\B, B/A. instead of top-bottom, left-right + four corners. >> Or perhaps I am not understanding your system. > > OK, then I disagree because of the polysemic value of A/B and B/A. > >>> 4) Vertical and horizontal script and control characters >>> >>> >> >> I do not understand why you do entirely get rid of the notion of ?quadrat? in your example. If you encode your text as /hthr-H*(n:t)-tA:tA/, the text would be displayed correctly whatever the orientation (hltr, hrtl, vrtl, vltr), no? Of course, one needs the notion of quadrat in the encoding for this; and this is a nice case for showing that we *need* this well-defined notion in the encoding, not just sequences of glyphs. >> >> If you want to have the "notion of quadrate" in unicode, then you have to introduce at least another control character that you will have to use in front of *every single quadrate* in your text. if i understood correctly, one of the proposal that were circulating was indeed suggesting something like that. Which means that to display any text you will probably end up using more control characters that hieroglyphic signs themselves. This can be a sound way of thinking in a MdC (as you don't like JSesh ;-) ) perspective, but I doubt it make sense in unicode (and more important i doubt the people of the unicode consortium will think it make sense). >> >> Unless you want to code every single possible "quadrate" as an independent glyph: that would thus end up being conceptually comparable to a chinese character. >> >> But I assume (i hope) we don't want do do that, right? > > We use spaces between words, would it be bad to use quadrats separators between quadrats? Mutatis mutandis, it feels to me like saying ?oh, no, there are blank spaces all around!?. > > More seriously, taking signs as basic units for Unicode might be a practical solution (even if it leads to many subsequent difficulties, see the vertical vs horizontal discussion), but denying the essential function of quadrats is kind of funny when discussing the introduction of control characters (1, 2, 38, it does not matter): what they do is building quadrats (implicitly or explicitly). >>> 5) special characters, vertical/horizontal texts and input methods >>> >>> >> >> Your point escapes me, here. Unless it is a result of getting rid of the quadrats: within quadrats, the groups of hieroglyphs are essentially the same. >> >> >> With quadrats marked by special control characters would be even worse, as you would have to take into consideration even more possible combinations. >> >> >>> 6) Ramesside ?groups? (or ?tall groups?). >>> >>> >> >> Why are they groups or quadrats and not ?small columns?? >> >> The answer is simple and straightforward (and provided by your example): because they correspond to the size occupied by the A1 sign. Look at the example. >> >> but some signs do occupy the full height of the horizontal line (unlike in your Japanese example, which is nice, but as such irrelevant), which is the basis for deciding what counts as a unit (quadrat) or not. (Or do you have another definition in mind?). >> >> Not true. >> This can be interpreted just as a question of layout, not as a question of grouping. > > Paraphrasing what Ariel Shisha-Halevy once said to a colleague during a conference: you?re entitled to have your own kind of Egyptian if you want! > I?m not arguing against non-sense. But that?s a pity, because there are actually many cases (esp. in Ptolemaic temples) that fit perfectly with your ?small columns? hypothesis. > The Ramesside example that you provide simply does not work this way. > > And all the parallels from other scripts are pointless (and admittedly funny in the framework of this discussion), since none of these scripts are based on a quadratic structure that is close in any respect to Egyptian. Comparing apple with pears won?t help: and again, that?s a bit of a pity, because we have many cases in Egyptian of layouts similar to the ones you mentioned. These are nice cases of special layouts, indeed, and I agree with your analysis: we do not want to encode them. (unlike the quadrat structure). > > >> But if you want to be able to use it minimally for (1) standardized electronic corpus and (2) journals like LingAeg, etc. dealing with the Egyptian language, this would be a requirement. >> >> All the egyptian words quoted in my article published on the last >> issue of JNES were written with my font with just standard ligatures, >> without control characters. No one complained, no one told me >> anything, so I assume that such a system is indeed suitable for >> scientific publications, not only for writing tourists' names.. ;-) > > I?m afraid that one paper consisting of a lexicographical discussion about the use of one word in one hieratic document cannot be taken as any evidence for disproving my point: you?re happy with the way your hieroglyphs are rendered? Perfect. But why should it be the case that people willing to render a simple hieroglyphic line as the one below would face difficulties? This escapes me. > > > (KRI I, 4) >>> 7) What are the ?square groups?? >>> >>> >> >> I provided the definition agreed on ? I think ? by everyone, if you have a better one, I?m listening of course. >> >> see above. >> And see my "Second" introductory point in the previous email: it is not a question of finding the "right" definition (as there isn't any). It is a question of finding the *most efficient* definition in order to efficiently reach our goal of having a working system of unicode-based hieroglyphs that can be used by the *whole* Egyptological community (and not only by a handful of people working on a specific corpus or database, and on his respect, to respond to you last remark..). >> >> I?m sorry to put it so bluntly, but if Unicode is not to be useful for the majority of egyptologists, so be it. It will remain what it is, a standard not used by the community: Journals, Corpora, etc. will keep on using JSESH and other tools, they?re doing well with it. >> >> Perhaps the opposite should also be considered? perhaps the unicode-based system to write hieroglyphs (considering that the very idea at the basis of unicode is to make writing standardized and accessible to as many people as possible) should indeed aim at being useful for the majority of the Egyptologists (corpus-linguists as well as all the other thousands of non-corpus-linguist egyptologists), and if some team working with some specific database should have some specific need that cannot be satisfied with such a standard unicode system, then perhaps it is them who should use other tools? >> >> I guess this is an option that should be considered.. no? > > We are talking about feelings and different perceptions here, so it?s hard to be objective. My own view (admittedly subjective) is as follows: most Egyptologists (publishing texts, monuments, etc) will never be happy with what Unicode has to offer, because this is not precise enough at multiple levels (Nigel expressed this position multiple times during the meeting for instance). On the other hand, grammarians (broadly speaking) and people working on corpora are probably the ones who are the more open to standardization (hieroglyphs were not even there in the TLA at the beginning). I see this community (much more than historians, etc.) as intensive users of Unicode: using it for exchanging texts, publishing volumes, creating Online resources, etc. I feel like we are very much in favor of standardization and striving for making resources accessible; two goals of Unicode, as you mentioned. > > These resources should be there for lasting, it would be a pity not to think about the standard carefully before going in any direction. > > That?s all folks! > Have a nice weekend, > > St?phane > _______________________________________________ > Egyptian mailing list > Egyptian at evertype.com > http://evertype.com/mailman/listinfo/egyptian_evertype.com _______________________________________________ Egyptian mailing list Egyptian at evertype.com http://evertype.com/mailman/listinfo/egyptian_evertype.com From bobqq at live.co.uk Sun Jul 24 19:41:23 2016 From: bobqq at live.co.uk (Bob Richmond) Date: Sun, 24 Jul 2016 19:41:23 +0100 Subject: [Egyptian] On "A system of control characters for Ancient Egyptian hieroglyphic text" (2016-07-23) Message-ID: Hi Mark-Jan We've been talking about plain text on and off for over 18 months so I was interested to read your L2/16-177 "new system" earlier this month and your latest update yesterday (https://mjn.host.cs.st-andrews.ac.uk/tmp/unicode2.pdf). Good to see your number of control characters now reduced, more consideration has been given to OpenType and some discussion in Cambridge has been factored in. I understand what you are trying to do and there are points I'd like to discuss when we have time. There are items you note that indeed need to be progressed and agreed (EMPTY, STACK CARTOUCHE .) However I am disappointed you have not taken on board the fundamental failing with the scheme you've been developing that makes it unsuitable for Unicode plain text consideration, however useful it may be for other purposes. I thought this was clear at Cambridge but apparently not. Unicode plain text hieroglyphic is exposed to a huge audience and the very first consideration is don't make simple things complicated. Use of BEGIN/END for every group breaks that basic rule. Your scheme is unnecessarily complicated so fails at the outset. It's a continuing distraction for the less technically minded in this group to continue to hold it up as a viable alternative to the current UTC recommendation. It is not. MdC X1:R1 in your scheme uses 3 control characters. Quite honestly I find it hard to understand why anyone thinks this is possibly a good idea. If anyone reading this does support 3 over 1 please let me know your reasoning I'm actually quite curious why I've had to spend time on this. What you've tried to do is build a theoretical model which describes cluster layout given a set of constraints and that's all good as an academic exercise. I'd be interested to see it tested against texts and data. All features of could be added in some way to the three control system as HLP or extra controls. So, there's no need to throw your work away. Some parts apply to plain text implementation e.g. your description on making an MdC-like font. Plenty more you could re-purpose should you wish to continue to be involved in Unicode developments. What I suggest you do is consider how you might use elements of your scheme in a simple higher level protocol on top of plain text as it is at present (I'll be circulating my considered view on adding to the 3 control set on Tuesday). Brackets in an HLP could be ok for rare cases. May be useful for TLA and Ramses. Meanwhile if you have any ideas on how you would like to see e.g. 4 corners added to the existing proposal I'd be pleased to hear from your or anyone else on the topic. Regards, Bob -------------- next part -------------- An HTML attachment was scrubbed... URL: From s.polis at ulg.ac.be Mon Jul 25 11:11:43 2016 From: s.polis at ulg.ac.be (=?utf-8?Q?St=C3=A9phane_polis?=) Date: Mon, 25 Jul 2016 12:11:43 +0200 Subject: [Egyptian] Some general considerations In-Reply-To: References: <96C8F2FC-A766-4379-BCBE-B5A6FB700A3B@ulg.ac.be>

Message-ID: Yes, Marwarn, I agree that we should indeed stop here. The vertical text that you produce is nothing even close to what could have been an ancient Egyptian vertical text, even less a ?nice example? (or was it ironic?). If you want to produce such texts for you own purposes, that?s fine, but for rendering ancient texts that is not an option. (and as I said during the meeting, converting a hieroglyphic text from vertical to horizontal is not an easy business, it implies interpretation and restructuring). As such, counting the number of groups makes little sense (minimally, p-w should be p*w; i-n should be i*n, iyA would imply 2 groups + A1, etc., etc.; etc.) and won?t lead to any firm conclusions in terms of what are the best options for Unicode. So I?d rather stop here and wish you all the best for your future work on ancient Egyptian. St?phane > Le 24 juil. 2016 ? 18:15, Marwan Kilani a ?crit : > > Just one comment: > > And all the parallels from other scripts are pointless (and admittedly funny in the framework of this discussion), since none of these scripts are based on a quadratic structure that is close in any respect to Egyptian. > > Funny to read such a comment while at that meeting in Cambridge a Japanese linguist and Egyptologist explained, by referring to scientific publications, that Japanese (and chinese and korean) are indeed based on quadratic structures that are indeed very close in many respect to Egyptian. > > There were many things that could be said as a reply to your email, but let be honest: with these premises, it won't make any sense. > > p.s.: > No, actually, i am going to add something: > Look what happens (image 1 below) if you cut your ramesside line in small columns (which is just a way of analyzing the text, as it is yours taking them as groups, but as we are not egyptians it is not "truth", it is not and don't want to be the "true nature of the text"), I was saying: Look what happens (image 1 below) if you cut your ramesside line in small columns and you put them one under the other.. > > Such a nice example of Egyptian vertical text, isn't it? :-) > > And now, count how many groups you would need to compose such a text with a vertical font within a vertical layout. > You can see the result in the second image: the groups you will need are marked in green: 7 groups. That's all. > Note that you will not need to combine the D snake and the ns tongue as a group with the following signs, because with a vertical font you could put the baseline of the sign under its "horizontal" bit, and you could consider the tail as hanging below the baseline, as it is for the latin letters "q", "g" "p" and so on. And consider that the "30" is already a single character in unicode (as the t&w, btw.. so i would suggest you to find a more relevant example to argue for the importance of the relative position of signs..) > > So if you use my way of interpreting ramesside texts as short vertical strings of texts within an main horizontal layout, you would just need 7 (sic!!!), in general very basic, groups to display it correctly. All the other signs could just be inputted one after the other, without control characters, without ligatures, without anything, within short columns one next to the other. And you could just use basic layout algorithm to make them with the space in a nice way (you know, like when you expand or to squeeze the letters within a line to make you text look better? exactly the same thing, but vertically). > > Instead, with your way of interpreting ramesside writing, which assumes that every "tall group" is a real "group" that need to be built and displayed within a single purely horizontal layout, and with your system of control characters, you would need 15 (sic!!!) groups, many of them very rare in not even unique, and you would need tens of control characters nested one into the other to correctly build and correctly display each of them. > > I really wonder which approach would be more efficient, more economic and more easily implemented.. > > Image 1: > > > > > > Image 2: > > > > > _______________________________________________ > Egyptian mailing list > Egyptian at evertype.com > http://evertype.com/mailman/listinfo/egyptian_evertype.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From s.polis at ulg.ac.be Mon Jul 25 11:25:19 2016 From: s.polis at ulg.ac.be (=?utf-8?Q?St=C3=A9phane_polis?=) Date: Mon, 25 Jul 2016 12:25:19 +0200 Subject: [Egyptian] On "A system of control characters for Ancient Egyptian hieroglyphic text" (2016-07-23) In-Reply-To: References: Message-ID: Hi all, If i may jump in one last time. As I said several times, I leave entirely up to the specialists all the issues concerning the syntax of the operators. What matters for me is what you can effectively achieve, and Mak-Jan?s proposal covers precisely what we minimally would like to have (which is why I support it heartfully). Now, if it really ends up to be too much for Unicode and that there is no way to make this happen there, but that you are confident that it can be handled at the level of HLPs, then can I ask a very naive question: what need is there for any control character in Unicode? After all, Marwan?s font with the ligatures seems to work quite well for basic purposes, so it can be considered as a good solution for some users esp. when combined with So?s input system. For other uses, we will need several types of grouping, groups inserted in groups, etc.: why would some bits end up in Unicode, while other would be up to HLP? Shouldn?t we try to have a coherent scheme and not something made of bits and pieces? A real (even if maybe naive) concern. Take care, St?phane > Le 24 juil. 2016 ? 20:41, Bob Richmond a ?crit : > > Hi Mark-Jan > > We?ve been talking about plain text on and off for over 18 months so I was interested to read your L2/16-177 ?new system? earlier this month and your latest update yesterday (https://mjn.host.cs.st-andrews.ac.uk/tmp/unicode2.pdf ). Good to see your number of control characters now reduced, more consideration has been given to OpenType and some discussion in Cambridge has been factored in. I understand what you are trying to do and there are points I?d like to discuss when we have time. There are items you note that indeed need to be progressed and agreed (EMPTY, STACK CARTOUCHE ?) > > However I am disappointed you have not taken on board the fundamental failing with the scheme you?ve been developing that makes it unsuitable for Unicode plain text consideration, however useful it may be for other purposes. I thought this was clear at Cambridge but apparently not. > > Unicode plain text hieroglyphic is exposed to a huge audience and the very first consideration is don?t make simple things complicated. Use of BEGIN/END for every group breaks that basic rule. Your scheme is unnecessarily complicated so fails at the outset. It?s a continuing distraction for the less technically minded in this group to continue to hold it up as a viable alternative to the current UTC recommendation. It is not. > > MdC X1:R1 in your scheme uses 3 control characters. Quite honestly I find it hard to understand why anyone thinks this is possibly a good idea. If anyone reading this does support 3 over 1 please let me know your reasoning I?m actually quite curious why I?ve had to spend time on this. > > What you?ve tried to do is build a theoretical model which describes cluster layout given a set of constraints and that?s all good as an academic exercise. I?d be interested to see it tested against texts and data. All features of could be added in some way to the three control system as HLP or extra controls. > > So, there?s no need to throw your work away. Some parts apply to plain text implementation e.g. your description on making an MdC-like font. Plenty more you could re-purpose should you wish to continue to be involved in Unicode developments. > > What I suggest you do is consider how you might use elements of your scheme in a simple higher level protocol on top of plain text as it is at present (I?ll be circulating my considered view on adding to the 3 control set on Tuesday). Brackets in an HLP could be ok for rare cases. May be useful for TLA and Ramses. > > Meanwhile if you have any ideas on how you would like to see e.g. 4 corners added to the existing proposal I?d be pleased to hear from your or anyone else on the topic. > > Regards, > Bob > > > > > > > > > _______________________________________________ > Egyptian mailing list > Egyptian at evertype.com > http://evertype.com/mailman/listinfo/egyptian_evertype.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From odusseus at gmail.com Mon Jul 25 11:49:55 2016 From: odusseus at gmail.com (Marwan Kilani) Date: Mon, 25 Jul 2016 12:49:55 +0200 Subject: [Egyptian] Some general considerations In-Reply-To: References: <96C8F2FC-A766-4379-BCBE-B5A6FB700A3B@ulg.ac.be>

Message-ID: On Mon, Jul 25, 2016 at 12:11 PM, St?phane polis wrote: > Yes, Marwarn, I agree that we should indeed stop here. > > The vertical text that you produce is nothing even close to what could > have been an ancient Egyptian vertical text, even less a ?nice example? (or > was it ironic?). > St?phane, I am sorry to disappoint you, but you are not an ancient Egyptian scribe and I am sorry to disappoint you, but your perception of egyptian writing is not the "truth". Your aim is not to *produce* an ancient egyptian text. The aim is, or should be, to analyze what ancient Egyptian wrote in a way to transcribe what ancient Egyptian wrote in an precise and efficient way. Not to determine what according to some abstract concept is the "true nature" of a text and trying to encode this "true nature". And the aim of that "vertical rendition" was not to produce a "real" ancient egyptian vertical text. it was to explicit and explain one specific step (and its advantages) the inputting procedure and approach that i am suggesting. And my approach allows to transcribe it as precisely as yours. But more efficiently. Perhaps you should try to understand how it works (because clearly you don't) before stating what is possible and what is not. And perhaps you should try to understand a bit better how fonts, ligatures, layouts, and unicode in general work because you seem to believe that is possible/practical to do things that are actually impossible/impractical (stacking in hieroglyphs, using tens of control characters to display 3 hieroglyphs), and you seem to believe that is impossible/impractical to do things that are instead very possible/practical (as coding spatial information in ligatures - you can give names to ligature, so you can call your p*w-ligature "p*w" and you have your spatial information that you can retrieve from your font whenever you want) Let alone your comments about non-western non-latin scripts or your tw/wt example (which is a non-existing problem that has *already* been solved in the *current* current unicode set, where t&w is *already* coded as an independent character. So if you want to display tw = /tw/ you just input t + w, if you want to display wt = /wt/ you just input w + t, and if you want to display t&w = /wt/-/tw/ leaving the ambiguity of the pronunciation you just input the unicode character tw (u13172) which was probably encoded *exactly* for this purpose - pretty smart idea btw? - you should find some other common examples that cannot be solved in this very easy and practical way to explain why coding position is important, because arguing for a proposal on the basis of a problem that has *already* been solved in the current unicode set, well.. and btw, if you knew how ligatures work, you would see that in the case of the wt/tw problem it is actually possible to code *more* information, about the spatial organization *AND* about the reading order of the signs, by using ligatures, than by using control characters.. but I assume you are not really interested in that right?) So well.. All the best with your work with ancient Egyptian as well Marwan > Le 24 juil. 2016 ? 18:15, Marwan Kilani a ?crit : > > Just one comment: > > And all the parallels from other scripts are pointless (and admittedly >> funny in the framework of this discussion), since none of these scripts are >> based on a quadratic structure that is close in any respect to Egyptian. >> > > Funny to read such a comment while at that meeting in Cambridge a Japanese > linguist and Egyptologist explained, by referring to scientific > publications, that Japanese (and chinese and korean) are indeed based on > quadratic structures that are indeed very close in many respect to Egyptian. > > There were many things that could be said as a reply to your email, but > let be honest: with these premises, it won't make any sense. > > p.s.: > No, actually, i am going to add something: > Look what happens (image 1 below) if you cut your ramesside line in small > columns (which is just a way of analyzing the text, as it is yours taking > them as groups, but as we are not egyptians it is not "truth", it is not > and don't want to be the "true nature of the text"), I was saying: Look > what happens (image 1 below) if you cut your ramesside line in small > columns and you put them one under the other.. > > Such a nice example of Egyptian vertical text, isn't it? :-) > > And now, count how many groups you would need to compose such a text with > a vertical font within a vertical layout. > You can see the result in the second image: the groups you will need are > marked in green: 7 groups. That's all. > Note that you will not need to combine the D snake and the ns tongue as a > group with the following signs, because with a vertical font you could put > the baseline of the sign under its "horizontal" bit, and you could consider > the tail as hanging below the baseline, as it is for the latin letters "q", > "g" "p" and so on. And consider that the "30" is already a single character > in unicode (as the t&w, btw.. so i would suggest you to find a more > relevant example to argue for the importance of the relative position of > signs..) > > So if you use my way of interpreting ramesside texts as short vertical > strings of texts within an main horizontal layout, you would just need 7 > (sic!!!), in general very basic, groups to display it correctly. All the > other signs could just be inputted one after the other, without control > characters, without ligatures, without anything, within short columns one > next to the other. And you could just use basic layout algorithm to make > them with the space in a nice way (you know, like when you expand or to > squeeze the letters within a line to make you text look better? exactly the > same thing, but vertically). > > Instead, with your way of interpreting ramesside writing, which assumes > that every "tall group" is a real "group" that need to be built and > displayed within a single purely horizontal layout, and with your system of > control characters, you would need 15 (sic!!!) groups, many of them very > rare in not even unique, and you would need tens of control characters > nested one into the other to correctly build and correctly display each of > them. > > I really wonder which approach would be more efficient, more economic and > more easily implemented.. > > Image 1: > > > > > > Image 2: > > > > > _______________________________________________ > Egyptian mailing list > Egyptian at evertype.com > http://evertype.com/mailman/listinfo/egyptian_evertype.com > > > > _______________________________________________ > Egyptian mailing list > Egyptian at evertype.com > http://evertype.com/mailman/listinfo/egyptian_evertype.com > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ncs3 at cam.ac.uk Mon Jul 25 11:54:44 2016 From: ncs3 at cam.ac.uk (Nigel Strudwick) Date: Mon, 25 Jul 2016 11:54:44 +0100 Subject: [Egyptian] Some general considerations In-Reply-To: References: <96C8F2FC-A766-4379-BCBE-B5A6FB700A3B@ulg.ac.be>

Message-ID: Can we please stop this sort of pointless arguing and get back to the basic issue of getting the encoding members back onto sorting out this Unicode control character business? I?m trying to keep out of it, but I think there are times when everyone needs to be called to order and reminded of the main thing we need to achieve first after Cambridge. Nigel On 25 Jul 2016, at 11:49, Marwan Kilani wrote: > > > On Mon, Jul 25, 2016 at 12:11 PM, St?phane polis wrote: > Yes, Marwarn, I agree that we should indeed stop here. > > The vertical text that you produce is nothing even close to what could have been an ancient Egyptian vertical text, even less a ?nice example? (or was it ironic?). > > St?phane, I am sorry to disappoint you, but you are not an ancient Egyptian scribe and I am sorry to disappoint you, but your perception of egyptian writing is not the "truth". > > Your aim is not to *produce* an ancient egyptian text. > > The aim is, or should be, to analyze what ancient Egyptian wrote in a way to transcribe what ancient Egyptian wrote in an precise and efficient way. Not to determine what according to some abstract concept is the "true nature" of a text and trying to encode this "true nature". > > And the aim of that "vertical rendition" was not to produce a "real" ancient egyptian vertical text. it was to explicit and explain one specific step (and its advantages) the inputting procedure and approach that i am suggesting. > > And my approach allows to transcribe it as precisely as yours. But more efficiently. > Perhaps you should try to understand how it works (because clearly you don't) before stating what is possible and what is not. > And perhaps you should try to understand a bit better how fonts, ligatures, layouts, and unicode in general work because you seem to believe that is possible/practical to do things that are actually impossible/impractical (stacking in hieroglyphs, using tens of control characters to display 3 hieroglyphs), and you seem to believe that is impossible/impractical to do things that are instead very possible/practical (as coding spatial information in ligatures - you can give names to ligature, so you can call your p*w-ligature "p*w" and you have your spatial information that you can retrieve from your font whenever you want) > > Let alone your comments about non-western non-latin scripts or your tw/wt example (which is a non-existing problem that has *already* been solved in the *current* current unicode set, where t&w is *already* coded as an independent character. So if you want to display tw = /tw/ you just input t + w, if you want to display wt = /wt/ you just input w + t, and if you want to display t&w = /wt/-/tw/ leaving the ambiguity of the pronunciation you just input the unicode character tw (u13172) which was probably encoded *exactly* for this purpose - pretty smart idea btw? - you should find some other common examples that cannot be solved in this very easy and practical way to explain why coding position is important, because arguing for a proposal on the basis of a problem that has *already* been solved in the current unicode set, well.. and btw, if you knew how ligatures work, you would see that in the case of the wt/tw problem it is actually possible to code *more* information, about the spatial organization *AND* about the reading order of the signs, by using ligatures, than by using control characters.. but I assume you are not really interested in that right?) > > So well.. > > All the best with your work with ancient Egyptian as well > > Marwan > > > > >> Le 24 juil. 2016 ? 18:15, Marwan Kilani a ?crit : >> >> Just one comment: >> >> And all the parallels from other scripts are pointless (and admittedly funny in the framework of this discussion), since none of these scripts are based on a quadratic structure that is close in any respect to Egyptian. >> >> Funny to read such a comment while at that meeting in Cambridge a Japanese linguist and Egyptologist explained, by referring to scientific publications, that Japanese (and chinese and korean) are indeed based on quadratic structures that are indeed very close in many respect to Egyptian. >> >> There were many things that could be said as a reply to your email, but let be honest: with these premises, it won't make any sense. >> >> p.s.: >> No, actually, i am going to add something: >> Look what happens (image 1 below) if you cut your ramesside line in small columns (which is just a way of analyzing the text, as it is yours taking them as groups, but as we are not egyptians it is not "truth", it is not and don't want to be the "true nature of the text"), I was saying: Look what happens (image 1 below) if you cut your ramesside line in small columns and you put them one under the other.. >> >> Such a nice example of Egyptian vertical text, isn't it? :-) >> >> And now, count how many groups you would need to compose such a text with a vertical font within a vertical layout. >> You can see the result in the second image: the groups you will need are marked in green: 7 groups. That's all. >> Note that you will not need to combine the D snake and the ns tongue as a group with the following signs, because with a vertical font you could put the baseline of the sign under its "horizontal" bit, and you could consider the tail as hanging below the baseline, as it is for the latin letters "q", "g" "p" and so on. And consider that the "30" is already a single character in unicode (as the t&w, btw.. so i would suggest you to find a more relevant example to argue for the importance of the relative position of signs..) >> >> So if you use my way of interpreting ramesside texts as short vertical strings of texts within an main horizontal layout, you would just need 7 (sic!!!), in general very basic, groups to display it correctly. All the other signs could just be inputted one after the other, without control characters, without ligatures, without anything, within short columns one next to the other. And you could just use basic layout algorithm to make them with the space in a nice way (you know, like when you expand or to squeeze the letters within a line to make you text look better? exactly the same thing, but vertically). >> >> Instead, with your way of interpreting ramesside writing, which assumes that every "tall group" is a real "group" that need to be built and displayed within a single purely horizontal layout, and with your system of control characters, you would need 15 (sic!!!) groups, many of them very rare in not even unique, and you would need tens of control characters nested one into the other to correctly build and correctly display each of them. >> >> I really wonder which approach would be more efficient, more economic and more easily implemented.. >> >> Image 1: >> >> >> >> >> >> Image 2: >> >> >> >> >> _______________________________________________ >> Egyptian mailing list >> Egyptian at evertype.com >> http://evertype.com/mailman/listinfo/egyptian_evertype.com > > > _______________________________________________ > Egyptian mailing list > Egyptian at evertype.com > http://evertype.com/mailman/listinfo/egyptian_evertype.com > > > _______________________________________________ > Egyptian mailing list > Egyptian at evertype.com > http://evertype.com/mailman/listinfo/egyptian_evertype.com From odusseus at gmail.com Mon Jul 25 11:58:03 2016 From: odusseus at gmail.com (Marwan Kilani) Date: Mon, 25 Jul 2016 12:58:03 +0200 Subject: [Egyptian] Some general considerations In-Reply-To: References: <96C8F2FC-A766-4379-BCBE-B5A6FB700A3B@ulg.ac.be>

Message-ID: P.S. St?phane, please obviously don't take my tone, which I know could have sounded a bit harsh, on a personal level. I am not taking any of remarks personally and feel free to freely criticize my arguments as much as you wish. Please do the same. It is nothing personal, I am talking about work, this as nothing to do about me or you as a person, obviously. I hope it is clear for everyone. Marwan On Mon, Jul 25, 2016 at 12:54 PM, Nigel Strudwick wrote: > Can we please stop this sort of pointless arguing and get back to the > basic issue of getting the encoding members back onto sorting out this > Unicode control character business? > > I?m trying to keep out of it, but I think there are times when everyone > needs to be called to order and reminded of the main thing we need to > achieve first after Cambridge. > > Nigel > > > > > On 25 Jul 2016, at 11:49, Marwan Kilani wrote: > > > > > > > On Mon, Jul 25, 2016 at 12:11 PM, St?phane polis > wrote: > > Yes, Marwarn, I agree that we should indeed stop here. > > > > The vertical text that you produce is nothing even close to what could > have been an ancient Egyptian vertical text, even less a ?nice example? (or > was it ironic?). > > > > St?phane, I am sorry to disappoint you, but you are not an ancient > Egyptian scribe and I am sorry to disappoint you, but your perception of > egyptian writing is not the "truth". > > > > Your aim is not to *produce* an ancient egyptian text. > > > > The aim is, or should be, to analyze what ancient Egyptian wrote in a > way to transcribe what ancient Egyptian wrote in an precise and efficient > way. Not to determine what according to some abstract concept is the "true > nature" of a text and trying to encode this "true nature". > > > > And the aim of that "vertical rendition" was not to produce a "real" > ancient egyptian vertical text. it was to explicit and explain one specific > step (and its advantages) the inputting procedure and approach that i am > suggesting. > > > > And my approach allows to transcribe it as precisely as yours. But more > efficiently. > > Perhaps you should try to understand how it works (because clearly you > don't) before stating what is possible and what is not. > > And perhaps you should try to understand a bit better how fonts, > ligatures, layouts, and unicode in general work because you seem to believe > that is possible/practical to do things that are actually > impossible/impractical (stacking in hieroglyphs, using tens of control > characters to display 3 hieroglyphs), and you seem to believe that is > impossible/impractical to do things that are instead very > possible/practical (as coding spatial information in ligatures - you can > give names to ligature, so you can call your p*w-ligature "p*w" and you > have your spatial information that you can retrieve from your font whenever > you want) > > > > Let alone your comments about non-western non-latin scripts or your > tw/wt example (which is a non-existing problem that has *already* been > solved in the *current* current unicode set, where t&w is *already* coded > as an independent character. So if you want to display tw = /tw/ you just > input t + w, if you want to display wt = /wt/ you just input w + t, and if > you want to display t&w = /wt/-/tw/ leaving the ambiguity of the > pronunciation you just input the unicode character tw (u13172) which was > probably encoded *exactly* for this purpose - pretty smart idea btw? - you > should find some other common examples that cannot be solved in this very > easy and practical way to explain why coding position is important, because > arguing for a proposal on the basis of a problem that has *already* been > solved in the current unicode set, well.. and btw, if you knew how > ligatures work, you would see that in the case of the wt/tw problem it is > actually possible to code *more* information, about the spatial > organization *AND* about the reading order of the signs, by using > ligatures, than by using control characters.. but I assume you are not > really interested in that right?) > > > > So well.. > > > > All the best with your work with ancient Egyptian as well > > > > Marwan > > > > > > > > > >> Le 24 juil. 2016 ? 18:15, Marwan Kilani a ?crit : > >> > >> Just one comment: > >> > >> And all the parallels from other scripts are pointless (and admittedly > funny in the framework of this discussion), since none of these scripts are > based on a quadratic structure that is close in any respect to Egyptian. > >> > >> Funny to read such a comment while at that meeting in Cambridge a > Japanese linguist and Egyptologist explained, by referring to scientific > publications, that Japanese (and chinese and korean) are indeed based on > quadratic structures that are indeed very close in many respect to Egyptian. > >> > >> There were many things that could be said as a reply to your email, but > let be honest: with these premises, it won't make any sense. > >> > >> p.s.: > >> No, actually, i am going to add something: > >> Look what happens (image 1 below) if you cut your ramesside line in > small columns (which is just a way of analyzing the text, as it is yours > taking them as groups, but as we are not egyptians it is not "truth", it is > not and don't want to be the "true nature of the text"), I was saying: Look > what happens (image 1 below) if you cut your ramesside line in small > columns and you put them one under the other.. > >> > >> Such a nice example of Egyptian vertical text, isn't it? :-) > >> > >> And now, count how many groups you would need to compose such a text > with a vertical font within a vertical layout. > >> You can see the result in the second image: the groups you will need > are marked in green: 7 groups. That's all. > >> Note that you will not need to combine the D snake and the ns tongue as > a group with the following signs, because with a vertical font you could > put the baseline of the sign under its "horizontal" bit, and you could > consider the tail as hanging below the baseline, as it is for the latin > letters "q", "g" "p" and so on. And consider that the "30" is already a > single character in unicode (as the t&w, btw.. so i would suggest you to > find a more relevant example to argue for the importance of the relative > position of signs..) > >> > >> So if you use my way of interpreting ramesside texts as short vertical > strings of texts within an main horizontal layout, you would just need 7 > (sic!!!), in general very basic, groups to display it correctly. All the > other signs could just be inputted one after the other, without control > characters, without ligatures, without anything, within short columns one > next to the other. And you could just use basic layout algorithm to make > them with the space in a nice way (you know, like when you expand or to > squeeze the letters within a line to make you text look better? exactly the > same thing, but vertically). > >> > >> Instead, with your way of interpreting ramesside writing, which assumes > that every "tall group" is a real "group" that need to be built and > displayed within a single purely horizontal layout, and with your system of > control characters, you would need 15 (sic!!!) groups, many of them very > rare in not even unique, and you would need tens of control characters > nested one into the other to correctly build and correctly display each of > them. > >> > >> I really wonder which approach would be more efficient, more economic > and more easily implemented.. > >> > >> Image 1: > >> > >> > >> > >> > >> > >> Image 2: > >> > >> > >> > >> > >> _______________________________________________ > >> Egyptian mailing list > >> Egyptian at evertype.com > >> http://evertype.com/mailman/listinfo/egyptian_evertype.com > > > > > > _______________________________________________ > > Egyptian mailing list > > Egyptian at evertype.com > > http://evertype.com/mailman/listinfo/egyptian_evertype.com > > > > > > _______________________________________________ > > Egyptian mailing list > > Egyptian at evertype.com > > http://evertype.com/mailman/listinfo/egyptian_evertype.com > > > _______________________________________________ > Egyptian mailing list > Egyptian at evertype.com > http://evertype.com/mailman/listinfo/egyptian_evertype.com > -------------- next part -------------- An HTML attachment was scrubbed... URL: From s.polis at ulg.ac.be Mon Jul 25 11:58:52 2016 From: s.polis at ulg.ac.be (=?utf-8?Q?St=C3=A9phane_polis?=) Date: Mon, 25 Jul 2016 12:58:52 +0200 Subject: [Egyptian] Some general considerations In-Reply-To: References: <96C8F2FC-A766-4379-BCBE-B5A6FB700A3B@ulg.ac.be>

Message-ID: <7366EDBB-A193-4482-BCBC-47D2D40AB972@ulg.ac.be> > Le 25 juil. 2016 ? 12:49, Marwan Kilani a ?crit : > > > > On Mon, Jul 25, 2016 at 12:11 PM, St?phane polis > wrote: > Yes, Marwarn, I agree that we should indeed stop here. > > The vertical text that you produce is nothing even close to what could have been an ancient Egyptian vertical text, even less a ?nice example? (or was it ironic?). > > St?phane, I am sorry to disappoint you, but you are not an ancient Egyptian scribe and I am sorry to disappoint you, but your perception of egyptian writing is not the "truth". > > Your aim is not to *produce* an ancient egyptian text. > > The aim is, or should be, to analyze what ancient Egyptian wrote in a way to transcribe what ancient Egyptian wrote in an precise and efficient way. Not to determine what according to some abstract concept is the "true nature" of a text and trying to encode this "true nature ?. I never spoke about any ?truth? at any point, i?m talking about what is (likely to be) attested in Egyptian texts and what is not. > > And the aim of that "vertical rendition" was not to produce a "real" ancient egyptian vertical text. it was to explicit and explain one specific step (and its advantages) the inputting procedure and approach that i am suggesting. > > And my approach allows to transcribe it as precisely as yours. But more efficiently. > Perhaps you should try to understand how it works (because clearly you don't) before stating what is possible and what is not. > And perhaps you should try to understand a bit better how fonts, ligatures, layouts, and unicode in general work because you seem to believe that is possible/practical to do things that are actually impossible/impractical (stacking in hieroglyphs, using tens of control characters to display 3 hieroglyphs), and you seem to believe that is impossible/impractical to do things that are instead very possible/practical (as coding spatial information in ligatures - you can give names to ligature, so you can call your p*w-ligature "p*w" and you have your spatial information that you can retrieve from your font whenever you want) > > Let alone your comments about non-western non-latin scripts or your tw/wt example (which is a non-existing problem that has *already* been solved in the *current* current unicode set, where t&w is *already* coded as an independent character. So if you want to display tw = /tw/ you just input t + w, if you want to display wt = /wt/ you just input w + t, and if you want to display t&w = /wt/-/tw/ leaving the ambiguity of the pronunciation you just input the unicode character tw (u13172) which was probably encoded *exactly* for this purpose - pretty smart idea btw? - you should find some other common examples that cannot be solved in this very easy and practical way to explain why coding position is important, because arguing for a proposal on the basis of a problem that has *already* been solved in the current unicode set, well.. and btw, if you knew how ligatures work, you would see that in the case of the wt/tw problem it is actually possible to code *more* information, about the spatial organization *AND* about the reading order of the signs, by using ligatures, than by using control characters.. but I assume you are not really interested in that right?) Sorry Marwan, but we leave on different planets: your observation was general (along the line ?no linguistic meaning for the organization of the hieroglyphs'), I provided general examples about the facts that it is not the case. That this is handled in a way or another and finds practical solutions in Unicode has nothing to do with the general principles. I stop here with pointless argumentation. Cheers, St. > > So well.. > > All the best with your work with ancient Egyptian as well > > Marwan > > > > >> Le 24 juil. 2016 ? 18:15, Marwan Kilani > a ?crit : >> >> Just one comment: >> >> And all the parallels from other scripts are pointless (and admittedly funny in the framework of this discussion), since none of these scripts are based on a quadratic structure that is close in any respect to Egyptian. >> >> Funny to read such a comment while at that meeting in Cambridge a Japanese linguist and Egyptologist explained, by referring to scientific publications, that Japanese (and chinese and korean) are indeed based on quadratic structures that are indeed very close in many respect to Egyptian. >> >> There were many things that could be said as a reply to your email, but let be honest: with these premises, it won't make any sense. >> >> p.s.: >> No, actually, i am going to add something: >> Look what happens (image 1 below) if you cut your ramesside line in small columns (which is just a way of analyzing the text, as it is yours taking them as groups, but as we are not egyptians it is not "truth", it is not and don't want to be the "true nature of the text"), I was saying: Look what happens (image 1 below) if you cut your ramesside line in small columns and you put them one under the other.. >> >> Such a nice example of Egyptian vertical text, isn't it? :-) >> >> And now, count how many groups you would need to compose such a text with a vertical font within a vertical layout. >> You can see the result in the second image: the groups you will need are marked in green: 7 groups. That's all. >> Note that you will not need to combine the D snake and the ns tongue as a group with the following signs, because with a vertical font you could put the baseline of the sign under its "horizontal" bit, and you could consider the tail as hanging below the baseline, as it is for the latin letters "q", "g" "p" and so on. And consider that the "30" is already a single character in unicode (as the t&w, btw.. so i would suggest you to find a more relevant example to argue for the importance of the relative position of signs..) >> >> So if you use my way of interpreting ramesside texts as short vertical strings of texts within an main horizontal layout, you would just need 7 (sic!!!), in general very basic, groups to display it correctly. All the other signs could just be inputted one after the other, without control characters, without ligatures, without anything, within short columns one next to the other. And you could just use basic layout algorithm to make them with the space in a nice way (you know, like when you expand or to squeeze the letters within a line to make you text look better? exactly the same thing, but vertically). >> >> Instead, with your way of interpreting ramesside writing, which assumes that every "tall group" is a real "group" that need to be built and displayed within a single purely horizontal layout, and with your system of control characters, you would need 15 (sic!!!) groups, many of them very rare in not even unique, and you would need tens of control characters nested one into the other to correctly build and correctly display each of them. >> >> I really wonder which approach would be more efficient, more economic and more easily implemented.. >> >> Image 1: >> >> >> >> >> >> Image 2: >> >> >> >> >> _______________________________________________ >> Egyptian mailing list >> Egyptian at evertype.com >> http://evertype.com/mailman/listinfo/egyptian_evertype.com