[Egyptian] another approach, and possibly a compromise?

Bob Richmond bobqq at live.co.uk
Sat Jul 30 17:55:26 BST 2016


Hi Marwan

 

The attested groups database as a Unicode resource is part of the proposal Andrew and I submitted hence part of the current UTC recommendation. I’d hoped to have a draft done this Summer but obviously with the current disruption I had to put that to one side.

 

No time atm for general discussion on controls and theory, sorry.

 

Bob

 

From: Egyptian [mailto:egyptian-bounces at evertype.com] On Behalf Of Marwan Kilani
Sent: 30 July 2016 14:22
To: Egyptian Hieroglyphs in the UCS <egyptian at evertype.com>
Subject: [Egyptian] another approach, and possibly a compromise?

 

May I dare to suggest another approach to the problem, suggesting a compromise solution that could be at the same time easy and efficient and that could satisfy more or less everyone?

Let’s start from the assumption that at some point we will need a common archive or database listing all the attested groups.

This is something that *will have to be done* at some point, whether you want to use 0 or 100 control characters, because groups will need to be inserted as precomposed glyphs into the fonts to be displayed correctly, and the people who will design the fonts will need a list of the groups they have to draw.

This is a must, such a reference list must be created at some point, whatever solution for grouping will be decided.

And I think it should not be too difficult to set up something like this, and one could think to an online database of reference, where a unified and generally recognized list of reference with all the attested groups will be freely available.

The list will give information about the individual signs composing each group, and about their spatial organization within the group itself.

This can be done by just writing for each group the list of signs composing it with the relevant MdC operators.

Groups, or sequences of texts could be searched also on the basis of their spatial organization

 

Now, as for the Unicode:

Yes about control characters, but:

What about deciding that:

a) sequences of signs that can be combined into only one attested groups (as many Ramesside groups) will be rendered *only* with plain ligatures, without control characters.

The advantages would be:

-       1) complex groups could be rendered without the need of recursivity or complex nested control characters, because it is logic to assume that the more complex a group is, and the less likely it is that its composing signs will appear in more than one spatial organization. So they can be dealt with plain ligatures without the risk of ambiguity.

-       2) Because of the point above, this will probably solve the problem of nesting control characters, or at least it will greatly reduce it. I doubt that there are many sequences of signs with nesting features that can be combined in multiple different groups.

-       3) If for some reason the group should not be rendered correctly by the browser/text editor, it would not be a problem because the signs composing the group will be displayed ungrouped, as plain text, without broken control characters among them. The text would still be readable and usable

-       4) The syntax of the signs will not be affected by control characters, because control characters will not be involved. The ligatures will be triggered jsut by imputing the sign in their correct reading order.

 

Note that there would not be any loss of information, because these groups would be unique spatial compositions, i.e. the spatial organization of the signs within the group will be implicit in the ligature itself (which could be named with the string of signs + MdC operators describing the group), and will be available in the database mentioned above. As long as the database will be used as a reference for the creation of fonts, the ligature needed for the group will be present and therefore the group itself will be displayed correctly.

 

b) sequences of signs that can be combined into more than one different group will be rendered differently. In particular, one, the most common group (or the one with the most complex organization?) will be rendered with a plain ligature. Any other group will be rendered with control characters (whatever control characters you prefer).

 

So for instance the sequence: owl G17 + arm D36

 

The basic group, rendered with the plain ligature could be the owl above the arm, that will be automatically created by inputting owl + arm.
Then there could be a secondary group, the arm across the owl (I know this is already encoded as a distinct character, just for the sake of the example), that will be created with a control character, so e.g. by inputting owl + control character + arm.

 

The advantages would be:

-       5) even if a text editor/browser will not be able to correctly render control characters, there will always be one basic group with a plain ligature that can be rendered (whether as a group or as a sequence of signs) without broken control characters popping out here and there

-       6) you can have as many variants as you want, and therefore the precision of the spatial distribution will be preserved, because the less common spatial organizations (i.e. the less common groups) will be distinguished by the basic one through the use of control characters. If you don’t need spatial precision, you can just use the plain basic ligature, if you want to be precise you can select the specific organization you need.

-       7) short sequences of signs are more likely to have more than one possible spatial organization, i.e. short sequences of signs are more likely to be composed in more than one different group. Still, groups built from short sequences of signs are much less likely to require features like nested control characters, recursivity, etc., so it will probably be possible to deal with them with just very simple sequences of control characters.

-       8) if a new group for sequence of signs that until now would be combined into only one group should be discovered, it could just be added to the list and rendered with a control characters

 

 Because of the point 7) here above, and because of the fact that complex groups will likely be unique and therefore dealt with plain ligatures (point 1) above), it is quite likely that this system could reduce (if not even solve) the problem of the multiplication and nesting of too many control characters.

Now it seems to me such a system would satisfy everyone’s needs and deal with a lot of problems:

-       searchability will be granted because groups will be dealt mainly with plain ligatures, and therefore the phonetic order can be respected in imputing the single sings

-       spatial information will be preserved, because it will be either encoded in the ligature itself, or in the variants using control characters. And will in any case case be present in the basic database (that *has to be created* anyway).

-       The system will allow to use control characters in a efficient way, that won’t require tens of them, both because unique groups will be rendered with plain ligatures, and because those groups that will must likely need control characters will be relatively simple, based on short sequences of signs. This, I think, is more in line with the basic principles of Unicode.

-       Note that the system is different from Bob’s original simplified Egyptian proposal, as it suggests to use plain ligatures for all the sequences of signs that can be combined in only one group, and for one of the possible groups for those sequences that can be combined in more than one group.

At the same time the system is relatively similar to how the emoji variants work, and to what we were discussion with some of you during the workshop.

-   The use of plain ligatures for the most common/basic forms of the groups would solve the Dd problem pointed out by Michael: no need to chose between up/down or corner control character, because the pain ligature can be used

-       It would be very easy to create input methods for such a system

 

Now, as we are here to reach an agreement on a system that can work.

It seems to me that such a system would solve a lot of the issues that are being discussed.
So what do you think about it?

 

Marwan

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://evertype.com/pipermail/egyptian_evertype.com/attachments/20160730/baf49160/attachment.htm>


More information about the Egyptian mailing list