[Egyptian] Some general considerations

Sat Jul 23 18:12:18 BST 2016

Hello everyone

*Third*
>
> And your solution is a good one — I have absolutely no doubt — as long as
> you do not want to *search* for the relative position of signs with respect
> to one another. This is however a piece of information that is, like it or
> not, part of the ‘orthographic’ system of ancient Egyptian and to which we
> want to have access (see further Simon’s mail earlier this week).
>

In general, the spatial distribution of signs (i.e. the "grouping") has *no
meaning at all* in Egyptian. No semantic, phonological, or morphological
information is coded in the relative position of the signs.
This is demonstrated by the texts themselves: if the relative position of
the signs were linguistically important, you would have some form of
regularity, with some combinations being possible and others being
forbidden. This is not the case.

Combining three signs into a group, or writing them one after the other is
linguistically exactly the same.
it is just an esthetic, a layout, matter.
Not a linguistic one.

No more than illuminated initial capital letters in medieval manuscripts.
Like these:

http://www.sothebys.com/content/dam/stb/lots/L12/L12240/256L12240_6G4Y2.jpg

They are nice, they are fancy, they are there (hundreds of thousands of
them), but they do not carry any additional linguistic information
whatsoever. Thus, there is not specific need to (and in fact they are not)
encode or represent them in unicode.

As far as I know (and please correct me if I am wrong), the only utility
that recording the position of signs could have on a practical level would
be to fill lacunas in text or to suggest alternative readings in hieratic
scripts.
But in fact, in order to find what sign could be missing in a given lacuna,
or what sign could be hidden behind a hieratic ligature, you need
dictionaries and corpora of texts where you can search for sequences of
signs *independently* form their spatial position (what signs x is attested
after sign y? whether combined in a similar group or not?), you don't need
to code anything in unicode. It can be a plus, but it is not indispensable.

> *1) JSesh approach*
>
>
> Sure, everything needs not be dealt with at the level of Unicode, but the
> data needed should not be hidden in the ligatures embedded in the font
> either.
>

Again, what data?
What is the *linguistic* (not graphic, not philological) information
carried by groups that is so important to code?

Perhaps this question has been already asked, but honestly so far I haven't
heard or read any convincing answer, and I haven't seen any common example
(i can exclude there could be some uncommon case, i obviously have not seen
all the egyptian texts existing in the world) where the position of a sign
in respect with the other signs around it carries and linguistic
information.

*2) Groups in fonts*
>
> OK, sure. But then again control characters have the advantage of being
> explicit about the relative position of signs when a group is not in a
> font: how would you proceed for storing such an information, as a lay user,
> when using ligatures? (this is a real question, nothing ironic here).
>

I am not sure I understand your question: the information about the spatial
distribution is already coded in the ligature. the ligature IS the
information.
And if the ligature does not exist, it takes (literally) 30 seconds to
create it within a font. And with a common database and a common font of
reference, if would be extremely easy to create a common shared table of
ligatures that will allow everyone to see exactly the same groups (or
alternatively if i write my text with the font x with the ligature x, it
would be enough to embed the font itself in the text file (there are
various ways to do so), because everyone would be using the same font or at
least the same set of ligatures.

> *3) the 4 “small sign in the corner of big sign” control characters.*
>
> I’m glad to see that your solution is exactly the same as the one we
> suggest! Indeed, there are logically 2 ‘variables’ (Top vs Bottom and
> Left-Right), if the sequence of signs indicates the Left-Right
> unambiguously (like in your example), then we only need to encode
> explicitly the Top vs Bottom, of course. This is basically how we
> represented things with Michael at the pub.
> Please keep in mind that the four operators come from another type of
> syntax.
>

This is not what I am suggesting. I am suggesting 4 control characters:
top-bottom, left-right, A\B, B/A. instead of top-bottom, left-right + four
corners.
Or perhaps I am not understanding your system.

> *4) Vertical and horizontal script and control characters*
>
>
> I do not understand why you do entirely get rid of the notion of ‘quadrat’
> in your example. If you encode your text as /hthr-H*(n:t)-tA:tA/, the text
> would be displayed correctly whatever the orientation  (hltr, hrtl, vrtl,
> vltr), no? Of course, one needs the notion of quadrat in the encoding for
> this; and this is a nice case for showing that we *need* this well-defined
> notion in the encoding, not just sequences of glyphs.
>

If you want to have the "notion of quadrate" in unicode, then you have to
introduce at least another control character that you will have to use in
front of *every single quadrate* in your text. if i understood correctly,
one of the proposal that were circulating was indeed suggesting something
like that. Which means that to display any text you will probably end up
using more control characters that hieroglyphic signs themselves. This can
be a sound way of thinking in a MdC (as you don't like JSesh ;-) )
perspective, but I doubt it make sense in unicode (and more important i
doubt the people of the unicode consortium will think it make sense).

Unless you want to code every single possible "quadrate" as an independent
glyph: that would thus end up being conceptually comparable to a chinese
character.

But I assume (i hope) we don't want do do that, right?

*5) special characters, vertical/horizontal texts and input methods*
>
>
> Your point escapes me, here. Unless it is a result of getting rid of the
> quadrats: within quadrats, the groups of hieroglyphs are essentially the
> same.
>

With quadrats marked by special control characters would be even worse, as
you would have to take into consideration even more possible combinations.

> *6) Ramesside “groups” (or “tall groups”).*
>
>
> Why are they groups or quadrats and not ’small columns’?
>
> The answer is simple and straightforward (and provided by your example):
> because they correspond to the size occupied by the A1 sign. Look at the
> example.
>

> but some signs do occupy the full height of the horizontal line (unlike in
> your Japanese example, which is nice, but as such irrelevant), which is the
> basis for deciding what counts as a unit (quadrat) or not. (Or do you have
> another definition in mind?).
>

Not true.
This can be interpreted just as a question of layout, not as a question of
grouping.
Simply, there could be signs that can be graphically stretched/enlarged to
fully fill alone a given space, while other will be squeezed fit together
within an equivalent space.

This does not say anything at all about them being "groups" or not. It can
be interpreted just as a question of layout.

And actually this is very common in various writing systems around the
world, and no one would consider the fact that some isolated glyphs appear
as big as some "combined one" as reason to consider the "combined ones" as
"groups".

Just a few examples from the internet:

have a look at this chinese (bopomofo + characters) text:

https://upload.wikimedia.org/wikipedia/commons/1/1b/Bopomofo_in_Regular,_Handwritten_Regular_%26_Cursive_formats.jpg

According to your way of interpreting groups ("some signs do occupy the
full height of the horizontal line, which is the basis for deciding what
counts as a unit (quadrat) or not."), the four little characters within the
parentheses on the right of the image should be interpreted as a single
"group" (or as a combined "character", as we are in china) just because
they, combined, are as big as some of the other single characters.
This is not the case. No one in asia would have consider such a combination
as a "group" or as a "combined character". They are just four independent
characters(or "four groups" assuming character = group, which conceptually
is a fairly sound equivalence) which happen to fit into the space in a
slightly different way compared to the other characters of the text.
Their appearance is just a question of *scale*, thus of layout. Not of
grouping, and do not say anything about what a "group" should be.

Here, an even better example, a chiense text with bopofomo and hanzi
characters:

http://chinesehacks.com/app/uploads/2010/06/zhuyin.jpg

no one would consider those small signs on the right of the big characters
as "groups" just because when combined together they end up being as big as
the single bigger characters.
And as you can see, conceptually, you can interpret this text as a sequence
of short vertical columns organized in horizontal lines. Some of these
short vertical columns are occupied by a single "scaled up" character (like
the A1 in the Egyptian text above) while other short vertical columns are
occupied by multiple "scaled down" signs one above the other.
it is the same with Ramesside writing.

No one, however, would consider those smaller "scaled down" signs written
one above the other as "groups" just because together they are as big as
the big ones.

Same with latin script, actually: for instance take again the manuscript
page above:

http://www.sothebys.com/content/dam/stb/lots/L12/L12240/256L12240_6G4Y2.jpg

According to your definition, one should take the tallest sign (i.e. the
initial "B") as the unit to define what a "group" is. As a consequence, one
should consider the 6 lines of text beside it not as "lines of text", but
as a single "group", and according to your approach those lines should not
be represented with a specific layout, but should be somehow combines with
control characters.

I hope we agree that this would make no sense.

So if in Chinese (and Japanese, and Korean) and in Latin script you can
have glyphs of different sizes in the same text, without the need to
conceptually cluster the "smallest one" into "groups" that will appear as
big as the biggest signs, why do you feel this need with Egyptian?

And note that this is different from Egyptian ordinary square writing, as
in ordinary square writing one can assume that all the signs (both those
combines into square groups and those outside square groups) are more or
less in the same *scale* (as a general principle, again i know there are
exceptions). In Ramesside writing, instead, the *scale* of the sign is not
always the same. Some are "scaled down" to fit togheter within one of those
"small vertical columns" while others (like the A1) are just "scaled up" to
fill up alone the same space of one of those "small vertical columns".

Exactly like in the Chinese examples here above, or exactly like with
illuminated capital letters in manuscripts.

It is (or it can be interpreted) as a question of layout.

> But if you want to be able to use it minimally for (1) standardized
> electronic corpus and (2) journals like LingAeg, etc. dealing with the
> Egyptian language, this would be a requirement.
>

All the egyptian words quoted in my article published on the last issue of
JNES were written with my font with just standard ligatures, without
control characters. No one complained, no one told me anything, so I assume
that such a system is indeed suitable for scientific publications, not only
for writing tourists' names.. ;-)

Beside the anecdote, you can do everything scientific you need with unicode
without ending up using tens of control characters.
And if you can't then perhaps you should consider that unicode is not the
best tool for your job..

It is actually funny: i have tried to understand all the various proposal
including control characters, and as a result I have pointed out *several*
practical problems in the use of control characters that can seriously
affect the *linguistic* information of the text (like control characters
requiring to change the actual inputting order of the actual hieroglyphs to
display them correctly), some still without an answer, or with answer so
complex that will be hardly realizable (see question of turning vertical
text into horizontal text, that will require the introduction of at least
one more control character).

On the other hand, almost no one so far has pointed out any real problem in
the use of ligatures (possibly combined with a very restricted number, 1 o
2, of control characters as suggested by Bob) for which there isn't already
a possible solution which is already implemented in some script around the
world (i.e. unwanted ligatures can be broken with a zero-width character
like in indian scripts, deferent ligatures using the same signs could be
selected with "variant characters" like in emoji etc).

And still, a system that does not use multiple control characters will be
good only for writing tourists' names..

Please, no offense, but again: the story of the elephant..

Trying to describe (and encode) them as independent units (or devising
> control characters to build them) is more or less like trying to encode
> every singly column of hieroglyphic text as single independent “groups”.
>
> Are you sure? Come on...
>

it is you who is talking about encoding (as glyphs in font, not necessarily
as unicode characters) tens of thousands of possible and often unique
combinations, not me..

Ramesside "tall" groups can be interpreted as short vertical texts. Thus
thinking of encoding them (or building them with control characters) is
like thinking of encoding as groups whole columns of text.

> The wording is transparent: easy, but unfortunately inaccurate and
> irrelevant for scholarly uses (not palaeography, of course: palaeography
> has to do with the actual appearance and style of individual signs).
>

again, why?

> *7) What are the “square groups”?*
>
>
> I provided the definition agreed on — I think — by everyone, if you have a
> better one, I’m listening of course.
>

see above.
And see my "Second" introductory point in the previous email: it is not a
question of finding the "right" definition (as there isn't any). It is a
question of finding the *most efficient* definition in order to efficiently
reach our goal of having a working system of unicode-based hieroglyphs that
can be used by the *whole* Egyptological community (and not only by a
handful of people working on a specific corpus or database, and on his
respect, to respond to you last remark..).

I’m sorry to put it so bluntly, but if Unicode is not to be useful for the
> majority of egyptologists, so be it. It will remain what it is, a standard
> not used by the community: Journals, Corpora, etc. will keep on using JSESH
> and other tools, they’re doing well with it.
>

Perhaps the opposite should also be considered? perhaps the unicode-based
system to write hieroglyphs (considering that the very idea at the basis of
unicode is to make writing standardized and accessible to as many people as
possible) should indeed aim at being useful for the majority of the
Egyptologists (corpus-linguists as well as all the other thousands of
non-corpus-linguist egyptologists), and if some team working with some
specific database should have some specific need that cannot be satisfied
with such a standard unicode system, then perhaps it is them who should use
other tools?

I guess this is an option that should be considered.. no?

Best

Marwan
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://evertype.com/pipermail/egyptian_evertype.com/attachments/20160723/3e8aee07/attachment.htm>