[Egyptian] Some general considerations

Sun Jul 24 17:38:59 BST 2016

Replying to Nigel

The point you make on the importance of making a distinction between
hieratic transcription to hieroglyphic and transcription from (original)
hieroglyphic to (digital format) hieroglyphic is important. This is one snag
with the MdC tradition which encourages 'one size fits all' thinking about
arrangements of hieroglyphs and groups. Likewise vertical and horizontal
writing have their own considerations. 

Fonts. As you note a font such as Cleo defines the glyphs only. Nevertheless
in designing her font Cleo gave attention to relative proportions of sign
and use in combinations using the tools at her disposal at the time. Her use
case was based on the Gardiner Egyptian Grammar model which attempts to
render classic Middle Egyptian style well at 18pt text and acceptably at
12pt. Contrast with the Hieroglyphica font which provides more detail but is
optimised for larger point sizes. Traditional MdC applications just as JSesh
come with font+application and the two are intended to be used in concert.
The Egyptologist has little freedom unless both font and application meet
their needs. Portability from one app+font to another app+font is not ideal.

Fonts with shaping in Unicode. Glyphs as before. But the font designer now
has the opportunity to have more control over how their font looks and
better deal with proportion and aesthetics. Whether this is a burden will
depend on tools available. This is one personal interest of mine and font
practicalities are part of thinking my behind control characters and their
straightforward implementation. An application such as JSesh can choose to
ignore some or all of the features built into the font and do its own thing.
It can also add functionality on top of basic text. There is no loss, only
the potential of gain.

My personal prototype tools work with MdC (including JSesh extensions)  and
Unicode plain text and in my experience over last 18 months it all works
pretty well. It is desirable that application such as JSesh add in support
for Unicode. JSesh 5.5 does not have 'Gardiner codes' for all the Unicode
(2009) hieroglyphs so there is housekeeping needed but nothing Serge or I
regard as problematic.

Incidentally. My own software is agnostic about what is settled on for
initial Unicode plain text controls (I could even add in support for RES if
required!) but on hold until standards situation is clarified.

Big topic.

Bob

-----Original Message-----
From: Egyptian [mailto:egyptian-bounces at evertype.com] On Behalf Of Nigel
Strudwick
Sent: 24 July 2016 12:56
To: Egyptian Hieroglyphs in the UCS <egyptian at evertype.com>
Subject: Re: [Egyptian] Some general considerations

Greetings

Just wanted to say that Stéphane’s last paragraph admirably summarises my
perspective.

I haven’t been reading this posts closely since I don’t have the theoretical
background on Unicode to have a view. But some observations and a question.

But I would say that of the categories Stéphane mentions in his last
paragraph, it is probably also making a distinction in the text corpora
section between hieroglyphic and hieratic texts, simply due to the generally
more regular arrangement of the latter, and hieroglyphic’s notorious ability
to be squeezed, expanded etc to fit a space for a variety of reasons. So
while those doing corpora won’t be as fussed about precise arrangements as I
can be, there are going to be times where there will be challenges with
hieroglyphs.

Something that did come as a shock during the meeting was the dawning on me,
in the course of a slightly tense exchange with Michael, was that Unicode
fonts clearly can put a lot of burden on the font designer, especially for
glyphs, if they are to include multiple and complex ligatures etc. This is
where the current fonts score well (such as the Cleo Fonts) in that they are
about the design only, and the arrangement is then carried out in the
specialist software (e.g. JSesh) where the ultimate control lies with the
Egyptologist and not the font designer. [I know that won’t be well received
but you know where I come from on all this!]

One question that I should have asked at the meeting is this: not knowing
how control characters actually manifest themselves, will a text formatted
with unicode with all the control characters be able to be exported into a
plain MdC format, or will one of these input systems be able to import an
MdC text and format it correctly? I ask as there will be a lot of MdC plain
text around on people’s hard discs. Or will be have ultimately to revise the
MdC system to handle all these other codes?

Apologies in advance for the evident failures of comprehension in some of
the points above. I don’t really need comments on the first points I made,
but I would be interested on how the MdC issues might be handled.

Best, Nigel

On 24 Jul 2016, at 11:33, Stéphane polis <s.polis at ulg.ac.be> wrote:

> Hi guys!
> 
> (no worries, this is my last mail on the topic, enough time and energy 
> spent on this.)
>>> Third
>>> 
>> And your solution is a good one — I have absolutely no doubt — as long as
you do not want to *search* for the relative position of signs with respect
to one another. This is however a piece of information that is, like it or
not, part of the ‘orthographic’ system of ancient Egyptian and to which we
want to have access (see further Simon’s mail earlier this week).
>> 
>> In general, the spatial distribution of signs (i.e. the "grouping") has
*no meaning at all* in Egyptian. No semantic, phonological, or morphological
information is coded in the relative position of the signs.
> 
> This is patently inaccurate. 
> Three simple examples should suffice here (sorry for providing textbook
examples known by all).
> 
> 1 - spatial distribution affecting phono-morphology [t] followed by 
> [w] can only be read /tw/, while t&w can be /tw/ or /wt/ => same ‘linear
order’, 1 reading in one case, 2 readings in the other (referring
potentially to 2+ morphemes).
> 
> 2 - spatial distribution as a condition for reading (and adding 
> semantic value) <PastedGraphic-1.pdf>  makes no sense, when combined 
> as <PastedGraphic-2.pdf>  it is clear that it should be read /ptH/ 
> ‘Ptah’ (> p(t) + t(A) + H(H)), referring to the god in his demiurgic
dimension (separating the sky from the earth). The position of the sign is
both a condition for reading and an added semantic information (not only
Ptah, but Ptah as demiurge).
> 
> 3 - spatial distribution affecting the function of a sign Hieratic 
> example? If the rowing man is followed by a n ( <PastedGraphic-3.pdf>
> ) the n has to be read /n/ (phonemogram), if the n is positioned under 
> the rowing man ( <PastedGraphic-5.pdf> ), we are dealing with a 
> compound classifier made of two signs, the /n/ has an iconic value and the
whole group refer to a man rowing in water.
> 
> Etc., etc.
> 
> So position is 'just an esthetic, layout matter, not a linguistic one’ as
you say? This kind of assertion reflects badly on our understanding of the
hieroglyphic system. That’s a pity to make such unwarranted statements in a
discussion also aimed at non-specialists to whom we try to explain things as
straightforwardly as possible in order to come up with a solution that is
satisfying for everyone.
> 
> Accordingly, if Unicode aims first and foremost at rendering the
‘linguistic’ dimension of writing, the examples above should suffice to show
the importance of the ‘quadrat' organization of this script. Again, you
might not like it from a computer/font oriented perspective; I agree it’s
not convenient to encode, it might even not be possible to encode it at all
in Unicode because the standard has not the capabilities needed (and only
higher level protocols could then handle this). All of this is fine with me,
but it would be great not to distord presentation of the data.
> 
> [I leave here alone other semiotic dimensions of writing: the spatial 
> arrangement is part of the ‘orthography’ of the scribes, not 
> necessarily meaningful at the ‘linguistic’ level strictly speaking, 
> but at the level of scribal practices, etc.: why don’t we use IPA for 
> our modern languages? simply because writing is much more than 
> ‘linguistic’ stricto sensu and that it make sense to know who writes 
> ‘next’ and who plays with the script and writes ‘neckst’. As simple as 
> that.]
>  
> 
>> No more than illuminated initial capital letters in medieval manuscripts.

>> Like these:
>> 
>> http://www.sothebys.com/content/dam/stb/lots/L12/L12240/256L12240_6G4
>> Y2.jpg
>> 
>> They are nice, they are fancy, they are there (hundreds of thousands of
them), but they do not carry any additional linguistic information
whatsoever. Thus, there is not specific need to (and in fact they are not)
encode or represent them in unicode.
> 
> This comparaison is hilarious.
> 
>> As far as I know (and please correct me if I am wrong), the only utility
that recording the position of signs could have on a practical level would
be to fill lacunas in text or to suggest alternative readings in hieratic
scripts.
> 
> Nope, see above. This is one side-interest of the control characters that
I mentioned in the discussions, indeed, but definitely not the only one,
since the arrangement is meaningful at multiple levels.
>>> 1) JSesh approach
>>> 
>>> 
>>> 
>> Sure, everything needs not be dealt with at the level of Unicode, but the
data needed should not be hidden in the ligatures embedded in the font
either.
>> 
>> Again, what data?
>> What is the *linguistic* (not graphic, not philological) information
carried by groups that is so important to code?
> 
> See above.
>>> 2) Groups in fonts
>>> 
>>> 
>> OK, sure. But then again control characters have the advantage of being
explicit about the relative position of signs when a group is not in a font:
how would you proceed for storing such an information, as a lay user, when
using ligatures? (this is a real question, nothing ironic here).
>> 
>> I am not sure I understand your question: the information about the
spatial distribution is already coded in the ligature. the ligature IS the
information.
>> And if the ligature does not exist, it takes (literally) 30 seconds to
create it within a font. And with a common database and a common font of
reference, if would be extremely easy to create a common shared table of
ligatures that will allow everyone to see exactly the same groups (or
alternatively if i write my text with the font x with the ligature x, it
would be enough to embed the font itself in the text file (there are various
ways to do so), because everyone would be using the same font or at least
the same set of ligatures.
> 
> All this was clear, sure. What I meant is that the ligature are purely
‘graphic’ right? No information is stored about the position of one sign
with respect to another?
>   
>>> 3) the 4 “small sign in the corner of big sign” control characters.
>>> 
>> I’m glad to see that your solution is exactly the same as the one we
suggest! Indeed, there are logically 2 ‘variables’ (Top vs Bottom and
Left-Right), if the sequence of signs indicates the Left-Right unambiguously
(like in your example), then we only need to encode explicitly the Top vs
Bottom, of course. This is basically how we represented things with Michael
at the pub.
>> Please keep in mind that the four operators come from another type of
syntax.
>> 
>> This is not what I am suggesting. I am suggesting 4 control characters:
top-bottom, left-right, A\B, B/A. instead of top-bottom, left-right + four
corners.
>> Or perhaps I am not understanding your system.
> 
> OK, then I disagree because of the polysemic value of A/B and B/A.
>  
>>> 4) Vertical and horizontal script and control characters
>>> 
>>> 
>> 
>> I do not understand why you do entirely get rid of the notion of
‘quadrat’ in your example. If you encode your text as /hthr-H*(n:t)-tA:tA/,
the text would be displayed correctly whatever the orientation  (hltr, hrtl,
vrtl, vltr), no? Of course, one needs the notion of quadrat in the encoding
for this; and this is a nice case for showing that we *need* this
well-defined notion in the encoding, not just sequences of glyphs.
>> 
>> If you want to have the "notion of quadrate" in unicode, then you have to
introduce at least another control character that you will have to use in
front of *every single quadrate* in your text. if i understood correctly,
one of the proposal that were circulating was indeed suggesting something
like that. Which means that to display any text you will probably end up
using more control characters that hieroglyphic signs themselves. This can
be a sound way of thinking in a MdC (as you don't like JSesh ;-) )
perspective, but I doubt it make sense in unicode (and more important i
doubt the people of the unicode consortium will think it make sense).
>> 
>> Unless you want to code every single possible "quadrate" as an
independent glyph: that would thus end up being conceptually comparable to a
chinese character.
>> 
>> But I assume (i hope) we don't want do do that, right?
> 
> We use spaces between words, would it be bad to use quadrats separators
between quadrats? Mutatis mutandis, it feels to me like saying ‘oh, no,
there are blank spaces all around!’.
> 
> More seriously, taking signs as basic units for Unicode might be a
practical solution (even if it leads to many subsequent difficulties, see
the vertical vs horizontal discussion), but denying the essential function
of quadrats is kind of funny when discussing the introduction of control
characters (1, 2, 38, it does not matter): what they do is building quadrats
(implicitly or explicitly).
>>> 5) special characters, vertical/horizontal texts and input methods
>>> 
>>> 
>> 
>> Your point escapes me, here. Unless it is a result of getting rid of the
quadrats: within quadrats, the groups of hieroglyphs are essentially the
same.
>> 
>> 
>> With quadrats marked by special control characters would be even worse,
as you would have to take into consideration even more possible
combinations.
>> 
>>   
>>> 6) Ramesside “groups” (or “tall groups”).
>>> 
>>> 
>> 
>> Why are they groups or quadrats and not ’small columns’? 
>> 
>> The answer is simple and straightforward (and provided by your example):
because they correspond to the size occupied by the A1 sign. Look at the
example.
>>  
>> but some signs do occupy the full height of the horizontal line (unlike
in your Japanese example, which is nice, but as such irrelevant), which is
the basis for deciding what counts as a unit (quadrat) or not. (Or do you
have another definition in mind?). 
>> 
>> Not true.
>> This can be interpreted just as a question of layout, not as a question
of grouping.
> 
> Paraphrasing what Ariel Shisha-Halevy once said to a colleague during a
conference: you’re entitled to have your own kind of Egyptian if you want!
> I’m not arguing against non-sense. But that’s a pity, because there are
actually many cases (esp. in Ptolemaic temples) that fit perfectly with your
‘small columns’ hypothesis. 
> The Ramesside example that you provide simply does not work this way. 
> 
> And all the parallels from other scripts are pointless (and admittedly
funny in the framework of this discussion), since none of these scripts are
based on a quadratic structure that is close in any respect to Egyptian.
Comparing apple with pears won’t help: and again, that’s a bit of a pity,
because we have many cases in Egyptian of layouts similar to the ones you
mentioned. These are nice cases of special layouts, indeed, and I agree with
your analysis: we do not want to encode them. (unlike the quadrat
structure).
> 
>  
>> But if you want to be able to use it minimally for (1) standardized
electronic corpus and (2) journals like LingAeg, etc. dealing with the
Egyptian language, this would be a requirement. 
>> 
>> All the egyptian words quoted in my article published on the last 
>> issue of JNES were written with my font with just standard ligatures, 
>> without control characters. No one complained, no one told me 
>> anything, so I assume that such a system is indeed suitable for 
>> scientific publications, not only for writing tourists' names.. ;-)
> 
> I’m afraid that one paper consisting of a lexicographical discussion about
the use of one word in one hieratic document cannot be taken as any evidence
for disproving my point: you’re happy with the way your hieroglyphs are
rendered? Perfect. But why should it be the case that people willing to
render a simple hieroglyphic line as the one below would face difficulties?
This escapes me.
> 
>  
> <Capture d’écran 2016-07-24 à 11.47.11.png> (KRI I, 4)
>>> 7) What are the “square groups”?
>>> 
>>> 
>> 
>> I provided the definition agreed on — I think — by everyone, if you have
a better one, I’m listening of course.
>> 
>> see above.
>> And see my "Second" introductory point in the previous email: it is not a
question of finding the "right" definition (as there isn't any). It is a
question of finding the *most efficient* definition in order to efficiently
reach our goal of having a working system of unicode-based hieroglyphs that
can be used by the *whole* Egyptological community (and not only by a
handful of people working on a specific corpus or database, and on his
respect, to respond to you last remark..).
>> 
>> I’m sorry to put it so bluntly, but if Unicode is not to be useful for
the majority of egyptologists, so be it. It will remain what it is, a
standard not used by the community: Journals, Corpora, etc. will keep on
using JSESH and other tools, they’re doing well with it. 
>> 
>> Perhaps the opposite should also be considered? perhaps the unicode-based
system to write hieroglyphs (considering that the very idea at the basis of
unicode is to make writing standardized and accessible to as many people as
possible) should indeed aim at being useful for the majority of the
Egyptologists (corpus-linguists as well as all the other thousands of
non-corpus-linguist egyptologists), and if some team working with some
specific database should have some specific need that cannot be satisfied
with such a standard unicode system, then perhaps it is them who should use
other tools?
>> 
>> I guess this is an option that should be considered.. no?
> 
> We are talking about feelings and different perceptions here, so it’s hard
to be objective. My own view (admittedly subjective) is as follows: most
Egyptologists (publishing texts, monuments, etc) will never be happy with
what Unicode has to offer, because this is not precise enough at multiple
levels (Nigel expressed this position multiple times during the meeting for
instance). On the other hand, grammarians (broadly speaking) and people
working on corpora are probably the ones who are the more open to
standardization (hieroglyphs were not even there in the TLA at the
beginning). I see this community (much more than historians, etc.) as
intensive users of Unicode: using it for exchanging texts, publishing
volumes, creating Online resources, etc. I feel like we are very much in
favor of standardization and striving for making resources accessible; two
goals of Unicode, as you mentioned.
> 
> These resources should be there for lasting, it would be a pity not to
think about the standard carefully before going in any direction.
> 
> That’s all folks!
> Have a nice weekend,
> 
> Stéphane
> _______________________________________________
> Egyptian mailing list
> Egyptian at evertype.com
> http://evertype.com/mailman/listinfo/egyptian_evertype.com

_______________________________________________
Egyptian mailing list
Egyptian at evertype.com
http://evertype.com/mailman/listinfo/egyptian_evertype.com