[Egyptian] UMdC - A Unicode Coding Manual for Egyptian?

Bob Richmond bobqq at live.co.uk
Wed Jul 26 14:43:34 BST 2017


All

My error: I had meant to say 8 format controls in total, 6 are new – the original 2 (VJ and HJ) were proposed over two years ago though still waiting to be released in a future version of the Unicode standard.

One thing to be clear about. There are many data formats that can use Unicode hieroglyphic so UMdC does not have to do everything. For instance Daniel mentions TEI-XML an established system and there are many more.

Daniel:

The Rosetta data makes for an excellent example lets discuss this off list and get a TEI with new Unicode version together – replacing items like <seg rendition="#vert">𓀎𓏥</seg> with 𓀎:𓏥  aids readability considerably and its interesting to see TEI and putative UMdC side by side.

Incidentally I’m writing some guidelines for hieroglyphic in Markdown for publishing docs on Github etc.

Serge:


  1.  I envisage UMdC as a simple plain text markup system focussed on end user usability rather than an XML file format so more akin to Markdown than TEI-XML (but less complex fundamentally than either). However I agree that (for instance) a tag convention for signs missing in Unicode can be useful. The possible approaches can be discussed in detail.
  2.  I agree.


Regards,
Bob



From: Serge Rosmorduc<mailto:rosmord at gmail.com>
Sent: 26 July 2017 06:49
To: Egyptian Hieroglyphs in the UCS<mailto:egyptian at evertype.com>
Subject: Re: [Egyptian] UMdC - A Unicode Coding Manual for Egyptian?

dear all,

My two cents on the subject:

a) If XML is used (and XML is needed for some uses), I believe we should use it for all structural information. We don't want to need to parse XML and some light MDC in the same file.

I also believe that regularity and ease of processing are more important than convenience when manually writing XML. In particular, I would use an XML element for each sign (it's needed in some cases to attach properties to the sign, so let's use it in all cases).

b) For simple texts, the MDC has the advantage of being somehow readable - and will be even more with Unicode.
Both uses might be kept alongside each other.



best regards

Serge



Le 25 juil. 2017 13:57, "Daniel Werning" <daniel.werning at topoi.org<mailto:daniel.werning at topoi.org>> a écrit :
Dear Bob,

I am generally interested in joining the discussion.
I have exercised some encoding in TEI XML based on the current state of Unicode encoding.
See: http://rosettastone.hieroglyphic-texts.net/tei-xml/.
The encoding of the arrangement will be easier with the six control characters in pipeline for Unicode. However, there are cases left of non-unicode signs and non-simple arrangements, which can -- nicely, I believe -- be encoded in TEI XML. Anyhow, I can imagine that I can contribute based on my experience.

All the best,
Daniel (Werning)
--
_____________________________________________________________
Dr. Daniel A. Werning

  daniel.werning at topoi.org<mailto:daniel.werning at topoi.org>
  http://www.topoi.org/person/werning-daniel-a/

  Exzellenzcluster Topoi
  Humboldt-Universität zu Berlin
_____________________________________________________________



Am 25.07.2017 um 12:11 schrieb Bob Richmond:
Hi All

1071 Hieroglyphs have been available in Unicode since version 5.2 (2009). Six formatting characters are now in the pipeline (since May). Eventually there will be more hieroglyphs and likely control characters too.

The idea of defining a data file format “UMdC” acknowledging Unicode was discussed at I&E 2006 and afterwards but the lack of Unicode availability in the standard and issues of application and system support made this seem a little premature. It seems to me the time is now ripe to revisit the topic.

The basics of UMdC (as I see it) are as follows:
 1. A well defined file type “umdc” containing plain text and markup

    (capable of being edited in simple text editors such as Windows
    Notepad and HTML textarea blocks).
 2. Guidance on subset usage in database records.
 3. Basic plain text including the 1071 + 6 for Egyptian characters
    (plus e.g. transliteration formats).
 4. Markup to deal with elements missing from Unicode such as

    hieroglyphs not in the 1071 set.
 5. Optional markup to help with preparing data for use with other

    formats such as HTML/CSS and Office applications.
 6. Optional markup to help with interoperability with MdC88 based data

    formats (including extensions such as JSesh).
 7. Specification of font requirements needed for representation of UMdC

    data.

So long as the markup system is sufficiently flexible (e.g. use of XML-like tags) version 1 of UMdC need not be overly featured and then additions can be made as need is proven. It should be possible to create a version 1 specification supported with basic tools  in months not years.

I expect I’m not the only person who has already done related work. Has anyone any points to make of what they would like to see in UMdC? Anyone like to get involved in defining the markup scheme?

Thanks

Bob Richmond


_______________________________________________
Egyptian mailing list
Egyptian at evertype.com<mailto:Egyptian at evertype.com>
http://evertype.com/mailman/listinfo/egyptian_evertype.com

_______________________________________________
Egyptian mailing list
Egyptian at evertype.com<mailto:Egyptian at evertype.com>
http://evertype.com/mailman/listinfo/egyptian_evertype.com


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://evertype.com/pipermail/egyptian_evertype.com/attachments/20170726/984c00b7/attachment.htm>


More information about the Egyptian mailing list