The International Phonetic Alphabet is based on the Latin alphabet A-Z, with a lot of extensions. There are extensions like “Latin script a” ɑ, like “Latin epsilon” ɛ, like “Latin gamma” ɣ, like “Latin eng” ŋ, like “Latin phi” ɸ, and so on. Notice the following:
- Latin ɛ is fairly similar to Greek ε, though its capital is Ɛ and the Greek’s is Ε.
- Latin ɣ is rather different to to Greek γ being symmetrical with a loop; its capital is Ɣ and the Greek’s is Γ.
- Latin ɸ is distinctly different from Greek φ, having strong serifs in its ascender and descender; it has no capital and the Greek’s capital is Φ.
And this is fine. These Latin letters were “disunified” from Greek a long time ago, and the UCS contains all of them as uniquely encoded characters. Three letters, however, were not disunified, and are problematic.
- U+03B2 ( β ) GREEK SMALL LETTER BETA
- U+03B8 ( θ ) GREEK SMALL LETTER THETA
- U+03C7 ( χ ) GREEK SMALL LETTER CHI
Now the first and third of these do have non-Greek shapes, just as Latin phi does. Here’s an example from Daniel Jones’ Outline of English Phonetics (1932)—click on the image to see it larger if you like:
Now, the serifs on that beta’s descender are very atypical indeed in Greek typography. Moreover, the fact that the letter is unified with Greek can cause some troubles in sorting multilingual data, since oin a typical English or German or French sort (for instance) the Latin alphabet sorts first, then the Greek alphabet, then the Cyrillic, and then others. In practice this means that β does not sort after b (where one might expect it), but after z.
The IPA chi can also differ from the typical Greek chi. In the 1949 Handbook of the IPA, the serifs on the letter are on the top-right to bottom-left branch of the x; the other branch is curved.
A point to remember is that the intent of the IPA chi was originally not that it was unified with Greek chi, but rather that it was different:
The non-roman letters of the International Phonetic Alphabet have been designed as far as possible to harmonise well with the roman letters. The Association does not recognise makeshift letters; it recognises only letters which have been carefully cut so as to be in harmony with the other letters, For instance, the Greek letters included in the International Phonetic Alphabet are cut in roman adaptations.
Let’s compare capital and small Latin Xx, Greek Χχ, and that IPA chi. Now it’s possible that because Greek fonts have been in use for a good while that some people might prefer a greekish glyph to a latinish glyph. Nevertheless, take note of the weight of that older IPA chi, and compare it to the “stretched x” shape.
But in fact there’s another reason to encode a Latin chi. Lepsius made use of it in his transcription of Chukchi, and there its capital is entirely different from the capital used in Greek. Now, there is precedent for just this kind of thing being a reason to disunify: Cyrillic Ԛ and ԛ (used in Kurdish) were disunified from Latin Q and q because the capital Cyrillic one sometimes looks like an oversized small one.
So, what it looks like is that we have the following—Latin x, Greek chi, and Latin chi (both greekish and latinish glyphs are shown):
Let’s assume that LATIN LETTER CHI and LATIN LETTER BETA get encoded (leaving aside the question of THETA for now). Now the big question for the IPA is, what should be done when they are? The current recommendation is “use GREEK LETTER CHI”, but of course there’s no alternative. When there is… well, I for one would prefer a Latin letter that sorts between x and y, rather than a Greek letter that sorts between φ and ψ.
There is certainly data out there using the Greek letters β and χ and θ. Of course, there is also data out there using non-Unicode fonts, or SAMPA, or other things. In my opinion, the right thing to do is bite the bullet, get Latin beta, chi, and theta encoded, and get the recommediations promulgated through fonts and keyboard drivers. But I do not know what the view of the International Phonetic Association might be.
Here is an example of some functionality related to this. I created a number of folders named “a_la”, where the “_” is replaced by various letters.
It’s easy to see that in the Mac OS, Latin letters sort before Greek. Thorn þ sorts correctly after z. Eth ð after d. IPA ɡ after g, followed by IPA gamma ɣ. Small capital ɪ and Latin iota ɩ follow i, as expected. Then, after þ, we see that the Greek alphabet appears in its correct order. But I am sure that I want IPA beta to sort after b, not after þ, and likewise IPA chi after x. I am torn between wanting IPA theta to sort after t or after þ, but probably the former. Anyway, I want a disunification of these three IPA letters from Greek.