Readit News logoReadit News
teleforce · 3 years ago
As many have pointed here Mandarin, Thai, Cantonese and Vietnam are tonal languages and the meaning of words are depending on how you speak the syllables inside the words. Mandarin has four, Thai has five, Cantonese has six and Vitnamese has six tones. Overall about 20% or 1.5 billion of the world's population converse daily in tonal languages.

It will be very helpful if someone come up with automatic tonal detection systems for language learners to automatically check the correctness of their pronunciations as they speak in real-time. This can be accomplished by using time-frequency analysis that detect its accuracy similar to these language pronunciation apps like ELSA speak for English [1][2].

[1]Time–frequency representation:

https://en.wikipedia.org/wiki/Time%E2%80%93frequency_represe...

[2] ELSA Speak:

https://elsaspeak.com/en/

inkyoto · 3 years ago
> It will be very helpful if someone come up with automatic tonal detection systems for language learners to automatically check the correctness of their pronunciations […]

The number embedded in the Jyutping pronunciation indicates the tone.

However, the automatic tonal detection is not as straightforward as it might seem due the tone sandhi (the tone change of the word/syllable depending on surrounding morphemes), and not all tonal languages have the tone sandhi. Cantonese, Teochew and Hokkien languages can have a pretty complex tone sandhi, though.

For Cantonese and Jyutping specifically, the tone sandhi is marked as, e.g., «faan4*2» which means that the standard pronunciation is «faan4» (the low falling tone) can change into «faan2» (the high rising tone) depending on the following morpheme. The tone sandhi does not have fixed rules and depends on each morpheme. Depending on the tonal language, the preceding morpheme can initiate a tone change, or it can be the following one. Certain tonal languages can have a complex tone sandhi that affects the tonal change not just in a single morpheme but it the rest of the morphemes that comprise a full word.

We would really be talking about a full contextual, semantics aware automated translator that applies the approriate tone sandhi on the contextual basis to give a Western reader the correct tone rendition in the transliteration.*

Umofomia · 3 years ago
Note that while Cantonese has the phenomenon of changed tones (變音 - https://en.wikipedia.org/wiki/Changed_tone), it is not actually considered tone sandhi (https://en.wikipedia.org/wiki/Tone_sandhi#What_tone_sandhi_i...).

Tone sandhi is phonologically motivated, i.e., the tone changes arise from the pronunciation of the surrounding words, and are thus largely predictable. Cantonese changed tones, however, are generally lexically motivated, i.e., the changed tone is part of the realization of the word itself. Cantonese changed tones are thus more akin to Mandarin's erhua phenomenon (https://en.wikipedia.org/wiki/Erhua), which is also seemingly as random with regard to the words to which it applies.

> For Cantonese and Jyutping specifically, the tone sandhi is marked as, e.g., «faan4*2»

Note that the Jyutping standard actually doesn't specify how tone changes are marked. I believe the * convention originated at https://www.cantonese.sheik.co.uk/ to facilitate Cantonese learning. Wiktionary has adopted a similar convention but using a hyphen (-) instead (https://en.wiktionary.org/wiki/Wiktionary:About_Chinese/Cant...).

dumbotron · 3 years ago
I got curious about something with tonal languages: how are song melodies written for them. For some, the melody matched the tones of the word. Then there's Mandarin. Mandarin just follows the melody, and you can figure out the word by context. As an English speaker, this makes ~cents~ sense. Homophones aren't a big deal. If Mandarin doesn't need tones in lyrics, why does it need them normally?
Umofomia · 3 years ago
Interestingly, Cantonese songs tend to preserve tone better than songs sung in Mandarin. The paper "Tone and Melody in Cantonese" by Marjorie K.M. Chan [1] mentions the following:

> For Chinese, modern songs in Mandarin and Cantonese exhibit very different behaviour with respect to the extent to which the melodies affect the lexical tones. In modern Mandarin songs, the melodies dominate, so that the original tones on the lyrics seem to be completely ignored. In Cantonese songs, however, the melodies typically take the lexical tones into consideration and attempt to preserve their pitch contours and relative pitch heights.

[1] https://journals.linguisticsociety.org/proceedings/index.php...

hnfong · 3 years ago
Tangential fact - while, as you described, most popular songs in Cantonese have tones matching the melody (usually the melody is written first, then lyrics are filled in), many songs from Christian churches don't follow this practice. (I don't know why, maybe a lack of lyricists for translation from English/Latin to Cantonese during earlier years?)

So, in Hong Kong, when somebody writes a song/lyric that doesn't quite have matching tones, we ask "which church are you from?" to make fun of it.

ddeck · 3 years ago
It needs them because there are too few unique syllables in Mandarin. I'm sure a linguist can provide the proper terminology, but there are only around 400 unique sounds in Mandarin, ignoring tones. Even adding five tones still only increases this to ~1500 (not all are used). Compare this to English, where estimates are in the 10-15k range.

There are therefore an enormous number of homophones in Mandarin, which makes it very challenging to comprehend without context. I've often had native speaking friends eavesdrop on a conversation, only to tell me that they're not sure what is being discussed.

It also means that the language cannot be usefully written phonetically, and thus unique characters are required.

some discussion here:

https://chinese.stackexchange.com/questions/40574/why-does-m...

https://chinese.stackexchange.com/questions/39695/does-chine...

https://chinese.stackexchange.com/questions/14596/how-many-s...

Gigablah · 3 years ago
Just because you can figure it out by context in songs (rarely upon the first listen, mind you), that doesn’t mean the added cognitive load isn’t excessively burdensome in everyday speech.

Deleted Comment

maartenpi_ · 3 years ago
I've learned to speak some Mandarin. What helped me a lot while talking is using Google's Translate function to see if I get the tone right.

It's free and fast. It's very helpful when you want quick feedback.

I learned a lot from it about the pronunciation of consonants as well. The k-sound has way more air in it. You need to pronounce it as "kh". Same thing with the t-sound.

I just came back from a trip to China and it was noticeable how much more people understood me. Still need to work on vocabulary though...

peterfirefly · 3 years ago
> The k-sound has way more air in it. You need to pronounce it as "kh". Same thing with the t-sound.

Well, if you are Dutch, all other consonants from all other languages need a lot more aspiration ;)

oefrha · 3 years ago
You can even use Google Translate's text to speech API in whatever learning program you build for yourself:

  GET https://translate.google.com/translate_tts?ie=UTF-8&tl=zh_CN&client=tw-ob&q=<url encoded text>
This returns an audio/mpeg (mp3) response. Change the language code as appropriate.

It's not the most natural sounding TTS engine, but it's free, unauthenticated and trivial to use.

xept · 3 years ago
What Google's Translate function are you talking about? Voice input?
postcynical · 3 years ago
The annnoying part in written thai is thattherearenospacesbetweenwords.
seanmcdirmid · 3 years ago
Spaces between words is a relatively recent Irish invention (7th or 8th century) in western written language, so it’s not like it’s an obvious thing to have.
xvilka · 3 years ago
Same with Chinese language, thus lexing and parsing requires knowing many more words than in languages with spaces between words.
dumbotron · 3 years ago
Hence expertsexchange.com
qingcharles · 3 years ago
Do any ideographic languages use spaces?

I'm used to it in Asian languages but it still does my head in when I try to read older Latin documents.

geomark · 3 years ago
yougetuseditafterawhile

The thing that bugs me about written Thai is that there are spaces now and then and you would expect them to be at sentence breaks but they seem to be randomly placed throughout the text, almost as if that's where the writer felt like he needed to take a breath instead of where one sentence ends and another begins.

deadfoxygrandpa · 3 years ago
idk the more chinese i learn, the more im convinced that the very concept of individual words is blurred and not quite the same because of the way the writing system works

中国共产党, is that one word? should you break it up as 中国 共产党? what about 中国 共产 党? i dont think its nearly as clear which of these is correct as it is in english

throwaway2037 · 3 years ago
There was a post on HN recently about the demise of speech recognition: https://news.ycombinator.com/item?id=35800935

That said, I still think your post raises some excellent points. To refine it, how about an online learning app that plays a sound or video clip from a national news TV or radio program. (Usually, the speakers have perfect pronunciation.) You repeat the words; the online app records it; then, it shows your tones/pitch/accent versus the native speaker. I think that could be incredibly useful. My point: Speech reco is an impossibly hard problem (currently) due to infinitely broad context in spoken languages. My idea would have "perfect context", so speech reco could really work.

When I was learning tones for a while, I used to record myself, then replay it. It's amazing to hear the difference between what you think you sound like and what you really sound like. An online app could help to fine tune your pronunciation very quickly.

nextaccountic · 3 years ago
This article is from 2010, and it says that as of 2010, progress on speech recognition flatlined since 2001

Is this still the case?

How does 2001 software compares to 2023 software? Or 2010 to 2023 for that matter

gumby · 3 years ago
> As many have pointed here Mandarin, Thai, Cantonese and Vietnam are tonal languages

There are also plenty of tonal languages outside East/Southeast Asia, in South Asia, Africa, Europe, North America, and that's just ones I know of.

Languages also switch, like ancient Greek.

LAC-Tech · 3 years ago
Or indeed, ancient Chinese! Which IIRC was non tonal.

There's some linguistic pattern where consonant clusters at the end of words get dropped, but their 'effect' on the vowel remains and that's how these kind of tones develop.

IIRC there's also two different kind of tones, pitch tones, and register tones....

Languages are crazy.

faitswulff · 3 years ago
Hong Kong Cantonese has six tones. I knew it was different for Guangzhou Cantonese, but I wasn't sure exactly how many, so here's what Wikipedia says:

> In finals that end in a stop consonant, the number of tones is reduced to three; in Chinese descriptions, these "checked tones" are treated separately by diachronic convention, so that Cantonese is traditionally said to have nine tones. However, phonetically these are a conflation of tone and final consonant; the number of phonemic tones is six in Hong Kong and seven in Guangzhou.

keyboard_smash · 3 years ago
Yeah, the seventh tone in Guangzhou Cantonese that’s gone in Hong Kong Cantonese is the high-falling tone.

In Guangzhou Cantonese, 衫 (shirt) and 三 (three) are not homophones, but they are in Hong Kong Cantonese. The Jyutping romanization (from the Linguistic Society of Hong Kong) reflects this change in HK Cantonese (saam1), whereas Yale, based on the older pronunciation, could represent the difference in tone (sāam vs sàam).

Interestingly enough, the high-falling tone is still retained in Hong Kong Cantonese for one exceedingly common word, the final particle 㖭 (tim1/tìm)!

porphyra · 3 years ago
And to add to that, there are a bunch of sandhis, which are when the tones are shifted or modified when followed by certain other tones.

Using tones is so natural that many native Cantonese speakers are unaware that the language even has tones lol.

xbeta · 3 years ago
Isn't true that Cantonese has 9 tones?
Umofomia · 3 years ago
Yes and no. Cantonese has 9 tone categories that have 6 distinct tone contours. The 3 additional tones fall under the checked tone category (https://en.wikipedia.org/wiki/Checked_tone) for historical purposes, but their realized pronunciations coincide with the tone contours of 3 of the other 6 tones, so for most practical purposes, many sources describe Cantonese as having 6 tones.

I have an old Quora answer here that goes into more detail: https://qr.ae/pyNupi

ronyeh · 3 years ago
I like to tell complete newbies that Cantonese roughly has "4 tones."

- High level

- Mid level

- Low (includes "low falling" and "low level")

- Rising (includes "low rising" and "mid rising")

I've combined similar tones into the Low and Rising categories. If you are a non-native Cantonese speaker, and don't differentiate between "low falling" and "low level", native Cantonese speakers will still understand you.

It's difficult for a non-native speaker to distinguish between "low rising" and "mid rising".... so just treat it as a rising tone. I'm a native speaker and sometimes I forget which type of rising tone a particular word is.... I didn't learn it that way, haha. I just learned to say the word the same way my parents did.

The 7th, 8th, and 9th tones are short versions of the three level tones, and they all end in a consonant (like "k"). If you pronounced them the same, but make the syllable very short, you'll be fine.

So yeah.... think of it as 4 tones, just like Mandarin. Three different level tones at high, middle, low pitches, and one rising tone :-).

likpok · 3 years ago
It depends. Some people classify it as 6 tones, some as 9. The extra three are "entering <x> level tone", which are sort of shortened versions of a different tone.

So: some words end in a stop, which is sometimes counted as a different tone even though the pitch pattern isn't different. For example, consider fan versus fat.

https://en.wikipedia.org/wiki/Checked_tone

ackfoobar · 3 years ago
That's my favourite pet peeve.

TLDR: No. There are 6 tones in Cantonese, the 9 "categories" are made referring to Middle Chinese.

---

Middle Chinese had 4 tones[1]. The 4th tone, "entering" (or "checked"), is words that end in stops (p/t/k). Because of the way it evolved, none of those words in Cantonese have tones 2, 4, or 5 (but not exactly, see below). In other words, they all have tones 1, 3, or 6.

To emphasize this observation and to make a connection to the 4 tones in middle Chinese, some analysis call them tones 7, 8, 9, with names upper dark/lower dark/light entering[2].

But such an analysis has nothing to do with how a modern Cantonese speaking brain process the sounds. E.g. Cantonese has a tone-change to tone 2 for the diminutive form, when this happens to a word that ends with p/t/k[3], the 9 tone framework cannot describe that.

---

Caveat: when I said "Cantonese" above I mean the dominant dialect of Cantonese spoken in Guangzhou/Hong Kong.

[1] https://en.wikipedia.org/wiki/Four_tones_(Middle_Chinese)

[2] https://en.wikipedia.org/wiki/Cantonese_phonology#Tones

[3] https://en.wiktionary.org/wiki/%E7%8E%89#Pronunciation

majou · 3 years ago
Japanese has two tones, which is something I didn't know until recently.
GolDDranks · 3 years ago
This is not true in a strict sense; Standard Japanese has a pitch accent that has a "culminative" pitch countour over a word. Culminativity means that there is a single point of prominence at maximum. In Japanese, this gets realised as a drop in the pitch. (In variants of Japanese, there are more elaborare systems.)

Tone systems are different in the sense that each syllable has it's own countour. (Of course, when realized, these get merged according to various phonological processes) Japanese differs from tone systems in that it has only one culminative pitch contour over multi-syllable words.

(Disclosure: I am an expert in Japanese phonology, especially in pitch accent.)

adastra22 · 3 years ago
In the same way that Swedish has two tones. Bitonal systems aren't quite as difficult as language like monosyllabic polytonal languages like Mandarin or Cantonese though. There are a handful of words in Japanese which are differentiated in pronunciation only by tone, but these are relatively rare. If you screw up the tone in Japanese it will sound like a bad foreign accent but you will likely still be understood. Just about every word in Mandarin on the other hand has one or more conjugate tone pairings, and if you screw up the tone you're speaking nonsense.

(Source: 10 years learning Japanese, followed by marrying someone from Taiwan.)

panabee · 3 years ago
this is awesome. thanks for sharing.

there is a wealth of resources for learning mandarin but only a smattering for cantonese.

two favorites:

1. web dictionary http://www.cantonese.sheik.co.uk/scripts/wordsearch.php?leve...

2. iOS dictionary that is free and comprehensive, also covers mandarin: https://apps.apple.com/us/app/pleco-chinese-dictionary/id341...

unaffiliated with either company, just a longtime user.

other suggestions for cantonese resources welcome.

keyboard_smash · 3 years ago
I make a Cantonese dictionary app for Windows/Mac/Linux, similar to Pleco (has multiple sources, coloured characters for tones, etc)!

https://jyutdictionary.com/

panabee · 3 years ago
awesome, just sent an email about potentially sponsoring this.
lordnacho · 3 years ago
What I'm after is a resource for learning mandarin of you're already a speaker of Cantonese.

I've found it surprisingly hard to find a summary of the differences.

By contrast learning German from Danish seemed to have some bits that made it clear reasonably fast.

Umofomia · 3 years ago
I'm not affiliated nor do I personally have any experience with this service, but have heard good things about the Canto To Mando Blueprint: https://www.thecmblueprint.com/
peterfirefly · 3 years ago
If you can read hanzi, can't you just start reading whatever you want in Mandarin? Obviously starting with something easy and then moving on to harder and harder texts, of course.

And can't you just start watching TV shows and movies in Mandarin? It should be a lot easier than for us mortals who start out without knowing any Sinitic languages. We have to work hard just to handle the tones and the near total lack of shared vocabulary.

I am puzzled that learning Mandarin for you hasn't gone much the same way as learning German did.

There is of course the standard way of playing with HelloChinese (or similar apps) and LingQ/Du Chinese (or similar apps) + reading easy readers (Mandarin Companion, Chinese Breeze, and similar). You should be able to speedrun them, compared to the long slog path that we mortals have to take.

(A fellow Dane with zero East Asian parentage.)

a_c · 3 years ago
Do you mind sharing why are you learning cantonese?
panabee · 3 years ago
our family grew up with cantonese as the second language. my mom is from hong kong and my dad from the southern part of china.

unfortunately, i failed to cherish this opportunity and spoke english predominantly as a child, leaving me with heavily-accented and vocabulary-limited cantonese.

i have spent an inordinate amount of time repairing these deficiencies and learning mandarin as well.

a little more self-awareness and foresight as a teenager could have saved me years of learning as an adult.

on the upside, self-learning has yielded insights into chinese language and culture, learning, and accents. unsure if these other perspectives are worth the extra time and effort, though. :)

panabee · 3 years ago
consolidating other resource links from the thread:

* jyutpin typing game https://chaaklau.github.io/cantorocks/

m348e912 · 3 years ago
I asked my Asian friend if this font is a good way to learn Chinese. He said a better option was to get a girlfriend who only speaks Cantonese.

Noted.

lvturner · 3 years ago
I’ve been married to a native Mandarin speaker for years now… this strategy does not always work.
wluu · 3 years ago
This is pretty good for someone who can speak Cantonese but can't read/write it.

As an example, I speak Mandarin and can't read/write (much) Chinese characters as I spoke it at home while growing up in Australia. So, I can imagine there'd be quite a lot who are in a similar situation to me but with Cantonese who would benefit from this (not just as a learning tool).

I've been using the Zhongwen[0] browser extension to "read" websites that have Chinese characters for many years as hovering over Chinese characters will display a popup with the pronunciation ping yin. It may not be the speediest way of understanding a block of Chinese text.

I could imagine someone creating a browser extension that would replace the font used on the website(s) with the Cantonese Visual Font when the extension is enabled.

[0] https://github.com/cschiller/zhongwen

kelvie · 3 years ago
Something's not clear here to me, how does this handle words with multiple pronounciations using a font alone?
jamesdutc · 3 years ago
I am not a Cantonese speaker; however, in Mandarin, fonts with phonetic guidance are very common.

e.g., Hann-Tzong Wang's (王漢宗) free font collection[1] includes two typefaces with phonetic pronunciation guidance. These are wp{0..3}10-05.ttf and wp{0..3}10-08.ttf [2] As you can see from the filenames, there are actually four different font files for each of these two typefaces. The font files numbered {1..3} are for 「破音字」, characters with alternate pronunciation.

When a user types a word like 「給予」 (ㄐ一ˇㄩˇ/jǐ yǔ) for which there is an alternate, less-common pronunciation (ㄐ一ˇ/jǐ instead of ㄍㄟˇ/gěi for 給) they simply change the font for just the affected character to the variant with the correct pronunciation.

In the case of this Cantonese Font, the authors distribute a single .ttf (alongside a “phrasebook” .ttf whose purpose is not clear to me) and indicate in the Roadmap section of the website that ligature support must be enabled. If alternate pronunciations are common in Cantonese, then I suspect that they must use some ligature-based method. I would have to imagine there must be cases where this could be ambiguous, but I don't know how you would resolve those.

(In practice, just swapping the font on a single character works fairly well.)

[1] https://code.google.com/archive/p/wangfonts/

[2] https://dywang.csie.cyut.edu.tw/dywang/download/pdf/sample-o...

er4hn · 3 years ago
Thanks for sharing this. When I saw the link, I wanted to see if there was something similar for Mandarin. Looking at the PDF sample I don't see anything around how to pronounce the characters, i.e. there's nothing along the top or bottom that looks like pinyin with tonal markers. Am I missing something?
rahimnathwani · 3 years ago
Perhaps they use the same technology as ligatures? There could be a glyph for the standalone character, but also special glyphs for certain combos?

The page says they do handle variations:

  Pronunciation in the Cantonese Font adapts to the context. Based on what comes before or after, the Jyutping romanization changes to the right one. The magic behind this is a careful curation from 100,000 contexts where the pronunciation differs from the standalone character.

jkwchui · 3 years ago
Hello. Font's author here. You and Jeff are correct in guessing this is (ab)using ligatures maximally :) To satisfy your curiosity, we can go deeper.

----

Conceptually it is simple: 1. assign a default (most likely) sound for each character, 2. loop through contexts, extracting words (char-combos) where the sound is different from the default ("alt-word") 3. create SVGs + font-paths (fallback for incompatible systems) for every char and every alt-word 4. assign a ligature to substitute each char-sequence that forms the alt-word (e.g., "when 乾 隆 appears adjacently, replace with `uniF1234` (the codepoint for the alt-word 乾隆")

It is not perfect, but I didn't expect this to work so well, and was stunned when the testers report high accuracy. I have always believed that bespoke computation with word segmentation (with some 1M frequency attached library) and large data-bank (100k+ words) was necessary.

----

Practically it was horrific, tedious, mind-numbing, gawd-awful set of "why this doesn't work": 1. SVG automation that works for 10^3 breaks with 10^5 2. what worked for Latin breaks for unicode 3. what worked for unicode breaks for PUA 4. what worked for monochrome breaks for color 5. what worked for single glyphs breaks for ligatures 6. what?! The assignments in the database is wrong?? 7. [...]

As I was trying to coerce the system to do what it wasn't designed to do, many of these breaks are undocumented, pretty mysterious to solve, and some steps just got manually gritted through. (And each of the 15k+ glyphs got gritted through about five times.)

It does look pretty elegant at the end ;)

jeffparsons · 3 years ago
I've seen ligatures (or whatever the underlying feature is in font formats) used for some wild stuff, but this takes the cake. They're effectively encoding a small amount of natural language processing in a font.

Setting aside for a minute the question of whether you _should_, I wonder how far you can take this? I.e. what limits are there on how much context you can take into account, etc.?

pleasedontsell · 3 years ago
Maybe it uses font ligatures to change based on the surrounding characters.

https://en.wikipedia.org/wiki/Ligature_(writing)

jkwchui · 3 years ago
You are correct!
ghayes · 3 years ago
As far as I know, Mandarin doesn't have multiple pronunciations for the same character-- does Cantonese? Aside of that, you could use ligatures for that, couldn't you?
Umofomia · 3 years ago
Mandarin definitely has many characters with multiple pronunciations. One large class come from literary vs. colloquial reading differences: https://en.wikipedia.org/wiki/Literary_and_colloquial_readin...

Another large class class comes from vestiges of derivational morphology in Old Chinese: https://en.wikipedia.org/wiki/Homograph#In_Chinese For instance, the character 度 in modern Mandarin can be pronounced dù (when used as a noun) or duó (when used as a verb), both of which derived from Old Chinese /daːɡs/ and /daːɡ/, respectively.

With Simplified Chinese characters, some of them come from the merger of originally different words that had similar, but not exactly the same pronunciations. For instance, both 髮 (fà) and 發 (fā) were merged into 发.

akavi · 3 years ago
Mandarin absolutely does:

* 行: xíng or háng

* 的: de or dì

* 长: cháng or zhǎng

(plus I'm sure many more that I can't think of just right now)

joak · 3 years ago
In Mandarin there are actually different pronunciation depending on context.

Example

觉得 juede, to think 睡觉 shuijiao, to sleep

Here the same character is pronounced jue or jiao depending on context

gnownelag · 3 years ago
Both Mandarin and Cantonese actually have multiple pronunciations for the same character. Here is an example in both:

- 说服/說服 Mandarin: shuì fú Cantonese: seoi3 fuk6

- 说话/說話 Mandarin: shuō huà Cantonese: syut3 waa6

inkyoto · 3 years ago
Both do. A single, isolated Chinese character may have multiple unrelated meanings with some of them having an entirely unrelated pronunciation. It is, in fact, ubiquitous.

The idea is that each honzi has exactly one meaning is a misconception.

With respect to ligatures, if by that you mean the length of the same word across different Sinitic languages, that depends on the specific language and its phonology. Mandarin, for instance, has lost a large number of finals over the course of its evolution which has resulted in words generally being longer and requiring extra syllables to resolve the phonetic ambiguity. The Sinitic languages that have retained more finals (and sounds in general) tend to have more of shorter words. Cantonese is one of them albeit not the only one.

wpietri · 3 years ago
For those curious, the romanization system here is Jyutping: https://en.wikipedia.org/wiki/Jyutping

That's new to me; previously I had only seen Pinyin and Wade-Giles:

https://en.wikipedia.org/wiki/Pinyin

https://en.wikipedia.org/wiki/Wade%E2%80%93Giles

Wikipedia has a nice article on the history and a number of other systems:

https://en.wikipedia.org/wiki/Romanization_of_Chinese

tikkabhuna · 3 years ago
Happy to see this here. I think there's tons of potential for making Cantonese easier to learn. The big difficulties I've had as an English speaker learning is:

1. Multiple Romanisation formats (Jyutping vs Yale) 2. Many community lead dictionaries with varying completeness. 3. Many web resources for learning words/phrases/etc use a mixture of traditional characters, jyutping, yale, or something else.

Its very difficult to find the content in the format a learner needs. Hopefully something like this will help learners use content written using traditional characters.

jkwchui · 3 years ago
(Font author here)

I whole-heartedly agree. I am a native speaker, and "fluent" in jyutping, yet I have such a hard time with Yale.

One service I'm going to build is a mapping tool between {R1, R2, ...Rn} and {G1, G2, ...Gn} where R is romanization method and G are y/z-variants of glyphs. (These, for the most part, already exists inside packages I built for building the font, and just need to have an UI to expose it to the world.) It would sure save me lots of time trying to read Matthews-Yip...

tikkabhuna · 3 years ago
I was thinking the same thing. Perhaps creating an API around PyCantonese?

My thought is that if there's a common data format for a Cantonese sentence with jyutping/yale/traditional + translation(s), the user could then pick what to display.

It could then also be worked into games/learning exercises. Placeholders could be made with a number of options so users could learn how to slot different adjectives into sentences, for example.

(I have the same username on Reddit, by the way. Sorry I never got to test it out for you!)

lvturner · 3 years ago
This is amazing and very useful!

Does anyone know of such a font for Manadrin?

adastra22 · 3 years ago
Seriously. I could really use this right now for Mandarin.