Wait hold on I just realized.
-
@mcc the photo is exactly how the post looks on Tusky for me fwiw
@ehashman I have found an absolutely BAFFLING intersection of Android features and I'd wonder if it's just this tablet is messed up but I found people on Stack Overflow having the same problem with the same fix
-
@Heliograph @rk The 个 is a friend that you give to a number so that it does not get lonely
@mcc @Heliograph @rk I prefer to think of it as the units people (and other things) come in. As in, “Going down to the bar to drink a couple of pints, and maybe bring back a ge or two.”
-
@mcc @Heliograph @rk I prefer to think of it as the units people (and other things) come in. As in, “Going down to the bar to drink a couple of pints, and maybe bring back a ge or two.”
@jonathankoren o-0 a "G" or two? @mcc @rk
-
@mcc not sure about the context it'll be used in, but the choices are:
* Mainland China
* Macau (you'd never guess this without looking it up)
* Hong Kong
* Singapore@r @mcc i mostly remember macau as "the one that has 门 in it".
Android will render text from languages not in your list, that's why pleco shows the right forms. it just won't do so unless explicitly told to with an
android.text.style.LocaleSpan, which most apps don't bother to do.You can get the same problem in web browsers if it isn't told what language to use. I regularly see japanese forms in chinese subtitles because google isn't setting
lang="zh-Hans"for their subtitles. -
@jonathankoren o-0 a "G" or two? @mcc @rk
@Heliograph @mcc @rk
Narrator: They brought home zero ges of companions -
@Heliograph @mcc @rk
Narrator: They brought home zero ges of companions -
Update: I solved the problem, not by adding Chinese as an alternate language for my Android, but by deleting Japanese as an alternate language. Not sure when I did this or what I was trying to accomplish but I question Google's decision that informing it I may look at text in Japanese makes it conclude I DEFINITELY won't be looking at Chinese!
@mcc I have to wonder if this is downstream of Unicode's choices around CJK unification. Because I seem to remember reading that it ended up causing some situations where, in order to correctly render a block of text, you need out-of-band knowledge of which language it's in.
-
@ehashman I have found an absolutely BAFFLING intersection of Android features and I'd wonder if it's just this tablet is messed up but I found people on Stack Overflow having the same problem with the same fix
@mcc I'm so glad you found a fix!
-
Wait hold on I just realized. Is
八人入
A reasonable Chinese sentence
@mcc “八人入” isn’t wrong, but it feels incomplete — like “eight people enter...” and then you wait for the rest.
“八人入众” is more satisfying. It means “eight people enter a crowd” — a complete image, and a great start for a story.
To me, it feels like the opening of a wuxia (武侠 martial arts) tale.
In wuxia, heroes often disappear into crowds before they act. So “eight people enter a crowd” already makes me curious: who are they? what happens next?Thanks for sharing !
-
undefined oblomov@sociale.network shared this topic on
-
@mcc “八人入” isn’t wrong, but it feels incomplete — like “eight people enter...” and then you wait for the rest.
“八人入众” is more satisfying. It means “eight people enter a crowd” — a complete image, and a great start for a story.
To me, it feels like the opening of a wuxia (武侠 martial arts) tale.
In wuxia, heroes often disappear into crowds before they act. So “eight people enter a crowd” already makes me curious: who are they? what happens next?Thanks for sharing !
@noone2333 thanks. Do I need a 个 or something on the 八?
-
@mcc
This is the problem with han unification; we're partway back to code pages and picking the right font to render a particular language.Like telling Danes and Swedes that ä and æ is the same character and so we'll just make them the same in Unicode.
@jannem Mmm, not sure about that. In my experience, “text encoding” and “language” are 2 orthogonal axes, and proper text handling requires you to know both.
This is one of the minor annoyances of Mastodon — it doesn't seem to be possible to mark parts of a post as being in different languages.
I don't have a huge problem with Han unification. I think it's a valid technical decision.
-
@jannem Mmm, not sure about that. In my experience, “text encoding” and “language” are 2 orthogonal axes, and proper text handling requires you to know both.
This is one of the minor annoyances of Mastodon — it doesn't seem to be possible to mark parts of a post as being in different languages.
I don't have a huge problem with Han unification. I think it's a valid technical decision.
@krans @mcc
The bigger problem is that on the web and in apps there's usually no information on what language something is written in. Which means a browser or an app they can only guess what font to render Unicode han characters in. And when a user has installed support for more than one it is certain to frequently go wrong.Edit: you don't need to know the language to always render "ä" correctly. You do need to know the language in order to render "骨".
-
@krans @mcc
The bigger problem is that on the web and in apps there's usually no information on what language something is written in. Which means a browser or an app they can only guess what font to render Unicode han characters in. And when a user has installed support for more than one it is certain to frequently go wrong.Edit: you don't need to know the language to always render "ä" correctly. You do need to know the language in order to render "骨".
@jannem I agree. The root cause is that file formats, protocols and most programs are written almost entirely by English-speakers, who assume that only English-speaking people use computers and that all content will be in English.
For my entire lifetime, support for multilingual text has always been an afterthought — and many development frameworks make it incredibly difficult.
-
@jannem I agree. The root cause is that file formats, protocols and most programs are written almost entirely by English-speakers, who assume that only English-speaking people use computers and that all content will be in English.
For my entire lifetime, support for multilingual text has always been an afterthought — and many development frameworks make it incredibly difficult.
-
Update: I solved the problem, not by adding Chinese as an alternate language for my Android, but by deleting Japanese as an alternate language. Not sure when I did this or what I was trying to accomplish but I question Google's decision that informing it I may look at text in Japanese makes it conclude I DEFINITELY won't be looking at Chinese!
@mcc unfortunately there’s not really a good solution to this problem and Android, like everyone else, just has to pick a resolution method and stick with it. If you’ve heard of “Han Unification,” well it sounds like something that happened violently in 2200 BC but actually it happened quite recently in a Unicode meeting room and it causes this exact specific intractable issue
-
@noone2333 thanks. Do I need a 个 or something on the 八?
@mcc You can just say 八人 .
个 isn't needed here. It’s cleaner and more natural without it, especially in short, poetic or title-like phrases. -
In Pleco they look like this. I don't know if this is a different but regular hanzi font or if the CJK unification is messing me up somehow
EDIT: I currently think Tusky is showing me Japanese character variants https://social.mildlyfunctional.gay/@artemist/116146010272716935
@mcc i once got homework graded as incorrect because the japanese dictionary website i used did not use "lang" html attributes and firefox ended up selecting a korean font
-
This is what Tusky looks like.
@mcc I wonder if android has an API to indicate what language a text field is in? Phanpy web (iOS) handles the character variation just fine and I wonder if it’s because browsers let you set languages for text + it’s using the annotated post language?
-
@mcc I wonder if android has an API to indicate what language a text field is in? Phanpy web (iOS) handles the character variation just fine and I wonder if it’s because browsers let you set languages for text + it’s using the annotated post language?
@mcc it seems like it is actually using the declared language on my end because if I switch the post language here to Japanese I see the Japanese variants of the characters, and if I switch it back to Chinese I see the Chinese variants of the characters.
Test post marked as Japanese: 八人入
-
@Heliograph @rk The 个 is a friend that you give to a number so that it does not get lonely
@mcc @Heliograph @rk my mind went immediately to Knuth up-arrow, which gives numbers lots of friends