Wait hold on I just realized.
-
@jonathankoren o-0 a "G" or two? @mcc @rk
@Heliograph @mcc @rk
Narrator: They brought home zero ges of companions -
@Heliograph @mcc @rk
Narrator: They brought home zero ges of companions -
Update: I solved the problem, not by adding Chinese as an alternate language for my Android, but by deleting Japanese as an alternate language. Not sure when I did this or what I was trying to accomplish but I question Google's decision that informing it I may look at text in Japanese makes it conclude I DEFINITELY won't be looking at Chinese!
@mcc I have to wonder if this is downstream of Unicode's choices around CJK unification. Because I seem to remember reading that it ended up causing some situations where, in order to correctly render a block of text, you need out-of-band knowledge of which language it's in.
-
@ehashman I have found an absolutely BAFFLING intersection of Android features and I'd wonder if it's just this tablet is messed up but I found people on Stack Overflow having the same problem with the same fix
@mcc I'm so glad you found a fix!
-
Wait hold on I just realized. Is
八人入
A reasonable Chinese sentence
@mcc “八人入” isn’t wrong, but it feels incomplete — like “eight people enter...” and then you wait for the rest.
“八人入众” is more satisfying. It means “eight people enter a crowd” — a complete image, and a great start for a story.
To me, it feels like the opening of a wuxia (武侠 martial arts) tale.
In wuxia, heroes often disappear into crowds before they act. So “eight people enter a crowd” already makes me curious: who are they? what happens next?Thanks for sharing !
-
undefined oblomov@sociale.network shared this topic on
-
@mcc “八人入” isn’t wrong, but it feels incomplete — like “eight people enter...” and then you wait for the rest.
“八人入众” is more satisfying. It means “eight people enter a crowd” — a complete image, and a great start for a story.
To me, it feels like the opening of a wuxia (武侠 martial arts) tale.
In wuxia, heroes often disappear into crowds before they act. So “eight people enter a crowd” already makes me curious: who are they? what happens next?Thanks for sharing !
@noone2333 thanks. Do I need a 个 or something on the 八?
-
@mcc
This is the problem with han unification; we're partway back to code pages and picking the right font to render a particular language.Like telling Danes and Swedes that ä and æ is the same character and so we'll just make them the same in Unicode.
@jannem Mmm, not sure about that. In my experience, “text encoding” and “language” are 2 orthogonal axes, and proper text handling requires you to know both.
This is one of the minor annoyances of Mastodon — it doesn't seem to be possible to mark parts of a post as being in different languages.
I don't have a huge problem with Han unification. I think it's a valid technical decision.
-
@jannem Mmm, not sure about that. In my experience, “text encoding” and “language” are 2 orthogonal axes, and proper text handling requires you to know both.
This is one of the minor annoyances of Mastodon — it doesn't seem to be possible to mark parts of a post as being in different languages.
I don't have a huge problem with Han unification. I think it's a valid technical decision.
@krans @mcc
The bigger problem is that on the web and in apps there's usually no information on what language something is written in. Which means a browser or an app they can only guess what font to render Unicode han characters in. And when a user has installed support for more than one it is certain to frequently go wrong.Edit: you don't need to know the language to always render "ä" correctly. You do need to know the language in order to render "骨".
-
@krans @mcc
The bigger problem is that on the web and in apps there's usually no information on what language something is written in. Which means a browser or an app they can only guess what font to render Unicode han characters in. And when a user has installed support for more than one it is certain to frequently go wrong.Edit: you don't need to know the language to always render "ä" correctly. You do need to know the language in order to render "骨".
@jannem I agree. The root cause is that file formats, protocols and most programs are written almost entirely by English-speakers, who assume that only English-speaking people use computers and that all content will be in English.
For my entire lifetime, support for multilingual text has always been an afterthought — and many development frameworks make it incredibly difficult.
-
@jannem I agree. The root cause is that file formats, protocols and most programs are written almost entirely by English-speakers, who assume that only English-speaking people use computers and that all content will be in English.
For my entire lifetime, support for multilingual text has always been an afterthought — and many development frameworks make it incredibly difficult.
-
Update: I solved the problem, not by adding Chinese as an alternate language for my Android, but by deleting Japanese as an alternate language. Not sure when I did this or what I was trying to accomplish but I question Google's decision that informing it I may look at text in Japanese makes it conclude I DEFINITELY won't be looking at Chinese!
@mcc unfortunately there’s not really a good solution to this problem and Android, like everyone else, just has to pick a resolution method and stick with it. If you’ve heard of “Han Unification,” well it sounds like something that happened violently in 2200 BC but actually it happened quite recently in a Unicode meeting room and it causes this exact specific intractable issue
-
@noone2333 thanks. Do I need a 个 or something on the 八?
@mcc You can just say 八人 .
个 isn't needed here. It’s cleaner and more natural without it, especially in short, poetic or title-like phrases. -
In Pleco they look like this. I don't know if this is a different but regular hanzi font or if the CJK unification is messing me up somehow
EDIT: I currently think Tusky is showing me Japanese character variants https://social.mildlyfunctional.gay/@artemist/116146010272716935
@mcc i once got homework graded as incorrect because the japanese dictionary website i used did not use "lang" html attributes and firefox ended up selecting a korean font
-
This is what Tusky looks like.
@mcc I wonder if android has an API to indicate what language a text field is in? Phanpy web (iOS) handles the character variation just fine and I wonder if it’s because browsers let you set languages for text + it’s using the annotated post language?
-
@mcc I wonder if android has an API to indicate what language a text field is in? Phanpy web (iOS) handles the character variation just fine and I wonder if it’s because browsers let you set languages for text + it’s using the annotated post language?
@mcc it seems like it is actually using the declared language on my end because if I switch the post language here to Japanese I see the Japanese variants of the characters, and if I switch it back to Chinese I see the Chinese variants of the characters.
Test post marked as Japanese: 八人入
-
@Heliograph @rk The 个 is a friend that you give to a number so that it does not get lonely
@mcc @Heliograph @rk my mind went immediately to Knuth up-arrow, which gives numbers lots of friends
-
WAIT WTF this is an actual Chinese IME and it seems to be showing me Japanese characters. Ok I think Lenovo is fucking with me, one minute
@mcc do you ever feel like the time between you making a lighthearted shitpost and then uncovering a pit of writhing software horrors gets shorter every year.
-
@mcc
This is the problem with han unification; we're partway back to code pages and picking the right font to render a particular language.Like telling Danes and Swedes that ä and æ is the same character and so we'll just make them the same in Unicode.
-
@mcc unfortunately there’s not really a good solution to this problem and Android, like everyone else, just has to pick a resolution method and stick with it. If you’ve heard of “Han Unification,” well it sounds like something that happened violently in 2200 BC but actually it happened quite recently in a Unicode meeting room and it causes this exact specific intractable issue
@0xabad1dea @mcc also an act of violence, I would argue
-
@mcc it seems like it is actually using the declared language on my end because if I switch the post language here to Japanese I see the Japanese variants of the characters, and if I switch it back to Chinese I see the Chinese variants of the characters.
Test post marked as Japanese: 八人入
@mcc I couldn’t find an Android API to do this right, and I found what seems like a reasonable iOS API but it doesn’t do what I was expecting so I’m not actually sure it’s possible to do this well except with web technologies