Wait hold on I just realized.
-
Wait hold on I just realized. Is
八人入
A reasonable Chinese sentence
@mcc “八人入” isn’t wrong, but it feels incomplete — like “eight people enter...” and then you wait for the rest.
“八人入众” is more satisfying. It means “eight people enter a crowd” — a complete image, and a great start for a story.
To me, it feels like the opening of a wuxia (武侠 martial arts) tale.
In wuxia, heroes often disappear into crowds before they act. So “eight people enter a crowd” already makes me curious: who are they? what happens next?Thanks for sharing !
-
undefined oblomov@sociale.network shared this topic on
-
@mcc “八人入” isn’t wrong, but it feels incomplete — like “eight people enter...” and then you wait for the rest.
“八人入众” is more satisfying. It means “eight people enter a crowd” — a complete image, and a great start for a story.
To me, it feels like the opening of a wuxia (武侠 martial arts) tale.
In wuxia, heroes often disappear into crowds before they act. So “eight people enter a crowd” already makes me curious: who are they? what happens next?Thanks for sharing !
@noone2333 thanks. Do I need a 个 or something on the 八?
-
@mcc
This is the problem with han unification; we're partway back to code pages and picking the right font to render a particular language.Like telling Danes and Swedes that ä and æ is the same character and so we'll just make them the same in Unicode.
@jannem Mmm, not sure about that. In my experience, “text encoding” and “language” are 2 orthogonal axes, and proper text handling requires you to know both.
This is one of the minor annoyances of Mastodon — it doesn't seem to be possible to mark parts of a post as being in different languages.
I don't have a huge problem with Han unification. I think it's a valid technical decision.
-
@jannem Mmm, not sure about that. In my experience, “text encoding” and “language” are 2 orthogonal axes, and proper text handling requires you to know both.
This is one of the minor annoyances of Mastodon — it doesn't seem to be possible to mark parts of a post as being in different languages.
I don't have a huge problem with Han unification. I think it's a valid technical decision.
@krans @mcc
The bigger problem is that on the web and in apps there's usually no information on what language something is written in. Which means a browser or an app they can only guess what font to render Unicode han characters in. And when a user has installed support for more than one it is certain to frequently go wrong.Edit: you don't need to know the language to always render "ä" correctly. You do need to know the language in order to render "骨".
-
@krans @mcc
The bigger problem is that on the web and in apps there's usually no information on what language something is written in. Which means a browser or an app they can only guess what font to render Unicode han characters in. And when a user has installed support for more than one it is certain to frequently go wrong.Edit: you don't need to know the language to always render "ä" correctly. You do need to know the language in order to render "骨".
@jannem I agree. The root cause is that file formats, protocols and most programs are written almost entirely by English-speakers, who assume that only English-speaking people use computers and that all content will be in English.
For my entire lifetime, support for multilingual text has always been an afterthought — and many development frameworks make it incredibly difficult.
-
@jannem I agree. The root cause is that file formats, protocols and most programs are written almost entirely by English-speakers, who assume that only English-speaking people use computers and that all content will be in English.
For my entire lifetime, support for multilingual text has always been an afterthought — and many development frameworks make it incredibly difficult.
-
Update: I solved the problem, not by adding Chinese as an alternate language for my Android, but by deleting Japanese as an alternate language. Not sure when I did this or what I was trying to accomplish but I question Google's decision that informing it I may look at text in Japanese makes it conclude I DEFINITELY won't be looking at Chinese!
@mcc unfortunately there’s not really a good solution to this problem and Android, like everyone else, just has to pick a resolution method and stick with it. If you’ve heard of “Han Unification,” well it sounds like something that happened violently in 2200 BC but actually it happened quite recently in a Unicode meeting room and it causes this exact specific intractable issue
-
@noone2333 thanks. Do I need a 个 or something on the 八?
@mcc You can just say 八人 .
个 isn't needed here. It’s cleaner and more natural without it, especially in short, poetic or title-like phrases. -
In Pleco they look like this. I don't know if this is a different but regular hanzi font or if the CJK unification is messing me up somehow
EDIT: I currently think Tusky is showing me Japanese character variants https://social.mildlyfunctional.gay/@artemist/116146010272716935
@mcc i once got homework graded as incorrect because the japanese dictionary website i used did not use "lang" html attributes and firefox ended up selecting a korean font
-
This is what Tusky looks like.
@mcc I wonder if android has an API to indicate what language a text field is in? Phanpy web (iOS) handles the character variation just fine and I wonder if it’s because browsers let you set languages for text + it’s using the annotated post language?
-
@mcc I wonder if android has an API to indicate what language a text field is in? Phanpy web (iOS) handles the character variation just fine and I wonder if it’s because browsers let you set languages for text + it’s using the annotated post language?
@mcc it seems like it is actually using the declared language on my end because if I switch the post language here to Japanese I see the Japanese variants of the characters, and if I switch it back to Chinese I see the Chinese variants of the characters.
Test post marked as Japanese: 八人入
-
@Heliograph @rk The 个 is a friend that you give to a number so that it does not get lonely
@mcc @Heliograph @rk my mind went immediately to Knuth up-arrow, which gives numbers lots of friends
-
WAIT WTF this is an actual Chinese IME and it seems to be showing me Japanese characters. Ok I think Lenovo is fucking with me, one minute
@mcc do you ever feel like the time between you making a lighthearted shitpost and then uncovering a pit of writhing software horrors gets shorter every year.
-
@mcc
This is the problem with han unification; we're partway back to code pages and picking the right font to render a particular language.Like telling Danes and Swedes that ä and æ is the same character and so we'll just make them the same in Unicode.
-
@mcc unfortunately there’s not really a good solution to this problem and Android, like everyone else, just has to pick a resolution method and stick with it. If you’ve heard of “Han Unification,” well it sounds like something that happened violently in 2200 BC but actually it happened quite recently in a Unicode meeting room and it causes this exact specific intractable issue
@0xabad1dea @mcc also an act of violence, I would argue
-
@mcc it seems like it is actually using the declared language on my end because if I switch the post language here to Japanese I see the Japanese variants of the characters, and if I switch it back to Chinese I see the Chinese variants of the characters.
Test post marked as Japanese: 八人入
@mcc I couldn’t find an Android API to do this right, and I found what seems like a reasonable iOS API but it doesn’t do what I was expecting so I’m not actually sure it’s possible to do this well except with web technologies
-
Wait hold on I just realized. Is
八人入
A reasonable Chinese sentence
@mcc you've been on Chinese fanfiction sites I see
-
@Heliograph @rk The 个 is a friend that you give to a number so that it does not get lonely
@mcc @Heliograph @rk
It's a Totoro umbrella. -
Okay I now believe the problem is neither Tusky nor Lenovo but rather that Android is not a serious product and never has been. It seems Android may outright refuse to show scripts unless you've whitelisted the language. Problem: I think this menu is asking me which version of Chinese I want but the menu is in Chinese. I want to look at Chinese text so I can learn Chinese. I don't know it yet. I feel like I'm playing an adventure game.
* I may explore a PR later anyway.
@mcc i think the first one is japanese, the second simplified chinese.
-
@mcc I couldn’t find an Android API to do this right, and I found what seems like a reasonable iOS API but it doesn’t do what I was expecting so I’m not actually sure it’s possible to do this well except with web technologies
@porglezomp I'm talking to someone and we think we found the android API