These forums have been archived and are now read-only.

The new forums are live and can be found at https://forums.eveonline.com/

EVE Information Portal

 
  • Topic is locked indefinitely.
 

New dev blog: Introducing Cerberus

First post
Author
ZaBob
The Scope
Gallente Federation
#81 - 2011-11-27 06:07:00 UTC
GateScout wrote:
Rakshasa Taisab wrote:
Quote:
pronunciation (in Japanese, used for sorting)


WTF do you mean by 'sorting'?

I have never come across that term when it comes to Japanese grammar.

I had a long post about this, but the forums just ate it. Evil Go to goggle and search for this: "Sorting in Japanese — An Unsolved Problem" Read that blog post and you'll understand.

If you're really interested in this topic, continue reading: http://www.localizingjapan.com


That's an interesting blog post, but I take a bit of issue with "unsolved problem". A better characterization would be "a problem without a commonly-agreed upon solution".

There is a very clear and distinct sorting order used in Japanese dictionaries. Well, there's more than one variation of it, but it's something that Japanese kids learn growing up.

Rather than describe it at length, let me point out that the key point in dictionary ordering is that it is NOT based on pronunciation for the Kanji ideographs. (It is, for kana). Rather, it is based on a key structural portion of each character, and stroke count.

In fact, sorting international text is a pretty-well understood problem. That doesn't mean well agreed-upon -- but there IS a standard framework: the Unicode 6.0 standard. http://www.unicode.org/reports/tr10/

The key is that you don't sort on character codes, either. and the exact desired collation order will depend on locale.

If you read that above link, you'll see you don't even sort on any simple mapping of character codes, but rather on "collation units". That's because other languages have things like accent marks (of varying language-specific impact on sorting and character identity) and combining forms, etc.

The bottom line is, while we may not have universal agreement on any particular collation sequence, sorting to a desired collation sequence IS a solved problem, and we pretty much understand where the disagreements lie.

We humans have evolved an amazing variety of ways of recording our thoughts on paper. And sorting is so important, that even when we tens of thousands of characters, we find ways to sort them.
ZaBob
The Scope
Gallente Federation
#82 - 2011-11-27 06:16:17 UTC
And finally -- the reason I visited this thread in the first place:

CCP Shiny and crew: Congratulations on diving into an endlessly complex and challenging problem.

It looks to me like you're pushing this a bit further than usual, from just "localized text", just a bit into the territory of "natural language generation".

For the people wondering why not just use gettext -- gettext is pretty limited. It's good enough for what it's intended for, but when you start trying to compose complex explanations, that starts to break down.

CCP deserves credit for trying to push things a bit further toward true internationalization.

Whether it was worth the effort or not -- aye, that's always the rub. You always find internationalization to be harder than you think it will be going in.
Ishtanchuk Fazmarai
#83 - 2011-11-27 11:14:12 UTC
ZaBob wrote:
(lenghty expxalnation, several posts long, about the intricacies of Japanese language


This is why they signed an agreement with a giant Korean corporation (NEXON) to market EVE Online to Japanese customers in Japanese... Lol

Translating is a very complex issue because every language represents a mindset: FAI, what things do have a name and what things do not have a name...

A quick instance: in English there is that word, "tonight", which does not translate into Spanish (the closest is "esta noche", which can be either last night or tonight), whereas in Spanish there is another word, "anoche", which does not translate to English as it means "last night".

So in English the upcoming night has got a name but not the last night, whereas in Spanish the last night has got a name but the upcoming one doesn't... and we're talking about languages geographically and culturally close.

I kinda wonder what concepts that we meet in EVE are actually Icelandic words (a part of Icelandic mindset, like "dry shark is yummy") roughly adapted into a different mindset (English)...

Roses are red / Violets are blue / I am an Alpha / And so it's you

Rakshasa Taisab
Sane Industries Inc.
#84 - 2011-11-27 16:34:05 UTC
GateScout wrote:
Rakshasa Taisab wrote:
Quote:
pronunciation (in Japanese, used for sorting)


WTF do you mean by 'sorting'?

I have never come across that term when it comes to Japanese grammar.

I had a long post about this, but the forums just ate it. Evil Go to goggle and search for this: "Sorting in Japanese — An Unsolved Problem" Read that blog post and you'll understand.

If you're really interested in this topic, continue reading: http://www.localizingjapan.com

Actually I know perfectly well about Japanese character encoding and lexical sorting, as both a coder and a Japanese speaker.

What threw me off was the use of 'pronunciation', when the more commonly used term is 'reading' of a kanji or compound. The two main types of readings of a kanji character are called onyomi and kunyomi (-yomi means reading), native reading and sino-japanese reading.

Nyan