Stories
Slash Boxes
Comments

SoylentNews is people

SoylentNews is powered by your submissions, so send in your scoop. Only 17 submissions in the queue.
posted by NCommander on Tuesday May 19 2015, @10:00AM   Printer-friendly
from the translators-will-be-wanted dept.

As the dev cycle for the first release of rehash winds down (one remaining bug before we're ready to deploy), I've turned my attention another large site-related project, specifically the possibility of internationalizing and localizing the site. Internationalizing a very large legacy codebase is a lot of work, and I want to gauge interest. For clarification, this work would translate the UI, and the site interface itself, and not the articles. Historically, I'm aware of two successful translation project, BarraPunto, and Slashdot.JP. However, both of those projects did so by simply rewriting the templates in their native language instead of using a more flexible system that would allow for dynamic processing.

After a few hours of cursing, I was successful in modifying rehash to use Locale::Maketext::Lexicon, and load translations dynamically from gettext when a template is loaded. The practical upshot is that if I continue with this effort, we will be able to import all the static strings in rehash, and translate them through any service that can handle standard gettext POT files (such as Launchpad Rosetta), then have rehash load the specific language on the fly depending on a user's settings, or their browsers preferences. This also has the benefit that translators would only require a minimal amount of HTML knowledge would be required to successfully translate rehash.

What I want to know is the following:

  • Is there significant interest to prioritize this project?
  • Are there enough people who are willing to engage in translating rehash to their native language?
  • Assuming there is a successful translation committed and launched, would people be interested in running a version of SN in their own native language?

As long as the first two answers are yes, I'll push to get a localized codebase in place for the next rehash development cycle (likely landing in July or August). Please note there is a *lot* of strings to be translated, I'm expecting upwards of a couple thousand once I've finished scrubbing through all the templates and libraries. If you are interested in this, please note your native language below, and I'll get in contact with you once we're ready to start doing translation work.

Just to show that this is indeed possible, here's the output of the generate-pot-files script, with the few templates I've gone through:

slashlithium~/src/rehash-ncommander$ bin/generate-pot-files
 * themes/default/templates/about;about;default
   - Total strings extracted : 21
 * themes/default/templates/admin;menu;default
   - Total strings extracted : 7
 * themes/default/templates/articlemoved;misc;default
   - Total strings extracted : 3
 * themes/default/templates/bannedtext_ipid;misc;default
   - Total strings extracted : 2
 * themes/default/templates/bannedtext_palm;misc;default
   - Total strings extracted : 2
 * themes/default/templates/main;404;default
   - Total strings extracted : 11
READING PO FILE : i18n/rehash.pot
WRITING PO FILE : ./i18n/rehash.pot
DONE
 
This discussion has been archived. No new comments can be posted.
Display Options Threshold/Breakthrough Mark All as Read Mark All as Unread
The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
  • (Score: 1, Informative) by Anonymous Coward on Tuesday May 19 2015, @10:31AM

    by Anonymous Coward on Tuesday May 19 2015, @10:31AM (#185004)

    this work would translate the UI, and the site interface itself, and not the articles

    if the articles aren't translated also, there's probably not much point

    no sense being able to read the menu but not the content

    translating menu and content otoh... (adds a whole new level of editing difficulty but may broaden the readership potential)

    also, probably tough getting a decent number of volunteers that do perl

    Starting Score:    0  points
    Moderation   +1  
       Informative=1, Total=1
    Extra 'Informative' Modifier   0  

    Total Score:   1  
  • (Score: 4, Interesting) by NCommander on Tuesday May 19 2015, @10:34AM

    by NCommander (2) Subscriber Badge <michael@casadevall.pro> on Tuesday May 19 2015, @10:34AM (#185006) Homepage Journal

    It's not perl. The translations file looks like this:

    #. (origin)
    #: themes/default/templates/main;404;default:22
    #, perl-maketext-format
    msgid "The requested URL (%1) was not found."
    msgstr ""

    Basically fill in the blank. In some places, because the layout of the page may have to change, there are HTML tags and such but no raw perl to manage.

    I won't mind having a non-English version of SN, but translating the UI would be a prelude to that, as well as recruiting a new editorial team to be able manage it. I only speak English, a little Spanish, and bad English.

    --
    Still always moving
    • (Score: 0) by Anonymous Coward on Tuesday May 19 2015, @10:47AM

      by Anonymous Coward on Tuesday May 19 2015, @10:47AM (#185011)

      Just refactor the code so a different language template can be sub'ed if/when desired. Seems like you already done that. And build a MexPanish template for testing.

      For now, though, like the AC noted, don't see a point displaying vato menu with gringo articles.

    • (Score: 4, Interesting) by martyb on Tuesday May 19 2015, @01:11PM

      by martyb (76) Subscriber Badge on Tuesday May 19 2015, @01:11PM (#185048) Journal

      A few things come to mind.

      First, I sense that our community is not sufficiently large at this moment to warrant this. Down the road, though, I think this is a *great* idea. If we provide alternate 'versions' of the site in other languages, I believe any given story will suffer. It's the 'Network Effect' and I would rather see our existing community grow prior to embarking on providing alternate language implementations.

      Second, This is *great* for proofreading the existing UI! I can't wait to see all the extracted text strings and pore over them with a fine-toothed comb (heh. translate THAT) to identify, for example, inconsistent capitalization and spacing on Button elements.

      Third, there's some things about internationalization that defy simple substitution. I read a great article a few months ago but cannot locate it now. It was a (possibly fictional) account of trying to internationalize messages into Russian (Polish?). As I recall, the example was something mundane like returning the number of files found for a search. Simple parameterization was inadequate.

      "I found %1 file(s)"

      looks ugly when the substituted parameter has the value "1". So, the developer adds a test of whether the count is singular or not.

      if (count == 1) {printf "I found 1 file"} else {printf "I found %d file(s)", count}

      Except, that in that language, simple translation like that fails (in English think enumeration like First, Second, Third, Fourth, Fifth instead of One, Two, Three, Four, Five.) I *really* wish I could locate the original article — this by no means does it justice. Hopefully I've provided enough info that it will jog someone's memory and they can reply with a link.

      Further, the amount of space required on the screen to represent the *concept* in each message may be very different. This potentially leads to a cascade of problems, especially on small-screen devices. Take a look at some of the 'slashboxes' and I'd be surprised if all of those entries, after translation, will still fit in the limited space provided. So now you need to increase the size of those boxes to accommodate the largest message, so now you have a lot of white space with short text entries, and ... you get the point.

      tl;dr: short-term, knock your socks off but leave the strings in English. There's MUCH more to this than meets the eye. Here be dragons.

      --
      Wit is intellect, dancing.
      • (Score: 2) by M. Baranczak on Tuesday May 19 2015, @02:03PM

        by M. Baranczak (1673) on Tuesday May 19 2015, @02:03PM (#185060)

        I vaguely remember that article too. The core problem is that a lot of languages have more than one plural form for nouns, depending on the number. This includes the entire Slavic and Baltic families, Arabic, Irish, Icelandic and a couple others. More information:

        https://developer.mozilla.org/en-US/docs/Mozilla/Localization/Localization_and_Plurals [mozilla.org]
        http://localization-guide.readthedocs.org/en/latest/l10n/pluralforms.html [readthedocs.org]

        • (Score: 2) by janrinok on Tuesday May 19 2015, @02:14PM

          by janrinok (52) Subscriber Badge on Tuesday May 19 2015, @02:14PM (#185066) Journal
          Having spent several years learning and using Russian - I can vouch for this! Translations are often not simple word substitutions. Grammar can also be affected by the number - using accusative case, genitive singular or genitive plural depending on the last digit of a larger number.
          • (Score: 2) by ticho on Tuesday May 19 2015, @02:49PM

            by ticho (89) on Tuesday May 19 2015, @02:49PM (#185076) Homepage Journal

            Fortunately, gettext handles this quite gracefully, if (and that's a big if) the developer does his job properly.

          • (Score: 2) by frojack on Tuesday May 19 2015, @05:51PM

            by frojack (1554) on Tuesday May 19 2015, @05:51PM (#185147) Journal

            Having spent several years learning and using Russian

            So give us an opinion then: How does google translate do on the SN home page: https://goo.gl/AbvqMm [goo.gl]
            Clearly not everything is likely to translate, but I've read the reverse translations
              and they are usable.

            --
            No, you are mistaken. I've always had this sig.
            • (Score: 3, Insightful) by janrinok on Tuesday May 19 2015, @09:24PM

              by janrinok (52) Subscriber Badge on Tuesday May 19 2015, @09:24PM (#185219) Journal

              I didn't realise that anyone was suggesting that each page that needs to be translated on the fly is going to be submitted to Google for translation.

              But the problem remains that where in English there are only usually two visually different forms of a word - singular and plural - there exist many different combinations in other languages. Providing a one-to-one translation is not always simple. A practical example; the front page shows how many messages are waiting for you to read. In English it is either (1) message or (more than 1) messages. In Slavonic languages it is not uncommon for 1,2-4,5+ to require different forms (cases) of the word 'message(s)'. In English you can also correctly say 'with 5 messages' , but in Slavonic languages the word 'with' requires the instrumental case - and the word 'messages' is written differently than previously stated. There are many such characteristics that further complicate the translation.

              The task is do-able, but one mustn't underestimate the size of the task that NCommander is proposing. It would be a significant effort for what appears to be, from my particular viewpoint, very little gain. FatPhil's suggestion of reducing the costs of running the site is more important than translating the front page into numerous languages - especially as you have demonstrated that Google could do the job reasonably well for those that want it. But I realise that it is not as interesting a challenge than writing a clever bit of software, unless there are some other benefits of doing the translation that are not immediately apparent to us all?

              Do we need a front page in different languages? No. Could we use an internal messaging system (e.g. to provide feedback on submissions, to let people know when their subs expire etc.)? Yes. Could we use an improved editor for changing a submission to something suitable for the front page? Yes. Could we benefit from more automatic processing of submissions to check links, carry out routine formatting, correcting spelling etc? Yes. Could we have secure VNC-type connections to the dev system so that both trainer and trainee can see the same page when new editors have to learn how to do the job? And that is only with my Editor's hat on. Can we have a way of entering a comment and then returning to the thread at the point where the comment was inserted? Can we have different ACs indicated by the addition of a number so that we can follow who is saying what to whom in a thread? Now some of these are in the pipeline but we haven't got them yet. When we have done all these, and also the things that others have been asking for, then that is the time to start looking for 'I-wonder-if-we-could...' tasks to keep people busy.

              • (Score: 2) by frojack on Wednesday May 20 2015, @12:27AM

                by frojack (1554) on Wednesday May 20 2015, @12:27AM (#185254) Journal

                I didn't realise that anyone was suggesting that each page that needs to be translated on the fly is going to be submitted to Google for translation.

                No, that wasn't what I was suggesting either. All I mentioned it for was to point out that the site is accessible in many languages, some with bettr translations than others, but most of the translations are good enough to get by, and all of them less work for out much overworked staff.

                I too think there is too little gain to waste our commander's time on this.

                --
                No, you are mistaken. I've always had this sig.
                • (Score: 2) by janrinok on Wednesday May 20 2015, @08:34AM

                  by janrinok (52) Subscriber Badge on Wednesday May 20 2015, @08:34AM (#185352) Journal
                  Sorry, perhaps I misunderstood. However, we seem to be agreed that the task doesn't justify the effort at present.
      • (Score: 2) by Ryuugami on Tuesday May 19 2015, @02:11PM

        by Ryuugami (2925) on Tuesday May 19 2015, @02:11PM (#185062)

        Hah! Found it. [cpan.org]

        I remember reading that article a few years back. I thought I had it bookmarked, but apparently not... had to search for it.

        Interestingly, it seems there was a SoylentNews submission [soylentnews.org] about it a few months ago. I seem to have missed that one :)

        --
        If a shit storm's on the horizon, it's good to know far enough ahead you can at least bring along an umbrella. - D.Weber
        • (Score: 2) by martyb on Friday May 22 2015, @05:38PM

          by martyb (76) Subscriber Badge on Friday May 22 2015, @05:38PM (#186550) Journal

          No wonder it sounded familiar -- I wrote that story! Thanks for finding it and replying!

          --
          Wit is intellect, dancing.
      • (Score: 1) by Yates on Tuesday May 19 2015, @03:58PM

        by Yates (3947) on Tuesday May 19 2015, @03:58PM (#185099)
        I believe this is the article that you were looking for.

        A Localization Horror Story: It Could Happen To You [cpan.org]
        • (Score: 1) by martyb on Friday May 22 2015, @09:36PM

          by martyb (76) Subscriber Badge on Friday May 22 2015, @09:36PM (#186675) Journal

          Yes! That's the one -- thanks!

          --
          Wit is intellect, dancing.
    • (Score: 2) by danomac on Tuesday May 19 2015, @04:12PM

      by danomac (979) on Tuesday May 19 2015, @04:12PM (#185105)

      I won't mind having a non-English version of SN, but translating the UI would be a prelude to that, as well as recruiting a new editorial team to be able manage it. I only speak English, a little Spanish, and bad English.

      You don't speak cursive English? I know I do from time to time, mostly when unexpected things happen.

  • (Score: 5, Interesting) by bryan on Tuesday May 19 2015, @03:57PM

    by bryan (29) <bryan@pipedot.org> on Tuesday May 19 2015, @03:57PM (#185098) Homepage Journal

    I implemented machine translated content (stories and comments) on Pipedot [pipedot.org] a few months ago. The idea was that you could type in any language you want, and the site will automatically translate your comment [pipedot.org] into English. Alternatively, you can change your display language (in your profile settings) to view the entire site in your chosen language.

    This functionality actually started out as an April Fools joke. On April 1st [pipedot.org], I set the default language of the site to Esperanto [wikipedia.org] - the universal language! Of course, I don't think its seen much of any use or interest since then.

  • (Score: 2) by jmoschner on Tuesday May 19 2015, @10:17PM

    by jmoschner (3296) on Tuesday May 19 2015, @10:17PM (#185231)

    I don't think it is worth it at this time. Perhaps one the user base has grown.

    Maybe as a compromise consider putting the bing [bing.com] or google [google.com] translation widget into each page so people can translate if they so wish without it taking up much dev time. Don't know, just an idea.