Stories
Slash Boxes
Comments

SoylentNews is people

posted by takyon on Monday August 24 2015, @06:30PM   Printer-friendly
from the x1F4A9 dept.

The candidate list for Unicode 9 is taking shape, with the final list of new emojis scheduled for approval in mid-2016.

38 emoji characters have been accepted as candidates for the 2016 Unicode update, including Face Palm, Selfie, Shrug, Fingers Crossed, and Pregnant Woman.


Original Submission

 
This discussion has been archived. No new comments can be posted.
Display Options Threshold/Breakthrough Mark All as Read Mark All as Unread
The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
  • (Score: 2) by kurenai.tsubasa on Tuesday August 25 2015, @05:23PM

    by kurenai.tsubasa (5227) on Tuesday August 25 2015, @05:23PM (#227684) Journal

    Woah, there! I suppose UTF-16 makes sense when encoding text that's primarily an Eastern Asian language. Then there's that thing where the Windows world was all about UCS-2 (fixed-width 16 bit) for a while. Does anybody actually use UTF-32?

    Personally, I'm very happy with UTF-8. Almost everything is encoded as one byte with the occasional two or three byte encoding when I go for punctuation that's not included in ISO-8859-* or if for whatever reason I need to enter an astrological sign. Did you know your entire post is UTF-8 (and anything else that's backwards compatible with good ol' 7 bit ASCII)?

    I admit I'm not certain why there needs to be emoji.

    The ISO 8859-* character sets were entirely sufficient to deal with every non-asian language and perhaps we could have just told them, sorry your written language is simply incompatible with data processing.

    Just make sure you remember whether it was ISO-8859-1 or ISO-8859-5! Is that character supposed to be þ or ў? I suppose trial and error would reveal which variant it was.

    From Wikipedia: [wikipedia.org]

    ISO/IEC 8859-1 is missing some characters for French and Finnish text and the euro sign. In order to provide some of these characters, ISO/IEC 8859-15 was developed as an update of ISO/IEC 8859-1. This required, however, the removal of some infrequently used characters from ISO/IEC 8859-1, including fraction symbols and letter-free diacritics: ¤, ¦, ¨, ´, ¸, ¼, ½, and ¾.

    The popular Windows-1252 character set adds all the missing characters provided by ISO/IEC 8859-15, plus a number of typographic symbols, by replacing the rarely used C1 controls in the range 128 to 159 (hex 80 to 9F). It is very common to mislabel text data with the charset label ISO-8859-1, even though the data is really Windows-1252 encoded. Many web browsers and e-mail clients will interpret ISO-8859-1 control codes as Windows-1252 characters in order to accommodate such mislabeling but it is not standard behaviour and care should be taken to avoid generating these characters in ISO-8859-1 labeled content.

    Ugh, the Windows-1252 character set! That doesn't even mention CP437! How will I run ZZT [wikipedia.org] with either Windows-1252 or ISO-8859-*? What a nightmare. Shift-JIS! Wingdings! When does the madness end?!

    Anyway, with ISO-8859-*, what should I do if I need a combination of characters that none of the 15 variants include? This can happen even if I stick to European languages. Speaking of ISO-8859-5, for some reason № makes it in, but © is nowhere to be found. No ¥, £, or ¢, either. Eh, probably only communists use ISO-8859-5.

    I seriously doubt that 1 GB of RAM is considered too little these days because of Unicode, and I am certain that assigning codepoints to emoji (which may or may not ever be implemented in your font of choice) won't cause the demise of civilization.

    Starting Score:    1  point
    Karma-Bonus Modifier   +1  

    Total Score:   2