from the ¡sᴉɥʇ-sǝlpuɐɥ-ʍou-ǝʇᴉs-ǝɥʇ dept.
For purposes of breakage, anything that breaks the site layout/Reply To/Parent/Moderate buttons, or breaks any comments beyond itself is considered bad. We need to stop those. If you can break it (which shouldn't be hard), you earn a cookie, and I'll get you in the CREDITS file as something awesome.
For comments that are just plain unreadable, moderation will take care of them, and that isn't considered a bug. So go forth and BREAK my minions! ()}:o)↺
So, as I write this, day one has officially come to an end. I'm still somewhat in shock over it. Last night when I was editing the database to change over hostnames and such, I was thinking, man, it would be great if we got 100 regular users by tomorrow. Turns out I was wrong. By a factor of ten. Holy cow, people. I'm still in a state of disbelief, partially due to the epic turnout, but also because our very modest server hardware hasn't soiled itself from the influx (the numbers are, well, "impressive" is a way to put it). Anyway, I wanted to do a bit of a writeup of where we stand now, what works, and what doesn't. Check it out (and some raw numbers) after the break! Warning, it is a bit lengthy.
ZME Science reports on a Nature article (full article is paywalled) (DOI: 10.1038/nature18599) about a disease called disseminated neoplasia. The disease is a group of cancers which are thought to spread via seawater. They affect mussels, cockles, and golden carpet shell clams.
Among mussels and cockles, the cancer cells come from the same species, but the cancer infecting golden carpet shell clams comes from a different species, Venerupis corrugata , the pullet carpet shell.
Helsinki-based software developer, Henri Sivonen, has written a pair of blog posts about UTF-8; why it should be used and how to inform the user agent when it is used.
The first blog post explains problems that can arise when UTF-8 is used without explicitly stating so. Here is a short selection from Why Supporting Unlabeled UTF-8 in HTML on the Web Would Be Problematic:
UTF-8 has won. Yet, Web authors have to opt in to having browsers treat HTML as UTF-8 instead of the browsers Just Doing the Right Thing by default. Why?
I'm writing this down in comprehensive form, because otherwise I will keep rewriting unsatisfactory partial explanations repeatedly as bug comments again and again. For more on how to label, see another writeup.
Legacy Content Won't Be Opting Out
First of all, there is the "Support Existing Content" design principle. Browsers can't just default to UTF-8 and have HTML documents encoded in legacy encodings opt out of UTF-8, because there is unlabeled legacy content, and we can't realistically expect the legacy content to be actively maintained to add opt-outs now. If we are to keep supporting such legacy content, the assumption we have to start with is that unlabeled content could be in a legacy encoding.
In this regard, <meta charset=utf-8> is just like <!DOCTYPE html> and <meta name="viewport" content="width=device-width, initial-scale=1">. Everyone wants newly-authored content to use UTF-8, the No-Quirks Mode (better known as the Standards Mode), and to work well on small screens. Yet, every single newly-authored HTML document has to explicitly opt in to all three, since it isn't realistic to get all legacy pages to opt out.
The second blog post explains how one explicitly communicates to the user agent that UTF-8 is employed in the current document. Always Use UTF-8 & Always Label Your HTML Saying So: