Stories
Slash Boxes
Comments

SoylentNews is people

posted by martyb on Monday September 16 2019, @05:00PM   Printer-friendly
from the things-expand-to-exceed-the-space-provided dept.

https://danluu.com/web-bloat/

A couple years ago, I took a road trip from Wisconsin to Washington and mostly stayed in rural hotels on the way. I expected the internet in rural areas too sparse to have cable internet to be slow, but I was still surprised that a large fraction of the web was inaccessible. Some blogs with lightweight styling were readable, as were pages by academics who hadn't updated the styling on their website since 1995. But very few commercial websites were usable (other than Google). When I measured my connection, I found that the bandwidth was roughly comparable to what I got with a 56k modem in the 90s. The latency and packetloss were significantly worse than the average day on dialup: latency varied between 500ms and 1000ms and packetloss varied between 1% and 10%. Those numbers are comparable to what I'd see on dialup on a bad day.

Despite my connection being only a bit worse than it was in the 90s, the vast majority of the web wouldn't load. Why shouldn't the web work with dialup or a dialup-like connection? It would be one thing if I tried to watch youtube and read pinterest. It's hard to serve videos and images without bandwidth. But my online interests are quite boring from a media standpoint. Pretty much everything I consume online is plain text, even if it happens to be styled with images and fancy javascript. In fact, I recently tried using w3m (a terminal-based web browser that, by default, doesn't support css, javascript, or even images) for a week and it turns out there are only two websites I regularly visit that don't really work in w3m (twitter and zulip, both fundamentally text based sites, at least as I use them)[1].

More recently, I was reminded of how poorly the web works for people on slow connections when I tried to read a joelonsoftware post while using a flaky mobile connection. The HTML loaded but either one of the five CSS requests or one of the thirteen javascript requests timed out, leaving me with a broken page. Instead of seeing the article, I saw three entire pages of sidebar, menu, and ads before getting to the title because the page required some kind of layout modification to display reasonably. Pages are often designed so that they're hard or impossible to read if some dependency fails to load. On a slow connection, it's quite common for at least one depedency to fail. After refreshing the page twice, the page loaded as it was supposed to and I was able to read the blog post, a fairly compelling post on eliminating dependencies.

[1] excluding internal Microsoft stuff that's required for work. Many of the sites are IE only and don't even work in edge. I didn't try those sites in w3m but I doubt they'd work! In fact, I doubt that even half of the non-IE specific internal sites would work in w3m.


Original Submission

 
This discussion has been archived. No new comments can be posted.
Display Options Threshold/Breakthrough Mark All as Read Mark All as Unread
The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
  • (Score: 3, Informative) by c0lo on Tuesday September 17 2019, @01:29AM (1 child)

    by c0lo (156) Subscriber Badge on Tuesday September 17 2019, @01:29AM (#894938) Journal

    HTML spec did not specify rendering of any element, it's still the user agent (web browser) implementation that decides how to render any element.

    Here's the 1995 spec of HTML [ietf.org]. You may be interested in Section 5.4 - Headings: H1 ... H6 [ietf.org] which describes how the heading should look like ("H1 - Bold, very-large font, centered. One or two blank lines above and below.", etc).

    But you are right, the earliest HTML spec did not prescribe the rendering, just didn't abstain from describing it. You will find quite a number of examples, including

    5.5.2. Preformatted Text: PRE


          The <PRE> element represents a character cell block of text and is
          suitable for text that has been formatted for a monospaced font.

          The <PRE> tag may be used with the optional WIDTH attribute. The
          WIDTH attribute specifies the maximum number of characters for a line

    5.7.2. Typographic Elements


    ...
                Typical renderings for idiomatic elements may vary between user
                agents. If a specific rendering is necessary -- for example, when
                referring to a specific text attribute as in "The italic parts are
                mandatory" -- a typographic element can be used to ensure that the
                intended typography is used where possible.

    ---

    Starting from HTML 3.2 [w3.org] - when the W3C consortium took over HTML spec, the number of thingies added for the sake of standardizing the presentation exploded:

    1. starting with the bgcolor, background attributes...
    2. ... then the table, TH, TR, TD elements with extensive set of attributes targeting the appearance - size, borders, padding, word-wraps, v/h-aligns, etc ),
    3. ... (see yourself properly seated now, please)
      Font style elements [w3.org]:

      TT teletype or monospaced text
      I italic text style
      B bold text style
      U underlined text style
      STRIKE strike-through text style
      BIG places text in a large font
      SMALL places text in a small font
      SUB places text in subscript style
      SUP places text in superscript style

    ---

    It needed 3 more years, to the version 4.0 for HTML (1998), to anchor the idea of 2.4.1 Separate structure and presentation [w3.org]

    And even when the idea was floating (with the CSS 1 (1996) [w3.org]), the actual industry support lagged behind by at least 4 years [wikipedia.org]

    --
    https://www.youtube.com/watch?v=aoFiw2jMy-0 https://soylentnews.org/~MichaelDavidCrawford
    Starting Score:    1  point
    Moderation   +1  
       Informative=1, Total=1
    Extra 'Informative' Modifier   0  
    Karma-Bonus Modifier   +1  

    Total Score:   3  
  • (Score: 2) by canopic jug on Tuesday September 17 2019, @04:20AM

    by canopic jug (3949) Subscriber Badge on Tuesday September 17 2019, @04:20AM (#894987) Journal

    Early HTML, with its SGML [coverpages.org] roots, was intended to be in the direction of keeping structure and presentation separate. In the "browser wars" that M$ started and Netscape allowed itself to be dragged into, all kinds of misfeatures were piled on, especially elements for typgraphic markup. HTML 3.2 was at the peak of that and a cap. It may have looked like a train wreck but it was marking an end point of that mess and signaled a turn back towards separating structure from layout/presentation which we started then to get with HTML 4.

    Now that the W3C is fully out of the hands of academia and in the hands of the likes of Facebook and Microsoft, the standard will turn into another trainwreck.

    --
    Money is not free speech. Elections should not be auctions.