Stories
Slash Boxes
Comments

SoylentNews is people

The Fine print: The following are owned by whoever posted them. We are not responsible for them in any way.

Journal by cafebabe

(This is the sixth of many promised articles which explain an idea in isolation. It is hoped that ideas may be adapted, linked together and implemented.)

I've got to practical problems with resource naming and practical problems with networking. These problems interact badly with video over the Internet.

In the 1990s, video on the Internet was typically sent to users as lossy UDP. Apple's QuickTime3 had an option to optimize .mov files to aid this type of distribution. This functionality seems to be deprecated and we've devolved to the situation where "streaming" video involves mono-casting small chunks of video over HTTP/1.1 over TCP. There is a top-tier of companies which likes this arrangement. This includes Google, Apple, Microsoft, Netflix and a few parties who provide infrastructure, such as Akamai and CloudFlare. Many of them prefer the analytics which can be gained from mono-casting. To differing degrees, each knows where and when you've paused a video. Big Data provides the opportunity to discover why this occurs. Perhaps the content is violent, challenging or difficult to understand. Perhaps you had a regular appointment to be elsewhere. It should be understood that mono-casting over TCP provides the least amount of privacy for users.

TCP is completely incompatible with multi-cast and therefore chosing TCP excludes the option of sending the same packets to different users. The investment in infrastructure would be completely undermined if anyone could live-stream multi-cast video without centralized intermediaries running demographic profiling advert brokers on sanitized, government approved content.

Admittedly, UDP has a reputation for packet flood but TCP congestion control is wishful thinking which is actively undermined by Google, Microsoft and others. Specifically, RFC3390 specifies that unacknowledged data sent over TCP should be capped. In practice, the limit is about 4KB. However, Microsoft crap-floods 100KB over a fresh TCP connection. Google does similar. If you've ever heard an end-user say that YouTube is reliable or responsive, that's because YouTube's servers are deliberately mis-configured to shout over your other connections. Many of these top-tier companies are making a mockery of TCP congestion control.

Ignoring the packet floods, there are some fundamentals that cannot be ignored. Lossy video uses small packets. The scale of compression which can be attained within a IPv6 1280 byte packet is quite limited, especially when encryption headers are taken into account. The big guns of compression (lz4, lzo, lzma, Burrows-Wheeler transform) are still warming-up in the first 1MB. That dove-tails nicely with HTTP/1.1 but restricts UDP to old-school compression, such as LZ77, Shannon-Fano or arithmetic compression.

However, if we define video tiles in a suitable manner, incorporate degradation in the form of a Dirac codec style diff tree and make every video tile directly addressable then everything works for multiple users at different sites. The trick is to make video tiles of a suitable form. I'm glad that quadtree video has finally become mainstream with HEVC because this demonstrates that practical savings which can be made by amortizing co-ordinates. Further savings can be made via the Barnsley's Collage theorem. The net result is that disparate tile types can be butted together and the seams will be no less awful than JPEG macro-blocks.

Unfortunately, if we wish to be fully and practically compatible with embedded IPv6 devices, we have a practical UDP payload of 1024 bytes or 8192 bits. This determines maximum video quality when using different tile sizes:-

  • 32×32 pixel blocks permits 8 bits per pixel.
  • 64×64 pixel blocks permits 2 bits per pixel.
  • 128×128 pixel blocks permits 0.5 bits per pixel.

There is a hard upper-bound on payload size but are these figures good or bad? Well, H.264 1920×1080p at 30 frames per second may encode from 4GB per hour to 6GB per hour. That's 224 billion pixels and an upper-bound of 48 billion bits - or about 0.23 bits per pixel.

So, what type of tiles would be encoded in the quadtree? Well, enabling different subsets of tiles would provide a codec which ranged from symmetric lossless to highly asymmetric lossy. At a minimum, an H.261 style DCT is essential. On its own, it would provide symmetric lossy encoding which matches MJPEG for quality while only incurring a small overhead for (rather redundant) quadtree encoding. It would also be useful to add texture compression and some rather basic RLE. This would be particularly useful for desktop windowing because it allows, for example, word-processing and web browsing to be rendered accurately and concisely.

What is the maximum overhead of quadtree encoding? Well, for every four leaves, there is one branch and for every four branches, there is another branch. By the geometric series, the ratio of branches to leaves is 1:3 - and branches are always a single byte prior to compression. So, quadtree overhead is small.

So, we've established that it is possible to make a video system which is:-

  • Lossy or lossless.
  • Provides symmetric or asymmetric encode/decode time.
  • Provides low-latency encoding modes.
  • Works with directly addressable video tiles of 64×64 pixels or larger.
  • Works with embedded devices which implement a strict 1280 byte packet limit.
  • Works with 70% round-trip packet loss.
  • Allows frame rate degradation.
  • Can be configured to incur worst-case storage or bandwidth cost which is only marginally worse than MJPEG.
  • May be suitable as transport for a windowing system.
  • Allows content to be viewed on an unlimited number of devices and/or by an unlimited number of users at any number of sites.
  • Allows implementation of video walls of arbitrary size and resolution.

It won't match the bit-rate of H.264, WebM or HEVC but it works over a wide range of scenarios where these would completely fail.

Display Options Threshold/Breakthrough Reply to Article Mark All as Read Mark All as Unread
The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
(1)
  • (Score: 0) by Anonymous Coward on Thursday July 06 2017, @05:56AM (4 children)

    by Anonymous Coward on Thursday July 06 2017, @05:56AM (#535585)

    I don't want realtime streaming video. I want lossless. I want buffering. I want seeking. I want byte serving. I want range requests. I want TCP. I want to download one entire video into one file.

    I absolutely positively without a doubt do not want anything resembling lossy garbage ATSC digital TV.

    HTTP progressive pseudostreaming is superior to TV in every way and I am never going back to broadcast and I will never accept utterly shitty attempts to emulate the broadcast experience with streaming video.

    • (Score: 0) by Anonymous Coward on Thursday July 06 2017, @10:42AM (2 children)

      by Anonymous Coward on Thursday July 06 2017, @10:42AM (#535654)

      I don't want realtime streaming video. I want lossless. I want buffering. I want seeking.

      You are aware that online video is more than just streaming movies? I cannot imagine a video conference without real time streaming video. And in that context, quality degradation is certainly preferable to buffering, and seeking simply doesn't make sense.

      I want to download one entire video into one file.

      Well, then maybe you should use a download service instead of a streaming service?

    • (Score: 2) by cafebabe on Thursday July 06 2017, @10:33PM

      by cafebabe (894) on Thursday July 06 2017, @10:33PM (#535913) Journal

      See UDP Servers, Part 2 [soylentnews.org]. Over-the-air digital television has one source of re-sends: FEC from the same source. TCP also has one source of re-sends: from the mono-cast source. UDP video may use FEC (like over-the-air television), requests from source (like TCP) or requests from peers (like BitTorrent).

      --
      1702845791×2
(1)