Stories
Slash Boxes
Comments

SoylentNews is people

Journal by cafebabe

(This is the sixth of many promised articles which explain an idea in isolation. It is hoped that ideas may be adapted, linked together and implemented.)

I've got to practical problems with resource naming and practical problems with networking. These problems interact badly with video over the Internet.

In the 1990s, video on the Internet was typically sent to users as lossy UDP. Apple's QuickTime3 had an option to optimize .mov files to aid this type of distribution. This functionality seems to be deprecated and we've devolved to the situation where "streaming" video involves mono-casting small chunks of video over HTTP/1.1 over TCP. There is a top-tier of companies which likes this arrangement. This includes Google, Apple, Microsoft, Netflix and a few parties who provide infrastructure, such as Akamai and CloudFlare. Many of them prefer the analytics which can be gained from mono-casting. To differing degrees, each knows where and when you've paused a video. Big Data provides the opportunity to discover why this occurs. Perhaps the content is violent, challenging or difficult to understand. Perhaps you had a regular appointment to be elsewhere. It should be understood that mono-casting over TCP provides the least amount of privacy for users.

TCP is completely incompatible with multi-cast and therefore chosing TCP excludes the option of sending the same packets to different users. The investment in infrastructure would be completely undermined if anyone could live-stream multi-cast video without centralized intermediaries running demographic profiling advert brokers on sanitized, government approved content.

Admittedly, UDP has a reputation for packet flood but TCP congestion control is wishful thinking which is actively undermined by Google, Microsoft and others. Specifically, RFC3390 specifies that unacknowledged data sent over TCP should be capped. In practice, the limit is about 4KB. However, Microsoft crap-floods 100KB over a fresh TCP connection. Google does similar. If you've ever heard an end-user say that YouTube is reliable or responsive, that's because YouTube's servers are deliberately mis-configured to shout over your other connections. Many of these top-tier companies are making a mockery of TCP congestion control.

Ignoring the packet floods, there are some fundamentals that cannot be ignored. Lossy video uses small packets. The scale of compression which can be attained within a IPv6 1280 byte packet is quite limited, especially when encryption headers are taken into account. The big guns of compression (lz4, lzo, lzma, Burrows-Wheeler transform) are still warming-up in the first 1MB. That dove-tails nicely with HTTP/1.1 but restricts UDP to old-school compression, such as LZ77, Shannon-Fano or arithmetic compression.

However, if we define video tiles in a suitable manner, incorporate degradation in the form of a Dirac codec style diff tree and make every video tile directly addressable then everything works for multiple users at different sites. The trick is to make video tiles of a suitable form. I'm glad that quadtree video has finally become mainstream with HEVC because this demonstrates that practical savings which can be made by amortizing co-ordinates. Further savings can be made via the Barnsley's Collage theorem. The net result is that disparate tile types can be butted together and the seams will be no less awful than JPEG macro-blocks.

Unfortunately, if we wish to be fully and practically compatible with embedded IPv6 devices, we have a practical UDP payload of 1024 bytes or 8192 bits. This determines maximum video quality when using different tile sizes:-

  • 32×32 pixel blocks permits 8 bits per pixel.
  • 64×64 pixel blocks permits 2 bits per pixel.
  • 128×128 pixel blocks permits 0.5 bits per pixel.

There is a hard upper-bound on payload size but are these figures good or bad? Well, H.264 1920×1080p at 30 frames per second may encode from 4GB per hour to 6GB per hour. That's 224 billion pixels and an upper-bound of 48 billion bits - or about 0.23 bits per pixel.

So, what type of tiles would be encoded in the quadtree? Well, enabling different subsets of tiles would provide a codec which ranged from symmetric lossless to highly asymmetric lossy. At a minimum, an H.261 style DCT is essential. On its own, it would provide symmetric lossy encoding which matches MJPEG for quality while only incurring a small overhead for (rather redundant) quadtree encoding. It would also be useful to add texture compression and some rather basic RLE. This would be particularly useful for desktop windowing because it allows, for example, word-processing and web browsing to be rendered accurately and concisely.

What is the maximum overhead of quadtree encoding? Well, for every four leaves, there is one branch and for every four branches, there is another branch. By the geometric series, the ratio of branches to leaves is 1:3 - and branches are always a single byte prior to compression. So, quadtree overhead is small.

So, we've established that it is possible to make a video system which is:-

  • Lossy or lossless.
  • Provides symmetric or asymmetric encode/decode time.
  • Provides low-latency encoding modes.
  • Works with directly addressable video tiles of 64×64 pixels or larger.
  • Works with embedded devices which implement a strict 1280 byte packet limit.
  • Works with 70% round-trip packet loss.
  • Allows frame rate degradation.
  • Can be configured to incur worst-case storage or bandwidth cost which is only marginally worse than MJPEG.
  • May be suitable as transport for a windowing system.
  • Allows content to be viewed on an unlimited number of devices and/or by an unlimited number of users at any number of sites.
  • Allows implementation of video walls of arbitrary size and resolution.

It won't match the bit-rate of H.264, WebM or HEVC but it works over a wide range of scenarios where these would completely fail.

 

Post Comment

Edit Comment You are not logged in. You can log in now using the convenient form below, or Create an Account, or post as Anonymous Coward.

Public Terminal

Anonymous Coward [ Create an Account ]

Use the Preview Button! Check those URLs!


Logged-in users aren't forced to preview their comments. Create an Account!

Allowed HTML
<b|i|p|br|a|ol|ul|li|dl|dt|dd|em|strong|tt|blockquote|div|ecode|quote|sup|sub|abbr|sarc|sarcasm|user|spoiler|del>

URLs
<URL:http://example.com/> will auto-link a URL

Important Stuff

  • Please try to keep posts on topic.
  • Try to reply to other people's comments instead of starting new threads.
  • Read other people's messages before posting your own to avoid simply duplicating what has already been said.
  • Use a clear subject that describes what your message is about.
  • Offtopic, Inflammatory, Inappropriate, Illegal, or Offensive comments might be moderated. (You can read everything, even moderated posts, by adjusting your threshold on the User Preferences Page)
  • If you want replies to your comments sent to you, consider logging in or creating an account.

If you are having a problem with accounts or comment posting, please yell for help.