from the reduce-the-size-of-your-pron-storage dept.
With unlimited data plans becoming increasingly expensive, or subscribers being forced to ditch their unlimited data due to overuse, anything that can reduce the amount of data we download is welcome. This is especially true for media including images or video, and Google just delivered a major gain when it comes to viewing images online.
The clever scientists at Google Research have come up with a new technique for keeping image size to an absolute minimum without sacrificing quality. So good is this new technique that it promises to reduce the size of an image on disk by as much as 75 percent.
The new technique is called RAISR, which stands for "Rapid and Accurate Image Super-Resolution." Typically, reducing the size of an image means lowering its quality or resolution. RAISR works by taking a low-resolution image and upsampling it, which basically means enhancing the detail using filtering. Anyone who's ever tried to do this manually knows that the end result looks a little blurred. RAISR avoids that thanks to machine learning.
[...] RAISR has been trained using low and high quality versions of images. Machine learning allows the system to figure out the best filters to recreate the high quality image using only the low quality version. What you end up with after lots of training is a system that can do the same high quality upsampling on most images without needing the high quality version for reference.
-- submitted from IRC
Related Stories
Neural SuperSampling Is a Hardware Agnostic DLSS Alternative by Facebook
A new paper published by Facebook researchers just ahead of SIGGRAPH 2020 introduces neural supersampling, a machine learning-based upsampling approach not too dissimilar from NVIDIA's Deep Learning Super Sampling. However, neural supersampling does not require any proprietary hardware or software to run and its results are quite impressive as you can see in the example images, with researchers comparing them to the quality we've come to expect from DLSS.
Video examples on Facebook's blog post.
The researchers use some extremely low-fi upscales to make their point, but you could also imagine scaling from a resolution like 1080p straight to 8K. Upscaling could be combined with eye tracking and foveated rendering to reduce rendering times even further.
Also at UploadVR and VentureBeat.
Journal Reference:
Lei Xiao, Salah Nouri, Matt Chapman, Alexander Fix, Douglas Lanman, Anton Kaplanyan,Neural Supersampling for Real-time Rendering - Facebook Research, (DOI: https://research.fb.com/publications/neural-supersampling-for-real-time-rendering/)
Related: With Google's RAISR, Images Can be Up to 75% Smaller Without Losing Detail
Nvidia's Turing GPU Pricing and Performance "Poorly Received"
HD Emulation Mod Makes "Mode 7" SNES Games Look Like New
Neural Networks Upscale Film From 1896 to 4K, Make It Look Like It Was Shot on a Modern Smartphone
Apple Goes on an Acquisition Spree, Turns Attention to NextVR
(Score: 5, Funny) by ikanreed on Thursday January 19 2017, @08:26PM
...without losing sugar.
With unlimited fruit plans becoming increasingly expensive, or eaters being forced to ditch their fruit due to under-consumption, anything that can reduce the amount of grapes we consume is welcome This is especially true for beverages such as wine or juice, and the sun just delivered a major can when it comes to juicing fruit in a bucket.
(Score: 2) by nobu_the_bard on Thursday January 19 2017, @08:27PM
I read through the description of RAISR. I do not fully comprehend the implications yet, but it seems that the idea is, it is a form of upscaling (I suppose that explains the name). The images can be stored at lower resolutions and then upscaled with a technique that generates a more accurate result than common linear upscaling techniques. As the resultant image is comparable to the high res original than a similarly linearly upscaled version, they are able to claim it does not lose detail.
(Score: 2) by Grishnakh on Thursday January 19 2017, @09:23PM
Which sounds like an outright lie. You can't retrieve information that isn't there; upsampling just makes something look sharper by interpolating, but it doesn't actually add lost information back to the image. To make an extreme argument, I can replace a picture of a line with a file that just describes two points on that line, and let an interpolation algorithm fill in the rest. But that's not going to replace the little circle that's in the middle of that line, which has been lost due to the low resolution of the file describing two points.
(Score: 2) by JoeMerchant on Thursday January 19 2017, @10:26PM
Presumably, they are doing something "more" than upscaling, which will give (at least qualitatively) better reproduction of the original image than upscaling alone, and require more data as well.
As far as I can see, it's just another evolution beyond jpeg et.al. with more acceptable quality at the compression rate they are testing it at. Like .mp3 et.al. in the audio space, it's focusing on the things people notice at the expense of reduced detail in the areas that people don't notice.
------
Nothing to see here, move along.
🌻🌻 [google.com]
(Score: 3, Insightful) by Immerman on Friday January 20 2017, @12:26AM
It's true that you can't restore information that's been removed. But it's also true that there's far more information in your average image than you will notice without exhaustive examination. If done right, those two truth may largely counteract each other.
I wouldn't trust the detail in an upscaled image for anything important, but how often is there anything important in the details of an image on a web page? How often do you even pay any attention to the detail?
Meanwhile, even simple bi-cubic upscaling can often reveal a great deal of information that was already present, but heavily obscured by the pixilated noise introduced by rendering pixels as colored blocks rather than sampling points.
(Score: 2) by FakeBeldin on Friday January 20 2017, @05:30PM
The fine summary remarks that they're using machine learning.
You can think of that as moving details from the image files to the "compression" algorithm.
An extreme case of this is the tongue-in-cheek LenPEG [dangermouse.net] image compression algorithm.
At any rate: low-res images for which the high-res version was learned by the algorithm can have details restored, as those details are embedded in the algorithm. The obvious drawback is that images that are sufficiently different from images on which the algorithm was trained will not be correctly transformed, as those details are missing in both the low-res version and the algorithm.
So probably the algorithm tosses out as much details as can still be reconstructed later - a lot for known images, almost nothing for sufficiently "new" images. Given Google's huge database of images, I wouldn't be surprised if this worked well on 80% of the images.
And that would be worth the trade off, I think.
(Score: 0) by Anonymous Coward on Thursday January 19 2017, @08:43PM
Upscale was shown but was expecting to see triplets with the original image before the 75% business.
(Score: 2) by bob_super on Thursday January 19 2017, @08:45PM
Yup, they might need to brush up on it.
Smartly upscaling a low-rez image can create a nice sharp image. But it will fill in the blanks with information that is different from the original.
I guess in the era of photshop-everything and look-at-my-breakfast-plate, it's not a big deal, until the wrong black guy gets sent to the Chair.
(Score: 0) by Anonymous Coward on Thursday January 19 2017, @08:57PM
i think you just sold the product; it's a convenient scapegoat to deny wrongdoing. The machine said so, your honor.
(Score: 1, Interesting) by Anonymous Coward on Thursday January 19 2017, @10:18PM
Here's the thing - not every bit carries an identical amount of information.
What they've done is "put" the information into the upscaling algorithms - or more precisely into the choice of aglorithms, and the parameters to the algorithms.
There is still some loss of information. Just not as much as there would be in a naive application of upscaling.
Keep in mind that the target application is not archival, its just for display to humans doing non-critical viewing on their phones.
So loss of things like shadow details and small color gradients aren't considered too important because they aren't readily apparent to the human eye,
(Score: 2) by bob_super on Thursday January 19 2017, @10:36PM
Would you mind reading TFS's title again?
without losing detail is most definitely incorrect.
(Score: 0) by Anonymous Coward on Friday January 20 2017, @12:03AM
I'm sorry. I thought were talking about the actual system google developed. Not whatever some dumbshit wrote when they submitted the summary to soylent.
Your criticism is so much more informative.
Carry on!
(Score: 0) by Anonymous Coward on Thursday January 19 2017, @11:15PM
They're not losing any information. They are adding information by trying to be clever in figuring out what detail is supposed to be there. The point is correct, that you can't add detail that isn't there to begin with. Their deep learning algorithms have been trained on lots of real-life pictures so it makes the best guess what filtering algorithms to use to put information into the image. But keep in mind that the information, the detail, they are putting in isn't in the picture to begin with.
For instance, you can look at a low-resolution image of a circle, and if you recognize it to be a circle, you can redraw it in very sharp and fine detail. However, if what was in that low-resolution image was really not a circle, but a very round ellipse, but you can't tell because of the resolution, you might redraw it as a very fine and sharp circle, but you've added that information yourself.
(Score: 2) by JoeMerchant on Friday January 20 2017, @03:22AM
There's Shannon's information theory where bits are bits and you can only push so many bits per second through a channel with so much bandwidth.
Then there's the Fraunhofer style of information theory where some bits are more important than others, so preserve the ones that are perceived by people and discard those that won't be missed.
All this crap about "upscaling the image" is oversimplification to make a tech article that people think they understand in a 30 second skimming. There's something like upscaling going on in there, but if that's all that's going on in there, this wouldn't have been news 20 years ago... cubic spline interpolation has been around for awhile, as has .jpg and many other forms of lossy optical compression.
You're right, though - a poorly lit, out of focus, shaky 16 megapixel image of breakfast is certainly overkill. It would be amusing if the algorithm included some AI that determined the "value" of the image and adjusted the compression levels accordingly.
🌻🌻 [google.com]
(Score: 1) by GDX on Saturday January 21 2017, @12:12AM
For me this algorithm is more similar to "spectral band replication" and "band folding" than to typical interpolation, where you recreate deleted/lost information from the information that you have and some cues, this is a step beyond Fraunhofer style of information theory.
For example HE-AAC that uses SBR, the audio is resampled from 48kHz to 24kHz (this kills the audio signal from 12kHz to 24kHz) and compressed using AAC-LC and then the SBR data is added. During the decompression the audio is resampled from 24kHz to 48kHz and the SBR data is used to fake the missing audio signals in the 12-24kHz range
(Score: 2) by JoeMerchant on Saturday January 21 2017, @12:21AM
Fraunhofer style compression has been "out there" for what? Like 20 years in widespread usage? It's about time to take another step forward.
🌻🌻 [google.com]
(Score: 2) by gidds on Friday January 20 2017, @01:26PM
Exactly.
And the real shame is that this sort of technology could be used in a real compression algorithm.
AIUI, many compression algorithms are based around a predictor: code that can make the best guess possible as to what the next byte/word/unit will be, based on the ones it's had already. Then, you encode the 'residual', the difference between the prediction and the actual value. The better the predictor, the smaller the residuals — and the better they can be compressed using existing techniques. (You can also apply lossy techniques to them, of course.)
So if this sort of AI makes better guesses about the detail of the image, then it can be used to improve image compression without making up detail out of whole cloth just because it's the sort of thing that other images have.
(Of course, if an ignorant amateur like me can come up with this idea, then I'm sure the experts have. Though none of the reports I've read about this story suggest so.)
[sig redacted]
(Score: 1) by j-beda on Thursday January 19 2017, @10:23PM
Is the new image 75% (3/4) the size of the old? Is the old image 125% (5/4) the size of the old - thus the new being 80% the size of the old? Is the new image 25% (1/4) the size of the old?
Is it something else?
Why do so many people want to sue the word "smaller" when combined with a percentage or fraction? Do they not know how ambigious that is?
(Score: 2) by jelizondo on Thursday January 19 2017, @11:03PM
They want to sue the word because America is lawyer happy and they don't know how to use the word smaller. Talk about ambiguity!
(Score: 2) by Bill Dimm on Thursday January 19 2017, @11:45PM
I haven't read the article, but assuming they are using terminology in the standard way it means:
new_size = old_size - 0.75 * old_size = 25% of old_size
It's really not ambiguous. Where things do get ambiguous is when the old and new quantities are themselves percentages. If someone says "Previously precision was 70% and adjustments improved it by 10%" does that mean it is now 77% or is it 80% (i.e., is the 10% change relative or absolute)?
(Score: 4, Funny) by linkdude64 on Thursday January 19 2017, @11:16PM
Zoom in!
Enhance JPEG section A6 - tighten up on the reflection of his eye - there's a mirror in the room he's in. Now we just re-vectorize the compression algorithms on the Z-Axis to anti-alias and...oh god, it's...it's a copy of this horrible script!!!
(Score: 2, Funny) by Anonymous Coward on Thursday January 19 2017, @11:37PM
http://www.dailymotion.com/video/x2qlmuy [dailymotion.com]
Love this vid :)
(Score: 2) by Bot on Friday January 20 2017, @01:20AM
google pls
whatever savings in size and speed of web elements means only that the page will cram more of it.
Let us go to tiffs and resurrect the javascript engine of netscape navigator. So finally the web pages will return to be pages.
Seriously I see a better way to use this algo: to improve (sharpen, denoise, color correction) existing photos and video without resizing them.
Account abandoned.
(Score: 2) by takyon on Friday January 20 2017, @09:41AM
Optimize for size? I guess the Google Fiber experiment really is dead! Or mobile just really sucks!
[SIG] 10/28/2017: Soylent Upgrade v14 [soylentnews.org]
(Score: 2) by lizardloop on Friday January 20 2017, @12:40PM
This was my immediate thought. Every attempt to make browsing the web faster has been met with just cramming more shit on to every page.
(Score: 0) by Anonymous Coward on Friday January 20 2017, @05:36PM
I've been kicking around the idea of image compression by using GA's (genetic algorithm) to evolve the most compact representation (with the lossy tolerance level set by user). Polygons, ovals, and maybe wave equations for repeating-but-varying polygon fill-ins (think Moiré or fractals), and blur filters can be combined in different orders and positions. The GA would find the best ordering and combo's.
Here are examples where a single kind of shape type is used:
https://www.youtube.com/watch?v=25aXHBZFPgU [youtube.com]
https://www.youtube.com/watch?v=GCmMRUIGIwQ [youtube.com]
But I'm thinking of having more object and transformation types.
It would be computationally intense to generate the "render list" of polygons and transformations for a given image, but relatively quick to render. The GA does all the hard work so that the renderer doesn't have to.
Create a kind of render machine language of generation primitives. Let's say we limit each object to 6 parameters (some may be ignored in some instructions). A render list would then resemble:
By having a consistent instruction format, GA cross-breeding is easier. We can cross-breed on both instruction list level, and the parameter level. The actual primitives (instructions) used may require some experiments to see what works best.
It would be cool to watch the renderer reconstruct the image in slow motion by applying the operations one at a time. (Not the evolution steps, that takes too long to watch, just the final render list.) Even if it turned out to be a poor compression technique, the slow-mo rendering itself may be an entertaining use.
I suspect this kind of gizmo would work best on images with a degree of repetition in them, such as buildings, and worse on those with a lot of randomness, like a jungle. Maybe the GA can mix traditional compression and this new kind, evolving the best combo for different images or portion of images.
* There are different ways to arrange repeat instruction parameters. Perhaps the first 2 params could be the x and y offset, the second 2 be the delta on the first 2 per "loop", and the last two be the count of repeats for x and y respectively. The repeater(s) would apply to the prior instruction, be it a drawing or blurring instruction.