Stories
Slash Boxes
Comments

SoylentNews is people

posted by Fnord666 on Sunday August 30 2020, @11:13AM   Printer-friendly
from the no-need-to-actually-go-there dept.

This AI Creates Detailed 3D Renderings from Thousands of Tourist Photos

A team of researchers at Google have come up with a technique that can combine thousands of tourist photos into detailed 3D renderings that take you inside a scene... even if the original photos used vary wildly in terms of lighting or include other problematic elements like people or cars.

The tech is called "NeRF in the Wild" or "NeRF-W" because it takes Google Brain's Neural Radiance Fields (NeRF) technology and applies it to "unstructured and uncontrolled photo collections" like the thousands of tourist photos used to create the demo you see below[1][2], and the samples in the video above[3].

It's basically an advanced, neural network-driven interpolation that manages to include geometric info about the scene while removing 'transient occluders' like people or cars and smoothing out changes in lighting.

[1] demo1.gif (36.75 MiB)
[2] demo2.gif (35.66 MiB)
[3] YouTube video (3m42s).

NeRF in the Wild: Neural Radiance Fields for Unconstrained Photo Collections (arXiv:2008.02268v2 [cs.CV])


Original Submission

This discussion has been archived. No new comments can be posted.
Display Options Threshold/Breakthrough Mark All as Read Mark All as Unread
The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
(1)
  • (Score: 2) by FatPhil on Sunday August 30 2020, @01:11PM (8 children)

    by FatPhil (863) <{pc-soylent} {at} {asdf.fi}> on Sunday August 30 2020, @01:11PM (#1044161) Homepage
    Makes me wonder at what point we will no longer be able to trust any photo we see on news sites.
    --
    Great minds discuss ideas; average minds discuss events; small minds discuss people; the smallest discuss themselves
    • (Score: 4, Interesting) by takyon on Sunday August 30 2020, @01:39PM (3 children)

      by takyon (881) <{takyon} {at} {soylentnews.org}> on Sunday August 30 2020, @01:39PM (#1044166) Journal

      It has to be easy enough for newsrooms to use it. Like Adobe Photoshop with machine learning tools and filters [medium.com] on ARM Macs accelerated by the "Neural Engine" [wikipedia.org].

      News sites can also just grab images off Twitter and run those. The sophistication of fake images should go way up. Great new machine learning algorithms are coming out frequently, and consumer GPU performance will double/triple within a few years, with three-way competition (Intel) possibly driving down costs.

      --
      [SIG] 10/28/2017: Soylent Upgrade v14 [soylentnews.org]
      • (Score: 2) by bzipitidoo on Sunday August 30 2020, @09:55PM (2 children)

        by bzipitidoo (4388) Subscriber Badge on Sunday August 30 2020, @09:55PM (#1044355) Journal

        Something doesn't quite add up with stories of this sort. We're treated to a constant stream of stories that AI can do all these amazing things with photos, like reconstruct the scene from a different angle, fix all kinds of problems such as blur and color degradation, colorize, intelligently depixelate, and most of all, identify faces, and make "deep fakes".

        Then you run the GIMP (I have never used PhotoShop in large part because it is commercial), and what do you see? Certainly not any AI engine. To fix color problems in old photos there is "white balance", which is a very good if totally brain dead method of restoring colors to something close to their original values. All it does is search for the brightest red, green, and blue values, assumes those were originally part of a white object, and adjusts the values to make that color white, proportionally adjusting all the other values. Nice, but no real intelligence to that. Doesn't work on dark pictures that don't have some combination of brightly colored items that add up to white. Often there is something white in the picture-- clothes, clouds, or walls, usually.

        To fix blur is a lot harder. I did read of one way to use the GIMP to fix motion blur, if the motion is in a straight line. Doesn't work if the motion follows a curve, but only because the GIMP doesn't have any way to motion blur along a path other than a straight line. (What you do is motion blur in the direction of the blur. Fight blur with blur. It works, but not without causing another problem. Sharpened an old blurry photo extremely well, but it also created terrible shadows.) I admit I am not skilled in this area. For instance, I don't understand "convolution". The GIMP's interface just gives you a 5x5 matrix to fill in.

        Searching online for deblurring, there are a bunch of commercial offerings, and they do their utmost to drown out whatever libre software there may be. It's hard to tell how good these commercial offering are, really. Can't believe the vendors. They have certainly cherry-picked their examples and exaggerated their quality and effectiveness. So I am unsure what the state of the art is on deblurring.

        • (Score: 2) by takyon on Sunday August 30 2020, @10:44PM

          by takyon (881) <{takyon} {at} {soylentnews.org}> on Sunday August 30 2020, @10:44PM (#1044371) Journal

          I mention Photoshop because that's what "real people" use, and I know that there is at the very least an "object selection tool" [theverge.com] (see also [engadget.com]). Blender can use tensor cores for denoised ray tracing [blender.org], and maybe some video editing software [nvidia.com] can use them. Dedicated non-GPU AI/ML accelerators will be in every ARM-based Mac desktop, and similar capabilities should be usable on consumer x86 PCs via the GPU.

          The disconnect between the AI stories and what you see is because the cool new stuff [youtube.com] comes out on places like GitHub + arXiv. It took a while for the concept of "deep fakes" to go from research papers to user-friendly software. But people are making "deep fake memes" these days.

          As a fellow GIMP user, I already expect it to be years behind the competition. Maybe it is catching up since the competition is running out of features to add.

          This AI Removes Shadows From Your Photos! 🌒 [youtube.com]

          I see no reason why that shouldn't be a function in Photoshop within 5 years.

          --
          [SIG] 10/28/2017: Soylent Upgrade v14 [soylentnews.org]
        • (Score: 0) by Anonymous Coward on Sunday August 30 2020, @11:23PM

          by Anonymous Coward on Sunday August 30 2020, @11:23PM (#1044391)

          > Something doesn't quite add up with stories of this sort.
          > Then you run the GIMP

          Not to knock GIMP but I think you answered your own question.

          > I don't understand "convolution". The GIMP's interface just gives you a 5x5 matrix to fill in.

          Convolution is literally just matrix multiplication. [gimp.org]

          > So I am unsure what the state of the art is on deblurring.

          HQ predictive optical flow is some way off. [catalyzex.com]

    • (Score: 2, Informative) by shrewdsheep on Sunday August 30 2020, @02:03PM

      by shrewdsheep (5215) on Sunday August 30 2020, @02:03PM (#1044171)

      Most probably we have passed this point already. The solution, however, exists. It is trusted computing, techniques AKA DRM. The best of both worlds is possible and the discussions needs to point out the different aspects (making trust in digital media possible and restricting access) if DRM is to be fought off.

    • (Score: 3, Interesting) by HiThere on Sunday August 30 2020, @08:18PM (1 child)

      by HiThere (866) on Sunday August 30 2020, @08:18PM (#1044316) Journal

      That's been true ever since "framing a photo" was developed. Say f decades ago. Possibly earlier, but that's the first time I saw a news team select shots so that they conveyed an intended message rather than what was going on.

      If you've ever been on-site at a news scene, and then seen it on TV, you've probably seen what I mean in action. They take one small part of what's going on and focus on it, excluding anything contrary to the message they intend to communicate. I've seen major fires turned into city-wide conflagrations. I've seen minor altercations turned into a major riot. Technically what they showed was actually happening, but the context was a lie.

      When I was in college I thought I could read a newspaper story and figure out what was really happening. After I was on-site at a couple of stories I stopped believing that.

      --
      Javascript is what you use to allow unknown third parties to run software you have no idea about on your computer.
      • (Score: 2) by bzipitidoo on Sunday August 30 2020, @09:58PM

        by bzipitidoo (4388) Subscriber Badge on Sunday August 30 2020, @09:58PM (#1044357) Journal

        Yes, the few times a story ran about something I knew personally, it was shocking just how much they'd distorted the facts to make it all a lot, lot more dramatic.

    • (Score: 1) by jurov on Monday August 31 2020, @12:44AM

      by jurov (6250) on Monday August 31 2020, @12:44AM (#1044417)

      But this doesn't depend on technology, photos were retouched since forever... you haven't heard about Stalin?

  • (Score: 0) by Anonymous Coward on Sunday August 30 2020, @01:42PM (2 children)

    by Anonymous Coward on Sunday August 30 2020, @01:42PM (#1044169)

    Microsoft PhotoSynth was literally this. They killed it off 10 years ago.

    • (Score: 2, Interesting) by Anonymous Coward on Sunday August 30 2020, @02:25PM (1 child)

      by Anonymous Coward on Sunday August 30 2020, @02:25PM (#1044177)

      The real advance here is to use images captured under a multitude of lighting conditions and have AI normalize them into a single textured model. This is a longstanding problem even for controlled photogrammetry where shadows become geometry and it's often faster to model from scratch than fix up the point cloud.

      • (Score: 0) by Anonymous Coward on Sunday August 30 2020, @04:41PM

        by Anonymous Coward on Sunday August 30 2020, @04:41PM (#1044221)

        Precisely, photogrammetry using images that were taken in close proximity has been a thing for quite a while now. If you can fly a drone around a land mark capturing images of it, you can stitch it into a surprisingly accurate 3d model. For some types of objects, the result may even be better than what can be achieved using laser based 3d scanning technology. Especially if the tones are relatively flat and you don't have the ability to dirty them up for the scan.

        There's also the added level of being able to display the same image under different weather conditions which wasn't practical with the older technology unless you went back and actually recorded it under different conditions or manually want in and changed the look of the model or imported into software to do raycasting on the model.

  • (Score: 2) by Rich on Sunday August 30 2020, @02:42PM (2 children)

    by Rich (945) on Sunday August 30 2020, @02:42PM (#1044180) Journal

    Would it be as good with a thousand Keyhole fly-bys rather than tourist photos? Because then the question wouldn't be anymore whether they can read newspaper headlines, but they would have everything down to the millimeter range, and in 3D. Maybe the last "too good" pictures (I think from somewhere in Iran) were "leaked" to distract from such abilities.

    • (Score: 2) by takyon on Sunday August 30 2020, @10:46PM (1 child)

      by takyon (881) <{takyon} {at} {soylentnews.org}> on Sunday August 30 2020, @10:46PM (#1044374) Journal
      • (Score: 2) by Rich on Monday August 31 2020, @11:47AM

        by Rich (945) on Monday August 31 2020, @11:47AM (#1044546) Journal

        Yup. I meant those. I don't even think they were surprising. Recent commercially available resolutions from LEO seem to be below 30 cm. The NRO will probably laugh at that, but even if that was their limit, too, they could do overlays. I just read up on it (without going too deep) and it looks like naive summing of values yields 6dB at fourfold image numbers, i.e. they'd get a 16-fold noise reduction from a burst of 256 images. They will surely have advanced algorithmic or machine learning techniques to also consider nearby pixels, phase correlation, transient separation from backgrounds, etc. to get to subpixel resolution. I'd be suprised if they could not get down to millimeter ranges. Add the stuff from the article, and it's all in nice 3D and colour.

  • (Score: 3, Interesting) by Mojibake Tengu on Sunday August 30 2020, @04:37PM (1 child)

    by Mojibake Tengu (8598) on Sunday August 30 2020, @04:37PM (#1044218) Journal

    Consider this tech applied in real wars: every observable adversary soldier possibly recorded during mass combat, and later, after the war, identified with reconstruction, and taken personally to legal responsibility for his deeds on the battlefield. Or even sooner, if taken captive.

    I'd call this a karma. It'll suck to become an aggressor in future wars.
    The question "Which state is better at this?" remains open, for today.

    --
    The edge of 太玄 cannot be defined, for it is beyond every aspect of design
    • (Score: 3, Insightful) by Freeman on Monday August 31 2020, @03:34PM

      by Freeman (732) on Monday August 31 2020, @03:34PM (#1044611) Journal

      Assuming both states are very good at it, there's no reason why it couldn't be just as easily manufactured. Thus, unbelievable.

      --
      Joshua 1:9 "Be strong and of a good courage; be not afraid, neither be thou dismayed: for the Lord thy God is with thee"
  • (Score: 3, Interesting) by Runaway1956 on Sunday August 30 2020, @05:54PM (1 child)

    by Runaway1956 (2926) Subscriber Badge on Sunday August 30 2020, @05:54PM (#1044246) Homepage Journal

    They scan millions of people's vacation pics to do this? Where did they get all the pics? Were they hoovered off of Facebook and other social media? Maybe from Google Drive? Possibly even from people's home computers?

    The question in my mind, is whether they were given legal permission to use thousands of artists work. Maybe, like the advertising companies, they just presume to have rights to people's information.

    --
    Abortion is the number one killed of children in the United States.
    • (Score: 0) by Anonymous Coward on Sunday August 30 2020, @06:25PM

      by Anonymous Coward on Sunday August 30 2020, @06:25PM (#1044256)

      The pics were presumably taken from the public web for the purpose of research which falls under fair use.

(1)