Stories
Slash Boxes
Comments

SoylentNews is people

posted by janrinok on Tuesday December 14 2021, @09:04PM   Printer-friendly
from the do-you-see-what-I-see? dept.

Breakthrough AI Technique Enables Real-Time Rendering of Scenes in 3D From 2D Images:

Humans are pretty good at looking at a single two-dimensional image and understanding the full three-dimensional scene that it captures. Artificial intelligence agents are not.

Yet a machine that needs to interact with objects in the world — like a robot designed to harvest crops or assist with surgery — must be able to infer properties about a 3D scene from observations of the 2D images it's trained on.

While scientists have had success using neural networks to infer representations of 3D scenes from images, these machine learning methods aren't fast enough to make them feasible for many real-world applications.

A new technique demonstrated by researchers at MIT and elsewhere is able to represent 3D scenes from images about 15,000 times faster than some existing models.

The method represents a scene as a 360-degree light field, which is a function that describes all the light rays in a 3D space, flowing through every point and in every direction. The light field is encoded into a neural network, which enables faster rendering of the underlying 3D scene from an image.

The light-field networks (LFNs) the researchers developed can reconstruct a light field after only a single observation of an image, and they are able to render 3D scenes at real-time frame rates.

"The big promise of these neural scene representations, at the end of the day, is to use them in vision tasks. I give you an image and from that image you create a representation of the scene, and then everything you want to reason about you do in the space of that 3D scene," says Vincent Sitzmann, a postdoc in the Computer Science and Artificial Intelligence Laboratory (CSAIL) and co-lead author of the paper.

[...] "Neural rendering has recently enabled photorealistic rendering and editing of images from only a sparse set of input views. Unfortunately, all existing techniques are computationally very expensive, preventing applications that require real-time processing, like video conferencing. This project takes a big step toward a new generation of computationally efficient and mathematically elegant neural rendering algorithms," says Gordon Wetzstein, an associate professor of electrical engineering at Stanford University, who was not involved in this research. "I anticipate that it will have widespread applications, in computer graphics, computer vision, and beyond."

Project Website:
Light Field Networks: Neural Scene Representations with Single-Evaluation Rendering

Journal Reference:
Vincent Sitzmann, Semon Rezchikov, William T. Freeman, et al. Light Field Networks: Neural Scene Representations with Single-Evaluation Rendering 4 June 2021, Computer Science Computer Vision and Pattern Recognition. arXiv:2106.02634


Original Submission

Related Stories

Nvidia Algorithm Turns 2D Photos Into a 3D Scene "Almost Instantly" 4 comments

Nvidia shows off AI model that turns a few dozen snapshots into a 3D-rendered scene

Nvidia's latest AI demo is pretty impressive: a tool that quickly turns a "few dozen" 2D snapshots into a 3D-rendered scene. In the video below you can see the method in action, with a model dressed like Andy Warhol holding an old-fashioned Polaroid camera. (Don't overthink the Warhol connection: it's just a bit of PR scene dressing.)

The tool is called Instant NeRF, referring to "neural radiance fields" — a technique developed by researchers from UC Berkeley, Google Research, and UC San Diego in 2020. If you want a detailed explainer of neural radiance fields, you can read one here, but in short, the method maps the color and light intensity of different 2D shots, then generates data to connect these images from different vantage points and render a finished 3D scene. In addition to images, the system requires data about the position of the camera.

Researchers have been improving this sort of 2D-to-3D model for a couple of years now, adding more detail to finished renders and increasing rendering speed. Nvidia says its new Instant NeRF model is one of the fastest yet developed and reduces rendering time from a few minutes to a process that is finished "almost instantly."

Also at Tom's Hardware and PetaPixel.

Previously: Breakthrough AI Technique Enables Real-Time Rendering of Scenes in 3D From 2D Images


Original Submission

Microsoft Accused of Selling AI Tool That Spews Violent, Sexual Images to Kids 13 comments

https://arstechnica.com/tech-policy/2024/03/microsoft-accused-of-selling-ai-tool-that-spews-violent-sexual-images-to-kids/

Microsoft's AI text-to-image generator, Copilot Designer, appears to be heavily filtering outputs after a Microsoft engineer, Shane Jones, warned that Microsoft has ignored warnings that the tool randomly creates violent and sexual imagery, CNBC reported.

Jones told CNBC that he repeatedly warned Microsoft of the alarming content he was seeing while volunteering in red-teaming efforts to test the tool's vulnerabilities. Microsoft failed to take the tool down or implement safeguards in response, Jones said, or even post disclosures to change the product's rating to mature in the Android store.

[...] Bloomberg also reviewed Jones' letter and reported that Jones told the FTC that while Copilot Designer is currently marketed as safe for kids, it's randomly generating an "inappropriate, sexually objectified image of a woman in some of the pictures it creates." And it can also be used to generate "harmful content in a variety of other categories, including: political bias, underage drinking and drug use, misuse of corporate trademarks and copyrights, conspiracy theories, and religion to name a few."

[...] Jones' tests also found that Copilot Designer would easily violate copyrights, producing images of Disney characters, including Mickey Mouse or Snow White. Most problematically, Jones could politicize Disney characters with the tool, generating images of Frozen's main character, Elsa, in the Gaza Strip or "wearing the military uniform of the Israel Defense Forces."

Ars was able to generate interpretations of Snow White, but Copilot Designer rejected multiple prompts politicizing Elsa.

If Microsoft has updated the automated content filters, it's likely due to Jones protesting his employer's decisions. [...] Jones has suggested that Microsoft would need to substantially invest in its safety team to put in place the protections he'd like to see. He reported that the Copilot team is already buried by complaints, receiving "more than 1,000 product feedback messages every day." Because of this alleged understaffing, Microsoft is currently only addressing "the most egregious issues," Jones told CNBC.

Related stories on SoylentNews:
Cops Bogged Down by Flood of Fake AI Child Sex Images, Report Says - 20240202
New "Stable Video Diffusion" AI Model Can Animate Any Still Image - 20231130
The Age of Promptography - 20231008
AI-Generated Child Sex Imagery Has Every US Attorney General Calling for Action - 20230908
It Costs Just $400 to Build an AI Disinformation Machine - 20230904
US Judge: Art Created Solely by Artificial Intelligence Cannot be Copyrighted - 20230824
"Meaningful Harm" From AI Necessary Before Regulation, says Microsoft Exec - 20230514 (Microsoft's new quarterly goal?)
the Godfather of AI Leaves Google Amid Ethical Concerns - 20230502
Stable Diffusion Copyright Lawsuits Could be a Legal Earthquake for AI - 20230403
AI Image Generator Midjourney Stops Free Trials but Says Influx of New Users to Blame - 20230331
Microsoft's New AI Can Simulate Anyone's Voice With Three Seconds of Audio - 20230115
Breakthrough AI Technique Enables Real-Time Rendering of Scenes in 3D From 2D Images - 20211214


Original Submission

This discussion has been archived. No new comments can be posted.
Display Options Threshold/Breakthrough Mark All as Read Mark All as Unread
The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
(1)
  • (Score: 0) by Anonymous Coward on Tuesday December 14 2021, @09:31PM

    by Anonymous Coward on Tuesday December 14 2021, @09:31PM (#1205116)

    TFA: "Yet a machine that needs to interact with objects in the world — like a robot designed to harvest crops or assist with surgery — must be able to infer properties about a 3D scene from observations of the 2D images it's trained on."

    End result: https://mangaplanet.com/what-is-guro/ [mangaplanet.com]

  • (Score: 1, Interesting) by Anonymous Coward on Tuesday December 14 2021, @10:46PM

    by Anonymous Coward on Tuesday December 14 2021, @10:46PM (#1205137)

    Currently, getting something big scanned, like a racetrack & surrounding scenery for a driving game, is done in segments with lots of labor stitching pieces together. The survey with techs moving lidar or stereo cameras around the track could take days.

    If this can process images at normal frame rates, a drive around the track with a bunch of cameras working should do it in a few minutes.

  • (Score: 0) by Anonymous Coward on Tuesday December 14 2021, @10:56PM

    by Anonymous Coward on Tuesday December 14 2021, @10:56PM (#1205139)

    Imagine the boon for pornography ...

  • (Score: 2) by MIRV888 on Wednesday December 15 2021, @12:52AM (1 child)

    by MIRV888 (11376) on Wednesday December 15 2021, @12:52AM (#1205162)

    Just ask Sarah.

    • (Score: 0) by Anonymous Coward on Wednesday December 15 2021, @07:41AM

      by Anonymous Coward on Wednesday December 15 2021, @07:41AM (#1205242)

      and it runs on a 6502

  • (Score: 0) by Anonymous Coward on Wednesday December 15 2021, @07:49PM

    by Anonymous Coward on Wednesday December 15 2021, @07:49PM (#1205366)

    3D? Snore. Wake me up when it can project into 4D curved space-time.

(1)