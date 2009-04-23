from the cut-and-paste dept.
On Wednesday, Meta announced an AI model called the Segment Anything Model (SAM) that can identify individual objects in images and videos, even those not encountered during training, reports Reuters.
According to a blog post from Meta, SAM is an image segmentation model that can respond to text prompts or user clicks to isolate specific objects within an image. Image segmentation is a process in computer vision that involves dividing an image into multiple segments or regions, each representing a specific object or area of interest.
The purpose of image segmentation is to make an image easier to analyze or process. Meta also sees the technology as being useful for understanding webpage content, augmented reality applications, image editing, and aiding scientific study by automatically localizing animals or objects to track on video.
I'm a neuroscientist in a doctoral program but I have a growing interest in deep learning methods (e.g., http://deeplearning.net/ ). As a neuroscientist using MR imaging methods, I often rely on tools to help me classify and define brain structures and functional activations. Some of the most advanced tools for image segmentation are being innovated using magical-sounding terms like Adaboosted weak-learners, auto-encoders, Support Vector Machines, and the like.
While I do not have the time to become a computer-science expert in artificial intelligence methods, I would like to establish a basic skill level in the application of some of these methods. Soylenters, "Do I need to know the mathematical foundation of these methods intimately to be able to employ them effectively or intelligently?" and "What would be a good way of becoming more familiar with these methods, given my circumstances?"
NVIDIA Research's GauGAN AI Art Demo Responds to Words:
A picture worth a thousand words now takes just three or four words to create, thanks to GauGAN2, the latest version of NVIDIA Research's wildly popular AI painting demo.
The deep learning model behind GauGAN allows anyone to channel their imagination into photorealistic masterpieces — and it's easier than ever. Simply type a phrase like "sunset at a beach" and AI generates the scene in real time. Add an additional adjective like "sunset at a rocky beach," or swap "sunset" to "afternoon" or "rainy day" and the model, based on generative adversarial networks, instantly modifies the picture.
With the press of a button, users can generate a segmentation map, a high-level outline that shows the location of objects in the scene. From there, they can switch to drawing, tweaking the scene with rough sketches using labels like sky, tree, rock and river, allowing the smart paintbrush to incorporate these doodles into stunning images.
The new GauGAN2 text-to-image feature can now be experienced on NVIDIA AI Demos, where visitors to the site can experience AI through the latest demos from NVIDIA Research. With the versatility of text prompts and sketches, GauGAN2 lets users create and customize scenes more quickly and with finer control.
Kinda makes Turtle graphics from the 70s look rather basic. However, beware Rule 34…
MIT's newest computer vision algorithm identifies images down to the pixel:
For humans, identifying items in a scene [...] is as simple as looking at them. But for artificial intelligence and computer vision systems, developing a high-fidelity understanding of their surroundings takes a bit more effort. Well, a lot more effort. Around 800 hours of hand-labeling training images effort, if we're being specific. To help machines better see the way people do, a team of researchers at MIT CSAIL in collaboration with Cornell University and Microsoft have developed STEGO, an algorithm able to identify images down to the individual pixel.
Normally, creating CV training data involves a human drawing boxes around specific objects within an image — say, a box around the dog sitting in a field of grass — and labeling those boxes with what's inside ("dog"), so that the AI trained on it will be able to tell the dog from the grass. STEGO (Self-supervised Transformer with Energy-based Graph Optimization), conversely, uses a technique known as semantic segmentation, which applies a class label to each pixel in the image to give the AI a more accurate view of the world around it.
Whereas a labeled box would have the object plus other items in the surrounding pixels within the boxed-in boundary, semantic segmentation labels every pixel in the object, but only the pixels that comprise the object — you get just dog pixels, not dog pixels plus some grass too. It's the machine learning equivalent of using the Smart Lasso in Photoshop versus the Rectangular Marquee tool.
The problem with this technique is one of scope. Conventional multi-shot supervised systems often demand thousands, if not hundreds of thousands, of labeled images with which to train the algorithm. Multiply that by the 65,536 individual pixels that make up even a single 256x256 image, all of which now need to be individually labeled as well, and the workload required quickly spirals into impossibility.
Instead, "STEGO looks for similar objects that appear throughout a dataset," the CSAIL team wrote in a press release Thursday. "It then associates these similar objects together to construct a consistent view of the world across all of the images it learns from."
(Score: 0) by Anonymous Coward on Monday April 10, @06:12PM (3 children)
Great, even more tools to enable fakery. Exactly what we needed.
These people are fuckin' lemmings(*), marching blindly but briskly towards the abyss, all the while chanting "technology is morally neutral, it's not our fault (x2), what could possibly go wrong"
(*) Apologies to all lemmings, we humans use this as figure of speech. I realize you have a higher sense of self-preservation than those I applied this figure of speech to.
(Score: 2) by gznork26 on Monday April 10, @06:30PM (1 child)
Speaking of lemmings, James Thurber wrote a short story called "Interview with a Lemming" in which a scientist is having a discussion with one. The scientist says that he's made his life's work of studying lemmings, but there's one thing I don't understand, why you run off a cliff to your deaths. The lemming replies that he's made his life's work of studying humans, and he doesn't understand why we don't.
(Score: 2) by GloomMower on Monday April 10, @07:02PM
https://www.adfg.alaska.gov/index.cfm?adfg=wildlifenews.view_article&articles_id=56 [alaska.gov]
The Disney documentary of lemmings going off the cliff was faked.
(Score: 1) by khallow on Monday April 10, @06:39PM
Well, I doubt you can show we need more or less fakery so it's not saying much.