Captions: Generative Video Startup Raises $60 Million in NYC

Generative video creation and editing platform Captions has raised $60 million in Series C funding. Founded in 2021 by former Microsoft engineer Gaurav Misra and Goldman Sachs alum Dwight Churchill, the company’s technologies — Lipdub, AI Edit and the 3D avatar app AI Creator — have amassed more than 10 million downloads for mobile, the firm says. The C round brings its total raise to $100 million for a stated market valuation of $500 million. With the new funding, Captions plans to expand its presence in New York City, which is “emerging as the epicenter for AI research,” according to Misra. Continue reading Captions: Generative Video Startup Raises $60 Million in NYC

Meta’s 3D Gen Bridges Gap from AI to Production Workflow

Meta Platforms has introduced an AI model it says can generate 3D images from text prompts in under one minute. The new model, called 3D Gen, is billed as a “state-of-the-art, fast pipeline” for turning text input into high-resolution 3D images quickly. The app also adds textures to AI output or existing images through text prompts, and “supports physically-based rendering (PBR), necessary for 3D asset relighting in real-world applications,” Meta explains, adding that in internal tests, 3D Gen outperforms industry baselines on “prompt fidelity and visual quality” and for speed. Continue reading Meta’s 3D Gen Bridges Gap from AI to Production Workflow

Apple Launches Public Demo of Its Multimodal 4M AI Model

Apple has released a public demo of the 4M AI model it developed in collaboration with the Swiss Federal Institute of Technology Lausanne (EPFL). The technology debuts seven months after the model was first open-sourced, allowing informed observers the opportunity to interact with it and assess its capabilities. Apple says 4M was built by applying masked modeling to a single unified Transformer encoder-decoder “across a wide range of input/output modalities — including text, images, geometric and semantic modalities, as well as neural network feature maps.” Continue reading Apple Launches Public Demo of Its Multimodal 4M AI Model

Nokia Makes the First-Ever 3D Spatial Audio Cell Phone Call

Nokia made what it claims is “the world’s first immersive voice and audio call” using cell phones, made possible by the new 3GPP Immersive Voice and Audio Services (IVAS) codec that lets consumers hear 3D spatial sound in real-time. The codec — which Nokia participated in crafting — is a major leap from today’s standard monophonic smartphone voice call experience and is part of the upcoming 5G Advanced standard. The innovation paves the way towards enhanced immersive spatial communications, extended reality and metaverse applications, says Nokia, explaining that it works across “any connected device,” including smartphones, tablets and PCs. Continue reading Nokia Makes the First-Ever 3D Spatial Audio Cell Phone Call

Vision Pro Adds Dual 4K Virtual Display, URSA Cine Immersive

Apple is previewing visionOS 2, the next-gen operating system coming this fall for its Vision Pro mixed-reality headset. The new system includes a Mac Virtual Display that creates the ultra-wide equivalent of two side-by-side 4K monitors. The new OS updates navigational hand gestures and adds a Photos app feature that turns existing 2D pictures into spatial images. At WWDC Apple also announced that Blackmagic Design will release the URSA Cine Immersive, the first commercial camera system designed to capture images for the Vision Pro, and Canon unveiled a dual-lens optical setup for APS-C cameras. Continue reading Vision Pro Adds Dual 4K Virtual Display, URSA Cine Immersive

Acer 3D Camera Makes Glasses-Free Content for Its Displays

Acer has extended its SpatialLabs branding from glasses-free 3D laptops to a 3D camera coming to market in Q3 starting at $549. The Acer SpatialLabs Eyes Stereo Camera has 8MP of resolution per eye and can live stream in 3D to YouTube and enable high-resolution 3D video calls through Zoom, Microsoft Teams and Google Meet. It has a built-in selfie mirror, auto and touch focus capabilities and electronic image stabilization (EIS). It is fully compatible with the Acer Aspire 3D 15 SpatialLabs Edition laptop, released in February, and will also work with other 3D displays, projectors and VR headsets. Continue reading Acer 3D Camera Makes Glasses-Free Content for Its Displays

Autodesk Buys Wonder Dynamics, AI VFX App Wonder Studio

Autodesk is going all-in on artificial intelligence with the acquisition of AI startup Wonder Dynamics, maker of the Wonder Studio VFX tool. Autodesk — whose products include Maya, 3ds Max and Flame — worked with Wonder on a Maya plug-in last year and appears to have been impressed. Wonder Studio was purpose-built to be compatible with 3D tools like Maya, largely automating the process of putting 3D characters within live-action scenes. Terms of the deal were not disclosed, and Autodesk did not detail plans for integrating Wonder Dynamics, but it’s likely the company’s AI expertise will make itself felt across the portfolio. Continue reading Autodesk Buys Wonder Dynamics, AI VFX App Wonder Studio

Looking Glass Debuts Two New Headset-Free Spatial Displays

Looking Glass has launched a new 32-inch, glasses-free spatial display and an OLED version of its 16-inch model. The screens come in both landscape and portrait orientations and are aimed at XR professionals requiring visualization for 3D digital images, video and applications in real time. The 3D displays broadcast 45-100 views for what the company says is an uncompromised group-view experience. Sensors for touchless gesture control are available and the devices support a wide variety of software, including plugins for Unity, Unreal, Blender and WebXR. The 16-inch OLED lists for $4,000 but is offered at $3,000 for a limited time. Continue reading Looking Glass Debuts Two New Headset-Free Spatial Displays

Adobe Considers Sora, Pika and Runway AI for Premiere Pro

Adobe plans to add generative AI capabilities to its Premiere Pro editing platform and is exploring the update with third-party AI technologies including OpenAI’s Sora, as well as models from Runway and Pika Labs, making it easier “to draw on the strengths of different models” within everyday workflows, according to Adobe. Editors will gain the ability to generate and add objects into scenes or shots, remove unwanted elements with a click, and even extend frames and footage length. The company is also developing a video model for its own Firefly AI for video and audio work in Premiere Pro. Continue reading Adobe Considers Sora, Pika and Runway AI for Premiere Pro

Stable Video 3D Generates Orbital Animation from One Image

Stability AI has released Stable Video 3D, a generative video model based on the company’s foundation model Stable Video Diffusion. SV3D, as it’s called,  comes in two versions. Both can generate and animate multi-view 3D meshes from a single image. The more advanced version also let users set “specified camera paths” for a “filmed” look to the video generation. “By adapting our Stable Video Diffusion image-to-video diffusion model with the addition of camera path conditioning, Stable Video 3D is able to generate multi-view videos of an object,” the company explains. Continue reading Stable Video 3D Generates Orbital Animation from One Image

Alibaba’s EMO Can Generate Performance Video from Images

Alibaba is touting a new artificial intelligence system that can animate portraits, making people sing and talk in realistic fashion. Researchers at the Alibaba Group’s Institute for Intelligent Computing developed the generative video framework, calling it EMO, short for Emote Portrait Alive. Input a single reference image along with “vocal audio,” as in talking or singing, and “our method can generate vocal avatar videos with expressive facial expressions and various head poses,” the researchers say, adding that EMO can generate videos of any duration, “depending on the length of video input.” Continue reading Alibaba’s EMO Can Generate Performance Video from Images

AI Video Startup Haiper Announces Funding and Plans for AGI

London-based AI video startup Haiper has emerged from stealth mode with $13.8 million in seed funding and a platform that generates up to two seconds of HD video from text prompts or images. Founded by alumni from Google DeepMind, TikTok and various academic research labs, Haiper is built around a bespoke foundation model that aims to serve the needs of the creative community while the company pursues a path to artificial general intelligence (AGI). Haiper is offering a free trial of what is currently a web-based user interface similar to offerings from Runway and Pika. Continue reading AI Video Startup Haiper Announces Funding and Plans for AGI

ZTE Unveils Glasses-Free Android Tablet, the Nubia Pad 3D II

ZTE has launched what it calls the world’s first AI-powered, eyewear-free 5G 3D tablet, the Nubia Pad 3D II. The 12.1-inch LCD display supports 2,560 x 1,600 resolution and 144Hz refresh rate. Powered by a Qualcomm Snapdragon 8 Gen 2 chipset, the Nubia Pad 3D II is equipped with an AI eye-tracking engine that utilizes “high-speed visual sensors and eye-detection algorithms” to enhance response speed and enable accurate synchronization with the users’ eyes in real-time “for a more natural and realistic 3D display experience,” ZTE says. The device also converts 2D to 3D with Neovision 3D Anytime technology. Continue reading ZTE Unveils Glasses-Free Android Tablet, the Nubia Pad 3D II

New Chinese Optical Disc Promises Petabyte-Plus of Storage

Researchers at China’s University of Shanghai for Science and Technology have invented an ultrahigh density optical disc format they claim can store up to 1.6 petabits — more than 1,500 terabytes, or 125,000 gigabytes — of data. While the new discs are said to look like typical Blu-rays, the data is written to one hundred layers in a 3D stacking architecture by a 54-nanometer laser that is about one-tenth the size of visible light waves. The same laser is used to read the data back. The tech is said to present “a promising solution for cost effective, long-term archival data storage.” Continue reading New Chinese Optical Disc Promises Petabyte-Plus of Storage

OpenAI’s Generative Video Tech Is Described as ‘Eye-Popping’

OpenAI has debuted a generative video model called Sora that could be a game changer. In OpenAI’s demonstration clips, Sora depicts both fantasy and natural scenes with photorealistic intensity that makes the images appear to be photographed. Although Sora is said to be currently limited to one-minute clips, it is only a matter of time until that expands, which suggests the technology could have a significant impact on all aspects of production — from entertainment to advertising to education. Concerned about Sora’s disinformation potential, OpenAI is proceeding cautiously, and initially making it available only to a select group to help it troubleshoot. Continue reading OpenAI’s Generative Video Tech Is Described as ‘Eye-Popping’