MAGE AI Unifies Generative and Recognition Image Training

Researchers at MIT’s Computer Science and Artificial Intelligence Laboratory (CSAIL) have introduced a computer vision system that combines image recognition and image generation technology into one training model instead of two. The result, MAGE (short for MAsked Generative Encoder) holds promise for a wide variety of use cases and is expected to reduce costs through unified training, according to the team. “To the best of our knowledge, this is the first model that achieves close to state-of-the-art results for both tasks using the same data and training paradigm,” the researchers said. Continue reading MAGE AI Unifies Generative and Recognition Image Training

Meta Develops Computer Vision AI That Learns Like Humans

Meta Platforms continues to make progress on a mission to develop artificial intelligence that can teach itself to learn how the world works. Chief AI Scientist Yann LeCun has taken a special interest in developing the new model, called Image Joint Embedding Predictive Architecture, or I-JEPA, which learns by building an internal representation of the outside world and analyzing image abstracts instead of comparing pixels. The approach allows AI techto learn more like humans do, with their ability to figure out complex tasks and adapt to new situations. Continue reading Meta Develops Computer Vision AI That Learns Like Humans

Apple Eyes AI Video Compression with WaveOne Acquisition

Apple has acquired WaveOne, a Mountain View-based startup that has been developing AI algorithms for video compression. Cupertino has been mum about the purchase, but the deal reportedly closed in January, and WaveOne employees are said to have been absorbed into Apple’s machine learning groups. WaveOne’s codecs use machine learning to squeeze more picture out of less bandwidth, including optimizing for signal interruptions, so the picture doesn’t freeze or disappear, making it ideal for mobile. As Netflix and YouTube tout picture improvements, WaveOne could potentially advantage Apple TV+ and a mixed reality headset. Continue reading Apple Eyes AI Video Compression with WaveOne Acquisition

Pinterest Grows Its Active Users, Focuses on Video Shopping

Pinterest grew Q4 year-over-year revenue by 4 percent, to $877 million, while full year sales jumped 9 percent in 2022 totaling $2.8 billion. The company said that global monthly active users also grew by 4 percent in the three month period ending December 31, to a total of 450 million. CEO Bill Ready emphasized on the earnings call the intent to eventually “make every pin shoppable.” Similar to how it is monetizing still images Pinterest is focusing on making videos “more actionable” by applying what it calls “our computer vision technology.” Continue reading Pinterest Grows Its Active Users, Focuses on Video Shopping

CES: Encoding Environmental Intelligence with New Chip Tech

The design of truly contextual experiences — whether for utility or entertainment — requires a knowledge of both the user and the environment they are in. This becomes especially relevant when we think of what it means to build interesting mixed reality experiences. CES this year showcased a variety of computer vision AI software tools oriented towards understanding environmental context. At Eureka Park in the Venetian, however, MantiSpectra’s chip sensor technology provided a peek into the benefits for user experience enabled by environmental intelligence arising from hardware. Continue reading CES: Encoding Environmental Intelligence with New Chip Tech

Google Search Reinvention Focuses on Visuals and Discovery

Google is the latest tech giant to be swayed by the influence of TikTok and Instagram as it reimagines a more visual, discovery-centric type of search. That was major media’s takeaway from the third annual Google Search On event, which continued the trend of trying to find more intuitive ways to search, namely visually and vocally, by snapping a photo or asking your phone a question. Thanks to advances in artificial intelligence, the Alphabet company says it is “going far beyond the search box to create search experiences that work more like our minds.” Continue reading Google Search Reinvention Focuses on Visuals and Discovery

New Amazon Devices Include Home Robot, Smart Thermostat

During its streamed media event this week, Amazon introduced new devices including a wheeled robot named Astro and a sale-by-invitation-only Ring autonomous security drone for the home. While the unusual products added sizzle, the focus was largely on basics like its first smart thermostat, updates to the Echo speaker line and Ring security products. Several of the new products appear to target market share of products already on offer, including through Amazon, and many emphasize synergy among Amazon’s hardware brands. The company’s fee-based premium services were also emphasized. Continue reading New Amazon Devices Include Home Robot, Smart Thermostat

IBM Project CodeNet Employs AI Tools to Program Software

IBM’s AI research unit debuted Project CodeNet, a dataset to develop machine learning models for software programming. The name is a take-off on ImageNet, the influential dataset of photos that pushed the development of computer vision and deep learning. Creating “AI for code” systems has been challenging since software developers are constantly discovering new problems and exploring different solutions. IBM researchers have taken that into consideration in developing a multi-purpose dataset for Project CodeNet. Continue reading IBM Project CodeNet Employs AI Tools to Program Software

IBM CodeNet Enables AI Translation of Computer Languages

During its Think conference this week, IBM debuted Project CodeNet, an open-source dataset for benchmarking around AI for code. Project CodeNet consists of 14 million code examples, which makes it about 10 times larger than the most similar dataset, which has 52,000 examples. Project CodeNet also offers 500 million lines of code and 55 programming languages including C++, Java, Python, Go, COBOL, Pascal and Fortran, making it a Rosetta Stone for AI systems to automatically translate code into other programming languages. Continue reading IBM CodeNet Enables AI Translation of Computer Languages

Facebook Counters AI Bias with a Data Set Featuring Actors

Facebook released an open-source AI data set of 45,186 videos featuring 3,011 U.S. actors who were paid to participate. The data set is dubbed Casual Conversations because the diverse group was recorded giving unscripted answers to questions about age and gender. Skin tone and lighting conditions were also annotated by humans. Biases have been a problem in AI-enabled technologies such as facial recognition. Facebook is encouraging teams to use the new data set. Most AI data sets comprise people unaware they are being recorded. Continue reading Facebook Counters AI Bias with a Data Set Featuring Actors

CES: Seoul Robotics, Mobileye Enable Lidar for Smart Cities

During the all-digital CES 2021, lidar (light detection and ranging) technology was presented as a key tool for building autonomous vehicles, smart homes and infrastructure for smart cities. Lidar, which senses what an object is based on its shape, first appeared in the 1970s but, up until now, has been too expensive and complicated for broad industrial use. Seoul Robotics, Intel’s Mobileye and Blickfeld were among the companies at CES showcasing real-world lidar applications. Lidar is predicted to triple to an almost $3 billion market by 2025. Continue reading CES: Seoul Robotics, Mobileye Enable Lidar for Smart Cities

OpenAI Unveils AI-Powered DALL-E Text-to-Image Generator

OpenAI unveiled DALL-E, which generates images from text using two multimodel AI systems that leverage computer vision and NLP. The name is a reference to surrealist artist Salvador Dali and Pixar’s animated robot WALL-E. DALL-E relies on a 12-billion parameter version of GPT-3. OpenAI demonstrated that DALL-E can manipulate and rearrange objects in generated imagery and also create images from scratch based on text prompts. It has stated that it plans to “analyze how models like DALL·E relate to societal issues.” Continue reading OpenAI Unveils AI-Powered DALL-E Text-to-Image Generator

Amazon Unveils Computer Vision Products for Industrial Use

Amazon announced the AWS Panorama Appliance, a plug-in that connects to a network and identifies video streams from cameras in the customers’ industrial facilities. It enables AI services for construction, manufacturing, retail and other industries and is aimed at “industrial companies looking for a more holistic, computer vision-centric analytics solution.” It integrates with AWS IoT services including SiteWise. Also new is the AWS Panorama SDK that allows manufacturers to build new cameras for computer vision at the edge. Continue reading Amazon Unveils Computer Vision Products for Industrial Use

Facebook Builds Out Its Shopping Features Across Platforms

Next Tuesday, Facebook will begin the global rollout of a new tab in its main app called Facebook Shop, which allows users to browse product catalogs and buy items directly on the social media platform. The new feature, previously in beta with a small group of U.S. users, joins a similar feature launched on Instagram last month. Prior to Facebook Shop, businesses could add catalogs to their Facebook pages, but the new feature is a dedicated marketplace for multiple retailers. Instagram’s Checkout feature will also soon be broadly available. Continue reading Facebook Builds Out Its Shopping Features Across Platforms

Google Developing New Cloud Services During the Pandemic

According to Google Cloud chief executive Thomas Kurian, the coronavirus pandemic has had an impact on the development of new cloud features. “Every week, there’s a new set of dimensions, and we have to adapt, keep people positive, and focus through it,” he said. A new security product that encrypts data while it’s being processed, for example, is aimed at luring businesses in highly regulated industries to adopt cloud services. Another cloud-computing product is Assured Workloads for Government, a new way to secure public sector deals. Continue reading Google Developing New Cloud Services During the Pandemic