By
ETCentric StaffMarch 6, 2024
Anthropic has released Claude 3, claiming new industry benchmarks that see the family of three new large language models approaching “near-human” cognitive capability in some instances. Accessible via Anthropic’s website, the three new models — Claude 3 Haiku, Claude 3 Sonnet and Claude 3 Opus — represent successively increased complexity and parameter count. Sonnet is powering the current Claude.ai chatbot and is free, for now, requiring only an email sign-in. Opus comes with the the $20 monthly subscription for Claude Pro. Both are generally available from the Anthropic website and via API in 159 countries, with Haiku coming soon. Continue reading Anthropic’s Claude 3 AI Is Said to Have ‘Near-Human’ Abilities
By
ETCentric StaffFebruary 26, 2024
Community message board and social news aggregator Reddit, founded in 2005, has filed to go public on the New York Stock Exchange in an IPO observers say may be complete in a matter of weeks. It is the first social media company to go public in many years, with Snap Inc.’s 2017 offering cited as the most recent stock market splash. Reddit’s bankers are reportedly seeking a $5 billion valuation, about half the $10 billion it was valued at for a 2021 private funding round. Reddit filed with the SEC the same day it announced an “expanded partnership” with Google to use Vertex AI. Continue reading Reddit Announces IPO on Heels of Expanded Deal with Google
By
ETCentric StaffFebruary 21, 2024
Researchers at Amazon have trained what they are calling the largest text-to-speech model ever created, which they claim is exhibiting “emergent” qualities — the ability to inherently improve itself at speaking complex sentences naturally. Called BASE TTS, for Big Adaptive Streamable TTS with Emergent abilities, the new model could pave the way for more human-like interactions with AI, reports suggest. Trained on 100,000 hours of public domain speech data, BASE TTS offers “state-of-the-art naturalness” in English as well as some German, Dutch and Spanish. Text-to-speech models are used in developing voice assistants for smart devices and apps and accessibility. Continue reading Amazon Claims ’Emergent Abilities’ for Text-to-Speech Model
By
ETCentric StaffFebruary 16, 2024
Stability AI, purveyor of the popular Stable Diffusion image generator, has introduced a completely new model called Stable Cascade. Now in preview, Stable Cascade uses a different architecture than Stable Diffusion’s SDXL that the UK company’s researchers say is more efficient. Cascade builds on a compression architecture called Würstchen (German for “sausage”) that Stability began sharing in research papers early last year. Würstchen is a three-stage process that includes two-step encoding. It uses fewer parameters, meaning less data to train on, greater speed and reduced costs. Continue reading Stability AI Advances Image Generation with Stable Cascade
By
Paula ParisiJanuary 31, 2024
AI copyright infringement tool Nightshade generated 250,000 downloads shortly after its January release, exceeding the expectations of its creators in the computer science department at the University of Chicago. Nightshade allows artists to thwart AI models from scraping and training on their work without consent. The Bureau of Labor Statistics shows more than 2.67 million artists working in the U.S., but social media feedback indicates the downloads have been worldwide. One of the coders says cloud mirror links had to be added so as not to overwhelm the University of Chicago’s web servers. Continue reading AI Poison Pill App Nightshade Has 250K Downloads in 5 Days
By
Paula ParisiJanuary 17, 2024
Getty Images and Nvidia are expanding their AI partnership with the addition of the text-to-image platform Generative AI by iStock, designed to produce stock photos that can be used by individuals or enterprise customers. Built on Nvidia Picasso, a foundry for custom AI models, and trained exclusively on data from Getty Images’ proprietary creative libraries, Generative AI by iStock “has been engineered to guard against generations of known products, people, places or other copyrighted elements,” Getty explains, adding that “any licensed visual that a customer generates comes with iStock’s standard $10,000 USD legal coverage.” Continue reading CES: Getty Rolls Out iStock Generative AI Powered by Nvidia
By
Paula ParisiDecember 22, 2023
Google has unveiled a new large language model designed to advance video generation. VideoPoet is capable of text-to-video, image-to-video, video stylization, video inpainting and outpainting, and video-to-audio. “The leading video generation models are almost exclusively diffusion-based,” Google says, citing Imagen Video as an example. Google finds this counter intuitive, since “LLMs are widely recognized as the de facto standard due to their exceptional learning capabilities across various modalities.” VideoPoet eschews the diffusion approach of relying on separately trained tasks in favor of integrating many video generation capabilities in a single LLM. Continue reading VideoPoet: Google Launches a Multimodal AI Video Generator
By
Paula ParisiDecember 15, 2023
Google is rolling out Gemini to developers, enticing them with tools including AI Studio, an easy-to-navigate Web-based platform that will serve as a portal to the multi-tiered Gemini ecosystem, beginning with Gemini Pro, with Gemini Ultra to come next year. The service aims to allow developers to quickly create prompts and Gemini-powered chatbots, providing access to API keys to integrate them into apps. They’ll also be able to access code, should projects require a full featured IDE. The site is essentially a revamped version of what was formerly Google’s MakerSuite. Continue reading Google Debuts Turnkey Gemini AI Studio for Developing Apps
By
Paula ParisiDecember 15, 2023
Microsoft is releasing Phi-2, a text-to-text small language model (SLM) that outperforms some LLMs, yet is light enough to run on a mobile device or laptop, according to Microsoft CEO Satya Nadella. The 2.7 billion-parameter SLM beat Meta Platforms’ Llama 2 and Mistral 7B from France (each with 7 billion parameters) says Microsoft, emphasizing its complex reasoning and language comprehension are exceptional for a model with less than 13 billion parameters. For now, Microsoft is making it available “for research purposes only” under a custom license. Continue reading Microsoft Says Phi-2 Can Outperform Large Language Models
By
Paula ParisiDecember 12, 2023
The EU has reached a provisional agreement on the Artificial Intelligence Act, making it the first Western democracy to establish comprehensive AI regulations. The sweeping new law predominantly focuses on so-called “high-risk AI,” establishing parameters — largely in the form of reporting and third-party monitoring — “based on its potential risks and level of impact.” Parliament and the 27-country European Council must still hold final votes before the AI Act is finalized and goes into effect, but the agreement, reached Friday in Brussels after three days of negotiations, means the main points are set. Continue reading EU Makes Provisional Agreement on Artificial Intelligence Act
By
Paula ParisiDecember 12, 2023
Google personalized AI assistant NotebookLM is an experimental product that has been in early access since July. Now the company is integrating its new Gemini Pro LLM with NotebookLM and making it available to U.S. residents 18 and older. NotebookLM is engineered “to help you do your best thinking,” Google says, with documents uploaded to the service making it “an instant expert in the information you need,” allowing it to answer questions about your data. Unlike generic chatbots, NotebookLM draws responses from the documents you feed it, meaning it will be hyper-focused — a lite version of a custom trained model. Continue reading Google’s NotebookLM is a Personalized Lite Language Model
By
Paula ParisiDecember 1, 2023
Amazon is debuting its Titan Image Generator in preview for AWS Bedrock customers. The new Titan generative AI model can create new images from a text prompt or existing image, and automatically adds watermarking to protect intellectual property. The move into generative imaging puts Amazon in competition with a growing field that includes large firms like Adobe and Google. Unlike those companies and others, the e-retail giant is at present focusing exclusively on enterprise customers. Amazon Bedrock is a managed service giving developers access to a range of foundation models from companies including Meta Platforms, Anthropic, and Amazon itself. Continue reading Amazon Previews Titan Image Generator for Bedrock Clients
By
Paula ParisiDecember 1, 2023
Amazon has launched five new capabilities to its SageMaker service, including Sagemaker HyperPod, which accelerates large language and foundation model training and tuning. Sagemaker HyperPod is said to shorten the training time by up to 40 percent using its purpose-built infrastructure designed for distributed training at scale. By optimizing acceleration, SageMaker Inference reduces foundation model deployment costs by 50 percent and latency by 20 percent on average, Amazon claims. “SageMaker HyperPod removes the undifferentiated heavy lifting involved in building and optimizing machine learning infrastructure,” said Amazon. Continue reading SageMaker HyperPod: Amazon Accelerates AI Model Training
By
Paula ParisiNovember 27, 2023
Stability AI has opened research preview on its first foundation model for generative video, Stable Video Diffusion, offering text-to-video and image-to-video. Based on the company’s Stable Diffusion text-to-image model, the new open-source model generates video by animating existing still frames, including “multi-view synthesis.” While the company plans to enhance and extend the model’s capabilities, it currently comes in two versions: SVD, which transforms stills into 576×1024 videos of 14 frames, and SVD-XT that generates up to 24 frames — each at between three and 30 frames per second. Continue reading Stability Introduces GenAI Video Model: Stable Video Diffusion
By
Paula ParisiNovember 20, 2023
Having made the leap from image generation to video generation over the course of a few months in 2022, Meta Platforms introduces Emu, its first visual foundational model, along with Emu Video and Emu Edit, positioned as milestones in the trek to AI moviemaking. Emu uses just two diffusion models to generate 512×512 four-second long videos at 16 frames per second, Meta said, comparing that to 2022’s Make-A-Video, which requires a “cascade” of five models. Internal research found Emu video generations were “strongly preferred” over the Make-A-Video model based on quality (96 percent) and prompt fidelity (85 percent). Continue reading Meta Touts Its Emu Foundational Model for Video and Editing