Alibaba’s Qwen3-Omni AI Ingests Text, Images, Audio, Video
By Paula Parisi
September 24, 2025
September 24, 2025
Alibaba Cloud’s newest AI model, Qwen3-Omni-30B-A3B, has debuted with a splash. The Chinese company is touting it as “the first natively end-to-end omni-modal AI unifying text, image, audio & video in one model.” While Qwen3-Omni can accept prompts of text, image, audio and video, it only outputs text and audio. Alibaba Cloud has released the three versions of Qwen3-Omni so users can select based on their needs, choosing between general multimodal capabilities, deep reasoning or specialized audio understanding. Alibaba has also developed an AI chip called T-Head that performs comparably to Nvidia’s H20. Continue reading Alibaba’s Qwen3-Omni AI Ingests Text, Images, Audio, Video