ChatGPT Goes Multimodal: OpenAI Adds Vision, Voice Ability

OpenAI began previewing vision capabilities for GPT-4 in March, and the company is now starting to roll out the image input and output to users of its popular ChatGPT. The multimodal expansion also includes audio functionality, with OpenAI proclaiming late last month that “ChatGPT can now see, hear and speak.” The upgrade vaults GPT-4 into the multimodal category with what OpenAI is apparently calling GPT-4V (for “Vision,” though equally applicable to “Voice”). “We’re rolling out voice and images in ChatGPT to Plus and Enterprise users,” OpenAI announced. Continue reading ChatGPT Goes Multimodal: OpenAI Adds Vision, Voice Ability

Meta Unveils Llama 2 LLM with Microsoft as Preferred Partner

This week, Meta Platforms released Llama 2, the next generation of its open-source large language model that is free for research and commercial use. Llama 2’s pretrained and fine-tuned language models are available in sizes ranging from 7 to 70 billion parameters. Meta also named Microsoft Azure its “preferred partner for Llama 2,” offering it through the Azure AI model catalog for use with cloud-native tools that leverage content filtering and safety features. Meta says Llama 2 is “also optimized to run locally on Windows,” providing developers a seamless workflow across enterprise and consumer platforms. Continue reading Meta Unveils Llama 2 LLM with Microsoft as Preferred Partner