Microsoft’s Next Generation of Bing AI Interacts with Images

Microsoft’s AI-powered Bing search engine has been drawing in excess of 100 million daily active users and logged half a billion chats. With OpenAI’s GPT-4 and DALL-E 2 models driving the action, it has also created over 200 million images since debuting in limited preview in February. Seeking to build on that momentum, Microsoft is adding new features and integrating Bing more tightly with its Edge browser. The company is also ditching its waitlist in a move to open preview. “We’re underway with the transformation of search,” CVP and consumer CMO Yusuf Mehdi said at a preview event last week.

Bing combines OpenAI’s powerful large language models with Microsoft’s own “immense search index for results that are current, cited and conversational — something you can’t get anywhere else but on Bing,” Microsoft writes in a blog post, adding that the ability to create and compose queries in chat and get conversational responses presents “a new level of ease” that is “fundamentally changing the way people find information.”

Among the functionalities being added to the next generation of AI-enhanced Bing and Edge — which Microsoft calls “your copilot for the Web” — are:

  • Moving from text-only search and chat to multimodal support, with image and video queries and responses. Coming soon.
  • Moving from single use chat and search sessions to multi-session productivity experiences with easier chat history retrieval and archiving and persistent chats within Edge.
  • Opening up platform capabilities to developers and third parties who want to build apps for Bing that help people take actions on their queries and complete tasks.

Bing’s new multimodality means “Bing Chat will soon respond with images — at least where it makes sense,” TechCrunch writes, explaining that “answers to questions (e.g. “Where is Machu Picchu?”) will be accompanied by relevant images if any exist, much like the standard Bing search flow but condensed into a card-like interface.”

Microsoft says it has filtering that will stop explicit images from appearing. Sarah Bird, the head of responsible AI at Microsoft, told TechCrunch that Bing Chat draws not only on the moderation algorithms in Bing search but has additional “toxicity classifiers,” AI trained to detect potentially harmful or offensive prompts.

Bing Chat is also being programmed to interpret images in addition to understanding text. “Users will be able to upload images and search the web for related content, for example copying a link to an image of a crocheted octopus and asking Bing Chat the question ‘how do I make that?’ to get step-by-step instructions,” TechCrunch says.

Bing’s third-party service integration will see apps like OpenTable and the science app Wolfram Alpha interacting directly through Bing chat, sometimes with videos and charts, Bloomberg says that at the Manhattan event Bing reps demoed “how a user can type in, say, ‘Find me a dinner reservation for two in New York City tonight,’ and get a link to the reservation service OpenTable.”

No Comments Yet

You can be the first to comment!

Leave a comment

You must be logged in to post a comment.