With its first birthday right around the corner, Google’s conversational AI tool Bard is getting another upgrade.
Early reviews found Bard lagged behind competing chatbots in part because its responses were less human-like. Since then, Google has integrated its AI model Gemini, which helped Bard expand its training beyond text to video, audio and photos, and it’s now attempting to make further strides in utility as generative AI remains a hot ticket.
In a blog post Thursday, Bard product lead Jack Krawczyk said the tool now allows users to generate images for free.
When someone types in a prompt, like, “create an image of a hot air balloon flying over the mountains at sunset,” Bard generates what Google describes as “custom, wide-ranging visuals to help bring your idea to life.”
It does, however, take some time – around 13 seconds to be exact.
And while most queries we tried generated relevant images and/or responses, it doesn’t have a 100% accuracy rate.
For example, when I asked Bard to create an image for a news story about the most recent updates to the tool, it declined to generate the requested image. And when I asked it to simply create an image about Google Bard, it created this blonde cyborg:
Over the last year, the market has been flooded with chatbots like OpenAI’s ChatGPT, Microsoft’s Bing AI, Anthropic’s Claude — and, yes, Google’s Bard — as Big Tech seeks to stake its claim in the next wave of search. These chatbots access huge datasets and use large language models to deliver text, and now image or even video, responses to consumer queries. It’s a rapidly evolving field that has already gotten closer to human conversation. However, while the bots may confidently deliver answers, they aren’t always accurate — and they remain vulnerable to abuse.
Google’s post noted that Bard includes a distinction between visuals created with Bard and original human artwork, and it embeds watermarks into the pixels of generated images. To test this, I asked it to create an image of Botticelli’s Birth of Venus. It offered up a replica, but sloppier. Those faces! Those hands! However, there is an option to report a legal issue and to give each image a thumbs up or thumbs down.
In the wake of Taylor Swift deepfakes, Google said it seeks to limit “violent, offensive or sexually explicit content” and applies filters to avoid the generation of images of named people. Indeed, it declined to create an image of Super Bowl quarterbacks Patrick Mahomes and Brock Purdy having a picnic or one of Beyonce at the bank.
“We’ll continue investing in new techniques to improve the safety and privacy protections of our models,” Krawczyk wrote.
When I asked Bard to generate an image of Lisa Lacy at work, Bard said it did not have enough information about that person to help. It was, however, able to create an image for the more generic query of a journalist at work — with not one but two sandwiches on his desk.
It declined to create an image of a man tossing a coin from Hoover Dam as “throwing objects into Hoover Dam is prohibited.” (It offered to create an image of a scenic view or a historical depiction instead.)
And Bard was happy to create images of historical moments like the signing of the Declaration of Independence.
In addition to adding the image generation tool, Google is expanding the availability of Gemini Pro in Bard from English to more than 40 languages. That includes its double-check feature, which allows users to fact-check Bard’s responses with web content.
This will help the tool expand to more than 230 countries and territories, according to the post.
Google first added Gemini Pro to Bard in December 2023, to give it “more advanced understanding, reasoning, summarizing and coding abilities.”
Editors’ note: CNET is using an AI engine to help create some stories. For more, see this post.