Google halts Gemini AI image generator after it created inaccurate historical content

Gemini chatbot illustration

Google on Thursday 22nd February 2024 said it is pausing its Gemini artificial intelligence (AI) image generation feature after saying it offers ‘inaccuracies’ in historical pictures

Users had been reporting that the AI tool generated images of historical figures, like the U.S. Founding Fathers as people of colour, calling this inaccurate.

Google posted a statement on Thursday 22nd February 2024, saying that it will pause Gemini’s feature to generate images of people and will re-release an ‘improved’ version soon.

Is Google struggling to keep up with the AI race?

The image generator tool was launched at the start of February 2024 through Gemini, which was orignally called Bard.

It is facing challenges at a time when Google is trying to catch up with Microsoft-backed OpenAI project, Copilot.

Google releases Gemini, its latest AI venture

Chatbot

Pressure mounts on the Google to demonstrate how it plans monetize AI.

Google has launched its largest and most capable (by its own admission) artificial intelligence (AI) model on Wednesday 6th December 2023 pressure mounts on the company to answer how it’ll monetize AI.

Gemini

The large language model Gemini will include a suite of three different sizes: Gemini Ultra, its largest, most capable category; Gemini Pro, which scales across a wide range of tasks; and Gemini Nano, which it will use for specific tasks and mobile devices.

Cloud

Google is reportedly planning to licence Gemini to clients through Google Cloud to use in their own applications. Developers and enterprise customers can access Gemini Pro via the Gemini API in Google AI Studio or Google Cloud Vertex AI.

Android

Android developers will also be able to build with Gemini Nano. Gemini will also be used to power Google products like its Bard Chatbot and Search Generative Experience, which tries to answer search queries with conversational-style text.

Ultra

Gemini Ultra is reportedly the first model to outperform human experts on MMLU (massive multitask language understanding), which uses a combination of 57 subjects such as math, physics, history, law, medicine and ethics for testing both world knowledge and problem-solving abilities, the company said in a blog post Wednesday 6th December 2023. 

It can supposedly understand nuance and reasoning in complex subjects.

Advanced

The company gave examples demonstrating Gemini being able to take a screenshot of a chart and analyse hundreds of pages from research and then updating the chart.

Another example was analyzing a photo of a person’s math homework and identifying correct answers and pointing out incorrect ones.

The future is artificial.

Definition of the word Gemini: Constellation, Astrological Sign or Twins in Latin.

OpenAI says ChatGPT can now ‘speak,’ listen and process images

ChatGPT

I can see and hear and speak…

OpenAI’s ChatGPT can now ‘see, hear and speak,’ or, at least, understand spoken words, respond with a synthetic voice and process images, the company announced Monday 25th September 2023.

The update to the chatbot OpenAI’s biggest since the introduction of GPT-4, allows users to opt into voice conversations on ChatGPT’s mobile app and choose from five different synthetic voices for the bot to respond with. Users will also be able to share images with ChatGPT and highlight areas of focus or analysis.

Roll out

The changes will be rolling out to paying users in the next two weeks, OpenAI said. ‘While voice functionality will be limited to the iOS and Android apps, the image processing capabilities will be available on all platforms’.

The big feature push comes alongside ever-rising stakes of the artificial intelligence (AI) race among chatbot leaders such as OpenAI, Microsoft, Google and Anthropic. In an effort to encourage consumers to adopt generative AI into their daily lives, tech giants are racing to launch not only new chatbot apps, but also new features. Google has announced updates to its Bard chatbot, and Microsoft added visual search to Bing.

Investment expansion

Earlier this year, Microsoft’s expanded its investment in OpenAI, an additional $10 billion, it made it the biggest AI investment of the year. In April 2023, the startup reportedly structured a $300 million share sale at a valuation of between $27 billion and $29 billion, with investments from firms such as Sequoia Capital and Andreessen Horowitz. 

Concerns

Experts have raised concerns about AI-generated synthetic voices, which in this case could allow users a more natural experience but also enable more convincing deepfakes. Cyber threat investigators and researchers have already begun to explore how deepfakes can be used to penetrate cybersecurity systems.

OpenAI says ChatGPT can now ‘speak,’ listen and process images

OpenAI acknowledged those concerns in its announcement, saying that synthetic voices were ‘created with voice actors we have directly worked with,’ rather than collected from strangers.

The release also provided little information about how OpenAI would use consumer voice inputs, or how the company would secure that data if it were used. OpenAI did not immediately respond to CNBC’s request for comment, and the company’s terms of service say that consumers own their inputs ‘to the extent permitted by applicable law.’

What does ‘ChatGPT’ actually mean?

ChatGPT is an acronym for Chat Generative Pre-trained Transformer. It is a name of an artificial intelligence model that can generate natural language text based on user input.

It was developed by OpenAI, a research organization dedicated to creating and ensuring the safe and beneficial use of artificial intelligence (AI). ChatGPT can be used for various purposes, such as answering questions, having conversations, and producing creative writing.

Baidu launches raft of AI applications after its Ernie chatbot receives massive public approval

AI chatbot

More than 6 million users already

Baidu also announced that more than 6 million users have used an AI powered tool that sits inside its Google drive-like cloud product.

At the 4th September event, Baidu also demonstrated ‘displayed generative’ AI-based products that could assist with traffic management, financial research and coal mine logistics.

ChatGPT, from Microsoft-backed OpenAI, is not officially available in China, where Google and Facebook are blocked.

10 new AI products announced by Baidu

Chinese tech giant Baidu announced more than 10 new AI-based applications on 4th September 2023, just days after its ChatGPT-like Ernie bot was released for public use.

Among the products revealed was a generative AI-integrated word processing app called WPS AI, created by Shanghai-listed Kingsoft Office. It was reported the company built the tool using the AI model on which Baidu’s Ernie bot is based, as well as Baidu’s ‘Qianfan’ cloud platform for AI models.

‘This AI malarchy is progressing at quite a rate’.

Nearly 10,000 businesses are actively using Baidu’s Qianfan cloud platform each month, the company claimed.

AI assistant

Baidu also announced that more than 6 million users have used an AI-powered tool that sits inside its Google drive-like cloud product. The AI assistant can search documents, summarize and translate text and create content, the company claimed.

It wasn’t immediately clear to what extent those products were available for public use.

On 31st August 2023, Baidu released its Ernie bot to the public, signaling government approval of the AI-powered chatbot. Other Chinese companies also released similar AI products around the same time.