OpenAI says ChatGPT can now ‘speak,’ listen and process images

ChatGPT

I can see and hear and speak…

OpenAI’s ChatGPT can now ‘see, hear and speak,’ or, at least, understand spoken words, respond with a synthetic voice and process images, the company announced Monday 25th September 2023.

The update to the chatbot OpenAI’s biggest since the introduction of GPT-4, allows users to opt into voice conversations on ChatGPT’s mobile app and choose from five different synthetic voices for the bot to respond with. Users will also be able to share images with ChatGPT and highlight areas of focus or analysis.

Roll out

The changes will be rolling out to paying users in the next two weeks, OpenAI said. ‘While voice functionality will be limited to the iOS and Android apps, the image processing capabilities will be available on all platforms’.

The big feature push comes alongside ever-rising stakes of the artificial intelligence (AI) race among chatbot leaders such as OpenAI, Microsoft, Google and Anthropic. In an effort to encourage consumers to adopt generative AI into their daily lives, tech giants are racing to launch not only new chatbot apps, but also new features. Google has announced updates to its Bard chatbot, and Microsoft added visual search to Bing.

Investment expansion

Earlier this year, Microsoft’s expanded its investment in OpenAI, an additional $10 billion, it made it the biggest AI investment of the year. In April 2023, the startup reportedly structured a $300 million share sale at a valuation of between $27 billion and $29 billion, with investments from firms such as Sequoia Capital and Andreessen Horowitz. 

Concerns

Experts have raised concerns about AI-generated synthetic voices, which in this case could allow users a more natural experience but also enable more convincing deepfakes. Cyber threat investigators and researchers have already begun to explore how deepfakes can be used to penetrate cybersecurity systems.

OpenAI says ChatGPT can now ‘speak,’ listen and process images

OpenAI acknowledged those concerns in its announcement, saying that synthetic voices were ‘created with voice actors we have directly worked with,’ rather than collected from strangers.

The release also provided little information about how OpenAI would use consumer voice inputs, or how the company would secure that data if it were used. OpenAI did not immediately respond to CNBC’s request for comment, and the company’s terms of service say that consumers own their inputs ‘to the extent permitted by applicable law.’

What does ‘ChatGPT’ actually mean?

ChatGPT is an acronym for Chat Generative Pre-trained Transformer. It is a name of an artificial intelligence model that can generate natural language text based on user input.

It was developed by OpenAI, a research organization dedicated to creating and ensuring the safe and beneficial use of artificial intelligence (AI). ChatGPT can be used for various purposes, such as answering questions, having conversations, and producing creative writing.

Amazon to invest up to $4 billion in leading edge tech Anthropic

Tech AI led investment

E-commerce conglomerate Amazon announced on Monday 25th September 2023 that it will invest up to $4 billion in artificial intelligence (AI) firm Anthropic, a rival to ChatGPT developer OpenAI, and take a minority ownership position in the company.

The move further enforces Amazon’s aggressive AI push as it aims to keep pace with rivals such as Microsoft and Alphabet’s Google.

The two firms reportedly said that they are forming a strategic collaboration to advance generative AI, with the startup selecting Amazon Web Services as its primary cloud provider.

Money waiting to go into tech, turn it on

Tech money

Reports suggest as much as $3 trillion is waiting on the sidelines to be invested in tech’.

AI FOMO

The reasoning is that AI is driving a fear of missing out (FOMO). We could very well be experiencing the fourth industrial revolution right now, and it is AI-driven. Strategically, companies can’t just sit around and wait. There’s a window where if they don’t join in or realise the potential and grab the opportunity, they’ll miss out.

IPO’s

Three of the biggest initial public offerings (IPO) in the tech’ sector in nearly two years raised some $6 billion collectively in less than a week. Nvidia has attracted much attention with the AI driven interest it has created recently.

While a handful of tech IPOs and one big acquisition wouldn’t have been much cause for celebration in previous years, they are a welcome return after the drought of pandemic-era hit investment.

The IPO market for tech was effectively shut down until Arm Holdings, Instacart and Klaviyo opened the investors door again. Merger activity such as that driven by Microsoft Corp., OpenAI ChatGPT and Activision Blizzard Inc. is helping to lift up the appetitie for investment again. And it’s pretty much AI induced.

Money ready to go

Some analysts suggest there is $3 trillion sitting on the sidelines ready to invest, mostly held by Big Tech and private equity companies. The fascination with artificial intelligence (AI) and fear of missing out (FOMO) will create massive AI led tech investing opportunities. Everyone will want a slice of this cake.

This could very well be the biggest transformational spending wave that we’ve seen in years and certainly since the internet arrived in 1995.

Just look out for that ‘bubble’ again – it will pop! But much money will be made before that happens and then again after.

Baidu launches raft of AI applications after its Ernie chatbot receives massive public approval

AI chatbot

More than 6 million users already

Baidu also announced that more than 6 million users have used an AI powered tool that sits inside its Google drive-like cloud product.

At the 4th September event, Baidu also demonstrated ‘displayed generative’ AI-based products that could assist with traffic management, financial research and coal mine logistics.

ChatGPT, from Microsoft-backed OpenAI, is not officially available in China, where Google and Facebook are blocked.

10 new AI products announced by Baidu

Chinese tech giant Baidu announced more than 10 new AI-based applications on 4th September 2023, just days after its ChatGPT-like Ernie bot was released for public use.

Among the products revealed was a generative AI-integrated word processing app called WPS AI, created by Shanghai-listed Kingsoft Office. It was reported the company built the tool using the AI model on which Baidu’s Ernie bot is based, as well as Baidu’s ‘Qianfan’ cloud platform for AI models.

‘This AI malarchy is progressing at quite a rate’.

Nearly 10,000 businesses are actively using Baidu’s Qianfan cloud platform each month, the company claimed.

AI assistant

Baidu also announced that more than 6 million users have used an AI-powered tool that sits inside its Google drive-like cloud product. The AI assistant can search documents, summarize and translate text and create content, the company claimed.

It wasn’t immediately clear to what extent those products were available for public use.

On 31st August 2023, Baidu released its Ernie bot to the public, signaling government approval of the AI-powered chatbot. Other Chinese companies also released similar AI products around the same time.

Amazon – leading or competing?

The power of AI

Amazon is one of the leading companies in the field of artificial intelligence (AI) and has been developing its own custom chips to power its AI applications and services.

Amazon’s AI chips are designed to perform tasks such as natural language processing, computer vision, speech recognition, and machine learning inference and training.

AI chips created by Amazon

  • AZ2: This is a processor built into the Echo Show 15 smart display and powers artificial intelligence tasks like understanding your voice commands and figuring out who is issuing those commands. The AZ2 chip also enables features such as visual ID, which can recognize faces and display personalized information on the screen.
  • Inferentia: This is a high-performance chip that Amazon launched to deliver low-cost and high-throughput inference for deep learning applications. Inferentia powers Amazon Elastic Compute Cloud (EC2) Inf1 instances, which are optimized for running inference workloads on AWS. Inferentia also powers some of Amazon’s own services, such as Alexa, Rekognition, and SageMaker Neo.
  • Trainium: This is a chip that Amazon designed to provide high-performance and low-cost training for machine learning models. Trainium will power Amazon EC2 Inf2 instances, which are designed to train increasingly complex models, such as large language models and vision transformers. Trainium will also support scale-out distributed training with ultra-high-speed connectivity between accelerators.

Despite advancements is Amazon chasing to keep up?

Amazon is racing to catch up with Microsoft and Google in the field of generative AI, which is a branch of AI that can create new content or data from existing data. Generative AI can be used for applications such as natural language generation, image and video synthesis, text summarization, and personalization.

AI models from Amazon

  • Titan: This is a family of large language models (LLMs). Titan models can generate natural language texts for various domains and tasks, such as conversational agents, document summarization, product reviews, and more. Titan models are trained on a large and diverse corpus of text data from various sources, such as books, news articles, social media posts, and product descriptions.
Power of AI
Powerful chips for artificial intelligence (AI)
  • Bedrock: This is a service that Amazon created to help developers enhance their software using generative AI. Bedrock provides access to pre-trained Titan models and tools to customize them for specific use cases. Bedrock also allows developers to deploy their generative AI applications on AWS using Inferentia or Trainium chips.

Generative AI

Amazon’s CEO, Andy Jassy in the past said he thought of generative AI as having three macro layers: the compute, the models, and the applications. He said that Amazon is investing heavily in all three layers and that its custom chips are a key part of its strategy to provide high-performance and low-cost compute for generative AI. He also said that Amazon is not used to chasing markets but creating them, and that he believes Amazon has the best platform for generative AI in the world.

Inferentia and Trainium, offer AWS customers an alternative to training their large language models on Nvidia GPUs, which have been getting difficult and expensive to procure. 

‘The entire world would like more chips for doing generative AI, whether that’s GPUs or whether that’s Amazon’s own chips that we’re designing’, Amazon Web Services CEO Adam Selipsky is reported to have said. ‘I think that we’re in a better position than anybody else on Earth to supply the capacity that our customers collectively are going to want’.

Fast actors

Yet others have acted faster, and invested more, to capture business from the generative AI boom. When OpenAI launched ChatGPT in November 2022, Microsoft gained widespread attention for hosting the chatbot, and investing a reportedly whopping $13 billion in OpenAI. It was quick to add the generative AI models to its own products, incorporating them into Bing in February 2023. 

That same month, Google launched its own large language model, Bard, followed by a $300 million investment in OpenAI rival Anthropic. 

Chat Bot
AI Chat Bot robot

It wasn’t until April 2023 that Amazon announced its own family of large language models, called Titan, along with a service called Bedrock to help developers enhance software using generative AI.

Amazon is not used to chasing markets. Amazon is used to creating markets. And for the first time for some time, they find themselves on the back foot and working to play catch up.

And Meta?

Meta also recently released its own LLM, Llama 2. The open-source ChatGPT rival is now available for people to test on Microsoft’s Azure public cloud.

The AI battle continues…

Hackers to compete for $20 million prize

Hackers

The U.S. cyber hacker challenge is a new initiative launched by the Biden administration in August 2023 to use artificial intelligence (AI) to protect critical U.S. infrastructure from cybersecurity risks. 

The challenge will offer $20 million in prize money and includes collaboration from leading AI companies Anthropic, Google, Microsoft and OpenAI, who will make their technology available for the competition. The challenge was announced at the Black Hat USA hacking conference in Las Vegas.

The competition will consist of three stages

  • Qualifying event in the spring of 2024
  • Semifinal at DEF CON 2024
  • Final at DEF CON 2025 

The competitors will be asked to use AI to secure vital software and open source their systems so that their solutions can be used widely (does that create a risk in itsellf)? The top three teams will be eligible for additional prizes, including a top prize of $4 million for the team that best secures vital software.

The challenge aims to explore what’s possible when experts in cybersecurity and AI have access to a suite of cross-company resources. The U.S. government hopes that the promise of AI can help further secure critical U.S. systems and protect Americans from future cyber attacks!

Limitations and risks using AI for security

However, there are flaws and drawbacks of using AI for cybersecurity, both for the attackers and the defenders.

  • Lack of transparency and explainability: AI systems are often complex and opaque, making it difficult to understand how they make decisions or what factors influence their outputs. This can lead to trust issues, ethical dilemmas, and legal liabilities.
  • Overreliance on AI: AI systems are not infallible and may make mistakes or produce false positives or negatives. Relying too much on AI, without human oversight or verification can result in missed threats, erroneous actions, or unintended consequences.
  • Bias and discrimination: AI systems may inherit or amplify human biases or prejudices that are present in the data, algorithms, or design of the systems. This can result in unfair or discriminatory outcomes, such as excluding certain groups of people from access to services or opportunities, or targeting them for malicious attacks.
  • Vulnerability to attacks: AI systems may be susceptible to adversarial attacks, such as data poisoning, model stealing, evasion, or exploitation. These attacks can compromise the integrity, availability, or confidentiality of the systems, or manipulate them to produce malicious outputs.
  • High cost: Developing and maintaining AI systems for cybersecurity requires a lot of resources, such as computing power, memory, data, and skilled personnel. These resources may not be easily accessible or affordable for many organizations or individual.
AI and cybersecurity systems
‘Well, what do you think of AI and cybersecurity sharing resources’? ‘Ha! playing right into our hands’.

These are some of the flaws of using AI for cybersecurity, but they are not insurmountable. With proper research, regulation, education, and collaboration, AI can be a powerful ally in enhancing cybersecurity and protecting against cyber threats – that is until it takes over, but that will never happen… will it?

Google says people should use its search engine to check whether information provided by its Chatbot, Bard, is actually accurate

Robot AI

Accuracy

According to a recent news article, Google says people should use its search engine to check whether information provided by Bard is actually accurate, as it may display inaccurate or offensive information that doesn’t represent Google’s views. Just Google views I wonder…?

Google’s UK boss Debbie Weinstein said Bard was not really the place that you go to search for specific information, but rather an experiment best suited for collaboration around problem solving and creating new ideas.

Robot AI
‘Just checking the answer with my search engine!’

Hallucinate

According to an Android Authority article, both Bard and ChatGPT can hallucinate or confidently lie when asked about obscure topics. Bard does offer a link to search results and will sometimes cite a source or two. However, Google states that Bard can even lie about its own inner workings so you cannot trust everything it says…?

Testing… 1… 2… 3…?

According to a report by Marie Haynes, Bard predicts it will generate accurate responses 85% of the time by September 2023, but in an experiment, it posted an accuracy score of 63%, meaning it had incorrect information in more than 1/3 of its responses

Early days, or habouring a problem for the future?

AI race gathers momentum as China’s Baidu claims its Ernie Bot is Better than ChatGPT on key tests

AI Robots Chatting

Baidu said its AI system called Ernie 3.5 outperformed OpenAI’s ChatGPT and GPT4 in several key areas.

  • The Chat Bot was revealed in March 2023 and has since been publicly testing it in China. The chatbot is based on Baidu’s foundational AI model called ERNIE.
  • Baidu’s advancements underscore the intense competition taking place in the area of generative AI with technology giants in the US and China rapidly advancing their AI models.

 ERNIE Enhanced Language RepresentatioN with Informative Entities

US and China AI Bots go head to head

Ernie was first introduced in 2019, and since then, Baidu has been improving and upgrading it with new versions. The latest version, Ernie 3.5, was announced in June 2023, and it claims to outperform OpenAI’s ChatGPT and GPT 4 in several key areas

Baidu’s Ernie is an artificial intelligence (AI) model that powers the company’s chatbot service, Ernie Bot. Ernie stands for Enhanced Language RepresentatioN with Informative Entities, and it is a natural language processing (NLP) deep-learning model that can understand and generate natural language.

Trained on large data sets

Ernie 3.5 is based on Baidu’s foundational AI model, which is trained on huge amounts of data from various domains, such as news, social media, encyclopedias, books, and more. Ernie 3.5 can handle various NLP tasks, such as question answering, dialogue generation, text summarization, sentiment analysis, and more.

According to a test by the China Science Daily journal, Ernie 3.5 surpassed ChatGPT and GPT 4 in general abilities and outperformed the more advanced GPT 4 on several Chinese-language capabilities. 

ERNIE version 3.5 boosted its training and efficiency, making it faster and cheaper to upgrade to future versions. Baidu hopes that ERNIE Bot will become the next must-have app in China’s internet market, attracting users because of its natural and engaging conversations.

Intergration

Baidu has been integrating ERNIE Bot across multiple business applications, ranging from cloud computing to smart speakers. 

Chat Bot
AI Chatbot

ERNIE Bot is one of the examples of how Baidu is investing in AI technology and competing with other tech giants in the US and China. Baidu’s founder Robin Li, reportedly said that ‘foundation models are an engine driving global economic growth and represent a major strategic opportunity that cannot be missed‘.

The major BIG players, Alphabet (Google), Microsoft & META all have their own versions of AI. Hopefully it will be used ‘intelligently’.