Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

OpenAI update ChatGPT: supports picture and voice input

2025-04-03 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > IT Information >

Share

Shulou(Shulou.com)11/24 Report--

Thanks to CTOnews.com netizens for the delivery of clues about the past. CTOnews.com Sept. 25, OpenAI recently announced the launch of a new version of ChatGPT, with two new features: voice input and image input. According to OpenAI, the new features will be available to ChatGPT Plus subscribers in the next two weeks, and others will be able to use them "soon".

The voice input function is similar to a voice assistant on a mobile phone. Users only need to press a button to say their own question, and ChatGPT will convert it into text, generate the answer, and then convert the answer into voice and play it back to the user. OpenAI said that this interaction is more natural and convenient, and because of the technical advantages of LLM, the quality of the answer will be higher. OpenAI has also developed a new text-to-speech model that can generate a similar human voice based on a sample voice of a few seconds. Users can choose ChatGPT sounds from five options, and this model has more potential uses. For example, OpenAI is working with Spotify to translate podcasts into other languages while retaining the voice of podcast hosts. However, this model also carries some risks, such as the possibility of being maliciously used to impersonate public figures or to commit fraud. As a result, OpenAI said, this model will not be widely open, but will be subject to strict controls and restrictions.

The image input function is similar to Google Lens, where users can photograph what they are interested in and upload it to ChatGPT. ChatGPT will try to identify what the user wants to ask and respond accordingly. Users can also use the drawing tools in the application to help express their problems, or communicate with voice or text input. The advantage of ChatGPT is that it can have multiple rounds of conversations instead of searching all at once. If users are not satisfied with the answer or want more information, they can continue to ask ChatGPT questions to get a more accurate and comprehensive answer. Of course, there are some potential problems with image search. For example, when dealing with character images, OpenAI says they limit ChatGPT's ability to analyze and directly evaluate people, both for accuracy and privacy, which means uploading a person's photo to know who he or she is is not yet possible.

CTOnews.com notes that since the launch of ChatGPT in early 2022, OpenAI has been trying to add more features and capabilities to its robot while avoiding new problems. With this update, the company is trying to find a balance on this line by consciously limiting what its new model can do. But this approach is not a long-term solution, as more and more people use voice control and image search, and ChatGPT gradually becomes a truly multimodal and useful virtual assistant, it will become more and more difficult to maintain secure and reasonable boundaries.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

IT Information

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report