The latest development and buzz around ChatGPT have caused it to go viral in the world of generative Artificial Intelligence (AI) since OpenAI released the text-based artificial intelligence tool in November 2022. The new language-processing AI has attracted billions of dollars in funding from tech investors.
What is ChatGPT?
ChatGPT is a machine learning model designed to interact in a conversational manner that is much more advanced than chatbots in the past. It is a sibling model to InstructGPT that follows instructions in a prompt and provides responses in detail and is, essentially, a variant of the GPT-3.5 language-generation software.
ChatGPT’s dialogue format allows for follow-up questions, challenging incorrect premises, acknowledging its errors, as well as dismissing inappropriate requests. This machine learning model can be used to assist with a wide range of tasks involving Natural Language Processing (NLP) and its high scalability makes it ideal for implementation in large-scale applications.
ChatGPT training process
The initial model of ChatGPT was trained using supervised fine-tuning where human AI trainers both the user and an AI assistant and provided conversations. To help compose responses, the trainers had access to model-written suggestions.
To develop a reward model for reinforcement learning, comparison data consisting of several model responses were collected, and alternative completions were sampled. The data could be gathered from the conversations that the trainers had with the chatbot.
The responses were later ranked by the AI trainers according to quality. These reward models made it possible to fine-tune the model using Proximal Policy Optimization. There were several iterations of this process involved.
The advanced model has been trained using reinforcement learning from Human Feedback (RLHF) as the InstructGPT, but the data collection setup is slightly different from its sibling. Both ChatGPT and GPT 3.5 were trained by OpenAI on an Azure AI supercomputing infrastructure using a large dataset of text, allowing it to learn the structures and patterns of the language.
OpenAI is a research institute and technology company that develops and promotes AI that is safe, beneficial, and aligned with human values. Founded in 2015, the institute conducts research in AI, machine learning, robotics, and human-AI interactions. In addition, it develops and promotes AI technologies and tools that can be used in a variety of settings in a responsible and ethical manner, making it more accessible and beneficial for the community.
Benefits of ChatGPT
Chatbots have been of interest for decades, but most of them are still relatively primitive and are only capable of answering rudimentary questions on help desk pages or somehow addressing the issues of frustrated customers. But now, with ChatGPT’s ability to carry a conversation through multiple queries and generate software code, the world of NLP is slowly entering a new chapter.
As a machine learning model, ChatGPT has the capability to assist with a wide range of tasks involving NLP. Due to its training based on a large dataset of text, it can understand and produce human-like responses to a wide range of questions and requests. Some of its potential benefits include:
- Improved efficiency and accuracy in NLP-related tasks
- Quickly and accurate responses to a wide range of questions
- Assistance with a wide range of tasks that require the understanding and generation of natural language
The limitations of ChatGPT
The ability of ChatGPt to follow a conversation is noteworthy. However, like many chatbots in the past, it does come with its own baggage:
- Plausible-sounding but incorrect information: ChatGPT sometimes responds with plausible-sounding but incorrect information, and fixing this can be a challenge as there’s currently no source of truth during RL training. On the other hand, if it is trained to be more cautious, it can end up declining questions that it can answer correctly. Additionally, supervised training can mislead the model because it is about what the model knows rather than what the human demonstrator knows.
- Sensitive to slight tweaks in the input: The response of the ChatGPT can be inconsistent if the input phrasing is slightly tweaked or the same prompt is made multiple times. In such instances, the model may claim not to know the answer or may answer correctly.
- Bias issue: Due to the biases in the training data and over-optimization, the ChatGPT model can be extremely verbose, and certain phrases are excessively used. An example of this is that it often restates that it’s a language model trained by OpenAI.
- Does not ask clarifying questions: The current model takes a guess when the user provides an ambiguous query. Ideally, it should ask clarifying questions.
- Response to inappropriate or harmful requests: While the model usually refuses to accept inappropriate requests, it does sometimes respond to harmful instructions. Typically, unsafe content is regulated by Moderation API, but from time to time, it demonstrates false negatives and positives. However, the good news is that appropriate user feedback is being taken into consideration to improve the current system.
As we can see, several limitations remain, but regular model updates and an accessible interface to ChatGPT are helping to improve in the areas where it’s lacking. The valuable user feedback from users will help recognize novel risks and possible mitigations.
The training process is ongoing, and the abilities of this AI tool are constantly improving as it continues to learn. ChatGPT’s rising popularity has certainly made it a potential major player in the world of NLP, and has led to market leader, Google unveiling its own multi-format Artificial Intelligence tool, Google Bard AI.