Introduction
As you have heard of Alexa, Google Assistant, and Siri, these are all AI-based virtual assistants using advanced machine learning algorithms and natural language processing. They are capable of carrying out several tasks, such as responding to queries, giving information, setting alarms, playing audio, etc.
However, a new type of AI has emerged and taken over the globe. By now, everyone is familiar with ChatGPT. It is the most recent viral sensation to take the globe by storm. According to reports, it will completely alter how future generations will look towards artificial intelligence (AI).
ChatGPT :
ChatGPT is a text-generating application that creates text that resembles to human speech in response to commands or input provided to it in form of a prompt. ChatGPT, or Chat-based Generative Pre-trained Transformers, uses machine learning algorithms like reinforcement learning with human feedback and Natural Language Processing (NLP) techniques to give users an interactive conversational experience. It is an AI tool that works as a model that is simple for everyone to understand. The model used to train ChatGPT is from the GPT-3.5 series, which completed training in early 2022. ChatGPT and GPT-3.5 were trained on the Azure AI supercomputing infrastructure by openai.
Background: what is openAI?
In 2015, Elon Musk, Sam Altman, Reid Hoffman, and others started a non-profit AI research lab called OpenAI. OpenAI is the brain behind ChatGPT and undertakes AI research to advance and create a friendly AI. Despite being the most well-known, ChatGPT was not the first AI chatbot. There have been many other chatbots previously which I have discussed in my blog post on "past, present, and future chatbots". ELIZA is said to be the first-ever chatbot invented in the 1960s by Joseph Wizenbaum, a German-American computer scientist at MIT Artificial Laboratory. ELIZA was able to identify keywords and key phrases (inputs) by using Natural Language Processing, a technology that enables computers to comprehend human language. It was implemented with simple pattern-matching techniques such that if it was told that "My sister hates me" then the response to it was "who else in your family hates you?". So to improve and advance in this many other chatbots were introduced further.
InstructGPT :
The earlier sister version of ChatGPT is InstructGPT, a tool that allows the users to train the model on specific tasks and generate text tailored to the user’s particular needs. InstructGPT and ChatGPT only differ in the way they collect data.ChatGPT model is fine-tuned from the same language model as InstructGPT, and a similar methodology is used for fine-tuning it. "We had added some conversational data and tuned the training process a bit. So we didn’t want to oversell it as a big fundamental advance" - statement by openAI. As it turned out, the conversational data had a big positive impact on ChatGPT.
Techniques behind ChatGPT :
So as we have seen above chatGPT uses Natural language processing (NLP) technique and Reinforcement learning with the Human Feedback Model and the model used to train it is from GPT 3.5 series. Importantly it is a generative pre-trained transformer popularly known as GPT.
GPT :
GPT is developed in a self-supervised fashion. The model is trained over a massive dataset to predict the next word in the sequence. This is known as casual language modeling. This language model is then finetuned on a supervised dataset for the downstream tasks.
The GPT-3 is known as an autoregressive model, which is trained to only make predictions based on past data. Large-scale apps like search engines, content creation, and many others can be created using GPT-3. But GPT-3 failed to achieve human-like conversations why so?
One of the issues with GPT-3 is that the model output is not in line with the user directions or prompts, meaning that GPT-3 is unable to produce a response based on user preferences. The main reason behind this is model is trained to predict the next word in the sentence. GPT-3 is not trained to generate human-preferred responses. Another issue is that because it has no control over the content, it may produce inappropriate and harmful comments. In order to resolve both of these problems- alignment and harmful comments, a new language model was trained that can address these challenges.
NLP(Natural language processing) :
NLP is the branch of AI that deals with the interaction between computers and humans using natural language. It is a crucial part of ChatGPT’s technology stack and allows the model to understand and generate text in a way like humans. When a user interacts with this AI, they wouldn’t know whether a human is behind that or an AI. Tokenization, named entity recognition, sentiment analysis, and part-of-speech tagging are a few of the typical NLP methods used in ChatGPT.
Reinforcement learning with Human Feedback(RLHF):
Reinforcement Learning from Human Feedback is a deep reinforcement learning technique that considers human feedback for learning. Human experts control the learning algorithm by selecting the most probable human responses from a list of responses produced which means it can generate responses interactively. This feature makes it naturally unique as it will continuously change and adapt in response to the feedback provided making it an application that will create long-lasting users. In this way, the agent mimics safe and truthful responses. Why traditional reinforcement learning not used instead of RLHF?
Traditional Reinforcement Learning systems require the reward function to be defined to determine whether the agent is moving in the right direction and aim to maximize the rewards. But, communicating the reward function to the agent in modern Reinforcement Learning environments is very challenging. Hence, instead of defining the reward function for the agent, we train the agent to learn the reward function based on human feedback. In this way, the agent can learn the reward function and understand the environment’s complex behaviors.
Conclusion
This brings us to the end of the article. In this article, we discussed ChatGPT openAI, InstructGPT, and techniques used in ChatGPT. ChatGPT is an absolute sensation in the history of AI, but there is a lot more to it to achieve human intelligence. You can try ChatGPT here. Hope you liked the article. Please let me know your thoughts and views on ChatGPT in the comments below.