For me, the last few months have been about generative AI and, possibly, for many of you as well. After working on adopting large language models at Ataccama, I decided to share a little about our journey. This will be a series of posts - covering different steps along the way.
Let’s start with a short recap of relevant terminology and a definition of GPT exactly.
GPT stands for Generative Pre-Trained Transformer
A Transformer is a type of neural network architecture that Google introduced in 2017. It's a relatively simple neural network compared to many other architectures and is well-suited to run on currently available hardware. This means you can make an extensive neural network and still be able to handle the associated computational complexity. GPT-3 (released in 2020), for example, is a transformer with 175 billion parameters. This is one of the reasons we call these models “large language models.”
Pre-trained means that a model has been trained on large volumes of text without any specific task in mind. It uses text to learn dependencies between words and other language features. GPT-3 includes a large collection of books, the entire English Wikipedia, and a fairly large subset of the Internet — hundreds of millions of words. This is the second reason why we call these large LMs.
Generative means that the model can produce new text. When given an input, it generates text that is likely to follow based on all of the language understanding the model has learned. What surprises many people is that the generated output is actually original. The model doesn't just output pieces of what it has seen before; instead, it has learned the fundamental structure of the language and can produce entirely new content.
What's even more interesting is that it mimics reasoning. When you think about it, this makes sense. Humans also use language to reason. Try to think something through without using words. Difficult, isn't it? LLM models have learned to use language and can even simulate the thinking process.
I used GPT-3 in the explanations above because we're still waiting for more information about GPT-4 (OpenAI stopped being so open at some point).
As you may have noticed, this technology is not that new (modern LLMs arrived in 2017). So why is all this hype coming now, in 2023?
On November 30, 2022, Open AI introduced ChatGPT. It's a "fine-tuned" version of the model, together with a chat-based interface that just about anyone can use. It immediately became the fastest-growing application ever made.
What does "fine-tuned" mean?
The plain “vanilla” GPT-3 model (also called the foundation model) is not easy to use. It knows the structure of language. However, it does not necessarily know how to listen to instructions or answer questions. It simply knows what is likely to follow after any given input from the language perspective (specifically, from the training data perspective).
If you ask GPT-3 a question, it’s possible that, from a language perspective, replying with another similar question is reasonable (a human might do this when conversing with another human).
It might also be acceptable, from a language perspective, to tell you not to bother the AI and to Google your question instead. Even telling you a wrong answer is possible, depending on the AI's sources.
Many of these scenarios are completely fine from a generic language model perspective. They’re just not very helpful to the end user in most cases. So, instead, you have to "fine-tune" the model to get the types of useful answers you need.
Fine-tuning means adjusting the language model to work well for a specific task. In the case of ChatGPT, its task is to be a helpful assistant.
Fine-tuning is typically done by giving examples of prompts and sound responses and then adjusting the weights of the model so that it is more likely to produce desired responses. But, giving the model enough examples to make tangible improvements on its capacity can be tricky. What OpenAI did to solve this is called Reinforcement Learning from Human Feedback (RLHF).
The basic idea behind RLHF is that it's much easier to tell which answer (out of a few options) is good than actually writing a good answer from scratch. You start with a base foundation model, give it questions, let it generate multiple answers, and then rank these answers manually — this is the Human Feedback part. When you collect enough of these, you can fine-tune a model that can rank at a performance level similar to that of humans.
Once this step is done, you can use the reinforcement learning technique (I blogged about it a couple of years ago). To use this technique, you let another instance of the model produce answers to questions, and the fine-tuned ranking model ranks them. Then, you allow it to run for some time, fine-tuning the model along the way to produce better and better answers to your questions.
It sounds easy enough. The reality, of course, is much more messy. Long story short, OpenAI came out with a version of their model that is much more helpful than the original base model. And, at the same time, they made it available to the general public — for free — and in an interface that is easy to use even for non-technical people.
The Hype Continues
Technologically, the jump from the 2020 version of the model to ChatGPT was not that big. It’s this fine-tuning that started the AI fever of 2023. It brought more attention, brought more researchers to this field, brought more funding—and now we see possible applications everywhere.
To give you an idea of how fast this whole space is moving, check out this GitHub project that shows a timeline of all critical milestones, including research papers, interesting projects, and important product releases relevant to large language models. For July 2023, it showed 232 such events. That's more than seven per day on average! It's getting tough to keep up.
Fortunately, I'll continue to blog about these topics and keep a pulse on current events. If you'd like to keep up with all things Gen AI, follow my personal page on Medium. Also, sign up for our Generative AI event to hear from industry experts about the latest in GPT and Gen AI.