- Joined
- Dec 1, 2022
- Messages
- 21
- License
- Apache 2.0
- API usage
- Paid
- Type
- Generative

Making talking robots with GPT-2
This is a tutorial in using machine learning to generate realistic English text, in any style you like. It doesn't require any coding, and by the end you will have built a simple chatbot, using the state-of-the art GPT-2 model, and hopefully learned a little about how machine learning works and what it might (or might not) be useful for as we go along.GPT-2 is the state of the art text generation model developed by the OpenAI foundation. They only released a limited, smaller version of their model due to concerns about it being misused to generate fake news and the like. Is this something that worries you? At the end of this tutorial, have a think about whether you feel the same, or differently.
One of the cool things about GPT-2 (and machine learning approaches that use Deep Neural Networks in general) is that we can fine-tune them - this means we can take an existing model (such as GPT-2's model of English text on the internet), and use them as a starting point to learn something more specific.
This is super useful, as it takes a lot of data to learn a model of language. If we wanted to, for instance, make a model that speaks like Columbo, there simply isn't enough example text from columbo scripts to do that from scratch (for some idea of the amount of data you need - there probably isn't enough data in the complete works of Shakespeare to train a 'shakespeare model', either).
However, if we start from the pre-trained GPT-2 model, we can take advantage of the fact that GPT-2 has already learned a pretty usable representation of the English lanaguage from gigabytes and gigabytes of text scraped from the internet, and we can build on top of it use a smaller amount of example text in the style we want, in order to learn to generate text in that style. This is what fine-tuning is, and we're going to try it out today! We're going to follow the process in this technical blog post, but without needing to write any code or install any software ourselves.