At this point, you’ve probably heard of ChatGPT and all of its amazing capabilities in editing, writing, and even coding. But did you know that ChatGPT is only one of a new class of AI models revolutionizing the world? Let’s dive into some other Large Language Models(LLMs) besides ChatGPT, how they work, and their potential implications.
GPT, GPT-2, and GPT-3
In many ways, these three AI models were the precursors to much of the advances in LLMs we see today. All three of them utilize the transformer model architecture. The idea of this kind of model is to track sequential relationships in input like a sentence through multiple “attention mechanisms” that each focus on different aspects. For example, one “attention head” might might focus on relationships between pronouns and nouns and another might focus on subjects and the verbs with which they correlate.
Attention Head visualization. Source:
One of the main innovations of the “GPT” series was training this transformer architecture on very large amounts of data. GPT-3, the largest model, was trained on 45 terabytes of text data, or hundreds of billions of words total. It’s for this reason that GPT-3 gained abilities to a level never seen before in AI for text-based question-answering, idea generation, and even writing stories/poetry. Another interesting use of the “GPT” series is in robotics--- researchers have used its capabilities for autonomous robot task planning.
DALL·E/ DALL·E 2
Once thought to be unique to humans, even art is now being made by AI models. DALL·E/ DALL·E 2 are two models also based on GPT-3. These models harnessed GPT-3’s text comprehension abilities while also being trained on text-image pairs from the internet.
Artwork by DALL·E 2, source: https://openai.com/dall-e-2/
Some of the interesting things you can do with DALL·E 2, the improved version of DALL·E, are creating a piece of artwork in a specified medium (watercolor, digital art, pixel art, cartoon, etc.) and drawing in the style of a famous artist (e.g. a mouse eating cheese in Van Gogh’s style). Though, DALL·E 2 still has some areas of improvement, one area notably being its ability to depict human hands.
Since the release of these models, many articles and experts have speculated about their implications. One such implication is in the context of schoolwork: with chatGPT being able to write full essays, it appears to be very difficult to tell if a student or an AI wrote an assignment. Similarly, there was a large controversy when an artist used an app utilizing DALL·E 2 to generated a piece of artwork, and that artwork proceeded to win at the Colorado State Fair.
In the future, it may become even more difficult to distinguish what is made by a human and what is made by a machine.
Another implication is that some could use these models for malicious purposes such as creating malware or generating propaganda. OpenAI, the company responsible for creating/updating these models have attempted to mitigate this risk by imposing restrictions on what content you can generate (e.g. limiting racial bias, screening for political topics). However, there appears to be some clever ways to get past these filters (which will not be discussed here for fear of reproduction).
Finally, some have pointed out the problematic nature of some of the training data of these models. Especially, DALL·E utilizes the hard work of human artists, most likely without their permission, to generate art that could eventually lead to the loss of jobs for artists. Indeed, people in industries such as art and coding have expressed concern that they could be facing more and more job loss as these AIs get better and better.
Large Language Models are some of the largest advances AI has seen thus far. AI can now write literary essays, generate detailed art, and write programs in multiple programming languages. At the same time, with more capabilities come more risks---loss of artistic integrity, malicious usage, and job loss. Like for many other technological advances, it’s important for society to acknowledge the benefits of efficiency but also recognize and mitigate these emerging problems.