OpenAI's third-generation of Generative Pre-training Transformer - GPT-3 - is now breaking the Internet. It has received a fair share of praise from the experts about its intuitive capability of generating text and even code. However, all this hype might be undermining the limitations of GPT-3. Before we go there, let's begin by understanding what GPT-3 is and the promises it holds.
GPT-3 is a state-of-the-art language model powered by a neural network that is touted to generate text indistinguishable from human creation. It has been trained using all of the textual content available on the Internet. The output it generates is chunks of text that are computed to be a plausible response to the given query input. All this is based on what has already been published or posted by us online.
GPT-3 is, at its core, is a revolutionary text predictor. You give it a chunk of text as input, and the model predicts what possible text can follow that input. It can then repeat this process— iterating each time, sequentially —until it reaches a length limit. A large portion of this marvel can be attributed to the large dataset because of which the GPT-3 can identify and riff on the linguistic patterns contained therein.
OpenAI first described this beast in a research paper published in May 2020. It is now accessible via an API to select people who requested access to a private beta. The GPT-3 model is constructed using the basic concept of Transformer, similar to its predecessors. The dataset that is used for pre-training is composed of Common Crawl, Wikipedia, WebText, Books, and some additional data sources, which account for almost all the major textual data sources on the Internet. This trained model was assessed against various NLP benchmarks. It produced stellar performance on question answering tasks and closed-book queries.
GPT-3 is, by leaps and bounds, an improvement over its predecessor GPT-2, though it expands on the same architecture. Both GPT and GPT-2, are both adaptations of a Transformer, an invention pioneered at Google in 2017. The Transformer calculates the probability for a word to appear around a set of other words. This computation is performed using a function called attention. GPT-2's largest version consists of 1.5 billion parameters. Before GPT-3, the largest Transformer-based language model in the world — introduced by Microsoft earlier this month — contains 17 billion parameters. GPT-3, however, is built up of a whopping 175 billion parameters!
For a supermassive generator model, the current buzz seems to be rightly placed. However, hiding behind the limelight of its expanse is its limitations.
While it's power and potential remains unparalleled, among the AI practitioners is a discussion that the model is not much different from a big transformer. The impressive text generation can solely be attributed to the massive computational power, scale, and the number of resources used for training the model. It is being argued that GPT-3 is still far away from reaching Artificial General Intelligence (AGI). Let's elaborate a little more on this.
With human intervention, GPT-3 is capable of performing a gamut of operations. It can write code; compose prose and fiction; generate business memos, among many others. However, GPT-3 has no internal representation of what each of these words means. It has no semantically-grounded model of the world or of the topics on which it discourses. This implies that GPT-3 works with only statistical computations and does not work by understanding the input and output text's content.
This is important as it indicates that GPT-3 cannot reason abstractly. It lacks the "brain" that humans possess that enables them to write content. Thus, when GPT-3 is faced with content that is different from or unavailable in the Internet's corpus of existing text that was used for training it, this text generator is at a loss.
The OpenAI researchers themselves acknowledged: "GPT-3 samples [can] lose coherence over sufficiently long passages, contradict themselves, and occasionally contain non-sequitur sentences or paragraphs."
Another major limitation of GPT-3 is its algorithmic bias. Accepted by OpenAI, GPT-3 is known to have biases towards gender, race, and religion. This arises from biases in training data that reflect societal views and opinions. Thus, this further bolsters the fact that this is not a standalone intelligent system. Hence in terms of pushing the field forward, GPT-3 hasn't offered much.
GPT-3 has got a good percentage of the data science community excited and invested. However, on the flip side, there are some adverse impacts of the tool on our society.
GPT-3's text generation is racially biased. There have been quite some instances where people have posted their output to prove this. Jerome Pesenti, the Head of AI, Facebook, stated that GPT-3 is surprising and creative, but it's also unsafe due to harmful biases. OpenAI admits in its GPT-3 paper that its API models have been proved to exhibit algorithmic biases, which are seen in the generated text. It is including violent words in the textual content created for content related to the Islamic religion. This is an example of the bias of the model. Anyone reminded of the time Microsoft's Tay went full Nazi? This incident, on some level, had widened the gap between AI advocates and opponents of AI. GPT-3 poses the same threat.
Such advancements in text generating models can profoundly impact the future of literature. It can be assumed with such language models, a large portion of all the written material available tomorrow will be computer-generated. The "high-quality" texts generated by the model are majorly undetectable by the virgin readers. The Open AI team, in their paper on GPT-3, warned the users about its malicious use in spam, phrasing, and fraudulent behaviour like deep fakes. This highlighted a significant portion of its negative impacts.
However, these are not the only issues associated with deep fakes. One train of thought talks about the potential "data pollution" the text from GPT-3 will cause. The content generated by GPT-3 is based on the previous data present on the Internet. A large portion of the content is neither well-curated nor written by responsible, accountable individuals. This forces the GPT-3 model to follow the same path. The quality of content will inevitably plummet. The conversation extends to the impact of this on future generations, who, at this rate, might have a hard time finding real quality work in a haystack of generated text.
GPT-3 can have a significant impact on the job market. It has been proved to generate efficient, non-trivial code. Sharif Shameem, the founder of debuild.co, has tweeted how their company has leveraged GPT-3 to write code. This commercial revolution of app development could pose a threat to all the coders out there. Their relevancy amidst such events in text-generators is in question.
This is not limited to developers. GPT-3 holds the potential to obviate the jobs of many, including journalists, writers, and scriptwriters, to name a few.
Nevertheless, there is a section of people who believe that GPT-3 could aid humans in various fields, rather than replace them. Shameem explained to the media that, in the future, doctors could "ask GPT-3 the cause of a specific set of patient's symptoms to give a reasonable response.”
Another grave concern is its impact on the environment. GPT-3, though touted to be the next "big thing," is not necessarily a conceptual breakthrough. It can be considered as an incremental improvement over GPT and GPT-2. Thus, put in simple words, it is a good idea that is being enabled with even more computing power. This would, naturally, increase the performance with each iteration.
This leads us to our next big question: if such improvements can be seen only through massive computing power applications, what is its impact on the environment? At present, the only consensus, published by ScienceDirect, seems to indicate that such machine-learning technologies are incredibly energy-intensive. However, the exact size of its environmental footprint remains unknown. It is, for a fact, genuinely difficult to measure such activities' effect on the environment.
Nevertheless, it is also because efforts to compute the impact has never been made by the tech industry, owing to the lack of incentive. Remember the Bitcoin and blockchain wave that mesmerized people a few years ago? This went on until someone discovered that Bitcoin mining consumed the same amount of electricity as small countries. GPT-3 and machine-learning may be very impressive and is undoubtedly profitable for tech giants. However, sooner or later, shouldn't we be asking if the planet can afford it?
Given its incredible computing power and user base, GPT-3 is expected to garner this volume of attention. Nevertheless, there are many drawbacks, limitations, and societal and environmental impacts that need to be taken into account. Thus, it can be deduced that GPT-3 is far from complete, and requires massive improvements before it can be made live.