• A new system called GPT-3 is shocking experts with its ability to use and understand language as well as human beings do
ILLUSTRATION: WREN MCDONALD

Word has been making its way out from the technology community: The world changed this summer with the rollout of an artificial intelligence system known as GPT-3. Its ability to interact in English and generate coherent writing have been startling hardened experts, who speak of “GPT-3 shock.”

Where typical AI systems are trained for specific tasks—classifying images, playing Go—GPT-3 can handle tasks it was never specifically trained for. Research released by its maker, San Francisco-based OpenAI, has found that GPT-3 can work out analogy questions from the old SAT with better results than the average college applicant. It can generate news articles that readers may have trouble distinguishing from human-written ones.

And it can do tasks its creators never thought about. Beta testers in recent weeks have found that it can complete a half-written investment memo, produce stories and letters written in the style of famous people, generate business ideas and even write certain kinds of software code based on a plain-English description of the desired software. OpenAI has announced that after the test period, GPT-3 will be released as a commercial product.

The name stands for Generative Pre-trained Transformer, third generation. Like other AI systems today, GPT-3 is based on a large, organized collection of numeric weights, known as parameters, that determine its operation. The builder of the AI trains it using large digital data sets—in this case, a filtered version of the contents of the web, plus Wikipedia and some others. The number of parameters is a key measure of an AI model’s capacity; GPT-3 has 175 billion, which is more than 100 times that of its predecessor, GPT-2, and 10 times that of its nearest rival, Microsoft’s Turing NLG.

 AI has been through cycles of hype and bust before. Still, I was curious. 

AI has been through cycles of hype and bust before. Still, I was curious. I became a beta tester of the website simplify.so, which lets users enter English text for GPT-3 to simplify, and I gave the technology a trial run.

I copied and pasted the first paragraph of George Washington’s 1796 Farewell Address: “The period for a new election of a citizen to administer the executive government of the United States being not far distant, and the time actually arrived when your thoughts must be employed in designating the person who is to be clothed with that important trust, it appears to me proper, especially as it may conduce to a more distinct expression of the public voice, that I should now apprise you of the resolution I have formed, to decline being considered among the number of those out of whom a choice is to be made.”

GPT-3 gave me its translation: “I am not going to run for president.” Take a bow, HAL 9000.

I got similarly cogent summaries when I entered the First Amendment and other sources. I wondered whether GPT-3 was simply lifting language from websites, but I couldn’t find any evidence of that.

Yet when I gave it the famous first line of Jane Austen’s “Pride and Prejudice”—“It is a truth universally acknowledged, that a single man in possession of a good fortune, must be in want of a wife”—the AI was puzzling to watch. In the course of my first four tries, a few of its answers were sort of in the ballpark without being quite right. (For instance, “A man with a lot of money must be looking for a wife.”) Then on my fifth try, it seemed to crack up: “It is a truth universally acknowledged, that a single man with a good fortune must be in want of a wife, because men are very vain and they want to be seen as wealthy, and women are very greedy and they want to be seen as beautiful.”

I am anthropomorphizing GPT-3 here, but I shouldn’t. It’s a statistical model that doesn’t have mental states or engage in reasoning as we think of it. It isn’t a general AI like HAL 9000, Tony Stark’s JARVIS or the overlords of the Grimes song “We Appreciate Power.”

Shreya Shankar, a machine-learning engineer at the AI firm Viaduct, told me that more-advanced users can teach the system to do new tasks by presenting it with examples, often just a few. From there, it generalizes what the task is. For instance, when she wanted GPT-3 to translate equations from English into math symbols, she started by giving it a few equations in English and their equivalents written in symbols.

 Once GPT-3 has been primed with examples, it still needs a human in the loop to separate the silver in its output from the dross. 

“At first, I had two or three examples,” Ms. Shankar said. “And I was able to get basic things like X squared plus 2X. But if I wanted elements like integrals, derivatives and logs, I needed to have examples of those. I ended up curating 10 or 11 examples, and GPT-3 ended up generalizing those quite well.” Once GPT-3 has been primed with examples, it still needs a human in the loop to separate the silver in its output from the dross, including its occasional nonsense and the bias it picked up from its internet training.

If the price is right, there’s a good chance that GPT-3 will make major changes in our working lives. For a range of knowledge workers—news reporters, lawyers, coders and others—the introduction of systems like GPT-3 will likely shift their activities from drafting to editing. On the plus side, the biggest barrier to getting work done, the tyranny of the blank paper or the blank screen, may become much rarer. It’s simple enough just to keep clicking GPT-3’s “generate” button until something halfway usable appears.

The tyranny of the blank screen, though, forces us to think through a problem in a way that editing does not. Human nature probably means that people will often be more intent on massaging an AI’s output to the point that it looks acceptable than on doing their own work to sort through ambiguous data and conflicting arguments. Like GPS navigation, which started as just a tool but has reduced our engagement with the act of navigating, AI language generators may start by sparing us labor but soon spare us thought. (With regard to possible misuse, a representative of OpenAI told me that it bans uses of GPT-3 that may cause harm, including harassment, spamming, deception or radicalization.)

If this sounds overstated, it’s important to consider that AI language models are likely to get still stronger. Creating a more powerful rival to GPT-3 is within reach of other tech companies. The underlying methods of machine learning are widely known, and the data OpenAI used for training is publicly available. With GPT-3 having shown the potential of very large models, its 175 billion parameters may soon be surpassed. Indeed, Google researchers announced in June that they had built a 600-billion-parameter model for language translation, and researchers at Microsoft have said that they have their sights set on trillion-parameter models, though not necessarily for language.

If GPS-3 finds a market, it won’t be the last word, for better or worse.