Generative AI is overrated, long live old-school AI!
TLDR; Don't be dazzled by generative AI's creative charm! Predictive AI, though less flashy, remains crucial for solving real-world challenges and unleashing AI's true potential. By merging the powers of both AI types and closing the prototype-to-production gap, we'll accelerate the AI revolution and transform our world. Keep an eye on both these AI stars to witness the future unfold.
Throughout 2022, generative AI captured the public’s imagination. Now that GPT-4 is out, the hype is poised to reach new heights.
With the late 2022 release(s) of Stable Diffusion, Dall-E2, and ChatGPT, people could engage with AI first-hand, watching with awe as seemingly intelligent systems created art, composed songs, penned poetry, and wrote passable college essays.
Only a few months later, some investors have become only interested in companies building generative AI, relegating those working on predictive models to “old school” AI.
However, generative AI alone won’t fulfill the promise of the AI revolution. The sci-fi future that many people anticipate accompanying the widespread adoption of AI depends on the success of predictive models. Self-driving cars, robotic attendants, personalized healthcare, and many other innovations hinge on perfecting “old school” AI.
Not your average giraffe
Generative AI’s Great Leap Forward?
Predictive and generative AI is designed to perform different tasks.
Predictive models infer information about different data points to make decisions. Is this an image of a dog or a cat? Is this tumor benign or malignant? A human supervises the model’s training, telling whether its outputs are correct. Based on the training data it encounters, the model learns to respond to different scenarios differently.
Generative models produce new data points based on what they learn from their training data. These models typically train in an unsupervised manner, analyzing the data without human input and drawing conclusions.
For years, generative models had the more complex tasks, such as trying to learn to generate photorealistic images or create textual information that answers questions accurately, and progress moved slowly.
Then, an increase in the availability of computing power enabled machine learning teams to build foundation models– massive unsupervised models that train vast amounts of data (sometimes all the data available on the internet). Over the past couple of years, ML engineers have calibrated these generative foundation models– feeding them subsets of annotated data to target outputs for specific objectives– so that they can be used for practical applications.
ChatGPT is a good example. It’s built on a version of GPT-n, foundation models trained on vast amounts of unlabelled data. To create ChatGPT-3, OpenAI hired 6,000 annotators to label an appropriate subset of data. Its ML engineers then used that data to fine-tune the model to teach it to generate specific information.
With these sorts of fine-tuning methods, generative models have begun to create previously incapable outputs. The result has been a swift proliferation of functional generative models. This sudden expansion makes it appear that generative AI has leapfrogged the performance of existing predictive AI systems.
The newly released GPT-4 exhibits human-level performance on a variety of common and professional academic exams. Source: OpenAI GPT-4 Technical Report
Appearances, however, can be deceiving.
The Real-World Use Cases for Predictive and Generative AI
Regarding current real-world use cases for these models, people use generative and predictive AI differently.
Predictive AI has primarily been used to free up people’s time by automating human processes to perform at very high levels of accuracy and with minimal human oversight.
Most of the current use cases for generative AI still require human oversight. For instance, these models have been used to draft documents and co-author code, but humans are still “in the loop” reviewing and editing the outputs. In contrast, the current iteration of generative AI is primarily used to augment rather than replace human workloads.
Currently, generative models haven’t yet been applied to high-stakes use cases, so whether they have significant error rates doesn’t matter much. Their current applications, such as creating art or writing essays, don’t carry many risks. What harm is done if a generative model produces an image of a woman with eyes too blue to be realistic?
... blue contacts, anyone?
On the other hand, many of the use cases for predictive AI carry risks that can have a very real impact on people’s lives. As a result, these models must achieve high-performance benchmarks before they’re released into the wild. Whereas a marketer might use a generative model to draft a blog post that’s 80 percent as good as the one they would have written themselves, no hospital would use a medical diagnostic system that predicts with only 80 percent accuracy.
While on the surface, it may appear that generative models have taken a giant leap forward in terms of performance when compared to their predictive counterparts, all things equal, most predictive models are required to perform at a higher accuracy level because their use cases demand it.
Even lower-stakes predictive AI models like email filtering must meet high-performance thresholds. If a spam email lands in a user’s inbox, it’s not the end of the world, but if a critical email gets filtered directly to spam, the results could be severe.
The capacity at which generative AI can currently perform is far from the threshold required to make the leap into production for high-risk applications. Using a generative text-to-image model with likely error rates to make art may have enthralled the general public. However, no medical publishing company would use that same model to generate images of benign and malignant tumors to teach medical students. The stakes are too high.
The Business Value of AI
While predictive AI may have recently taken a backseat in terms of media coverage, in the near- to medium-term, these systems are still likely to deliver the most significant value for business and society.
Although generative AI creates new data about the world, it’s less helpful in solving problems on existing data. Most urgent large-scale problems humans must solve necessitates making inferences about and decisions based on real-world data.
Predictive AI systems can read documents, control temperature, analyze weather patterns, evaluate medical images, assess property damage, and more. They can generate immense business value by automating vast data and document processing. Financial institutions, for instance, use predictive AI to review and categorize millions of transactions daily, saving employees from these time and labor-intensive tasks.
However, many of the real-world applications for predictive AI that have the potential to transform our day-to-day lives depend on perfecting existing models so that they achieve the performance benchmarks required to enter production. Closing the proof-of-concept to production performance gap is the most challenging part of model development, but it’s essential if AI systems are to reach their potential.
The Future of Generative and Predictive AI
So has generative AI been overhyped?
Not exactly. Having generative models capable of delivering value is an exciting development. For the first time, people can interact with AI systems that don’t just automate but create– an activity of which only humans were previously capable.
Nonetheless, the current performance metrics for generative AI aren’t as well defined as those for predictive AI, and measuring the accuracy of a generative model is complex. If the technology is going to one day be used for practical applications– such as writing a textbook– it will ultimately need to have performance requirements similar to that of predictive models. Likewise, predictive and generative AI will merge eventually. Mimicking human intelligence and performance requires having one system that is both predictive and generative. That system will need to perform both of these functions at high levels of accuracy.
Factuality evaluations for GPT-4 are still around or below 80% on a broad set of categories - not yet usable for high-risk use cases. Source: OpenAI GPT-4 Technical Report
In the meantime, however, if we want to accelerate the AI revolution, we shouldn’t abandon “old school AI” for its flashier cousin. Instead, we must focus on perfecting predictive AI systems and putting resources into closing the proof-of-concept-to-production-gap for predictive models.
If we don’t, then ten years from now, we might be able to create a symphony from text-to-sound models, but we’ll still be driving ourselves.
Courtesy of Stable Diffusion
Ulrik Stig Hansen is Co-Founder & President of Encord. He holds an M.S. in Computer Science from Imperial College London.
March 20, 2023
20 min read
March 14, 2023
5 min read
March 14, 2023
3 min read