What Should Businesses Be Doing With Generative AI? High-Value Use Cases With Short Delivery Timelines.
I typically provide frameworks that help you figure out what you should be doing with Generative AI or any other technology, on your own. That’s what I consider most valuable and what my courses are built for. However, the overwhelming demand is for me to just tell you what to do with Generative AI.
In this article, I surrender to demand and explain the most promising Generative AI implementations.
This article follows 7 main use case categories:
Customer Intent Detection
Domain-Specific Intelligent Assistants
Orchestration Layer
Constrained Knowledge Search
Quick Start Templates For Everything
From Format X To Format Y
Third-Party Tools & Buy Vs. Build
I’ll start with LLM-first products that power all these use cases. Most of what you’ve been told about LLMs and Generative AI is complete garbage. Remember all the social media posts that started with “OpenAI is dead…” and “Copilot has been dethroned!” Where are they now? They showed terrific demos 9+ months ago. Shouldn’t their products be out by now?
Who Is Delivering LLM Products And What Are They Shipping?
We’re in the delivery phase, and LLM-first products aren’t what most people expect. Companies with pragmatic visions that were feasible to execute are on top. Microsoft, SAP, Palantir, Cisco, IBM, and many other technology incumbents have delivered more with Generative AI than Amazon, Meta, and Google.
SAP will be presenting Generative AI use cases and its product roadmap on March 6th. These types of events provide an escape from the hype and insights into where pragmatic companies see opportunities.
Deci and Writer are standout startups that just keep delivering. Deci has shipped several 7B LLMs that are competitive with Big Tech’s. The company’s goal is to deliver LLMs and support so any business can deliver Generative AI products. The startup’s focus is equal parts SoTA model metrics and business revenue metrics.
Writer’s CEO May Habib raised $100 million to build small language models (some as small as 128M parameters) for domain-specific use cases vs. one model to rule them all. Writer poached some of OpenAI’s early customers with high reliability, low generalization models with very targeted enterprise applications.
Naas has a product roadmap that’s more promising than any other I’ve seen outside of Anthropic. Nass CEO Jeremy Ravenel’s vision is ambitious but grounded in what his team can deliver. The startup is building a knowledge graph to support its LLM product. Early features target specific workflows like sales, marketing, and social media.
And speaking of, Anthropic’s tilt toward robotics and devices is exceptionally promising. It’s not public yet, but expect announcements and leaks soon.
What should you build? Stay small. Build for specific use cases rather than wide-ranging functionality. Keep costs in mind at all times. Run a feasibility assessment to ensure the business can build it and that customers or users are ready. Humble beginnings should lead to bigger visions.
Dispelling A Few Myths – Use Cases To Avoid
The CEO of Palantir came right out and said what most of us have discovered. “Chatbots are not the best way to access an LLM.” Chatbots are the most common product theme because they are easy to conceptualize and build prototypes around.
Once in production, the shortcomings are apparent. Their behavior isn’t reliable enough for customer-facing use cases. LLMs can be prompted to say the wildest things without appropriate guardrails. They go from helpful to your wild uncle on Facebook without warning.
The next myth is that prompt engineering is a simple panacea for all an LLM’s shortcomings. In reality, prompt engineering can make LLMs more reliable around a very narrow set of functionality. The gains don’t generalize and you must prompt engineer a solution for every shortcoming and undesirable behavior.
It’s high effort, requiring multiple iterations, and doesn’t always work. Use cases that require significant prompt engineering will take a long time to deliver and require several iterations to work the instability out. I wouldn’t rule them out immediately but only move forward if the opportunity size is large enough to justify it and the business has the talent to execute.
Finally, LLMs from GPT to open-source models don’t work for most use cases out of the box. There’s always additional work required. The formula for success, and a common theme across use cases in this article, is a use case that plays the LLMs’ strengths, high-quality data, and a human in the loop. Any use case without that setup isn’t a good candidate for evaluation.
Step 1 in figuring out what to do with LLMs is to stop listening to people who don’t know what they’re talking about. Expertise matters for execution.
Customer Intent Detection
Walmart released a high-quality example of how LLMs can detect intent and support customers in new ways. The customer can search for an outcome instead of hunting for individual items. Walmart’s eCommerce platform supports search terms like “tailgate party” or “kid’s birthday party.” The search returns a list of items that fit the intent or outcome.
This isn’t a chatbot implementation, even though it supports natural language queries. There’s no back-and-forth between the customer and LLM. The response is a set of search results, not a conversation that could go off track.
I’m big on workflows and opportunity estimation, so we should ask 2 questions about every use case before moving forward: