Building Enduring Enterprise AI Products
On December 21, 2022 the front page of the New York Times Business section featured an article about “a new chat bot” that was coming for Google’s search business. It was the first time ChatGPT was in the NYT's Business section, rather than those interested in this weird new consumer-facing marvel.
Around the same time, our Enterprise team at General Catalyst started to notice how often this generative search engine was mentioned in our regular conversations with large company leaders. Generative AI had revived debates across enterprise departments around how artificial intelligence could (or, without intervention, would) affect their work.
We wondered if generative AI (genAI) adoption would mirror the promise of the cloud back in 2010. The transition to cloud ended up requiring significant investment in infrastructure and time, not to mention change management. But the promise—and implementation—of AI is proving to be very different.
Accelerated adoption across the board
Over the last 6 months, we have spoken to trusted enterprise leaders across tech and traditional industries to understand what impact genAI could have on their internal workflows and customer-facing products. At first, we had conversations around Enterprise AI Readiness—i.e. what their teams needed to do to get ready to adopt genAI, from database management to access controls. But in reality, these companies moved much more quickly: the APIs now available and out-of-the-box LLMs are so performant that some no longer think they need unique infrastructure and tooling to use it themselves. Enterprises are training GPT4 live with prompts, and in some cases the models are making up for companies’ bad data hygiene (though others underscored that data quality remains a significant issue).
This fast-moving adoption of genAI is not only happening in tech-native companies. Even traditional enterprise companies have been testing and deploying AI. One global publisher we met has automated part of their peer review process with genAI, while Coca Cola is already leveraging genAI for new marketing and consumer experiences. GenAI is not only being embedded in Snapchat, DuckDuckGo, Instacart, and Shopify, but also being explored by defense orgs and health systems. GC’s portfolio company Hippocratic AI is leading this work in healthcare. We believe GenAI will be built into nearly every business application in use today.
Now let’s be clear, accelerated adoption—the phenomenon we are currently seeing in enterprise companies—does not equal enduring adoption.
First, while technical and business leaders alike have told us countless ways their companies are testing genAI, they have also shared that they’re uncomfortable sharing their most valuable data with external models, and that these tools are failing them in analytical and predictive use cases. One Head of Engineering was blunt: “GPT4 is useless for insights. Frankly, it lies.”
Second, we are seeing management clamoring for their teams to do something—without specific goals in mind. This urgency-sans-strategy had led teams to spend on myriad tools and consultants to match the pace of industry innovation. As VCs, we are seeing the direct impact of this in startups with skyrocketing adoption and fast-growing revenue. However, we lack clarity as to what products are genuinely differentiated, or what enterprise budgets will remain a year from now when management looks back and asks, how much value did genAI create, and at what cost?
Finally, for enduring adoption, companies’ will need to decide if AI is something they will invest in for the long term. Most companies do not have the talent to develop these applications for themselves, much less responsibly with the needed QA, and they need to buy ready-made products. On the other hand, teams with technical talent have the capacity to build—but that still requires an allocation of resources. One Chief Architect marveled at how expensive it would be to adopt something they could reasonably develop themselves. However, his priority was to focus his team on customer-facing applications, so he was open to buying tools that would increase the productivity of his internal team—though only when costs come down. We imagine this will happen as teams release less expensive or open source options that are equally effective.
An elusive key ingredient: Trust
Enduring AI adoption in the Enterprise requires a key ingredient that the ecosystem today lacks: Trust. Our team at GC has talked about this before in the context of Responsible AI and in our work on the Stakeholder-Aligned Enterprise. We still think it’s a good idea for humans to control AI systems, and for technologists to develop open, explainable, and accountable models. Corporate customers of Enterprise AI must be able to trust end results, as products will now be infused with the outputs of LLMs. Consumers of Enterprise products must understand how their data is being used. Enterprise leaders, too, must engender trust in the technology among their employees, including through retraining efforts. AI will not replace all jobs, but people trained to leverage AI might.
Part of the reason that trust remains an elusive goal is that there are still fundamental technical questions which are not yet fully resolved, including:
- Complex reasoning remains difficult
- True alignment of models, including mitigation of bias, remains an open question
- Structured data, especially time series data, remains difficult to leverage in an LLM context
The list of deficiencies goes on, coupled with the lack of enterprise understanding around alternative AI models (read: not genAI) that can help them achieve their goals.
The opportunity in Enterprise AI
Each challenge around AI development and trust (or lack thereof) creates an opportunity in Enterprise technology. A few we are energized about:
Control over proprietary data
Enterprises are still reluctant to export one of their most valuable assets: proprietary data. Proprietary data is any enterprise’s key to competitive defensibility, be that in a customer experience, a vertical-specific insight, a market-wide benchmark, or product discovery. Despite foundation model providers offering tools for companies to feel in control, potentially releasing competitive insights or user data makes enterprise leaders’ skin crawl. Allowing enterprises to control both infrastructure and training data we think will create significant value.
We are particularly keen to hear from those working on data as the core of AI development. Enterprises will need to continue investing in instrumenting their businesses, specifically having mature data systems that know where and how data is flowing, and doing this at scale. No matter how the AI stack looks in 6 or 12 months time, high quality data systems will underpin them.
Multi-model governance
By the end of the year, enterprises will likely have access to not only OpenAI’s GPTs, but also models from Meta, Google Brain, Anthropic, Hugging Face, Amazon, NVIDIA, Oracle. Performance will go up, and compute prices will go down. Companies will leverage different models for different use cases, as well as the robust models available as open source software and internally-built models. The problem with using many different models is not just about the ecosystem of model providers, but also about the things companies are building. Some enterprise use cases will perform better with certain models based on the data with which that model was trained. Still other applications of AI may prioritize cost over performance given how frequently they need to run that particular model. Enterprises will need multi-model governance to ensure proper access control, privacy, and security.
Data attribution
Unique data sets will be particularly valuable for unique use cases. At the beginning, companies are likely to experiment and see where AI can create value internally and externally. They will pull data from various sources to train and run these models. However, over time, enterprises will need to balance their investment in data pipelines and compute with the models that create most value. Enterprises will rely on data attribution platforms to understand where data is being used, what value it creates, and what that value creation costs. Then, they will look to solutions that can help create value with less cost – such as distributed computing to avoid paying to move data for analysis.
Difficult-to-build AI applications
Wrappers around GPT are prolific. Differentiated genAI applications are few and far between. And yet, difficult-to-build AI applications more broadly are already creating enormous value. These applications often require a unique background to build, like robust understandings of chemistry or physics. For example, GC has backed companies building transformative AI in healthcare, such as Aidoc and PathAi. In the Enterprise, we are energized by defensible, complex, robust ideas, particularly in applied deep learning.
Augmenting vs. reinventing with AI
Unlike other generational changes, the widespread adoption of AI is one in which many incumbents will not only survive but also thrive. It was hard moving a business from mainframes to client/server/microprocessor. It was hard entirely rewriting applications to be web based. This wave is different. Enterprises do not need to rebuild their tech stacks. Not only are they working with engaged customer bases, but they are also sitting on working systems, and these can be augmented with AI—without starting from scratch.
Though enterprises have not needed to invest heavily to adopt genAI, they will need to invest in the right infrastructure and tooling to adopt AI in a trustworthy way at scale. If you are an enterprise leader bolstering your AI strategy, we are keen to share perspectives with you. And if you are a founder building with these ideas in mind, please reach out to us at General Catalyst.
On December 21, 2022 the front page of the New York Times Business section featured an article about “a new chat bot” that was coming for Google’s search business. It was the first time ChatGPT was in the NYT's Business section, rather than those interested in this weird new consumer-facing marvel.
Around the same time, our Enterprise team at General Catalyst started to notice how often this generative search engine was mentioned in our regular conversations with large company leaders. Generative AI had revived debates across enterprise departments around how artificial intelligence could (or, without intervention, would) affect their work.
We wondered if generative AI (genAI) adoption would mirror the promise of the cloud back in 2010. The transition to cloud ended up requiring significant investment in infrastructure and time, not to mention change management. But the promise—and implementation—of AI is proving to be very different.
Accelerated adoption across the board
Over the last 6 months, we have spoken to trusted enterprise leaders across tech and traditional industries to understand what impact genAI could have on their internal workflows and customer-facing products. At first, we had conversations around Enterprise AI Readiness—i.e. what their teams needed to do to get ready to adopt genAI, from database management to access controls. But in reality, these companies moved much more quickly: the APIs now available and out-of-the-box LLMs are so performant that some no longer think they need unique infrastructure and tooling to use it themselves. Enterprises are training GPT4 live with prompts, and in some cases the models are making up for companies’ bad data hygiene (though others underscored that data quality remains a significant issue).
This fast-moving adoption of genAI is not only happening in tech-native companies. Even traditional enterprise companies have been testing and deploying AI. One global publisher we met has automated part of their peer review process with genAI, while Coca Cola is already leveraging genAI for new marketing and consumer experiences. GenAI is not only being embedded in Snapchat, DuckDuckGo, Instacart, and Shopify, but also being explored by defense orgs and health systems. GC’s portfolio company Hippocratic AI is leading this work in healthcare. We believe GenAI will be built into nearly every business application in use today.
Now let’s be clear, accelerated adoption—the phenomenon we are currently seeing in enterprise companies—does not equal enduring adoption.
First, while technical and business leaders alike have told us countless ways their companies are testing genAI, they have also shared that they’re uncomfortable sharing their most valuable data with external models, and that these tools are failing them in analytical and predictive use cases. One Head of Engineering was blunt: “GPT4 is useless for insights. Frankly, it lies.”
Second, we are seeing management clamoring for their teams to do something—without specific goals in mind. This urgency-sans-strategy had led teams to spend on myriad tools and consultants to match the pace of industry innovation. As VCs, we are seeing the direct impact of this in startups with skyrocketing adoption and fast-growing revenue. However, we lack clarity as to what products are genuinely differentiated, or what enterprise budgets will remain a year from now when management looks back and asks, how much value did genAI create, and at what cost?
Finally, for enduring adoption, companies’ will need to decide if AI is something they will invest in for the long term. Most companies do not have the talent to develop these applications for themselves, much less responsibly with the needed QA, and they need to buy ready-made products. On the other hand, teams with technical talent have the capacity to build—but that still requires an allocation of resources. One Chief Architect marveled at how expensive it would be to adopt something they could reasonably develop themselves. However, his priority was to focus his team on customer-facing applications, so he was open to buying tools that would increase the productivity of his internal team—though only when costs come down. We imagine this will happen as teams release less expensive or open source options that are equally effective.
An elusive key ingredient: Trust
Enduring AI adoption in the Enterprise requires a key ingredient that the ecosystem today lacks: Trust. Our team at GC has talked about this before in the context of Responsible AI and in our work on the Stakeholder-Aligned Enterprise. We still think it’s a good idea for humans to control AI systems, and for technologists to develop open, explainable, and accountable models. Corporate customers of Enterprise AI must be able to trust end results, as products will now be infused with the outputs of LLMs. Consumers of Enterprise products must understand how their data is being used. Enterprise leaders, too, must engender trust in the technology among their employees, including through retraining efforts. AI will not replace all jobs, but people trained to leverage AI might.
Part of the reason that trust remains an elusive goal is that there are still fundamental technical questions which are not yet fully resolved, including:
- Complex reasoning remains difficult
- True alignment of models, including mitigation of bias, remains an open question
- Structured data, especially time series data, remains difficult to leverage in an LLM context
The list of deficiencies goes on, coupled with the lack of enterprise understanding around alternative AI models (read: not genAI) that can help them achieve their goals.
The opportunity in Enterprise AI
Each challenge around AI development and trust (or lack thereof) creates an opportunity in Enterprise technology. A few we are energized about:
Control over proprietary data
Enterprises are still reluctant to export one of their most valuable assets: proprietary data. Proprietary data is any enterprise’s key to competitive defensibility, be that in a customer experience, a vertical-specific insight, a market-wide benchmark, or product discovery. Despite foundation model providers offering tools for companies to feel in control, potentially releasing competitive insights or user data makes enterprise leaders’ skin crawl. Allowing enterprises to control both infrastructure and training data we think will create significant value.
We are particularly keen to hear from those working on data as the core of AI development. Enterprises will need to continue investing in instrumenting their businesses, specifically having mature data systems that know where and how data is flowing, and doing this at scale. No matter how the AI stack looks in 6 or 12 months time, high quality data systems will underpin them.
Multi-model governance
By the end of the year, enterprises will likely have access to not only OpenAI’s GPTs, but also models from Meta, Google Brain, Anthropic, Hugging Face, Amazon, NVIDIA, Oracle. Performance will go up, and compute prices will go down. Companies will leverage different models for different use cases, as well as the robust models available as open source software and internally-built models. The problem with using many different models is not just about the ecosystem of model providers, but also about the things companies are building. Some enterprise use cases will perform better with certain models based on the data with which that model was trained. Still other applications of AI may prioritize cost over performance given how frequently they need to run that particular model. Enterprises will need multi-model governance to ensure proper access control, privacy, and security.
Data attribution
Unique data sets will be particularly valuable for unique use cases. At the beginning, companies are likely to experiment and see where AI can create value internally and externally. They will pull data from various sources to train and run these models. However, over time, enterprises will need to balance their investment in data pipelines and compute with the models that create most value. Enterprises will rely on data attribution platforms to understand where data is being used, what value it creates, and what that value creation costs. Then, they will look to solutions that can help create value with less cost – such as distributed computing to avoid paying to move data for analysis.
Difficult-to-build AI applications
Wrappers around GPT are prolific. Differentiated genAI applications are few and far between. And yet, difficult-to-build AI applications more broadly are already creating enormous value. These applications often require a unique background to build, like robust understandings of chemistry or physics. For example, GC has backed companies building transformative AI in healthcare, such as Aidoc and PathAi. In the Enterprise, we are energized by defensible, complex, robust ideas, particularly in applied deep learning.
Augmenting vs. reinventing with AI
Unlike other generational changes, the widespread adoption of AI is one in which many incumbents will not only survive but also thrive. It was hard moving a business from mainframes to client/server/microprocessor. It was hard entirely rewriting applications to be web based. This wave is different. Enterprises do not need to rebuild their tech stacks. Not only are they working with engaged customer bases, but they are also sitting on working systems, and these can be augmented with AI—without starting from scratch.
Though enterprises have not needed to invest heavily to adopt genAI, they will need to invest in the right infrastructure and tooling to adopt AI in a trustworthy way at scale. If you are an enterprise leader bolstering your AI strategy, we are keen to share perspectives with you. And if you are a founder building with these ideas in mind, please reach out to us at General Catalyst.