At Intel Innovation in September, we announced that Stability AI – the company behind the Stable Diffusion text-to-image model – would be one of the largest customers for an AI supercomputer built entirely on Intel technology, Intel® Gaudi®2 AI hardware accelerators and Intel® Xeon® processors.
As artificial intelligence (AI) workloads continue to evolve and get more complex, companies like Stability AI are rethinking how to meet the growing enterprise demand for AI compute. In a blog post, the company discusses the reasons it needed to explore alternatives for its AI compute needs and why it selected Intel’s technology.
Overall, the market is looking for alternatives for AI compute technology. There are highly competitive solutions available today, both on hardware and software. But they can be difficult to access and expensive. Intel’s proven record of working with the open ecosystem and independent software vendors to scale technologies gives developers more choice and compatibility. That’s why customers like Stability AI are choosing Intel’s proven alternative solutions that can keep compute costs down while not compromising performance.
The challenges that come with deploying GenAI in production are not unique to Stability AI. Research shows an incredible number of companies are experimenting with generative AI or GenAI (AI that “generates” new content based on queries of existing data), with about 10% of those already moving GenAI workflows into production.
But the reality is that many organizations that want to start adopting GenAI today get stuck at the pilot or proof-of-concept phase. They can’t scale their solution and make the jump into a production environment where their business can take full advantage of it. Here’s why organizations are getting stuck and what questions they need to ask to set themselves up for success.
To move from the proof-of-concept phase into production, organizations need the four S’s: speed, scale, sustainable cost and security. Organizations need all four to be successful, but often sacrifice one for another without realizing it.
Speed and Scale
To achieve speed, many companies use inexpensive APIs that are difficult to scale to the level needed by enterprise customers. Or, in pursuit of scale, they try to build their own large language models (LLMs), which require a great deal of work, time and experts, which are hard to find. There’s been little standardization over the past decade of enterprise AI development, so many organizations feel they must sacrifice speed and invest the time and work in building their own LLMs.
But that’s no longer the case. Today, most organizations don’t need to hire an army of data scientists because they can partner with the AI community to take an open solution and customize it for their needs. This missing piece helps speed up GenAI projects considerably and enterprises need to make sure they are not wasting time building something on their own from scratch.
Sustainable Cost
Many organizations also get lured into “pay as you go” pricing models from some providers as they start building pilots. While this approach may offer a cheap and easy way to get started, the cost quickly becomes unsustainable. A project may cost only tens of dollars to test, but when a company moves from a proof of concept to a production environment, the cost can quickly skyrocket to millions of dollars.
In contrast, a yearly contract for an enterprise license with another provider may seem expensive out of the gate but that price won’t change when moving from pilot to production. Over time, the savings add up. The cost using a “pay as you go” service could end up being $3 to $5 per query, while deploying an enterprise-grade system into your own environment could cost just 0.1 cent per query.
A good analogy here is renting versus buying a car. If you only want to drive a car occasionally, then rent. If you want that car available every day and plan to put some serious miles on it, buying is a better option.
Security
Before moving to production, organizations also must understand how their data is secured, where it goes (especially if they’re working with a partner), and ensure it meets the commitments they’ve made to their customers and any regulations that apply to them.
Putting Generative AI to Use
Artificial intelligence is a tectonic shift for industry. I’d compare the current moment in AI to 1996 for the internet: the potential is clear, but the true impact is unknown.
Compared to other evolutions, AI is seeing incredibly fast adoption and time to value. It’s still too early to understand all the opportunities AI will unlock – or its downstream implications – but the possibilities are incredible.
At Intel, we’re working to bring AI everywhere – to new use cases, industries, devices, people and more. On Dec. 14, at our AI Everywhere launch event, we’ll feature many more enterprise customers that are using Intel technologies to bring their GenAI products into production.
Suffice to say, businesses that balance speed, scale, sustainable cost and security without sacrificing any of them will have the best chance of reaping the rewards of the generative AI revolution.
Arun Subramaniyan leads the Data Center & AI Cloud Execution and Strategy team at Intel Corporation.