Generative AI: Can Your Cloud Infrastructure Take You to the Candy Castle?

(AI Generated/Shutterstock)

According to Gartner’s recent Hype Cycle for Artificial Intelligence 2024, investment in AI has hit a new high, thanks to a global focus on generative AI (GenAI). Yet Gartner also found that to date it has not produced the anticipated business value. While we’ve crossed Gartner’s “Peak of Inflated Expectations,” where there’s more hype than proof, we’ll soon slide into the “Trough of Disillusionment” as early adopters face performance snags that lower their ROI.

I know, it sounds like a tech version of the children’s board game, Candy Land, where players pass through places like the Peppermint Forest and Molasses Swamp on their way to the Candy Castle. But with AI, as the Harvard Business Review reports, up to 80% of AI projects fail – and real money is being lost.

For many companies, their biggest failure is an inability to ensure their cloud infrastructure can handle GenAI research and development. Unlocking insights within unstructured data delivers tremendous value across an enterprise. It can improve decision-making and product quality; enable marketers to reach the right audience with the right content; drive customer experiences with personalization; and unearth market trends. The list of possibilities is endless.

Yet, without an environment optimized for AI, you’ll be stuck at square one.

Why the Cloud?

(ArtCreationsDesignPhoto/Shuttertock)

There are some who say cloud-based GenAI is not cost-effective because it’s cheaper to deploy the high-end processing and networking required on-premises. However, to run GenAI this way you need GPUs, which are not only expensive – they’re scarce. You also must run workloads 24×7 at 90% utilization of resources. Instead, most organizations prefer to develop incrementally, which the cloud allows. And when it comes to unpredictable workloads, the elasticity of the cloud offers a far better approach.

Another factor in the cloud’s favor is the types of GenAI models being used. Right now, there’s a battle between open-source and closed-source models. Unfortunately, closed-source models aren’t able to be used on-premises despite being able to outperform their open-source rivals by quite a bit. Utilizing closed-source models requires the cloud. Thankfully, it offers a low cost-of-entry and is supported by an ecosystem of managed services and expert partners.

Improving Infrastructure

There are ways companies can ensure their computing and storage infrastructure are capable of handling GenAI in a cost-efficient way, including:

Modernizing and organizing: Tune applications for high performance while placing files and metadata correctly to ensure cost-effective scaling.
Leveraging existing cloud credits: Cloud providers offer redeemable credits that can be used to reduce the cost of cloud computing services. Apply these first to test your architecture as thoroughly as possible.
Configuring correctly: Ensure compute and storage configurations are set properly to avoid unexpected cost overruns. Understand the size of your model so you can feed it into the right GPU, and on the storage side, watch workloads and tweak accordingly to head off latency.
Consolidating data: You’ll be dealing with large sets of data from various sources. Clean, combine and consolidate what you can and ensure it’s all accessible. This will make it more usable and generate relevant insights because you’ll be analyzing your complete data, not just a subset.
Model tuning: Even when you have a framework for performance and system evaluation in place, GenAI apps and models require continuous tuning and optimization. Cloud providers often offer multiple models for evaluation, which are easy to find and deploy, making finding the right model simple and at a lower testing cost.
Optimizing data: Providing access to a volume of quality data creates a foundation on which AI is able to cross-reference and validate data, weeding out misinformation. For best results, position your data around collection and analytical resources.

Getting Started

A lot of organizations see GenAI struggles as a technology problem, but it’s actually a business issue. You need to identify what’s holding you back, then utilize the right tools to tackle the problem. Further, some wait to find a use case until they’ve worked through technical issues, when they really should find the use case first in order to gain a clear understanding of goals and what the ROI should look like.

Failing to understand your cause and criteria makes GenAI projects in the cloud unnecessarily complex. Every model and workload are different, so set ideal output and performance benchmarks then work backwards from there. Again, use that bank of cloud credits you’ve built with providers to test every aspect of your infrastructure.

Begin with a proof of concept (PoC) involving at least 10 users to start getting feedback, even if they give the experience a thumbs down. Constantly monitor every input and output your Gen AI creates and evaluate these against your standard benchmarks. This alone will provide insight into workload changes you’ll need to make in order to take things to the next level.

Finally, don’t go it alone. There are managed services with features like built-in security measures to prevent toxic content from making its way into your data. There are tools from major providers like Amazon and Google that provide guard rails. And there are consultancies that can bring it all together, using their hands-on expertise to create a cost-efficient and safe approach.

Simply put, GenAI can provide sweet success or leave a sour taste in your mouth. If you want to reach the Candy Castle and avoid your own Trough of Disillusionment, get your infrastructure AI-ready and know where you want it to take you.

About the author: Eduardo Mota is senior cloud data architect – AI/ML specialist, at DoiT, a provider of technology and cloud expertise to buy, optimize, and manage AWS, GCP, and Azure cloud services. An accomplished Cloud Architect and Machine Learning Specialist, he holds a Bachelor of Business Administration and multiple Machine Learning certifications, demonstrating his relentless pursuit of knowledge. Eduardo’s journey includes pivotal roles at DoiT and AWS, where his expertise in AWS and GCP cloud architecture and optimization strategies significantly impacted operational efficiency and cost savings for multiple organizations.