Scaling generative AI with cognitive infrastructure and FinOps

Cloud

How cognitive infrastructure supports generative AI ambitions at scale

As organizations increasingly adopt generative AI for business value, it becomes crucial to establish a scalable, optimized and flexible digital foundation

November 14, 2024

3 minutes read

Nicholas Ismail

Global Head of Brand Journalism, HCLTech

November 14, 2024

3 minutes read

Listen to article

Mute

30s Backward

30s Forward

In a recent discussion with HCLTech Trends and Insights, Gaurav Sharma, AVP, Hybrid Cloud Product Management, Hybrid Cloud Business Unit at HCLTech, shared insights on the pivotal role of cognitive infrastructure, automation and cost management in ensuring efficient and effective AI operations.

Building a foundation with cognitive infrastructure

Cognitive infrastructure forms the backbone of GenAI/AI at scale, enabling organizations to transition from proof of concept (POC) to full-scale production with agility and reliability. As Sharma emphasized, “generative AI requirements and AI requirements… have become the norm.” However, scaling GenAI presents unique infrastructure challenges, from high computational needs to low-latency networking and extensive data storage requirements that need to be seen from a different lens of implementation.

Sharma highlighted that optimizing infrastructure is essential for handling GenAI workloads, including large language models (LLMs) and specialized mid-sized models, especially when faced with high security and compliance requirements. “You need high throughput IOPS [input/output operations per second,” he noted, while mentioning the need for “security and compliance requirements” and sometimes even “a completely new architecture… which would have other requirements like liquid cooling as well.”

To streamline this process, cognitive infrastructure offers pre-built “blueprints, architectures… [and] t-shirt sizes” tailored to different stages of the AI journey. This approach eliminates the need for companies to start from scratch. This proactive approach helps organizations select the optimal model and infrastructure that aligns with their unique requirements and objectives that can be deployed easily and faster to suit the organization’ needs.

HCLTech’s Cognitive Infrastructure offering supports the organizations in their GenAI/AI journey by implementing automation and cost optimization led strategies for various AI use cases that are crucial for scaling these ambitions.

Automation and cost optimization tools

Automation and cost optimization are integral to sustaining GenAI initiatives in both hybrid and public cloud settings. Automation should address “the management, provisioning [and] running of the GenAI workloads in the most efficient manner,” said Sharma.

By leveraging third-party tools or proprietary accelerators/IPs, organizations can streamline operations and minimize resource waste, a necessity given the complex infrastructure demands of generative AI.

In addition to automation, cost optimization tools play a critical role. Running GenAI at scale is resource-intensive and can become costly if not managed effectively.

“Cost is a big factor,” he noted, and one that directly influences whether POCs transition to full-scale implementations. Companies need to weigh the “value that it is bringing versus the cost it is incurring,” balancing both tangible and intangible returns on investment.

One way to address these costs is through financial operations (FinOps), which Sharma described as crucial for optimizing GenAI expenses. For instance, companies adopting private or hybrid AI setups can explore utility-based models or pay-as-you-go options to reduce upfront costs.

However, Sharma explained that “optimization is not the end goal.” Cost-efficiency should accompany every phase of the AI journey to prevent financial strain on the organization.

Cloud Evolution: Mandate to Modernize

Download the report

The significance of industry events for AI progress

Looking ahead, Sharma expressed enthusiasm for discussing these strategies further at the Gartner 2024 EMEA IT Infrastructure, Operations, and Cloud Strategies Conference, where HCLTech will be a premium sponsor. Highlighting the conference’s importance, he stated, “this is a very important event for anyone who’s looking forward to the infrastructure, the operations, strategies… in the GenAI and the AI era.”

He cited three primary benefits of attending: hearing insights from Gartner experts, networking with peers to foster mutual learning and exploring emerging trends and thought leadership in generative AI.

The conference theme, focused on AI, promises to offer a platform for sharing best practices and discovering innovative approaches to infrastructure and operations in an AI-driven world.

In closing, Sharma explained that this journey is one of continuous evolution, refinement and adaptation, aligning infrastructure and costs with the overarching goal of maximizing value for both the organization and its end users/customers.