Enabling Hybrid Data Architecture for the AI Era

0
79

As hybrid and multi-cloud AI strategies gain urgency across global enterprises, Cloudera is positioning itself around a “true hybrid” architecture enabled by recent acquisitions and its Anywhere Cloud strategy. Leo Brunnick, Chief Product Officer, Cloudera discusses the company’s roadmap, data fabric capabilities, and its  containerized deployment model.

The demand for hybrid and multi-cloud AI deployments is growing. How is that driving your product strategy?

Cloudera has always supported hybrid, but the definition of hybrid has matured over time. Initially, hybrid meant you could run software on prem or in the cloud or even in Amazon’s cloud and Microsoft’s cloud but those deployments didn’t necessarily know about one another.

The next phase was enabling customers to run some workloads on prem and some in the cloud, while managing them through a single pane of glass across a complex data estate.

Now the world has moved further. What we are delivering, currently in tech preview, is the same piece of code, the same container, deployed on prem or across Amazon, Google, Microsoft, or Oracle Cloud. It is literally the same code base. This is enabled through our Anywhere Cloud architecture, made possible by our acquisition of Tykon six months ago.

With Anywhere Cloud, customers can burst workloads. For example, they may use on-prem GPUs because it’s cheaper, and when they need a spike, burst to the cloud. They can even choose spot instances dynamically, whichever provider is cheaper that day. Additionally, because this is a virtualized platform managed through a single console, customers gain multi-cloud failover. If Azure or AWS goes down, it doesn’t have to take your operations down.

This isn’t just about vendor lock-in. It’s about true hybrid for resilience, sovereignty, and cost control.

You’ve made several acquisitions recently, including Verta, Octopai, and Tykon. What do they bring?

Across our customer executive forums globally, we consistently hear similar priorities: AI and agentic AI adoption, understanding complex data estates, and simplifying deployment in hybrid environments. We made three acquisitions in the span of a year.

Verta, in the AI space, accelerated our AI Studios and Agent Studios capabilities. Octopai, a data lineage company, provides more than lineage; it offers comprehensive data visualization and mapping, forming a core part of our data fabric. Tykon enabled us to own our Kubernetes layer. Originally, the goal was to avoid having six different product versions across Kubernetes environments. What we gained was much more with a powerful orchestration layer that enables rapid deployment from bare metal in minutes.

The combination of these capabilities allows us to deliver AI solutions with full data fabric inside a write-once, deploy-anywhere infrastructure. When we explain this architecture, customers immediately understand the value because no one else combines all these elements in this way.

Post-acquisitions, what synergies have you driven into your product roadmap? Have there been announcements that bring these together?

Yes, we’ve announced several roadmap advancements over the past few months.

First, we announced extended long-term support for our core CDP platform. Our 7.1.9 and 7.3.2 releases will now have six years of long-term support. That means customers don’t have to constantly migrate. Second, our Data Services platform, version 1.5.5, now enables on-prem capabilities that were not previously available. This includes full AI Workshop, AI Studios, Agent Studios, and on-prem inferencing. It also includes Trino support, REST catalog integration, and Apache Iceberg support, all capabilities customers have been requesting. We’ve also announced long-term support for this platform.

With the acquisition of Tykon, we are launching the Anywhere Cloud strategy, entering tech preview in Q1 2026.

Normally, building a fully virtualized platform would take years. In our case, we can pull all CDP platform elements into a container, including our data fabric and lakehouse architecture. Cloudera AI components are also integrated into that container.

The container itself is best-in-class and was already outperforming others in the market.  As a result, in months, we’ve delivered the architecture customers have been waiting for. We believe it is a game changer. No other vendor has all of these components combined.

Customer and analyst feedback over the past 60 days has been very strong. In the recent Forrester Wave for Data Fabric, we were positioned ahead of most major vendors and were the only company to receive a perfect score in both vision and roadmap.

Tell us about your Open Data Lakehouse powered by Apache Iceberg. How does it strengthen flexibility?

At its core, Cloudera is a lakehouse company. Streaming, ingestion, analytics engines, object storage, and file storage all revolve around the lakehouse architecture.

This is crucial as every enterprise has a complex data estate. Even organizations that think they are “pure cloud” are one acquisition away from complexity again. We focus on interoperability. By supporting open formats like Apache Iceberg and enabling REST APIs, customers can query across environments, data sitting in Snowflake, Cloudera, Oracle, or elsewhere, within a single complex query. That reflects how real-world enterprise data environments actually look.

How are you simplifying GenAI, RAG, and agent-based deployments?

With the Verta acquisition, we accelerated development of our AI Studios, including RAG Studio, Agent Studio, and Fine-Tuning Studio. These environments are low-code or no-code. We’ve conducted hackathons with large enterprise customers where business users build deployable agents within a single day. That has been eye-opening for many organizations.

What type of customers benefit most from Cloudera?

Cloudera is particularly suited for large enterprises with complex estates. Banks, governments, and industrial companies typically operate mainframes, decades-old databases, client-server systems, SaaS applications, and multiple cloud platforms simultaneously. That is where we add value.

That complexity becomes critical when organizations begin deploying AI. AI is only as good as the underlying data, and most organizations do not yet have fully governed, lineage-tracked, and secured data environments. This becomes even more important with AI agents. Agents should operate like employees, just as an employee badge provides limited access, AI agents require fine-grained and role-based access control. Without proper governance, agents can act beyond their intended permissions.

How do you see the future balance between cloud and on-prem infrastructure?

Market perception has shifted significantly. Previously, some analysts predicted only 2–3% of meaningful data would remain on-prem long term. However, more recent projections suggest closer to 40% on-prem and 60% in public cloud.  That is because  many enterprises are experiencing “cloud hangover”, with escalating and unpredictable cloud spending. Some organizations are spending hundreds of millions annually and struggling to forecast costs. At the same time, data volumes are expanding exponentially, especially with AI and metadata growth. We have customers managing petabytes, even exabytes, of data.

Looking ahead, the future is not cloud-only when it comes to data storage. It is definitively hybrid.

Leave a reply