AI

Cloud or Edge: Where should AI inference run?

By Jack Gold Mar 26, 2025 11:17am

Within 2-3 years, 85% of enterprise AI workloads will be inference-based, rather than the current predominance of training workloads. This development will have significant implications for how organizations run such workloads, and how to determine the best place to deploy resources to run those workloads.

Most current AI workloads for training need high end specialized GPUs and run in the cloud at hyperscalers due to advantages like quick time to implementation, scalability of compute resources, software/model availability and ease of implementation. Most enterprise AI workloads running today are still experimental and/or small scale.

As AI moves to production level inference-based solutions, the need for high end GPUs is less important and standard server SoCs are more appropriate. There are many variables to evaluate to pick the best infrastructure assets to run these workloads.

In the chart below, we look at several factors that should be evaluated to determine whether it is best to run production inference-based AI workloads in a centralized cloud environment, or whether it makes more sense to run them in an Edge solution localized to the users of the solution.

We provide guidance on which we expect to have advantages for enterprises deploying production systems, especially those interested in providing maximum productivity and security at minimal total cost of ownership. Each organization will be unique but we believe these generalized guidelines are a valid place to start.

Cloud vs. Edge Deployment of AI Inference

Bottom Line: We have outlined a number of evaluation criteria in the chart above for determining whether Cloud (including hybrid cloud or remote cloud) or Edge deployments of AI Inference workloads is the best alternative. Each organization may have different requirements and these are guidelines. But for many enterprises, deploying AI Inference-based workloads at localized Edge computing resources on standard systems provides a much better solution for price, performance and security/privacy.

Jack Gold is founder and principal analyst at J.Gold Associates, LLC. With more than 45 years of experience in the computer and electronics industries, and as an industry analyst for more than 25 years, he covers the many aspects of business and consumer computing and emerging technologies. Follow him on Twitter @jckgld or LinkedIn at https://www.linkedin.com/in/jckgld.

cloud edge AI Jack Gold GPU AI