The importance of Graphics Processing Units (GPUs) has been underscored in recent years due to their versatility, gaming, high-performance computing, and AI capabilities. Looking under the hood of your laptop, smartphone, smartwatch, or gaming console, you'll likely find a GPU, a H.264, H,265, AV1 hardware-based decoder, or both. The benefits are clear for end users, and a great user experience is fundamental expectation for many. This explains why every tech enthusiast is eager to equip their gaming rigs with the latest and greatest GPUs from AMD, Intel, or NVIDIA. These GPUs are critical for high-end workstations and gaming, where acceleration can significantly improve application performance and user experience.
Beyond that, datacenter and cloud GPUs have become the foundation of artificial intelligence (AI). They have transformed Natural Language Processing and Recommender Systems, which were once slow and inaccurate, into faster and more precise operations.
In the context of end user computing, DaaS, and professional graphics, GPUs facilitate tasks like video editing and 3D graphics rendering, making them indispensable for designers working with products from Adobe, Autodesk, Unreal Engine, or Dassault Systemes (to name a few).
Overall, GPUs have become essential tools for a wide range of applications, from high-performance computing to AI and machine learning, and their use cases are continuously expanding.
The Importance of GPUs in Desktop-as-a-Service (DaaS)
But why do GPUs and hardware-based encoders/decoders matter in the world of DaaS?
Although GPUs serve various purposes with their API support and encoders/decoders, ultimately, their goal is to deliver the best graphical user experience. They support application APIs like DirectX, OpenGL, Vulkan, and CUDA, and offload the CPU for capture, encoding, and decoding using H.264, H.265, and AV1. This optimization of resource consumption enhances the user experience with Remoting Protocols such as the Frame Remoting Protocol. The goal here is to create a superior user experience for those running the latest operating systems such as Windows 11 and its applications on both virtual desktops and applications, cloud PCs, and cloud workstations powered by DaaS solutions such as Dizzion Frame, Amazon Workspaces, Citrix DaaS, Microsoft Azure Virtual Desktop (AVD), and VMware Horizon Cloud.
Why aren't all of the hundreds of millions of virtual desktop and application users worldwide using GPUs? There's a common misconception that GPUs are only for special "Cloud Workstation" use cases. The truth is that GPUs can be used for remoting protocols such as Frame Remoting Protocol in the DaaS context to deliver the best user experience, and also, many applications such as browsers, unified communications, and modern operating systems work best with a GPU or a “slice” of a GPU.
Where Frame Comes in
Modern DaaS remoting protocols, such as the Frame Remoting Protocol (FRP), leverage hardware-based encoding from NVIDIA and AMD GPUs to support codecs like H.264 and AV1. This enables users to run applications in the browser while offloading the CPU, resulting in a great user experience with high frame rates (fps) and various color encoding schemes such as YUV444. Another advantage of using GPUs or vGPUs is their ability to support application APIs like OpenGL, DirectX, Vulkan, and CUDA.
Frame Remoting Protocol 8 (FRP8) leverages WebRTC, utilizing UDP for transport to ensure optimal user experience in both LAN and network-constrained environments, addressing key factors like bandwidth and latency. The use of H.264 for capture, encoding, and decoding, capitalizes on its broad support across modern browsers and endpoint devices. This choice guarantees robust decoding capabilities, essential for delivering a high-quality user experience supporting 4K and multi-monitor setups (up to four screens), primarily via browser or browser engine integration.
In contrast, alternative VDI/DaaS solutions such as Amazon, Citrix, Microsoft, VMware often treat browser/HTML5 access as a secondary option, only offering reduced functionality compared to their primary solutions. Another advantage of FRP8’s reliance on WebRTC is its native support for essential Unified Communications features. This includes seamless access to webcam, microphone, and audio playback within the session, eliminating the need for additional software installations.
Frame's architecture is comprised of two main components: the Control Plane and the Data Plane. The Control Plane is responsible for resource provisioning, capacity management, user and admin interfaces, session policies, image management, brokering, and monitoring. On the other hand, the Data Plane connects the user's device or 'terminal' with their Workload VMs, which run virtual desktops or applications. This connection may be facilitated by a Streaming Gateway Appliance (SGA) for remote access. Frame Remoting Protocol 8 (FRP8) then establishes a direct and secure link between the terminal and Workload VM, enhancing security and transmission efficiency. This ensures the best user experience by providing the shortest path from the user to the workload VM.
GPU Considerations for Diverse DaaS Use Cases
You're probably wondering, "Do I really need GPUs for all my applications and use cases, including single task and knowledge workers?”
In my opinion, there are always nuances to consider. The fact is that DaaS with Windows 11, using the latest AMD and Intel CPUs, can work without GPUs. In fact, there are large-scale deployments, with over 90,000 customers, using virtual desktops and applications in CPU-only configurations. However, to be future-ready and ensure the best possible user experience, incorporating a GPU or a “slice” of a GPU is the way forward. Naturally, this comes with a cost impact, as is the case with many decisions. The art lies in finding the right balance between cost and user experience.
In the following sections, I'll walk you through some of the use cases and GPU options for Frame in public clouds and on-premises. I'll also discuss how to right-size the GPU-powered workload VMs, the tools to use, and the common mistakes to avoid when using GPUs with Frame.
Choosing the Right Option: CPU, GPU, vGPU, or Dedicated GPU
Frame has several strong processing options available for your Virtual Desktop and Application use cases. These include:
- CPU-only
- NVIDIA Virtual GPU (vGPU)
- AMD GPU (Virtual Functions VF)
- Dedicated GPU options from AMD and NVIDIA
Selecting the optimal GPU option depends on the expected user experience, the use case, the application requirements, and the business case.
Dedicated and Virtual GPUs
Let's dive a bit deeper into the GPU options Frame can use in both public cloud and on-premises environments. From a GPU perspective, there are two technology directions:
Dedicated GPUs
Dedicated GPUs are also known as GPU “pass-through” or “DDA” (Discrete Device Assignment). This means the Virtual Machine (VM) has a dedicated “full GPU” at its disposal.
If the GPU board has multiple GPUs per board available, each VM can access its own dedicated GPU. For instance, the NVIDIA A16 has four GPUs available, which means four VMs can be powered on and access a GPU in a 1:1 mapping. When the GPU board only has one GPU available, one VM can be powered-on and use that GPU. Commonly used GPUs such as NVIDIA T4, L4, and L40 are powerful GPUs, but do have one GPU per board available.
Dedicated GPU Options in Public Cloud
Today's commonly used GPU-based instances on Microsoft Azure, AWS, and Google Cloud are using dedicated GPUs. Typical instance families are Azure NC (NVIDIA-based), AWS G4 (AMD & NVIDIA GPU-based), AWS G5 (NVIDIA-based), and the GCP (NVIDIA-based) instances. These instances all use dedicated GPUs, and the NVIDIA-based instances include the NVIDIA vGPU licenses to be used in DaaS. Some of these instance families have machines with multiple dedicated GPUs in one VM, supporting specific high-end workstation applications, such as Autodesk VRED with real-time ray tracing.
Dedicated GPU Options in an On-prem Deployment
Dedicated GPU instances offer consistent high performance since the GPUs are not shared with 'noisy neighbors.' This means the GPU's frame buffer (memory), cores, and encoders/decoders are exclusively allocated to each instance. However, a downside from an infrastructure perspective is the inability to share the GPU with others, which also means the running costs cannot be distributed among multiple users.
Using dedicated GPUs is uncommon in an on-premises DaaS scenario due to limited flexibility and scalability. It's more common to use vGPUs with workstation-class GPU profile characteristics.
Virtual GPUs (vGPUs)
The term "Virtual GPUs" can have various interpretations. In this context, it refers to a virtual machine receiving a "slice" or "partition" of the GPU, managed by a software or hardware component at a lower level. This process of "slicing," "partitioning," or virtualizing the GPU can be achieved through software, hardware, or a combination of both, leading to a hybrid approach.
Virtual GPUs in an On-prem Deployment
For example, a hybrid setup might include NVIDIA's vGPU Manager software combined with SR-IOV, running on a Nutanix AHV hypervisor. In such a setup, different pre-defined vGPU profiles are available at the hypervisor level for use by virtual machines (VMs).
Similarly, AMD's slicing technology utilizes SR-IOV to create virtual functions for VMs. While this guide doesn't delve into specifics like the allocation of GPU frame buffer and cores for each VM or the intricacies of GPU scheduler configuration, understanding the performance and capabilities of vGPU profiles, virtual functions, and NVIDIA's vGPU software licensing options is crucial.
If you want to read more about NVIDIA vGPU software licensing and the capabilities provided by the software, I encourage you to read the NVIDIA vGPU licensing and packaging guide.
Virtual GPU Options in Public Cloud
Currently, a variety of vGPU options from both AMD and NVIDIA are available in the public cloud, particularly on Microsoft Azure, which offers instances powered by these leading manufacturers.
For instance, the Azure NVadsA10 instance is equipped with NVIDIA A10 GPUs and AMD EPYC Milan CPUs. The latest addition to Azure's vGPU family is the NGads_V620, which features AMD Radeon V620 GPUs and AMD EPYC Genoa CPUs.
More detailed information about GPUs, instance types, and user experience can be found at ux.dizzion.com. You will see Frame in action with five engineering, architecture, and construction applications (e.g.,Enscape, Autodesk Revit, Inventor, VRED and Unreal Engine). Instances from Azure, AWS, and GCP are using at FHD and 4K resolution in LAN and WAN scenarios.
Optimizing Your GPU Setup: Essential Tools and Strategies
Understanding how the remoting protocol, operating system, and applications utilize the GPU is crucial. The goal is to monitor the usage of GPU cores, frame buffer, and encoders over a sufficient period. Armed with this data, you can begin the process of accurately sizing your workload machines.
Fortunately, various solutions from partners are available to help you easily monitor and capture GPU/application usage. Examples include ControlUP, LiquidWare, LakeSide, and UberAgent.
Aside from these solutions, there are free community tools available, such as RD Analyzer, GPU profiler, GPU-Z, and Windows Performance Monitor found in Windows 10/11 and Server 2019/2022+.
Of course, properly sizing your infrastructure and virtual machines is crucial for optimal performance. When optimizing your workload VMs, consider the following key questions:
Does a CPU-only configuration deliver the desired user experience, and does it meet your expectations? Which applications are in use, and would they benefit from GPU acceleration?
What is the right balance between user experience and costs? What are the current and projected expenses associated with each configuration?
Is a CPU-only setup future-proof, considering that the Windows OS and many applications, including browsers, Microsoft Teams, Zoom, and Microsoft Office, are increasingly relying on GPU capabilities?
Common GPU Mistakes and How to Avoid Them
Unfortunately, even the most seasoned of users may encounter these pitfalls when using GPUs with Frame. Here are my tips on how to avoid them and ensure the best user experience at all times.
Mistake: Sizing your GPUs without any OS and application usage insights.
Solution: Sizing without usage insights is like driving your car in the dark without the lights on; it's dangerous and you'll likely get lost. Use tools to get this data.Mistake: Using the specifications of your physical PC or workstation and 1:1 map them to DaaS workload VMs.
Solution: Utilization of physical resources is often much lower—don't size for the peaks. Again, use tools to get these utilization insights to improve your sizing. Very often, the frame buffer is the first limit you will reach with GPUs.Mistake: Capturing GPU utilization for one hour is fine; I am in a hurry.
Solution: Capture utilization for a more extended period to make sure the data set is complete and that you've captured enough information.Mistake: Expecting double the peformance in your VM's applications when you have two GPUs.
Solution: 99% of applications aren't multi-GPU capable. It is better to use a modern state-of-the-art GPU.Mistake: I bought NVIDIA GPUs for my on-premises Frame with Nutanix AHV deployment. I'm good to go!
Solution: Don't forget the proper NVIDIA vGPU software licenses, hardware + software is the complete solution here.Mistake: We are totally fine with Windows 10/11 and using a 1 GB Frame buffer of the NVIDIA vGPU Profile.
Solution: While a 1 GB frame buffer might seem adequate, it's important to reassess based on your specific needs. For setups involving multiple monitors, higher resolutions, or demanding applications, a larger frame buffer may be necessary. Consider starting with a 1 GB frame buffer for basic productivity applications on Windows 10, a 2 GB frame buffer for a dual monitor setup, and a 4 GB vGPU frame buffer for more advanced graphics requirements.Mistake: Believing that the endpoint device doesn't impact the user experience because applications are running virtually, and the endpoint is merely a display device.
Solution: The reality is that the endpoint's capabilities significantly influence the overall user experience. Consider factors such as the device's hardware decoding capabilities, its support for multiple monitors and high resolutions, and whether it can run the latest browser or Frame App. These aspects are crucial in determining the quality of the virtual application or desktop experience.Mistake: Performance and availability of GPUs in the public cloud. Sure. No problem.
Solution: Understand the different GPU options, and GPU hardware characteristics, and understand that availability of resources and guaranteed capacity isn't always a given! Be sure to raise your GPU limits with your cloud provider well ahead of when you need those resources. Sometimes it can take days or more to get your limits raised.
If you're interested in learning more about cloud workstation configurations, I encourage you to read Cloud Workstation Special Report by AEC magazine.
GPU-accelerated DaaS, Powered by Frame
Incorporating GPUs into a DaaS solution, whether dedicated or virtual, is crucial for enhancing the user experience. The key to success is choosing the right (v)GPU option for your specific needs and navigating potential challenges.
Frame offers a wide range of cutting-edge AMD and NVIDIA GPU options to meet both your current and future requirements. Rest assured that your end users will have the performance and experience necessary to stay productive.
Ready to discover the power and simplicity of delivering virtual apps and desktops to users worldwide with just a browser? Sign up for a free Test Drive (It'll take less than 15 minutes to get started) or Frame Trial.