In life consistency is key but in the modern data center the key is "on-demand availability".
VMs are usually assigned a fixed memory size while creation. In most cases, this provides consistent memory management and performance; however, we do come across workloads that will benefit from free-floating memory assignment to the VMs.
Adaptive memory overcommit delivers value to customers in use cases which involve running workloads with unpredictable bursty I/O where peak memory usage is higher than the average memory usage. The key word here is "adaptive". It implies that an administrator does not have to manually set the lower and upper limit of the memory that can be assigned to an overcommitted VM (although the option to do so exists) because this will be handled by the hypervisor. Isn't that what abstraction is all about!
Virtual Desktop Infrastructure or VDIs especially persistent servers or desktops, test and/or development environments, production server virtualization, ROBO deployments, clusters configured with high availability and/or disaster recovery are the primary target scenarios for free-floating memory assignment or memory overcommit. This feature provides administrators the freedom to leverage the unused memory in the cluster and spin up more VMs on a host.
Imagine a scenario where a cluster is shared by users across more than one timezones with zero to minimum overlap in accessing the VMs on the cluster. Companies with such over-allocated deployments would definitely want to get the best bang out of their buck by having the capability of memory management in such asynchronous demand scenarios.
The underlying workflow of the on-demand memory claim process can be based on either balloon drivers or hypervisor swapping to reclaim memory. Acropolis Dynamic Scheduler or ADS on the Acropolis Hypervisor is responsible for defining the maximum amount of memory per host which is called the “memory pool”. Memory overcommitted to VMs on that host cannot surpass the memory pool value. Of course, precautions are taken to prevent high performance impact by controlling the size of each overcommitted VM on the host, such that the total memory usage does not exceed the memory pool limit of the host.
Overcommitted VMs are monitored periodically to adapt the size of memory assigned to them and even better, a small amount of memory is always kept unused within the memory overcommit pool a.k.a the buffer to allocate the minimum amount of memory to spin up a VM without shrinking another VM first.
What sets the Nutanix hypervisor AHV apart from other HCI vendors in offering the memory overcommit feature to customers is that on the Nutanix deployment, customers will not need to interact with the guest OS, thereby preventing any in-guest memory pressure.
The Linux kernel can monitor and control overcommitting memory to programs that require higher virtual memory than others using the vm.overcommit_memory setting.
To describe briefly, the vm.overcommit_memory setting can be configured according to the virtual memory requirements of workload running on a cluster.
vm.overcommit_memory set to 0 is the default setting in most Linux kernel versions. It gives freedom to the kernel to decide whether it is going to overcommit memory to a certain VM or not.
vm.overcommit_memory set to 1 will always let the Linux kernel to overcommit. Therefore, this setting is not advised.
vm.overcommit_memory set to 2 controls the percentage of memory that the kernel can overcommit using the overcommit_ratio kernel setting.
Let me briefly focus on some of the reservations that hold back enterprise customers from embracing memory overcommit, before wrapping up this post. One of the primary reasons, that we have already alluded to in this article is the fear of degraded performance. A recent article on that I came across on Infoworld imparts a positive spin on memory overcommit by perceiving it as resource sharing. In fact, resource sharing has been one of the unique selling propositions of adopting virtualization and we have been sharing other resources like CPU on virtualized platforms for quite some time.
Overcommitment makes sense, when implemented accurately with guardrails in place because system and user activity levels on VMs vary over time and there is some or the other VM in dire need of memory that can always benefit by using this feature. However, I would definitely want to emphasize studying your workload and working with the correct vendors before narrowing down on making the final decision on whether or not to adopt memory overcommit.
Comments