How to improve large DB read performance by 2X
Nutanix AOS 5.10 ships with a feature called Autonomous Extent Store (AES). AES effectively provides Metadata Locality to complement the existing data locality that has always existed. For large datasets (e.g. a 10TB database with 20% hot data) we observe a 2X improvement in throughput for random access across the 2TB hot dataset.
In our experiment we deliberately size the active working-set to NOT fit into the metadata cache. We uniformly access 2TB with a 100% random access pattern and record the time to access all 2TB. On the same hardware with AES enabled – the time is cut in half. As can be seen in the chart – the throughput is double, as expected.
It is the localization of metadata from AES that contributes to the 2X improvement. AES keeps most of the metadata local to the node – so there is no need to fetch data across-the-wire. Additionally AES reduces the need to cache metadata in DRAM since local access is so fast. For very large datasets, retrieving metadata can contribute a large proportion of the access time. This is true for all storage, so speeding up metadata resolution can make a dramatic improvement to overall throughput as we demonstrate.