Let's face it: Today’s storage is dumb. Mostly it is a dumping ground for data. As we produce more data we simply buy more storage and fill it up. We don't know who is using what storage at a given point in time, which applications are hogging storage or have gone rogue, what and how much sensitive information is stored, moved or accessed by whom, and so on. Basically, we are blind to whatever is happening inside that storage array. On the other hand, storage should just work, users of storage should see it as an endless invisible resource, while the administrators of storage should be able to unlock the value of data itself through real-time analytical insight, not fighting fires just to keep storage running and provisioned.
Storage systems these days are often quoted in petabytes and will eventually move to exabytes and beyond. Businesses are being crushed under the weight of this data sprawl and a new tsunami of data is coming their way as the Internet of Things fully comes online in the next decade. How are administrators dealing with this ever increasing appetite to store more data? It is time for a radical new approach to building a storage system, one that is aware of the information stored within while dramatically reducing the time administrators spend managing the system.
Welcome to the new era of data aware storage. This could not have come at a better time. Storage growth, as we all know, is out of control. Granted the cost per GB keeps falling at about a 40% per year rate, but we keep growing capacity at about a 60% growth rate. This causes both the cost and capacity to keep increasing every year. While cost increase is certainly an issue, the bigger issue is manageability. And not knowing what we have buried in those mounds of data is a bigger issue. Instead of data being an asset, it is a dead weight that keeps getting heavier. If we didn’t do something about it, we would simply be overwhelmed, if we are not already.
The question we ask is why is it possible to develop data aware storage today when we couldn’t yesterday? The answer is simple: flash technology, virtualization, and the availability of “free” CPU cycles make it possible for us to build storage today that can do a lot of heavy lifting from the inside. While this was possible yesterday, if implemented, it would have slowed down the performance of primary storage to a point where it would be useless. So, in the past, we simply let it store data. But today, we can build in a lot of intelligence without impacting performance or quality of service. We call this new type of storage Data Aware Storage.
When implemented correctly, data aware storage can provide insights that were not possible yesterday. It would reduce risk for non-compliance. It would improve governance. It would automate many of the storage management processes that are manual today. It would provide insights into how well the storage is being utilized. It would identify if a dangerous situation was about to occur, either for compliance or capacity or performance or SLA. You get the point. Storage that is inherently smart and knows: what type of data it has, how it is growing, who is using it, who is abusing it, and so on.
In this profile, we dive deep into a new technology, called Qumulo Core, the industry’s first data-aware scale-out NAS platform. Qumulo Core promises to radically change the scale-out NAS product category by using built-in data awareness to massively scale a distributed file system, while at the same time radically reducing the time to administer a system than can hold billions of files. File systems in the past could not scale to this level because administrative tools would crush under the weight of the system.
Converged infrastructure systems – the integration of compute, networking, and storage - have rapidly become the preferred foundational building block adopted by businesses of all shapes and sizes. The success of these systems has been driven by an insatiable desire to make IT simpler, faster, and more efficient. IT can no longer afford the effort and time to custom build their infrastructure from best of breed D-I-Y components. Purpose built converged infrastructure systems have been optimized for the most common IT workloads like Private Cloud, Big Data, Virtualization, Database and Desktop Virtualization (VDI).
Traditionally these converged infrastructure systems have been built using a three-tier architecture; where compute, networking and storage, integrated at the rack level gave businesses the flexibility to cover the widest range of solution workload requirements while still using well-known infrastructure components. Emerging onto the scene recently has been a more modular approach to convergence using what we term Hyper-Convergence. With hyper-convergence, the three-tier architecture has been collapsed into a single system appliance that is purpose-built for virtualization with hypervisor, compute, and storage with advanced data services all integrated into an x86 industry-standard building block.
In this paper we will examine the ideal solution environments where Hyper-Converged products have flourished. We will then give practical guidance on solution positioning for HP’s latest ConvergedSystem Hyper-Converged product offerings.
Over the past few years, to reduce cost and to improve time-to-value, converged infrastructure systems – the integration of compute, networking and storage - have been readily adopted by large enterprise users. The success of these systems results from the deployment of purpose built integrated converged infrastructure optimized for the most common IT workloads like Private Cloud, Big Data, Virtualization, Database and Desktop Virtualization (VDI). Traditionally these converged infrastructure systems have been built using a three-tier architecture; where compute, networking and storage, while integrated together in same rack, still consisted of best-in-breed standalone devices. These systems work well in stable, predictable environments, however, many virtualized environments are now dynamic with unpredictable growth, traditional three-tier architectures often times lack the simplicity, scalability and flexibility needed to operate in such environments.
Enter HyperConvergence, where the three-tier architecture has been collapsed into a single system that is purpose-built for virtualization from the ground up with virtualization, compute and storage, along with advanced features such as deduplication, compression and data protection, are all integrated into an x86 industry-standard building block node. These devices are built upon scale-out architectures with a 100% VM centric management paradigm. The simplicity, scalability and flexibility of this architecture make it a perfect fit for many virtualized environments.
Dell XC Web-scale Converged Appliances powered by Nutanix software are delivered as a series of HyperConverged products that are extremely flexible, scalable and can fit many enterprise workloads. In this solution brief we will examine what constitutes a dynamic virtualized environment and how the Dell XC Web-scale Appliance series fits into such an environment. We can confidently state that by implementing Dell’s XC flexible range of Web-scale appliances, businesses can deploy solutions across a broad spectrum of virtualized workloads where flexibility, scalability and simplicity are critical requirements. Dell is an ideal partner to deliver Nutanix software because of its global reach, streamlined operations and enterprise systems solutions expertise. The company is bringing HyperConverged platforms to the masses and by introducing the second generation of the XC Series appliances enables them to reach an even broader set of customers.
All-flash arrays are changing the datacenter for the better. No longer do we worry about IOPS bottlenecks from the array: all-flash arrays (AFA) can deliver a staggering amount of IOPs. AFAs with the ability to deliver hundreds of thousands of IOPs are not uncommon. The problem now, however, is how to get the IOPS from the array to the servers. We recently had a chance to see how well an AFA using EMC PowerPath driver works to eliminate this bottleneck—and we were blown away. Most comparisons with datacenter infrastructure show a 10-30% improvement in performance; but, the performance improvement that we saw with PowerPath was extraordinary.
Getting bits from an array to server is easy —very easy, in fact. The trick is getting the bits from a server to an array in an efficient manner when you have many virtual machines (VM) on multiple physical hosts that are transmitting the bits over a physical network with a virtual fabric overlay; this is much more difficult. Errors can get introduced and must be dealt with, the most efficient path must be obtained and established, re-evaluated and reestablished continually, and any misconfiguration can produce less than optimal performance. In some cases, this can cause outages or even data loss. In order to deal with the “pathing,” or how the I/O travels from the VM to storage, the OS running on the host needs a driver, or in the case where multiple paths can be taken from the server to the array, a multipathing driver needs to be used to direct the traffic.
Windows, Linux, VMware and most other modern operating systems include a basic multipath driver; however, these drivers tend to be generic and not code optimized to extract the maximum performance from an array and come with only rudimentary traffic optimization and management functions. In some cases these generic drivers are fine, but in the majority of datacenters the infrastructure is overtaxed and its equipment needs to be used in the most efficient manner possible. Fortunately, storage companies such as EMC are committed to making their arrays work as performant as possible and spend a considerable amount of time and research to develop multipathing drivers optimized for their arrays. EMC invited us to take a look at how PowerPath, their optimized “intelligent” multipath driver, performed on an XtremIO flash array connected to a Dell PowerEdge R710 server running ESXi 6.0 while simulating an Oracle workload. We looked at the results of the various tests EMC ran comparing PowerPath/VE multipath driver against VMware’s ESXi Native Multipath driver and we were impressed—very impressed—by the difference that an optimized, multipath driver like PowerPath can make in a high IO traffic scenario.