There is a serious re-hosting effort going on in data center storage as flash-filled systems replace large arrays of older spinning disks for tier 1 apps. Naturally as costs drop and the performance advantages of flash-accelerated IO services become irresistible, they also begin pulling in a widening circle of applications with varying QoS needs. Yet this extension leads to a wasteful tug-of-war between high-end flash only systems that can’t effectively serve a wide variety of application workloads and so-called hybrid solutions originally architected for HDDs that are often challenged to provide the highest performance required by those tier 1 applications.
Someday in its purest form all-flash storage theoretically could drop in price enough to outright replace all other storage tiers even at the largest capacities, although that is certainly not true today. Here at Taneja Group we think storage tiering will always offer a better way to deliver varying levels of QoS by balancing the latest in performance advances appropriately with the most efficient capacities. In any case, the best enterprise storage solutions today need to offer a range of storage tiers, often even when catering to a single application’s varying storage needs.
There are many entrants in the flash storage market, with the big vendors now rolling out enterprise solutions upgraded for flash. Unfortunately many of these systems are shallow retreads of older architectures, perhaps souped-up a bit to better handle some hybrid flash acceleration but not able to take full advantage of it. Or they are new dedicated flash-only point products with big price tags, immature or minimal data services, and limited ability to scale out or serve a wider set of data center QoS needs.
Oracle saw an opportunity for a new type of cost-effective flash-speed storage system that could meet the varied QoS needs of multiple enterprise data center applications – in other words, to take flash storage into the mainstream of the data center. Oracle decided they had enough storage chops (from Exadata, ZFS, Pillar, Sun, etc.) to design and build a “flash-first” enterprise system intended to take full advantage of flash as a performance tier, but also incorporate other storage tiers naturally including slower “capacity” flash, performance HDD, and capacity HDD. Tiering by itself isn’t a new thing – all the hybrid solutions do it and there are other vendor solutions that were designed for tiering – but Oracle built the FS1 Flash Storage System from the fast “flash” tier down, not by adding flash to a slower or existing HDD-based architecture working “upwards.” This required designing intelligent automated management to take advantage of flash for performance while leveraging HDD to balance out cost. This new architecture has internal communication links dedicated to flash media with separate IO paths for HDDs, unlike traditional hybrids that might rely solely on their older, standard HDD-era architectures that can internally constrain high-performance flash access.
Oracle FS1 is a highly engineered SAN storage system with key capabilities that set it apart from other all-flash storage systems, including built in QoS management that incorporates business priorities, best-practices provisioning, and a storage alignment capability that is application aware – for Oracle Database naturally, but that can also address a growing body of other key enterprise applications (such as Oracle JD Edwards, PeopleSoft, Siebel, MS Exchange/SQL Server, and SAP) – and a “service provider” capability to carve out multi-tenant virtual storage “domains” while online that are actually enforced at the hardware partitioning level for top data security isolation.
In this report, we’ll dive in and examine some of the great new capabilities of the Oracle FS1. We’ll look at what really sets it apart from the competition in terms of its QoS, auto-tiering, co-engineering with Oracle Database and applications, delivered performance, capacity scaling and optimization, enterprise availability, and OPEX reducing features, all at a competitive price point that will challenge the rest of the increasingly flash-centric market.
Virtualization is mature and widely adopted in the enterprise market, and convergence/hyperconvergence with virtualization is taking the market by storm. But what about mid-sized and SMB? Are they falling behind?
Many of them are. Generalist IT, low virtualization budgets, and small staff sizes all militate against complex virtualization projects and high costs. What this means is that when mid-sized and SMB want to virtualize, they either get sticker shock from high prices and high complexity, or dissatisfaction with cheap, poorly scalable and unreliable solutions. What they want and need is hyperconvergence for ease in management, lower CapEx and OpEx; and a simplified but highly scalable and available virtualization platform.
This is a tall order but not an impossible one: Scale Computing claims to meet these requirements for this large market segment, and Taneja Group’s HC3 Validation Report supports those claims. However, although lab results are vital to knowing the real story they are only part of that story. We also wanted to hear directly from IT about Scale in the real world of the mid-sized and SMB data center.
We undertook a Field Report project where we spoke at length with eight Scale customers. This report details our findings around the top common points we found throughout eight different environments: exceptional simplicity, excellent support, clear value, painless scalability, and high availability – all at a low price. These key features make a hyperconverged platform a reality for SMB and mid-market virtualization customers.
Every large IT shop has a long shelf of performance management solutions ranging from big platform bundles bought from legacy vendors, through general purpose log aggregators and event consoles, to a host of device-specific element managers. Despite the invested costs of acquiring, installing, supporting, and learning these often complex tools, only a few of them are in active daily use. Most are used only reactively and many just gather dust for a number of reasons. But, if only because of the ongoing costs of keeping management tools current, it’s only the solutions that get used that are worth having.
When it comes to picking which tool to use day-to-day, it’s not the theory of what it could do, it’s the actual value of what it does for the busy admin trying to focus on the tasks at-hand. And among the myriad of things an admin is responsible for, assuring performance requires the most management solution support. Performance related tasks include checking on the health of resources that the admin is responsible for, improving utilization, finding lurking or trending issues to attend to in order to head off disastrous problems later, working with other IT folks to diagnose and isolate service impacting issues, planning new activities, and communicating relevant insight to others – in IT, the broader business, and even to external stakeholders.
Admins responsible for infrastructure, when faced with these tasks, have huge challenges in large, heterogeneous, complex environments. While vendor-specific device and element managers drill into each piece of equipment, they help mostly with easily identifiable component failure. Both daily operational status and difficult infrastructure challenges involve looking across so-called IT domains (i.e. servers and storage) for thorny performance impacting trends or problems. The issue with larger platform tools is that they require a significant amount of installation, training, ongoing tool support, and data management that all detract from the time an admin can actually spend on primary responsibilities.
There is room for a new style of system management that is agile, insightful and empowering, and we think Galileo presents just such a compelling new approach. In this report we’ll explore some of the IT admin’s common performance challenges and then examine how Galileo Performance Explorer with its cloud-hosted collection and analysis helps conquer them. We’ll look at how Performance Explorer crosses IT domains to increase insight, easily implements and scales, fosters communication, and focuses on and enables the infrastructure admin to achieve daily operational excellence. We’ll also present a couple of real customer interviews in which, despite sunk costs in other solutions, adding Galileo to the data center quickly improved IT utilization, capacity planning, and the service levels delivered back to the business.
All-flash arrays are changing the datacenter for the better. No longer do we worry about IOPS bottlenecks from the array: all-flash arrays (AFA) can deliver a staggering amount of IOPs. AFAs with the ability to deliver hundreds of thousands of IOPs are not uncommon. The problem now, however, is how to get the IOPS from the array to the servers. We recently had a chance to see how well an AFA using EMC PowerPath driver works to eliminate this bottleneck—and we were blown away. Most comparisons with datacenter infrastructure show a 10-30% improvement in performance; but, the performance improvement that we saw with PowerPath was extraordinary.
Getting bits from an array to server is easy —very easy, in fact. The trick is getting the bits from a server to an array in an efficient manner when you have many virtual machines (VM) on multiple physical hosts that are transmitting the bits over a physical network with a virtual fabric overlay; this is much more difficult. Errors can get introduced and must be dealt with, the most efficient path must be obtained and established, re-evaluated and reestablished continually, and any misconfiguration can produce less than optimal performance. In some cases, this can cause outages or even data loss. In order to deal with the “pathing,” or how the I/O travels from the VM to storage, the OS running on the host needs a driver, or in the case where multiple paths can be taken from the server to the array, a multipathing driver needs to be used to direct the traffic.
Windows, Linux, VMware and most other modern operating systems include a basic multipath driver; however, these drivers tend to be generic and not code optimized to extract the maximum performance from an array and come with only rudimentary traffic optimization and management functions. In some cases these generic drivers are fine, but in the majority of datacenters the infrastructure is overtaxed and its equipment needs to be used in the most efficient manner possible. Fortunately, storage companies such as EMC are committed to making their arrays work as performant as possible and spend a considerable amount of time and research to develop multipathing drivers optimized for their arrays. EMC invited us to take a look at how PowerPath, their optimized “intelligent” multipath driver, performed on an XtremIO flash array connected to a Dell PowerEdge R710 server running ESX 5.5 while simulating an Oracle workload. We looked at the results of the various tests EMC ran comparing PowerPath/VE multipath driver against VMware’s ESXi Native Multipath driver and we were impressed—very impressed—by the difference that an optimized, multipath driver like PowerPath can make in a high IO traffic scenario