This post was originally written by Marten Terpstra at the Plexxi blog.
The days of completely separate storage network technologies are quickly fading. It feels like it’s only a few years ago that Fiber Channel was the way to create large scale storage networks. Big honking storage devices on a separate network, connected to Fiber Channel switches, connected with Fiber Channel adapters into servers. With 10GbE becoming cost effective and matching or outperforming 2, 4 or even 8Gbit Fiber Channel, Fiber Channel over Ethernet was invented, mostly as a mechanism to allow Ethernet based attachments to existing Fiber Channel installations. It’s a bit clumsy, but it works.
A variety of Ethernet and IP based storage mechanisms have gained significant popularity in the past few years. Storage has become denser and cheaper, and advances in file system technology have made it much more convenient to distribute storage and access it as if it was one large system. Most of the customers we find have a mixture of server co-located storage (typically 4-10 TB per system) to mid size storage arrays in the 10s to 100s of TB that are distributed across the network. Whether it’s NAS, iSCSI, ATA over Ethernet or any other Ethernet or IP based storage technology, it is clear that networked storage is where we are going. Don’t get me wrong, FC and FCoE will be around for a long time to come. Billions of dollars have been invested and those dollars will be protected. But even the bellwether FC storage companies provide ethernet and IP based access to their arrays these days.
Interestingly enough though, while many have made or are making the transition to, dare I call it, traditional network technologies for storage transport, many still create separate, parallel ethernet networks for storage. Servers have multiple NICs, with some of them dedicated to the storage network. The reasons for this separation are fairly straightforward. Storage used to be dedicated, it still is, but now on cheap 10GbE. The people that manage the storage still have the same full control, its still a storage network. All of the management, provisioning and tuning tools apply, all that really happened is a change of the medium. But above all, any time I have had a “why don’t you converge your storage and data network” discussion with a customer, it has almost always comes to a fear of running storage traffic and regular data traffic on the same network. It’s a fear of interference, congestion, perhaps a fear of the unknown “what happens to my storage performance when I run the two on the same network?”
The folks at Network Heresy have spent a few articles on the age old network problem of mice and elephant flows. Combining storage and regular data on the same network will amplify this problem. Storage functions like replication and archiving are extremely likely to be elephant flows and can have quite the impact on the rest of the flows. Perhaps that fear is rational?
There are certainly ways to try and engineer your way through this, but they are awfully complicated. I can create a different class of service for storage traffic, queue it differently, give it different drop preferences and use all the other tricks to try and ease the potential for interference. But still, the data traverses the same links and all we have done is made one more important than the other. Which may be perfectly valid for replication and archiving, but not at all for real time access to storage for regular applications.
So what if you could separate your storage and regular data traffic while on the same, single ethernet? What if you could isolate traffic to and from storage to use specific links, while all other traffic would use other links. Give lowest hop count paths to applications and storage that require the lowest latency, give paths with more latency but more available bandwidth to those replication and archiving flows that are not particularly latency sensitive. You can create a single ethernet network, and still separate your storage traffic and regular data traffic. I have seen solutions that propose a virtualized L2 network to separate the two types, but that only provides logical separation, they still flow across the exact same links, with the potential of interference.
A controller that manages topologies and paths based on end to end visibility and allows you to articulate in fairly abstract terms what you want separated, gives you this option. Having that same controller integrate with the storage management so that this can be fully or partially automated, even better. Think about it, a single storage and data network where you control how converged or diverged the various data streams are is very powerful. We have demonstrated this in the past and there are real solutions based on these capabilities.
We are collectively making the step into network based storage. We should also make the step to run this effectively with maximum performance and minimal interference on a single network. It really can be pretty simple…