Lately, I've been hanging around some cloud technologists, cloud service providers, virtualization customers, and security practitioners. I've been asking a lot of basic questions, trying to understand when and how cloud computing/virtualization will be ready to support any application or workload. UPDATED 2/2 4:00 PM
Just to be clear: For this discussion I'm using the terms virtualization and cloud computing almost interchangeably, even though reasonable people can argue the differences.
Here's what I heard from two different camps:
Cloud service providers (sellers): We are there today Dave... We can support any app and any workload; it's just that people aren't yet comfortable putting their data in the cloud.
Security practitioners (buyers): We honestly are not sure at this point how we're going to secure data in the cloud. The problem is that virtualization changes everything.
Which group do you think got my attention?
Here's what I learned from the security gurus: The current method used to secure information in the traditional data center is to create perimeters around assets and rely on the physical separation of resources, including servers, host bus adapters, internal buses, networks, disk storage, controllers, cache, memories, databases, tapes, and so forth.
Each of these physical resources has a system around it to manage authentication, access control, key management, auditing, etc. When data resides inside this physical entity, it is safe. These assets all have interfaces between them (connection points), and every time data passes between resources it becomes vulnerable. So technologies and processes are put in place to safely pass data between these resources, establish audit trails, and ensure secure recovery if there's a problem.
One of the fundamental enablers of security in this traditional example is the fact that each resource has a physical line of demarcation and a well-known and established connection point between resources. Simply put, a security practitioner knows what connects where and can secure it accordingly.
Here's where it gets hairy. When you add virtualization to the cloud, you now have all these connection points, but they are no longer physical, they are logical. You don't know what is happening where. A virtual machine is moved from one server to another, and the connection points continuously change. The problems for a security practitioner are ensuring that the connection between two resources is trusted, testing that the connection is safe, and finding ways to audit.
In the traditional non-virtualized world, you can rely on physical fencing (e.g., "Only these LUNs can be accessed from these servers") and create a perimeter around each resource and protect the connection points between resources at handoff. In the virtual world, you have no idea where the connection points between resources exist because they are dynamically changing -- perpetually.
So what's the answer? Simplify by getting rid of the complexity in the middle of the network, and secure the end points. Vendors such as IBM Corp. (NYSE: IBM), Microsoft Corp. (Nasdaq: MSFT), and VMware Inc. (NYSE: VMW) also continue to offer virtualization security solutions.
Another technique is to apply a set of technologies that perform authorization, access control at the application level, and on a request, break data up into lots of smaller pieces, disperse it throughout the network, and reconstruct it at the client end.
What this type of dispersal achieves is a form of encryption, without the need for key management, that can't be compromised with brute-force processing power from intruders. If one of the pieces of data is stolen, it is useless to the bad guys unless they have all the other pieces and the technology to put Humpty Dumpty together again.
Vendors with dispersal technology include startup
Cleversafe Inc. , whose products is like a file system -- and is priced that way at roughly several thousand dollars per "node," where a node is a server running the file system managing a bunch of storage behind it.
Also offering dispersal is (believe it or not)
Unisys Corp. (NYSE: UIS), whose proprietary solutions designed for the U.S. military deploy the concept.
Google (Nasdaq: GOOG)'s file system is also based on data dispersal.
As with all technologies, data dispersal has its tradeoffs. It's new, and there's overhead associated with the technique. Nonetheless, it's a good example of the type of new thinking that is needed to secure data in the cloud.
— David Vellante spent 15 years at IDC and is a founder of The Wikibon Project. He can be reached on Twitter at @dvellante.