Chances are that you’ve left a considerable electronic trail behind you in your travels across the Internet. Your email address, mailing address, birth date, age, credit card numbers, and more are all stored in scores of e-commerce systems, social networking sites, and maybe even a job board or two. And your business likely has a trove of similar data about everyone you’ve ever done a transaction with over the Web.
Of course, the longer that data is there, the greater the probability that it will be inappropriately disclosed -- either accidentally, or through a cyber-attack. The resulting exposure can lead to identity theft, or any number of digital assaults on individuals’ privacy. And as the folks at TJX Corp. can testify, that sort of data breach can cost your company hundreds of millions of dollars as well.
A Dutch researcher proposes that the way to eliminate the risk of accidental data disclosure is to let the data slowly decay until all the data fades away. Dr. Harold van Heerde of the Centre for Telematics and Information Technology (CTIT) at the University of Twente is researching ways to gradually replace details
from a database of personal information with more and more general information over time.
Of course, letting data degrade is the exact opposite of what most IT managers strive to do with customer data. After all, customer data is an asset: We use it more and more each day in an attempt to improve our relationship with customers, deliver better service, and understand patterns in their behavior. So high data quality is important. But for most uses beyond the transactional relationship with customers, we don’t need high-resolution data. Often, the data can be "anonymized" to a large degree for the purposes of larger analytical tasks, and there’s definitely a shelf-life attached to the value of data for any given transaction.
Van Heerde and a team of computer scientists from the Netherlands and France originally proposed the idea of data degradation to protect private information, in a paper presented at the 2008 Conference on Information and Knowledge Management. The idea in itself seems simple enough -- by gradually anonymizing data by removing personal identifying information, the data remains useful for things like market analytics and other business intelligence applications, but becomes useless to anyone who might be able to gain access to the data accidentally or through deliberate hacking.
A similar sort of time-bomb approach to data destruction was introduced in some mobile applications based on Java in the past decade. Mobile clients that use "data fading" keep track of how much time has elapsed since the last successful synchronization of the data with the source, and then start to destroy the data after a certain maximum "quiet period."
There are some significant barriers to data fading on a database server -- many of them pointed out by van Heerde and his colleagues in their original paper. For example, there’s the issue of data that’s been “destroyed” remaining in database backups. And while there have been plenty of exposures of personal data through cyber-attack, the most wide-ranging and severe exposures have often been because of the loss of backup tapes in shipment or because data has simply "walked out the door" on removable media.
Also, data degradation can’t be entirely automatic -- it would require some integration with data retention policy tools, particularly with data that might fall under data retention regulations (or might be the target of legal discovery). If you’re purging your transactional databases of older data on a regular basis and moving it to an offline backup, you’re likely already doing most of what data degradation would achieve from the standpoint of protecting customer data.
There’s also the question of whether there’s anything really gained in terms of personal data protection from cyber-attacks. While letting data degrade can protect information from older transactions if a site is compromised, it still leaves the most recent and potentially most valuable data vulnerable.
"Data degradation however, as any data retention model, cannot defeat trail disclosures performed by an adversary spying the database system from its creation," van Heerde and his colleagues wrote. So data degradation technology itself can’t prevent a breach of sensitive data -- it’s just an enhancement to standard access controls.
Most of what van Heerde’s proposed technology would do could be simply handled by good data management practices. Unfortunately, like common sense, good data practices are not common enough.
— Sean Gallagher is an award-winning IT journalist and the former head of InformationWeek Labs. Gallagher is now an independent journalist and technology consultant based in Baltimore. He can be reached at:gallagher.sean.m@gmail.com.
An interesting article on a subject that may keep philosophers arguing for a long time regarding just what data should be "lost'' and what is important to keep and how long to keep?
Seems almost like the argument of how many angels can dance on the head of a pin. Maybe, unanswerable.
Then who will decide what to keep and how long, a committee, the president, the company CIO?
An interesting article and follow-up conversation.
This is not to say that there shouldn't be efforts made to think about and come up with new and better methods to protect data, but what I think what the whole idea of fading speaks to is the yearning we have for online privacy and control over who has what information on us stored on their servers or storage sites. It's nice to think that there may be a way that our online trail could fade away or that the delete key could be as permanent as it sounds. But basically, we just need to get over it...it ain't gonna happen.
In case you haven’t heard, the Library of Congress is digitally archiving every public tweet since Twitter went online in March 2006. I’ve always wanted to have my writing in the Library of Congress. And now, it is—every tweet I’ve tweeted on Twitter since I wrote a story about it for GCN in March 2007.
So is this something else we have to be worried about?? A user's daily account of their actions being archived "forever" ???
The issue of data being attacked through cyber attacks and personal information being leaked out is a severe one. This can and does cost the IT companies huge losses. What I feel is that there should be specialized data centers which should provide the services of data storage and protection to other companies. Companies can outsource their responsibility of data storage and protection to these and these companies would be liable in case of any data theft. Since these companies will be specializing in storage and protection, they can do it in a more effective and economical way. I am not aware if these practices are being followed currently, but to me, it seems a smart solution to cater to the issue of information and privacy protection.
Ariella - I suspect they are not required to abide by HIPPA regulations. Personal information like social security numbers should be seen by as few people as possible (need to know). Any company that is still using SSN's for gratuitous purposes is playing with fire. For clients who haven't got the message, I usually assign unique indexes at necessary and they usually go along.
It's a good suggestion, mnt.code to ask for a number other than a social security number. But that is not usually the way things are set up. For example, I have an mployee number assigned by Pearson (for scoring on remote computers). The only time I use it is when I call with an issue. It is not used for logging in; the numbers used as a component for the log in are based on my social security number.
True, once the info gets out of the main system (and moves to an advertisers storage, per example), the users stop having control over that information.
The issue you mention about information you want to keep intact being intentionally 'faded' might be too much to bare.
I don't want my data faded or degraded. Please, everyone with my data: leave it right where it is.
The credit card and shipping info stored by Amazon, for example, and the browser cookies that call that data: let's not touch that.
Ditto the personal information that makes it easy to access (and yes, change!) my student loans, bank accounts, credit cards, and business transactions and relationships of all sorts. Don't forget all the times I filed for unemployment insurance online.
Yes, there is a microscopic chance that my "identity" could be "stolen." But I am consciously taking that risk, in exchange for my life being otherwise made much easier.
That's it. Researchers are trying to account for the lazy practice at many organizations. The marketers will always want to have all information on line for personalization applications, and database managers would like as little as possible. I have worn both hats, but would side with the database managers on this one.
Data can be maintained off line which isn't required for the function of the web site. Rarely are all saved fields needed on line, or the most vulnerable data could be swapped out for a unique placeholder, employee numbers instead of SSN's for instance. The data that a hacker is looking for will always be the newest data which renders "fading" moot.
Interesting that you should post about this now. Someone just told me about people signing on for a service to remove their virtual presence. I told him that I didn't think they could remove themselves altogether. If their personal data is stored -- even on a backup system -- it would not be within their power or that of their agent to delete their information.
The ThinkerNet does not reflect the views of TechWeb. The ThinkerNet is an informal means of communication to members and visitors of the Internet Evolution site. Individual authors are chosen by Internet Evolution to blog. Neither Internet Evolution nor TechWeb assume responsibility for comments, claims, or opinions made by authors and ThinkerNet bloggers. They are no substitute for your own research and should not be relied upon for trading or any other purpose.
On Wednesday, the Department of Justice filed suit to stop the acquisition of T-Mobile by AT&T, because of concerns over the deal being "anti-competitive" and "reducing consumer choice."
Yesterday, Hewlett-Packard Co. (NYSE: HPQ) announced it was killing its mobile device business, ending the fairy-tale story that started just a little more than a year ago with the acquisition of Palm for $1.2 billion. The move leaves hundreds of thousands of unsold HP TouchPads on the shelves and Palm Pre loyalists with orphaned devices in their pockets.
A day before the 30th anniversary of the unveiling of the personal computer, I was at a Microsoft media event in Washington. Called "The Future of Federal Work," the event was intended to show off Microsoft's Office 365 cloud-based collaboration and productivity platform in the context of how federal agencies will use it. But the event also offered a look at how Microsoft envisions the future of business computing in general, and what place the PC holds in that future.
Right now, your vice president of sales is looking at the iPad 2 and drooling. Or maybe that Pavlovian response is being elicited by the BlackBerry PlayBook, or the Motorola Xoom, or any of the herd of other tablets and smartphones that will be unleashed on the market any day now. Whichever it is, he'll want sales apps for it, and access to back-end data. And whichever one his salivary glands are working overtime over, you can count on it being the one that you weren't planning on supporting.
Back in the early days of search engine optimization (SEO), I accidentally learned the fine art of "Google-bombing." After a colleague had written what I thought was a particularly off-base column, I linked to his article from my personal blog with the phrase, "incredibly idiotic." And others followed in kind. Before long, entering "incredibly idiotic" in Google offered up my colleague at the top of the results.
Multi-tenant clouds assure security for clients, but not necessarily for their ideas. Here's one thing you should discuss with your cloud provider before you sign on.
The FBI recently issued a warning to smartphone users, highlighting two mobile malware applications: Loozfan, which steals personal information, and FinFisher, which is spyware that takes over a smartphone's functions.
Mobile device hacking in business is dramatically on the rise as companies use more consumer-grade devices. User education remains one of IT's best preventive strategies.
Smartphone users are aware that their systems are open to possible security breaches. But NPD Group found that more than 82 percent of them do not have any security software on their phones. That's just dumb.
With more and more executives relying on mobile devices to complete their work, mobile device management has become as popular as traditional IT management solutions.
New York's Metropolitan Transit Authority is conducting a pilot test of digital kiosks to guide subway users to where they want to go more efficiently and at lower cost.
While NFC's original goal was to enhance mobile commerce applications, it is finding its way into a number of other uses, which is creating both opportunity as well as challenges for IT departments.
Enterprises would like to move to cloud computing but are hesitant because they are concerned about providers’ ability to secure company data. Here are some tips that help to ensure that if breaches occur, the business is not left holding the bag.
New York's Metropolitan Transit Authority is conducting a pilot test of digital kiosks to guide subway users to where they want to go more efficiently and at lower cost.
The whole Amazon.reader debate is a double-stupid. It's stupid to think that there's any e-book buyer who doesn't know Amazon's URL, and it was stupider to let ICANN launch the whole free-form TLD initiative to start with.
While NFC's original goal was to enhance mobile commerce applications, it is finding its way into a number of other uses, which is creating both opportunity as well as challenges for IT departments.
Enterprises would like to move to cloud computing but are hesitant because they are concerned about providers’ ability to secure company data. Here are some tips that help to ensure that if breaches occur, the business is not left holding the bag.
Edmunds separates customers into segments based on the info it collects on its site and from partners, and uses that to push out custom content, said Brian Baron, director of business analytics for Edmunds.com, at Predictive Analytics Innovation Summit.
The automotive website uses propensity modeling to target ads and customer registration forms, said Brian Baron, director of business analytics for Edmunds.com, at Predictive Analytics Innovation Summit.
Expert Integrated Systems: Changing the Experience & Economics of IT In this e-book, we take an in-depth look at these expert integrated systems -- what they are, how they work, and how they have the potential to help CIOs achieve dramatic savings while restoring IT's role as business innovator. READ THIS eBOOK
your weekly update of news, analysis, and
opinion from Internet Evolution - FREE! REGISTER HERE
Wanted! Site Moderators Internet Evolution is looking for a handful of readers to help moderate the message boards on our site as well as engaging in high-IQ conversation with the industry mavens on our thinkerNet blogosphere. The job comes with various perks, bags of kudos, and GIANT bragging rights. Interested?
To save this item to your list of favorite Internet Evolution content so you can find it later in your Profile page, click the "Save It" button next to the item.
M2M: Rise of the Machines? Not Yet David Weldon In the 1970 science fiction thriller Colossus: The Forbin Project, two giant supercomputers from the United States and Soviet Union secretly join forces to take control of the collective nuclear might of the two countries. In the film, the two machines discover each other's existence, communicate back-and-forth, share their collective data, and cut their human creators out of the process. It is the ultimate example of machine-to-machine communications, or M2M. CLICK FOR MORE
M2M: Rise of the Machines? Not Yet David Weldon In the 1970 science fiction thriller Colossus: The Forbin Project, two giant supercomputers from the United States and Soviet Union secretly join forces to take control of the collective nuclear might of the two countries. In the film, the two machines discover each other's existence, communicate back-and-forth, share their collective data, and cut their human creators out of the process. It is the ultimate example of machine-to-machine communications, or M2M. CLICK FOR MORE