Big-data can often benefit from the creation of summary/abstraction databases that provide enough detail for analytics and are linked to support deeper analysis where needed. This can speed access and reduce database cost.
When have you known me not to have suggestions! The starting point is to look at how summaries of the "main data" could be constructed to facilitate general queries. For example, health care information could be summarized/abstracted by condition/diagnosis, medication, etc. These abstractions are then linked (conceptually by key, again "diagnosis" is an example) to the main database. Now if somebody wants to look at the prevalance of conditions by location, date, etc. they can use the abstracted database. If they want to drill down, they can take their query results and use the "conditions" key to access the main database. By doing this they avoid spinning through detail records to analyze what's inherently summary data.
Tom, you said "What we should do when we're thinking about big data, meaning large data of any sort, is thinking about how that data could be abstracted in different ways to allow query and analytic access while reducing overall database usage." Sounds good. But how do we do that? Have you got any suggestions of how to start?
Enterprise Efficiency is looking for engaged readers to moderate the message boards on this site. Engage in high-IQ conversations with IT industry leaders; earn kudos and perks. Interested? E-mail: email@example.com
Dell's Efficiency Modeling Tool The major problem facing the CIO is how to measure the effectiveness of the IT department. Learn how Dell’s Efficiency Modeling Tool gives the CIO two clear, powerful numbers: Efficiency Quotient and Impact Quotient. These numbers can be transforma¬tive not only to the department, but to the entire enterprise. Read the full report
Now that TGen has broken new ground in genomic research by using Dell's storage, cloud, and high-performance computing solutions, the company discusses what will come next for it and for personalized medicine.
The Translational Genomics Research Institute wanted to save lives, but its efforts were hobbled by immense computing challenges related to collecting, processing, sharing, and storing enormous amounts of data.
Office and personal productivity tools come in a first-class and coach flavor set, but what makes the difference is primarily little things that most users won't encounter. What's the big issue in using something other than Office, and can you get around it?
We really don't want an "Internet of Everything" but even building an Internet of Everythinguseful means setting some ground rules to insure there's value in the process and that costs and risks are minimized.
Google's Chrome OS has a lot of potential value and a lot of recent press, but it still needs something to make it more than a thin client. It needs cloud integration, it needs extended APIs via web services, and it needs to suck it up and support a hard drive.
On a recent African trip I saw examples of the value of the cloud in developing nations, for educational and community development programs. We could build on this, but not only in developing economies, because these same programs are often under-supported even in first-world countries.
VMware's debate with Cisco on SDN might finally create a fusion between an SDN view that's all about software and another that's all about network equipment. That would be good for every enterprise considering the cloud and SDN.
Wearing a bulky, oversized watch is good training for the next phase in wristwatches: the Internet-enabled, connected watch. Why the smartphone-tethered connected watch makes sense, plus Ivan demos an entirely new concept for the "smart watch."
Cloud storage costs are determined primarily by the rate at which files are changed and the possibility of concurrent access/update. If you can structure your storage use to optimize these factors you can cut costs, perhaps to zero.
The Internet has evolved into a machine for drumming up a chorus of "Happy Birthday" messages, from family, friends, friends of friends who you added on Facebook, random people that you circled on G+, and increasingly, automated bots. Enough already.