Big-data can often benefit from the creation of summary/abstraction databases that provide enough detail for analytics and are linked to support deeper analysis where needed. This can speed access and reduce database cost.
When have you known me not to have suggestions! The starting point is to look at how summaries of the "main data" could be constructed to facilitate general queries. For example, health care information could be summarized/abstracted by condition/diagnosis, medication, etc. These abstractions are then linked (conceptually by key, again "diagnosis" is an example) to the main database. Now if somebody wants to look at the prevalance of conditions by location, date, etc. they can use the abstracted database. If they want to drill down, they can take their query results and use the "conditions" key to access the main database. By doing this they avoid spinning through detail records to analyze what's inherently summary data.
Tom, you said "What we should do when we're thinking about big data, meaning large data of any sort, is thinking about how that data could be abstracted in different ways to allow query and analytic access while reducing overall database usage." Sounds good. But how do we do that? Have you got any suggestions of how to start?
Please join us for the "IT Convergence Strategies: Why, When and How " to learn more about:
• 5 truths about infrastructure convergence today that go beyond the hype
• How to exploit the 4 phases of convergence maximum efficiency and agility
• Key milestones to plan for on the convergence journey
• Why integrated management is a critical component of convergence plans
• The importance of an open, modular approach, such as Dell’s active infrastructure, to building a converged data center
Enterprise Efficiency is looking for engaged readers to moderate the message boards on this site. Engage in high-IQ conversations with IT industry leaders; earn kudos and perks. Interested? E-mail: firstname.lastname@example.org
Dell's Efficiency Modeling Tool The major problem facing the CIO is how to measure the effectiveness of the IT department. Learn how Dell’s Efficiency Modeling Tool gives the CIO two clear, powerful numbers: Efficiency Quotient and Impact Quotient. These numbers can be transforma¬tive not only to the department, but to the entire enterprise. Read the full report
Now that TGen has broken new ground in genomic research by using Dell's storage, cloud, and high-performance computing solutions, the company discusses what will come next for it and for personalized medicine.
The Translational Genomics Research Institute wanted to save lives, but its efforts were hobbled by immense computing challenges related to collecting, processing, sharing, and storing enormous amounts of data.
There's a lot of hype about virtualization of networks, NaaS, and SDN, but there's a couple of proven applications that enterprises could adopt right now and potentially save money and improve operations.
Skype/Outlook UC integration means we're going to have competition and fragmentation of UC client architectures, but is that bad? Modern devices can support IM, email, voice, and video clients, so maybe it's the back end of UC we need to be worried about.
Workers are now used to portable device support throughout their everyday lives. We should be looking at the policy of providing fixed-desk devices to support stationary workers. Could portable support be smarter?
Input devices run the gamut, from the humble Missile Command-style trackball to advanced speech recognition. Unfortunately, these input devices can be used for evil as well as good. Case in point: mobile ads that want you to talk to them.
Enterprises want three things in storage systems: First is some speech-recognition way of capturing videoconference data for indexing; second is semantic/AI analysis of emails and IM for content indexing; third is a better system for managing hierarchical layers of storage.