3 Big-Data Mistakes to Avoid

Cormac Foster, Journalist, Analyst, Tech Manager | 1/16/2013 | 19 comments

Cormac Foster
CIOs and other IT leaders are increasingly being asked to work with colleagues across the organization to develop ways to mine structured and unstructured data in order to draw actionable business insights.

More data is available to enterprises than ever before, and analysis tools have never been cheaper. There's plenty of upside to undertaking these types of big-data projects, but there are also some big traps that can hurt your career and derail your funding for future projects.

Here are three procedural problems to avoid:

Putting the wrong people in charge of your big-data project
Big-data projects touch each part of your enterprise, requiring input, support, funding, and information from nearly every department. The resulting insights can ultimately benefit everyone in the company. Regardless of what it takes to get your numbers crunched and come up with clear answers, you need a strong leader for each big-data initiative. This leader might not be the person who automatically comes to mind.

Since the breadth and depth of data required is so large, big-data projects carry the potential for unprecedented scope creep. To avoid an endless string of "one more thing" bolt-on projects, your project leader needs to be firm, well respected, and operating with the very public support of executive management. Without these qualities, your project leader will flounder.

While the marketing department is probably your company's biggest consumer of data analysis, it's generally a mistake to put a pure marketer in charge of a big-data project. A CEB study quoted by the Harvard Business Review noted that many marketers are notoriously data averse. The handful of marketing executives who are data driven may veer too far in the opposite direction, over-relying on data over all other information sources.

You might be compelled to put a data scientist in charge. This is also ill advised, as these executives tend to lack the necessary political understanding to nurture relationships across the organization. They may also have difficulty handling the gut-level marketing requirements needed to produce useful end results.

Ideally, your big-data project leader should come from your company's program management department. This type of executive is the most likely to offer the necessary project management skills, far-reaching internal relationships, knowledge of the corporate culture, and an understanding of technology.

Letting your data analysis run amok
If analyzing big-data couldn't help your executives glean additional insights about customers and business, there wouldn't be much reason to do it. With that said, you can't go into a project without a clear focus, otherwise you'll be overwhelmed and distracted by too much information. Down the line, your analysis will pay off in unexpected ways, but that's a bonus. If you want your mining to generate a return, you have to start by looking to answer a specific question, or a very clearly defined set of questions. Develop your focus and refine your queries first. Then stick to the plan. Are you trying to get a better handle on your true customer acquisition and costs? Are you looking at ways to reduce churn? Do you want to identify the prospects with the greatest likelihood of becoming customers? Focus on the primary questions first. After you've addressed these priority questions, you can process the exciting accidental discoveries that are sure to emerge during your analysis. Getting distracted chasing new tangents as they arise will only dilute your results and slow down your project.

Failing to anticipate dirty data
Your data is probably a lot dirtier than you think. Big-data implementations usually pull together multiple data sources that have never before been combined. In some cases, the data has never been analyzed, even on its own. De-duping your data, standardizing your formatting, and otherwise cleaning your data takes time. The smaller your test run, the faster you can work. After your data is clean, you can still expect to encounter procedural issues. Once again, working with small data sets allow you to tweak the system in a timely manner.

These are just some of the mistakes I've seen made by companies as they undertake big-data projects. I'd like to hear more about your experiences -- are they similar in nature? Are there other big-data pitfalls that come to mind? Tell us all about it in the comments section below.

View Comments: Newest First | Oldest First | Threaded View
Page 1 / 2   >   >>
Syerita Turner   3 Big-Data Mistakes to Avoid   1/28/2013 8:23:34 AM
Re: best questions?
@nasimsone...That is a very good question. Asking questions and conducting research in order to validate the data that you are questioning. 
nasimson   3 Big-Data Mistakes to Avoid   1/27/2013 12:45:31 PM
Re: best questions?
The question now is, that how do you insure the credibility of the data?
nasimson   3 Big-Data Mistakes to Avoid   1/27/2013 12:42:51 PM
Re: best questions?
The project manager should start by calling the team together (being certain to include off-site staff via the best technology available) and delivering a presentation about the project and its significance in a way that gets everybody fired up.
nasimson   3 Big-Data Mistakes to Avoid   1/27/2013 12:41:25 PM
Productivity losses
To stop the productivity losses, a good first step is to reduce work in progress (WIP) by 25-50 percent.This reduces the back and forth and makes managers and experts more responsive in dealing with issues and questions. Though counter-intuitive, reducing the number of open projects by 25-50 percent can double task completion rates
MDMConsult   3 Big-Data Mistakes to Avoid   1/26/2013 9:17:11 PM
Re: best questions?
Expensive and yes, fixing dirta data problems is valuable. With data quality and data consistency being a major roadblock to reaching this value. Organizations tend to exptect high quality data as clean data. Its crucial the human element is in place not just relying on technologies alone.
Syerita Turner   3 Big-Data Mistakes to Avoid   1/22/2013 8:42:44 AM
Re: best questions?
@Cormac you are so right. Most companies operate like that; needing results at a quick rate and the functionality is not there. That is one mistake that really can be avoided by planning ahead of time so that these tight crunches are avoided.
Cormac Foster   3 Big-Data Mistakes to Avoid   1/21/2013 9:42:29 PM
Re: best questions?
I think it's safe to assume that most data that's only used by a single application or division is, to one extent or another "dirty," even if the values are correct and teh problem is only inconsistent formatting. Fixing the problems is valuable, regardless of your Big Data plans, so as long as everyone's on board with your test runs being exploratory sessions, you're in good shape. The problem comes when you're expected to turn around fast results on a broad scale with your first run.
User Ranking: Blogger
Syerita Turner   3 Big-Data Mistakes to Avoid   1/21/2013 4:38:47 PM
Re: best questions?
Great post. I think that within my organization the lack of thinking that they will receive dirty data is our main pitfall. They really don't consider the inforamtion that is coming in to us as something that could potentially be harmful. When they get a handle on ensuring that all information that comes in and goes out is secure then they will really be on to a great security start.
Cormac Foster   3 Big-Data Mistakes to Avoid   1/18/2013 2:48:49 AM
Re: best questions?
I think mission creep is a big issue toward the beginning of a project, when you're still feeling out your processes and data sources. I don't mean to imply that it will ever be completely trivial to get ad hoc data cuts across massive quantities and types of data, but after many iterations (and all the standardization and cleaning that will require), it will certainly get easier. I imagine that big companies with extensive expertise mining data since long before Hadoop existed – banks, retailers, etc. – already have a fairly simple process for pulling new reports that doesn't involve a new spec at all. I think the biggest issue with Big Data is that we now have cheap enough software and computing cycles to bring that level of analysis to companies that couldn't afford it before. We also have more data, but I think that's secondary to the experience, tools, and discipline.
User Ranking: Blogger
CurtisFranklin   3 Big-Data Mistakes to Avoid   1/17/2013 11:10:31 PM
Re: best questions?
@Cormac, with the developments in database and computing technolo, do you foresee a day when mission creep won't be such a big issue -- when expanding processing to answer any question about any portion of the data set is essentially without additional cost? It seems that's something a few companies are promising, though I'm not sure anyone is truly close today.
Page 1 / 2   >   >>


The blogs and comments posted on EnterpriseEfficiency.com do not reflect the views of TechWeb, EnterpriseEfficiency.com, or its sponsors. EnterpriseEfficiency.com, TechWeb, and its sponsors do not assume responsibility for any comments, claims, or opinions made by authors and bloggers. They are no substitute for your own research and should not be relied upon for trading or any other purpose.

More Blogs from Cormac Foster
Cormac Foster   1/25/2013   28 comments
This is supposed to be a big year for identity management. IDC thinks we might all be logging onto the corporate network with our Facebook logins. Wired Magazine has declared passwords ...
Cormac Foster   1/22/2013   40 comments
Malware is going to be ugly in 2013. BitDefender is already calling this "The Year of Mobile Malware," which should send shivers down the spines of anyone playing with BYOD.
Cormac Foster   1/8/2013   5 comments
The last time we touched on software defined networking (SDN), or virtual networking, the industry was just lining up behind the OpenFlow standard that now defines it. By the end of 2012, ...
Cormac Foster   6/22/2011   5 comments
What do ISO 9001, HIPAA, PCI, Sarbanes Oxley, and a weekly drop-ship of 25 teddy bears to Des Moines every Tuesday have in common? They're all promises to do a certain thing a certain way ...
Latest Archived Broadcast
Data visualization can make complex data easier to grasp. Our expert guest will talk about the hows, whys, and whats of bringing the big picture to your enterprise.
May 28th 2pm EDT Tuesday
On-demand Video with Chat
NBA CIO Michael Gliedman will tell us why the NBA decided to create NBA.com/stats
6/18/2013 -   Please join us for the "IT Convergence Strategies: Why, When and How " to learn more about: • 5 truths about infrastructure convergence today that go beyond the hype • How to exploit the 4 phases of convergence maximum efficiency and agility • Key milestones to plan for on the convergence journey • Why integrated management is a critical component of convergence plans • The importance of an open, modular approach, such as Dell’s active infrastructure, to building a converged data center
E2 IT Migration Zones
IT Migration Zone - UK
Get Modern Apps on the Windows 8 Desktop
Application Audits Simplify Migration
Hardware Refresh Cycles Are Outdated
IT Migration Zone - FR
BrandCache sous Windows Server 2012
Windows Blue attendu en juin
Comment profiter d’une nouvelle expérience User Virtualization
IT Migration Zone - DE
Leap Motion zeigt Gestensteuerung für Windows 8
Microsofts Surface Pro kommt nach Deutschland
Like Us on Facebook
Twitter Feed
Enterprise Efficiency Twitter Feed
Dell IT Insights
Dell Market Response Twitter Feed
E2 Linked-in Group Ad
Site Moderators Wanted
Enterprise Efficiency is looking for engaged readers to moderate the message boards on this site. Engage in high-IQ conversations with IT industry leaders; earn kudos and perks. Interested? E-mail:
moderators@enterpriseefficiency.com
Dell's Efficiency Modeling Tool
The major problem facing the CIO is how to measure the effectiveness of the IT department. Learn how Dell’s Efficiency Modeling Tool gives the CIO two clear, powerful numbers: Efficiency Quotient and Impact Quotient. These numbers can be transforma¬tive not only to the department, but to the entire enterprise.

Read the full report
The State of Enterprise Efficiency in the Virtual Era: Virtualization – Smart Approaches to Maximize Gains
Virtualization is a presence in nearly all enterprise data centers. But not all companies are using it to its best effect. Learn the common characteristics of success, what barriers companies face, and how to get the most from your efforts.

Read the full report
Informed CIO: Dollars & Sense: Virtual Desktop Infrastructure
Cut through the VDI hype and get the full picture -- including ROI and the impact on your Data Center -- to make an informed decision about your virtual desktop infrastructure deployments.

Read the full report
SPONSORED BY DELL
BRIEFINGS
CASE STUDIES
EBOOKS
PUBLIC SECTOR RESOURCES
VIDEOS
WHITE PAPERS
A Video Case Study – Translational Genomics Research Institute
e2 Video
On the Case
TGen IT: Where We're Going Next

7|11|12   |   08:12   |   10 comments


Now that TGen has broken new ground in genomic research by using Dell's storage, cloud, and high-performance computing solutions, the company discusses what will come next for it and for personalized medicine.
On the Case
Better Care Through Better Communications

6|6|12   |   02:24   |   12 comments


The achievements of the TGen/Dell project could improve how all people receive healthcare, because they are creating ways to improve end-to-end communication of medical data.
On the Case
TGen IT: Where We Are Now

5|15|12   |   06:58   |   5 comments


TGen is breaking new ground in genomic research by using Dell's storage, cloud, and high-performance computing solutions.
On the Case
TGen IT: Where We Were

4|27|12   |   06:45   |   10 comments


The Translational Genomics Research Institute wanted to save lives, but its efforts were hobbled by immense computing challenges related to collecting, processing, sharing, and storing enormous amounts of data.
On the Case
1,200% Faster

4|18|12   |   02:27   |   12 comments


Through their partnership, Dell and TGen have increased the speed of TGen’s medical research by 1,200 percent.
On the Case
IT May Improve Children's Chances of Survival

4|17|12   |   02:12   |   8 comments


IT is helping medical researchers reach breakthroughs in a way and pace never seen before.
On the Case
Medical Advances in the Cloud

4|10|12   |   1:25   |   5 comments


TGen and Dell are pushing the boundaries of computing, and harnessing the power of the cloud to improve healthcare.
On the Case
TGen: Living the Mission

4|9|12   |   2:25   |   3 comments


TGen's CIO puts the organizational mission at the heart of everything the IT staff does.
On the Case
TGen Speeding Up Biomedical Research to Save More Lives

4|5|12   |   1:59   |   8 comments


The Translational Genomics Research Institute is revamping its computing to improve speed, storage, and collaboration – and, most importantly, to save lives.
On the Case
Computing Power Helping to Save Children's Lives

3|28|12   |   2:13   |   3 comments


The Translational Genomics Institute’s partnership with Dell is enabling them to treat kids with neuroblastoma more quickly and save more lives.
Tom Nolle
VMWare & the Bicameral Model of MDM

5|22|13   |   2:14   |   No comments


VMware has a new solution to the MDM problem, two virtual phones inside a real phone, at least for Android phones. Currently limited to two models, the idea could expand and provide a way of letting companies harmonize their need to manage corporate use of phones while preserving BYOD.
Ivan Schneider
Clash of the Tableau 8: Release the Kraken!

5|17|13   |   2:42   |   No comments


Tableau 8 has some great data visualization and presentation capabilities, but it's best paired with a strong data analysis framework.
Tom Nolle
Using Virtualization – for Real!

5|13|13   |   2:10   |   2 comments


There's a lot of hype about virtualization of networks, NaaS, and SDN, but there's a couple of proven applications that enterprises could adopt right now and potentially save money and improve operations.
Tom Nolle
Is UC Becoming Oxymoronic or Just Moronic?

5|9|13   |   2:12   |   No comments


Skype/Outlook UC integration means we're going to have competition and fragmentation of UC client architectures, but is that bad? Modern devices can support IM, email, voice, and video clients, so maybe it's the back end of UC we need to be worried about.
E2 Editors
Windows vs. Integrated Circuit CPUs

4|17|13   |   4:45   |   5 comments


The editors make their predictions about what will win the next match-up in the E2 Tournament of IT Revolutionaries.
E2 Editors
Radio vs. Public Internet Access

4|17|13   |   4:34   |   14 comments


The editors make their predictions about what will win the next match-up in the E2 Tournament of IT Revolutionaries.
E2 Editors
Mainframes vs. Servers

4|17|13   |   4:34   |   16 comments


The editors make their predictions about what will win the next match-up in the E2 Tournament of IT Revolutionaries.
E2 Editors
TCP/IP vs. Printing Press

4|17|13   |   3:07   |   5 comments


The editors make their predictions about what will win the next match-up in the E2 Tournament of IT Revolutionaries.
E2 Editors
BYOD vs. E-Commerce

4|12|13   |   3:12   |   11 comments


The editors make their predictions about what will win the next match-up in the E2 Tournament of IT Revolutionaries.
E2 Editors
Telecommuting vs. Outsourcing

4|12|13   |   4:19   |   7 comments


The editors make their predictions about what will win the next match-up in the E2 Tournament of IT Revolutionaries.
E2 Editors
Personal Computer vs. Mobile Devices

4|12|13   |   4:28   |   20 comments


The editors make their predictions about what will win the next match-up in the E2 Tournament of IT Revolutionaries.
E2 Editors
Smartphones vs. Productivity Software

4|12|13   |   3:09   |   13 comments


The editors make their predictions about what will win the next match-up in the E2 Tournament of IT Revolutionaries.
Tom Nolle
There's More to Mobility Than the Mobile Worker

4|9|13   |   2:03   |   5 comments


Workers are now used to portable device support throughout their everyday lives. We should be looking at the policy of providing fixed-desk devices to support stationary workers. Could portable support be smarter?
Ivan Schneider
From Kim Jong-Un's Trackball to Nuance Voice Ads

4|5|13   |   3:21   |   9 comments


Input devices run the gamut, from the humble Missile Command-style trackball to advanced speech recognition. Unfortunately, these input devices can be used for evil as well as good. Case in point: mobile ads that want you to talk to them.
Tom Nolle
Data/Storage Wish List for Enterprises

4|3|13   |   2:19   |   1 comment


Enterprises want three things in storage systems: First is some speech-recognition way of capturing videoconference data for indexing; second is semantic/AI analysis of emails and IM for content indexing; third is a better system for managing hierarchical layers of storage.