3 Big-Data Mistakes to Avoid

Cormac Foster, Journalist, Analyst, Tech Manager | 1/16/2013 | 19 comments

Cormac Foster
CIOs and other IT leaders are increasingly being asked to work with colleagues across the organization to develop ways to mine structured and unstructured data in order to draw actionable business insights.

More data is available to enterprises than ever before, and analysis tools have never been cheaper. There's plenty of upside to undertaking these types of big-data projects, but there are also some big traps that can hurt your career and derail your funding for future projects.

Here are three procedural problems to avoid:

Putting the wrong people in charge of your big-data project
Big-data projects touch each part of your enterprise, requiring input, support, funding, and information from nearly every department. The resulting insights can ultimately benefit everyone in the company. Regardless of what it takes to get your numbers crunched and come up with clear answers, you need a strong leader for each big-data initiative. This leader might not be the person who automatically comes to mind.

Since the breadth and depth of data required is so large, big-data projects carry the potential for unprecedented scope creep. To avoid an endless string of "one more thing" bolt-on projects, your project leader needs to be firm, well respected, and operating with the very public support of executive management. Without these qualities, your project leader will flounder.

While the marketing department is probably your company's biggest consumer of data analysis, it's generally a mistake to put a pure marketer in charge of a big-data project. A CEB study quoted by the Harvard Business Review noted that many marketers are notoriously data averse. The handful of marketing executives who are data driven may veer too far in the opposite direction, over-relying on data over all other information sources.

You might be compelled to put a data scientist in charge. This is also ill advised, as these executives tend to lack the necessary political understanding to nurture relationships across the organization. They may also have difficulty handling the gut-level marketing requirements needed to produce useful end results.

Ideally, your big-data project leader should come from your company's program management department. This type of executive is the most likely to offer the necessary project management skills, far-reaching internal relationships, knowledge of the corporate culture, and an understanding of technology.

Letting your data analysis run amok
If analyzing big-data couldn't help your executives glean additional insights about customers and business, there wouldn't be much reason to do it. With that said, you can't go into a project without a clear focus, otherwise you'll be overwhelmed and distracted by too much information. Down the line, your analysis will pay off in unexpected ways, but that's a bonus. If you want your mining to generate a return, you have to start by looking to answer a specific question, or a very clearly defined set of questions. Develop your focus and refine your queries first. Then stick to the plan. Are you trying to get a better handle on your true customer acquisition and costs? Are you looking at ways to reduce churn? Do you want to identify the prospects with the greatest likelihood of becoming customers? Focus on the primary questions first. After you've addressed these priority questions, you can process the exciting accidental discoveries that are sure to emerge during your analysis. Getting distracted chasing new tangents as they arise will only dilute your results and slow down your project.

Failing to anticipate dirty data
Your data is probably a lot dirtier than you think. Big-data implementations usually pull together multiple data sources that have never before been combined. In some cases, the data has never been analyzed, even on its own. De-duping your data, standardizing your formatting, and otherwise cleaning your data takes time. The smaller your test run, the faster you can work. After your data is clean, you can still expect to encounter procedural issues. Once again, working with small data sets allow you to tweak the system in a timely manner.

These are just some of the mistakes I've seen made by companies as they undertake big-data projects. I'd like to hear more about your experiences -- are they similar in nature? Are there other big-data pitfalls that come to mind? Tell us all about it in the comments section below.

View Comments: Newest First | Oldest First | Threaded View
Page 1 / 2   >   >>
Syerita Turner   3 Big-Data Mistakes to Avoid   1/28/2013 8:23:34 AM
Re: best questions?
@nasimsone...That is a very good question. Asking questions and conducting research in order to validate the data that you are questioning. 
nasimson   3 Big-Data Mistakes to Avoid   1/27/2013 12:45:31 PM
Re: best questions?
The question now is, that how do you insure the credibility of the data?
nasimson   3 Big-Data Mistakes to Avoid   1/27/2013 12:42:51 PM
Re: best questions?
The project manager should start by calling the team together (being certain to include off-site staff via the best technology available) and delivering a presentation about the project and its significance in a way that gets everybody fired up.
nasimson   3 Big-Data Mistakes to Avoid   1/27/2013 12:41:25 PM
Productivity losses
To stop the productivity losses, a good first step is to reduce work in progress (WIP) by 25-50 percent.This reduces the back and forth and makes managers and experts more responsive in dealing with issues and questions. Though counter-intuitive, reducing the number of open projects by 25-50 percent can double task completion rates
MDMConsult   3 Big-Data Mistakes to Avoid   1/26/2013 9:17:11 PM
Re: best questions?
Expensive and yes, fixing dirta data problems is valuable. With data quality and data consistency being a major roadblock to reaching this value. Organizations tend to exptect high quality data as clean data. Its crucial the human element is in place not just relying on technologies alone.
Syerita Turner   3 Big-Data Mistakes to Avoid   1/22/2013 8:42:44 AM
Re: best questions?
@Cormac you are so right. Most companies operate like that; needing results at a quick rate and the functionality is not there. That is one mistake that really can be avoided by planning ahead of time so that these tight crunches are avoided.
Cormac Foster   3 Big-Data Mistakes to Avoid   1/21/2013 9:42:29 PM
Re: best questions?
I think it's safe to assume that most data that's only used by a single application or division is, to one extent or another "dirty," even if the values are correct and teh problem is only inconsistent formatting. Fixing the problems is valuable, regardless of your Big Data plans, so as long as everyone's on board with your test runs being exploratory sessions, you're in good shape. The problem comes when you're expected to turn around fast results on a broad scale with your first run.
User Ranking: Blogger
Syerita Turner   3 Big-Data Mistakes to Avoid   1/21/2013 4:38:47 PM
Re: best questions?
Great post. I think that within my organization the lack of thinking that they will receive dirty data is our main pitfall. They really don't consider the inforamtion that is coming in to us as something that could potentially be harmful. When they get a handle on ensuring that all information that comes in and goes out is secure then they will really be on to a great security start.
Cormac Foster   3 Big-Data Mistakes to Avoid   1/18/2013 2:48:49 AM
Re: best questions?
I think mission creep is a big issue toward the beginning of a project, when you're still feeling out your processes and data sources. I don't mean to imply that it will ever be completely trivial to get ad hoc data cuts across massive quantities and types of data, but after many iterations (and all the standardization and cleaning that will require), it will certainly get easier. I imagine that big companies with extensive expertise mining data since long before Hadoop existed – banks, retailers, etc. – already have a fairly simple process for pulling new reports that doesn't involve a new spec at all. I think the biggest issue with Big Data is that we now have cheap enough software and computing cycles to bring that level of analysis to companies that couldn't afford it before. We also have more data, but I think that's secondary to the experience, tools, and discipline.
User Ranking: Blogger
CurtisFranklin   3 Big-Data Mistakes to Avoid   1/17/2013 11:10:31 PM
Re: best questions?
@Cormac, with the developments in database and computing technolo, do you foresee a day when mission creep won't be such a big issue -- when expanding processing to answer any question about any portion of the data set is essentially without additional cost? It seems that's something a few companies are promising, though I'm not sure anyone is truly close today.
Page 1 / 2   >   >>

The blogs and comments posted on EnterpriseEfficiency.com do not reflect the views of TechWeb, EnterpriseEfficiency.com, or its sponsors. EnterpriseEfficiency.com, TechWeb, and its sponsors do not assume responsibility for any comments, claims, or opinions made by authors and bloggers. They are no substitute for your own research and should not be relied upon for trading or any other purpose.

More Blogs from Cormac Foster
Cormac Foster   1/25/2013   28 comments
This is supposed to be a big year for identity management. IDC thinks we might all be logging onto the corporate network with our Facebook logins. Wired Magazine has declared passwords ...
Cormac Foster   1/22/2013   40 comments
Malware is going to be ugly in 2013. BitDefender is already calling this "The Year of Mobile Malware," which should send shivers down the spines of anyone playing with BYOD.
Cormac Foster   1/8/2013   5 comments
The last time we touched on software defined networking (SDN), or virtual networking, the industry was just lining up behind the OpenFlow standard that now defines it. By the end of 2012, ...
Cormac Foster   6/22/2011   5 comments
What do ISO 9001, HIPAA, PCI, Sarbanes Oxley, and a weekly drop-ship of 25 teddy bears to Des Moines every Tuesday have in common? They're all promises to do a certain thing a certain way ...
Latest Archived Broadcast
We talk with Bernard Golden about accelerating application delivery in the cloud.
On-demand Video with Chat
Register for this video discussion to learn how tablets can provide true business usability and productivity.
E2 IT Migration Zones
IT Migration Zone - UK
Why PowerShell Is Important
Reduce the Windows 8 Footprint for VDI
Rethinking Storage Management
IT Migration Zone - FR
SQL Server : 240 To de mémoire flash pour votre data warehouse
Quand Office vient booster les revenus Cloud et Android de Microsoft
Windows Phone : Nokia veut davantage d'applications (et les utilisateurs aussi)
IT Migration Zone - DE
Cloud Computing: Warum Unternehmen trotz NSA auf die „private“ Wolke setzen sollten
Cloud Computing bleibt Wachstumsmarkt – Windows Azure ist Vorreiter
Like Us on Facebook
Twitter Feed
Enterprise Efficiency Twitter Feed
Site Moderators Wanted
Enterprise Efficiency is looking for engaged readers to moderate the message boards on this site. Engage in high-IQ conversations with IT industry leaders; earn kudos and perks. Interested? E-mail:
Dell's Efficiency Modeling Tool
The major problem facing the CIO is how to measure the effectiveness of the IT department. Learn how Dell’s Efficiency Modeling Tool gives the CIO two clear, powerful numbers: Efficiency Quotient and Impact Quotient. These numbers can be transforma¬tive not only to the department, but to the entire enterprise.

Read the full report
The State of Enterprise Efficiency in the Virtual Era: Virtualization – Smart Approaches to Maximize Gains
Virtualization is a presence in nearly all enterprise data centers. But not all companies are using it to its best effect. Learn the common characteristics of success, what barriers companies face, and how to get the most from your efforts.

Read the full report
Informed CIO: Dollars & Sense: Virtual Desktop Infrastructure
Cut through the VDI hype and get the full picture -- including ROI and the impact on your Data Center -- to make an informed decision about your virtual desktop infrastructure deployments.

Read the full report
A Video Case Study – Translational Genomics Research Institute
e2 Video

On the Case
TGen IT: Where We're Going Next

7|11|12   |   08:12   |   10 comments

Now that TGen has broken new ground in genomic research by using Dell's storage, cloud, and high-performance computing solutions, the company discusses what will come next for it and for personalized medicine.
On the Case
Better Care Through Better Communications

6|6|12   |   02:24   |   11 comments

The achievements of the TGen/Dell project could improve how all people receive healthcare, because they are creating ways to improve end-to-end communication of medical data.
On the Case
TGen IT: Where We Are Now

5|15|12   |   06:58   |   6 comments

TGen is breaking new ground in genomic research by using Dell's storage, cloud, and high-performance computing solutions.
On the Case
TGen IT: Where We Were

4|27|12   |   06:45   |   10 comments

The Translational Genomics Research Institute wanted to save lives, but its efforts were hobbled by immense computing challenges related to collecting, processing, sharing, and storing enormous amounts of data.
On the Case
1,200% Faster

4|18|12   |   02:27   |   12 comments

Through their partnership, Dell and TGen have increased the speed of TGen’s medical research by 1,200 percent.
On the Case
IT May Improve Children's Chances of Survival

4|17|12   |   02:12   |   8 comments

IT is helping medical researchers reach breakthroughs in a way and pace never seen before.
On the Case
Medical Advances in the Cloud

4|10|12   |   1:25   |   5 comments

TGen and Dell are pushing the boundaries of computing, and harnessing the power of the cloud to improve healthcare.
On the Case
TGen: Living the Mission

4|9|12   |   2:25   |   3 comments

TGen's CIO puts the organizational mission at the heart of everything the IT staff does.
On the Case
TGen Speeding Up Biomedical Research to Save More Lives

4|5|12   |   1:59   |   6 comments

The Translational Genomics Research Institute is revamping its computing to improve speed, storage, and collaboration – and, most importantly, to save lives.
On the Case
Computing Power Helping to Save Children's Lives

3|28|12   |   2:13   |   3 comments

The Translational Genomics Institute’s partnership with Dell is enabling them to treat kids with neuroblastoma more quickly and save more lives.
Tom Nolle
The Big Reason to Use Office

3|18|14   |   02:24   |   46 comments

Office and personal productivity tools come in a first-class and coach flavor set, but what makes the difference is primarily little things that most users won't encounter. What's the big issue in using something other than Office, and can you get around it?
E2 Editors
SPONSORED: Mobile Security — A Use Case

3|4|14   |   04:27   |   16 comments

New mobile security solutions can accommodate a wide array of needs, including those of a complex university environment.
Tom Nolle
Killing Net Neutrality Might Save You Money

1|16|14   |   2:13   |   16 comments

The DC Court of Appeals voided most of the Neutrality Order, and whatever it might mean for the Internet overall, it might mean better and cheaper Internet VPNs for businesses.
Tom Nolle
The Internet of Everythinguseful

1|10|14   |   2:18   |   19 comments

We really don't want an "Internet of Everything" but even building an Internet of Everythinguseful means setting some ground rules to insure there's value in the process and that costs and risks are minimized.
Tom Nolle
Maturing Google Chrome

12|30|13   |   2.18   |   25 comments

Google's Chrome OS has a lot of potential value and a lot of recent press, but it still needs something to make it more than a thin client. It needs cloud integration, it needs extended APIs via web services, and it needs to suck it up and support a hard drive.
Sara Peters
No More Cookie-Cutter IT

12|23|13   |   03.58   |   21 comments

Creating the right combination of technology, people, and processes for your IT organization is a lot like baking Christmas cookies.
Sara Peters
Smart Wigs Not a Smart Idea

12|5|13   |   3:01   |   46 comments

Sony is seeking a patent for wigs that contain computing devices.
Tom Nolle
Cloud in the Wild

12|4|13   |   02:23   |   15 comments

On a recent African trip I saw examples of the value of the cloud in developing nations, for educational and community development programs. We could build on this, but not only in developing economies, because these same programs are often under-supported even in first-world countries.
E2 Editors
SPONSORED: Is Malware Evading Your IPS?

11|18|13   |   03:16   |   4 comments

Intrusion prevention software is supposed to detect and block malware intrusions, but clever malware authors can evade your IPS in these five main ways.
Sara Peters
Where Have All the Mentors Gone?

9|27|13   |   3:15   |   38 comments

A good professional mentor can change your life for the better... but where do you find one?
Tom Nolle
SDN Wars & You Could Win

9|17|13   |   2:10   |   5 comments

VMware's debate with Cisco on SDN might finally create a fusion between an SDN view that's all about software and another that's all about network equipment. That would be good for every enterprise considering the cloud and SDN.
Ivan Schneider
The Future of the Smart Watch

9|12|13   |   3:19   |   39 comments

Wearing a bulky, oversized watch is good training for the next phase in wristwatches: the Internet-enabled, connected watch. Why the smartphone-tethered connected watch makes sense, plus Ivan demos an entirely new concept for the "smart watch."
Tom Nolle
Cutting Your Cloud Storage Costs

9|4|13   |   2:06   |   3 comments

Cloud storage costs are determined primarily by the rate at which files are changed and the possibility of concurrent access/update. If you can structure your storage use to optimize these factors you can cut costs, perhaps to zero.
Sara Peters
Do CIOs Need an IT Background?

8|29|13   |   2:11   |   23 comments

Most of the CIOs interviewed in the How to Become a CIO series did not start their careers as IT professionals. So is an IT background essential?
Ivan Schneider
The Internet Loves Birthdays

8|27|13   |   3:25   |   69 comments

The Internet has evolved into a machine for drumming up a chorus of "Happy Birthday" messages, from family, friends, friends of friends who you added on Facebook, random people that you circled on G+, and increasingly, automated bots. Enough already.