May 172011

Back in December, when making predictions for the upcoming year regarding important BI trends, I wrote that we could expect to see use of text mining and analysis increase in 2011, just as it has almost every year since we’ve measured its adoption (see “What Lies Ahead: BI and Data Warehousing Predictions for 2011,” 14 December 2010).

A major driver behind this trend is that organizations are now faced with more and more unstructured data sources that they want to use to optimize their BI, marketing, and various performance management practices. In particular, Web, contact center, surveys, maintenance logs, sensors, and consumer social media sites are all contributing to the exploding amounts of unstructured data that almost every organization in every industry is generating.

Other important factors include the fact that text mining and analysis tools have been around for a while now and, as such, have become more comprehensive and user friendly. Don’t get me wrong, I’m not saying that developing text mining applications is an easy project by any means, but the tools are more comprehensive, have better interfaces, and the practice of text analysis is better understood today. In addition, the past few years have seen major BI players buying up independent text analytics vendors. Moreover, and probably most important, is that the technology is getting embedded in other software (e.g., CRM systems, focused applications).

Due to these developments, a more favorable attitude toward text mining and analysis appears to be emerging, whereby organizations seem to find the technology more approachable. Consequently, not only do I expect the use of text analysis and text mining software to increase this year, I also expect that more organizations will turn to integrating unstructured data into their data warehouses to support their BI initiatives. As I wrote in December, organizations are no longer content with using text analysis/mining tools in a standalone manner; they now have advanced their data integration capabilities to the extent that they are blending unstructured data into their data warehouses to support their corporate data analysis efforts.

Of course, the ability to manage and analyze unstructured data is important if organizations want to integrate data from social media into their BI and data warehousing systems, though, at this time, not many end-user organizations (i.e., non-Internet-based companies) are doing so. (Today, this remains mostly the realm of the major Internet players, such as Google, Yahoo!, and Facebook.) But this, too, I believe will soon start to change as people begin to develop a better understanding of just how to go about analyzing social media (i.e., learning the types of consumer trends they need to look for and how to do it) and how to apply the findings. For example, methods to optimize online as well as more traditional marketing efforts as well as how to use such findings to support product development efforts. (For some real examples of how organizations are using text mining for social media analysis and other applications, see “Miners that Shed Light: Some Innovative Predictive Analytics,” 7 September 2010.)

While social media analysis is destined to play an increasingly important role, we are also seeing text mining and analysis being applied to analyze unstructured data found in maintenance logs, sensor systems, and other operational devices in order to perform predictive maintenance on machinery, electronics, and other complex equipment. For example, major aircraft and equipment manufacturers — ranging from providers of helicopters to earth-moving equipment and process control systems — are working on text-mining-based predictive maintenance efforts. (For some real examples of how organizations are using text mining and analysis for preventative maintenance, see “Text Mining and Data Warehousing for Optimizing Preventative Maintenance,” 22 June 2010.)

I believe that companies should pay close attention to this trend because predictive maintenance represents a major change in how organizations are going to plan for and carry out equipment maintenance. Consequently, it has become a very important area in the application of data mining, text analysis, and predictive analytics in general.

The bottom line is that organizations are showing a lot of interest in using text analysis and mining to support various BI, performance management, and other optimization efforts. Of course, just how much interest and how many organizations are adopting the technology is the big question. Also, just because organizations express interest does not necessarily mean that they are fielding text mining and analysis applications (i.e., deploying them in production settings). And what about the serious issues and roadblocks they’re encountering in their efforts? To answer these and other questions, I’m conducting a survey on text mining and analysis (see I strongly urge you to take our survey. Surveys like these are invaluable because they help us separate the facts from hype when it comes to corporate technology adoption and use trends, and they provide information that you can use to plan and benchmark your own organization’s BI efforts. As has always been our policy, responses will remain confidential; they will be aggregated to determine overall corporate adoption trends. I will present my findings in upcoming Cutter Consortium research.


  One Response to “As Unstructured Data Rises, So Does View of Text Mining”

  1. avatar

    The daunting task has to move beyond interest into practical use and lessons learned (not just observed). We (my office) currently has a BI/BA pilot project underway to capture/digest variant forms and media/mediums of information, structured and unstructured. Given the evoluation and progression of capabiilties in the BI/BA marketplace. We provided in “proof of principle” phase that the “tools”, if you will, exist. This was also an important highlite in the Teradata 2010 Partners User Conference and Executive Forum. The challenge alwasy amounts to cultural transformation and bridging the gap of the functional cylandars of excellence within organizations.

 Leave a Reply

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>