Category

Data Integration

Jun 142016
 
How to Architect a Data Lake

“How do you architect a lake?” If the question sounds like the opening line of a joke, the answer would clearly come as: “You don’t. You can only discover one.” Whether it is data warehouses or marts, data lakes, or reservoirs, the IT industry has a penchant for metaphor. The subliminal images conjured in the human mind by the above terms are, in my opinion, of critical importance in guiding thinking about the fundamental meanings and architectures of these constructs. Thus, a data warehouse is a large, cavernous, but well-organized location for gathering and storing data prior to its final use and a place where consumers are less than welcome for fear of being knocked Read more

Feb 092016
 
Call for Papers: IoT Data Management and Analytics

The Internet of Things (IoT) is rapidly emerging as a transformational paradigm with a multitude of products and services now available and being adopted by corporations as well as individuals seeking to harness the vast opportunities it offers. But we do face a major obstacle in realizing the full potential of the IoT. The sheer variety, volume and velocity of data generated by the IoT presents unprecedented challenges in deriving meaningful and actionable insights and calls for a strategic approach to data management and usage. These approaches also need to address business strategies, business processes, enterprise architecture, systems and applications, and security and privacy considerations. It is also important to examine and decide what data Read more

Jun 302015
 
The Data Lake as an Exploration Platform

The data lake is an attractive use case for enterprises seeking to capitalize on Hadoop’s big data processing capabilities. This is because it offers a platform for solving a major problem affecting most organizations: how to collect, store, and assimilate a range of data that exists in multiple, varying, and often incompatible formats strung out across the organization in different sources and file systems. In the data lake scenario, Hadoop serves as a repository for managing multiple kinds of data: structured, unstructured, and semistructured. But what do you do with all this data once you get it into Hadoop? After all, unless it is used to gain some sort of business value, the data lake Read more

Jun 232015
 
Call for Papers: How Wearables Will Impact Corporate Technology, Business, Social, and Legal Landscapes

Wearable devices like smart watches (Apple Watch, Samsung Gear, etc.), activity/fitness trackers (FitBit, JawBone, etc.), smart badges for location tracking and security (Ekahaus), and smart glasses (Google Glass, Vuzix, etc.) are generating considerable interest in the consumer and business worlds. Recently, new types of wearables have also appeared, including “Smart fabrics/clothing” and virtual reality headsets. Wearables offer a host of opportunities, ranging from creating a new touchpoint for companies to engage directly with customers to changing the way healthcare is provided and medical research is conducted. Moreover, the use of wearable devices to assist employees with their jobs is now underway and is expected to advance in the near future. In short, wearables are making Read more

Here Comes the Big Data

 Posted by on Dec 16, 2014  No Responses »
Dec 162014
 

About two decades ago I thought I had a handle on big data. I was doing some data warehousing work with a telephone utility that had about 100 million transactions. That was a lot of data, I said to myself. Then, about 10 years ago, I was doing a review of a firm that audited financial trading on one of the major stock markets and I asked its big data guy how many transactions the company processed. His initial answer was, “On a slow day we get about 2.5 billion transactions.” “How many do you have on a busy day?” I asked with an air of shock. “4 or 5 billion,” he responded. Now that Read more

Dec 192013
 
Will the Laggards Speed Up, Please?

The Oil and Gas (O&G) industry, especially its so-called “upstream” segment, exploration and production (or going from the rock to the pipeline), is totally based on data. A seismic survey may collect petabytes of acoustic signals. Increasingly, when wells are drilled, sensors are inserted in them, and these sensors collect data for the next 30 years of production. Two completely different applications, but in both cases they result in masses of data. O&G’s dependency on data began decades ago when the Schlumberger brothers invented the “electric log” in 1926. And yet, this sector has been one of the most conservative, even lagging, adopters of modern modeling and management techniques for both information and processes. Over Read more

Aug 152012
 

For those not familiar with the structure of subways, the third rail is the one that carries the power for the trains. Under no circumstances do you ever want to touch it. The phrase “third rail” has made its way into political discourse as a description of a “no-win” issue — something that should be avoided at all costs. In corporate IT, email and the increasingly sophisticated messaging structure that supports it are third rail issues. As I said, the third rail is what carries the power to a subway train or electrified rapid transit. Should you be so unlucky as to find yourself on one of those tracks, it’s safe to touch the two Read more