Quantcast
Channel: Morpheus Blog
Viewing all articles
Browse latest Browse all 1101

Tame the Big-Data Deluge by Devising a Metadata Strategy

$
0
0

Ride the crest of the big-data wave by using metadata management as your surfboard. As more of your organization's information assets become untethered from relational databases, you'll rely increasingly on metadata to classify, qualify, and otherwise manage today's diverse data resources.

"Metadata" is one of those terms that appears to have as many meanings as there are people using the word. The standard definition of metadata is "data about data." That's like defining a tree as "wood with leaves."

A slightly better definition of metadata comes from Mika Javanainen's November 5, 2014, article on TechRadar: "The attributes, properties and tags that describe and classify information." This may include the data type (text document, image, Javascript, etc.), creation date, author, or workflow state.

Like many definitions, this one fails to communicate the importance of metadata to the task of organizing and managing massive data stores comprised of diverse elements that relate and interact in ways that are often unpredictable. As Javanainen points out, metadata's most important role may be as a bridge between diverse information residing in organizations: CRM, ERP, and other siloed databases housing both structured and unstructured data.

Javanainen recommends creating metadata templates for employees in the organization to standardize on, such as ones for proposals, contracts, invoices, and product information. This allows metadata attributes to be applied automatically and consistently to data at the point of ingestion.

Managing the transition from structured RDBMSs to unstructured big data

As Ventana Research's Mark Smith points out in a November 12, 2014, article on the Smart Data Collective site, most big data in organizations resides in conventional relational databases (76 percent, according to the company's research), followed by flat files (61 percent) and data-warehouse appliances (46 percent).

However, when enterprise data managers were asked which tools they plan to use for their future big-data tasks, 46 percent named in-memory databases, 44 percent cited Hadoop, 43 percent named specialized databases, and 42 percent plan to adopt NoSQL.

 

Companies intend to use a mixed bag of technologies as they begin to implement their big-data strategies. Source: Ventana Research

The companies surveyed by Ventana Research identified metadata management as the single most important aspect of their big-data integration plans (58 percent), followed by joining disparate data sources (56 percent) and establishing rules for processing and routing data (56 percent).

A new company named Primary Data intends to help organizations realize the full value of their metadata resources. Forbes' Tom Coughlin describes the company's unique approach in a November 26, 2014, article.

The Primary Data platform uses data virtualization to create a single global namespace that can be used to manage direct attached, network attached, and both public and private cloud storage. To improve performance and efficiency, content metadata is stored on fast flash-based storage servers, while the data the metadata refers to is housed on lower-cost (and slower) hard disk drives.

 

Primary Data's metadata server creates a logical abstraction of physical storage that automates data movement and placement via an intelligent policy engine. Source: Storage Newsletter

 


Viewing all articles
Browse latest Browse all 1101

Trending Articles