Quantcast
Channel: Morpheus Blog
Viewing all 1101 articles
Browse latest View live

New Breed of JSON Tools Closes the Gap with XML

$
0
0

As JSON's popularity for web app development increases, the range of tools supporting the environment expands into XML territory. The key is to maintain the simplicity and other strengths of JSON while broadening the environment's appeal for web developers.

JSON or XML? As with most programming choices, determining which approach to adopt for your web app's server calls is not a simple "either-or" proposition.

JSON, or JavaScript Object Notation, was conceived as a simple, concise data format for encoding data structures in an efficient text format. In particular, JSON is less "verbose" than XML, according to InCadence Strategic Solutions VP Michael C. Daconta in an April 16, 2014, article on GCN.com.

Despite JSON's efficiency, Daconta lists four scenarios in which XML is still preferred over JSON:

  • When you need to tag the data via markup, which JSON doesn't support
  • When you need to validate a document before you transmit it
  • When you want to extend a document's elements via substitution groups or other methods
  • When you want to take advantage of one of the many XML tools, such as XSLT or XPath

SOAP APIs and REST APIs are also not strictly either/or

One of JSON's claims to fame is that it is so simple to use it doesn't require a formal specification. George Anadiotis explains in a January 28, 2014, post on the Linked Data Orchestration site that many real-world REST-based JSON apps require schemas, albeit much different schemas than their SOAP-based XML counterparts.

 

The basic JSON-REST model decouples the client and server, which separates the app's internal data representation from the wire format. Source: Safety Net

The most obvious difference is that there are far fewer JSON tools than there are XML tools. This is understandable considering that XML has been around for decades and JSON is a relative newcomer. However, JSON tools are able to reverse-engineer schemas based on the JSON fragments you arrange on a template. The tools' output is then edited manually to complete the schema.

Anadiotis presents a five-step development plan for a JSON schema:

  1. Create sample JSON for exchanging data objects
  2. Use a JSON schema tool to generate a first-draft schema based on your sample JSON fragments
  3. Edit the schema manually until it is complete
  4. Use a visualization tool to create an overview of the schema (optional)
  5. Run a REST API metadata framework to provide the API's documentation

Tool converts JSON to CSV for easy editing

JSON may trail XML in quantity and quality of available toolkits, but the developer community is working hard to close the gap. An example is the free JSON-to-CSV converter developed by Eric Mills of the Sunlight Foundation. The converter lets you paste JSON code into a box and then automatically reformat and recolor it in an easy-to-read table.

 

In this example, the JSON-to-CSV converter transforms JSON code into a table of data about Ohio state legislators. Source: Programmable Web

Mills' goal in creating the converter was to make JSON "as approachable as a spreadsheet," as Janet Wagner repots in a March 31, 2014, post on the Programmable Web site. While the converter is intended primarily as a teaching tool that demonstrates the potential of JSON as a driver of the modern web, Mills plans to continue supporting the converter if is used widely.

Conversely, if you'd like to convert Excel/CSV data to HTML, JSON, XML, and other web formats, take the free Mr. Data Converter tool out for a spin.


Relational or Graph: Which Is Best for Your Database?

$
0
0

Choosing between the structured relational database model or the "unstructured" graph model is less and less an either-or proposition. For some organizations, the best approach is to process their graph data using standard relational operators, while others are better served by migrating their relational data to a graph model.

The conventional wisdom is that relational is relational and graph is graph, and never the twain shall meet. In fact, relational and graph databases now encounter each other all the time, and both can be better off for it.

The most common scenario in which "unstructured" graph data coexists peaceably with relational schema is placement of graph content inside relational database tables. Alekh Jindal of the MIT Computer Science and Artificial Intelligence Laboratory (CSAIL) points out in a July 9, 2014, post on the Intel Science and Technology Center for Big Data blog that most graph data originates in an RDBMS.

Rather than extract the graph data from the RDBMS for import to a graph processing system, Jindal suggests applying the graph-analytics features of the relational database. When a graph is stored as a set of nodes and a set of edges in an RDBMS, built-in relational operators such as selection, projection, and join can be applied to capture node/edge access, neighborhood access, graph traversal, and other basic graph operations. Combining these basic operations makes possible more complex analytics.

Similarly, stored procedures can be used as driver programs to capture the iterative operations of graph algorithms. The down side of expressing graph analytics as SQL queries is the performance hit resulting from multiple self-joins on tables of nodes and edges. Query pipelining and other parallel-processing features of RDBMSs can be used to mitigate any resulting slowdowns.

When Jindal compared the performance of a column-oriented relational database and Apache Giraph on PageRank and ShortestPath, the former outperformed the latter in two graph-analytics datasets: one from LiveJournal with 4.8 million nodes and 68 million edges; and one from Twitter with 41 million nodes and 1.4 billion edges.

 

A column-oriented RDBMS matched or exceeded the performance of a native graph database in processing two graph datasets. Source: Alekh Jindal, MIT CSAIL

When migrating data from relational to graph makes sense

While there are many instances in which extending the relational model to accommodate graph-data processing is the best option, there are others where a switch to the graph model is called for. One such case is the massive people database maintained by Whitepages, which resided for many years in siloed PostgreSQL, MySQL, and Oracle databases.

As explained in a November 12, 2014, post on Linkurious, Whitepages discovered that many of its business customers were using the directory to ask graph-like questions, primarily for fraud prevention. In particular, the businesses wanted to know whether a particular phone number was associated with a real person at a physical address, and what other phone numbers and addresses have been associated with a particular person.

The development team hired by Whitepages used the Titan scalable graph database to meet the company's need for scalability, availability, high performance (processing 30,000 vertices per second), and high ingest rate (greater than 200 updates per second). The resulting graph schema more accurately modeled the way Whitepages customers where querying the database: from location to location, and number to number.

 

The Whitepages graph schema tracks people as they change physical address and telephone number, among other attributes. Source: Linkurious

Whitepages has made its graph infrastructure available to the public via the WhitePages PRO API 2.0.

Whether you find your organization's data better suited to either the graph or relational model, the Morpheus Virtual Appliance will help you with real-time database and system operational insights. Get your MongoDB, MySQL, Elasticsearch, or Redis databases provisioned with a simple point-and-click interface, and manage SQL, NoSQL, and In-Memory databases across hybrid clouds. 

HTML5 Promises Simpler Embedded Videos and Better Performance

$
0
0

Embedding videos in web pages is simpler and playback quality improved via HTML5's new video specification.

TL;DR: The new HTML standard was along time coming, but now that it has arrived, developers can put HTML5's advanced video features to good use. Use these sites to determine how compatible your browser is with HTML5, how to embed an open-source HTML5 video player in your pages, and how to make best use of all the attributes associated with HTML5's new "video" tag.

Let's face it: Adobe's Flash player is trouble. It's trouble because Flash is susceptible to so many zero-day attacks, making your systems more vulnerable to malware. It's also trouble because it's everywhere: Flash was the long-time de facto standard for streaming Web animations and other media on PCs and, to a lesser extent, mobile devices. (Emphasis on was.)

YouTube's recent decision to dump Flash in favor of HTML5 is the stake in the heart of a proprietary technology that has outstayed its welcome. In a January 28, 2015, article, CNET's Stephen Shankland describes the ascendance of HTML5 video. Flash will survive for a short time as a vestigial browser extension needed to accommodate sites that haven't converted to the new video standard. (Note that Google Chrome has used a built-in Flash player since 2010.)

However, Flash's ultimate demise may not be far off. Mozilla is developing a version of Firefox that doesn't need the Flash player, as Tech Times' Timothy Torres reports in a February 16, 2015, article. The trend is away from browser extensions generally and the Flash player in particular.

In October 2014, the W3C's final recommendation for the HTML5 specification was released. You can get a sense of how well your browser is prepared for the new web standard by visiting the HTML5 Test site. The service generates a numerical score indicating your browser's degree of support for HTML5. The top score is 555. Below the overall score are category tables showing how well your browser scored in such areas as parsing, video, audio, elements, forms, storage, 2D and 3D graphics, and user interaction.

The HTML5 Test site automatically generates an overall score of your browser's support for HTML5, and shows how well your browser does in specific categories. Source: HTML5 Test

The test doesn't attempt to cover all aspects of HTML5 and its many extensions. You can also view the aggregate scores of other desktop, tablet, and mobile browsers. On desktops, Google Chrome scores highest overall, followed by Opera, Firefox, Safari, and Internet Explorer.

The aggregate scores of browsers' support for HTML5 indicate that Chrome supports the new standard best, followed by Opera, Firefox, Safari, and IE. Source: HTML5 Test

Open-source HTML5 video player, and an HTML5 video tutorial

The site supporting the open-source Video.js HTML5 video player offers a primer on the spec's "video" tag, which works much like the "img" tag in earlier HTML versions. The goal is to allow developers to embed videos in their pages without being concerned about which players site visitors have installed. Video performance is enhanced because the browser doesn't need to call a separate plug-in or extension to play the video.

HTML5 Rocks' Pete LaPage put together an extensive HTML5 video tutorial that covers everything from the spec's simple embedded playback controls to using the "source" element to specify multiple source files. For example, you'll optimize video performance by including the type attribute in the source element.

Use the type attribute in the "video" tag's source element to improve the performance of videos embedded in web pages. Source: HTML5 Rocks

The tutorial also covers using the "track" element to add subtitles and other text to a video, and special "video" attributes, such as autoplay, preload, loop, muted, and height & width. As you might expect, the tutorial includes several videos that demonstrate the techniques it presents.

Using a browser to manage your heterogeneous databases doesn't get any simpler than by using the new Morpheus Virtual Appliance. The Morpheus database-as-a-service (DBaaS) provides a single dashboard for provisioning, deploying, and monitoring MySQL, MongoDB, Redis, and ElasticSearch databases via a simple point-and-click interface.

Morpheus lets you work with SQL, NoSQL, and in-memory databases across hybrid clouds in just minutes. Each database instance you create includes a free full replica set for built-in fault tolerance and fail over. You can migrate existing databases from a private cloud to the public cloud, or from public to private. A new instance of the same database type is created in the other cloud, and real-time replication keeps the two databases in sync.

Visit the Morpheus site for pricing information and to create a free account.

The Many Forms of HTML5 Local Storage

$
0
0

New local storage options in HTML5 are sure to improve the performance of any applications that run in browsers.

TL;DR: With HTML5, you can now store as much as 5MB of data locally in the browser, and decide whether or not the data should persist when the browser session ends. Here's a rundown of the HTML5 Web Storage options for app developers, as well as a few pitfalls you'll want to avoid.

After years of fits and starts, the W3C's final recommendation for the HTML5 specification was released in October 2014. Perhaps the greatest impact HTML5 will have for app developers is local data storage. Toptal's Demir Selmanovic points out in The 5 Most Common HTML5 Mistakes that Web Storage's local data stores are not encrypted, which introduces a potential security risk.

On the plus side, Web Storage data never travels to web servers, so it is more secure than old-style cookies and Flash LBOs. However, HTML5's localStorage and sessionStorage values are easy for bad guys to modify, so you should avoid storing security tokens locally.

HTML5 has already gone through several iterations of local storage, the simplest of which is JavaScript variables. According to Sitepoint's Craig Buckler in HTML5 Browser Storage: The Past, Present and Future, you can store application data in a single global variable.

The simplest approach to local storage for application data in HTML5 is to use a single JavaScript global variable. Source: Sitepoint

Alternatively, values can be stored in the page DOM as node attributes or properties. This is particularly beneficial for widget-specific values, but doing so is riskier than using JavaScript variables because you can't predict how your data will be interpreted by future browsers and other libraries.

Web Storage's window.localStorage and code.sessionStorage objects have identical APIs and are used to retain persistent data and session-only data, respectively. Name/value pairs are used to store domain-specific strings, and up to 5MB of data can be stored locally, none of which ever travels to the server.

HTML5's window.localStorage object allows local data to persist after the browser session is closed. Source: Sitepoint

Web Storage supports only string values, and it's unstructured, so it doesn't allow transactions, indexing, or searching. Conversely, IndexedDB's data store is structured, transactional, and more like NoSQL in terms of performance. Its synchronous and asynchronous API makes possible more robust client-side data storage and access, although the API's size and complexity make creating an IndexedDB polyfill a challenge.

File API enhancements facilitate local file access

When users interact with files in their browsers, the many back-and-forth trips between the client and server can be frustrating -- to users and developers alike. HTML5's File API lets users access and alter files in the browser with much less interaction with the server.

In an October 29, 2014, tutorial on Scotch.io, Spencer Cooley describes how to allow browser users to select one or more image files using JavaScript, and then display the file without requiring a call to the server.

HTML5 allows you to access an image file in your browser without having to communicate with the server. Source: Scotch.io

After accessing the FileList object, you render the file in the browser by loading one of the file objects into FileReader to generate a local URL that serves as the src in an image element.

Load a file object into FileReader to create a local URL to use as the src image element. Source: Scotch.io

The simplest way to manage your databases in a browser is by using the new Morpheus Virtual Appliance, which provides a single dashboard for provisioning, deploying, and monitoring your heterogeneous MySQL, MongoDB, Redis, and ElasticSearch databases. Morpheus offers a simple point-and-click interface for analyzing SQL, NoSQL, and in-memory databases across hybrid clouds in just minutes. Each database instance you create includes a free full replica set for built-in fault tolerance and fail over.

With the Morpheus database-as-a-service (DBaaS), you can migrate existing databases from a private cloud to the public cloud, or from public to private. A new instance of the same database type is created in the other cloud, and real-time replication keeps the two databases in sync. Visit the Morpheus site for pricing information and to create a free account.

Container Virtualization versus Hypervisor Virtualization

$
0
0

The goal: combine the speed and small footprint of containers with the proven track record of hypervisor VMs.

TL;DR: Containers are taking the virtualization world by storm, but most analysts see the technology complementing and integrating with the traditional hypervisor virtual machine model rather than replacing it. The first steps are already being taken to merge the performance benefits of containers with the manageability and security of hypervisor VMs.

With the possible exception of Hollywood, there's no hype like tech hype. And lately much of the tech hype has centered on containers, which many pundits and press types praise as the cure for everything from server overload to psoriasis.

Eventually all the hot air surrounding container technology will blow away and IT will be left with an innovative approach to server virtualization that blends with rather than replaces existing methods. Even though containers have been around since before the 2011 introduction of the open-source Docker technology (now seen as a milestone in the industry), containers' impact on cloud services in particular is expected to continue to rise at a steep trajectory.

Where does that leave hypervisor-based virtual machines? As evident by VMWare's recent release of VSphere 6.0, the more traditional virtualization architecture still has plenty to offer. Silicon Angle's Maria Deutscher reports in a February 10, 2015, article that the new version's long-distance migration option allows managers to relocate instances thousands of miles away without having to take them offline.

In addition, VSphere's cloning function reduces the amount of data that needs to travel across the network. Launch times are cut from minutes to seconds because fewer duplicate files are required to initialize. That's especially important because a big edge for containers over hypervisor VMs is that containers are faster and require much less overhead.

As Linux Journal's David Strauss explains in an article from June 2013, each VM requires its own operating system image in the hypervisor model, while multiple containers run within a single OS, in addition to sharing other binary and library resources. With hypervisor VMs, you often need more memory and disk space for the OS than you do for the application it's hosting.

Containers (the model at right) reduce virtualization overhead compared to hypervisors (shown on the left) by sharing OS, binaries, and libraries among instances. Source: Linux Journal

Container performance improvements come with caveats

Containers' smaller server footprint can more than double the number of instances each server can run. However, as TechTarget's Jim O'Reilly points out in a February 2015 article, doubling the number of instances also doubles the server's I/O load. O'Reilly cites a study conducted by IBM Research that found containers outperform hypervisors 2:1 in LINPACK benchmarks, including random disk reads and writes, and SQL performance with local solid state drives.

While hypervisor VMs allow provisioning without any hardware deployment, containers eliminate the need for OS deployment and boot-up. Source: Linux Journal

Despite containers' performance advantages over hypervisor VMs, some analysts caution that VMs remain the best choice in public-cloud, multi-tenant environments. Tom Nolle states in a December 2014 TechTarget article that the VM boundary makes it more difficult for hackers to attack adjacent applications than with separate containers. Also, it's more difficult to prevent one container from hogging resourced needed by neighboring containers.

Nolle envisions containers running inside VMs, which is the goal of the recent alliance between Docker and VMWare, as reported by Datamation's James Maguire in an August 28, 2014, article. Nolle anticipates that both technologies will benefit from such a symbiotic relationship.

One way to ensure peak performance for your databases is by using the new Morpheus Virtual Appliance. With the Morpheus database-as-a-service (DBaaS) you can provision, deploy, and monitor your MongoDB, Redis, MySQL, and ElasticSearch databases from a single point-and-click console. Morpheus lets you work with SQL, NoSQL, and in-memory databases across hybrid clouds in just minutes. Each database instance you create includes a free full replica set for built-in fault tolerance and fail over.

In addition, the service allows you to migrate existing databases from a private cloud to the public cloud, or from public to private. A new instance of the same database type is created in the other cloud, and real-time replication keeps the two databases in sync.Visit the Morpheus site for pricing information and to create a free account.

What Is a Lost Update in Database Systems?

$
0
0

On occasion, data in a database can become incorrect due to a lost update. Find out what a lost update is and what can be done to prevent it!

TL;DR: When database transactions are executed, they are typically sequential and very dutifully update the data as expected. On occasion, transactions happen nearly simultaneously, which can lead to something called a lost update. A lost update may cause data within the database to be incorrect, which can lead to problems with normal operations, such as fulfilling customer orders.

What is a Lost Update?

A lost update occurs when two different transactions are trying to update the same column on the same row within a database at the same time. Typically, one transaction updates a particular column in a particular row, while another that began very shortly afterward did not see this update before updating the same value itself. The result of the first transaction is then "lost", as it is simply overwritten by the second transaction.

The process invlolved in a lost update. Source: Vlad Mihalcea's Blog.

The image above shows the sequence of events that can occur in a lost update. Notice that both transactions are seeing a beginning value of 7 for the quantity column; however, the second transaction needs to see a value of 6 to be correct, as the first transaction was initiated in order to update that same column.

Since the second transaction is unaware of the change made by the first transaction, it simply changes the quantity to 10, overwriting and thus losing the update made by the first transaction. As you can see, if the quantity is first lowered to 6 before the second transaction makes an update, then the second transaction would do the correct calculation and update that value to 9. Instead, the second transaction still sees 7, and thus updates it to 10 instead of 9, causing the quantity to be incorrect!

What Can Be Done to Prevent a Lost Update?

One recommended method for preventing lost updates is to use optimistic concurrency control to perform what is called optimistic locking on the data. Optimistic concurrency control typically uses four phases in order to help to ensure that data isn’t lost:

Begin - A timestamp is recorded to pinpoint the beginning of the transaction.

Modify - Read values and make writes tentatively.

Validate - Make a check to ensure that other transactions have not modified any data that is used by the current transaction (including any transactions that have completed or are still active after the current transaction's start time.

Commit or Rollback - If there are not conflicts, the transaction can be committed. Otherwise, the transaction can be aborted (or other resolution methods can be employed) to prevent a lost update from occurring.

The image below demonstrates what happens when an optimistic lock is used.

An example of optimistic locking. Source: IBM developerWorks.

Get Your Own Hosted Database

Whatever your strategy is for preventing lost updates, you will want reliable and stable database hosting. Morpheus Virtual Appliance is a tool that allows you manage heterogeneous databases in a single dashboard. With Morpheus, you have support for SQL, NoSQL, and in-memory databases like Redis across public, private, and hybrid clouds. So, visit the Morpheus site for pricing information or to create a free account today!

When It Makes Sense to Combine MongoDB and Redis

$
0
0

 

How two MongoDB database developers cured performance bottlenecks by offloading transactions to Redis.

TL;DR: Not all real-world data-management problems can be addressed by a single DBMS. These two database developers addressed performance bottlenecks in their MongoDB databases, and avoided expensive hardware upgrades, by offloading some time-sensitive transactions to Redis in-memory databases that were subsequently synched with their corresponding MongoDB documents.

Henry Ford is famous for telling the people buying his Model T that they could have any color they wanted, so long as it was black. For years the database industry was not much different: You could have any database you wanted, so long as it was SQL.

Today it seems there are new databases appearing every time you turn around, few of which bear much resemblance to the traditional relational model. At the same time, organizations are collecting and using more data -- and more data types -- than ever before. Timely, accurate data analysis is critical to every company's success.

Rarely is one database going to fit the bill. That makes the ability to integrate with other databases a key feature of every DBMS on the market. More and more organizations are meeting their data-management needs by combining MongoDB's ability to accommodate unstructured data with Redis's in-memory performance.

A MongoDB database undone by a simple document counter

January 30, 2014, post by DJ Walker-Morgan on the Compose blog describes a company that was facing the prospect of an expensive server upgrade to accommodate an increase in database writes. Each time the application ran a task, counters in the documents increased by one, which caused a write operation.

The system was maxing out at 1500 updates per second. Sharding the data across multiple MongoDB instances would allow 1500 more updates per second for each shard, but doing so would require the time and expense of more CPUs. Instead, the company reworked the application to send increment requests to a Redis in-memory database, mapped to the correct key and field, using the "hincrby" command. Redis's key/value store offers a high transaction rate and low data storage that's perfect for such incremental operations. The Redis store is ultimately harvested, aggregated, and saved in a MongoDB document.

MongoDB-Redis combo takes the sting out of keeping score

Karl Seguin, the creator of an open-source leaderboard component for game developers, improved the performance of score-keeping by combining a MongoDB data store with Redis's sorted sets. Seguin describes how he did so on the Practical NoSQL blog.

When you enter a key, a score, and a value in Redis, it supplies simple and efficient ranking methods. The key is the leaderboard id, the scope, and the scope+leaderboard start time. Daily scores for each leaderboard id are added to an ordered set. The "member" parameter ensures that when values are updated, only the member's highest score is recorded.

Redis's sorted sets simplifies the process of ranking players in game apps. Source: Practical NoSQL

Seguin estimates that the time required to complete a read operation on a leaderboard with 5 million records was reduced by the new technique from approximately 5 minutes to only 147 milliseconds. Still, a score update could require as many as 8 operations: 1 read, 4 MongoDB writes, and 3 Redis writes. However, there are far fewer writes than reads overall, and the performance boost didn't require any new hardware, although it did create the need to manage a second data store.

With the new Morpheus Virtual Appliance you can provision, deploy, and monitor your MonoDB, Redis, MySQL, and ElasticSearch databases from a single point-and-click console. The Morpheus database-as-a-service (DBaaS) lets you work with SQL, NoSQL, and in-memory databases across hybrid clouds in just minutes. Each database instance you create includes a free full replica set for built-in fault tolerance and fail over.

Morpheus lets you migrate existing databases from a private cloud to the public cloud, or from public to private. A new instance of the same database type is created in the other cloud, and real-time replication keeps the two databases in sync.Visit the Morpheus site for pricing information and to create a free account.

How to Ensure MongoDB Security Options Are Enabled

$
0
0

MongoDB's recent releases add the authentication, auditing, and other management controls serious business databases require.

TL;DR: Protect against data breaches by ensuring the industrial-strength security features built into the latest releases of MongoDB are configured correctly. These include user authentication, audit trails, encryption, and environment/process controls.

Any commercial database has to have security built in. One of the early knocks on the open-source MongoDB NoSQL database was that it lacked the management and security features of Oracle and other relational DBMSs.

The release of MongoDB 2.6 in 2014 addressed these concerns by upgrading the MongoDB Management Service (MMS). Java World's Brian Crucitti writes in an April 10, 2014, article that version 2.6 added continuous backup, point-in-time recovery, and monitoring and alerts on more than 100 parameters. Other new security features in version 2.6 included authentication/authorization, field-level security, and encryption.

The new version 3.0 of MongoDB due in March builds on these enhancements by adding a pluggable storage engine API that allows multiple storage engines to coexist within a single replica set, according to the company. The new release's WildTiger storage engine features document-level locking, as InfoQ's Alex Giamas explains in a February 20, 2015, article.

The WildTiger storage engine is said to be seven to 10 times faster than its predecessor, and it compresses data 80 percent more efficiently than earlier releases, according to MongoDB's Eliot Horowitz, as quoted by ZDNet's Toby Wolpe in a February 3, 2015, article.

MongoDB data disclosure highlights built-in security features

In a February 10, 2015, post, Information Age's Ben Rossi reports that three students at Saarland University in Germany discovered 40,000 unsecured MongoDB databases on commercial servers. MongoDB points out in a February 13, 2015, follow-up post that the breached databases failed to enable the database's built-in security features, which would have precluded such vulnerabilities.

German university students discovered 40,000 publicly accessible MongoDB databases on commercial servers throughout the world. Source: Jens Heyens, Kai Greshake, and Eric Petryk, via Information Age

It is standard operating procedure to implement access controls whenever a database moves from a closed development environment to the public domain. The four essentials of any commercial database are authentication, operational audit trails, encryption at the communication and storage layers, and environment/process controls.

As part of the security-first mantra, the most popular MongoDB installer, RPM for the RedHat and CentOS Linux distributions, creates a process that restricts access to localhost. MongoDB also supports less-restrictive configurations that prevent unauthorized access such as those by the German students. (The MongoDB site features a tutorial for installing the database on Red Hat Enterprise, CentOS, Fedora, and Amazon Linux.)

Among the MongoDB security options are to configure a firewall to block client access to shards, and to configure Mongos to capture only local traffic. Source: IBM DeveloperWorks

The MongoDB Manual features a Security Checklist that includes authentication, role-based access control, encrypting data communications and storage, restricting network access, auditing system activity, running MongoDB with a dedicated user, secure configuration options, and security standards compliance. For deployments within the U.S. Department of Defense, you can request a Security Technical Implementation Guide.

When it comes to database security, the new Morpheus Virtual Appliance has you covered front, back, and sideways. With the Morpheus database-as-a-service (DBaaS) you can provision, deploy, and monitor your MongoDB, Redis, MySQL, and ElasticSearch databases from a single point-and-click console. Morpheus lets you work with SQL, NoSQL, and in-memory databases across hybrid clouds in just minutes. Each database instance you create includes a free full replica set for built-in fault tolerance and fail over.

In addition, the service allows you to migrate existing databases from a private cloud to the public cloud, or from public to private. A new instance of the same database type is created in the other cloud, and real-time replication keeps the two databases in sync. Visit the Morpheus site for pricing information and to create a free account.


Cloud-based Continuous Integration Projects Prove the Concept

$
0
0

Hosted continuous integration services support fast and efficient app development by combining an army of components.

TL;DR: Can the production-line principles of manufacturing industries be applied to application development? New cloud-based continuous integration services are putting the proposal to the test. They work with a range of tools and platforms to automate and systematize any and all coding tasks.

"Continuous integration" is one of those things that sound like a good idea, until you try them. In the real world, you never have a "perfect" version of an application out in the field. Fortunately, the "good enough" programs created using cloud-based continuous-integration services come together quicker and simpler than their offline counterparts.

The pre-eminent CI server is Jenkins; as Tech Republic's Nick Hardiman writes in a February 3, 2015, article, alternatives include Atlassian BambooCruiseControl, and JetBrains TeamCity. (An extensive list of continuous-integration products is available on Wikipedia.) Among the cloud versions of popular IDEs are Cloud9Compilr, and Nitrous. The hosted-CI services CodeshipSemaphor, and Travis can read from GitHub and write to Heroku.

In a January 5, 2015, post, Yegor Bugayenko updates his thorough comparison of 13 cloud CI services, which he originally posted in October 2014. Bugayenko's top four services are Travis ($129 per month), Appveyor ($39 per month), Wercker (free), and Shippable ($1 a month). Appveyor is the only one of the 13 to run Windows builds. Bugayenko notes that although Java and Ruby are considered platform independent, builds that run on Linux often don't pass on Windows or Macs.

The 13 hosted CI services run primarily on Linux and cost from nothing to $129 per month. Source: Yegor Bugayenko

Large development projects put cloud CI services to the test

One of the largest open-source projects is OpenStack. The developer community supporting OpenStack has created a system for code review, testing, and continuous integration. Among the infrastructure's components are unit tests, functional tests, integration tests, a patch review system, and automatic builds. Adalberto Medeiros describes the project in a September 23, 2014, post on IBM's ThoughtsOnCloud.

The OpenStack CI environment relies on such tools as DevStack, Grenade, and Tempest to automate the process. The Gerrit build and patch review system works with these tools to verify every proposed change to the project.

The OpenStack CI process integrates diverse components to verify changes before they are added to OpenStack projects. Source: IBM ThoughtsOnCloud

Gerrit uses Zuul to check all patches for dependencies, and to merge the changes once they pass. The Jenkins build automation system handles the changes as jobs, Nodepool creates the images and VMs to run the jobs, and the Jenkins Job builder automates the creation of the jobs required to test the environment. (Miguel Zuniga's presentation Continuous Integration and Deployment Using OpenStack can be viewed on the OpenStack site.)

The reality for a great number CI projects is that developers ignore the build status warnings, as Bugayenko explains in an October 29, 2014, post entitled Continuous Integration Is Dead. This is a follow-up to the September 26, 2014, post entitled Why Continuous Integration Doesn't Work.

Bugayenko's proposed solution is to create a read-only master branch that prohibits anything from being merged into the master. When anyone proposes a change, a script is created that will merge, test, and commit. Any branch that breaks a unit test causes the entire branch to be rejected. Bugayenko claims that such a system makes coders immediately responsible for the code they write.

Fast and efficient management of heterogeneous databases is the cornerstone of the new Morpheus Virtual Appliance. With the Morpheus database-as-a-service (DBaaS) you can provision, deploy, and monitor your MongoDB, Redis, MySQL, and ElasticSearch databases from a single point-and-click console. Morpheus lets you work with SQL, NoSQL, and in-memory databases across hybrid clouds in just minutes. Each database instance you create includes a free full replica set for built-in fault tolerance and fail over.

In addition, the service allows you to migrate existing databases from a private cloud to the public cloud, or from public to private. A new instance of the same database type is created in the other cloud, and real-time replication keeps the two databases in sync. Visit the Morpheus site for pricing information and to create a free account.

Cloud Data Centers Are Poised to Take the Spotlight

$
0
0

More and more companies are moving some or all of their data-center operations to specialty cloud-based services.

TL;DR: IT departments dealing with the increase in mobile network end points, and the corresponding tsunami of data, are turning to cloud data centers to complement rather than replace their on-site data processing. Workloads of data-center-as-a-service operations are expected to pass traditional data-center workloads by the end of the decade.

The traditional data center, with row after row of server racks and miles of cabling -- is officially an endangered species. Taking its place are industrial-strength mega facilities that offer data-center-as-a-service to organizations of all sizes.

That's one of the conclusions of the Cisco Global Cloud Index, which forecasts that cloud data center traffic will increase at a 32 percent compound annual growth rate through 2018, compared to a CAGR of 8 percent for traditional data center traffic. The cloud's share of overall traffic will increase from 54 percent in 2013 to 76 percent in 2018.

Cloud data center traffic will increase at a 23 percent CAGR through 2018 and will account for 76 percent of all traffic by that year. Source: Cisco Global Cloud Index

Cisco's study projects that 78 percent of all data center workloads will be processed in the cloud by 2018, increasing at a CAGR of 24 percent. Simultaneously, traditional data center workloads will decrease at a -2 percent rate in the period.

A primary driver of this transition is the growing popularity of virtualization, which increases workload density (the average number of workloads per physical server). Cloud server workload density is forecast to grow from 5.2 in 2013 to 7.5 in 2018. For traditional data center servers, workload density will increase from 2.2 in 2013 to 2.5 in 2018.

On-premises data centers to evolve rather than go extinct

Even with the shift to data processing in the cloud, there's still plenty of life left in the on-site data center. First and foremost, companies have made considerable investments in their IT departments, and they won't be walking away from that investment without having several darn good reasons.

December 10, 2014, article by Upsite Technologies explains that the typical data center goes from 5 to 15 years between major upgrades. In the interim, IT departments will focus on discrete, small-scale proof-of-concept projects to determine how the use of cloud data centers can support their existing operations. The real change happens when a data center reaches its end-of-life stage.

Spending on data-center infrastructure by colocation and outsourcing services will eclipse end-user equipment investments by 2020. Source: DCD Intelligence

Of course, industry trends have a way of shortening technology lifecycles. The overall amount of data being generated, the diverse types of data, and the growing number of network end points created by a more mobile workforce all conspire to push existing data infrastructures to the breaking point and beyond.

IT Business Edge's Arthur Cole writes in a February 25, 2015, article that construction of so-called hyperscale facilities is booming, while spending on data center construction is flat. The Open Data Center Alliance has issued guidelines for implementing a cloud infrastructure that focuses on a seamless transition for IT consumers and managers alike.

The new Morpheus Virtual Appliance offers companies of all sizes a giant step forward in their transition to cloud services. With the Morpheus database-as-a-service (DBaaS) you can provision, deploy, and monitor your MongoDB, Redis, MySQL, and ElasticSearch databases from a single point-and-click console. Morpheus lets you work with SQL, NoSQL, and in-memory databases across hybrid clouds in just minutes. Each database instance you create includes a free full replica set for built-in fault tolerance and fail over.

In addition, the service allows you to migrate existing databases from a private cloud to the public cloud, or from public to private. A new instance of the same database type is created in the other cloud, and real-time replication keeps the two databases in sync. Visit the Morpheus site for pricing information and to create a free account.

Redis Production Checklist

$
0
0

Don't Go Into Redis Production Without this Checklist

Are you going into production with Redis? Make sure you have done everything on this checklist!

TL;DR: If you are going into production with Redis, it is a good idea to check over a few things before you go live. For production, you will want to be sure everything is configured properly and that maintenance will be as easy as possible for your team.

With that in mind, here is a checklist you can use to help you better prepare for using Redis in production.

Run the Redis benchmark

One test you can run is the Redis benchmark. The benchmark will perform a stress test on your Redis installation to ensure everything will run smoothly with your current settings. An example of running the benchmark from the command line is shown below.

An example of running the Redis benchmark.

In this case, the test will run 100,000 requests from 50 different clients, which are all sending 10 commands at once.

Firewall the Redis port

The Redis port should only be directly accessible to the specific computers that are being used to implement your Redis application. As a result, the Redis port should be restricted by a firewall to prevent outside (and potentially unwanted) access to the system.

Set an authentication password

Enabling the Redis authorization layer allows queries from unauthorized clients to be refused. To be authorized, the client must send and AUTH command with the correct password.

This can act as a redundant layer of security in case, for instance, the firewall fails. The authentication password can be set by a system administrator inside the redis.conf file.

Backup and logging

Backups and logs are always good to have if something should go wrong. Redis provides two options, which can be used individually or both at the same time.

RDB (Redis Database File) - With RDB enabled, you will have access to point-in-time snapshots of the dataset, which allows you to easily restore the dataset if needed. However, since snapshots are taken at intervals, you may lose some data between the last snapshot and the incident.

AOF (Append Only File) - With AOF enabled, Redis will log all write operations. Since the data is written instantly, you can recover even the most recent data.

It is often recommended that both of these options be enabled, so that you have both available when you need to recover data after an incident.

Disable potentially harmful commands

Some commands could be harmful in the wrong hands, so disabling them in production may be a good idea. Some of these could include:

  • FLUSHALL - removes all keys from all databases
  • FLUSHDB - removes all keys from the current database
  • CONFIG - allows runtime server configuration
  • SHUTDOWN - shuts down Redis

To disable a command, simply rename it to an empty string.

Disabling a Redis command by renaming it to an empty string.

Get Your Own Hosted Database

One you have your Redis implementation prepared for production, you will want reliable and stable database hosting. Morpheus Virtual Appliance is a tool that allows you manage heterogeneous databases in a single dashboard.

With Morpheus, you have support for SQL, NoSQL, and in-memory databases like Redis across public, private, and hybrid clouds. So, visit the Morpheus site for pricing information or to create a free account today!

Bimodal IT the Future of the Data Center or Another Empty Buzzword

$
0
0

Few ideas have polarized IT managers more than bimodal IT: Is it the key to modernizing the data center, or a sure-fire recipe for disaster? Some analysts claim adoption of the model -- which runs agile and legacy development projects in parallel -- is well underway. Others assert that the model is a rehash of an inherently flawed approach that dates to the beginning of the century. The upshot: There's more than one way to manage change.

Depending on who you believe, bimodal IT is either the cure for everything that ails data centers, or a lame attempt to preserve the data silos that threaten the survival of companies of all sizes and types. As usual, the truth is positioned somewhere between these two extremes.

Technology research firm Gartner is credited as the creator and chief promoter of the concept. In a January 14, 2015, article on ZDNet, Adjuvi Chief Strategy Officer Dion Hinchcliffe presents bimodal/multimodal IT as a way for organizations to adopt agile development and other new technologies quickly, while retaining the reliability, stability, and security of traditional IT processes.

In the bimodal approach to IT, the agile development model is implemented in parallel with conventional IT workflows. Source: ZDNet

According to Hinchcliffe, Gartner's bimodal IT architecture is one mode too few. He cites the trimodal design proposed by Simon Wardley, which Wardley explains in a November 16, 2014, post on the Gardeviance blog. Wardley labels the three modes as pioneers, settlers, and city planners; the added twist is an overlay of self-organizing cell-based structures in which each cell conforms to the two-pizza model (a development team small enough to feed with only two pizzas -- hold the anchovies).

The trimodal IT model divides pioneers, settlers, and city planners, each of which is subdivided into cells for easier manageability. Source: Simon Wardley

Wardley criticizes the bimodal-IT concept in a November 13, 2014, post on the same blog, stating that it's "2004 dressed up as 2014 and it is guaranteed to get you into a mess." The agile developers work fast and don't mind errors -- in fact, they depend on them. The traditional developers work slowly and deliberately, and they have a low tolerance for errors. This is bound to lead to a stalemate, according to Wardley.

Bimodal IT: 'Rock-solid fluidity' or 'balderdash'?

One of bimodal IT's most vehement critics is Intellyx President Jason Bloomberg, who calls the concept "balderdash." Bloomberg's October 12, 2014, post on the company's blog states that Gartner is simply telling its clients what they want to hear rather than what the need to hear. Bimodal IT, he claims, is nothing more than an excuse for continuing to do IT poorly.

Bloomberg admits that change is difficult and expensive, and there's no need to fix what isn't broken. However, change is occurring throughout organizations at a rapid pace -- mostly originating from outside the data center. The need to maintain compliance, security, and other governance persists when IT modernizes, but governance must be done in a more agile, automated way.

Gartner Fellow Daryl Plummer counters this criticism by pointing out that adoption of the bimodal-IT model is well underway. In an October 6, 2014, press release, Plummer claims it is both "rock solid" and "fluid"; he states that 45 percent of CIOs report they now have a "fast mode of operation." Gartner projects that 75 percent of IT organizations will have some form of bimodal in place by 2017.

When you cut through all the rhetoric, what you're left with is the need to get users the data they need to thrive and ensure the company achieves its goals. For managing heterogeneous MySQL, MongoDB, Redis, and ElasticSearch databases, there's no more efficient, effective, and affordable way than by using the new Morpheus Virtual Appliance, which combines all the controls users need in a single dashboard. Morpheus is the first and only database-as-a-service (DBaaS) that supports SQL, NoSQL, and in-memory databases across public, private, and hybrid clouds.

With Morpheus, you can invoke a new database instance with a single click, and each instance includes a free full replica set for failover and fault tolerance. Your MySQL and Redis databases are backed up and you can administer your databases using your choice of tools. Visit the Morpheus site to create a free account.

The Importance of API Management to Hybrid Clouds

$
0
0

APIs assume the gatekeeper function to ensure only the appropriate business assets are accessible to the public.

TL;DR: Hybrid clouds provide data managers with an unprecedented level of flexibility in shuffling data between public and private clouds as needs dictate. APIs are the conduit that makes certain company data available to partners, customers, and other external sources, while also securing the organization's sensitive information.

API management is serious business. The consequences of bad API management can be dire. In a February 27, 2015, post on the Expert Integrated Systems blog, IBM's Claudio Tagliabue cites the example of Moonpig, a UK-based service whose vulnerable API exposed its customers' credit-card and other sensitive information. Paul Price explains the vulnerability in a January 5, 2015, post.

Tagliabue compares good API practices to software-oriented architecture principles: data consistency, performance, and granularity. Hybrid cloud services depend on APIs -- particularly RESTful APIs -- to expose select assets of the business to the public, and to place private assets behind a firewall.

Hybrid clouds rely on APIs to ensure the right assets are exposed to partners, customers, and the public. Source: Claudio Tagliabue

Rob Zazueta, who works for API management service Mashery, identifies three "pillars" of API management: security, scalability, and support. Zazueta is quoted by Forbes' Adrian Bridgwater in a February 12, 2015, article.

In terms of security, use of the OAuth standard for controlling access makes things simpler for developers. The API must also support throttling to control the flow of traffic through the backend, and caching to ensure fast response to the most common requests. In terms of support, Zazueta claims the best thing you can do for developers is allow them to request the access they need to a controlled set of data directly via a developer portal.

APIs are making middle managers an endangered species

When a company has few employees, it stands to reason it has fewer managers. When that company is Uber, there's little need for any middle management. ProgrammableWeb Editor in Chief David Berlind posits in a February 4, 2015, article that APIs are removing the bottom rung on the corporate ladder leading from the front line to management. Berlind is responding to an earlier post by Segment CEO Peter Reinhardt on the use of APIs by contractors for such services as Uber, Lyft, 99designs Tasks, and HomeJoy.

Contractors for services such as Uber interact with an API in place of a human manager. Source: Uber

The contractors are managed by the APIs, and the services are driven to minimize the cost of executing the API methods. Reinhardt expects API integration to continue, resulting in the automation of such human endeavors as flipping a house by combining Redfin's API to buy a house and a Zirtual assistant to manage the house's renovation.

APIs are also key to extending data centers to the cloud. Data Center Knowledge's Bill Kleyman writes in a March 2, 2015, article that APIs are integrated with data center management consoles. For example, the Neutron networking component of OpenStack Havana integrates directly with OpenFlow to enhance multi-tenancy and cloud scaling.

The new Morpheus Virtual Appliance is designed to make working with public, private, and hybrid clouds a breeze. With the Morpheus database-as-a-service (DBaaS) you can provision, deploy, and monitor your MongoDB, Redis, MySQL, and ElasticSearch databases from a single point-and-click console. Morpheus lets you work with SQL, NoSQL, and in-memory databases across hybrid clouds in just minutes. Each database instance you create includes a free full replica set for built-in fault tolerance and fail over.

In addition, the service allows you to migrate existing databases from a private cloud to the public cloud, or from public to private. A new instance of the same database type is created in the other cloud, and real-time replication keeps the two databases in sync. Visit the Morpheus site for pricing information and to create a free account.

When Redis over MongoDB

$
0
0

In certain situations, you may wish to choose Redis over MongoDB when you decide which database to use. Find out when it is best to use Redis over MongoDB.

TL;DR: When choosing a database, there are a number of NoSQL solutions from which you can choose. Among these, both Redis and MongoDB are popular choices as databases for various applications.

Both databases have their strengths and weaknesses, so knowing these can help you choose which one will best suit the needs of your application. If you are considering Redis, when should you favor using it over MongoDB?

MogoDB Strengths

MongoDB excels when you are going to be storing and querying extremely large datasets. This is especially true if you will need to continually add more data to the dataset, as MongoDB make scaling the large amount of data a fairly easy task.

In addition, MongoDB is a good choice when you want your data organized as a document store. This allows for a hierarchical structure where you can have documents within documents in order to easily locate necessary data. For example, if you store a blog post which contains comments, the comments could naturally be subdocuments of the blog post content document.

An example of a MongoDB query. Source: MongoDB.

Redis Strengths

Redis excels at providing speed for a database where the size is generally well-known. It does this by placing the database in memory, which allows for high-performance caching and extremely fast location of the needed data.

Redis is organized as a key-value store, so you can quickly access data using the known key. This makes it ideal for applications that require real-time functionality. For example, if you display real-time stock prices, you can use Redis to rapidly get the latest stock price by its key and get it displayed to the user.

Not only does Redis provide key-value functionality, but it features the ability to use more complex data structures like scalars, sets, hashes, and lists. These are very similar to the same structures in many programming languages, which can make the learning curve a bit easier for developers.

An example of a Redis list. Source: Redis.

When to Choose Redis

In certain situations, Redis will be a better choice than MongoDB. Most likely, you will want to choose Redis if you meet most of these conditions:

  • You will have a generally consistent and knowable database size (so that the database can fit all or mostly in memory).
  • Speed will be extremely important and you will be able to use a known key for data retrieval.
  • You want to store data for real-time display or real-time analytics.
  • You want to use the support for data structures such as scalars, sets, hashes, and lists provided by Redis

With this in mind, you should be able to determine whether Redis will be your choice when compared to MongoDB!

Get Your Own Hosted Database

Whether you choose Redis or MongoDB for your application, you will want reliable and stable database hosting. Morpheus Virtual Appliance is a tool that allows you manage heterogeneous databases in a single dashboard.

With Morpheus, you have support for SQL, NoSQL, and in-memory databases like Redis across public, private, and hybrid clouds. So, visit the Morpheus site for pricing information or to create a free account today!

How the Cloud Complements In-memory Databases

$
0
0

 a

In-memory databases and cloud-hosted apps are finding favor for their speed and convenience, particularly when used in tandem.

TL;DR: It's always a bonus when two promising, up-and-coming technologies fit well together. That appears to be the case for in-memory databases and cloud-hosted applications, both of which are noted for their performance and accessibility. An example is the use of Redis to create a low-level cache that improves the performance of Rails models.

The outlook is rosy for in-memory databases, although they won't make their disk-based counterparts obsolete anytime soon. Eventually, however, DRAM prices will drop low enough to make in-memory databases suitable for bread-and-butter applications, not just the forecasting and planning apps that benefit most from in-memory's high performance.

Companies already use a mix of fixed-disk, solid-state, flash, and cloud storage. The ability to find just the right combination of storage options for your changing data loads is the key to keeping performance high and costs low. When it comes to speed, in-memory databases and cloud services have a distinct advantage over in-house fixed-disk servers.

In a recent interview, Oracle's Rich Clayton disparaged in-house IT, claiming it is too slow for planning and forecasting applications. Clayton is quoted by Diginomica's Phil Wainewright in a February 17, 2015, article as saying that within two years, up to 80 percent of planning and forecasting apps will be running in the cloud, although operational financials will remain the province of in-house systems.

The need for speed is also what's driving adoption of in-memory databases, which will generate $95 billion in annual revenue by the end of 2018, according to Gartner. Information Age's Ben Rossi reports on the in-memory database forecast in a February 27, 2015, article.

Protecting against in-memory data loss due to a server outage

Analysts claim the biggest shortcoming of in-memory databases is their reliance on volatile memory. Gartner analyst Donald Frieberg points out that it's much more difficult to configure and manage a high-availability server for in-memory databases than it is for databases stored on fixed disks. Frieberg is quoted by Computing's Graeme Burton in a September 22, 2014, article.

One solution for protecting against data loss in in-memory databases is to implement remote direct memory access (RDMA), which allows the database to write to a second server. While doing so introduces some latency, Friebeg claims RDMA doesn't affect database performance.

Remote direct memory access allows an in-memory database to write to two servers simultaneously. Source: IBM

Create a low-level cache layer in Rails with Redis

One of the most practical uses of an in-memory database is to create a low-level cache to improve database performance. In a January 15, 2015, post on Sitepoint, Vasu K explains how to cache Rails models with Redis. While the primary source of Rails bottlenecks is in the View layer, a model-layer cache can boost database speed considerably.

Redis is particularly handy for this task because it is simple to set up and manage. Start by moving into the Redis app directory and executing the commands below:

Creating a low-level cache for Rails models starts by executing these commands from the Redis app directory. Source: Vasa K via Sitepoint

Once you've created your models and category listing page with category descriptions and tags, you start your browser and go to /category to view the mini-profiler that benchmarks execution times of actions performed at the backend. Next, use the Ruby client for Redis built into Rails to direct the database to use Redis as the cache store.

The Ruby client for Redis lets you instruct Rails to designate Redis as its cache store. Source: Vasa K via Sitepoint

One option for writing objects to Redis is to iterate over each property in the object, and then save the properties as a hash. However, it's faster and simpler to save them as a JSON encoded string using JSON.load. This necessitates updating the views to use the hash syntax to display the categories.

The simplest way to manage in-memory databases along with SQL and NoSQL databases is by using the new Morpheus Virtual Appliance. With the Morpheus database-as-a-service (DBaaS) you can provision, deploy, and monitor your MongoDB, Redis, MySQL, and ElasticSearch databases from a single point-and-click console. Morpheus lets you work with SQL, NoSQL, and in-memory databases across public, private, and hybrid clouds in just minutes. Each database instance you create includes a free full replica set for built-in fault tolerance and fail over.

In addition, the service allows you to migrate existing databases from a private cloud to the public cloud, or from public to private. A new instance of the same database type is created in the other cloud, and real-time replication keeps the two databases in sync. Visit the Morpheus site for pricing information and to create a free account.


Improve the Performance of MongoDB Databases by Applying Tag-aware Sharding

$
0
0

Tag-aware sharding allows DBAs to optimize the performance of their MongoDB databases by helping the balancer organize shards so that a collection's data can be accessed quickly. You can apply tags based on how frequently the data is accessed, the physical location of the users or data center, and the amount of system memory the shard requires, among other data characteristics.

The performance of a multi-cluster MongoDB database is all about balance: each of the shards in a cluster should have the right amount of chunks, and each chunk should be comprised of related data. One of the best ways to achieve the right balance of chunks in a shard, and shards in a cluster, is by using tag-aware sharding.

The basics of tag-aware sharding are presented in a GitHub text file. When you run the balancer on a sharded collection, it migrates the collection's chunks to the shard associated with a tag whose :term:'shard key' range has an *upper* bound greater than the chunk's *lower* bound. Chunks that violate the configured tag are moved to the appropriate shard.

Apply a unique tag to each shard to move unsharded collections out of the primary shard for your database. Source: Ask Asya

In the real world, this relatively straightforward process can get complicated very quickly. The folks behind the Bugsnag web-monitoring tool found this out soon after applying tag-aware sharding to their MongoDB sharded cluster. Simon Maynard explains in an October 7, 2014, blog post that the company added tags for each of its sharded collections to address slow responses by its unsharded collections when the primary shard was getting a lot of hits.

Tags were applied only to Bugsnag's large shards, which were used to store crashes; users' collections were stored on a smaller machine with sufficient memory to hold the entire dataset. When old data was deleted, it left the shards out of balance because the balancing algorithm ignores the size of each chunk when it moves chunks across shards. While MongoDB 2.6 added a command that merges empty chunks with their neighbors, the process is manual. Maynard wrote a script to automate the process.

Maynard also wrote a script that resizes chunks that have become too large, and he explains how Bugsnag was able to optimize storage by eliminating orphan documents, removing chunks that were no longer necessary, and using shell commands to monitor shard distribution: db.collection.getShardDistribution()db.stats(), and sh.status().

Tag-aware use cases: Archives, shard by location, shard to a specific server

The power and versatility of tag-aware sharding are highlighted in a November 5, 2014, post on the MongoDB blog by Francesca Krihely. For example, much of an organization's data is rarely accessed, so storing that data on high-performance hardware is wasteful. You can use tag-aware sharding to assign tags to various storage tiers, apply a unique shard key range, and have the documents moved to the appropriate shard during balancing.

Similarly, it isn't uncommon to want to store user information at a specific data-center location. The MongoDB Manual includes a tutorial on data-center awareness, but there are a few caveats that apply. For instance, since you can't change the value of a shard key, you'll have to delete and then reinsert the document for any user who changes location.

An example of the use of tag-aware sharding to place users in shards based on their location. Source: Albert Spijkers

One of the most useful applications of tag-aware sharding is for memory optimization. Collections with heavy indexing can be tagged to a physical server that has sufficient RAM to accommodate those shards.

One of the most efficient ways to monitor and optimize heterogeneous MySQL, MongoDB, Redis, and ElasticSearch databases is via the new Morpheus Virtual Appliance, which seamlessly provisions and manages SQL, NoSQL, and in-memory databases across public, private, and hybrid clouds. Morpheus lets you bring up a new instance of a database in just seconds via a point-and-click interface.

A free full replica set is provisioned for each database instance, and your MySQL and Redis databases are backed up. Morpheus supports a range of database tools for connecting to, configuring, and managing your databases. Visit the Morpheus site to create a free account.

The Ups and Downs of Open-Source Project Sponsorships

$
0
0

Big-name tech companies often realize a tremendous return on their open-source investment. However, when a project's single sponsor has a sudden change in strategy, developers can be left in the lurch. The most recent example of orphaned open-source projects are Groovy and Grails, which Pivotal plans to cease supporting as of March 31.

A lot of developers were caught by surprise when Pivotal announced on January 19, 2015, that it was ending its sponsorship of the open-source Groovy Java JVM and its Grails web development environment. In a January 20, 2015, post on I Programmer, Alex Armstrong states that Pivotal's decision demonstrates the dangers of relying on a single vendor to keep an open-source tool viable.

Pivotal's Mike Maxey explains the company's decision to end its sponsorship of Groovy and Grails at versions 2.4 and 3.0, respectively, as motivated by the company's plan to focus on commercial and open-source platform-as-a-service (PaaS), data, and agile development projects. Pivotal will honor existing commercial support contracts for Groovy and Grails beyond the March 31, 2015, cutoff date for its sponsorship.

As I Programmer's Armstrong points out, Pivotal's announcement is not the end of the world for Groovy and Grails. The community of developers using the tools will be able to support them in the short term, but ultimately deep-pocket sponsors will be needed to fund the development required to keep pace with the competition.

The egalitarian reputation of the open-source community is challenged by the results of a Network World survey identifying the sponsors of 36 big-name open-source nonprofits and foundations. John Gold reports on the survey results in a January 9, 2015, article. The most prominent name on the rosters of open-source sponsors was Google, which helps fund eight of the 36 organizations.

The most-prevalent sponsor of the 36 open-source projects surveyed was Google, which contributes to eight of the groups. Source: Network World

Other major players among the open-source sponsors were Canonical, SUSE, HP, and VMware, which each supported five of the groups; and Nokia, Oracle, Cisco, IBM, Dell, Intel, and NEC, each of which contributed to four open-source projects. Pro Publica's nonprofit records indicate that the average annual revenue of the 36 organizations in Network World's survey is $4.36 million, although two organizations skew the average: the Wikimedia Foundation ($27 million in revenue) and Linux Foundation ($17 million).

What open-source sponsors receive in return

You might think that a sponsor's funding gives it an opportunity to contribute to the project's code, but Network World's Gold claims that isn't always the case. In fact, the primary benefit to sponsors is access to the open-source project's community of users, whether as potential employees or customers, or both.

In contrast to seeking sponsorship from one or more tech giants, FreedomSponsors takes a crowdfunding approach to supporting free and open-source software projects. As the site's FAQ explains, it works by allowing people to place money bounties for solutions to the problems that are most important to them, and allowing developers to collect bounties when they provide the solution.

FreedomSponsors takes a crowdfunding approach to supporting open-source projects based on "bounties" for specific solutions. Source: FreedomSponsors

Until it is able to create a reputation system, FreedomSponsors will rely on a trust-but-verify model, which doesn't necessarily engender a lot of confidence in the system. Still, the people behind the site get double bonus points for making the effort.

A more typical approach to vendor support for an open-source project is Microsoft's MS Open Tech subsidiary, which has joined with education-technology service Remote-Learner to integrate Office 365 with the Moodle cloud-based learning system. The project is described in a January 19, 2015, article on Business Cloud News.

The benefits of the project to Microsoft are obvious: direct hooks between Office 365 and Moodle ensures an installed base of mostly young students for Microsoft's nascent cloud version of Office.

For a worry-free approach to managing heterogeneous MySQL, MongoDB, Redis, and ElasticSearch databases, check out the new Morpheus Virtual Appliance, which combines all the controls you need in a single dashboard. Morpheus is the first and only database-as-a-service (DBaaS) that supports SQL, NoSQL, and in-memory databases across public, private, and hybrid clouds.

With Morpheus, you can invoke a new database instance with a single click, and each instance includes a free full replica set for failover and fault tolerance. Your MySQL and Redis databases are backed up and you can administer your databases using your choice of tools. Visit the Morpheus site to create a free account.

Your Options for Importing XML Data into a MySQL Table

$
0
0

 

Avoid these common glitches that may arise when attempting to transfer XML data into a MySQL table.

TL;DR: XML files generally have little in common with MySQL tables, which explains why importing data from an XML file to a MySQL table can be so troublesome. Here's how to use MySQL's LOAD XML command, stored procedures, and prepared statements to ensure all the XML data you want to import, and only the XML data you want to import, transfers safely and smoothly into your MySQL table.

Importing XML data into a MySQL database should be a piece of cake. Like many seemingly straightforward data-transfer operations, however, the process can be anything but cake-like.

Consider the common situation of needing to import only a select number of attributes from an XML file to a MySQL table, as presented in a Stack Overflow post from May 2014. The XML file contains team rosters that include information about each team's players. Only select fields of player data are to be imported to the MySQL table: player_id (the primary key and autoincrement), first_name, last_name, and team.

MySQL's LOAD XML command allows you to import only select attributes from an XML file into a MySQL table. Source: Stack Overflow

MySQL's LOAD XML command works the same as LOAD DATA to import the XML attributes without an input value by assigning them to user variables, and by not assigning the variables to a table column.

The LOAD XML command imports XML attributes without an input value by using user-variable assignments. Source: Stack Overflow

In another Stack Overflow post from March 2011, import of an XML file to a MySQL table failed because the column count didn't match the value count at row 1. The MySQL table had a field ID that wasn't in the XML file. The goal was to import the XML file using MySQL queries in a way that bypasses the ID column and instead uses the autoincrement function for the ID column. The solution was to specify fields in the LOAD XML command:

By specifying fields in the LOAD XML command you can bypass the table's ID column and use the autoincrement function in its place. Source: Stack Overflow

Use stored procedures and prepared statements to insert XML data

In a pair of articles from March and April 2014, Database Journal's Rob Gravelle explains how to use a stored procedure to import XML data into a MySQL table; and how to enhance the process by using a Prepared Statement, and by adding error handling and validation.

Using stored procedures to import XML comes with limitations, as Gravelle points out. First, you haven't been able to run LOAD XML INFILE and LOAD DATA INFILE statements in a stored procedure since MySQL 5.0.7. Also, you can't make the procs very dynamic, so they can't support a variety of file types. Finally, you can't map XML data to table structures. However, if the XML file has a rigid and known structure, you can input the data with a single call.

Gravelle's example XML file is a list of applicants, each of which has three attributes: ID, first name, and last name. The same three fields are in the target table, along with an int ID (the primary key) and two varchars.

Using a stored procedure to import an XML file to a MySQL table requires matching the file's structure to the table beforehand. Source: Database Journal

MySQL's Load_File() function is used to import the XML data into a local variable, and the ExtractValue() function is used to query the XML data using XPath. The row count lets you iterate over every XML row, since you're not likely to know how many records will be imported with each run.

MySQL's SQL interface for prepared statements can be accessed from within a stored procedure, although doing so is not as fast as using the binary protocol through a prepared statement API. Using the prepared-statement approach, you can compare the number of XML node attributes to the number of columns in the target table. If they don't match, an error is displayed and the processing is paused.

The prepared statement lets you ensure the number of XML node attributes matches the number of columns in the target table. Source: Database Journal

The new Morpheus Virtual Appliance makes managing heterogeneous MySQL, MongoDB, Redis, and ElasticSearch databases a breeze. With the Morpheus database-as-a-service (DBaaS) you can provision, deploy, and monitor all your databases from a single point-and-click console. Morpheus lets you work with SQL, NoSQL, and in-memory databases across public, private, and hybrid clouds in just minutes. Each database instance you create includes a free full replica set for built-in fault tolerance and fail over.

In addition, the service allows you to migrate existing databases from a private cloud to the public cloud, or from public to private. A new instance of the same database type is created in the other cloud, and real-time replication keeps the two databases in sync. Visit the Morpheus site for pricing information and to create a free account.

FREAK Encryption Vulnerability Puts Web Servers at Risk

$
0
0

Servers that haven't been configured to block export ciphers could be targeted by FREAK man-in-the-middle attacks.

TL;DR: A backward-compatibility feature built into the SSL/TLS encryption protocol as a result of 20-year-old software-export controls is being leveraged by hackers to launch man-in-the-middle attacks on web servers. A recent survey found that many popular cloud services remained vulnerable to FREAK attacks more than 24 hours after the glitch was disclosed.

A vulnerability recently disclosed in the SSL/TLS encryption method has left a great number of web servers vulnerable to a man-in-the-middle attack. Researchers at Skyhigh Networks report that as of noon PST on March 4, 2015, 24 hours after the FREAK vulnerability was first reported, 766 cloud services were still at risk, based on the company's analysis of 10,000 services. A running tally of the vulnerable sites is maintained on the Freak Attack site.

The Register's John Leydon explains in a March 5, 2015, post that the average company uses 122 of the services. While there have been no reports of attacks targeting the FREAK vulnerability, researchers demonstrated how easy such an attack would be by breaking into the NSA's public-facing site. Techworm's Vijay reports in a March 5, 2015, article that the researchers needed only $104 and eight hours of computing time on Amazon's cloud computing service to compromise the NSA site.

According to Symantec technical director Rick Andrews, any web server whose configuration allows use of export ciphers is vulnerable to FREAK. Andrews is quoted by Computer Business Review's Jimmy Nicholls in a March 5, 2015, article.

FREAK stands for "Factoring attack on RSA-Export Keys." It allows hackers to force browsers to downgrade to weaker 512-bit RSA encryption from the current default 2,048-bit keys, or the intermediate 1,024-bit keys. Computerworld's Jeremy Kirk explains in a March 3, 2015, article that the U.S. government's export restrictions from the 1990s prohibited export of software supporting strong encryption. Even after the restrictions were lifted, the export mode feature remained in the SSL/TLS protocol in order to maintain backward-compatibility with old products.

Cryptography researchers claim it would take only seven hours and the equivalent of 75 PCs to break 512-bit encryption, but millions of PC equivalents and months or years to break 1,024-bit or 2,048-bit encryption. Source: Matthew D. Green, Johns Hopkins University, via the Washington Post

How to patch web servers to prevent a FREAK attack

From a client perspective, the simplest way to guard against a FREAK-based attack is to avoid using Apple's Safari browser or the browser built into Android devices. Apple plans to issue a FREAK patch for Safari in the second week of March, and Google reports having pushed a patch to its Android partners. FREAK doesn't affect recent versions of the Google Chrome, Internet Explorer, and Firefox browsers.

In a March 2, 2015, post, Akamai's Bill Brenner describes the OpenSSL command you can run to determine whether a web server is vulnerable to an export-cipher attack:

Running this OpenSSL command should generate an "alert handshake failure" message, which indicates the server is not vulnerable to an export-cipher attack. Source: Akamai

Substitute your domain name for "www.akamai.com". If you see an "alert handshake failure" message, the host is protected against a FREAK attack. Skyhigh Networks recommends that administrators disable support for all known ciphers and enable forward secrecy. Instructions for doing so are available on the Mozilla site.

State Machine Attacks on TLS (SMACK) offers a video demonstration of a FREAK attack as well as a list of vulnerable TLS client libraries.

Security is hard-wired into the new Morpheus Virtual Appliance. Your data is protected at the persistence layer via user authentication and Access Control Lists (ACLs). With the Morpheus database-as-a-service (DBaaS) you can provision, deploy, and monitor your MongoDB, Redis, MySQL, and ElasticSearch databases from a single point-and-click console. Morpheus lets you work with SQL, NoSQL, and in-memory databases across public, private, and hybrid clouds in just minutes. Each database instance you create includes a free full replica set for built-in fault tolerance and fail over.

In addition, the service allows you to migrate existing databases from a private cloud to the public cloud, or from public to private. A new instance of the same database type is created in the other cloud, and real-time replication keeps the two databases in sync. Visit the Morpheus site for pricing information and to create a free account.

3 Approaches to Creating a SQL-Join Equivalent in MongoDB

$
0
0

 

Integrating MongoDB document data with SQL and other table-centric data sources needn't be so processor-intensive.

TL;DR: While there's no such operation as a SQL-style table join in MongoDB, you can achieve the same effect without relying on table schema. Here are three techniques for combining data stored in MongoDB document collections with minimal query-processing horsepower required.

The signature relational-database operation is the table join: combine data from table 1 with data from table 2 to create table 3. The schema-less document-container structure of MongoDB and other non-relational databases makes such table joins impossible.

Instead, as the MongoDB Manual explains, MongoDB either denormalizes the data by storing related items in a single document, or it relates that data in separate documents. One way to relate documents is via manual references: the _id field of one document is saved in the other document as a reference. The application simply runs a second query to return the related data.

When you need to link multiple documents in multiple collections, DBRefs let you relate documents using the value of one document’s _id field, collection name, and, optionally, its database name. The application resolves DBRefs by running additional queries to return the referenced documents.

tutorial in the MongoDB Manual demonstrates use of denormalization in a social-media application. The manual also provides a SQL-to-aggregation mapping chart.

Simple function for 'joining' data within a single MongoDB collection

An alternative approach to relating data in a MongoDB collection is via a function you run in the MongoDB client console. The process is explained in a Stack Overflow post from March 2014.

For example, in a library database, you first create fields for "authors", "categories", "books", and "lending".

The fields to be "joined" in the MongoDB database are "authors", "categories", "books", and "lending". Source: Stack Overflow

Then you apply the function.

Run MongoDB's find() method to retrieve related documents in a collection. Source: Stack Overflow

The result is the rough equivalent of a join operation on SQL tables.

After running the find() method the documents in the collection related as specified are returned. Source: Stack Overflow

Ensuring MongoDB apps integrate with your organization's other data

Lack of a one-to-one join equivalent is only one of the many ways MongoDB differs from SQL databases. In a July 17, 2013, post, Julian Hyde, lead developer of the Mondrian open-source OLAP engine, explains how he built a MongoDB-to-SQL interface using the Optiq dynamic data management framework.

Optiq features a SQL parser and a query optimizer powered by rewrite rules. Hyde created rules to map SQL tables onto MongoDB collections, and to map relational operations onto MongoDB's find and aggregate operators. The result is the equivalent of a JDBC driver for MongoDB based on a hybrid query-processing engine intended to shift as much query processing as possible to MongoDB. Joins and other operations are handled by the client.

The process allows you to convert each MongoDB collection to a table. The COLUMNS and TABLES system tables are supplied by Optiq, and the ZIPS view is defined in mongo-zips-model.json.

The Optiq framework allows a MongoDB collection to be converted to a SQL-style table. Source: Julian Hyde

Simple management of SQL, NoSQL, and in-memory databases is a key feature of the new Morpheus Virtual Appliance. With the Morpheus database-as-a-service (DBaaS) you can provision, deploy, and monitor heterogeneous MySQL, MongoDB, Redis, and ElasticSearch databases from a single point-and-click console. Morpheus lets you work with all your databases across public, private, and hybrid clouds in just minutes. Each database instance you create includes a free full replica set for built-in fault tolerance and fail over.

In addition, the service allows you to migrate existing databases from a private cloud to the public cloud, or from public to private. A new instance of the same database type is created in the other cloud, and real-time replication keeps the two databases in sync. Visit the Morpheus site for pricing information and to create a free account.

Viewing all 1101 articles
Browse latest View live