Quantcast
Channel: Morpheus Blog
Viewing all 1101 articles
Browse latest View live

Has Node.js Adoption Peaked? Is So, What's Next for Server-Side App Development?

$
0
0

The general consensus of the experts is that Node.js will continue to play an important role in web app development despite the impending release of the io.js forked version. Still, some developers have decided to switch to the Go programming language and other alternatives, which they consider better suited to large, distributed web apps.

The developer community appears to be tiring of the constant churn in platforms and toolkits. Jimmy Breck-McKye points out in a December 1, 2014, post on his Lazy Programmer blog that it has been only two years since the arrival of Node.js, the JavaScript framework for developing server-side apps quickly and simply.

Soon Node.js was followed by Backbone.js/Grunt, Require.js/Handlebars, and most recently, Angular, Gulp, and Browserify. How is a programmer expected to invest in any single set of development tools when the tools are likely to be eclipsed before the developer can finish learning them?

Node.js still has plenty of supporters, despite the recent forking of the product with the release of io.js by a group of former Node contributors. In a December 29, 2014, post on the LinkedIn Pulse blog, Kurt Cagle identifies Node as one of the Ten Trends in Data Science for 2015. Cagle nearly gushes over the framework, calling it "the nucleus of a new stack that is likely going to relegate Ruby and Python to has-been languages." Node could even supplant PHP someday, according to Cagle.

The internal thread architecture of Node.js handles incoming requests to the http server similar to SQL requests. Source: Stack Overflow

Taking the opposite view is Shiju Varghese, who writes in an August 31, 2014, post on his Medium blog that after years of developing with Node, he has switched to using Go for Web development and as a " technology ecosystem for building distributed apps." Among Node's shortcomings, according to Varghese, are its error handling, debugging, and usability.

More importantly, Varghese claims Node is a nightmare to maintain for large, distributed apps. For anyone building RESTful apps on Node.js, he recommends the Hapi.js framework created by WalMart. Varghese predicts that the era of using dynamic languages for "performance-critical" web apps is drawing to a close.

The Node.js fork may -- or may not -- be temporary

When io.js was released in late November 2014, developers feared they would be forced to choose between the original version of the open-source framework supported by Joyent, and the new version created by former Node contributors. As ReadWrite's Lauren Orsini describes in a December 10, 2014, article, the folks behind io.js were unhappy with Joyent's management of the framework.

Io.js is intended to have "an open governance model," according to the framework's readme file. It is described as an "evented IO for V8 JavaScript." Node.js and io.js are both server-side frameworks that allow web apps to handle user requests in real time, and the io.js development team reportedly intends to maintain compatibility with the "Node ecosystem."

At present, most corporate developers are taking a wait-and-see approach to the Node rift, according to InfoWorld's Paul Krill. In a December 8, 2014, article, Krill writes that many attendees at Intuit's Node Day conference see the fork as a means of pressuring Joyent to "open up a little bit," as one conference-goer put it. Many expect the two sides to reconcile before long -- and before parallel, incompatible toolsets are released.

Still, the io.js fork is expected to be released in January 2015, according to InfoQ's James Chester in a December 9, 2014, post. Isaac Z. Schluetler, one of the Node contributors backing io.js, insists in an FAQ that the framework is not intended to compete with Node, but rather to improve it.

Regardless of the outcome of the current schism, the outlook for Node developers looks rosy. Indeed.com's recent survey of programmer job postings indicates that the number of openings for Node developers is on the rise, although it still trails jobs for Ruby and Python programmers.

Openings for developers who can work with Node.js are on the rise, according to Indeed.com. Source: FreeCodeCamp

Regardless of your preferred development framework, you can rest assured that your MySQL, MongoDB, Redis, and ElasticSearch databases are accessible when you need them on the Morpheus database-as-a-service (DBaaS). Morpheus supports a range of tools for connecting to, configuring, and managing your databases.

You can provision, deploy, and host all your databases on Morpheus with just a few clicks using the service's dashboard. Visit the Morpheus site for to create a free account!


What to Do When MySQL Ignores Your Index

$
0
0

If you find yourself with a MySQL query that's taking forever to complete, the suspects at the top of your troubleshooting list usually relate to the approach MySQL is choosing to index the table being searched. Discovering the cause of the query slow-down is only the beginning: the "key" is to ensure the system is selecting the optimal index option.

MySQL's index_merge function is intended to allow some queries that contain WHERE clauses and single-column indexes to use multiple indexes. The goal is to speed up queries by searching only specific indexes of tables rather than the entire table via the default PRIMARY KEY.

In a December 2012 post on the MySQL Performance Blog, Ernie Souhrada uses the example of the query “SELECT foo FROM bar WHERE indexed_colA = X OR indexed_colB = Y” that applies the index merge union algorithm to scan the "indexed_colA" and "indexed_colB" columns simultaneously. Then it performs a set-theoretic union of the two result sets. (Using "AND" in place of "OR" generates the set-theoretic intersection of the result sets.)

Index_merge sometimes clobbers performance, however. Souhrada gives the example of an index merge run on a table that had 4.5 million rows. The EXPLAIN that was run on the SELECT indicated three different indexes were being used to search about 8,100 of the table's 4.5 million rows.

 

A query executing multiple indexes appears to be searching only a fraction of the table's rows. Source: MySQL Performance Blog

In fact, the query was taking 3.3 seconds, far longer than expected. The specific conditions in this query were causing MySQL to search 8.5 million rows rather than the expected 8,100. The query was reading the index entries for all columns in the WHERE clause that were included in the merge operation. It then performed a set intersection on the results.

The solution in this case was to use index hints, which are described in the MySQL Reference Manual. Rewriting the query to focus on user_type reduces processing time from 3.3 seconds to a millisecond.

 

Using an index hint to focus the query on a single index removed the performance bottleneck. Source: MySQL Performance Blog

While an index hint was the best solution in this case, in other situations an index hint can cause more problems than it solves. You simply can't anticipate future changes to the database and to MySQL itself in subsequent versions, any of which could transform the index hint into a time bomb waiting to explode your database.

Many ways to ensure MySQL queries read your preferred index

As the above example shows, finding the source of the slow index processing is only the beginning of your troubleshooting. There are almost always multiple potential solutions to a query-index problem. On his OpenArk.org blog, Shlomi Noach lists seven alternative approaches to speeding up MySQL index queries.

A table with about 10 million rows ran a query indexed to the "type" key, which should limit queries to scanning about 110 rows, filtered "using where" and sorted "using filesort". Instead, the query was using PRIMARY KEY to search all 10 million rows, filtered "using where."

For some reason, MySQL was ignoring the plan it identified as most efficient. The first options are to rebuild the table or to run ANALYZE TABLE to update the index statistics, which takes less time than rebuilding and may help generate better query plans. The bulldozer approach is to use FORCE INDEX, or alternatively to limit MySQL to using only the index you specify.

 

One option to address slow MySQL queries is to state explicitly the index to be searched. Source: OpenArk.org

Alternatively, you could instruct MySQL to ignore the PRIMARY KEY:

An alternative for speeding up queries is to specify that MySQL should ignore the PRIMARY KEY. Source: OpenArk.org

One downside to these approaches is that they are not standard SQL. Moving some logic to the application would be a possibility, but this solution is rarely quick or simple. Versions of SQL that support the ORDER BY function could specifically negate use of PRIMARY KEY. You can realize the same result by "tricking" MySQL into believing the PRIMARY KEY is not optimal via use of "ORDER BY id". This makes the second sorting column redundant, so MySQL concludes on its own that use of the "type" index is best.

DBAs can spend a considerable amount of their workday puzzling over these types of MySQL performance issues. What few admins look forward to is writing backup scripts, scheduling backups, and ensuring their databases are fully recoverable. The Morpheus Virtual Appliance is your new solution for database provisioning and managing on private, public, and hybrid clouds. Start your free trial now

 

New Breed of JSON Tools Closes the Gap with XML

$
0
0

As JSON's popularity for web app development increases, the range of tools supporting the environment expands into XML territory. The key is to maintain the simplicity and other strengths of JSON while broadening the environment's appeal for web developers.

JSON or XML? As with most programming choices, determining which approach to adopt for your web app's server calls is not a simple "either-or" proposition.

JSON, or JavaScript Object Notation, was conceived as a simple, concise data format for encoding data structures in an efficient text format. In particular, JSON is less "verbose" than XML, according to InCadence Strategic Solutions VP Michael C. Daconta in an April 16, 2014, article on GCN.com.

Despite JSON's efficiency, Daconta lists four scenarios in which XML is still preferred over JSON:

  • When you need to tag the data via markup, which JSON doesn't support
  • When you need to validate a document before you transmit it
  • When you want to extend a document's elements via substitution groups or other methods
  • When you want to take advantage of one of the many XML tools, such as XSLT or XPath

SOAP APIs and REST APIs are also not strictly either/or

One of JSON's claims to fame is that it is so simple to use it doesn't require a formal specification. George Anadiotis explains in a January 28, 2014, post on the Linked Data Orchestration site that many real-world REST-based JSON apps require schemas, albeit much different schemas than their SOAP-based XML counterparts.

 

The basic JSON-REST model decouples the client and server, which separates the app's internal data representation from the wire format. Source: Safety Net

The most obvious difference is that there are far fewer JSON tools than there are XML tools. This is understandable considering that XML has been around for decades and JSON is a relative newcomer. However, JSON tools are able to reverse-engineer schemas based on the JSON fragments you arrange on a template. The tools' output is then edited manually to complete the schema.

Anadiotis presents a five-step development plan for a JSON schema:

  1. Create sample JSON for exchanging data objects
  2. Use a JSON schema tool to generate a first-draft schema based on your sample JSON fragments
  3. Edit the schema manually until it is complete
  4. Use a visualization tool to create an overview of the schema (optional)
  5. Run a REST API metadata framework to provide the API's documentation

Tool converts JSON to CSV for easy editing

JSON may trail XML in quantity and quality of available toolkits, but the developer community is working hard to close the gap. An example is the free JSON-to-CSV converter developed by Eric Mills of the Sunlight Foundation. The converter lets you paste JSON code into a box and then automatically reformat and recolor it in an easy-to-read table.

 

In this example, the JSON-to-CSV converter transforms JSON code into a table of data about Ohio state legislators. Source: Programmable Web

Mills' goal in creating the converter was to make JSON "as approachable as a spreadsheet," as Janet Wagner repots in a March 31, 2014, post on the Programmable Web site. While the converter is intended primarily as a teaching tool that demonstrates the potential of JSON as a driver of the modern web, Mills plans to continue supporting the converter if is used widely.

Conversely, if you'd like to convert Excel/CSV data to HTML, JSON, XML, and other web formats, take the free Mr. Data Converter tool out for a spin.

How the Internet of Things Will Affect Database Management

$
0
0

There's no denying the rise of the Internet of Things will challenge existing database systems to adapt to accommodate huge volumes of unstructured data from diverse sources. Some analysts question whether RDBMSs have the scalability, flexibility, and connectivity required to collect, store, and categorize the disparate data types organizations will be dealing with in the future. Others warn against counting out RDBMSs prematurely, pointing out that there's plenty of life left in the old data structures.

Imagine billions of devices of every type flooding data centers with information: a secured entryway reporting on people's comings and goings; a smart shelf indicating a shortage of key production supplies; a pallet sensor reporting an oversupply of stocked items.

The Internet of Things poses unprecedented challenges for database administrators in terms of scalability, flexibility, and connectivity. How do you collect, categorize, and extract business intelligence from such disparate data sources? Can RDBMSs be extended to accommodate the coming deluge of device-collected data? Or are new, unstructured data models required?

As you can imagine, there's little consensus among experts on how organizations should prepare their information systems for these new types and sources of data. Some claim that RDBMSs such as MySQL can be extended to handle data from unconventional sources, many of which lack the schema, or preconditioning, required to establish the relations that are the foundation of standard databases. Other analysts insist that only unstructured, "schema-less" DBMSs such as NoSQL are appropriate for data collection from intelligent devices and sensors.

 

The standard application model is transformed to encompass the cloud by the need to accommodate tomorrow's diverse data sources and types. Source: Technische Universität München

In a November 28, 2014, article, ReadWrite's Matt Asay reports on a recent survey conducted by Machine Research that found NoSQL is the key to "managing more heterogeneous data generated by millions and millions of sensors, devices and gateways." Not surprisingly, the two primary reasons for the assessment are NoSQL's flexibility in handling unstructured data, and its scalability, which the researchers claim RDBMSs simply can't match.

Reports of the death of RDBMSs are slightly exaggerated

There are a couple of problems with this claim, however. First, there's an acute shortage of NoSQL developers, according to statistics compiled by research firm VisionMobile. The company pegs the current number of NoSQL developers at 300,000, but it estimates that by the year 2020 the number will jump to 4.5 million.

 

VisionMobile's research indicates that demand for NoSQL developers will explode in coming years. Source: VisionMobile

Many experts posit that the forecast demise of RDBMSs is premature because the forecasters underestimate the ability of RDBMS vendors to extend their products to enhance their scalability and their ability to accommodate unstructured data. For example, MySQL vendor DeepDB improved the scalability of its product by replacing the default InnoDB storage engine with an alternative that it claims improved server performance by a factor of 100.

At this point, IT managers can be excused for thinking "Here we go again." Does the Internet of Things signal yet another sea-change for their data centers? Or is this the most recent case of the hype outpacing the substance? According to a December 2014 TechTarget article by Alan R. Earls, corporate bandwidth will be overwhelmed by the rush of small data packets coming from Internet-connected devices.

In particular, distributed data centers will be required that move data security -- and the bulk of data analysis -- to the edge of the network. This will require pushing the application layer to the router and integrating a container with logic. As Accenture Technology Lab researcher Walid Negm points out, the cloud is increasingly serving as either an extension of or replacement for the conventional network edge.

The best way to get a jump on the expansion of data networks and provision your databases is via a secure, reliable, and scalable platform for private, public, and hybrid clouds. Morpheus Virtual Appliance supports MongoDB, MySQL, Elasticsearch, and Redis with a simple point and click database provisioning setup. Get started on your free trial now! 

Relational or Graph: Which Is Best for Your Database?

$
0
0

Choosing between the structured relational database model or the "unstructured" graph model is less and less an either-or proposition. For some organizations, the best approach is to process their graph data using standard relational operators, while others are better served by migrating their relational data to a graph model.

The conventional wisdom is that relational is relational and graph is graph, and never the twain shall meet. In fact, relational and graph databases now encounter each other all the time, and both can be better off for it.

The most common scenario in which "unstructured" graph data coexists peaceably with relational schema is placement of graph content inside relational database tables. Alekh Jindal of the MIT Computer Science and Artificial Intelligence Laboratory (CSAIL) points out in a July 9, 2014, post on the Intel Science and Technology Center for Big Data blog that most graph data originates in an RDBMS.

Rather than extract the graph data from the RDBMS for import to a graph processing system, Jindal suggests applying the graph-analytics features of the relational database. When a graph is stored as a set of nodes and a set of edges in an RDBMS, built-in relational operators such as selection, projection, and join can be applied to capture node/edge access, neighborhood access, graph traversal, and other basic graph operations. Combining these basic operations makes possible more complex analytics.

Similarly, stored procedures can be used as driver programs to capture the iterative operations of graph algorithms. The down side of expressing graph analytics as SQL queries is the performance hit resulting from multiple self-joins on tables of nodes and edges. Query pipelining and other parallel-processing features of RDBMSs can be used to mitigate any resulting slowdowns.

When Jindal compared the performance of a column-oriented relational database and Apache Giraph on PageRank and ShortestPath, the former outperformed the latter in two graph-analytics datasets: one from LiveJournal with 4.8 million nodes and 68 million edges; and one from Twitter with 41 million nodes and 1.4 billion edges.

 

A column-oriented RDBMS matched or exceeded the performance of a native graph database in processing two graph datasets. Source: Alekh Jindal, MIT CSAIL

When migrating data from relational to graph makes sense

While there are many instances in which extending the relational model to accommodate graph-data processing is the best option, there are others where a switch to the graph model is called for. One such case is the massive people database maintained by Whitepages, which resided for many years in siloed PostgreSQL, MySQL, and Oracle databases.

As explained in a November 12, 2014, post on Linkurious, Whitepages discovered that many of its business customers were using the directory to ask graph-like questions, primarily for fraud prevention. In particular, the businesses wanted to know whether a particular phone number was associated with a real person at a physical address, and what other phone numbers and addresses have been associated with a particular person.

The development team hired by Whitepages used the Titan scalable graph database to meet the company's need for scalability, availability, high performance (processing 30,000 vertices per second), and high ingest rate (greater than 200 updates per second). The resulting graph schema more accurately modeled the way Whitepages customers where querying the database: from location to location, and number to number.

 

The Whitepages graph schema tracks people as they change physical address and telephone number, among other attributes. Source: Linkurious

Whitepages has made its graph infrastructure available to the public via the WhitePages PRO API 2.0.

Whether you find your organization's data better suited to either the graph or relational model, the Morpheus Virtual Appliance will help you with real-time database and system operational insights. Get your MongoDB, MySQL, Elasticsearch, or Redis databases provisioned with a simple point-and-click interface, and manage SQL, NoSQL, and In-Memory databases across hybrid clouds. 

Overcoming Barriers to Adoption of Network Functions Virtualization

$
0
0

IT managers are equally intrigued by the promise of network functions virtualization, and leery of handing over control of their critical networks to unproven software, much of which will be managed outside their data centers. Some of the questions surrounding NFV will be addressed by burgeoning standards efforts, but most organizations continue to adopt a "show me" attitude toward the technology.

Big things are predicted for software defined networks (SDN) and network functions virtualization (NFV), but as with any significant change in the global network infrastructure, the road to networking hardware independence will have its share of bumps.

For one thing, securing networks that have no physical boundary is no walk in the park. Viodi's Alan Weissberger explains in a December 29, 2014, post that replacing traditional hardware functions with software extends the potential attack space "exponentially." When you implement multiple virtual appliances on a single physical server, for example, they'll all be affected by a single breach of that server.

Even with the security concerns, the benefits of virtualization in terms of flexibility and potential cost savings are difficult for organizations of all sizes to ignore. In a December 23, 2014, article on TechWeek Europe, Ciena's Benoit de la Tour points out that virtualization allows network operators to expand or remove firewalling, load balancing, and other services and appliance functionality instantly.

Simplifying hardware management is one of NFV's principal selling points. John Fruehe writes on the Moor Insights & Strategy blog that NFV replaces some specialty networking hardware with software that runs on commercial off-the-shelf (COTS) x86 servers, or as VMs running on those servers. It also simplifies network architectures by reducing the total number of physical devices.

NFV offers organizations the potential to simplify network management by reducing the overall hardware footprint. Source: Moor Insights & Strategy

Potential NFV limitations: licensing and carrier control

The maturation of the technology underlying NFV concepts is shown in the creation of the Open Platform for NFV, a joint project of the Linux Foundation and such telecom/network companies as AT&T, Cisco, HP, NTT Docomo, and Vodafone. As ZDNet's Steven J. Vaughan-Nichols reports in a September 30, 2014, article, OPNFV is intended to create a " carrier-grade, integrated, open source NFV reference platform."

Linux Foundation Executive Director Jim Zemlin explains that the platform would be similar to Linux distributions serving a variety of needs and allowing code to be integrated upstream and downstream. Even with an open-source base, some potential NFV adopters are hesitant to cede so much control of their networks to carriers. For one thing, companies don't want to find themselves caught in the middle of feuding carriers and equipment vendors.

 

SDN and NFV have many similarities, but also some important differences, principally who hosts the bulk of the network hardware. Source: Moor Insights & Strategy

More importantly, IT managers are concerned about ensuring the reliability of their networks in such widespread virtual environments. Red Hat's Mark McLoughlin states in an October 8, 2014, post on OpenSource.com that network functions implemented as horizontal scale-out applications will address reliability the way cloud apps do: each application tier will be distributed among multiple failure domains. Scheduling of performance-sensitive applications will be in the hands of the telcos, which makes SLAs more important than ever.

Existing software licensing agreements also pose a challenge to organizations hoping to benefit from use of NFV. A November 26, 2014, article by TechTarget's Rob Lemos describes a hospital that attempted to switch from a license based on total unique users to one based on concurrent users as it implemented network virtualization. The process of renegotiating the licenses took four years.

Lemos points out that organizations often neglect to consider the implications of renegotiating software licenses when they convert to virtualized operations. Conversely, when you use the new Morpheus Virtual Appliance, you know exactly what you're getting -- and what you're paying for -- ahead of time. With the Morpheus database-as-a service, you have full control of the management of your heterogeneous MySQL, MongoDB, Redis, and ElasticSearch databases via a single dashboard.

Morpheus lets you create a new instance of any SQL, NoSQL, or in-memory database in just seconds via a point-and-click interface. A free full replica set is provided for each database instance, and your MySQL and Redis databases are backed up. Visit the Morpheus site for pricing information and to create a free account.

Has Node.js Adoption Peaked? If So, What's Next for Server-Side App Development?

$
0
0

The general consensus of the experts is that Node.js will continue to play an important role in web app development despite the impending release of the io.js forked version. Still, some developers have decided to switch to the Go programming language and other alternatives, which they consider better suited to large, distributed web apps.

The developer community appears to be tiring of the constant churn in platforms and toolkits. Jimmy Breck-McKye points out in a December 1, 2014, post on his Lazy Programmer blog that it has been only two years since the arrival of Node.js, the JavaScript framework for developing server-side apps quickly and simply.

Soon Node.js was followed by Backbone.js/Grunt, Require.js/Handlebars, and most recently, Angular, Gulp, and Browserify. How is a programmer expected to invest in any single set of development tools when the tools are likely to be eclipsed before the developer can finish learning them?

Node.js still has plenty of supporters, despite the recent forking of the product with the release of io.js by a group of former Node contributors. In a December 29, 2014, post on the LinkedIn Pulse blog, Kurt Cagle identifies Node as one of the Ten Trends in Data Science for 2015. Cagle nearly gushes over the framework, calling it "the nucleus of a new stack that is likely going to relegate Ruby and Python to has-been languages." Node could even supplant PHP someday, according to Cagle.

The internal thread architecture of Node.js handles incoming requests to the http server similar to SQL requests. Source: Stack Overflow

Taking the opposite view is Shiju Varghese, who writes in an August 31, 2014, post on his Medium blog that after years of developing with Node, he has switched to using Go for Web development and as a " technology ecosystem for building distributed apps." Among Node's shortcomings, according to Varghese, are its error handling, debugging, and usability.

More importantly, Varghese claims Node is a nightmare to maintain for large, distributed apps. For anyone building RESTful apps on Node.js, he recommends the Hapi.js framework created by WalMart. Varghese predicts that the era of using dynamic languages for "performance-critical" web apps is drawing to a close.

The Node.js fork may -- or may not -- be temporary

When io.js was released in late November 2014, developers feared they would be forced to choose between the original version of the open-source framework supported by Joyent, and the new version created by former Node contributors. As ReadWrite's Lauren Orsini describes in a December 10, 2014, article, the folks behind io.js were unhappy with Joyent's management of the framework.

Io.js is intended to have "an open governance model," according to the framework's readme file. It is described as an "evented IO for V8 JavaScript." Node.js and io.js are both server-side frameworks that allow web apps to handle user requests in real time, and the io.js development team reportedly intends to maintain compatibility with the "Node ecosystem."

At present, most corporate developers are taking a wait-and-see approach to the Node rift, according to InfoWorld's Paul Krill. In a December 8, 2014, article, Krill writes that many attendees at Intuit's Node Day conference see the fork as a means of pressuring Joyent to "open up a little bit," as one conference-goer put it. Many expect the two sides to reconcile before long -- and before parallel, incompatible toolsets are released.

Still, the io.js fork is expected to be released in January 2015, according to InfoQ's James Chester in a December 9, 2014, post. Isaac Z. Schluetler, one of the Node contributors backing io.js, insists in an FAQ that the framework is not intended to compete with Node, but rather to improve it.

Regardless of the outcome of the current schism, the outlook for Node developers looks rosy. Indeed.com's recent survey of programmer job postings indicates that the number of openings for Node developers is on the rise, although it still trails jobs for Ruby and Python programmers.

Openings for developers who can work with Node.js are on the rise, according to Indeed.com. Source: FreeCodeCamp

Regardless of your preferred development framework, you can rest assured that your MySQL, MongoDB, Redis, and ElasticSearch databases are accessible when you need them on the Morpheus database-as-a-service (DBaaS). Morpheus supports a range of tools for connecting to, configuring, and managing your databases.

You can provision, deploy, and host all your databases on Morpheus with just a few clicks using the service's dashboard. Visit the Morpheus site for to create a free account!

Diagnose and Optimize MySQL Performance Bottlenecks

$
0
0

A common source of MySQL performance problems is tables with outdated, redundant, and otherwise-useless data. Slow queries can be fixed by optimizing one or all tables in your database in a way that doesn't lock users out any longer than necessary.

MySQL was originally designed to be the little database that could, yet MySQL installations keep getting bigger and more complicated: larger databases (often running in VMs), and larger and more widely disparate clusters. As database configurations increase in size and complexity, DBAs are more likely to encounter performance slowdowns. Yet the bigger and more complex the installation, the more difficult it is to diagnose and address the speed sappers.

The MySQL Reference Manual includes an overview of factors that affect database performance, as well as sections explaining how to optimize SQL statements, indexes, InnoDB tables, MyISAM tables, MEMORY tables, locking operations, and MySQL Server, among other components.

At the hardware level, the most common sources of performance hits are disk seeks, disk reading and writing, CPU cycles, and memory bandwidth. Of these, memory management generally and disk I/O in particular top the list of performance-robbing suspects. In a June 16, 2014, article, ITworld's Matthew Mombrea focuses on the likelihood of encountering disk thrashing (a.k.a. I/O thrashing) when hosting multiple virtual machines running MySQL Server, each of which contains dozens of databases.

Data is constantly being swapped between RAM and disk, and obviously it's faster to access data in system memory than data on disk. When insufficient RAM is available to MySQL, dozens or hundreds of concurrent queries to disk will result in I/O thrashing. Comparing the server's load value to its CPU utilization will confirm this: high load value and low CPU utilization indicates high disk I/O wait times.

Determining how frequently you need to optimize your tables

The key to a smooth-running database is ensuring your tables are optimized. Striking the right balance between optimizing too often and optimizing too infrequently is a challenge for any DBA working with large MySQL databases. This quandary was presented in a Stack Overflow post from February 2012.

For a statistical database having more than 2,000 tables, each of which has approximately 100 million rows, how often should the tables be optimized when only 60 percent of them are updated every day (the remainder are archives)? You need to run OPTIMIZE on the table in three situations:

  • When its datafile is fragmented on disk
  • When many of its rows are updated or change size
  • When deleting many records and not adding many others

Run CHECK TABLE when you suspect the table's data is corrupted, and then REPAIR TABLE when corruption is reported. Use ANALYZE TABLE to update index cardinality.

In a separate Stack Overflow post from March 2011, the perils of optimizing too frequently are explained. Many databases use InnoDB with a single file rather than separate files per table. Optimizing in such situations can cause more disk space to be used rather than less. (Also, tables are locked during optimization, so large tables may be inaccessible for long periods.)

From the command line, you can use mysqlcheck to optimize one or all databases:

Run "mysqlcheck" from the command line to optimize one or all of your databases quickly. Source: Stack Overflow

Alternatively, you can run this PHP script to optimize all the tables in your database:

This PHP script will optimize all the tables in a database in one fell swoop. Source: Stack Overflow

Other suggestions are to implode the table names into one string so that you need only one optimize table query, and to use MySQL Administrator in the MySQL GUI Tools.

Monitoring and optimizing your MySQL, MongoDB, Redis, and ElasticSearch databases is a point-and-click process in the new Morpheus Virtual Appliance. Morpheus is the first and only database-as-a-service (DBaaS) that supports SQL, NoSQL, and in-memory databases across public, private, and hybrid clouds. You can provision your database with astounding ease, and each database instance includes a free full replica set. The service supports a range of database tools and lets you analyze all your databases from a single dashboard. Visit the Morpheus site to create a free account.


When One Data Model Just Won't Do: Database Design that Supports Polyglot Persistence

$
0
0

The demands of modern database development mandate an approach that matches the model (structured or unstructured) to the nature of the underlying data, as well as the way the data will be used. Choice of data model is no longer an either/or proposition: now you can have your relational and key-value, too. The multimodel approach must be applied deliberately to reduce operational complexity and ensure reliability.

"When your only tool is a hammer, all your problems start looking like nails." Too often that old adage has applied to database design: When your only tool is a relational DBMS, all your data starts to look structured.

Well, today's proliferation of data types defies squeezing it all into a single model. The age of the multimodel database has arrived, and developers are responding by adopting designs that apply the most appropriate model to the various data types that comprise their diverse databases.

In a January 6, 2015, article on InfoWorld, FoundationDB's Stephen Pimentel explains that the rise in NoSQL, JSON, graphs, and other non-SQL data models is the result of today's applications needing to work with various data types and storage requirements. Rather than creating multiple distinct databases, developers are increasingly basing their databases on a single backend that supports multiple data models.

Data scientist Martin Fowler describes polyglot persistence as the ability of applications to manage their own data using various technologies based on the characteristics and use of that data. Rather than selecting the tool first and then fitting the data to the tool, developers will determine how the various data elements will be manipulated and then will choose the appropriate tools for those specific purposes.

Multimodel databases apply different data models in a single database based on the characteristics of various data elements. Source: Martin Fowler

Multimodel databases are by definition more complicated than their single-model counterparts. Managing this complexity is the principal challenge of developers, primarily because each data storage mechanism requires its own interface and creates a potential performance bottleneck. However, the alternative of attempting to apply the relational model to NoSQL-type unstructured data will require a tremendous amount of development and maintenance effort.

Putting the multimodel database design into practice

John P. Wood highlights the primary shortcoming of RDBMSs in clustered environments: the way they enforce data integrity places inordinate demands on processing power and storage requirements. RDBMSs depend on fast, simple access to all data continually to prevent duplicates, enforce constraints, and otherwise maintain the database.

While you can scale out relational databases via slave-master, sharding, and other approaches, doing so increases the app's complexity. More importantly, a key-value store is often a better fit for that data than RDBMS's rows and columns, even with object/relation mapping tools.

Wood describes two scenarios in which polyglot persistence improves database performance: when performing complex calculations on massive data sets; and when needing to store data that varies greatly from document to document, or that is constantly changing structure. In the first instance, data is moved from the relational to the NoSQL database and then processed by the application to maximize the benefits of clustering. In the second, structure is applied to the document on the fly to allow data inside the document to be queried.

The basic relational (SQL) model compared to the document (NoSQL) model. Source: Aaron Stannard

The trend toward supporting multiple data models in a single database is evident in the new Morpheus Virtual Appliance, which supports heterogeneous MySQL, MongoDB, Redis, and ElasticSearch databases. Morpheus lets you monitor and analyze all your databases using a single dashboard to provide instant insight into consumption and availability of system resources.

The Morpheus Virtual Appliance is the first and only database provisioning and management platform that works with private, public, and hybrid clouds. A free full replica set is provisioned for each database instance, and backups are created for your MySQL and Redis databases.

Visit the Morpheus site to create a free trial account!

The Benefits of Virtual Appliances Expand to Encompass Nearly All Data Center Ops

$
0
0

Virtual appliances deliver the potential to enhance data security and operational efficiency in IT departments of all shapes, sizes, and types. As the technology expands to encompass ever more data-center operations, it becomes nearly impossible for managers to exclude virtual appliances from their overall IT strategies.

Why have virtual appliances taken the IT world by storm? They just make sense. By combining applications with just as much operating system and other resources as they need, you're able to minimize overhead and maximize processing efficiency. You can run the appliances on standard hardware or in virtual machines.

At the risk of sounding like a late-night TV commercial, "But wait, there's more!" The Turnkey Linux site summarizes several other benefits of virtual appliances: they streamline complicated, labor-intensive processes; they make software deployment a breeze by encapsulating all the app's dependencies, thus precluding conflicts due to incompatible OSes and missing libraries; and last but not least, they enhance security by running in isolation, so a problem with or breach of one appliance doesn't affect any other network components.

In fact, the heightened focus in organizations of all sizes on data security is the impetus that will lead to a doubling of the market for virtual security appliances between 2013 and 2018, according to research firm Infonetics. The company forecasts that revenues from virtual security appliances will total $1.2 billion in 2018, as cited by Fierce IT's Fred Donovan in a November 11, 2014, article.

 

Growth in the market for virtual appliances will be spurred in large part by increased emphasis on data security in organizations. Source: Infonetics Research, via Fierce IT

In particular, virtual appliances are seen as the primary platform for implementation of software-defined networks and network functions virtualization, both of which are expected to boom starting in 2016, according to Infonetics.

The roster of top-notch virtual appliances continues to grow

There are now virtual appliances available for such core functions as ERP, CRM, content management, groupware, file serving, help desks, and domain controllers. TechRepublic's Jack Wallen lists 10 of his favorite virtual appliances, which include Drupal appliance, LAMP stack, Zimbra appliance, Openfiler appliance, and the Opsview Core Virtual Appliance.

If you prefer the DIY approach, the TKLDev development environment for Turnkey Linux appliances claims to make building Turnkey Core from scratch as easy as running make.

The TKLDev tool lets you build Turnkey Core simply by running make. Source: Turnkey Linux

The source code for all the appliances in the Turnkey library are available on GitHub, as are all other repositories and the TKLDev documentation.

Also available are the Turnkey LXC (LinuX Containers) and Turnkey LXC appliance. Turnkey LXC is described by Turnkey Linux's Alon Swartz in a December 19, 2013, post as a " middle ground between a chroot on steroids and a full fledged virtual machine." The environment allows multiple isolated containers to be run on a single host.

The most recent addition to the virtual-appliance field is the Morpheus Virtual Appliance, which is the first and only database provisioning and management platform that supports private, public, and hybrid clouds. Morpheus offers the simplest way to provision heterogeneous MySQL, MongoDB, Redis, and ElasticSearch databases.

The Morpheus Virtual Appliance offers real-time monitoring and analysis of all your databases via a single dashboard to provide instant insight into consumption and availability of system resources. A free full replica set is provisioned for each database instance, and backups are created for your MySQL and Redis databases.

Visit the Morpheus site to create a free trial account. You'll also find out how to get started using Morpheus, which is the only database-as-a-service to support SQL, NoSQL, and in-memory databases.

Tame the Big-Data Deluge by Devising a Metadata Strategy

$
0
0

Ride the crest of the big-data wave by using metadata management as your surfboard. As more of your organization's information assets become untethered from relational databases, you'll rely increasingly on metadata to classify, qualify, and otherwise manage today's diverse data resources.

"Metadata" is one of those terms that appears to have as many meanings as there are people using the word. The standard definition of metadata is "data about data." That's like defining a tree as "wood with leaves."

A slightly better definition of metadata comes from Mika Javanainen's November 5, 2014, article on TechRadar: "The attributes, properties and tags that describe and classify information." This may include the data type (text document, image, Javascript, etc.), creation date, author, or workflow state.

Like many definitions, this one fails to communicate the importance of metadata to the task of organizing and managing massive data stores comprised of diverse elements that relate and interact in ways that are often unpredictable. As Javanainen points out, metadata's most important role may be as a bridge between diverse information residing in organizations: CRM, ERP, and other siloed databases housing both structured and unstructured data.

Javanainen recommends creating metadata templates for employees in the organization to standardize on, such as ones for proposals, contracts, invoices, and product information. This allows metadata attributes to be applied automatically and consistently to data at the point of ingestion.

Managing the transition from structured RDBMSs to unstructured big data

As Ventana Research's Mark Smith points out in a November 12, 2014, article on the Smart Data Collective site, most big data in organizations resides in conventional relational databases (76 percent, according to the company's research), followed by flat files (61 percent) and data-warehouse appliances (46 percent).

However, when enterprise data managers were asked which tools they plan to use for their future big-data tasks, 46 percent named in-memory databases, 44 percent cited Hadoop, 43 percent named specialized databases, and 42 percent plan to adopt NoSQL.

 

Companies intend to use a mixed bag of technologies as they begin to implement their big-data strategies. Source: Ventana Research

The companies surveyed by Ventana Research identified metadata management as the single most important aspect of their big-data integration plans (58 percent), followed by joining disparate data sources (56 percent) and establishing rules for processing and routing data (56 percent).

A new company named Primary Data intends to help organizations realize the full value of their metadata resources. Forbes' Tom Coughlin describes the company's unique approach in a November 26, 2014, article.

The Primary Data platform uses data virtualization to create a single global namespace that can be used to manage direct attached, network attached, and both public and private cloud storage. To improve performance and efficiency, content metadata is stored on fast flash-based storage servers, while the data the metadata refers to is housed on lower-cost (and slower) hard disk drives.

 

Primary Data's metadata server creates a logical abstraction of physical storage that automates data movement and placement via an intelligent policy engine. Source: Storage Newsletter

 

The Storage Network of the Future Will Feature Objects Prominently

$
0
0

Old meets new head on when unstructured object storage systems are integrated with conventional array- and server-based SANs and NAS. Modern data-storage architectures must take advantage of the best each technology has to offer in order to address the soaring amount of data in organizations, and the need-it-now mentality of users.

What are we going to do with all this data?

That's the question organizations of all shapes and sizes are pondering as they consider their storage options. Increasingly, storage-area networks (SAN), network-attached storage (NAS), and object storage all play a role in your company's data strategy.

In Gigaom's November 13, 2014, cloud sector roadmap, object storage is identified as one of the cornerstones of infrastructure as a service. It uses the REST API to expose scalable object-based storage functions. In conjunction with a content delivery network (CDN), object storage delivers images, videos, documents, and other static data elements by caching the high-access items at the edge of the network, near consumers.

A primary distinction between network-attached storage and object storage is that the former uses a native file system, while the latter supports any file system, or none at all. Use of the REST API allows files to be stored and retrieved from any application that can make HTTP calls.

A popular use for object storage is to capture point-in-time snapshots of virtual machines, blocked storage, and managed database services. The five trends most likely to affect the object-storage market through mid-2016, according to Gigaom's analysis, are scalability and reliability; security and compliance; integration and interoperability; data management; and data ingestion.

The five trends projected to have the greatest impact on the object-storage market through mid-2016. Source: Gigaom

Security in particular poses some concerns for companies adopting object storage. Aspects to consider when planning an object-storage strategy include disposing data at the end of its life; API uptime; and read/write success rate.

Matching the storage approach to the data and its users

There will always be a mix-and-match aspect to devising the best data-management strategy for your company. For one thing, SANs and NAS have been the key to enterprise data storage for many years and show no signs of disappearing anytime soon. In a December 1, 2014, article, Computerworld's Chris Poelker points out that SAN and NAS are converging with each other, as well as with object storage and other new technolgies.

Old silos of data on various storage platforms in organizations are merging in a new converged infrastructure. Source: Computerworld

While object storage is still perceived as best for software-defined cloud storage of relatively static data elements (archived or accessed infrequently), its use will likely expand as access speeds increase and latency decreases.

In a December 3, 2014, article, The Register's Trevor Pott identifies strengths and weaknesses of the two primary storage options for enterprises:

  • Object storage is best when you have lots of unstructured data you access infrequently.
  • Server-based SANs are preferred when speed and low latency are most important.

According to Pott, the primary advantage of server SANs is their support for virtualization. They allow virtualization teams to provision the storage they need without having to go through or compete with other departments. It makes sense to manage your storage via the same interface you use to manage your virtual machines, including user profiles and APIs.

The ultimate fate of both object storage and server SANs depends on the delivery of software that's smart enough to manage the transfer and monitoring of petabyte-scale data stores. But you don't have to wait for the next generation of data-storage software to realize the cost, efficiency, and other benefits of cloud storage. New companies are emerging that are addressing these issues innovative and cost-effective ways. Do you have any good experiences with such companies? Leave your suggestions in the comments section! 

How to Use SQL Server Transaction Logs to Improve Your DB's Performance

$
0
0

Every change to the objects contained in your database is recorded in SQL Server's transaction logs. That makes the logs the logical first stop when troubleshooting a problem with your database. Unfortunately, the transaction logs themselves can sometimes cause problems of there own. Here's how to put the logs to best use.

SQL Server's transaction logs can be a fountain of information about your databases. Unfortunately, the logs can sometimes be the source of the performance trouble their information is intended to help identify and prevent.

The Microsoft Developer Network's documentation for the SQL Server 2014 Transaction Log lists five primary purposes for the logs:

  • Recovery of individual transactions
  • Recovery of all incomplete transactions when SQL Server is started
  • Rolling a restored database, file, filegroup, or page forward to the point of failure
  • Supporting transactional replication
  • Supporting high availability and disaster recovery solutions: AlwaysOn Availability Groups, database mirroring, and log shipping

The documentation also points out a common source of problems related to transaction logs: truncation. You have to truncate the logs to prevent them from filling available memory. Truncation involves deleting inactive virtual log files to free up space for the physical transaction log.

The transaction log is truncated automatically after a checkpoint in the simple recovery model, and under the full recovery model (or bulk-logged recovery model) after a log backup, if there has been a checkpoint since the most recent backup (copy-only log backups are the exception). The MSDN document explains the factors that can delay an automatic log truncation.

How to use fn_dblog to analyze your transaction log

There's a wealth of system information in the SQL Server transaction log, but accessing and interpreting it can be a challenge. In a March 10, 2014, post, Remus Rusanu describes how to put the fn_dblog function to use to glean useful information from the log. (Note that fn_dblog was formerly known as the DBCC command, as Thomas LaRock explains on the SolarWindsLogicalRead site.)

The write-ahead logging (WAL) protocol ensures that any change to data stored in the database is recorded somewhere in the transaction log. This includes minimally logged, bulk logged, and so-called non-logged operations such as TRUNCATE.

Rusanu provides the example of a log entry for three concurrent transactions: one with two insert and one delete; one with an insert but rolled back, so there's no corresponding delete operation; and one with two deletes and one insert. The log sequence number (LSN) determines the order in the log of the "concurrent" operations.

 

This SQL Server transaction log shows interleaved operations from multiple simultaneous transactions. Source: Remus Rusanu

As shown in the above example, the [Transaction ID] column holds the system degenerate transaction ID for each logged operation. Start your analysis at the LOP_BEGIN_XACT operations, which indicate the date and time of the transaction, the user SID, and other useful information. Rusanu provides a detailed examination of a single transaction as well as an in-depth look at a log after truncation.

A transaction log that just keeps growing can leave you scratching your head. A DBA Stack Exchange post from December 2012 presents a typical situation: a transaction log that refuses to truncate. The most likely causes are a long-running transaction, such as index maintenance, or not changing the default "Full" recovery mode and going a long time between backups.

One of the proposed solutions is to use the [sys.databases][5] catalog view and look in the log_reuse_wait column with a lookup ID of the reason code, as well as a log_reuse_wait_desc column with a description of the wait reason.

 

How to Ensure Your SSL-TLS Connections Are Secure

$
0
0

Encryption is becoming an essential component of nearly all applications, but managing the Secure Sockets Layer/Transport Layer Security (SSL/TLS) certificates that are at the heart of most protected Internet connections is anything but simple. A new tool from Google can help ensure your apps are protected against man-in-the-middle attacks.

In the not-too-distant past, only certain types of Internet traffic was encrypted, primarily online purchases and any transmission of sensitive business information. Now the push is on to encrypt everything -- or nearly everything -- that travels over the Internet. While some analysts question whether the current SSL/TLS encryption standards are up to the task, certificate-based encryption isn't likely to be replaced anytime soon.

The Elecronic Frontier Foundation's Let's Encrypt program proposes a new certificate authority (CA) intended to make HTTPS the default on all websites. The EFF claims the current CA system for HTTPS is too complex, too costly, and too easy for the bad guys to beat.

Nearly every web user has encountered a warning or error message generated by a misconfigured certificate. The pop-ups are usually full of techno-jargon that can confuse engineers, let alone your typical site visitors. In fact, a recent study by researchers at Google and the University of Pennsylvania entitled Improving SSL Warnings: Comprehension and Adherence (pdf) found that 66 percent of people using the Chrome browser clicked right through the CA warnings.

As Threatpost's Brian Donahue reports in a February 3, 2015, article, redesigning the messages to provide better visual cues and more dire warnings convinced 62 percent of users to choose the preferred, safe response, compared to only 37 percent who did so when confronted with the old warnings. The "opinionated design" concept combines a plain-English explanation ("Your connection is not private" in red letters) with added steps required to continue despite the warning.

Researchers were able to increase the likelihood that users would make the safe choice by redesigning SSL certificate warnings from cryptic (top) to straightforward (bottom). Source: Sophos Naked Security

Best practices for developing SSL-enabled apps

SSL has become a key tool in securing IT infrastructures. Because SSL certificates are valid only for the time they specify, monitoring the certificates becomes an important part of app management. A Symantec white paper entitled SSL for Apps: Best Practices for Developers (pdf) outlines the steps required to secure your apps using SSL/TLS.

When establishing an SSL connection, the server returns one or more certificates to create a "chain of trust." The certificates may not be received in a predictable order. Also, the server may return more than necessary or require that the client look for necessary certificates elsewhere. In the latter case, a certificate with a caIssuers entry in its authorityInfoAccess extension will list a protocol and extension for the issuing certificate.

Once you've determined the end-entity SSL certificate, you verify that the chain from the end-entity certificate to the trusted root certificate or intermediate certificate is valid.

To help developers ensure their apps are protected against man-in-the-middle attacks resulting from corrupted SSL certificates, Google recently released a tool called nogotofail. As PC World's Lucian Constantin explains in a November 4, 2014, article, apps become vulnerable to such attacks because of bad client configurations or unpatched libraries that may override secure default settings.

Nogotofail simulates man-in-the-middle attacks using deep packet inspection to track all SSL/TLS traffic rather than monitoring only the two ports usually associated with secure connections, such as port 443. The tool can be deployed as a router, VPN server, or network proxy.

Security is at the heart of the new Morpheus Virtual Appliance, which lets you seamlessly provision and manage SQL, NoSQL, and in-memory databases across hybrid clouds. Each database instance you create includes a free full replica set for built-in fault tolerance and failover. You can administer your heterogeneous MySQL, MongoDB, Redis, and ElasticSearch databases from a single dashboard via a simple point-and-click interface. 

Visit the Morpheus site to sign up for a FREE Trial!

MongoDB 3.0 First Look: Faster, More Storage Efficient, Multi-model

$
0
0

Document-level locking and pluggable storage APIs top the list of new features in MongoDB 3.0, but the big-picture view points to a more prominent role for NoSQL databases in companies of all types and sizes. The immediate future of databases is relational, non-relational, and everything in between -- sometimes all at once.

Version 3.0 of MongoDB, the leading NoSQL database, is being touted as the first release that is truly ready for the enterprise. The new version was announced in February and shipped in early March. At least one early tester, Adam Comerford, reports that MongoDB 3.0 is indeed more efficient at managing storage, and faster at reading compressed data.

The new feature in MongoDB 3.0 gaining the lion's share of analysts' attention is the addition of the WiredTiger storage engine and pluggable API that MongoDB acquired in December 2014. JavaWorld's Andrew C. Oliver states in a February 3, 2015, article that WiredTiger will likely boost performance over MongoDB's default MMapV1 engine in apps where reads don't greatly outnumber writes.

Oliver points out that WiredTiger's B-tree and Log Structured Merge (LSM) algorithms benefit apps with large caches (B-tree) and with data that doesn't cache well (LSM). WiredTiger also promises data compression that reduces storage needs by up to 80 percent, according to the company.

The addition of the WiredTiger storage engine is one of the new features in MongoDB 3.0 that promises to improve performance, particularly for enterprise customers. Source: Software Development Times

Other enhancements in MongoDB 3.0 include the following:

  • Document-level locking for concurrency control via WiredTiger
  • Collection-level concurrency control and more efficient journaling in MMapV1
  • A pluggable API for integration with in-memory, encrypted, HDFS, hardware-optimized, and other environments
  • The Ops Manager graphical management console in the enterprise version

Computing's John Leonard emphasizes in a February 3, 2015, article that MongoDB 3.0's multi-model functionality via the WiredTiger API positions the database to compete with DataStax' Apache Cassandra NoSQL database and Titan graph database. Leonard also highlights the new version's improved scalability.

Putting MongoDB 3.0 to the (performance) test

MongoDB 3.0's claims of improved performance were borne out by preliminary tests conducted by Adam Comerford and reported on his Adam's R&R blog in posts on February 4, 2015, and February 5, 2015. Comerford repeated compression tests with the WiredTiger storage engine in release candidate 7 (RC7) -- expected to be the last before the final version comes out in March -- that he ran originally using RC0 several months ago. The testing was done on an Ubuntu 14.10 host with an ext4 file system.

The results showed that WiredTiger's on-disk compression reduced storage to 24 percent of non-compressed storage, and to only 16 percent of the storage space used by MMapV1. Similarly, the defaults for WiredTiger with MongoDB (the WT/snappy bar below) used 50 percent of non-compressed WiredTiger and 34.7 percent of MMapV1.

Testing WiredTiger storage (compressed and non-compressed) compared to MMapV1 storage showed a tremendous advantage for the new MongoDB storage engine. Source: Adam Comerford

Comerford's tests of the benefits of compression for reads when available I/O capacity is limited demonstrated much faster performance when reading compressed data using snappy and zlib, respectively. A relatively slow external USB 3.0 drive was used to simulate "reasonable I/O constraints." The times indicate how long it took to read the entire 16GB test dataset from the on-disk testing into memory from the same disk.

Read tests from compressed and non-compressed disks in a simulated limited-storage environment indicate faster reads with WiredTiger in all scenarios. Source: Adam Comerford

 

All signs point to a more prominent role in organizations of all sizes for MongoDB in particular and NoSQL in general. Running relational and non-relational databases side-by-side is becoming the rule rather than the exception. The new Morpheus Virtual Appliance puts you in good position to be ready for multi-model database environments. It supports rapid provisioning and deployment of MongoDB v3.0 across public, private and hybrid clouds. Sign Up for a Free Trial now!


The Key to Selecting a Programming Language: Focus

$
0
0

There isn't a single best programming language. Rather than flitting from one language to the next as each comes into fashion, determine the platform you want to develop apps for -- the web, mobile, gaming, embedded systems -- and then focus on the predominant language for that area.

"Which programming languages do you use?"

In many organizations, that has become a loaded question. There is a decided trend toward open source development tools, as indicated by the results of a Forrester Research survey of 1,400 developers. ZDNet's Steven J. Vaughan-Nichols reports on the study in an October 29, 2014, article.

Conventional wisdom says open-source development tools are popular primarily because they cost less than their proprietary counterparts. That belief is turned on its head by the Forrester survey, which found performance and reliability are the main reasons why developers prefer to work with open-source tools. (Note that Windows still dominates on the desktop, while open source leads on servers, in data centers, and in the cloud.)

Then again, "open source" encompasses a universe of different development tools for various platforms: the web, mobile, gaming, embedded systems -- the list goes on. A would-be developer can waste a lot of time bouncing from Rails to Django to Node.js to Scala to Clojure to Go. As Quincy Larson explains in a November 14, 2014, post on the FreeCodeCamp blog, the key to a successful career as a programmer is to focus.

Larson recounts his seven months of self-study of a half-dozen different programming languages before landing his first job as a developer -- in which he used none of them. Instead, his team used Ruby on Rails, a relative graybeard among development environments. The benefits of focusing on a handful of tools are many: developers quickly become experts, productivity is enhanced because people can collaborate without a difference in tools getting in the way, and programmers aren't distracted by worrying about missing out on the flavor of the month.

Larson recommends choosing a single type of development (web, gaming, mobile) and sticking with it; learning only one language (JavaScript/Node.js, Rails/Ruby, or Django/Python); and following a single online curriculum (such as FreeCodeCamp.com or NodeSchool.io for JavaScript, TheOdinProject.com or TeamTreehouse.com for Ruby, and Udacity.com for Python).

 

A cutout from Lifehacker's "Which Programming Language?" infographic lists the benefits of languages by platform. Source: Lifehacker

Why basing your choice on potential salary is a bad idea

Just because you can make a lot of money developing in a particular language doesn't mean it's the best career choice. Readwrite's Matt Asay points out in a November 28, 2014, article that a more rewarding criterion in the long run is which language will ensure you can find a job. Asay recommends checking RedMonk's list of popular programming languages.

Boiling the decision down to its essence, the experts quoted by Asay suggest JavaScript for the Internet, Go for the cloud, and Swift (Apple) or Java (Android) for mobile. Of course, as with most tech subjects, opinions vary widely. In terms of job growth, Ruby appears to be fading, Go and Node.js are booming, and Python is holding steady.

But don't bail on Ruby or other old-time languages just yet. According to Quartz's programmer salary survey, Ruby on Rails pays best, followed by Objective C, Python, and Java.

While Ruby's popularity may be on the wane, programmers can still make good coin if they know Ruby on Rails. Source: Quartz, via Readwrite

Also championing old-school languages is Readwrite's Lauren Orsini in a September 1, 2014, article. Orsini cites a study by researchers at Princeton and UC Berkeley that found inertia is the primary driver of developers' choice of language. People stick with a language because they know it, not because of any particular features of the language. Exhibits A, B, C, and D of this phenomenon are PHP, Python, Ruby, and JavaScript -- and that doesn't even include the Methuselah of languages: C.

No matter your language of choice, you'll find it combines well with the new Morpheus Virtual Appliance, which lets you monitor and manage heterogeneous MySQL, MongoDB, Redis, and ElasticSearch databases from a single dashboard. Morpheus is the first and only database-as-a-service (DBaaS) that supports SQL, NoSQL, and in-memory databases across public, private, and hybrid clouds.

With Morpheus, you can invoke a new database instance with a single click, and each instance includes a free full replica set for failover and fault tolerance. Your MySQL and Redis databases are backed up and you can administer your databases using your choice of tools. Visit the Morpheus site to create a free account.

Bimodal IT the Future of the Data Center or Another Empty Buzzword

$
0
0

Few ideas have polarized IT managers more than bimodal IT: Is it the key to modernizing the data center, or a sure-fire recipe for disaster? Some analysts claim adoption of the model -- which runs agile and legacy development projects in parallel -- is well underway. Others assert that the model is a rehash of an inherently flawed approach that dates to the beginning of the century. The upshot: There's more than one way to manage change.

Depending on whom you believe, bimodal IT is either the cure for everything that ails data centers, or a lame attempt to preserve the data silos that threaten the survival of companies of all sizes and types. As usual, the truth is positioned somewhere between these two extremes.

Technology research firm Gartner is credited as the creator and chief promoter of the concept. In a January 14, 2015, article on ZDNet, Adjuvi Chief Strategy Officer Dion Hinchcliffe presents bimodal/multimodal IT as a way for organizations to adopt agile development and other new technologies quickly, while retaining the reliability, stability, and security of traditional IT processes.

In the bimodal approach to IT, the agile development model is implemented in parallel with conventional IT workflows. Source: ZDNet

According to Hinchcliffe, Gartner's bimodal IT architecture is one mode too few. He cites the trimodal design proposed by Simon Wardley, which Wardley explains in a November 16, 2014, post on the Gardeviance blog. Wardley labels the three modes as pioneers, settlers, and town planners; the added twist is an overlay of self-organizing cell-based structures in which each cell conforms to the two-pizza model (a development team small enough to feed with only two pizzas -- hold the anchovies).

The trimodal IT model divides pioneers, settlers, and city planners, each of which is subdivided into cells for easier manageability. Source: Simon Wardley

Wardley criticizes the bimodal-IT concept in a November 13, 2014, post on the same blog, stating that it's "2004 dressed up as 2014 and it is guaranteed to get you into a mess." The agile developers work fast and don't mind errors -- in fact, they depend on them. The traditional developers work slowly and deliberately, and they have a low tolerance for errors. This is bound to lead to a stalemate, according to Wardley.

Bimodal IT: 'Rock-solid fluidity' or 'balderdash'?

One of bimodal IT's most vehement critics is Intellyx President Jason Bloomberg, who calls the concept "balderdash." Bloomberg's October 12, 2014, post on the company's blog states that Gartner is simply telling its clients what they want to hear rather than what the need to hear. Bimodal IT, he claims, is nothing more than an excuse for continuing to do IT poorly.

Bloomberg admits that change is difficult and expensive, and there's no need to fix what isn't broken. However, change is occurring throughout organizations at a rapid pace -- mostly originating from outside the data center. The need to maintain compliance, security, and other governance persists when IT modernizes, but governance must be done in a more agile, automated way.

Gartner Fellow Daryl Plummer counters this criticism by pointing out that adoption of the bimodal-IT model is well underway. In an October 6, 2014, press release, Plummer claims it is both "rock solid" and "fluid"; he states that 45 percent of CIOs report they now have a "fast mode of operation." Gartner projects that 75 percent of IT organizations will have some form of bimodal in place by 2017.

When you cut through all the rhetoric, what you're left with is the need to get users the data they need to thrive and ensure the company achieves its goals. For managing heterogeneous MySQL, MongoDB, Redis, and ElasticSearch databases, there's no more efficient, effective, and affordable way than by using the new Morpheus Virtual Appliance, which combines all the controls users need in a single dashboard. Morpheus is the first and only database-as-a-service (DBaaS) that supports SQL, NoSQL, and in-memory databases across public, private, and hybrid clouds.

With Morpheus, you can invoke a new database instance with a single click, and each instance includes a free full replica set for failover and fault tolerance. Your MySQL and Redis databases are backed up and you can administer your databases using your choice of tools. Visit the Morpheus site to create a free account.

MongoDB Poised to Play a Key Role in Managing the Internet of Things

$
0
0

Rather than out-and-out replacing their relational counterparts, MongoDB and other NoSQL databases will coexist with traditional RDBMSs. However, as more -- and more varied -- data swamps companies, the scalability and data-model flexibility of NoSQL will make it the management platform of choice for many of tomorrow's data-analysis applications.

There's something comforting in the familiar. When it comes to databases, developers and users are warm and cozy with the standard, nicely structured tables-and-rows relational format. In the not-too-distant past, nearly all of the data an organization needed fit snugly in the decades-old relational model.

Well, things change. What's changing now is the nature of a business's data. Much time and effort has been spent converting today's square-peg unstructured data into the round hole of relational DBMSs. But rather than RDBMSs being modified to support the characteristics of non-textual, non-document data, companies are now finding it more effective to adapt databases designed for unstructured data to accommodate traditional data types.

Two trends are converging to make this transition possible: NoSQL databases such as MongoDB are maturing to add the data-management features businesses require; and the amount and types of data are exploding with the arrival of the Internet of Things (IoT).

Heterogeneous DBs are the wave of the future

As ReadWrite's Matt Asay reports in a November 28, 2014, article, any DBAs who haven't yet added a NoSQL database or two to their toolbelt are in danger of falling behind. Asay cites a report by Machine Research that found relational and NoSQL databases are destined to coexist in the data center: the former will continue to be used to process "structured, highly uniform data sets," while the latter will manage the unstructured data created by "millions and millions of sensors, devices, and gateways."

Relational databases worked for decades because you could predict the characteristics of the data they held. One of the distinguishing aspects of IoT data is its unpredictability: you can't be sure where it will come from, or what forms it will take. Managing this data requires a new set of skills, which has led some analysts to caution that a severe shortage of developers trained in NoSQL may impede the industry's growth.

The expected increase in NoSQL-based development in organizations could be hindered by a shortage of skilled staff. Source: VisionMobile, via ReadWrite

The ability to scale to accommodate data elements measured in the billions is a cornerstone of NoSQL databases, but Asay points out the feature that will drive NoSQL adoption is flexible data modeling. Whatever devices or services are deployed in the future, NoSQL is ready for them.

Document locking one sign of MongoDB's growing maturity

According to software consultant Andrew C. Oliver -- a self-described "unabashed fan of MongoDB" -- the highlight of last summer's MongoDB World conference was the announcement that document-level locking is now supported. Oliver gives his take on the conference happenings in a July 3, 2014, article on InfoWorld.

Oliver compares MongoDB's document-level locking to row-level locking in an RDBMS, although documents may contain much more data than a row in an RDBMS. Some conference-goers projected that multiple documents may one day be written with ACID consistency, even if done so "locally" to a single shard.

Another indication of MongoDB becoming suitable for a wider range of applications is the release of the SlamData analytics tool that works without having to export data via ETL from MongoDB to an RDBMS or Hadoop. InfoWorld's Oliver describes SlamData in a December 11, 2014, article.

In contrast to the Pentaho business-intelligence tool that also supports MongoDB, SlamData CEO Jeff Carr states that the company's product doesn't require a conversion of document databases to the RDBMS format. SlamData is designed to allow people familiar with SQL to analyze data based on queries of MongoDB document collections via a notebook-like interface.

The SlamData business-intelligence tool for MongoDB uses a notebook metaphor for charting data based on collection queries. Source: InfoWorld

There's no simpler or more-efficient way to manage heterogeneous databases than by using the point-and-click interface of the new Morpheus Virtual Appliance, which lets you monitor and analyze heterogeneous MySQL, MongoDB, Redis, and ElasticSearch databases in a single dashboard. Morpheus is the first and only database-as-a-service (DBaaS) that supports SQL, NoSQL, and in-memory databases across public, private, and hybrid clouds.

With Morpheus, you can invoke a new database instance with one click, and each instance includes a free full replica set for failover and fault tolerance. You can administer your databases using your choice of tools. Visit the Morpheus site to create a free account.

Preparing Developers for a Multi-language Multi-paradigm Future

$
0
0

Tried-and-true languages such as Java, C++, Python, and JavaScript continue to dominate the most popular lists, but modern app development requires a multi-language approach to support diverse platforms and links to backend servers. The future will see new languages being used in conjunction with the old reliables.

Every year, new programming languages are developed. Recent examples are Apple's Swift and Carnegie Mellon University's Wyvernet. Yet for more than a decade, the same handful no. of languages have retained their popularity with developers -- Java, JavaScript, C/C++/C#/Objective-C, Python, Ruby, PHP -- even though each is considered to have serious shortcomings for modern app development.

According to TIOBE Software's TIOBE Index for January 2015, JavaScript recorded the greatest increase in popularity in 2014, followed by PL/SQL and Perl.

The same old programming languages dominate the popularity polls, as shown by the most-recent TIOBE Index. Source: TIOBE Software

Of course, choosing the best language for any development project rarely boils down to a popularity contest. When RedMonk's Donnie Berkholz analyzed GitHub language trends in May 2014, aggregating new users, issues, and repositories, he concluded that only five languages have mattered on GitHub since 2008: JavaScript, Ruby, Java, PHP, and Python.

 

An analysis of language activity on GitHub between 2008 and 2013 indicates growing fragmentation. Source: RedMonk

Two important caveats to Berkholz's analysis are that GitHub focused on Ruby on Rails when it launched but has since gone more mainstream; and that Windows and iOS development barely register because both are generally open source-averse. As IT World's Phil Johnson points out in a May 7, 2014, article, while it's dangerous to draw conclusions about language popularity based on this or any other single analysis, it seems clear the industry is diverging rather than converging.

Today's apps require a multi-language, multi-paradigm approach

Even straightforward development projects require expertise in multiple languages. TechCrunch's Danny Crichton states in a July 10, 2014, article that creating an app for the web and mobile entails HTML, CSS, and JavaScript for the frontend (others as well, depending on the libraries required); Java and Objective-C (or Swift) for Android and iPhone, respectively; and for links to backend servers, Python, Ruby, or Go, as well as SQL or other database query languages.

Crichton identifies three trends driving multi-language development. The first is faster adoption of new languages: GitHub and similar sites encourage broader participation in developing libraries and tutorials; and developers are more willing to learn new languages. Second, apps have to run on multiple platforms, each with unique requirements and characteristics. And third, functional programming languages are moving out of academia and into the mainstream.

Researcher Benjamin Erb suggests that rather than functional languages replacing object-oriented languages, the future will be dominated by multi-paradigm development, in particular to address concurrency requirements. In addition to supporting objects, inheritance, and imperative code, multi-paradigm languages incorporate higher-order functions, closures, and restricted mutability.

One way to future-proof your SQL, NoSQL, and in-memory databases is by using the new Morpheus Virtual Appliance, which lets you manage heterogeneous MySQL, MongoDB, Redis, and ElasticSearch databases in a single dashboard. Morpheus is the first and only database-as-a-service (DBaaS) that supports SQL, NoSQL, and in-memory databases across public, private, and hybrid clouds.

With Morpheus, you can invoke a new database instance with one click, and each instance includes a free full replica set for failover and fault tolerance. You can administer your databases using your choice of tools. Visit the Morpheus site to create a free account.

Avoid Being Locked into Your Cloud Services

$
0
0

Before you sign on the dotted line for a cloud service supporting your application development or other core IT operation, make sure you have an easy, seamless exit strategy in place. Just because an infrastructure service is based on open-source software doesn't mean you won't be locked in by the service's proprietary APIs and other specialty features.

In the quest for ever-faster app design, deployment, and updating, developers increasingly turn to cloud infrastructure services. These services promise to let developers focus on their products rather than on the underlying servers and other exigencies required to support the development process.

However, when you choose cloud services to streamline development, you run the risk of being locked in, at either the code level or the architecture level. Florian Motlik, CTO of continuous-integration service Codeship, writes in a February 21, 2015, article on Gigaom that infrastructure services mask the complexities underlying cloud-based development.

Depending on the type of cloud infrastructure service you choose, the vendor may manage more or less of your data operations. Source: Crucial

Even when the services you use adhere strictly to open systems, there is always a cost associated with switching providers: transfer the data, change the DNS, and thoroughly test the new setup. Of particular concern are services such as Google App Engine that lock you in at the code level. However, Amazon Web Services Lambda, Heroku, and other infrastructure services that let you write Node.js functions and invoke them either via an API or on specific events in S3, Kinesis, or DynamoDB entail a degree of architecture lock-in as well.

To minimize lock-in, Motlik recommends using a micro-services architecture based on technology supported by many different providers, such as Rails or Node.

Cloud Computing Journal's Gregor Petri identifies four types of cloud lock-in: the horizontal type locks you into a specific product and prevents you from switching to a competing service; vertical limits your choices in other levels of the stack, such as database or OS; diagonal locks you into a single vendor's family of products, perhaps in exchange for reduced management and training costs, or to realize a substantial discount; and generational prevents you from adopting new technologies as they become available.

Gregor Petri identifies four types of cloud lock-in: horizontal, vertical, diagonal, and generational. Source: Cloud Computing Journal

Will virtualization bring about the demise of cloud lock-in?

Many cloud services are addressing the lock-in trap by making it easier for potential customers to migrate their data and development tools/processes from other platforms to the services' own environments. Infinitely Virtual founder and CEO Adam Stern claims that virtualization has "all but eliminated" lock-in related to operating systems and open source software. Stern is quoted by Linux Insider's Jack M. Germain in an article from November 2013.

Alsbridge's Rick Sizemore points out that even with the availability of tools for migrating data between VMWare, OpenStack, and Amazon Web Services, customers may be locked in by contract terms that limit when they can remove their data. Sizemore also cautions that services may combine open source tools in a proprietary way that locks in your data.

In a February 9, 2015, article in Network World, HotLink VP Jerry McLeod points out that you can minimize the chances of becoming locked into a particular service by ensuring that you can move hybrid workloads seamlessly between disparate platforms. McLeod warns that vendors may attempt to lock in their customers by requiring that they sign long-term contracts.

Seamless workload migration and customer-focused contract terms are only two of the features that make the new Morpheus Virtual Appliance a "lock-in free" zone. With the Morpheus database-as-a-service (DBaaS) you can provision, deploy, and monitor your MongoDB, Redis, MySQL, and ElasticSearch databases from a single point-and-click console. Morpheus lets you work with SQL, NoSQL, and in-memory databases across hybrid clouds in just minutes. Each database instance you create includes a free full replica set for built-in fault tolerance and fail over.

In addition, the service allows you to migrate existing databases from a private cloud to the public cloud, or from public to private. A new instance of the same database type is created in the other cloud, and real-time replication keeps the two databases in sync. Visit the Morpheus site to create a free account.

Viewing all 1101 articles
Browse latest View live