Morpheus Blog

With Redis, you can use replication, sharding, or both. Find out how each one can help your database.

TL;DR: When running a Redis database (or any database for that matter), it is a good idea to have high availability of data, as well as good performance. Replication and sharding can both be helpful in providing for these needs. Whether your database is in need of one, the other, or both, it is helpful to know what each of these does.

What Is Database Replication?

Replication, also often called mirroring, is simply copying all of the data to another location. This allows data to be pulled from two or more locations, which ensures high availability. It can also be quite helpful should the main data location go down for some reason, as the data can still be read from one of the replicas.

In Redis, you can set up replication once you have at least one slave installation (Redis uses a master/slave(s) setup for replication). In the slave configuration file, you use the slaveof command, as in the following example.

An example of the slaveof command. Source: Redis.

If you set a master password, you will also need to set it in the masterauth setting. Once this is done, you can start (or restart) the Redis service on the slave to enable replication.

What is Database Sharding?

Sharding, also often called partitioning, involves splitting data up based on keys. This increases performance because it reduces the hit on each of the individual resources, allowing them to share the burden rather than having it all in one place.

For example, you can use a hash function on your Redis keys to turn them into numbers. Then, if you want two shards, you can send all the even-numbered keys to one instance while placing all of the odd-numbered keys to the second instance. This could be done using other algorithms for different numbers of shards.

Redis sharding can be implemented in several ways:

Client side partitioning - Clients select the proper Redis instance to read or write a particular key.

Proxy assisted partitioning - A proxy is used to handle requests and send the requests to the proper Redis instance.

Query routing - The query is sent to a random instance, which then takes on the responsibility of redirecting the client to the proper Redis instance.

Using Replication and Sharding Together

If you want both high availability and improved performance, both replication and sharding can be used together to provide this. With sharding, you will have two or more instances with particular data based on keys. You can then replicate each of these instances to produce a database that is both replicated and sharded, which will provide for reliability and speed at the same time!

An example of sharding and replication together. Source: Slideshare.

Get Your Own Hosted Database

Whatever your strategy is for obtaining high availability and/or performance on your Redis database, you will want reliable and stable database hosting. Morpheus Virtual Appliance is a tool that allows you manage heterogeneous databases in a single dashboard.

With Morpheus, you have support for SQL, NoSQL, and in-memory databases like Redis across public, private, and hybrid clouds. So, visit the Morpheus site for pricing information or to create a free account today!

Code sharing reaches new levels with these services that flatten the app-development flow chart via reusability to the nth.

TL;DR: Go beyond copying and pasting chunks of code by using these services that let you share app models and components quickly via the web; develop apps in a flash using an integration service layer; and use the same code to create native apps for iOS, Android, and Windows Phone.

The developer tradition of sharing code is taking several new forms to keep pace with changes in the way applications are created these days. Services such as GitHub, SourceForge, CodePlex, JavaForge, Bitbucket, Beanstalk, Pastebin, Pastie, and Google Code are being joined by next-generation code-sharing services that support visual development, application integration, and cross-platform mobile applications.

One of the newest code-sharing services is Mendix Model Share, which lets you share live app models and embed them in websites and blog/forum posts. You can also reuse them instantly for easy collaboration. Forbes' Adrian Bridgwater explains in a March 9, 2015, post that rapid application development (RAD) programmers can use Model Share to swap fully functional application models via the web.

For example, someone who has developed a functioning app, such as a currency converter, or a subcomponent or executable application model could post it to a site and get near-real-time feedback from other developers. The downside of sharing executables without the underlying code is that the recipient is flying blind: they don't have access to the app's underpinnings. Of course, if everything works and the functions are well documented, that's not a problem.

Integration service layer streamlines in-house app development

These days, a lot of the coding being done in organizations comes from outside the IT department. In a March 2015 article on TechTarget, George Lawton describes how companies are supporting their "citizen integrators" by creating an integration service layer. This layer lets developers focus on the business layers of their apps while leaving transportation-layer concerns to be handled by the integration service.

U.K.-based Channel 4 used the MuleSoft Mule ESB enterprise service bus to create an integration service layer for Channel 4's app-building employees. The alternative was to use point-to-point integration via web services, but Channel 4 determined that an integration service layer would make it easier for business managers to create their own applications.

MuleSoft's Mule ESB acts as a "transit system" for data traveling between apps in an enterprise or on the Internet. Source: MuleSoft

Channel 4's developers were so accustomed to point-to-point integration that the transition to use of an integration layer proved challenging. The new development methodology was facilitated by continuous integration, environment builds, and deployment processes, as well as guidelines for development teams to ensure consistency and optimal component sharing.

Share code when developing multiplatform mobile apps

Imagine being able to use JavaScript and TypeScript to create native applications for iOS, Android, and Windows Phone that share code among the platforms. That's the promise of NativeScript, a Telerik technology scheduled to debut in late April 2015. InfoWorld's Paul Krill provides an overview of the NativeScript system in a March 10, 2015, article.

Apps created with NativeScript have a native UI rather than the app being HTML-rendered in a web view, as with hybrid apps and traditional browser apps, according to Telerik VP Todd Anglin. The JavaScript engines in iOS, Android, and Windows Phone controls the native UI layer. A JavaScript proxy in NativeScript exposes the underlying native iOS/Android/Windows APIs to their JavaScript engines: JavaScriptCore, V8, and Chakra, respectively.

The NativeScript modules expose the native device and platform capabilities of Android and iOS, providing access via non-platform-specific code. Source: Telerik

Support for multiple platforms is a key feature of the new Morpheus Virtual Appliance. With the Morpheus database-as-a-service (DBaaS) you can provision, deploy, and monitor heterogeneous MySQL, MongoDB, Redis, and ElasticSearch databases from a single point-and-click console. Morpheus lets you work with your SQL, NoSQL, and in-memory databases across public, private, and hybrid clouds in just minutes. Each database instance you create includes a free full replica set for built-in fault tolerance and fail over.

In addition, the service allows you to migrate existing databases from a private cloud to the public cloud, or from public to private. A new instance of the same database type is created in the other cloud, and real-time replication keeps the two databases in sync. Visit the Morpheus site for pricing information and to create a free account.

Add relevant search results, expand your search to all fields in a database, and run "fuzzy" searches in MySQL.

TL;DR: The bigger and more complicated the database, the more talented its search capabilities need to be. These three techniques let you broaden the results returned by MySQL searches: one via automatic relevance feedback, another by searching all fields in all tables of the database, and a third by finding approximate matches to your search term.

A database doesn't do users much good if they can't find the data they're searching for. The larger and more complex the database elements, the more sophisticated your searches have to be to return the information you need.

In MySQL, full-text indexes apply to CHAR, VARCHAR, or TEXT columns in InnoDB and MyISAM tables. As the MySQL Reference Manual explains, full-text searchers are done using the MATCH() ... AGAINST syntax: the former accepts a comma-separated list identifying the columns to be searched; and the latter accepts a string to search for, and optionally a modifier indicating the type of search to be conducted.

The three types of full-text searches are natural language, boolean, and query expansion, which broadens the results of a natural-language search based on automatic relevance feedback (also called blind query expansion). A MySQL Tutorial steps through the process of crafting and applying a query expansion search.

When you add WITH QUERY EXPANSION to the AGAINST() function, the MySQL search engine first finds all rows matching the search query, then it checks those rows for the relevant words, and finally it searches again based on the relevant words rather than the user's original keywords. The tutorial uses the example of a search of a car database with and without query expansion.

A MySQL full-text search without query expansion (top) and with query expansion (bottom). Source: MySQL Tutorial

Multiple ways to search all tables in MySQL

When a Stack Overflow poster asked how to search all fields in all tables of a MySQL database, several different solutions were offered. The simplest is a SQLDump and then search the database as a file. To make the result easier to read, use --extended-insert=FALSE flag for mysqldump.

Another option is to use the search feature in phpMyAdmin: select the database, choose Search, enter your search term, and select the tables to be searched. A third solution is the following function:

This MySQL function searches all fields in all tables of a MySQL database. Source: Stack Overflow

Use SoundEx to include approximations in search results

Approximate string matching is an algorithmic trick upon which empires have been built. However, the fuzzy searching techniques that are common in Web search engines are lacking in MySQL, at least in terms of a direct equivalent search function. In a March 9, 2015, post, Database Journal's Rob Gravelle describes how to use the SoundEx phonetic algorithm to perform fuzzy searches in MySQL.

SoundEx reduces text to the sound when spoken. This makes it easy to spot misspellings, which are likely to have the same SoundEx string as the correct spelling. The three arguments the function takes are the needle (the search target), haystack (the text to be searched), and splitChar (usually a space).

Gravelle devises a function he names soundex_match_all that extends SoundEx to search for strings of words.

Use of MySQL's SoundEx function to search for approximate matches is extended to strings of words by this function. Source: Database Journal

The new Morpheus Virtual Appliance lets you provision, deploy, and monitor heterogeneous MySQL, MongoDB, Redis, and ElasticSearch databases from a single point-and-click console. With the Morpheus database-as-a-service (DBaaS) you can manage all your SQL, NoSQL, and in-memory databases across public, private, and hybrid clouds in just minutes. Each database instance you create includes a free full replica set for built-in fault tolerance and fail over.

When using MongoDB in production, there are a number of recommendations to consider that can be quite helpful with issues such as performance, availability, and security. Some of the key considerations for using MongoDB in a production environment are outlined in this checklist.

Use 64-bit Builds

While MongoDB provides a 32-bit build, it is intended for development machines since there is a storage limitation of two gigabytes in the 32-bit version (the 64-bit version uses memory mapped files to allow more storage).

Updates to One or More Documents

MongoDB only updates one document by default, but if you want to be able to update multiple documents that match your query, then you will need to set the multi parameter to a value of true. For example, MongoDB provides the following example that can be used to do this in the mongo shell:

An example of using the mongo shell to change settings. Source: MongoDB.

Simply change bool_multi to a value of true to allow updating of multiple documents.

Make Sure Journaling Is Enabled

To maintain the durability of write operations, it is recommended to keep journaling enabled. This allows MongoDB to write change operations to the journal before changes are applied to the data files. This allows everything in the journal to be recovered if MongoDB were to encounter an unexpected stoppage, which ensure the data remains in a consistent state.

BSON Document Size

The size of a BSON document in MongoDB is limited to 16 megabytes. If you determine that you will need documents of a larger size, then the MongoDB team recommends you use GridFS, which uses chunking behind the scenes so that you can use larger documents.

An example of a BSON document. Source: MongoDB.

RAID Arrays

If you are using a RAID array for your disks, it is recommended that you use a RAID-10 implementation, as other types of RAID arrays often lack either in performance or availability in comparison. For example, using RAID-5 or RAID-6 will not provide you with the performance you will need for MongoDB.

Networking and Security

It is strongly recommended that a production implementation of MogoDB be run only in an environment that can be trusted. First, the network in which it runs should disallow access from any unknown networks, systems, or machines. This can be best ensured by allowing access only to the systems that you absolutely know require it, such as your monitoring services or your application servers.

Also, you will want to review the recommendations of the MongoDB team for configuring network interfaces and firewalls.

In addition, there are some additional suggestions for firewall configuration for those using Linux implementations and those using Windows implementations.

Get Your Own Hosted Database

Once you have your MongoDB implementation set up for production, you will want reliable and stable database hosting. Morpheus Virtual Appliance is a tool that allows you manage heterogeneous databases in a single dashboard.

When one of your Heroku apps is accessed infrequently, it can take more than 20 seconds for it to spin out of idle mode. Keep the apps active by automatically pinging their servers, either by using a free add-on or by running a custom function.

The app-development two step: Step one, you build and test your app; step two, you find a service to host the app so potential customers can kick its tires. No matter which tools and services you use to create your web application, the chances are good you can use the Heroku platform as a service (PaaS) to make it available to the public.

Heroku made its name by offering to host your web apps for free. You switch to the paid version of the service once you scale up as the app gains traction with customers. But Heroku's true claim to fame is that the service lets you deploy your apps with just a couple of clicks. As ReadWrite's Lauren Orsini explains in a September 23, 2014, article, hosting an app is much trickier than hosting a site.

Orsini describes using a free add-on called Heroku Scheduler to ping her apps once an hour. A primary reason for pinging an app regularly is to avoid having to spin up a new dyno each time someone accesses the app after a delay. If it has shifted into idle, it can take more than 20 seconds for the app to open. Some potential customers may not wait that long.

This is the problem addressed in a Stack Overflow post that generated 14 suggestions for preventing dynos from idling. Topping the responses was use of the free New Relic add-on, which has an availability monitor that can ping sites at a set interval. Alternatives include Kaffeine, pingdom, and Uptimerobot.

If you prefer a solution that doesn't rely on a third-party service, you can run the KeepAlive function shown below:

The KeepAlive function automatically pings apps at a set interval to prevent Heroku dynos from idling. Source: Stack Overflow

Automatic deployment of code from GitHub and Dropbox

Heroku recently released GitHub Integration, which automates the process of deploying code stored on GitHub. InfoQ's Richard Seroter explains in a January 21, 2015, article that manual deploys are recommended when making changes, such as testing a new feature branch. Automatic deployments are initiated each time developers push to a designated branch, or for teams, when the continuous integration server finishes and commits successfully.

To use Heroku's Dropbox Sync option, create a "Heroku" subfolder on the Dropbox account. The source code of the applications you deploy will be copied to that folder. Deployment from Dropbox to Heroku is done by using the Heroku dashboard to kick off a manual commit. You can't use both Dropbox Sync and GitHub Integration in the same app, but you can use Dropbox Sync with Heroku's standard git function.

Multiple developers can work on an application simultaneously by connecting their Dropbox accounts to the app, which delivers any changes to the source code automatically. However, Heroku warns that Dropbox lacks the robust features of a true source control management tool. For example, if a developer force-pushes the Heroku git rep, the Dropbox folders are unlinked due to the difficulty of generating a differences report.

The update's many new features put Heroku in a good position against the competition in the PaaS market, according to InfoQ's Seroter. The table below compares the native code deployment capabilities of the four leading PaaS providers.

The addition of GitHub Integration, Dropbox Sync, and other new Heroku features gives the service an edge over most PaaS competitors. Source: InfoQ

The most efficient way to manage your apps, databases, IT operations, and business services in real time is by using the Happy Apps service. Happy Apps lets you set up rules so you are alerted via SMS and email whenever incidents or specific events occur. You can group and monitor multiple apps, databases, web servers, and app servers. In addition to an overall status, you can view the status of each individual group member.

Happy Apps is the only app-management service to support SSH and agent-based connectivity to all your apps on public, private, and hybrid clouds. The service provides dependency maps for determining the impact your IT systems will have on other apps. All checks performed on your apps are collected in easy-to-read reports that can be analyzed to identify repeating patterns and performance glitches over time. Visit the Happy Apps site to sign up for a free trial.

The many OpenSSL vulnerabilities coming to light in recent months have motivated a thorough audit of the open system's code. But this hasn't prevented companies from implementing proprietary SSL alternatives, including application delivery controllers running a streamlined, closed SSL stack, and Google's own BoringSSL implementation.

It's only March, but it has already been a rough year for OpenSSL security. On January 8, the OpenSSL Project issued updates that addressed eight separate security holes, two of which were rated as "moderate" in severity. SC Magazine's Adam Greenberg reports on the patches in a January 8, 2015, article.

Then in the first week of March, the FREAK vulnerability was disclosed, which made one-fourth of all SSL-encrypted sites susceptible to man-in-the-middle attacks, as Informationweek Dark Reading's Kelly Jackson Higgins explains in a March 3, 2015, article.

Now site managers are sweating yet another OpenSSL patch for a security hole that could be just as serious as FREAK. In a mailing list notice posted on March 16, 2015, the OpenSSL Project's Matt Caswell announced the March 19, 2015, release of a patch for multiple OpenSSL vulnerabilities, at least one of which is classified as "high" severity. The Register's Darrin Pauli reports in a March 17, 2015, post that the updates apply to OpenSSL versions 1.0.2a, 1.0.1m, 1.0.0r, and 0.9.8zf.

Web giants finance long-overdue OpenSSL security audit

The alert about the new OpenSSL vulnerability comes just more than a week after it was announced that the NCC Group security firm would be conducting an audit of OpenSSL code. The goal of the audit is to spot errors in the code before they are discovered in the wild, as ZDNet's Steven J. Vaughan-Nichols writes in a March 7, 2015, article.

NCC Group principal security engineer Thomas Ritter states that the OpenSSL codebase is now stable enough to undergo a thorough analysis and revision. The focus of the NCC Group audit will be on Transport Layer Security attacks related to protocol flow, state transitions, and memory management. Preliminary results are expected by early summer 2015, according to Ritter.

OpenSSL is only one of the many Secure Sockets Layer/Transport Layer Security implementations for encrypting web content. Source: Ale Agostini

Serious vulnerabilities discovered in OpenSSL in the recent past, including Heartbleed, Shellshock, and Early CCS, cause sites to rush to apply patches. The OpenSSL audit is the first project under the Linux Foundation's Core Infrastructure Initiative, which is funded in large part by contributions from Google, Amazon, Cisco Systems, Microsoft, and Facebook, as the Register's Pauli notes in a March 10, 2015, article.

Proprietary SSL implementations as a safer alternative to OpenSSL

The number and severity of OpenSSL security holes have caused some organizations to build their own proprietary SSL stack on application delivery controllers, as FirstPost's Shibu Paul describes in a March 9, 2015, article. ADCs are a new type of advanced load balancer for frontend servers that use a streamlined version of the SSL stack designed to be small enough to execute in the kernel.

An advantage of proprietary SSL stacks is that hackers don't have access to the code the way they do for open systems. If an organization discovers a vulnerability in its proprietary SSL stack, it can address the problem without the public being aware of it. That's why companies using ADCs weren't susceptible to Heartbleed or man-in-the-middle attacks.

Application delivery controllers are touted as a more-secure alternative to OpenSSL because they rely on a proprietary SSL stack. Source: Scope Middle East

Google's response to the many OpenSSL vulnerabilities was to create its own version of the encryption standard, called BoringSSL. As Matthew McKenna writes in a February 25, 2015, post on the TechZone 360 site, having to manage more than 70 OpenSSL patches was making it difficult for the company to maintain consistency across its multiple code bases.

Maintaining security without impacting manageability is a key precept of the new Morpheus Virtual Appliance, which lets you provision, deploy, and monitor heterogeneous MySQL, MongoDB, Redis, and ElasticSearch databases from a single point-and-click console. With the Morpheus database-as-a-service (DBaaS) you can manage all your SQL, NoSQL, and in-memory databases across public, private, and hybrid clouds in just minutes. Each database instance you create includes a free full replica set for built-in fault tolerance and fail over.

Choosing the right database for a project is an extremely important step in planning and development. Picking the wrong setup can cost quite a bit of time and money, and can leave you with numerous upset users in the process. Both MongoDB and MySQL are excellent databases when used in their expected ways, but which one is better for building a social network?

What is MongoDB?

MongoDB is a NoSQL database, which means that related data gets stored in single documents for fast retrieval. This is often a good model for when data won’t need to be duplicated in multiple documents (which can cause inconsistencies).

An example of a MongoDB document. Source: MongoDB.

MongoDB is easily scalable in such cases, so the database can have rapid horizontal growth while automatically keeping things in order. This can be especially good when you have large amounts of data and need a quick response time.

What is MySQL?

MySQL is a relational database, which means that data gets stored (preferably) in normalized tables so that there is no duplication. This is a good model when you need data to be consistent and reliable at all times (such as personal information or financial data).

An example of a MySQL table. Source: MySQL.

While horizontal scaling can be more difficult, it does adhere to the ACID model (atomicity, consistency, isolation, durability), which means you have far fewer worries about data reliability.

How Does Social Networking Work?

Social networks offer different ways for people to connect. Whether it is through a mutual friendship, a business associate, or following a well-known person or business for updates, there are numerous methods of getting information out over social networks.

The key ingredient is in the connections: for anyone a user is connected with or following, that person will typically see the updates from all of those connections once logged in to the system.

An example social network relationship diagram. Source: SarahMei.com.

Comparison of the Databases

Given that social data has various relations, to users in particular, it lends better to a relational database over time. Even though a NoSQL solution like MongoDB can seem like a great way to retrieve lots of data quickly, the relational nature of users in a social network can cause lots of duplication to occur.

Such duplication lends itself to data becoming inconsistent and/or unreliable over time, or to queries becoming much more difficult to handle if the duplication is removed (since documents will likely need to point to other documents, which is not optimal for a NoSQL type of database).

As a result, MySQL would be the better recommendation, since it will have the data reliability and relational tools necessary to handle the interactions and relationships among numerous users. You may also decide to use both MySQL and MongDB together to utilize the best features of each database.

Get MySQL or MongoDB

Whether you decide to use one or both databases, the new Morpheus Virtual Appliance seamlessly provisions and manages both SQL and NoSQL databases across private and public (or even hybrid) clouds. With its easy to use interface, you can have a new instance of a database up and running in seconds.

Visit the Morpheus site to create a free account.

Researchers are developing compiler technologies that optimize and regenerate code in multiple languages and for many different platforms in only one or a handful of steps. While much of their work focuses on Java and JavaScript, their innovations will impact developers working in nearly all programming languages.

Who says you can't teach an old dog new tricks? One of the staples of any developer's code-optimization toolkit is a compiler, which checks your program's syntax, semantics, and other aspects for errors and otherwise optimizes its performance.

Infostructure Associates' Wayne Kernochan explains in an October 2014 TechTarget article that compilers are particularly adept at improving the performance of big data and business-critical online transaction processing (OLTP) applications. As recent developments in compiler technology point out, the importance of the programs goes far beyond these specialty apps.

Google is developing two new Java compilers named Jack (Java Android Compiler Kit) and Jill (Jack Intermediate Library Linker) that are part of Android SDK 21.1. I Programmer's Harry Fairhead writes in a December 12, 2014, article that Jack compiles Java code directly to a .dex Dalvik Executable rather than using the standard javac compiler to convert the source code to Java bytecode and then to Dalvik bytecode by feeding it through the dex compiler.

In addition to skipping the conversion to Java bytecode, Jack also optimizes and applies Proguard's obfuscation in a single step. The .dex code Jack generates can be fed to either the Dalvik engine or the new ART Android RunTime Engine, which uses Ahead-of-Time compilation to improve speed.

Jill converts .jar library files into the .jack library format to allow it to be merged with the rest of the object code.

Google's new Jack and Jill Java compilers promise to speed up compilation by generating Dalvik bytecode without first having to convert it from Java bytecode. Source: I Programmer

In addition to streamlining compilation, Jack and Jill reduce Google's reliance on Java APIs, which are the subject of the company's ongoing lawsuit with Oracle. At present, the compilers don't support Java 8, but in terms of retaining compatibility with Java, it appears Android has become the tail wagging the dog.

Competition among open-source compiler infrastructures heats up

The latest versions of the LLVM and Gnu Compiler Collection (GCC) are in a race to see which can out-perform the other. Both open-source compiler infrastructures generate object code from any kind of source code; they support C/C++, Objective-C, Fortran, and other languages. InfoWorld's Serdar Yegulalp reports in a September 8, 2014, article that testing conducted by Phoronix of LLVM 3.5 and a pre-release version of GCC 5 found that LLVM recorded faster C/C++ compile times. However, LLVM trailed GCC when processing some encryption algorithms and other tests.

Version 3.5 of the LLVM compiler infrastructure outperformed GCC 5 in some of Phoronix's speed tests but trailed in others, including audio encoding. Source: Phoronix

The ability to share code between JavaScript and Windows applications is a key feature of the new DuoCode compiler, which supports cross-compiling of C# code into JavaScript. InfoWorld's Paul Krill describes the new compiler in a January 22, 2015, article. DuoCode uses Microsoft's Roslyn compiler for code parsing, syntactic tree (AST) generation, and contextual analysis. DuoCode then handles the code translation and JavaScript generation, including source maps.

Another innovative approach to JavaScript compiling is the experimental Higgs compiler created by University of Montreal researcher Maxime Chevalier-Boisvert. InfoWorld's Yegulalp describes the project in a September 19, 2014, article. Higgs differs from other just-in-time (JIT) JavaScript compilers such as Google's V8, Mozilla's SpiderMonkey, and Apple's LLVM-backed FTLJIT project in that it has a single level rather than being multitiered, and it accumulates type information as machine-level code rather than using type analysis.

When it comes to optimizing your heterogeneous MySQL, MongoDB, Redis, and ElasticSearch databases, the new Morpheus Virtual Appliance makes it as easy as pointing and clicking in a single dashboard. Morpheus is the first and only database-as-a-service (DBaaS) that supports SQL, NoSQL, and in-memory databases across public, private, and hybrid clouds.

With Morpheus, you can invoke a new database instance with a single click, and each instance includes a free full replica set for failover and fault tolerance. Your MySQL and Redis databases are backed up and you can administer your databases using your choice of tools. Visit the Morpheus site to create a free account.

Too often the increased efficiencies and performance improvements promised by new data technologies seem to vanish into thin air when the systems hit the production floor. Not so for these three companies that implemented MongoDB databases in very different environments, but realized very similar benefits: faster app speeds and lower overall system costs.

A hedge fund reduced its software licensing costs by a factor of 40, and its data storage by 40 percent. In addition, its quantitative analysts' modeling is now 25 times faster.

A retailer has installed in-store touch screens that give its customers an enjoyable, interactive shopping experience. The company can create and modify its online catalogs in just minutes to keep pace with ever-changing fashion trends.

A firm that provides affiliate-marketing and partner-management services for enterprises was able to expand without incurring the added expenses for hardware and services it anticipated. The company's customers realized improved performance because the new system's compression and other storage enhancements allowed more of their report requests to be processed in RAM.

All three of these success stories were made possible by converting the companies' traditional databases to MongoDB.

Hedge fund adopts a self-service model for financial analyses

In the past, whenever British hedge fund AHL Man Group wanted to add any new data sources, it became a long, drawn-out process that piled onto the IT department's busy workload. As ComputerWeekly's Brian McKenna reports in a January 21, 2015, article, AHL decided to standardize on Python in 2012, and subsequently discovered that Python interfaced very smoothly with its MongoDB databases.

By the end of 2013 the company had completed a proof-of-concept project, after which it was able to finalize its transition to MongoDB by the end of May 2014, at which time its legacy-system licenses expired. The result was a 40-fold decrease in licensing costs, and a 40 percent reduction in disk-storage requirements. In addition, the switch to a self-service model has allowed some of the firm's analysts to perform their "quant" modeling up to 25 times faster than previously.

Retailer's in-store tablets keep pace with fashion trends

Another January 21, 2015, article on the Apparel site recounts how retailer Chico's FAS developed a MongoDB-based application for its in-store touch-screen Tech Tablets that customers use as virtual catalogs. In addition to highlighting Chico's latest styles, the tablets show product videos and testimonials. The key benefit of the MongoDB application is the ability to create and adapt catalogs in minutes rather than the weeks required previously.

It took Chico's only five months to develop and implement the MongoDB-based app, which easily scaled to meet the retailer's increased demand in the holiday shopping season. More importantly, the app created an interactive, personalized shopping experience that's sure to bring its customers back for more.

MongoDB distro lets expanding marketer avoid high hardware costs

As a company grows, its data networks have to grow along with it, which often increases cost and complexity exponentially. Affiliate-marketing and partner-management firm Performance Horizon Group (PHG) faced skyrocketing hardware expenses as it grew its operations supporting enterprise clients in more than 150 countries.

By implementing Tokutek's TokuMX distribution of MongoDB, PHG reduced its need for new servers by a factor of eight, according to PHG CTO Pete Cheyne. In addition, each of the new servers required only half the RAM of its existing machines while accommodating a growing number of data sets. PHG's implementation of TokuMX is described in a December 2, 2014, Tokutek press release.

Any organization can improve the efficiency of its database-management operations by adopting the new Morpheus Virtual Appliance, which lets you manage heterogeneous MySQL, MongoDB, Redis, and ElasticSearch databases in a single dashboard. Morpheus is the first and only database-as-a-service (DBaaS) that supports SQL, NoSQL, and in-memory databases across public, private, and hybrid clouds.

With Morpheus, you can invoke a new database instance with one click, and each instance includes a free full replica set for failover and fault tolerance. You can administer your databases using your choice of tools. Visit the Morpheus site to create a free account.

There isn't a single best programming language. Rather than flitting from one language to the next as each comes into fashion, determine the platform you want to develop apps for -- the web, mobile, gaming, embedded systems -- and then focus on the predominant language for that area.

"Which programming languages do you use?"

In many organizations, that has become a loaded question. There is a decided trend toward open source development tools, as indicated by the results of a Forrester Research survey of 1,400 developers. ZDNet's Steven J. Vaughan-Nichols reports on the study in an October 29, 2014, article.

Conventional wisdom says open-source development tools are popular primarily because they cost less than their proprietary counterparts. That belief is turned on its head by the Forrester survey, which found performance and reliability are the main reasons why developers prefer to work with open-source tools. (Note that Windows still dominates on the desktop, while open source leads on servers, in data centers, and in the cloud.)

Then again, "open source" encompasses a universe of different development tools for various platforms: the web, mobile, gaming, embedded systems -- the list goes on. A would-be developer can waste a lot of time bouncing from Rails to Django to Node.js to Scala to Clojure to Go. As Quincy Larson explains in a November 14, 2014, post on the FreeCodeCamp blog, the key to a successful career as a programmer is to focus.

Larson recounts his seven months of self-study of a half-dozen different programming languages before landing his first job as a developer -- in which he used none of them. Instead, his team used Ruby on Rails, a relative graybeard among development environments. The benefits of focusing on a handful of tools are many: developers quickly become experts, productivity is enhanced because people can collaborate without a difference in tools getting in the way, and programmers aren't distracted by worrying about missing out on the flavor of the month.

Larson recommends choosing a single type of development (web, gaming, mobile) and sticking with it; learning only one language (JavaScript/Node.js, Rails/Ruby, or Django/Python); and following a single online curriculum (such as FreeCodeCamp.com or NodeSchool.io for JavaScript, TheOdinProject.com or TeamTreehouse.com for Ruby, and Udacity.com for Python).

A cutout from Lifehacker's "Which Programming Language?" infographic lists the benefits of languages by platform. Source: Lifehacker

Why basing your choice on potential salary is a bad idea

Just because you can make a lot of money developing in a particular language doesn't mean it's the best career choice. Readwrite's Matt Asay points out in a November 28, 2014, article that a more rewarding criterion in the long run is which language will ensure you can find a job. Asay recommends checking RedMonk's list of popular programming languages.

Boiling the decision down to its essence, the experts quoted by Asay suggest JavaScript for the Internet, Go for the cloud, and Swift (Apple) or Java (Android) for mobile. Of course, as with most tech subjects, opinions vary widely. In terms of job growth, Ruby appears to be fading, Go and Node.js are booming, and Python is holding steady.

But don't bail on Ruby or other old-time languages just yet. According to Quartz's programmer salary survey, Ruby on Rails pays best, followed by Objective C, Python, and Java.

While Ruby's popularity may be on the wane, programmers can still make good coin if they know Ruby on Rails. Source: Quartz, via Readwrite

Also championing old-school languages is Readwrite's Lauren Orsini in a September 1, 2014, article. Orsini cites a study by researchers at Princeton and UC Berkeley that found inertia is the primary driver of developers' choice of language. People stick with a language because they know it, not because of any particular features of the language. Exhibits A, B, C, and D of this phenomenon are PHP, Python, Ruby, and JavaScript -- and that doesn't even include the Methuselah of languages: C.

No matter your language of choice, you'll find it combines well with the new Morpheus Virtual Appliance, which lets you monitor and manage heterogeneous MySQL, MongoDB, Redis, and ElasticSearch databases from a single dashboard. Morpheus is the first and only database-as-a-service (DBaaS) that supports SQL, NoSQL, and in-memory databases across public, private, and hybrid clouds.

Rather than out-and-out replacing their relational counterparts, MongoDB and other NoSQL databases will coexist with traditional RDBMSs. However, as more -- and more varied -- data swamps companies, the scalability and data-model flexibility of NoSQL will make it the management platform of choice for many of tomorrow's data-analysis applications.

There's something comforting in the familiar. When it comes to databases, developers and users are warm and cozy with the standard, nicely structured tables-and-rows relational format. In the not-too-distant past, nearly all of the data an organization needed fit snugly in the decades-old relational model.

Well, things change. What's changing now is the nature of a business's data. Much time and effort has been spent converting today's square-peg unstructured data into the round hole of relational DBMSs. But rather than RDBMSs being modified to support the characteristics of non-textual, non-document data, companies are now finding it more effective to adapt databases designed for unstructured data to accommodate traditional data types.

Two trends are converging to make this transition possible: NoSQL databases such as MongoDB are maturing to add the data-management features businesses require; and the amount and types of data are exploding with the arrival of the Internet of Things (IoT).

Heterogeneous DBs are the wave of the future

As ReadWrite's Matt Asay reports in a November 28, 2014, article, any DBAs who haven't yet added a NoSQL database or two to their toolbelt are in danger of falling behind. Asay cites a report by Machine Research that found relational and NoSQL databases are destined to coexist in the data center: the former will continue to be used to process "structured, highly uniform data sets," while the latter will manage the unstructured data created by "millions and millions of sensors, devices, and gateways."

Relational databases worked for decades because you could predict the characteristics of the data they held. One of the distinguishing aspects of IoT data is its unpredictability: you can't be sure where it will come from, or what forms it will take. Managing this data requires a new set of skills, which has led some analysts to caution that a severe shortage of developers trained in NoSQL may impede the industry's growth.

The expected increase in NoSQL-based development in organizations could be hindered by a shortage of skilled staff. Source: VisionMobile, via ReadWrite

The ability to scale to accommodate data elements measured in the billions is a cornerstone of NoSQL databases, but Asay points out the feature that will drive NoSQL adoption is flexible data modeling. Whatever devices or services are deployed in the future, NoSQL is ready for them.

Document locking one sign of MongoDB's growing maturity

According to software consultant Andrew C. Oliver -- a self-described "unabashed fan of MongoDB" -- the highlight of last summer's MongoDB World conference was the announcement that document-level locking is now supported. Oliver gives his take on the conference happenings in a July 3, 2014, article on InfoWorld.

Oliver compares MongoDB's document-level locking to row-level locking in an RDBMS, although documents may contain much more data than a row in an RDBMS. Some conference-goers projected that multiple documents may one day be written with ACID consistency, even if done so "locally" to a single shard.

Another indication of MongoDB becoming suitable for a wider range of applications is the release of the SlamData analytics tool that works without having to export data via ETL from MongoDB to an RDBMS or Hadoop. InfoWorld's Oliver describes SlamData in a December 11, 2014, article.

In contrast to the Pentaho business-intelligence tool that also supports MongoDB, SlamData CEO Jeff Carr states that the company's product doesn't require a conversion of document databases to the RDBMS format. SlamData is designed to allow people familiar with SQL to analyze data based on queries of MongoDB document collections via a notebook-like interface.

The SlamData business-intelligence tool for MongoDB uses a notebook metaphor for charting data based on collection queries. Source: InfoWorld

There's no simpler or more-efficient way to manage heterogeneous databases than by using the point-and-click interface of the new Morpheus Virtual Appliance, which lets you monitor and analyze heterogeneous MySQL, MongoDB, Redis, and ElasticSearch databases in a single dashboard. Morpheus is the first and only database-as-a-service (DBaaS) that supports SQL, NoSQL, and in-memory databases across public, private, and hybrid clouds.

Document-level locking and pluggable storage APIs top the list of new features in MongoDB 3.0, but the big-picture view points to a more prominent role for NoSQL databases in companies of all types and sizes. The immediate future of databases is relational, non-relational, and everything in between -- sometimes all at once.

Version 3.0 of MongoDB, the leading NoSQL database, is being touted as the first release that is truly ready for the enterprise. The new version was announced in February and shipped in early March. At least one early tester, Adam Comerford, reports that MongoDB 3.0 is indeed more efficient at managing storage, and faster at reading compressed data.

The new feature in MongoDB 3.0 gaining the lion's share of analysts' attention is the addition of the WiredTiger storage engine and pluggable API that MongoDB acquired in December 2014. JavaWorld's Andrew C. Oliver states in a February 3, 2015, article that WiredTiger will likely boost performance over MongoDB's default MMapV1 engine in apps where reads don't greatly outnumber writes.

Oliver points out that WiredTiger's B-tree and Log Structured Merge (LSM) algorithms benefit apps with large caches (B-tree) and with data that doesn't cache well (LSM). WiredTiger also promises data compression that reduces storage needs by up to 80 percent, according to the company.

mongobd-3.0-infographic

The addition of the WiredTiger storage engine is one of the new features in MongoDB 3.0 that promises to improve performance, particularly for enterprise customers. Source: Software Development Times

Other enhancements in MongoDB 3.0 include the following:

Document-level locking for concurrency control via WiredTiger
Collection-level concurrency control and more efficient journaling in MMapV1
A pluggable API for integration with in-memory, encrypted, HDFS, hardware-optimized, and other environments
The Ops Manager graphical management console in the enterprise version

Computing's John Leonard emphasizes in a February 3, 2015, article that MongoDB 3.0's multi-model functionality via the WiredTiger API positions the database to compete with DataStax' Apache Cassandra NoSQL database and Titan graph database. Leonard also highlights the new version's improved scalability.

Putting MongoDB 3.0 to the (performance) test

MongoDB 3.0's claims of improved performance were borne out by preliminary tests conducted by Adam Comerford and reported on his Adam's R&R blog in posts on February 4, 2015, and February 5, 2015. Comerford repeated compression tests with the WiredTiger storage engine in release candidate 7 (RC7) -- expected to be the last before the final version comes out in March -- that he ran originally using RC0 several months ago. The testing was done on an Ubuntu 14.10 host with an ext4 file system.

The results showed that WiredTiger's on-disk compression reduced storage to 24 percent of non-compressed storage, and to only 16 percent of the storage space used by MMapV1. Similarly, the defaults for WiredTiger with MongoDB (the WT/snappy bar below) used 50 percent of non-compressed WiredTiger and 34.7 percent of MMapV1.

Testing WiredTiger storage (compressed and non-compressed) compared to MMapV1 storage showed a tremendous advantage for the new MongoDB storage engine. Source: Adam Comerford

Comerford's tests of the benefits of compression for reads when available I/O capacity is limited demonstrated much faster performance when reading compressed data using snappy and zlib, respectively. A relatively slow external USB 3.0 drive was used to simulate "reasonable I/O constraints." The times indicate how long it took to read the entire 16GB test dataset from the on-disk testing into memory from the same disk.

Read tests from compressed and non-compressed disks in a simulated limited-storage environment indicate faster reads with WiredTiger in all scenarios. Source: Adam Comerford

All signs point to a more prominent role in organizations of all sizes for MongoDB in particular and NoSQL in general. Running relational and non-relational databases side-by-side is becoming the rule rather than the exception. The new Morpheus Virtual Appliance puts you in good position to be ready for multi-model database environments. It supports rapid provisioning and deployment of MongoDB v3.0 across public, private and hybrid clouds. Sign Up for a Free Trial now!

Tried-and-true languages such as Java, C++, Python, and JavaScript continue to dominate the most popular lists, but modern app development requires a multi-language approach to support diverse platforms and links to backend servers. The future will see new languages being used in conjunction with the old reliables.

Every year, new programming languages are developed. Recent examples are Apple's Swift and Carnegie Mellon University's Wyvernet. Yet for more than a decade, the same handful no. of languages have retained their popularity with developers -- Java, JavaScript, C/C++/C#/Objective-C, Python, Ruby, PHP -- even though each is considered to have serious shortcomings for modern app development.

According to TIOBE Software's TIOBE Index for January 2015, JavaScript recorded the greatest increase in popularity in 2014, followed by PL/SQL and Perl.

The same old programming languages dominate the popularity polls, as shown by the most-recent TIOBE Index. Source: TIOBE Software

Of course, choosing the best language for any development project rarely boils down to a popularity contest. When RedMonk's Donnie Berkholz analyzed GitHub language trends in May 2014, aggregating new users, issues, and repositories, he concluded that only five languages have mattered on GitHub since 2008: JavaScript, Ruby, Java, PHP, and Python.

An analysis of language activity on GitHub between 2008 and 2013 indicates growing fragmentation. Source: RedMonk

Two important caveats to Berkholz's analysis are that GitHub focused on Ruby on Rails when it launched but has since gone more mainstream; and that Windows and iOS development barely register because both are generally open source-averse. As IT World's Phil Johnson points out in a May 7, 2014, article, while it's dangerous to draw conclusions about language popularity based on this or any other single analysis, it seems clear the industry is diverging rather than converging.

Today's apps require a multi-language, multi-paradigm approach

Even straightforward development projects require expertise in multiple languages. TechCrunch's Danny Crichton states in a July 10, 2014, article that creating an app for the web and mobile entails HTML, CSS, and JavaScript for the frontend (others as well, depending on the libraries required); Java and Objective-C (or Swift) for Android and iPhone, respectively; and for links to backend servers, Python, Ruby, or Go, as well as SQL or other database query languages.

Crichton identifies three trends driving multi-language development. The first is faster adoption of new languages: GitHub and similar sites encourage broader participation in developing libraries and tutorials; and developers are more willing to learn new languages. Second, apps have to run on multiple platforms, each with unique requirements and characteristics. And third, functional programming languages are moving out of academia and into the mainstream.

Researcher Benjamin Erb suggests that rather than functional languages replacing object-oriented languages, the future will be dominated by multi-paradigm development, in particular to address concurrency requirements. In addition to supporting objects, inheritance, and imperative code, multi-paradigm languages incorporate higher-order functions, closures, and restricted mutability.

One way to future-proof your SQL, NoSQL, and in-memory databases is by using the new Morpheus Virtual Appliance, which lets you manage heterogeneous MySQL, MongoDB, Redis, and ElasticSearch databases in a single dashboard. Morpheus is the first and only database-as-a-service (DBaaS) that supports SQL, NoSQL, and in-memory databases across public, private, and hybrid clouds.

Elasticsearch is a great tool to provide fast and powerful search services to your web sites or applications, but care should be taken when moving from development to production. By following the checklist below, you can avoid some issues that may arise if you use development settings in a production environment!

Configure Your Log and Data Paths

To minimize the chances of data loss in a production environment, it is highly recommended that you change your log and data paths from the default paths to something that is less likely to be accidentally overwritten.

You can make these changes in the configuration file (which uses YAML syntax) under path, as in the following example, which uses suggested production paths from the Elasticsearch team:

Suggested settings for the log and data paths. Source: Elasticsearch.

Configure Your Node and Cluster Names

When you are looking for a node or a cluster, it is a good idea to have a name which describes what you will need to find and separates one from another.

The default cluster name of "elasticsearch " could allow any nodes to join the cluster, even if this was not intended. Thus, it is a good idea to give the cluster a distinct identifier instead.

The default node names are chosen randomly from a set of roughly 3000 Marvel character names. While this wouldn't be so bad for a node or two, this could get quite confusing as you add more than a few nodes. The better option is to use a descriptive name from the beginning to avoid potential confusion as nodes are added later.

Configure Memory Settings

Memory swapping used on systems could potentially cause the elasticsearch process to be swapped, which would not be good while running in production. Suggestions from the Elasticsearch team to fix this include disabling swapping, configuring swapping to only run in emergency conditions, or (for Linux/Unix users) using mlockall to try to lock the address space of the process into RAM to keep it from being swapped.

Configure Virtual Memory Settings

Elasticsearch indices use mmapfs/niofs, but the default mmap count on operating systems can potentially be too low. If so, you will end up with errors such as "out of memory" exceptions. To fix this, you can up the limit to accommodate Elasticsearch indices. The following example shows how the Elasticsearch team recommends increasing this limit on Linux systems (run the command as root):

Suggested command to increase the mmap count for Linux systems. Source: Elasticsearch.

Ensure Elasticsearch Is Monitored

It is a good idea to monitor your Elasticsearch installation so that you can see the status or be alerted if or when something goes wrong. A service such as Happy Apps can provide this type of monitoring for you (and can monitor the rest of your app as well).

Get ElasticSearch in the Cloud

When you launch your application that uses Elasticsearch, you will want reliable and stable database hosting. Morpheus Virtual Appliance is a tool that allows you manage heterogeneous databases in a single dashboard.

There aren't many MySQL databases that don't need to support users' remote connections. While many failed remote connections can be traced to a misconfigured my.cnf file, the many nuances of MySQL remote links make troubleshooting a dropped network connection anything but straightforward.

Some Linux administrators were rattled this week to learn of the discovery by Qualys of a bug in the GNU C Library (glibc) that could render affected systems vulnerable to a remote code execution attack. In a January 27, 2015, article, The Register's Neil McAllister describes the dangers posed by Ghost to Linux and a handful of other OSes.

Ghost affects versions of glibc back to 2.2, which was released in 2000, but as threats go, this one appears to be pretty mild. For one thing, the routines involved are old and rarely used these days. Even when they are used, they aren't called in a manner that the vulnerability could exploit. Still, Linux vendors Debian, Red Hat, and Ubuntu have released patches for Ghost.

As Ars Technica's Dan Goodin explains in a January 27, 2015, article, Ghost may affect MySQL servers, Secure Shell servers, form submission apps, and other mail servers in addition to the Exim server on which Qualys demonstrated the remote code execution attack. However, Qualys has confirmed that Ghost does not impact Apache, Cups, Dovecot, GnuPG, isc-dhcp, lighttpd, mariadb/mysql, nfs-utils, nginx, nodejs, openldap, openssh, postfix, proftpd, pure-ftpd, rsyslog, samba, sendmail, sysklogd, syslog-ng, tcp_wrappers, vsftpd, or xinetd.

Finding a solution to common MySQL remote-connection glitches

While protecting against Ghost may be as simple as applying a patch, managing remote connections on MySQL servers and clients can leave DBAs pounding their keyboards. ITworld's Stephen Glasskeys writes in a December 19, 2014, post about the hoops he had to jump through to find the cause of a failed remote connection on a Linux MySQL server.

After using the ps command to list processes, Glasskeys found that the --skip-networking command was enabled, which tells MySQL not to listen for remote TCP/IP connections. Running KDE's Find Files/Folders tool determined that rc.mysqld was the only script file containing the text "--skip-networking".

Diagnosing the cause of failed remote connections on a MySQL server led to the file rc.mysqld. Source: ITworld

To restore remote connections, open rc.mysqld and comment out the command by placing a pound sign (#) at the beginning of the line. Then edit the MySQL configuration file /etc/my.cnf as follows, making sure bind-address is set to 0.0.0.0:

Edit the /etc/my.cnf file to ensure bind-address is set to 0.0.0.0. Source: ITworld

Finally, use the "iptables --list" command to make sure the Linux server is set to accept requests on MySQL's port 3306, and the "iptables" command to enable them if it's not. After you restart MySQL, you can test the remote connection using the credentials and other options as they appear in the my.cnf file on the Linux server.

When MySQL's % wildcard operator leaves a remote connection hanging

A Stack Overflow post from April 2013 describes a situation where MySQL's % wildcard operator failed to allow a user "user@%" to connect remotely. Such remote connections require that MySQL's bind port 3306 be present in each machine's IP in my.cnf. Also, the user has to be created in both localhost and the % wildcard, and permissions granted on all databases. (You may also need to open port 3306, depending on your OS.)

Enable remote connections in the MySQL my.cnf file by adding each machine's IP and creating users in localhost and the % wildcard. Source: ITworld

Diagnosing failed remote connections and other database glitches is facilitated by the point-and-click interface of the new Morpheus Virtual Appliance, which lets you manage heterogeneous MySQL, MongoDB, Redis, and ElasticSearch databases in a single dashboard. Morpheus is the first and only database-as-a-service (DBaaS) that supports SQL, NoSQL, and in-memory databases across public, private, and hybrid clouds.

One of the most common MySQL operations is replicating databases between master and slave servers. While most such connections are straightforward to establish and maintain, on occasion something goes amiss: some master data may not replicate on the slave, or read requests may be routed to the master rather than to the server, for example. Finding a solution to a replication failure sometimes requires a little extra detective work.

Replication is one of the most basic operations in MySQL -- and any other database: it's used to copy data from one database server (the master) to one or more others (the slaves). The process improves performance by allowing loads to be distributed among multiple slave servers for reads, and by limiting the master server to writes.

Additional benefits of replication are security via slave backups; analytics, which can be performed on the slaves without affecting the master's performance; and widespread data distribution, which is accomplished without requiring access to the master. (See the MySQL Reference Manual for more on replication.)

As with any other aspect of database management, replication doesn't always proceed as expected. The Troubleshooting Replication section of the MySQL Reference Manual instructs you to check for messages in your error log when something goes wrong with replication. If the error log doesn't point you to the solution, ensure that binary logging is enabled in the master by issuing a SHOW MASTER STATUS statement. If it's enabled, "Position" is nonzero; if it isn't, make sure the master is running with the --log-bin option.

The manual offers several other replication-troubleshooting steps:

The master and slave must both start with the --server-id option, and each server must have a unique ID value;
Run SHOW SLAVE STATUS to ensure the Slave_IO_Running and Slave_SQL_Running values are both "yes";
Run SHOW_PROCESSLIST and look in the State column to verify that the slave is connecting to the master;
If a statement succeeded on the master but failed on the slave, the nuclear option is to do a full database resynchronization, which entails deleting the slave's database and copying a new snapshot from the master. (Several less-drastic alternatives are described in the MySQL manual.)

Solutions to real-world MySQL replication problems

What do you do when MySQL indicates the master-slave connection is in order, yet some data on the master isn't being copied to the slave? That's the situation described in a Stack Overflow post from March 2010.

Even though replication appears to be configured correctly, data is not being copied from the master to the slave. Source: Stack Overflow

The first step is to run "show master status" or "show master status\G" on the master database to get the correct values for the slave. The slave status above indicates the slave is connected to the master and awaiting log events. Synching the correct log file position should restore copying to the slave.

To ensure a good sync, stop the master, dump the database, record the master log file positions, restart the master, import the database to the slave, and start the slave in slave mode with the correct master log file position.

Another Stack Overflow post from March 2014 presents a master/slave setup using JDBC drivers in which transactions marked as read-only were still pinging the master. Since the MySQL JDBC driver was managing the connections to the physical servers -- master and slave -- the connection pool and Spring transaction manager weren't aware that the database connection was linking to multiple servers.

The solution is to return control to Spring, after which the transaction on the connection will be committed. The transaction debug message will indicate that queries will be routed to the slave server so long as the connection is in read-only mode. By resetting the connection before it is returned to the pool, the read-only mode is cleared and the last log message will show that queries are now being routed to the master server.

The point-and-click dashboard in the new Morpheus Virtual Appliance makes it a breeze to diagnose and repair replication errors -- and other hiccups -- in your heterogeneous MySQL, MongoDB, Redis, and ElasticSearch databases. Morpheus lets you seamlessly provision, monitor, and analyze SQL, NoSQL, and in-memory databases across hybrid clouds in just minutes. Each database instance you create includes a free full replica set for built-in fault tolerance and fail over.

With the Morpheus database-as-a-service (DBaaS), you can migrate existing databases from a private cloud to the public cloud, or from public to private. A new instance of the same database type is created in the other cloud, and real-time replication keeps the two databases in sync. Visit the Morpheus site to create a free account.

Before you sign on the dotted line for a cloud service supporting your application development or other core IT operation, make sure you have an easy, seamless exit strategy in place. Just because an infrastructure service is based on open-source software doesn't mean you won't be locked in by the service's proprietary APIs and other specialty features.

In the quest for ever-faster app design, deployment, and updating, developers increasingly turn to cloud infrastructure services. These services promise to let developers focus on their products rather than on the underlying servers and other exigencies required to support the development process.

However, when you choose cloud services to streamline development, you run the risk of being locked in, at either the code level or the architecture level. Florian Motlik, CTO of continuous-integration service Codeship, writes in a February 21, 2015, article on Gigaom that infrastructure services mask the complexities underlying cloud-based development.

Depending on the type of cloud infrastructure service you choose, the vendor may manage more or less of your data operations. Source: Crucial

Even when the services you use adhere strictly to open systems, there is always a cost associated with switching providers: transfer the data, change the DNS, and thoroughly test the new setup. Of particular concern are services such as Google App Engine that lock you in at the code level. However, Amazon Web Services Lambda, Heroku, and other infrastructure services that let you write Node.js functions and invoke them either via an API or on specific events in S3, Kinesis, or DynamoDB entail a degree of architecture lock-in as well.

To minimize lock-in, Motlik recommends using a micro-services architecture based on technology supported by many different providers, such as Rails or Node.

Cloud Computing Journal's Gregor Petri identifies four types of cloud lock-in: the horizontal type locks you into a specific product and prevents you from switching to a competing service; vertical limits your choices in other levels of the stack, such as database or OS; diagonal locks you into a single vendor's family of products, perhaps in exchange for reduced management and training costs, or to realize a substantial discount; and generational prevents you from adopting new technologies as they become available.

Gregor Petri identifies four types of cloud lock-in: horizontal, vertical, diagonal, and generational. Source: Cloud Computing Journal

Will virtualization bring about the demise of cloud lock-in?

Many cloud services are addressing the lock-in trap by making it easier for potential customers to migrate their data and development tools/processes from other platforms to the services' own environments. Infinitely Virtual founder and CEO Adam Stern claims that virtualization has "all but eliminated" lock-in related to operating systems and open source software. Stern is quoted by Linux Insider's Jack M. Germain in an article from November 2013.

Alsbridge's Rick Sizemore points out that even with the availability of tools for migrating data between VMWare, OpenStack, and Amazon Web Services, customers may be locked in by contract terms that limit when they can remove their data. Sizemore also cautions that services may combine open source tools in a proprietary way that locks in your data.

In a February 9, 2015, article in Network World, HotLink VP Jerry McLeod points out that you can minimize the chances of becoming locked into a particular service by ensuring that you can move hybrid workloads seamlessly between disparate platforms. McLeod warns that vendors may attempt to lock in their customers by requiring that they sign long-term contracts.

Seamless workload migration and customer-focused contract terms are only two of the features that make the new Morpheus Virtual Appliance a "lock-in free" zone. With the Morpheus database-as-a-service (DBaaS) you can provision, deploy, and monitor your MongoDB, Redis, MySQL, and ElasticSearch databases from a single point-and-click console. Morpheus lets you work with SQL, NoSQL, and in-memory databases across hybrid clouds in just minutes. Each database instance you create includes a free full replica set for built-in fault tolerance and fail over.

Configure Your Log and Data Paths

You can make these changes in the configuration file (which uses YAML syntax) under path, as in the following example, which uses suggested production paths from the Elasticsearch team:

Suggested settings for the log and data paths. Source: Elasticsearch.

Configure Your Node and Cluster Names

When you are looking for a node or a cluster, it is a good idea to have a name which describes what you will need to find and separates one from another.

The default cluster name of "elasticsearch " could allow any nodes to join the cluster, even if this was not intended. Thus, it is a good idea to give the cluster a distinct identifier instead.

Configure Memory Settings

Configure Virtual Memory Settings

Suggested command to increase the mmap count for Linux systems. Source: Elasticsearch.

Ensure Elasticsearch Is Monitored

Get ElasticSearch in the Cloud

Depending on the type of cloud infrastructure service you choose, the vendor may manage more or less of your data operations. Source: Crucial

To minimize lock-in, Motlik recommends using a micro-services architecture based on technology supported by many different providers, such as Rails or Node.

Gregor Petri identifies four types of cloud lock-in: horizontal, vertical, diagonal, and generational. Source: Cloud Computing Journal

Will virtualization bring about the demise of cloud lock-in?

The new MySQL 5.7.6 developer milestone 16 features noteworthy security upgrades, but others propose more radical approaches to database security. One method puts applications in charge of testing and reporting on their own security, while another separates the app from all security responsibility by placing each in its own virtual machine.

When a database release claims to improve performance over its predecessors by a factor of two to three times, you take notice. That's what Kay Ewbank claims in a March 12, 2015, post on the iProgrammer site about MySQL 5.7.6 developer milestone 16. The new version was released on March 9, 2015, and is available for download (its source code can be downloaded from GitHub).

In a March 10, 2015, post on the MySQL Server blog, Geir Hoydalsvik lists milestone 16's many new features and fixes. (Prepare to give your mouse scroll wheel a workout: the list is long.) Ewbank points in particular to the InnoDB data engine's CREATE TABLESPACE syntax for creating general tablespaces in which you can choose your own mapping between tables and tablespaces. This allows you to group all the tables of one customer in a single tablespace, for example.

(Note Hoydalsvik's warning that the milestone release is "for use at your own risk" and may require data format changes or a complete data dump.)

One of the update's security enhancements relates to the way the server checks the validity of the secure_file_priv system variable, which is intended to limit the effects of data import and export operations. In the new release, secure_file_priv can be set to null to disable all data imports and exports. Also, the default value now depends on the INSTALL_LAYOUT CMake option.

The default value of the secure_file_priv system variable is platform specific in MySQL 5.7.6 developer milestone 16. Source: MySQL Release Notes

Apps that continuously test and report on their own security

Data security generally entails scanning applications to spot problems and missing patches. In a March 5, 2015, article on InformationWeek's Dark Reading site, Jeff Williams proposes building security into the application via "instrumentation," which entails continuous testing and reporting by the app of its own security status.

Instrumentation collects security information from the apps without requiring scans because the programs test their own security and report the results back to the server. Williams provides the example of reports identifying all non-parameterized queries in an organization based on a common SQL injection defense: requiring that all queries be parameterized.

The opposite extreme: Separating security from the application

Another novel approach to database security is exemplified in Waratek AppSecurity for Java, which is reviewed by SC Magazine's Peter Stephenson in a March 2, 2015, article. The premise is that security is too important to be left to the application's developers. Instead, create a sandbox for Java, similar to a firewall but without the tendency to report false positives.

Waratek's product assigns each app the equivalent of its own virtual container, complete with a hypervisor. The container holds its own security rules, which frees developers to focus solely on their applications. Stephenson offers the example of a container that defends against a SQL injection attack on a MySQL database.

Waratek AppSecurity for Java creates a secure virtual machine that applies security rules from outside the application. Source: Waratek

Application security is at the core of the new Morpheus Virtual Appliance. With the Morpheus database-as-a-service (DBaaS) you can provision, deploy, and monitor heterogeneous MySQL, MongoDB, Redis, and ElasticSearch databases from a single point-and-click console. Morpheus lets you work with your SQL, NoSQL, and in-memory databases across public, private, and hybrid clouds in just minutes. Each database instance you create includes a free full replica set for built-in fault tolerance and fail over.

Redis Replication vs Sharding

Code Sharing Services Keep Pace with Application Development Trends

Three Ways to Search Smarter in MySQL

Make Sure to Do these Things Before Going Into Production with MongoDB

Happy Apps - How to Prevent a Heroku Dyno from Idling

Upcoming OpenSSL Security Overhaul Is Long Overdue

MySQL vs. MongoDB: The Pros and Cons When Building a Social Network

New Compilers Streamline Optimization and Enhance Code Conversion

How Three Companies Improved Performance and Saved Money with MongoDB

The Key to Selecting a Programming Language: Focus

MongoDB Poised to Play a Key Role in Managing the Internet of Things

MongoDB 3.0 First Look: Faster, More Storage Efficient, Multi-model

Preparing Developers for a Multi-language Multi-paradigm Future

Don't Go Into Elasticsearch Production without this Checklist

Troubleshoot Lost MySQL Remote Connections

Troubleshooting Problems with MySQL Replication

Avoid Being Locked into Your Cloud Services

Don't Go Into Elasticsearch Production without this Checklist

Avoid Being Locked into Your Cloud Services

Three Different Approaches to Hardening MySQL Security