Morpheus Blog

Until more sophisticated User Role-type controls are added to MySQL, developers will have to use GRANT and REVOKE statements to manage user privileges, or the Administrative Roles options provided in MySQL Workbench. Troubleshooting table-creation glitches in MySQL can be the source of much developer frustration, particularly when trying to assign privileges in a single database.

MySQL is not noted for the ease with which you can determine which users can access which features and functions. As Database Journal's Rob Gravelle explains in a February 13, 2014, article, SQL-type User Role controls were originally anticipated in MySQL 5.0, but Oracle has postponed the feature to MySQL 7.0.

Gravelle describes three tools that add User Roles to MySQL: Google's aptly named google_mysql_tools, the Open Source project SecuRich, and MySQL Workbench, whose Administrative Roles feature is described below. (Note that google_msql_tools are written in Python and thus require the MySQLdb connector.)

The MySQL Reference Manual presents the basics on how to use MySQL's GRANT statements to assign privileges to user accounts, including access to secure connections and server resources. As you might expect, the REVOKE statement is used to revoke privileges. The typical scenario is to create an account using CREATE USER, and then define its privileges and characteristics using GRANT.

The standard method of assigning user privileges in MySQL is to use the GRANT statement. Source: MySQL Reference Manual

Privileges can be granted globally using "ON *.*" syntax, at the database level using "ON db_name.*", at the table level using "ON db_name.tbl_name", and at the column level using the following syntax:

Assign user privileges at the column level in MySQL by enclosing the column or columns within parentheses. Source: MySQL Reference Manual

Other privileges apply to stored routines and proxy users. The "WITH" clause is used to allow one user to grant privileges to other users, to limit the user's access to resources, and to require that a user use secure connections in a specific way.

Assigning Administrative Roles via MySQL Workbench

Applying roles to users in MySQL Workbench is as easy as selecting the user account, choosing the Administrative Roles tab, and checking the boxes, as shown in the image below.

MySQL Workbench's Administrative Roles tab lets you assign user privileges by checking the appropriate boxes. Source: MySQL Reference Manual

Likewise, choose the Schema Privileges tab to assign such privileges as the ability to create temporary tables.

The inability to create tables can be a thorny problem for MySQL developers. A Stack Overflow post from February 2011 highlights several possible solutions to a recalcitrant create-table command. The first proposed solution is to grant all privileges via "GRANT ALL PRIVILEGES ON mydb* TO 'myuser'@'%' WITH GRANT OPTION;". Such a "Super User" account is not recommended for production databases, however, nor for granting privileges on a single database.

Alternatively, you could use the following syntax to limit the privilege to a particular database:

Grant a MySQL user the ability to create tables in a single database by using the "@%" and "@localhost" qualifiers. Source: Stack Overflow

A similar problem encountered when trying to allow MySQL users to create tables is presented in a Stack Exchange post from July 2014. The developer wants the user to be able to create, update, and delete tables, but to be prevented from changing the password or viewing all the records in the database. (The default setting in MySQL allows users to change their own passwords, but only administrators can change other users' passwords.)

Using MySQL Workbench, you can open the Users and Privileges options and create a role that has no administrative privileges but "all object rights" and "DLL rights" for the specific schema. Limiting users to a single schema prevents them from viewing or changing any other table except the information_schema administrative schema.

Much of the pain of managing MySQL, MongoDB, Redis, and ElasticSearch databases is mitigated by using the Morpheus database-as-a-service. Morpheus lets you provision, deploy, and host your databases in just seconds using a simple point-and-click interface, backups are provided for MySQL and Redis databases.

Morpheus is the first and only DBaaS that supports SQL, NoSQL, and in-memory databases. The service lets you use a range of database tools for connecting, configuring, and managing your databases. Visit the Morpheus site for to create a free account.

When searching for the right database to use for a particular application, you have a number of determinations to make. Depending on the structure of your data, how much data you have, how fast queries need to be, and other considerations, a MySQL database may just be the tool that best fits the job at hand.

What is MySQL?

MySQL is a popular open-source relational database management system (RDBMS), which means the database model is a set of relations. The idea is to have a very organized structure with data that is always consistent, preferably with no duplication. This can be achieved by properly normalizing the database.

What are some advantages of MySQL?

Consistent data – A normalized MySQL database is quite reliable when it comes to having accurate data when queried. Since there is no duplicate data stored in another location, any query for a piece of data will return the most current and correct data.

Use of SQL – SQL (Structured Query Language) is a very popular means of writing queries that can add, update, or retrieve stored data. This means that many developers and database administrators will already be familiar with the query syntax that will be needed when working with MySQL.

An example of SQL syntax. Source: 1KeyData.

ACID model – ACID stands for Atomicity, Consistency, Isolation, and Durability. This helps to ensure that all database transactions are reliable. For example, atomicity means that if any part of a database transaction should fail, then the entire transaction fails (even if some parts of it would succeed). This helps prevent the potential problems that can occur if partial transactions are executed.

When is MySQL the right tool?

MySQL can be more difficult to scale than a NoSQL database, so if you have a very large amount of data that will consistently be growing in size, you may want to consider a NoSQL solution, which allows for quick storage and queries with fewer round trips to the database.

On the other hand, MySQL is typically the right tool in situations where you need your data and any transactions dealing with the data to be consistent and reliable. This is certainly true when you are dealing with sensitive data such as financial or confidential information, which needs to be accurate at all times.

Of course, there are also cases where you deal with both big data and sensitive data. In such instances, you can get both MySQL and a NoSQL system to work together to use the best features of each database where they are needed.

An example of MySQL and MongoDB modeling the same data. Source: ScaleBase.

Get a Hosted MySQL Database

If you want to use MySQL for your application, one way to do so is to use a service like Morpheus, which offers databases as a service on the cloud. With Morpheus, you can easily set up one or more databases (including MongoDB, ElasticSearch, and more).

In addition to this, all databases are deployed on a high performance infrastructure with Solid State Drives, and are backed up, replicated, and archived. Open a free account today!

Striking the perfect balance between write and query performance in a MongoDB database distributed between clustered servers depends on choosing an appropriate hash-based shard key. Conversely, choosing the wrong key can slow writes and reads to a crawl, in addition to squandering server storage space.

Hash-based sharding was introduced in version 2.4 of MongoDB as a way to allow shards to be distributed efficiently among clustered servers. As the MongoDB manual explains, your choice of shard key -- and the resulting data partitioning -- is a balancing act between write and query performance.

Using a randomized shard key offers many benefits for scaling write operations in a cluster, but random shard keys can cause query performance to suffer because they don't support query isolation, thus mongos must query all or nearly all shards. Step-by-step instructions for creating a hashed index and for sharding a collection using a hashed shard key are provided in the MongoDB manual.

As straightforward as the concept of hash-based sharding appears, implementing the technique on a live MongoDB database can be anything but trouble-free. A post on the MongoDB blog highlights the tradeoffs required to establish the optimal sharding balance for a specific database.

Once you've named the collection to be sharded and the hashed "identifier" field for the documents in the collection, you create the hashed index on the field and then shard the collection using the same field. The post uses as an example a collection named "mydb.webcrawler" and an identifier field named "url".

After naming the collection and the hashed identifier field, you create the field's hashed index. Source: MongoDB Blog

Next, use the same field to shard the collection. Source: MongoDB Blog

While it's best to shard the collection before adding data via pre-splitting, when you shard an existing collection the balancer automatically positions chunks to ensure even distribution of the data. The split and moveChunks functions apply to hash-based shard keys, but use of the "find" mechanism can be problematic because the specifier document is hashed to get the containing chunk. The solution is to use the "bound" parameter when manipulating chunks or entire collections manually.

When hash-based sharding impedes performance

The consequences of choosing the wrong shard key when you hash a MongoDB collection are demonstrated in a Stack Overflow post from September 2013. After sharding a collection by hashed_id, the resulting _id_hashed index was taking up nearly a gigabyte of space. The poster asked whether the index could be deleted because only the _id field is used to query the document.

Hash-based sharding requires a hashed index on the shard key, which is used to determine the shard used for all subsequent queries. In this case, the optimizer is using the _id index because it is unique and generates a more efficient plan, but it still requires the _id_hashed index.

In an October 14, 2014, post on the Wenda.io site, the process of applying a hash-based shard to a particular field is explained. The goal is to allow the application to generate a static hash for the field value so that the hash will always be the same if the value is the same.

When you designate a field in a document as a hash sharded field, a hash value for that field is generated automatically just before the document is read or written to. Outgoing queries are assigned that hash value that is always used for shard targeting. However, this can impact default chunk balancing and depends on selection of an appropriate hash function.

Much of the hassle of managing MongoDB collections -- as well as MySQL, Redis, and ElasticSearch databases -- is eliminated by the simple interface of the Morpheus database-as-a-service (DBaaS). Morpheus lets you provision, deploy, and host heterogeneous databases via a single console.

Morpheus is the first and only DBaaS that supports SQL, NoSQL, and in-memory databases. A free full replica set is deployed with each database instance you provision, and your MySQL and Redis databases are backed up. Visit the Morpheus site to create a free account.

With many tools available for information storage, sometimes it can be difficult to determine the best one to use for a particular case. Find out when MongoDB may be the right tool for the job.

TL;DR: MongoDB has become quite popular in recent years, but is it the right tool to use for your application? When choosing a database, it is a good idea to pick one that has the features you most need and performs well in your particular situation. This way, you are less likely to be hit with surprises down the road.

What is MongoDB?

MongoDB is a NoSQL, document-oriented database. This means that it does not use SQL (Structured Query Language) for queries, and also does not use the relational tables used in traditional relational databases. Instead, it stores related information in a single document using a JSON-like structure (called BSON).

What are some advantages of MongoDB?

Big Data - Since MongoDB is easily scalable and can search through large amounts of data quickly in most cases, it is a good database to use when you have massive amounts of data. Its scalability helps when you are consistently adding more data to the mix.

BSON - BSON (Binary JSON) is a binary method of storing simple data structures using the same type of format as JSON (JavaScript Object Notation). Given that numerous programmers understand JSON already, using the BSON format for documents makes it easy for programmers to access the needed data.

An example MongoDB query using the BSON format. Source: MongoDB.

Document-Oriented - Unlike relational databases, which need to be normalized to try to eliminate duplicate data, MongoDB stores data in as few documents as possible instead. This means that related data is usually easier to put together and to locate later, making in more user-friendly in that area.

An example MongoDB document. Source: MongoDB.

When is MongoDB the right tool?

While being document-oriented is more user-friendly, the cost is that there will likely be some duplicated data, which is later resolved to the most recent and correct value. With that in mind, a normalized relational database is typically better when you are storing sensitive information (such as personal or financial information).

On the other hand, MongoDB is often a great database when you are dealing with big data and need to be able to make speedy queries on that data. For example, eBay uses MongoDB to store their media metadata, which is quite a large amount of information.

Of course, there are also cases where you deal with both big data and sensitive data. In such instances, you can get both MongoDB and a relational database to work together to use the best features of each database where they are needed.

Get a Hosted MongoDB Database

If you want to use MongoDB for your application, one way to do so is to use a service like Morpheus, which offers databases as a service on the cloud. With Morpheus, you can easily set up one or more databases (including MongoDB, MySQL, and more.

In addition to all of this, databases are deployed on a high performance infrastructure with Solid State Drives, and are backed up, replicated, and archived. You learn more by viewing pricing information, or you can even open a free account now!

Since MySQL both sends queries to the server and returns data in text format, the query must be fully parsed and the result set must be converted to a string before being sent to the client. This overhead can cause performance issues, so MySQL implemented a new feature called Prepared Statements when it released version 4.1.

What is a MySQL prepared statement?

A MySQL prepared statement is a method that can be used to pass a query containing one or more placeholders to the MySQL server. Prepared statements make use of the client/server protocol that works between a MySQL client and server, thus allowing it to have a quicker response time that the typical text/parse/conversion exchange.

Here is an example query that demonstrates how a placeholder can be used (this is similar to using a variable in programming):

Example of a MySQL placeholder

This query does not need to be fully parsed, since different values can be used for the placeholder. This provides a performance boost for the query, which is even more pronounced if the query is used numerous times.

In addition to enhanced performance, the placeholder can help you avoid a number of SQL injection vulnerabilities, since you are defining the placeholder rather than having it sent as a text string that can be more easily manipulated.

Using MySQL Prepared Statements

A prepared statement in MySQL is essentially performed using four keywords:

PREPARE - This prepares the statement for execution
SET - Sets a value for the placeholder
EXECUTE - This executes the prepared statement
DEALLOCATE PREPARE - This deallocates the prepared statement from memory.

With that in mind, here is an example of a MySQL prepared statement:

Example of a MySQL placeholder

Notice how the four keywords are used to complete the prepared statement:

The PREPARE statement defines a name for the prepared statement and a query to be run.
The SELECT statement that is prepared will select all of the user data from the users table for the specified user. A question mark is used as a placeholder for the user name, which will be defined next.
A variable named @username is set and is given a value of 'sally_224'. The EXECUTE statement is then used to execute the prepared statement using the value in the placeholder variable.
To end everything and ensure the statement is deallocated from memory, the DEALLOCATE PREPARE statement is used with the name of the prepared statement that is to be deallocated (statement_user in this case).

Get your own MySQL Database

To use prepared statements, you will need to have a MySQL database set up and running. One way to easily obtain a database is to use a service like Morpheus, which offers databases as a service on the cloud. With Morpheus, you can easily and quickly set up your choice of several databases (including MySQL, MongoDB, and more). In addition, databases are backed up, replicated, and archived, and are deployed on a high performance infrastructure with Solid State Drives.

A common source of MySQL performance problems is tables with outdated, redundant, and otherwise-useless data. Slow queries can be fixed by optimizing one or all tables in your database in a way that doesn't lock users out any longer than necessary.

MySQL was originally designed to be the little database that could, yet MySQL installations keep getting bigger and more complicated: larger databases (often running in VMs), and larger and more widely disparate clusters. As database configurations increase in size and complexity, DBAs are more likely to encounter performance slowdowns. Yet the bigger and more complex the installation, the more difficult it is to diagnose and address the speed sappers.

The MySQL Reference Manual includes an overview of factors that affect database performance, as well as sections explaining how to optimize SQL statements, indexes, InnoDB tables, MyISAM tables, MEMORY tables, locking operations, and MySQL Server, among other components.

At the hardware level, the most common sources of performance hits are disk seeks, disk reading and writing, CPU cycles, and memory bandwidth. Of these, memory management generally and disk I/O in particular top the list of performance-robbing suspects. In a June 16, 2014, article, ITworld's Matthew Mombrea focuses on the likelihood of encountering disk thrashing (a.k.a. I/O thrashing) when hosting multiple virtual machines running MySQL Server, each of which contains dozens of databases.

Data is constantly being swapped between RAM and disk, and obviously it's faster to access data in system memory than data on disk. When insufficient RAM is available to MySQL, dozens or hundreds of concurrent queries to disk will result in I/O thrashing. Comparing the server's load value to its CPU utilization will confirm this: high load value and low CPU utilization indicates high disk I/O wait times.

Determining how frequently you need to optimize your tables

The key to a smooth-running database is ensuring your tables are optimized. Striking the right balance between optimizing too often and optimizing too infrequently is a challenge for any DBA working with large MySQL databases. This quandary was presented in a Stack Overflow post from February 2012.

For a statistical database having more than 2,000 tables, each of which has approximately 100 million rows, how often should the tables be optimized when only 60 percent of them are updated every day (the remainder are archives)? You need to run OPTIMIZE on the table in three situations:

When its datafile is fragmented on disk
When many of its rows are updated or change size
When deleting many records and not adding many others

Run CHECK TABLE when you suspect the table's data is corrupted, and then REPAIR TABLE when corruption is reported. Use ANALYZE TABLE to update index cardinality.

In a separate Stack Overflow post from March 2011, the perils of optimizing too frequently are explained. Many databases use InnoDB with a single file rather than separate files per table. Optimizing in such situations can cause more disk space to be used rather than less. (Also, tables are locked during optimization, so large tables may be inaccessible for long periods.)

From the command line, you can use mysqlcheck to optimize one or all databases:

Run "mysqlcheck" from the command line to optimize one or all of your databases quickly. Source: Stack Overflow

Alternatively, you can run this PHP script to optimize all the tables in your database:

This PHP script will optimize all the tables in a database in one fell swoop. Source: Stack Overflow

Other suggestions are to implode the table names into one string so that you need only one optimize table query, and to use MySQL Administrator in the MySQL GUI Tools.

Monitoring and optimizing your MySQL, MongoDB, Redis, and ElasticSearch databases is a point-and-click process in the new Morpheus Virtual Appliance. Morpheus is the first and only database-as-a-service (DBaaS) that supports SQL, NoSQL, and in-memory databases across public, private, and hybrid clouds. You can provision your database with astounding ease, and each database instance includes a free full replica set. The service supports a range of database tools and lets you analyze all your databases from a single dashboard. Visit the Morpheus site to create a free account.

IT managers are equally intrigued by the promise of network functions virtualization, and leery of handing over control of their critical networks to unproven software, much of which will be managed outside their data centers. Some of the questions surrounding NFV will be addressed by burgeoning standards efforts, but most organizations continue to adopt a "show me" attitude toward the technology.

Big things are predicted for software defined networks (SDN) and network functions virtualization (NFV), but as with any significant change in the global network infrastructure, the road to networking hardware independence will have its share of bumps.

For one thing, securing networks that have no physical boundary is no walk in the park. Viodi's Alan Weissberger explains in a December 29, 2014, post that replacing traditional hardware functions with software extends the potential attack space "exponentially." When you implement multiple virtual appliances on a single physical server, for example, they'll all be affected by a single breach of that server.

Even with the security concerns, the benefits of virtualization in terms of flexibility and potential cost savings are difficult for organizations of all sizes to ignore. In a December 23, 2014, article on TechWeek Europe, Ciena's Benoit de la Tour points out that virtualization allows network operators to expand or remove firewalling, load balancing, and other services and appliance functionality instantly.

Simplifying hardware management is one of NFV's principal selling points. John Fruehe writes on the Moor Insights & Strategy blog that NFV replaces some specialty networking hardware with software that runs on commercial off-the-shelf (COTS) x86 servers, or as VMs running on those servers. It also simplifies network architectures by reducing the total number of physical devices.

NFV offers organizations the potential to simplify network management by reducing the overall hardware footprint. Source: Moor Insights & Strategy

Potential NFV limitations: licensing and carrier control

The maturation of the technology underlying NFV concepts is shown in the creation of the Open Platform for NFV, a joint project of the Linux Foundation and such telecom/network companies as AT&T, Cisco, HP, NTT Docomo, and Vodafone. As ZDNet's Steven J. Vaughan-Nichols reports in a September 30, 2014, article, OPNFV is intended to create a " carrier-grade, integrated, open source NFV reference platform."

Linux Foundation Executive Director Jim Zemlin explains that the platform would be similar to Linux distributions serving a variety of needs and allowing code to be integrated upstream and downstream. Even with an open-source base, some potential NFV adopters are hesitant to cede so much control of their networks to carriers. For one thing, companies don't want to find themselves caught in the middle of feuding carriers and equipment vendors.

SDN and NFV have many similarities, but also some important differences, principally who hosts the bulk of the network hardware. Source: Moor Insights & Strategy

More importantly, IT managers are concerned about ensuring the reliability of their networks in such widespread virtual environments. Red Hat's Mark McLoughlin states in an October 8, 2014, post on OpenSource.com that network functions implemented as horizontal scale-out applications will address reliability the way cloud apps do: each application tier will be distributed among multiple failure domains. Scheduling of performance-sensitive applications will be in the hands of the telcos, which makes SLAs more important than ever.

Existing software licensing agreements also pose a challenge to organizations hoping to benefit from use of NFV. A November 26, 2014, article by TechTarget's Rob Lemos describes a hospital that attempted to switch from a license based on total unique users to one based on concurrent users as it implemented network virtualization. The process of renegotiating the licenses took four years.

Lemos points out that organizations often neglect to consider the implications of renegotiating software licenses when they convert to virtualized operations. Conversely, when you use the new Morpheus Virtual Appliance, you know exactly what you're getting -- and what you're paying for -- ahead of time. With the Morpheus database-as-a service, you have full control of the management of your heterogeneous MySQL, MongoDB, Redis, and ElasticSearch databases via a single dashboard.

Morpheus lets you create a new instance of any SQL, NoSQL, or in-memory database in just seconds via a point-and-click interface. A free full replica set is provided for each database instance, and your MySQL and Redis databases are backed up. Visit the Morpheus site for pricing information and to create a free account.

ACID compliance can be very beneficial in some situations, while others may not require it. Find out where you need ACID compliance and where you don't.

TL;DR: One decision that must be made when setting up your databases is whether or not ACID compliance will be required. In some instances, it is very important to have it implemented; however, other situations allow for you to be more flexible with your implementation.

What is ACID Compliance?

A quick overview of ACID. Source: LastBuild Quality Assurance

To know when it is best to use ACID compliance, you should have a good understanding of the benefits it brings when used. ACID stands for Atomicity, Consistency, Isolation, and Durability. Each of these is described in more detail below:

Atomicity - Database transactions often have multiple parts of the transaction that need to be completed. Atomicity ensures that a transaction will not be successful unless all parts of it are successful. Since an incomplete transaction will fail, the database is less likely to get corrupted or incomplete data as a result.

Consistency - For a database to have consistency, the data within it must comply with its data validation rules. In an ACID-compliant database, transactions that do not follow those data validation rules are rolled back so that the database is in the state it was prior to the transaction. Again, this makes it less likely for the database to have faulty or corrupted data.

Isolation - When handling multiple transactions at the same time, a database that has isolation can handle each of the transactions without any of them affecting any of the others. For example, suppose two customers are each trying to purchase an item that is in stock (five still available). If one customer wants four of the items and the other wants three, isolation ensures that one transaction is performed ahead of the other, which keeps you from selling more than you have in stock!

Durability - A database with durability ensures that all completed transactions are saved, even if some sort of technology failure occurs.

Example showing how an ACID constraint can be implemented. Source: Wikipedia

When Is ACID Compliance Beneficial?

ACID compliance is essential in some cases. For instance, financial data/transactions or personal data (such as health care information) really need to be ACID compliant for the safety and privacy of the customer. In some cases, there may even be a regulatory authority which requires ACID compliance for particular situations. In any of these cases, it is a good idea to implement ACID compliance for your database.

When is ACID Compliance Not Necessary?

There are cases where a small amount of older or less consistent data can be acceptable. For instance, blog post comments or data saved for an autocomplete feature may not need to be under the scrutiny that other types of data require, so these often do not necessitate ACID compliance.

Get Your Own Database

Whether you require ACID compliance or not, Morpheus Virtual Appliance is a tool that allows you manage heterogeneous databases in a single dashboard. With Morpheus, you have support for SQL, NoSQL, and in-memory databases across public, private, and hybrid clouds. So, visit the Morpheus site for pricing information or to create a free account today!

Encryption is becoming an essential component of nearly all applications, but managing the Secure Sockets Layer/Transport Layer Security (SSL/TLS) certificates that are at the heart of most protected Internet connections is anything but simple. A new tool from Google can help ensure your apps are protected against man-in-the-middle attacks.

In the not-too-distant past, only certain types of Internet traffic was encrypted, primarily online purchases and any transmission of sensitive business information. Now the push is on to encrypt everything -- or nearly everything -- that travels over the Internet. While some analysts question whether the current SSL/TLS encryption standards are up to the task, certificate-based encryption isn't likely to be replaced anytime soon.

The Elecronic Frontier Foundation's Let's Encrypt program proposes a new certificate authority (CA) intended to make HTTPS the default on all websites. The EFF claims the current CA system for HTTPS is too complex, too costly, and too easy for the bad guys to beat.

Nearly every web user has encountered a warning or error message generated by a misconfigured certificate. The pop-ups are usually full of techno-jargon that can confuse engineers, let alone your typical site visitors. In fact, a recent study by researchers at Google and the University of Pennsylvania entitled Improving SSL Warnings: Comprehension and Adherence (pdf) found that 66 percent of people using the Chrome browser clicked right through the CA warnings.

As Threatpost's Brian Donahue reports in a February 3, 2015, article, redesigning the messages to provide better visual cues and more dire warnings convinced 62 percent of users to choose the preferred, safe response, compared to only 37 percent who did so when confronted with the old warnings. The "opinionated design" concept combines a plain-English explanation ("Your connection is not private" in red letters) with added steps required to continue despite the warning.

Researchers were able to increase the likelihood that users would make the safe choice by redesigning SSL certificate warnings from cryptic (top) to straightforward (bottom). Source: Sophos Naked Security

Best practices for developing SSL-enabled apps

SSL has become a key tool in securing IT infrastructures. Because SSL certificates are valid only for the time they specify, monitoring the certificates becomes an important part of app management. A Symantec white paper entitled SSL for Apps: Best Practices for Developers (pdf) outlines the steps required to secure your apps using SSL/TLS.

When establishing an SSL connection, the server returns one or more certificates to create a "chain of trust." The certificates may not be received in a predictable order. Also, the server may return more than necessary or require that the client look for necessary certificates elsewhere. In the latter case, a certificate with a caIssuers entry in its authorityInfoAccess extension will list a protocol and extension for the issuing certificate.

Once you've determined the end-entity SSL certificate, you verify that the chain from the end-entity certificate to the trusted root certificate or intermediate certificate is valid.

To help developers ensure their apps are protected against man-in-the-middle attacks resulting from corrupted SSL certificates, Google recently released a tool called nogotofail. As PC World's Lucian Constantin explains in a November 4, 2014, article, apps become vulnerable to such attacks because of bad client configurations or unpatched libraries that may override secure default settings.

Nogotofail simulates man-in-the-middle attacks using deep packet inspection to track all SSL/TLS traffic rather than monitoring only the two ports usually associated with secure connections, such as port 443. The tool can be deployed as a router, VPN server, or network proxy.

Security is at the heart of the new Morpheus Virtual Appliance, which lets you seamlessly provision and manage SQL, NoSQL, and in-memory databases across hybrid clouds. Each database instance you create includes a free full replica set for built-in fault tolerance and failover. You can administer your heterogeneous MySQL, MongoDB, Redis, and ElasticSearch databases from a single dashboard via a simple point-and-click interface.

Visit the Morpheus site to sign up for a FREE Trial!

Virtual appliances deliver the potential to enhance data security and operational efficiency in IT departments of all shapes, sizes, and types. As the technology expands to encompass ever more data-center operations, it becomes nearly impossible for managers to exclude virtual appliances from their overall IT strategies.

Why have virtual appliances taken the IT world by storm? They just make sense. By combining applications with just as much operating system and other resources as they need, you're able to minimize overhead and maximize processing efficiency. You can run the appliances on standard hardware or in virtual machines.

At the risk of sounding like a late-night TV commercial, "But wait, there's more!" The Turnkey Linux site summarizes several other benefits of virtual appliances: they streamline complicated, labor-intensive processes; they make software deployment a breeze by encapsulating all the app's dependencies, thus precluding conflicts due to incompatible OSes and missing libraries; and last but not least, they enhance security by running in isolation, so a problem with or breach of one appliance doesn't affect any other network components.

In fact, the heightened focus in organizations of all sizes on data security is the impetus that will lead to a doubling of the market for virtual security appliances between 2013 and 2018, according to research firm Infonetics. The company forecasts that revenues from virtual security appliances will total $1.2 billion in 2018, as cited by Fierce IT's Fred Donovan in a November 11, 2014, article.

Growth in the market for virtual appliances will be spurred in large part by increased emphasis on data security in organizations. Source: Infonetics Research, via Fierce IT

In particular, virtual appliances are seen as the primary platform for implementation of software-defined networks and network functions virtualization, both of which are expected to boom starting in 2016, according to Infonetics.

The roster of top-notch virtual appliances continues to grow

There are now virtual appliances available for such core functions as ERP, CRM, content management, groupware, file serving, help desks, and domain controllers. TechRepublic's Jack Wallen lists 10 of his favorite virtual appliances, which include Drupal appliance, LAMP stack, Zimbra appliance, Openfiler appliance, and the Opsview Core Virtual Appliance.

If you prefer the DIY approach, the TKLDev development environment for Turnkey Linux appliances claims to make building Turnkey Core from scratch as easy as running make.

The TKLDev tool lets you build Turnkey Core simply by running make. Source: Turnkey Linux

The source code for all the appliances in the Turnkey library are available on GitHub, as are all other repositories and the TKLDev documentation.

Also available are the Turnkey LXC (LinuX Containers) and Turnkey LXC appliance. Turnkey LXC is described by Turnkey Linux's Alon Swartz in a December 19, 2013, post as a " middle ground between a chroot on steroids and a full fledged virtual machine." The environment allows multiple isolated containers to be run on a single host.

The most recent addition to the virtual-appliance field is the Morpheus Virtual Appliance, which is the first and only database provisioning and management platform that supports private, public, and hybrid clouds. Morpheus offers the simplest way to provision heterogeneous MySQL, MongoDB, Redis, and ElasticSearch databases.

The Morpheus Virtual Appliance offers real-time monitoring and analysis of all your databases via a single dashboard to provide instant insight into consumption and availability of system resources. A free full replica set is provisioned for each database instance, and backups are created for your MySQL and Redis databases.

Visit the Morpheus site to create a free trial account. You'll also find out how to get started using Morpheus, which is the only database-as-a-service to support SQL, NoSQL, and in-memory databases.

The demands of modern database development mandate an approach that matches the model (structured or unstructured) to the nature of the underlying data, as well as the way the data will be used. Choice of data model is no longer an either/or proposition: now you can have your relational and key-value, too. The multimodel approach must be applied deliberately to reduce operational complexity and ensure reliability.

"When your only tool is a hammer, all your problems start looking like nails." Too often that old adage has applied to database design: When your only tool is a relational DBMS, all your data starts to look structured.

Well, today's proliferation of data types defies squeezing it all into a single model. The age of the multimodel database has arrived, and developers are responding by adopting designs that apply the most appropriate model to the various data types that comprise their diverse databases.

In a January 6, 2015, article on InfoWorld, FoundationDB's Stephen Pimentel explains that the rise in NoSQL, JSON, graphs, and other non-SQL data models is the result of today's applications needing to work with various data types and storage requirements. Rather than creating multiple distinct databases, developers are increasingly basing their databases on a single backend that supports multiple data models.

Data scientist Martin Fowler describes polyglot persistence as the ability of applications to manage their own data using various technologies based on the characteristics and use of that data. Rather than selecting the tool first and then fitting the data to the tool, developers will determine how the various data elements will be manipulated and then will choose the appropriate tools for those specific purposes.

Multimodel databases apply different data models in a single database based on the characteristics of various data elements. Source: Martin Fowler

Multimodel databases are by definition more complicated than their single-model counterparts. Managing this complexity is the principal challenge of developers, primarily because each data storage mechanism requires its own interface and creates a potential performance bottleneck. However, the alternative of attempting to apply the relational model to NoSQL-type unstructured data will require a tremendous amount of development and maintenance effort.

Putting the multimodel database design into practice

John P. Wood highlights the primary shortcoming of RDBMSs in clustered environments: the way they enforce data integrity places inordinate demands on processing power and storage requirements. RDBMSs depend on fast, simple access to all data continually to prevent duplicates, enforce constraints, and otherwise maintain the database.

While you can scale out relational databases via slave-master, sharding, and other approaches, doing so increases the app's complexity. More importantly, a key-value store is often a better fit for that data than RDBMS's rows and columns, even with object/relation mapping tools.

Wood describes two scenarios in which polyglot persistence improves database performance: when performing complex calculations on massive data sets; and when needing to store data that varies greatly from document to document, or that is constantly changing structure. In the first instance, data is moved from the relational to the NoSQL database and then processed by the application to maximize the benefits of clustering. In the second, structure is applied to the document on the fly to allow data inside the document to be queried.

The basic relational (SQL) model compared to the document (NoSQL) model. Source: Aaron Stannard

The trend toward supporting multiple data models in a single database is evident in the new Morpheus Virtual Appliance, which supports heterogeneous MySQL, MongoDB, Redis, and ElasticSearch databases. Morpheus lets you monitor and analyze all your databases using a single dashboard to provide instant insight into consumption and availability of system resources.

The Morpheus Virtual Appliance is the first and only database provisioning and management platform that works with private, public, and hybrid clouds. A free full replica set is provisioned for each database instance, and backups are created for your MySQL and Redis databases.

Visit the Morpheus site to create a free trial account!

The role of the traditional data-warehouse extract, transform, load function has broadened to become the foundation for a new breed of graphical business-intelligence tools. It has also spawned a market for third-party ETL tools that support a range of data types, sources, and systems.

The data-warehousing concept of extract, transform, load (ETL) almost seems quaint in the burgeoning era of unstructured data stored in petabyte-sized containers. Some analysts have gone so far as to declare ETL all but dead. In fact, technologies as useful and pervasive as ETL rarely disappear -- they just find new roles to play.

After all, ETL is intended to improve accessibility to and analysis of data. This function becomes even more important as data stores grow and analyses become more complex. In addition, the people analyzing the data are now more likely to be business managers using point-and-click dashboards rather than statisticians using sophisticated modeling tools. The IT chestnut "garbage in, garbage out" has never been more relevant.

In a February 2, 2015, post on the Smart Data Collective, Mario Barajas asserts that the best place to ensure the quality of data is at the source: the data input layer. ETL is used at the post-input stage to aggregate data in report-ready form. The technology becomes the lingua franca that "preps" diverse data types for analysis, rather than a data validator.

Gartner analyst Rita Sallam refers to this second generation of ETL as "smart data discovery," which she expects will deliver sophisticated business-intelligence capabilities to the 70 percent of people in organizations who currently lack access to such data-analysis tools. Sallam is quoted in a January 28, 2015, article on FirstPost.

Big-data analysis without the coding

Off-the-shelf ETL products were almost unheard of when data warehouses first arrived. The function was either built in by the warehouse vendor or hand-coded by the customer. Two trends have converged to create a market for third-party ETL tools: the need to accommodate unstructured data (think Twitter streams and video feeds); and to integrate multiple platforms (primarily mobile and other external apps).

ETL has morphed from a specialty function either built into a data-warehouse system or coded by customers, to a product category that extends far beyond any single data store. Source: Data-Informed

Representing this new era of ETL are products such as ClearStory, Paxata, Tamr, and Trifacta. As Gigaom's Barb Darrow explains in a February 4, 2015, article, the tools are intended to facilitate data sharing and integration with a company's partners. The key is to be able to do so at the speed of modern business. This is where next-gen ETL differs from its slow, deliberate data-warehouse counterpart.

Running at the speed of business is one of the primary benefits of the new Morpheus Virtual Appliance. The Morpheus database-as-a-service (DBaaS) lets you provision and manage SQL, NoSQL, and in-memory databases across hybrid clouds via a simple point-and-click interface. The service supports heterogeneous MySQL, MongoDB, Redis, and ElasticSearch databases.

A full replica set is provisioned with each database instance for fault tolerance and fail over, and Morpheus ensures that the replica and master are synched in near real time. The integrated Morpheus dashboard lets you monitor read/writes, queries per second, IOPs, and other stats across all your SQL, NoSQL, and in-memory databases. Visit the Morpheus site for pricing information and to create a free account.

If you need to store cached web sites, it is a good idea to begin with a database that is optimized for caching. What database is built to perform this sort of task?

TL;DR: If you need to store cached data, such as cached web sites, you will want to be able to retrieve content quickly when needed. Some databases are built better for such purposes, since there will be a large amount of data stored and retrieved on a regular basis. A good option for this is an in-memory, key-value data store.

What is an In-Memory Database?

An in-memory database (often shortened to IMDB), is a database that stores data using the main memory of a machine as the primary storage facility, rather than storing the data on a disk as in a typical disk-optimized database.

Typically, the biggest feature of in-memory databases is that they offer faster performance than databases that use disk storage. This is due in part to fewer CPU instructions being run, as well as drastically improved query time, since the data is in main memory rather than needing to be read from a disk on the machine.

The differences between a disk database and an in-memory database. Source: OpenSourceForU.

What is a Key-Value Store?

A key-value store is a system in which an associative array (often called a map) is used for storing information, as opposed to the typical tabular structure associated with relational databases. This mapping results in key-value pairs that resemble the associative arrays in most all popular programming languages.

As a result, it is often easy for programmers to write queries for such databases, since these often use a JSON-like structure that is very familiar to most programmers. The figure below shows one example of how the key-value store Redis can store data and be queried for that data:

An example of a Redis query. Source: Redis.

Storing Cached Web Sites

When storing cached web sites, you will likely be storing extremely large amounts of data, since each stored web site will have files, images, media, and dependencies that will need to be stored to have a complete cache.

While a relational database could store the information, it would likely be less optimized than a NoSQL solution for this purpose, and would likely be slower than an in-memory database is likely to be far faster at storing and retrieving items from the massive amount of available data. With this in mind, a database that is an in-memory, key-value store is likely going to be the most effective and efficient solution.

Databases such as Redis and Memcached fall into this category. If you are using Ngnix as a reverse proxy server, Redis even has an Nginx module available that can serve the cached content from Redis directly. This makes Redis an excellent choice for performing the task of storing cached web sites and retrieving information from that store.

Get Your Own Redis Database

Whether your database tables will be simple or complex, Morpheus Virtual Appliance is a tool that allows you manage heterogeneous databases in a single dashboard. With Morpheus, you have support for SQL, NoSQL, and in-memory databases like Redis across public, private, and hybrid clouds. So, visit the Morpheus site for pricing information or to create a free account today!

Tying security to the data itself allows IT to defend against internal and external threats, and avoid never-ending patch cycles.

TL;DR: Do you know where your organization's critical data is? As cloud services proliferate, it has become nearly impossible to secure physical servers and data-center perimeters. Growing threats from outside and within organizations have led IT managers to focus their security efforts on the data itself rather than the hardware the data is stored on.

You can't blame IT managers for thinking all their security efforts are futile. Nor can they be faulted for believing the deck is stacked against them. Today's hackers are more numerous, more proficient, and more focused on stealing companies' most valuable assets.

Even worse, outside threats may not be data managers' biggest security problem. As IT Business Edge's Sue Marquette Poremba writes in a February 2, 2015, article, recent surveys indicate IT departments' greatest concern is often the security threat posed by insiders: privileged users and employees with high-level access to sensitive data who either cause a breach intentionally, or through carelessness or lack of proper training.

Kaspersky Labs' IT Security Risk Survey 2014 identifies the biggest internal and external threats to medium-sized businesses' data security. Source: Kaspersky Labs

Poremba cites the 2015 Insider Security Threat Report compiled by Vormetric Data Security, which found that 59 percent of the IT personnel surveyed believe insiders pose the greatest data security risk to their firms. Vormetric is one of a growing number of security services to recommend customers focus their security efforts on the data rather than on securing the perimeter.

An opportunity to get off the non-stop-patch merry-go-round

One indication of the uphill battle companies face in keeping their systems safe is the sorry state of software patches. Security software vendor Secunia reports that 48 percent of Java users lack the latest patches. CSO's Maria Korolov reports on the Secunia survey in a January 26, 2015, article.

Secunia claims that in the past year, 119 new vulnerabilities were discovered in Java, which is installed on 65 percent of all computers. That's a lot of surface area for potential breaches. And Java is far from the only possible hack target: Veracode's recent scan of Linux systems found that 41 percent of enterprise applications using the GNU C Library (glibc) are susceptible to the Ghost buffer-overflow vulnerability because the apps use the gethostbyname function. Dark Reading's Kelly Jackson Higgins reports on the finding in a February 5, 2015, article.

Many analysts are predicting that an entirely new approach to data security is beginning to take hold in organizations: one that de-emphasizes server software and focuses instead on the data itself. Information Age's Ben Rossi writes in a January 25, 2015, article that physical servers are becoming "disposable," and in their place are API-driven cloud services.

Security controls are built into cloud services, according to Rossi: virtual servers feature dedicated firewalls, role access policy, and network access rights; files stored in the cloud have simple access policies and encryption mechanisms built in; and user-specific identity policies restrict their access to data and resources.

Security is at the heart of the new Morpheus Virtual Appliance, which lets you seamlessly provision and manage SQL, NoSQL, and in-memory databases across hybrid clouds. Each database instance you create includes a free full replica set for built-in fault tolerance and fail over. You can administer your heterogeneous MySQL, MongoDB, Redis, and ElasticSearch databases from a single dashboard via a simple point-and-click interface.

With the Morpheus database-as-a-service (DBaaS), you can migrate existing databases from a private cloud to the public cloud, or from public to private. A new instance of the same database type is created in the other cloud, and real-time replication keeps the two databases in sync. Visit the Morpheus site for pricing information and to create a free account.

Simple queries of Performance Schema tables can boost MySQL database performance without the heavy lifting.

TL;DR: Identify indexes that are no longer being used, spot the queries that take longest to process, and diagnose other hiccups in your MySQL database's performance by querying the Performance Schema tables.

When Oracle released MySQL 5.7 last year, the company boasted that the new version handled queries at twice the speed of its predecessor, as InfoWorld's Joab Jackson reported in a March 31, 2014, article. Companies could therefore use fewer servers to run large jobs, or take less time to run complex queries on the same number of servers.

MySQL 5.7 also added new performance schema for diagnosing and repairing bottlenecks and other problems. The Performance Schema tool is designed to monitor MySQL Server execution at a low level to avoid imposing a performance hit of its own. The 52 tables in "performance_schema" can be queried to report on the performance of indexes, threads, events, queries, temp tables, and other database elements.

In his January 12, 2015, overview of MySQL Workbench Performance reports, Database Journal's Rob Gravelle points out that using Performance Schema entails a lot of "instrumentation," including server defaults for monitoring coverage, using canned Performance Reports, and retrieving the SQL statement.

The MySQL Reference Manual explains that Performance Schema monitors all server events, including mutexes and other synchronization calls, file and table I/O, and table locks. It tracks current events as well as event histories and summaries, but tables stored in the corresponding performance_schema are temporary and do not persist on disk. (Note that the complementary Information Schema -- also called the data dictionary or system catalog -- lets you access server metadata, such as database and table names, column data types, and access privileges.)

Putting MySQL's Performance Schema tool to use

Starting with MySQL version 5.6.6, Performance Schema is enabled by default. Database Journal's Gravelle explains in a December 8, 2014, article that you can enable or disable it explicitly by starting the server with the performance_schema variable set to "on" via a switch in your my.cnf file. Verify initialization by using the statement "mysql> SHOW VARIABLES LIKE 'performance_schema';".

Initialize Performance Schema via a switch in the my.cnf file, and verify initialization by running the "SHOW VARIABLES LIKE" command. Source: Database Journal

The 52 Performance Schema tables include configuration tables, object tables, current tables, history tables, and summary tables. You can query these tables to identify unused indexes, for example.

Run this query to determine which of your database's indexes are merely taking up space. Source: Database Journal

Adding "AND OBJECT_SCHEMA = 'test'" to the WHERE clause lets you limit results to a specific schema. Another query helps you determine which long-running queries are monopolizing system resources unnecessarily.

This Performance Schema query generates a process list by the time required for the queries to run. Source: Database Journal

While most performance_schema tables are read-only, some setup tables support the data manipulation language (DML) to allow configuration changes to be made. The Visual Guide to the MySQL Performance Schema lists the five setup tables: actors, objects, instruments, consumers, and timers.

There's no simpler or more-efficient way to monitor the performance of heterogeneous MySQL, MongoDB, Redis, and ElasticSearch databases than by using the new Morpheus Virtual Appliance. Morpheus lets you seamlessly provision and manage SQL, NoSQL, and in-memory databases across hybrid clouds via a single point-and-click dashboard. Each database instance you create includes a free full replica set for built-in fault tolerance and fail over.

The two popular development environments are expected to benefit from long-awaited new features in their upcoming releases.

TL;DR: Long seen as rivals -- and often considered outdated -- Java and JavaScript remain the most popular development environments, despite their perceived shortcomings. Many of the criticisms of previous versions in terms of modularity, portability, and security are being addressed in new releases of both systems now under development.

At one time in the not-too-distant past, both Java and JavaScript were reported to be ready to breathe their last. Java was deemed so insecure that in 2013 Carnegie Mellon University's Software Engineering Institute and the U.S. Department of Homeland Security warned that people should disable Java in their browsers unless using it was "absolutely necessary."

Meanwhile, cross-site scripting attacks and other vulnerabilities led many to question the security of JavaScript. Still, few people have come out in favor of disabling JavaScript altogether because doing so renders most websites all but unusable. Chris Hoffman explains why in a February 28, 2013, article on the How-To Geek.

Fast-forward to 2015, and the outlook for both Java and JavaScript appears much rosier. Not only do both continue to top the list of most popular development platforms (as Mashable's Todd Wasserman reports in a January 18, 2015, article), enhanced versions of each are expected to arrive (relatively) soon.

Some of the improvements are already apparent. For example, Cisco Systems' 2015 Annual Security Report finds that Java exploits decreased 34 percent in 2014 from the previous year, as ProgrammableWeb's Michael Vizard reports in a January 29, 2015, article. The drop is attributed primarily to Java's implementation of automatic updates.

The prevalence of malware targeting Java dropped considerably in 2014, due in large part to implementation of automatic updates. Source: DarkVision Hardware

While attacks targeting JavaScript have increased as more developers choose the scripting language for both client- and server-side apps, JavaScript founder Brendan Eich recently predicted that the language would ultimately supplant Java as the premiere development platform for the web and elsewhere. In a February 13, 2015, article, InfoWorld's Paul Krill reports on Eich's presentation at the recent Node Summit conference in San Francisco.

Is Java development finally back on track?

When Oracle acquired Java as part of its purchase of Sun Microsystems in 2010, the product was in sorry shape, as ITworld's Andy Patrizio writes in a February 13, 2015, article. In addition to critical bugs, most of Sun's Java development team departed soon after the merger. This helps explain why so much time passed between the releases of Java SE 7 in 2011 and SE 8 in 2014.

With the impending arrival of the more modular Java 9, the tide is definitely turning. JavaWorld's Jeff Hanson explains in a February 2, 2015, article that despite Java's object-oriented nature, the current version doesn't meet developers' modularity requirements: object-level identity is unreliable, interfaces aren't versioned, and classes aren't unique at the deployment level.

The Open Service Gateway Initiative (OSGi) has been the primary alternative for modular Java, providing an execution environment, module layer, life-cycle layer, and service registry. However, OSGi lacks a formal mechanism for native package installation. Project Jigsaw is expected to be Java 9's modularity component, while the Java Enhanced Proposal (JEP) 200 and JEP 201 will build on Jigsaw for segmenting the JDK and modularizing the JDK source code, respectively.

JavaScript's fortunes are tied to the growing popularity of HTML5, the combination of which is becoming the de facto standard for websites, enterprise apps, and mobile apps, according to Developer Tech's Carlos Goncalves in a January 23, 2015, article. The downside of this tandem is a lack of security: code is stored on both the client and server as clear text. JavaScript obfuscation is being used increasingly not only to protect systems and applications, but also to enforce copyrights and licenses.

One way to future-proof your databases is by using the new Morpheus Virtual Appliance, which provides a single dashboard for provisioning, deploying, and managing your heterogeneous MySQL, MongoDB, Redis, and ElasticSearch databases. Morpheus offers a simple point-and-click interface for analyzing SQL, NoSQL, and in-memory databases across hybrid clouds in just minutes. Each database instance you create includes a free full replica set for built-in fault tolerance and fail over.

Sometimes database tables grow and require some columns to be moved to a separate table. Find out when this may be a good idea!

TL;DR: As a database table grows, certain columns can become candidates for being moved into a separate table and referred to by a particular ID in the original table. To know when this is a good idea, you need to determine the purpose of the data that is in the column in question and the potential for it to expand and evolve over time.

Determine the Purpose of the Data

Suppose you had a small application using a single table, like the one shown below:

An example database table.

This table is simply used for storing user information. You have a user ID, last name, first name, and user type. All of these seem good, as long as the number of user types is small and doesn't change very often. However, what if different user types need to be added on a somewhat regular basis over time?

Since the user type is currently a string of text (character varying in database terms), the initial adding of a user or the updating of the user type requires that particular string of text to be entered. There is a chance (especially if an update is done via a manual query) that the string could be mistyped slightly. This creates a new user type, which would exclude the user from being in the originally intended group!

Using an ID Instead

Now that the user type column is becoming problematic, you need to find a way to help keep the data in the column consistent with the actual user types. You can simply change this to a number; however, this will require that anyone adding or updating records in the table know what those number mean (e.g., 1 is for administrator, 2 is for premium user, and so on).

Another method would be to change it to a number and add a user type description to the table as well, but this is now adding another column to keep track of in this table, and could fall prey to the type description being entered incorrectly.

Moving a Column to a Table

Since you want to have the user type be a number, but still want to know what that number means, user type information is a good candidate to be moved to its own table with an ID as the primary key and a description column. The ID can then be referenced in the original users table. This helps ensure that both the user and user type data are kept consistent and accurate.

The resulting tables are shown below:

An example of two tables working together.

Notice that you can now simply point to the user type ID in the user_types table. This table can now be referenced if you need to know what that number means, and keeps each of those types with a consistent ID and description.

Get Your Own Database

Whether your database tables will be simple or complex, Morpheus Virtual Appliance is a tool that allows you manage heterogeneous databases in a single dashboard. With Morpheus, you have support for SQL, NoSQL, and in-memory databases across public, private, and hybrid clouds. So, visit the Morpheus site for pricing information or to create a free account today!

To normalize or not to normalize? Find out when normalization of a database is helpful and when it is not.

TL;DR: When using a relational database, normalization can help keep the data free of errors and can also help ensure that the size of the database doesn't grow large with duplicated data. At the same time, some types of operations can be slower in a normalized environment. So, when should you normalize and when is it better to proceed without normalization?

What is Normalization?

Database normalization is the process of organizing data within a database in the most efficient manner possible. For example, you likely do not want a username stored in several different tables within your database when you could store it in a single location and point to that user via an ID instead.

By keeping the unchanging user ID in the various tables that need the user, you can always point it back to the appropriate table to get the current username, which is stored in only a single location. Any updates to the username occur only in that place, making the data more reliable.

An example of using the first normal form. Source: ChatterBox's .NET.

What Is Good about Database Normalization?

A normalized database is advantageous when operations will be write-intensive or when ACID compliance is required. Some advantages include:

Updates run quickly due to no data being duplicated in multiple locations.
Inserts run quickly since there is only a single insertion point for a piece of data and no duplication is required.
Tables are typically smaller that the tables found in non-normalized databases. This usually allows the tables to fit into the buffer, thus offering faster performance.
Data integrity and consistency is an absolute must if the database must be ACID compliant. A normalized database helps immensely with such an undertaking.

What Are the Drawbacks of Database Normalization?

A normalized database is not as advantageous under conditions where an application is read-intensive. Here are some of the disadvantages of normalization:

Since data is not duplicated, table joins are required. This makes queries more complicated, and thus read times are slower.
Since joins are required, indexing does not work as efficiently. Again, this makes read times slower because the joins don't typically work well with indexing.

What if the Application is Read-Intensive and Write-Intensive?

In some cases, it isn't as clear that one strategy should be used over the other. Obviously, some applications really need both normalized and non-normalized data to work as efficiently as possible.

In such cases, companies will often use more than one database: a relational data such as MySQL for ACID compliant and write-intensive operations and a NoSQL database such as MongoDB for read-intensive operations on data where duplication is not as big of an issue.

What NoSQL and SQL databases do well. Source: SlideShare.

Get Your Own Hosted Database

Whether your database will be normalized or not, or whether you decide you need multiple databases, Morpheus Virtual Appliance is a tool that allows you manage heterogeneous databases in a single dashboard. With Morpheus, you have support for SQL, NoSQL, and in-memory databases like Redis across public, private, and hybrid clouds. So, visit the Morpheus site for pricing information or to create a free account today!

The Scala development environment seems to have as many naysayers as it has ardent supporters. Yet the language's combination of object-oriented and functional programming features is being emulated in such environments as Apple's Swift and Microsoft's C#. Despite recent forks released by Scala proponents dissatisfied by the slow pace of the language's development, the consensus is that Scala is a technology with a bright future.

When Java 8 was released in March 2014, it marked the most significant update of the venerable development environment in more than a decade. Many analysts interpreted the addition of such features as lambda expressions and functional programming as bad news for the Scala programming language, which already supported functional programming and runs in Java virtual machines.

In a September 5, 2014, interview with InfoWorld's Paul Krill, Typesafe chairman and co-founder Martin Odersky countered the argument that Scala and Java 8 are now so compatible that Scala is redundant. (Typesafe sells Scala-based middleware.) Odersky points out that the industry is moving increasingly toward functional programming, and Java 8 will spur the trend. However, Scala offers a much richer functional programming environment than Java 8, and Scala will leverage Java 8's VM improvements.

The current version 2.11. of Scala features a streamlined compiler and support for case classes with more than 22 parameters. Three new versions of Scala are in the pipeline, according to Odersky; their releases are scheduled roughly 18 months apart.

Version 2.12 will emphasize integration with Java 8
Aida will focus on making Scala's libraries less complicated to use and will integrate Java 8's parallel collections, or streams
Don Giovanni will represent a major reworking of the environment with a goal of making the core simpler and compiling more efficient

Much of the promise of Scala as a key language for future development is its fusion of object-oriented and functional programming. This hybridization is also evident in Apple's new Swift language and is expected to be evident in the next version of Microsoft's C#.

As a "blended" language, Scala combines attributes of both object-oriented and functional programming. Source: app-genesis

In a February 20, 2014, interview with ReadWrite's Matt Asay, Odersky states that Java 8's implementation of lambdas will add new methods to the java.util.stream.Stream type that will facilitate writing high-level collection code. It will also ease the transition to Java-based functional interfaces.

Scala forks: A sign of trouble or indication of strength?

A few months after the releases of Java 8 and Scala 2.11, not one but two different versions of the Scala compiler were announced. InfoQ's Benjamin Darfler reported in a September 16, 2014, article that Shapeless library principal engineer Miles Sabin announced on the Typelevel blog a fork of the Scala compiler that is merge-compatible with the Typesafe Scala compiler.

Just three days later, another Scala compiler fork was announced by Typesafe co-founder Paul Philips, who left Typesafe in 2013. Unlike Sabin's "conservative" fork, the Scala compiler developed by Philips is not intended to be merged with the Typesafe version down the road. Both Sabin and Philips believe Scala's development is proceeding too slowly (version 2.12 is scheduled for release in early 2016).

Typesafe CEO Odersky welcomed the forks, writing on the Typesafe blog that having "advanced and experimental branches" of the language and compiler in parallel with the "stable main branch" serves the needs of diverse developers. Odersky is encouraged that Typelevel's compiler will remain merge-compatible with standard Scala and believes some of that compiler's innovative features may eventually be added to the standard.

Typelevel's "conservative" fork of the standard Scala compiler is designed to provide a simple migration path to the Typesafe version. Source: Typelevel, via GitHub

During this time of transition in the development world, one of the key features of the new Morpheus Virtual Appliance database-as-a-service (DBaas) is its support for a wide range of development tools for connecting to, configuring, and managing heterogeneous MySQL, MongoDB, Redis, and ElasticSearch databases. Morpheus lets you monitor and analyze all your databases using a single dashboard to provide instant insight into consumption and availability of system resources.

The Morpheus Virtual Appliance is the first and only database provisioning and management platform that works with private, public, and hybrid clouds. A free full replica set is provisioned automatically for each database instance, and backups are created for your MySQL and Redis databases.

Visit the Morpheus site to create a free trial account today!

There's no denying the rise of the Internet of Things will challenge existing database systems to adapt to accommodate huge volumes of unstructured data from diverse sources. Some analysts question whether RDBMSs have the scalability, flexibility, and connectivity required to collect, store, and categorize the disparate data types organizations will be dealing with in the future. Others warn against counting out RDBMSs prematurely, pointing out that there's plenty of life left in the old data structures.

Imagine billions of devices of every type flooding data centers with information: a secured entryway reporting on people's comings and goings; a smart shelf indicating a shortage of key production supplies; a pallet sensor reporting an oversupply of stocked items.

The Internet of Things poses unprecedented challenges for database administrators in terms of scalability, flexibility, and connectivity. How do you collect, categorize, and extract business intelligence from such disparate data sources? Can RDBMSs be extended to accommodate the coming deluge of device-collected data? Or are new, unstructured data models required?

As you can imagine, there's little consensus among experts on how organizations should prepare their information systems for these new types and sources of data. Some claim that RDBMSs such as MySQL can be extended to handle data from unconventional sources, many of which lack the schema, or preconditioning, required to establish the relations that are the foundation of standard databases. Other analysts insist that only unstructured, "schema-less" DBMSs such as NoSQL are appropriate for data collection from intelligent devices and sensors.

The standard application model is transformed to encompass the cloud by the need to accommodate tomorrow's diverse data sources and types. Source: Technische Universität München

In a November 28, 2014, article, ReadWrite's Matt Asay reports on a recent survey conducted by Machine Research that found NoSQL is the key to "managing more heterogeneous data generated by millions and millions of sensors, devices and gateways." Not surprisingly, the two primary reasons for the assessment are NoSQL's flexibility in handling unstructured data, and its scalability, which the researchers claim RDBMSs simply can't match.

Reports of the death of RDBMSs are slightly exaggerated

There are a couple of problems with this claim, however. First, there's an acute shortage of NoSQL developers, according to statistics compiled by research firm VisionMobile. The company pegs the current number of NoSQL developers at 300,000, but it estimates that by the year 2020 the number will jump to 4.5 million.

VisionMobile's research indicates that demand for NoSQL developers will explode in coming years. Source: VisionMobile

Many experts posit that the forecast demise of RDBMSs is premature because the forecasters underestimate the ability of RDBMS vendors to extend their products to enhance their scalability and their ability to accommodate unstructured data. For example, MySQL vendor DeepDB improved the scalability of its product by replacing the default InnoDB storage engine with an alternative that it claims improved server performance by a factor of 100.

At this point, IT managers can be excused for thinking "Here we go again." Does the Internet of Things signal yet another sea-change for their data centers? Or is this the most recent case of the hype outpacing the substance? According to a December 2014 TechTarget article by Alan R. Earls, corporate bandwidth will be overwhelmed by the rush of small data packets coming from Internet-connected devices.

In particular, distributed data centers will be required that move data security -- and the bulk of data analysis -- to the edge of the network. This will require pushing the application layer to the router and integrating a container with logic. As Accenture Technology Lab researcher Walid Negm points out, the cloud is increasingly serving as either an extension of or replacement for the conventional network edge.

The best way to get a jump on the expansion of data networks and provision your databases is via a secure, reliable, and scalable platform for private, public, and hybrid clouds. Morpheus Virtual Appliance supports MongoDB, MySQL, Elasticsearch, and Redis with a simple point and click database provisioning setup. Get started on your free trial now!

Alternative Approaches to Assigning User Privileges in MySQL

Is MySQL the Right Tool for Your Job?

How to Ensure Peak Performance When Using Hash-based Sharding in MongoDB

When is MongoDB the Right Tool for Your Job?

The Most Important Takeaways from MySQL Prepared Statements

Diagnose and Optimize MySQL Performance Bottlenecks

Overcoming Barriers to Adoption of Network Functions Virtualization

When Do You Need ACID Compliance?

How to Ensure Your SSL-TLS Connections Are Secure

The Benefits of Virtual Appliances Expand to Encompass Nearly All Data Center Ops

When One Data Model Just Won't Do: Database Design that Supports Polyglot Persistence

ETL: Teaching an Old Data Cleanup Tool New Tricks

The Best Database for Storing Cached Web Sites

Security Focus Shifts from Physical Servers to Cloud Services

How to Use MySQL Performance Schema to Fine-tune Your Database

Java vs. JavaScript: And the Winner Is... Both?

When to Replace a Database Column with an ID

Pros and Cons of Database Normalization

Is Scala the Development Environment of the Future, or a Programming Dead End?

How the Internet of Things Will Affect Database Management