Morpheus Blog

One of the most common MySQL operations is replicating databases between master and slave servers. While most such connections are straightforward to establish and maintain, on occasion something goes amiss: some master data may not replicate on the slave, or read requests may be routed to the master rather than to the server, for example. Finding a solution to a replication failure sometimes requires a little extra detective work.

Replication is one of the most basic operations in MySQL -- and any other database: it's used to copy data from one database server (the master) to one or more others (the slaves). The process improves performance by allowing loads to be distributed among multiple slave servers for reads, and by limiting the master server to writes.

Additional benefits of replication are security via slave backups; analytics, which can be performed on the slaves without affecting the master's performance; and widespread data distribution, which is accomplished without requiring access to the master. (See the MySQL Reference Manual for more on replication.)

As with any other aspect of database management, replication doesn't always proceed as expected. The Troubleshooting Replication section of the MySQL Reference Manual instructs you to check for messages in your error log when something goes wrong with replication. If the error log doesn't point you to the solution, ensure that binary logging is enabled in the master by issuing a SHOW MASTER STATUS statement. If it's enabled, "Position" is nonzero; if it isn't, make sure the master is running with the --log-bin option.

The manual offers several other replication-troubleshooting steps:

The master and slave must both start with the --server-id option, and each server must have a unique ID value;
Run SHOW SLAVE STATUS to ensure the Slave_IO_Running and Slave_SQL_Running values are both "yes";
Run SHOW_PROCESSLIST and look in the State column to verify that the slave is connecting to the master;
If a statement succeeded on the master but failed on the slave, the nuclear option is to do a full database resynchronization, which entails deleting the slave's database and copying a new snapshot from the master. (Several less-drastic alternatives are described in the MySQL manual.)

Solutions to real-world MySQL replication problems

What do you do when MySQL indicates the master-slave connection is in order, yet some data on the master isn't being copied to the slave? That's the situation described in a Stack Overflow post from March 2010.

Even though replication appears to be configured correctly, data is not being copied from the master to the slave. Source: Stack Overflow

The first step is to run "show master status" or "show master status\G" on the master database to get the correct values for the slave. The slave status above indicates the slave is connected to the master and awaiting log events. Synching the correct log file position should restore copying to the slave.

To ensure a good sync, stop the master, dump the database, record the master log file positions, restart the master, import the database to the slave, and start the slave in slave mode with the correct master log file position.

Another Stack Overflow post from March 2014 presents a master/slave setup using JDBC drivers in which transactions marked as read-only were still pinging the master. Since the MySQL JDBC driver was managing the connections to the physical servers -- master and slave -- the connection pool and Spring transaction manager weren't aware that the database connection was linking to multiple servers.

The solution is to return control to Spring, after which the transaction on the connection will be committed. The transaction debug message will indicate that queries will be routed to the slave server so long as the connection is in read-only mode. By resetting the connection before it is returned to the pool, the read-only mode is cleared and the last log message will show that queries are now being routed to the master server.

The point-and-click dashboard in the new Morpheus Virtual Appliance makes it a breeze to diagnose and repair replication errors -- and other hiccups -- in your heterogeneous MySQL, MongoDB, Redis, and ElasticSearch databases. Morpheus lets you seamlessly provision, monitor, and analyze SQL, NoSQL, and in-memory databases across hybrid clouds in just minutes. Each database instance you create includes a free full replica set for built-in fault tolerance and fail over.

With the Morpheus database-as-a-service (DBaaS), you can migrate existing databases from a private cloud to the public cloud, or from public to private. A new instance of the same database type is created in the other cloud, and real-time replication keeps the two databases in sync. Visit the Morpheus site to create a free account.

Too often the increased efficiencies and performance improvements promised by new data technologies seem to vanish into thin air when the systems hit the production floor. Not so for these three companies that implemented MongoDB databases in very different environments, but realized very similar benefits: faster app speeds and lower overall system costs.

A hedge fund reduced its software licensing costs by a factor of 40, and its data storage by 40 percent. In addition, its quantitative analysts' modeling is now 25 times faster.

A retailer has installed in-store touch screens that give its customers an enjoyable, interactive shopping experience. The company can create and modify its online catalogs in just minutes to keep pace with ever-changing fashion trends.

A firm that provides affiliate-marketing and partner-management services for enterprises was able to expand without incurring the added expenses for hardware and services it anticipated. The company's customers realized improved performance because the new system's compression and other storage enhancements allowed more of their report requests to be processed in RAM.

All three of these success stories were made possible by converting the companies' traditional databases to MongoDB.

Hedge fund adopts a self-service model for financial analyses

In the past, whenever British hedge fund AHL Man Group wanted to add any new data sources, it became a long, drawn-out process that piled onto the IT department's busy workload. As ComputerWeekly's Brian McKenna reports in a January 21, 2015, article, AHL decided to standardize on Python in 2012, and subsequently discovered that Python interfaced very smoothly with its MongoDB databases.

By the end of 2013 the company had completed a proof-of-concept project, after which it was able to finalize its transition to MongoDB by the end of May 2014, at which time its legacy-system licenses expired. The result was a 40-fold decrease in licensing costs, and a 40 percent reduction in disk-storage requirements. In addition, the switch to a self-service model has allowed some of the firm's analysts to perform their "quant" modeling up to 25 times faster than previously.

Retailer's in-store tablets keep pace with fashion trends

Another January 21, 2015, article on the Apparel site recounts how retailer Chico's FAS developed a MongoDB-based application for its in-store touch-screen Tech Tablets that customers use as virtual catalogs. In addition to highlighting Chico's latest styles, the tablets show product videos and testimonials. The key benefit of the MongoDB application is the ability to create and adapt catalogs in minutes rather than the weeks required previously.

It took Chico's only five months to develop and implement the MongoDB-based app, which easily scaled to meet the retailer's increased demand in the holiday shopping season. More importantly, the app created an interactive, personalized shopping experience that's sure to bring its customers back for more.

MongoDB distro lets expanding marketer avoid high hardware costs

As a company grows, its data networks have to grow along with it, which often increases cost and complexity exponentially. Affiliate-marketing and partner-management firm Performance Horizon Group (PHG) faced skyrocketing hardware expenses as it grew its operations supporting enterprise clients in more than 150 countries.

By implementing Tokutek's TokuMX distribution of MongoDB, PHG reduced its need for new servers by a factor of eight, according to PHG CTO Pete Cheyne. In addition, each of the new servers required only half the RAM of its existing machines while accommodating a growing number of data sets. PHG's implementation of TokuMX is described in a December 2, 2014, Tokutek press release.

Any organization can improve the efficiency of its database-management operations by adopting the new Morpheus Virtual Appliance, which lets you manage heterogeneous MySQL, MongoDB, Redis, and ElasticSearch databases in a single dashboard. Morpheus is the first and only database-as-a-service (DBaaS) that supports SQL, NoSQL, and in-memory databases across public, private, and hybrid clouds.

With Morpheus, you can invoke a new database instance with one click, and each instance includes a free full replica set for failover and fault tolerance. You can administer your databases using your choice of tools. Visit the Morpheus site to create a free account.

It's not unusual for data analysts to spend more than half their time cleaning and converting data rather than extracting business intelligence from it. As data stores grow in size and data types proliferate, a new generation of tools are arriving that promise to deliver sophisticated analysis tools into the hands of non-data scientists.

One of the hottest job titles in technology is Data Scientist, perhaps surpassed only by the newest C-level position: Chief Data Scientist. IT's long-standing skepticism about such trends is evident by the joke cited by InfoWorld's Yves de Montcheuil that a data scientist is a business analyst who lives in California.

There's nothing funny about every company's need to translate its data into business intelligence. That's where data scientists take the lead role, but as the amount and types of data proliferate, data scientists find themselves spending the bulk of their time cleaning and converting data rather than analyzing and communicating it to business managers.

A recent survey of data scientists (registration required) conducted by IT-project crowdsourcing firm CrowdFlower found that two out of three analysts claim cleaning and organizing data is their most time-consuming task, and 52 percent report their biggest obstacle is poor quality data. While the respondents named 48 different technologies they use in their work, the most popular is Excel (55.6 percent), followed by the open source language R (43.1 percent) and the Tableau data-visualization software (26.1 percent).

Data scientists identify their greatest challenges as time spent cleaning data, poor data quality, lack of time for analysis, and ineffective data modeling. Source: CrowdFlower

What's holding data analysis back? The data scientists surveyed cite a lack of tools required to do their job effectively (54.3 percent), failure of their organizations to state goals and objectives clearly (52.3 percent), and insufficient investment in training (47.7 percent).

A dearth of tools, unclear goals, and too little training are reported as the principal impediments to data scientists' effectiveness. Source: CrowdFlower

New tools promise to 'consumerize' big data analysis

It's a common theme in technology: In the early days, only an elite few possess the knowledge and tools required to understand and use it, but over time the products improve and drop in price, businesses adapt, and the technology goes mainstream. New data-analysis tools are arriving that promise to deliver the benefits of the technology to non-scientists.

Steve Lohr profiles several of these products in an August 17, 2014, article in the New York Times. For example, ClearStory Data's software combines data from multiple sources and converts it into charts, maps, and other graphics. Taking a different approach to the data-preparation problem is Paxata, which offers software that retrieves, cleans, and blends data for analysis by various visualization tools.

The not-for-profit Open Knowledge Labs bills itself as a community of "civic hackers, data wranglers and ordinary citizens intrigued and excited by the possibilities of combining technology and information for good." The group is seeking volunteer "data curators" to maintain core data sets such as GDP and ISO-codes. OKL's Rufus Pollock describes the project in a January 3, 2015, post.

Open Knowledge Labs is seeking volunteer coders to curate core data sets as part of the Frictionless Data Project. Source: Open Knowledge Labs

There's no simpler or straightforward way to manage your heterogeneous MySQL, MongoDB, Redis, and ElasticSearch databases than by using the new Morpheus Virtual Appliance. Morpheus lets you seamlessly provision, monitor, and analyze SQL, NoSQL, and in-memory databases across hybrid clouds via a single point-and-click dashboard. Each database instance you create includes a free full replica set for built-in fault tolerance and fail over.

Researchers are developing compiler technologies that optimize and regenerate code in multiple languages and for many different platforms in only one or a handful of steps. While much of their work focuses on Java and JavaScript, their innovations will impact developers working in nearly all programming languages.

Who says you can't teach an old dog new tricks? One of the staples of any developer's code-optimization toolkit is a compiler, which checks your program's syntax, semantics, and other aspects for errors and otherwise optimizes its performance.

Infostructure Associates' Wayne Kernochan explains in an October 2014 TechTarget article that compilers are particularly adept at improving the performance of big data and business-critical online transaction processing (OLTP) applications. As recent developments in compiler technology point out, the importance of the programs goes far beyond these specialty apps.

Google is developing two new Java compilers named Jack (Java Android Compiler Kit) and Jill (Jack Intermediate Library Linker) that are part of Android SDK 21.1. I Programmer's Harry Fairhead writes in a December 12, 2014, article that Jack compiles Java code directly to a .dex Dalvik Executable rather than using the standard javac compiler to convert the source code to Java bytecode and then to Dalvik bytecode by feeding it through the dex compiler.

In addition to skipping the conversion to Java bytecode, Jack also optimizes and applies Proguard's obfuscation in a single step. The .dex code Jack generates can be fed to either the Dalvik engine or the new Android Runtime Engine-ART, which uses Ahead-of-Time compilation to improve speed.

Jill converts .jar library files into the .jack library format to allow it to be merged with the rest of the object code.

Google's new Jack and Jill Java compilers promise to speed up compilation by generating Dalvik bytecode without first having to convert it from Java bytecode. Source: I Programmer

In addition to streamlining compilation, Jack and Jill reduce Google's reliance on Java APIs, which are the subject of the company's ongoing lawsuit with Oracle. At present, the compilers don't support Java 8, but in terms of retaining compatibility with Java, it appears Android has become the tail wagging the dog.

Competition among open-source compiler infrastructures heats up

The latest versions of the LLVM and GNU Compiler Collection (GCC) are in a race to see which can out-perform the other. Both open-source compiler infrastructures generate object code from any kind of source code; they support C/C++, Objective-C, Fortran, and other languages. InfoWorld's Serdar Yegulalp reports in a September 8, 2014, article that testing conducted by Phoronix of LLVM 3.5 and a pre-release version of GCC 5 found that LLVM recorded faster C/C++ compile times. However, LLVM trailed GCC when processing some encryption algorithms and other tests.

Version 3.5 of the LLVM compiler infrastructure outperformed GCC 5 in some of Phoronix's speed tests but trailed in others, including audio encoding. Source: Phoronix

The ability to share code between JavaScript and Windows applications is a key feature of the new DuoCode compiler, which supports cross-compiling of C# code into JavaScript. InfoWorld's Paul Krill describes the new compiler in a January 22, 2015, article. DuoCode uses Microsoft's Roslyn compiler for code parsing, syntactic tree (AST) generation, and contextual analysis. DuoCode then handles the code translation and JavaScript generation, including source maps.

Another innovative approach to JavaScript compiling is the experimental Higgs compiler created by University of Montreal researcher Maxime Chevalier-Boisvert. InfoWorld's Yegulalp describes the project in a September 19, 2014, article. Higgs differs from other just-in-time (JIT) JavaScript compilers such as Google's V8, Mozilla's SpiderMonkey, and Apple's LLVM-backed FTLJIT project in that it has a single level rather than being multi-tiered, and it accumulates type information as machine-level code rather than using type analysis.

When it comes to optimizing your heterogeneous MySQL, MongoDB, Redis, and ElasticSearch databases, the new Morpheus Virtual Appliance makes it as easy as pointing and clicking in a single dashboard. Morpheus is the first and only database-as-a-service (DBaaS) that supports SQL, NoSQL, and in-memory databases across public, private, and hybrid clouds.

With Morpheus, you can invoke a new database instance with a single click, and each instance includes a free full replica set for failover and fault tolerance. Your MySQL and Redis databases are backed up and you can administer your databases using your choice of tools. Visit the Morpheus site to create a free account.

Note: When we recently launched, we were thrilled to have SQL Guru Pinal Dave give Morpheus a spin. It turns out that he had a great experience, and as he is indeed an SQLAuthority, we thought we'd share his post here as well. Without further delay, Pinal shares his thoughts below:

Pinal Dave

I love my weekend projects. Everybody does different activities in their weekend – like traveling, reading or just nothing. Every weekend I try to do something creative and different in the database world. The goal is I learn something new and if I enjoy my learning experience I share with the world. This weekend, I decided to explore Cloud Database As A Service – Morpheus. In my career I have managed many databases in the cloud and I have good experience in managing them.

I should highlight that today’s applications use multiple databases from SQL for transactions and analytics, NoSQL for documents, In-Memory for caching to Indexing for search. Provisioning and deploying these databases often require extensive expertise and time. Often these databases are also not deployed on the same infrastructure and can create unnecessary latency between the application layer and the databases. Not to mention the different quality of service based on the infrastructure and the service provider where they are deployed.

Moreover, there are additional problems that I have experienced with traditional database setup when hosted in the cloud:

Database provisioning & orchestration
Slow speed due to hardware issues
Poor Monitoring Tools
High Network Latency

Now if you have a great software and expert network engineer, you can continuously work on above problems and overcome them. However, not every organization have the luxury to have top notch experts in the field. Now above issues are related to infrastructure, but there are a few more problems which are related to software/application as well.

Here are the top three things which can be problems if you do not have application expert:

Replication and Clustering
Simple provisioning of the hard drive space
Automatic Sharding

Well, Morpheus looks like a product build by experts who have faced similar situation in the past. The product pretty much addresses all the pain points of developers and database administrators.

What is different about Morpheus is that it offers a variety of databases from MySQL, MongoDB, ElasticSearch to Redis as a service. Thus users can pick and chose any combination of these databases. All of them can be provisioned in a matter of minutes with a simple and intuitive point and click user interface. The Morpheus cloud is built on Solid State Drives (SSD) and is designed for high-speed database transactions. Inaddition it offers a direct link to Amazon Web Services to minimize latency between the application layer and the databases.

Here are the few steps on how one can get started with Morpheus. Follow along with me. First go to http://www.gomorpheus.com and register for a new and free account.

Step 1: Signup

It is very simple to signup for Morpheus.

Step 2: Select your database

I use MySQL for my daily routine, so I have selected MySQL. Upon clicking on the big red button to add Instance, it prompted a dialogue of creating a new instance.

Step 3: Create User

Now we just have to create a user in our portal which we will use to connect to a database hosted at Morpheus. Click on your database instance and it will bring you to User Screen. Over here you will notice once again a big red button to create a new user. I created a user with my first name.

Step 4: Configure Your MySQL Client

I used MySQL workbench and connected to MySQL instance, which I had created with an IP address and user.

That’s it! You are connecting to MySQL instance. Now you can create your objects just like you would create on your local box. You will have all the features of the Morpheus when you are working with your database.

Dashboard

While working with Morpheus, I was most impressed with its dashboard. In future blog posts, I will write more about this feature. Also with Morpheus you use the same process for provisioning and connecting with other databases: MongoDB, ElasticSearch and Reddis.

TL;DR: Even experienced SQL programmers can sometimes be thrown for an infinite loop -- or other code failure -- by one of the many pitfalls of the popular development platform. Get the upper hand by monitoring and managing your databases in the cloud via the Morpheus database-as-a-service.

When an database application goes belly up, the cause can often be traced to sloppy coding -- but not always. Every now and then, the reason for a misbehaving app is an idiosyncrasy in the platform itself. Here's how to prevent your database from tripping over one of these common SQL errors.

In an August 11, 2014, article in the Database Journal, Rob Gravelle describes a "gotcha" (not a bug) in the way MySQL handles numeric value overflows. When a value supplied by an automated script or application is outside the range of a column data type, MySQL truncates the value to an entry within the acceptable range.

Generally, a database system will respond to an invalid value by generating an error instructing the script to proceed or abort, or it will substitute the invalid error with its best guess as to which valid entry was intended. Of course, truncating or substituting the entered value is almost certainly going to introduce an error into the table. Gravelle explains how to override MySQL's default handling of overflow conditions to ensure it generates an error, which is the standard response to invalid entries by most other databases.

Don't fall victim to one of the most-common developer mistakes

According to Justin James on the Tech Republic site, the most common database programming no-no is misuse of primary keys. James insists that primary keys should have nothing at all to do with the application data in a row. Except in the "most unusual of circumstances," primary keys should be generated sequentially or randomly by the database upon row insertion and should not be changed. If they're not system values managed by the system, you're likely to encounter problems when you change the underlying data or migrate the data to another system.

Another frequent cause of program problems is overuse of stored procedures, which James describes as a "maintenance disaster." There's no easy way to determine which applications are using a particular stored procedure, so you end up writing a new one when you make a significant change to an app rather than adapting an existing stored procedure. Instead, James recommends that you use advanced object-relational mappers (ORMs).

Not every developer is sold on ORMs, however. On his Experimental Thoughts blog, Jeff Davis explains why he shies away from ORMs. Because ORMs add more lines of code between the application and the data, they invite more semantic errors. Davis points out that debugging in SQL is simpler when you can query the database as if you were an application. The more lines of code between the application error and the database, the more difficult it is to find the glitch.

One of the common database errors identified by Thomas Larock on the SQL Rockstar site is playing it safe by overusing the BIGINT data type. If you're certain no value in a column will exceed 100,000, there's no need to use the 8-byte BIGINT data type when the 4-byte INT data type will suffice. You may not think a mere 4 bytes is significant, but what if the table ends up with 2 million rows? Then your app is wasting 7.8MB of storage. Similarly, if you know you won't need calendar dates before the year 1900 or after 2079, using SMALLDATETIME will make our app much more efficient.

On the SQL Skills site, Kimberly Tripp highlights another common database-design error: use of non-sequential globally unique identifiers, or GUIDs. In addition to creating fragmentation in the base table, non-sequential GUIDs are four times wider than an INT-based identity.

Is your app's failure to launch due to poor typing skills?

Maybe the best way to start your hunt for a coding error is via a printout rather than with a debugger. That's the advice of fpweb.net's tutorial on troubleshooting SQL errors. The problem could be due to a missing single-quote mark, or misuse of double quotes inside a string. If you get the “No value given for one or more required parameters” error message, make sure your column and table names are spelled correctly

Likewise, if the error states “Data type mismatch in criteria expression,” you may have inserted letters or symbols in a column set for numeric values only (or vice-versa). The FromDual site provides a complete list of MySQL error codes and messages, including explanations for many of the codes, possible sources for the errors, and instructions for correcting many of the errors.

MySQL Error Codes

The FromDual site explains how to find sources of information about MySQL error messages.

MySQL error message

For many of the MySQL error messages, the FromDual index provides an explanation, reasons for the error message's appearance, and potential fixes.

Cloud database service helps ensure clean, efficient code

One of the benefits of the Morpheus cloud database-as-a-service is the ability to analyze your database in real time to identify and address potential security vulnerabilities and other system errors. TechTarget's Brien Posey points to another benefit of the database-as-a-service model: By building redundancy into all levels of their infrastructure, cloud database services help organizations protect against data loss and ensure high availability.

In addition to auto backups, replication, and archiving, Morpheus's service features a solid-state-disk-backed infrastructure that increases I/O operations per second (IOPs) by 100 times. Latency is further reduced via direct connections to EC2. Databases are monitored and managed continuously by Morpheus's crack team of DBAs and by the service's sophisticated robots.

Morpheus supports MongoDB, MySQL, Redis, and Elasticsearch. Platform support includes Amazon Web Services, Rackspace, Hiroku, Joyent, Cloud Foundry, and Windows Azure. Visit the Morpheus site for pricing information; free databases are available during the service's beta.

Hadoop Opener

TL; DR: The tremendous growth predicted for the open-source Hadoop architecture for data analysis is driven by the mind-boggling increase in the amount of structured and unstructured data in organizations, and the need for sophisticated, accessible tools to extract business and market intelligence from the data. New cloud services such as Morpheus let organizations of all sizes realize the potential of Big Data analysis.

The outlook is rosy for Hadoop -- the open-source framework designed to facilitate distributed processing of huge data sets. Hadoop is increasingly attractive to organizations because it delivers the benefits of Big Data while avoiding infrastructure expenses.

A recent report from Allied Market Research concludes that the Hadoop market will realize a compound annual growth rate of 58.2 percent from 2013 to 2020, to a total value of $50.2 billion in 2020, compared to $1.5 billion in 2012.

Hadoop Market Size

Allied Market Research forecasts a $50.2 billion global market for Hadoop services by the year 2020.

Just how "big" is Big Data? According to IBM, 2.5 quintillion bytes of data are created every day, and 90 percent of all the data in the world was created in the last two years. Realizing the value of this huge information store requires data-analysis tools that are sophisticated enough, cheap enough, and easy enough for companies of all sizes to use.

Many organizations continue to consider their proprietary data too important a resource to store and process off premises. However, cloud services now offer security and availability equivalent to that available for in-house systems. By accessing their databases in the cloud, companies also realize the benefits of affordable and scalable cloud architectures.

The Morpheus database-as-a-service offers the security, high availability, and scalability organizations require for their data-intelligence operations. Performance is maximized through Morpheus's use of 100-percent bare-metal SSD hosting. The service offers ultra-low latency to Amazon Web Services and other peering points and cloud hosting platforms.

The Nuts and Bolts of Hadoop for Big Data Analysis

The Hadoop architecture distributes both data storage and processing to all nodes on the network. By placing the small program that processes the data in the node with the much larger data sets, there's no need to stream the data to the processing module. The processor splits its logic between a map and a reduce phase. The Hadoop scheduling and resource management framework executes the map and reduce phases in a cluster environment.

The Hadoop Distributed File System (HDFS) data storage layer uses replicas to overcome node failures and is optimized for sequential reads to support large-scale parallel processing. The market for Hadoop really took off when the framework was extended to support the Amazon Web Services S3 and other cloud-storage file systems.

Adoption of Hadoop in small and midsize organizations has been slow despite the framework's cost and scalability advantages because of the complexity of setting up and running Hadoop clusters. New services do away with much of the complexity by offering Hadoop clusters that are managed and ready to use: there's no need to configure or install any services on the cluster nodes.

Netflix data warehouse combines Hadoop and Amazon S3 for infinite scalability

For its petabyte-scale data warehouse, Netflix chose Amazon's Storage Service (S3) over the Hadoop Distributed File System for the cloud-based service's dynamic scalability and limitless data and computational power. Netflix collects data from billions of streaming events from televisions, computers, and mobile devices.

With S3 as its data warehouse, Hadoop clusters with hundreds of nodes can be configured for various workloads, all able to access the same data. Netflix uses Amazon's Elastic MapReduce distribution of Hadoop and has developed its own Hadoop Platform as a Service, which it calls Genie. Genie lets users submit jobs from Hadoop, Pig, Hive, and other tools without having to provision new clusters or install new clients via RESTful APIs.

Netflix Hadoop S3 Data Warehouse

The Netflix Hadoop-S3 data warehouse offers unmatched elasticity in terms of data and computing power in a widely distributed network.

There is clearly potential in combining Hadoop and cloud services, as Wired's Marco Visibelli explains in an August 13, 2014, article. Visibelli describes how companies leverage Big Data for forecasting by scaling from small projects via Amazon Web Services and scaling up as their small projects succeed. For example, a European car manufacturer used Hadoop to combine several supplier databases into a single 15TB database, which saved the company $16 million in two years.

Hadoop opens the door to Big Data for organizations of all sizes. Projects that leverage the scalability, security, accessibility, and affordability of cloud services such as Morpheus's database as a service have a much greater chance of success.

[TL:DR] The theft of hundreds of millions of user IDs, passwords, and email addresses was made possible by a database programming technique called dynamic SQL, which makes it easy for hackers to use SQL injection to gain unfettered access to database records. To make matters worse, the dynamic SQL vulnerability can be avoided by using one of several simple programming alternatives.

How is it possible for a simple hacking method which has been publicized for as many as 10 years to be used by Russian cybercriminals to amass a database of more than a billion stolen user IDs and passwords? Actually, the total take by the hackers in the SQL injection attacks revealed earlier this month by Hold Security was 1.2 billion IDs and passwords, along with 500 million email addresses, according to an article written by Nicole Perlroth and David Gelles in the August 5, 2014, New York Times.

Massive data breaches suffered by organizations of all sizes in recent years can be traced to a single easily preventable source, according to security experts. In an interview with IT World Canada's Howard Solomon, security researcher Johannes Ullrich of the SANS Institute blames an outdated SQL programming technique that continues to be used by some database developers. The shocker is that blocking such malware attacks is as easy as using two or three lines of code in place of one. Yes, according to Ullrich, it's that simple.

The source of the vulnerability is dynamic SQL, which allows developers to create dynamic database queries that include user-supplied data. The Open Web Application Security Project (OWASP) identifies SQL, OS, LDAP, and other injection flaws as the number one application security risk facing developers. An injection involves untrusted data being sent to an interpreter as part of a command or query. The attacker's data fools the interpreter into executing commands or accessing data without authentication.

A1 Injection

According to OWASP, injections are easy for hackers to implement, difficult to discover via testing (but not by examining code), and potentially severely damaging to businesses.

The OWASP SQL Injection Prevention Cheat Sheet provides a primer on SQL injection and includes examples of unsafe and safe string queries in Java, C# .NET, and other languages.

String Query

An example of an unsafe Java string query (top) and a safe Java PreparedStatement (bottom).

Dynamic SQL lets comments be embedded in a SQL statement by setting them off with hyphens. It also lets multiple SQL statements to be strung together, executed in a batch, and used to query metadata from a standard set of system tables, according to Solomon.

Three simple programming approaches to SQL-injection prevention

OWASP describes three techniques that prevent SQL injection attacks. The first is use of prepared statements, which are also referred to as parameterized queries. Developers must first define all the SQL code, and then pass each parameter to the query separately, according to the OWASP's prevention cheat sheet. The database is thus able to distinguish code from data regardless of the user input supplied. A would-be attacker is blocked from changing the original intent of the query by inserting their own SQL commands.

The second prevention method is to use stored procedures. As with prepared statements, developers first define the SQL code and then pass in the parameters separately. Unlike prepared statements, stored procedures are defined and stored in the database itself, and subsequently called from the application. The only caveat to this prevention approach is that the procedures must not contain dynamic SQL, or if it can't be avoided, then input validation or another technique must be employed to ensure no SQL code can be injected into a dynamically created query.

The last of the three SQL-injection defenses described by OWASP is to escape all user-supplied input. This method is appropriate only when neither prepared statements nor stored procedures can be used, whether because doing so would break the application or render its performance unacceptable. Also, escaping all user-supplied input doesn't guarantee your application won't be vulnerable to a SQL injection attack. That's why OWASP recommends it only as a cost-effective way to retrofit legacy code.

All databases support one or more character escaping schemes for various types of queries. You could use an appropriate escaping scheme to escape all user-supplied input. This prevents the database from mistaking the user-supplied input for the developer's SQL code, which in turn blocks any SQL injection attempt.

The belt-and-suspenders approach to SQL-injection prevention

Rather than relying on only one layer of defense against a SQL injection attack, OWASP recommends a layered approach via reduced privileges and white list input validation. By minimizing the privileges assigned to each database account in the environment, DBAs can reduce the potential damage incurred by a successful SQL injection breach. Read-only accounts should be granted access only to those portions of database tables they require by creating a specific view for that specific level of access. Database accounts rarely need create or delete access, for example. Likewise, you can restrict the stored procedures certain accounts can execute. Most importantly, according to OWASP, minimize the privileges of the operating system account the database runs under. MySQL and other popular database systems are set with system or root privileges by default, which likely grants more privileges than the account requires.

Adopting the database-as-a-service model limits vulnerability

Organizations of all sizes are moving their databases to the cloud and relying on services such as Morpheus to ensure safe, efficient, scalable, and affordable management of their data assets. Morpheus supports MongoDB, MySQL, Redis, ElasticSearch, and other DB engines. The service's real-time monitoring lets you analyze and optimize the performance of database applications.

In addition to 24/7 monitoring of your databases, Morpheus provides automatic backup, restoration, and archiving of your data, which you can access securely via a VPN connection. The databases are stored on Morpheus's solid-state drives for peak performance and reliability.

TL;DR: As IT managers gain confidence in the reliability and security of cloud services, it becomes more difficult for them to ignore the cloud's many benefits for all their business's operations. Companies have less hardware to purchase and maintain, they spend only for the storage and processing they need, and they can easily monitor and manage their applications. With the Morpheus database as a service you get all of the above running on a high-availability network that features 24-7 support.

Give any IT manager three wishes and they'll probably wish for three fewer things to worry about. How about 1) having less hardware to buy and manage, 2) having to pay for only the storage and processing you need, and 3) being able to monitor and test applications from a single easy-to-use console?

Knowing the built-in cynicism of many data-center pros, they're likely to scoff at your offer, or at least suspect that it can't be as good as it sounds. That's pretty much the reception cloud services got in the early days, circa 2010.

An indication of IT's growing acceptance of cloud services for mainstream applications is KPMG's annual survey of 650 enterprise executives in 16 countries about their cloud strategies. In the 2011 survey, concerns about data security, privacy, and regulatory compliance were cited as the principal impediments to cloud adoption in large organizations.

According to the results of the most recent KPMG cloud survey, executives now consider cloud integration challenges and control of implementation costs as their two greatest concerns. There's still plenty of fretting among executives about the security of their data in the cloud, however. Intellectual property theft, data loss/privacy, and system availability/business continuity are considered serious problems, according to the survey.

International Cloud Survey

Executives rate such cloud-security challenges as intellectual property theft, data loss, and system availability greater than 4 on a scale of 1 (not serious) to 5 (very serious). Credit: KPMG

Still, security concerns aren't dissuading companies from adopting cloud services. Executives told KPMG that in the next 18 months their organizations planned cloud adoption in such areas as sourcing and procurement; supply chain and logistics; finance, accounting and financial management; business intelligence and analytics; and tax.

Cloud 'migration' is really a 'transformation'

Three business trends are converging to make the cloud an integral part of the modern organization: the need to collect, integrate, and analyze data from all internal operations; the need to develop applications and business processes quickly and inexpensively; and the need to control and monitor the use of data resources that are no longer stored in central repositories.

In a September 2, 2014, article on Forbes.com, Robert LeBlanc explains that cloud services were initially perceived as a way to make operations more efficient and less expensive. But now organizations see the cloud architecture as a way to innovate in all areas of the company. Business managers are turning to cloud services to integrate big data, mobile computing, and social media into their core processes.

BI Deployment Preferences

Mobile and collaboration are leading the transition in organizations away from on-site management and toward cloud platforms. Credit: Ventana Research

George Washington University discovered first-hand the unforeseen benefits of its shift to a cloud-based data strategy. Zaid Shoorbajee describes in the March 3, 2014, GW Hatchet student newspaper how a series of campus-wide outages motivated the university to migrate some operations to cloud services. The switch saved the school $700,000 and allowed its IT staff to focus more on development and less on troubleshooting.

The benefits the school realized from the switch extend far beyond IT, however. Students now have the same "consumer and social experience" they've become accustomed to in their private lives through Google, iTunes, and similar services, according to a university spokesperson.

Four approaches to cloud application integration

Much of the speed, efficiency, and agility of cloud services can be lost when organizations become bogged down in their efforts to adapt legacy applications and processes. In a TechTarget article (registration required), Amy Reichert presents four approaches to cloud application integration. The process is anything but simple, due primarily to the nature of the applications themselves and the need to move data seamlessly and accurately between applications to support business processes.

One of the four techniques is labeled integration platform as a service (iPaas), in which the cloud service itself provides integration templates featuring such tools as connectors, APIs, and messaging systems. Organizations then customize and modify the templates to meet their specific needs.

In cloud-to-cloud integration, the organization's cloud applications have an integration layer built in to support any required data transformations, as well as encryption and transportation. The cloud-to-integrator-to-cloud model relies on the organization's existing middleware infrastructure to receive, convert, and transport the data between applications.

Finally, the hybrid integration approach keeps individual cloud apps separate but adds an integration component to each. This allows organizations to retain control over the data, maximize its investment in legacy systems, and adopt cloud services at the company's own pace.

Regardless of your organization's strategy for adopting and integrating cloud applications, the Morpheus database as a service can play a key role by providing a flexible, secure, and reliable platform for monitoring and optimizing database applications. Morpheus's SSD-backed infrastructure ensures lightning fast performance, and direct patches into EC2 offer ultra-low latency.

Morpheus protects your data via secure VPC connections and automatic backups, replication, and archiving. The service supports ElasticSearch, MongoDB, MySQL, and Redis, as well as custom storage engines. Create your free database during the beta period.

Daniel Radcliffe

Daniel Radcliffe. Photograph: Perou/Guardian

TL;DR As the US State Department coped with massive database failure, thousands of travelers (and one Harry Potter star) were prevented entry to the United States. Even once the database was brought back online, it only worked in a limited capacity, resulting in extensive backlogs that added days, if not a full week, to wait time for visas and passports. Former U.S. Chief Technology Officer Todd Park wants government IT to move in the direction of open source, cloud-based computing. If you aren’t using SSD-backed cloud database infrastructure, it’s time to catch up.

The U.S. government might be able to afford the database downtime that most IT professionals price at $20,000 per hour minimum (some put that number in the hundreds of thousands), but chances are, most businesses are not equipped to suffer the consequences of database failure.

After the massive freak show that was the first iteration of Healthcare.gov, U.S. Chief Technology Officer (until a few days ago) Todd Park told Wired’s Steven Levy that he’d like to move government IT into the future with the rest of us, employing cloud-based, open source, rapid-iteration computing. He’s approached Silicon Valley tech gurus, imploring them to step in and champion for change. And he’s not shy about how dire the situation is: “America needs you! Not a year from now! But right. The. Fuck. Now!”

But that sort of modernization definitely hadn’t happened in Washington by mid-July, when State Department operations worldwide ground to a near standstill after the Consolidated Consular Database (CCD) crashed before their eyes.

As a result, an untold number of travelers were stuck waiting at embassies for authorizations that took, on average, a week longer to deliver than usual. Students, parents adopting new babies abroad, and vacationers alike found themselves trapped across the world from their destinations, all due to a system backup failure.

Database Crash Destroyed State Department Productivity

So what happened here? According to the DOS, an Oracle software upgrade “interfered with the smooth interoperability of redundant nodes.” On July 20, the Department began experiencing problems with the CCD, creating an ever-growing backlog of visa applications. While the applications were not lost, the crash rendered the CCD mostly unusable for at least a week.

Included in the backlog were only applications for non-immigrant visas. While the DOS would not confirm how many travelers were affected by the outage, State Department metrics show that 9 million non-immigrant visas were issued in 2012. Records are not yet available for more recent years, but United Press International reports that DOS issues 370,000 visas weekly and was operating at less than 50 percent productivity with a minimally functional system throughout the second half of July.

Worldwide Nonimmigrant Visa Issuances

Nearly 9 million non-immigrant visas issued in 2012 alone. Backlogs due to database failure can be crippling. Credit: US Department of State

DOS’s Crashed Database Trapped Harry Potter, US Citizens, and Visitors Abroad

Daniel Radcliffe, forever known for his title role in the eight Harry Potter films, was among many people who faced impeded travel after the CCD failure. En route to Comic-Con in San Diego after a brief trip to Toronto for a premiere, even Radcliffe had to wait at the border due to delays in processing his new visa.

But while Radcliffe got an emergency pass, many less famous travelers weren’t so lucky. Several dozen families were living in a Guangzhou, China hotel after being unable to obtain visas for their newly adopted children. One Maryland family of seven was stuck for days, and they weren’t alone. The Washington Post reported that at least 30 other families were waiting, too, unable to return home as the DOS coped with the tech glitch.

Chinese students headed to the States for university studies were also delayed, alongside non-citizen Ireland residents traveling to the US for vacations. The US State Department’s Facebook page shows posts as late as August 22 asking for advice regarding delays in passport issuance.

Businesses Can’t Afford to Rely on Archaic Database Solutions

The Department of State posted an FAQ on August 4 in which they claimed that while they had made “significant progress,” they were still working to return the Consular Consolidated Database, brought back online July 23 only partially functional, back to “full operational capacity.” They still didn’t know the precise cause of the breakdown. The State Department hasn’t issued any statements since the August 4 update.

Needless to say, this debacle has caused a massive headache for the government and for travelers alike. But downtime causes headaches for companies in every industry. An International Data Corporation study reports that 98% of US and 96% of UK companies have to perform file recovery at least once per week. With downtime costing at least $20K per hour for both groups, and often considerably more, it’s imperative that businesses use database solutions that promise quick recovery.

Average Cost of unplanned data center

Downtime Costs in More Ways than One Credit: Emerson Network Power

Morpheus Backs Up Automatically and Is Lightning Fast

Clearly, few businesses can withstand the downtime from which the State Department continues to recover. You need your database to work quickly and reliably. Morpheus cloud database-as-a-service offers auto backups, replication, and archiving. Since it operates via online console, you’ll never have to worry about losing access to your systems and data. Its SSD-backed infrastructure increases IOPs by over 100 times, making it reliably fast. Direct connection to EC2 vastly reduces latency.

Todd Park wouldn’t want to move government IT to the cloud if he didn’t trust the security. Morpheus is secured by VPN and is safe from interference from the public internet. The Morpheus platform is continually monitored and managed by the sharply dedicated and experienced team at Morpheus, as well as sophisticated robots made for the job. You can also monitor in real time the queries that could potentially bog down your app performance. Support is available 24 hours a day.

Morpheus works with Elasticsearch, MongoDB, MySQL, and Redis. While Morpheus is in beta, you can try it at no cost. Prices after beta and supported platforms are listed on the Morpheus web site.

TL;DR: At one time, organizations planning their cloud strategy adopted an either-or approach: Either store and manage data on a secure private cloud, or opt for the database-as-a-service model of the public cloud. Now companies are realizing the benefits of both options by adopting a multicloud strategy that places individual applications on the platform that best suits them.

In IT's never-ending quest to improve database performance and reduce costs, a new tactic has surfaced: multicloud. Rather that process all database queries on either the private cloud or public cloud, shift the processing to the platform best able to handle it in terms of speed and efficiency.

InfoWorld's David Linthicum explains in an August 5, 2014, article that a multicloud architecture "gives those who manage large distributed databases the power to use only the providers who offer the best and most cost-effective service -- or the providers who are best suited to their database-processing needs."

Managing the resulting complexity isn't as daunting as it may sound, according to Linthicum. In fact, a cloud-management system could soon become a requirement for IT departments of all sizes. Product lifecycle management (PLM) expert Oleg Shilovitsky claims in an August 5, 2014, article on BeyondPLM.com that three trends are converging to make distributed database architectures mandatory.

The first trend is the tsunami of data that is overwhelming information systems and pushing traditional database architectures to their physical limits. The second trend is the increasingly distributed nature of organizations, which are adopting a design-anywhere, build-anywhere philosophy. The third trend is the demand among users for ever-faster performance on many different platforms to keep pace with the changes in the marketplace.

Multicloud: More than simply pairing public and private

In a July 12, 2013, article, InfoWorld's Linthicum compared the process of adopting a multicloud strategy to the transition a decade or more ago to distributed internal systems customized to the specific demands of the business. A key to managing the increased complexity of multicloud systems is carefully choosing your service provider to ensure a good fit between their offerings and your company's needs.

Three key considerations in this regard are security, accessibility, and scalability. These are three areas where the Morpheus database-as-a-service shines. In addition to lightning-fast SSD-based infrastructure that increases IOPs by 100 times, Morpheus provides real-time monitoring for identifying and optimizing database queries that are impeding database performance.

Morpheus offers ultra-low latency to leading Internet peering points and cloud hosts. Additionally, fault tolerance, disaster recovery, and automated backups make Morpheus a unique Database as a service. You connect to your databases via secure VPC. Visit the Morpheus site for pricing information or to create a free account during the beta period.

Mixing old and new while maximizing adaptability

Businesses of all types and sizes are emphasizing the ability to shift gears quickly in anticipation of industry trends. No longer can you simply react to market changes: You must be there ahead of the competition.

A principal benefit of the multicloud database architecture is flexibility. In an August 25, 2014, article on Forbes.com, IBM's Jeff Borek highlights the ability of multicloud databases to leverage existing IT infrastructure while realizing the agility, speed, and cost savings of cloud services.

A typical multicloud approach is use of the private cloud as a point-of-control interface to public cloud services. MSPMentor's Michael Brown describes such an architecture in an August 27, 2014, article.

Many companies use a private cloud to ensure regulatory compliance for storing health, financial, and other sensitive data. In such systems, the private cloud may serve as the gateway to the public cloud in a two-tier structure. In addition to providing a single interface for users, the two levels allow applications and processes to be customized for best fit while keeping sensitive data secure.

A multicloud-application prototype: Managing multiple application servers

There's no denying that managing a distributed database system is more complicated than maintaining the standard top-down RDBMS of yesteryear. In a July 23, 2013, article on GitHub, German Ramos Garcia presents a prototype multicloud application development model based on the Hydra service. The model addresses much of the complexity entailed in managing multiple application servers.

The web application is first divided into static elements (images, Javascript, static HTML, etc.), dynamic elements on a backend server, and a database to support the backend servers.

Multi Cloud

A prototype multicloud application architecture separates static, dynamic, and database-support servers.

The distributed architecture must provide mechanisms for controlling the various servers, balancing traffic between servers, and recovering from failures. It must also control sessions between servers and determine where to store application data.

An alternative approach to multicloud management is presented by Mauricio J. Rojas in a blog post from March 25, 2014. The model Rojas proposes is a mash-up of management tools from many different cloud services.

Multi-cloud manager

Management tools for distributed cloud-based databases should focus on user needs and offer best of breed from various providers.

Rojas recommends creating a single set of management components for both the public and private clouds. This allows you to "create the same conditions in both worlds" and move seamlessly between the public and private domains.

In addition to security, important considerations in developing a multicloud management system are auto-scaling and high availability. With the Morpheus database-as-a-service, you're covered in all three areas right out of the box--even Pinal Dave, the SQL Authority uses Morpheus. Make Morpheus a key element of your multicloud strategy.

TL;DR: Following several high-profile development disasters, government IT departments have received a mandate to change their default app-development approach from the traditional top-down model to the agile, iterative, test-centric methodology favored by leading tech companies. While previous efforts to dynamite the entrenched, moribund IT-contracting process have crashed in flames, analysts hold out hope for the new 18F and U.S. Digital Service initiatives. Given the public's complete lack of faith in the government's ability to provide digital services, failure is simply not an option.

Can Silicon Valley save the federal government from itself? That's the goal of former U.S. Chief Technology Officer Todd Park, who relocated to California this summer and set about recruiting top-tier application developers from the most innovative tech companies on the planet to work for the government.

As Wired's Steven Levy reports in an August 28, 2014, article, Park hopes to appeal to developers' sense of patriotism. "America needs you," Levy quotes Park telling a group of engineers at the Mozilla Foundation headquarters. A quick review of recent federal-government IT debacles demonstrates the urgency of Park's appeal.

Start with the $300 million spent over the past six years by the Social Security Administration on a disability-claim filing system that remains unfinished. Then check out the FBI's failed Virtual Case File case-management initiative that had burnt through $600 million before being replaced by the equally troubled Sentinel system, as Jason Bloomberg explains in an August 22, 2012, CIO article.

But the poster child of dysfunctional government app development is HealthCare.gov., which Park was brought in to save after its spectacularly failed launch in October 2013. For their $300 million investment, U.S. taxpayers got a site that took eight seconds to respond to a mouse click and crashed so often that not one of the millions of people visiting the site on its first day of operation was able to complete an application.

Healthcare.gov homepage

Healthcare.gov's performance in the weeks after its launch highlight what can happen when a $300 million development project proceeds with no one in the driver's seat. Credit: The Verge

The dynamite approach to revamping government IT processes

Just months before HealthCare.gov's epic crash-and-burn, Park had established the Presidential Innovation Fellows program to attract tech professionals to six-month assignments with the government. The program was envisioned as a way to seed government agencies with people who could introduce cutting-edge tools and processes to their development efforts. After initial successes with such agencies as Medicare and Veterans Affairs, the group turned its attention to rescuing HealthCare.gov -- and perhaps the entire Affordable Care Act.

The source of the site's problems quickly became obvious: the many independent contractors assigned to portions of the site worked in silos, and no single contractor was responsible to ensure the whole shebang actually worked. Even as the glitches stacked up following the failed launch, contractors continued to work on new "features" because they were contractually required to meet specific goals.

The culprit was the federal contracting process. Bureaucrats farmed out contracts to cronies and insiders, whose only motivation was to be in good position to win the next contract put up for bid, according to Levy. Park's team of fixers was met with resistance at every turn despite being given carte blanche to ignore every rule of government development and procurement.

With persistence and at least one threat of physical force, the ad-hoc team applied a patchwork of monitoring, testing, and debugging tools that got the site operational. By April 2014, HealthCare.gov had achieved its initial goal of signing up 8 million people for medical insurance.

How an agile-development approach could save democracy

The silver lining of the HealthCare.gov debacle is the formation of two new departments charged with bringing an agile approach to government app development. The General Services Administration's 18F was established earlier this year with a mandate to "fail fast" rather than follow the standard government-IT propensity to fail big.

As Tech President's Alex Howard describes in an August 14, 2014, article, 18F is assisting agencies as they develop free, open-source services offered to the public via GitHub and other open-source repositories. Perhaps an even-bigger shift in attitude by government officials is the founding last month of the U.S. Digital Service, which is modeled after a successful U.K. government app-development program.

To help agencies jettison their old development habits in favor of modern approaches, the White House released the Digital Services Playbook that provides 13 "plays" drawn from successful best practices in the private and public sectors. Two of the plays recommend deploying in a flexible hosting environment and automating testing and deployment.

Digital Service Plays

The government's Digital Services Playbook calls for agencies to implement modern development techniques such as flexible hosting and automated testing.

That's precisely where the Morpheus database-as-a-service (DBaas) fits into the government's plans. Morpheus lets users spin up a new database instance in seconds -- there's no need to wait for lengthy IT approval to procure and provision a new DB. Instead it's all done in the cloud within seconds.

In addition, users' core elastic, scalable, and reliable DB infrastructure is taken care for them. Developers can focus on building the core functionality of the app rather than having to spend their time making the infrastructure reliable and scalable. Morpheus delivers continuous availability, fault tolerance, fail over, and disaster recovery for all databases running on its service. Last but definitely not least, it's cost efficient for users to go with Morpheus: there's no upfront setup cost, and they pay only for actual usage.

The Morpheus cloud database as a service (DBaaS) epitomizes the goals of the government's new agile-development philosophy. The service's real-time monitoring makes continuous testing a fundamental component of database development and management. Morpheus's on-demand scalability ensures that applications have plenty of room to grow without incurring large up-front costs. You get all this plus industry-leading performance, VPN security, and automatic backups, archiving, and replication.

Government IT gets the green light to use cloud app-development services

As groundbreaking as the Digital Services Playbook promises to be for government IT, another publication released at the same time may have an even-greater positive impact on federal agencies. The TechFAR Handbook specifies how government contractors can support an "iterative, customer-driven software development process."

Tech President's Howard quotes Code for America founder Jen Pahlka stating that the handbook makes it clear to government IT staff and contractors alike that "agile development is not only perfectly legal, but [is] in fact the default methodology."

Critics point out that this is not the government's first attempt to make its application development processes more open and transparent. What's different this time is the sense of urgency surrounding efforts such as 18F and the U.S. Digital Service. Pahlka points out that people have lost faith in the government's ability to provide even basic digital services. Pahlka is quoted in a July 21, 2014, Government Technology interview by Colin Wood and Jessica Mulholland as stating, "If government is to regain the trust and faith of the public, we have to make services that work for users the norm, not the exception."

TL;DR: Securing your company's cloud-based assets starts by applying tried-and-true data-security practices modified to address the unique characteristics of virtual-network environments. Cloud services are slowly gaining the trust of IT managers who are justifiably hesitant to extend the security perimeters to accommodate placing their company's critical business assets in the cloud.

The fast pace of technological change doesn't faze IT pros, who live the axiom "The more things change, the more they stay the same." The solid security principles that have protected data centers for generations apply to securing your organization's assets that reside in the cloud. The key is to anticipate the new threats posed by cloud technology -- and by cyber criminals who now operate with a much higher level of sophistication.

In a September 18, 2014, article, ZDNet's Ram Lakshminarayanan breaks down the cloud-security challenge into four categories: 1) defending against cloud-based attacks by well-funded criminal organizations 2) unauthorized access and data breaches that use employees' stolen or compromised mobile devices 3) maintenance and monitoring of cloud-based APIs, and 4) ensuring compliance with the growing number and complexity of government regulations.

IT departments are noted for their deliberate approach to new technologies, and cloud-based data services are no different. According to a survey published this month by the Ponemon Institute of more than 1,000 European data-security practitioners (pdf), 64 percent believe their organization's use of cloud services reduces their ability to protect sensitive information.

The survey, which was sponsored by Netskope, blames much of the distrust on the cloud multiplier effect: IT is challenged to track the increasing number and type of devices connecting to the company's networks, as well as the cloud-hosted software employees are using, and the business-critical applications being used in the "cloud workspace."

Building trust between cloud service providers and their IT customers

No IT department will trust the organization's sensitive data to a service that fails to comply with privacy and data-security regulations. The Ponemon survey indicates that cloud services haven't convinced their potential customers in Europe of their trustworthiness: 72 percent of respondents strongly disagreed, disagreed, or were uncertain whether their cloud-service providers were in full compliance with privacy and data-security laws.

Data-security executives remain leery of cloud services' ability to secure their organization's critical business data. Credit: Ponemon Institute

Even more troubling for cloud service providers is the survey finding that 85 percent of respondents strongly disagreed, disagreed, or weren't sure whether their cloud service would notify them immediately in the event of a data breach that affected their company's confidential information or intellectual property.

The Morpheus database-as-a-service puts data security front and center by offering VPN connections to your databases in addition to online monitoring and support. Your databases are automatically backed up, replicated, and archived on the service's SSD-backed infrastructure.

Morpheus also features market-leading performance, availability, and reliability via direct connections to EC2 and colocation with the fastest peering points available. The service's real-time monitoring lets you identify and optimize the queries that are slowing your database's performance. Visit the Morpheus site for pricing information and to sign up for a free account.

Overcoming concerns about cloud-service security

Watching your data "leave the nest" can be difficult for any IT manager. Yet cloud service providers offer a level of security at least on par with that of their on-premises networks. In a September 15, 2014, article on Automated Trader, Bryson Hopkins points out that Amazon Web Services and Microsoft Azure are two of the many public cloud services that comply with Service Organization Control (SOC), HIPPA, FedRAMP, ISO 27001, and other security standards.

The SANS Institute's Introduction to Securing a Cloud Environment (pdf) explains that despite the cloud's increased "attack surface" when compared with in-house servers, the risk of cloud-based data being breached is actually less than that of losing locally hosted data. Physical and premises security are handled by the cloud service but can be enhanced by applying a layered approach to security that uses virtual firewalls, security gateways, and other techniques.

Cloud services avoid resource contention and other potential problems resulting from multi-tenancy by reprovisioning virtual machines, overprovisioning to crowd out other tenants, and using fully reserved capacities.

Another technique for protecting sensitive data in multi-tenant environments is to isolate networks by configuring virtual switches or virtual LANs. The virtual machine and management traffic must be isolated from each other at the data link layer (layer 2) of the OSI model.

The key to protecting sensitive data in a multi-tenant cloud environment is to isolate virtual machine and management traffic at the data link layer. Credit: SANS Institute

In a June 27, 2014, article on CloudPro, Davey Winder brings the issue of cloud security full circle by highlighting the fact that the core principles are the same as for other forms of data security: an iron-clad policy teamed with encryption. The policy must limit privileged-user access by the service's employees and provide a way for customers to audit the cloud network.

One way to compare in-house data management and cloud-based management is via the farmer-restaurant analogy described in a September 15, 2014, article by Arun Anandasivam on IBM's Thoughts on Cloud site. If you buy your food directly from the farmer, you have a first-hand impression of the person who grew your food, but your options may be limited and you have to do the preparation work. If you buy your food from a restaurant, you likely have a wider selection to choose from and you needn't prepare the meal, but you have less control over the food's path from farm to kitchen, and you have fewer opportunities to determine beforehand whether the food meets your quality requirements.

That's not to say farmers are any more or less trustworthy than restaurants. You use the same senses to ensure you're getting what you paid for, just in different ways. So check out the Morpheus database-as-a-service to see what's on the menu!

TL;DR: Old divisions in IT departments between app development and operations are crashing to the ground as users demand more apps with more features, and right now! By combining agile-development techniques and a hybrid public-private cloud methodology, companies realize the benefits of new technologies and place IT at the center of their operations.

The re-invention of the IT department is well underway. The end result will put technology at the core of every organization.

Gone are the days when IT was perceived as a cost center whose role was to support the company's revenue-generating operations. Today, software is imbued in every facet of the organization, whether the company makes lug nuts or space crafts, lima beans or Linux distros.

The nexus of the IT transformation is the intersection of three disparate-yet-related trends: the merger of development and operations (DevOps), the wide-scale adoption of agile-development methodologies, and the rise of hybrid public/private clouds.

In a September 12, 2014, article, eWeek's Chris Preimesberger quotes a 2013 study by Puppet Labs indicating the switch to DevOps is well underway: 66 percent of the organizations surveyed had adopted DevOps or planned to do so, and 88 percent of telecoms use or intend to use a DevOps approach. The survey also found that DevOps companies deploy code 30 times more frequently than their traditional counterparts.

Closing the loop that links development and operations

A successful DevOps approach requires a closed loop connecting development and operations via continuous integration and continuous deployment. This entails adoption of an entirely new and fully automated development toolset. Traditional IT systems simply can't support the performance, scalability, and latency requirements of a continuous-deployment mentality. These are the precise areas where cloud architectures shine.

Agile DevOps

Agile development combines with DevOps to create a service-based approach to the provisioning, support, and maintenance of apps. Source: Dev2Ops

For example, the Morpheus database-as-a-service offers ultra-low latency via direct patches into EC2 and colocation with among the fastest peering points available. You can monitor and optimize your apps in real time and spot trends via custom metrics. Morpheus's support staff and advanced robots monitor your database infrastructure continuously, and custom MongoDB and MySQL storage engines are available.

In addition, you're assured high availability via secure VPC connections to the network, which uses 100-percent bare-metal SSD storage. Visit the Morpheus site for pricing information and to sign up for a free account.

Continuous integration + continuous delivery = continuous testing

Developers steeped in the tradition of delivering complete, finished products have to turn their thinking around 180 degrees. Dr. Dobb's Andrew Binstock explains in a September 16, 2014, article that continuous delivery requires deploying tested, usable apps that are not feature-complete. The proliferation of mobile and web interfaces makes constant tweaks and updates not only possible but preferable.

Pushing out 10 or more updates in a day would have been unheard of in a turn-of-the-millennium IT department. The incessant test-deploy-feedback loop is possible only if developers and operations staff work together to ensure smooth roll-outs and fast, effective responses to the inevitable deployment errors and other problems.

Integrating development and operations so completely requires not just a reorganization of personnel but also a change in management philosophy. However, the benefits of such a holistic approach to IT outweigh the short-term pain of the organizational adjustments required.

A key to smoothing out some of the bumps is use of a hybrid-cloud philosophy that delivers the speed, scalability, and cost advantages of the public cloud while shielding the company's mission-critical applications from the vagaries of third-party platforms. Processor, storage, and network resources can be provisioned quickly as services by using web interfaces and APIs.

Seeing apps as a collection of discrete services

Imagine a car that's still drivable with only three of its four wheels in place. That's the idea behind developing applications as a set of discrete services, each of which is able to function independently of the others. Also, the services can be swapped in and out of apps on demand.

This is the "microservice architecture" described by Martin Fowler and James Lewis in a March 25, 2014, blog post. The many services that comprise such an app run in their own processes and communicate via an HTTP resource API or other lightweight mechanism. The services can be written in different programming languages and can use various storage technologies because they require very little centralized management.

Microservice Architecture

The microservice architecture separates each function of the app as a separate service rather than encapsulating all functions in a single process. Source: Martin Fowler

By using services rather than libraries as components, the services can be deployed independently. When a service changes, only that service needs to be redeployed -- with some noteworthy exceptions, such as changes to service interfaces.

No longer are applications "delivered" by developers to users. In the world of DevOps, the team "developing" the app owns it throughout its lifecycle. Thus the "developers" take on the sys-admin and operations support/maintenance roles. Gone are the days of IT working on "projects." Today, all IT staff are working on "products." This cements to position the company's technology workers at the center of all the organization's operations.

TL;DR: Google Analytics stores a massive amount of statistical data from web sites across the globe. Retrieving reports quickly from such a large amount of data requires Google to use a custom solution that is easily scalable whenever more data needs to be stored.

At Google, any number of applications may need to be added to their infrastructure at any time, and each of these could potentially have extremely heavy workloads. Resource demands such as these can be difficult to meet, especially when there is a limited amount of time to get the required updates implemented.

If Google were to use the typical relational database on a single server node, they would need to upgrade their hardware each time capacity is reached. Given the amount of applications being created and data being used by Google, this type of upgrade could quite possibly be necessary on a daily basis!

The load could also be shared across multiple server nodes, but once more than a few additional nodes are required, the complexity of the system becomes extremely difficult to maintain.

With these things in mind, a standard relational database setup would not be a particularly attractive option due to the difficulty of upgrading and maintaining the system on such a large scale.

Finding a Scalable Solution

In order to maintain speed and ensure that such incredibly quick hardware upgrades are not necessary, Google uses its own data storage solution called BigTable. Rather than store data relationally in tables, it stores data as a multi-dimensional sorted map.

This type of implementation falls under a broader heading for data storage, called a key/value store. This method of storage can provide some performance benefits and make the process of scaling much easier.

Information Storage in a Relational Database

Relational databases store each piece of information in a single location, which is typically a column within a table. For a relational database, it is important to normalize the data. This process ensures that there is no duplication of data in other tables or columns.

For example, customer last names should always be stored in a particular column in a particular table. If a customer last name is found in another column or table within the database, then it should be removed and the original column and table should be referenced to retrieve the information.

The downside to this structure is that the database can become quite complex internally. Even a relatively simple query can have a large number of possible paths for execution, and all of these paths must be evaluated at run time to find out which one will be the most optimal. The more complex the database becomes, the more resources will need to be devoted to determining query paths at run time.

Information Storage in a Key/Value Store

With a key/value store, duplicate data is acceptable. The idea is to make use of disk space, which can easily and cost-effectively be upgraded (especially when using a cloud), rather than other hardware resources that are more expensive to bring up to speed.

This data duplication is beneficial when it comes to simplifying queries, since related information can be stored together to avoid having numerous potential paths that a query could take to access the needed data.

Instead of using tables like a relational database, key/value stores use domains. A domain is a storage area where data can be placed, but does not require a predefined schema. Pieces of data within a domain are defined by keys, and these keys can have any number of attributes attached to them.

The attributes can simply be string values, but can also be something even more powerful: data types that match up with those of popular programming languages. These could include arrays, objects, integers, floats, Booleans, and other essential data types used in programming.

With key/value stores, the data integrity and logic are handled by the application code (through the use of one or more APIs) rather than by using a scheme within the database itself. As a result, data retrieval becomes a matter of using the correct programming logic rather than relying on the database optimizer to determine the query path from a large number of possibilities based on the relation it needs to access.

Getting Results

Google needs to store and retrieve copious amounts of data for many applications, included among them are Google Analytics, Google Maps, Gmail, and their popular web index for searching. In addition, more applications and data stores could be added at any time, making their BigTable key/value store an ideal solution for scalability.

BigTable is Google’s own custom solution, so how can a business obtain a similar performance and scalability boost to give its users a better experience? The good news is that there are other key/value store options available, and some can be run as a service from a cloud. This type of service is easily scalable, since more data storage can easily be purchased as needed on the cloud.

A Key/Value Store Option

There are several options for key/value stores. One of these is Mongo, which is designed as an object database that stores information in JSON format. This format is ideal for web applications since JSON data makes it easy to pass data around in a standard format among the various parts of an application that need it.

For example, Mongo is part of the MEAN stack: Mongo, Express, AngularJS, and NodeJS—a popular setup for programmers developing applications. Each of these pieces of the puzzle will send data to and from other one or more of the other pieces. Since everything, including the database, can use the JSON format, passing the data around among the various parts becomes much easier and more standardized.

How to Make Use of Mongo

Mongo can be installed and used on various operating systems, including Windows, Linux, and OS X. In this case, the scalability of the database would need to be maintained by adding storage space to the server on which it is installed.

Another option is to use Mongo as a service on the cloud. This allows for easy scalability, since a request can be made to the service provider to up the necessary storage space at any time. In this way, new applications or additional data storage needs can be handled quickly and efficiently.

Morpheus is a great option for this service. Mongo is offered, as well as a number of other databases. Using Morpheus, a highly scalable database as a service can be running in no time!

TL;DR: As the amount of unstructured data being collected by organizations skyrockets, their existing databases come up short: they're too slow, too inflexible, and too expensive. What's needed is a DBMS that isn't constricted by the relational schema, and one that accommodates object-oriented data structures without the complexity and latency of object-relational mapping frameworks. NoSQL (a.k.a. Not Only SQL) provides the flexibility, scalability, and availability required to manage the deluge of unstructured data, albeit with some shortcomings of its own.

Data isn't what it used to be. Gone (or going) is the well-structured model of data stored neatly in tables and rows queried via the data's established relations. Along comes Google, Facebook, Twitter, Amazon, and untold other sources of unstructured data that simply doesn't fit comfortably in a conventional relational database.

That isn't to say RDBMSs are an endangered species. An August 24, 2014, article on TheServerSide.com points out that enterprises continue to prefer SQL databases, primarily for their reliability through compliance with the atomicity, consistency, isolation, and durability (ACID) model. Also, there are plenty of DBAs with relational SQL experience, but far fewer with NoSQL skills.

Still, RDBMSs don't accommodate unstructured data easily -- at least not in their current form. The future is clearly one in which the bulk of data in organizations is unstructured. As far back as June 2011 an IDC study (pdf) predicted that 90 percent of the data generated worldwide in the next decade would be unstructured. How much data are we talking? How about 8000 exabytes in 2015, which is the equivalent of 8 trillion gigabytes.

Growth in Data

The tremendous growth in the amount of data in the world -- most of which is unstructured -- requires a non-relational approach to management. Credit: IDC

As the IDC report points out, enterprises can no longer afford to "consume IT" as part of their internal infrastructure, but rather as an external service. This is particularly true as cloud services such as the Morpheus database as a service (DBaaS) incorporate the security and reliability required for companies to ensure the safety of their data and regulatory compliance.

By supporting both MongoDB and MySQL, Morpheus offers organizations the flexibility to transition existing databases and their new data stores to the cloud. They can use a single console to monitor their queries in real time to find and remove performance bottlenecks. Connections are secured via VPN, and all data is backed up, replicated, and archived automatically. Morpheus's SSD-backed infrastructure ensures fast connections to data stores, and direct links to EC2 provide ultra-low latency. Visit the Morpheus site for pricing information or to sign up for a free account.

Addressing SQL's scalability problem head-on

A primary shortcoming of SQL is that as the number of transactions being processed goes up, performance goes down. The traditional solution is to add more RDBMS servers, but doing so is expensive, not to mention a management nightmare as optimization and troubleshooting become ever more complex.

With NoSQL, your database scales horizontally rather than vertically. The resulting distributed databases host data on thousands of servers that can be added or deleted without affecting performance. Of course, reality is rarely this simple. In a November 20, 2013, article on InformationWeek, Joe Masters explains that high availability is simple to achieve on read-only distributed systems. Writing to those systems is much trickier.

As stated in the CAP theorem (or Brewer theorem, named after Eric Brewer), you can have strict availability, or strict consistency, but not both. NoSQL databases lean toward the availability side, at the expense of consistency. However, distributed databases are getting better at handling timeouts, although there's no way to do so without affecting the database's performance.

Another NoSQL advantage is that it doesn't lock you into a rigid schema the way SQL does. As Jnan Dash explains in a September 18, 2013, article on ZDNet, revisions to the data model can cause performance problems, but rarely do designers know all the facts about the data model before it goes into production. The need for a dynamic data model plays into NoSQL's strength of accommodating changes in markets, changes in the organization, and even changes in technology.

The benefits of NoSQL's data-model flexibility

NoSQL data models are grouped into four general categories: key-value (K-V) stores, document stores, column-oriented stores, and graph databases. Ben Scofield has rated these NoSQL database categories in comparison with relational databases. (Note that there is considerable variation between NoSQL implementations.)

The four general NoSQL categories are rated by Ben Scofield in terms of performance, scalability, flexibility, complexity, and functionality. Credit: Wikipedia

The fundamental data model of K-V stores is the associative array, which is also referred to as a map or directory. Each possible key of a key-value pair appears in a collection no more than once. As one of the simplest non-trivial data models, K-V stores are often extended to more-powerful ordered models that maintain keys in lexicographic order, among other purposes.

The documents that comprise the document store encapsulate or encode data in standard formats, typically XML, YAML, or JSON (JavaScript Object Notation), but also binary BSON, PDF, and MS-Office formats. Documents in collections are somewhat analogous to records in tables, although the documents in a collection won't necessarily share all fields the way records in a table do.

A NoSQL column is a key-value pair comprised of a unique name, value, and timestamp. The timestamp is used to distinguish valid content from stale content and thus addresses the consistency shortcomings of NoSQL. Columns in distributed databases don't need the uniformity of columns in relational databases because NoSQL "rows" aren't tied to "tables," which exist only conceptually in NoSQL.

Graph databases use nodes, edges, and properties to represent and store data without need of an index. Instead, each database element has a pointer to adjacent elements. Nodes can represent people, businesses, accounts, or other trackable items. Properties are data elements that pertain to the nodes, such as "age" for a person. Edges connect nodes to other nodes and to properties; they represent the relationships between the elements. Most of the analysis is done via the edges.

Once you've separated the NoSQL hype from the reality, it becomes clear that there's plenty of room in the database environments of the future for NoSQL and SQL alike. Oracle, Microsoft, and other leading SQL providers have already added NoSQL extensions to their products, as InfoWorld's Eric Knorr explains in an August 25, 2014, article. And with DBaaS services such as Morpheus, you get the best of both worlds: MongoDB for your NoSQL needs, and MySQL for your RDBMs. It's always nice to have options!

TL; DR: Google Analytics stores a massive amount of statistical data from web sites across the globe. Retrieving reports quickly from such a large amount of data requires Google to use a custom solution that is easily scalable whenever more data needs to be stored.

The load could also be shared across multiple server nodes, but once more than a few additional nodes are required, the complexity of the system becomes extremely difficult to maintain.

With these things in mind, a standard relational database setup would not be a particularly attractive option due to the difficulty of upgrading and maintaining the system on such a large scale.

Finding a Scalable Solution

Information Storage in a Relational Database

Information Storage in a Key/Value Store

Data Access

How data access differs between a relational database and a key/value database. Source: readwrite

Getting Results

A Key/Value Store Option

MySQL vs. MongoDB

How mySQL and Mongo perform the same tasks. Source: Rick Osborne

How to Make Use of Mongo

Morpheus is a great option for this service, offering Mongo as a highly scalable service in the cloud: Users of Morpheus get three shared nodes, full replica sets, and can seamlessly provision MongoDB instances. In addition, all of this runs on a high-performance, Solid State Drive (SSD) infrastructure, making it a very reliable data storage medium. Using Morpheus, a highly scalable database as a service can be running in no time!

TL; DR: The Department of Defense's slow, steady migration to public and private cloud architectures may be hastened by pressures at opposite ends of the spectrum. At one end are programs such as the NSA's cloud-based distributed RDBMS that realize huge cost savings and other benefits. At the other end are the growing number of sophisticated attacks (and resulting breaches) on expensive-to-maintain legacy systems. The consensus is that the DOD's adoption of public and private cloud infrastructures is inevitable, which makes the outlook rosy for commercial cloud services of all types.

U.S. computer networks are under attack. That's not news. But what is new is the sophistication of the assaults on public and private computer systems of all sizes. The attackers are targeting specific sensitive information, the disclosure of which threatens not only business assets and individuals' private data, but also our nation's security.

In a September 29, 2014, column on the Times Herald site, U.S. Senator Carl Levin, who is chairman of the Senate Armed Services Committee, released the unclassified version of an investigation into breaches of the computer networks of defense contractors working with the U.S. Transportation Command, or TRANSCOM. The report disclosed more than 20 sophisticated intrusions by the Chinese government into TRANSCOM contractor networks in a 12-month period ending in June 2013.

In one instance, the Chinese military stole passwords, email, and source code from a contractor's network. Other attacks targeted flight information to track the movement of troops, equipment, and supplies. TRANSCOM was aware of only two of the 20-plus attacks on its contractors' networks, even though the FBI and other government agencies were aware of all of the attacks.

The report highlights the need to disclose breaches and attempts. Otherwise, there's no way to formulate an effective response in the short run, and deterrence in the long run. The left hand doesn't know what happened to the right hand within government, as well as in the business world.

Lack of breach disclosures plays into the bad guys' hands

No longer are data thieves rogue hackers acting alone. Today's Internet criminals work in teams that tap the expertise of their members to attack specific targets and conceal their activities. InformationWeek's Henry Kenyon describes in a September 29, 2014, article how security officials in the public and private sectors are striving to coordinate their efforts to detect and prevent breaches by these increasingly sophisticated Internet criminals.

The Department of Homeland Security is charged with coordinating cyber-defenses, mitigating attacks, and responding to incidents of Internet espionage. Phyllis Schneck, DHS's director of cybersecurity, identifies three impediments to effective defenses against network attacks.

Problem 1: The criminals are talented and coordinated.
Problem 2: Breaches intent on espionage often appear to be theft attempts.
Problem 3: Firms don't report data breaches, so there's no sharing of information, which is necessary to devise a coordinated response.

DHS's Einstein system constantly scans civilian government networks, analyzing them to detect and prevent zero-day, bot-net, and other attacks. Schneck states that DHS makes it a priority to share the information it collects about attempted and successful breaches with other government agencies, the private sector, and academia.

DoD Modernization

Public and private cloud services will play an important role in the Department of Defense's IT Modernization program. Source: Business2Community

The problem, according to analysts, is that businesses are loathe to disclose data losses and thwarted attacks on their networks. They consider their reputation for network security a competitive advantage, so anything that impairs that reputation could reduce the company's value. Sue Poremba points out in a September 24, 2014, article on Forbes that most major breaches still receive very little publicity.

However, the recent spate of major breaches at Home Depot, Dairy Queen, PF Chang's, Target, and major universities are convincing company officials of the need to coordinate their defenses. Such a coordinated approach to network protection begins and ends by sharing information.

An NSA cloud success story serves as the blueprint

Organizations don't get more secretive than the U.S. National Security Agency. You'd think the NSA would be the last agency to migrate its databases to the cloud, but that's precisely what it did -- and in the process realized improved performance, timeliness, and usability while also saving money and maintaining top security.

In a September 29, 2014, article, NetworkWorld's Dirk A.D. Smith describes the NSA's successful cloud-migration program. The agency's hundreds of relational databases needed more capacity, but throwing more servers at the problem wasn't practical: existing systems didn't scale well, and the resulting complexity would have been a nightmare to manage.

Instead, NSA CIO Lonny Anderson convinced the U.S. Cyber Command director to move the databases to a private cloud. Now analyses take less time, the databases cost less to manage, and the data they contain is safer. That's what you call win-win-win.

The goal was to create a "user-facing experience" that offered NSA analysts "one-stop shopping," according to Anderson. Budget cuts required security agencies to share data and management responsibilities: NSA and CIA took charge of cloud management; the National Geospatial Intelligence Agency (NGA) and Defense Intelligence Agency (DIA) took responsibility for desktops; and National Reconnaissance Office (NRO) was charged with network management and engineering services.

The agencies' shared private cloud integrates open source (Apache Hadoop, Apache Accumulo, OpenStack) and government-created apps running on commercial hardware that meets the DOD's specs for reliability and security. The resulting network lets the government realize the efficiency benefits of commercial public cloud services, according to Anderson.

DoD Cloud Infrastructure

The DOD's Enterprise Cloud Infrastructure will transition local and remote data centers to a combination of public and private cloud apps and services. Source: Business2Community

Just as importantly, the cloud helps the defense agencies ensure compliance with the strict legal authorities and oversight their data collection and analysis activities are subject to. The private cloud distributes data across a broad geographic area and tags each data element to indicate its security and usage restrictions. The data is secured at multiple layers of the distributed architecture.

The data-element tags allow the agency to determine when and how each bit of data is accessed -- to the individual word or name -- as well as all the people who accessed, downloaded, copied, printed, forwarded, modified, or deleted the specific data element. Many of these operations weren't possible on the legacy systems the private cloud replaced, according to Anderson. He claims the new system would have prevented breaches such as the 2010 release of secure data by U.S. soldier Bradley Manning.

Overcoming analysts' reluctance to abandon their legacy systems

Anderson faced an uphill battle in convincing agency analysts to give up their legacy systems, which in many instances couldn't be ported directly to the cloud. Adoption of the cloud was encouraged through a program that prohibited analysts from using the legacy systems for one full day every two weeks. With the assistance of analysts with cloud expertise, the newbies overcame the problems they encountered as they transitioned to the agency's private cloud.

The result is a faster, more efficient system that improves security and cut costs. These are among the benefits being realized by companies using the Morpheus database-as-a-service (DBaaS). Morpheus is based on an SSD infrastructure for peak performance, allowing you to identify and optimize data queries in real time. Backup, replication, and archiving of databases are automatic, and your data is locked down via VPN security.

Morpheus supports Elasticsearch, Redis, MySQL, and MongoDB. Visit the Morpheus site for pricing information and to create a free trial account.

Similar benefits are being realized by the first DOD agencies using commercial cloud services. Amber Corrin reports in a September 24, 2014, article on the Federal Times site that defense agencies will soon be able to contract for public cloud services directly rather than having to go through the Defense Information Systems Agency (DISA).

The change is the result of the perception that DOD agencies are too slow to adopt cloud technologies, according to DOD CIO Terry Halvorsen. However, there will still be plenty of bureaucracy. Agencies will be required to provide the DOD CIO with "detailed business case analyses" that consider services offered by the DISA, among other restrictions.

Most importantly, all cloud connection points are controlled by the DISA, and unclassified traffic has to pass through secured channels. Slowing things down even further, agencies will have to obtain an authority to operate, or ATO.

The DOD's cloud migration may be slow, but it's steady. Bob Brewin reports in a September 23, 2014, article on NextGov that the Air Force Reserves will now use Microsoft 365 for email and other purposes, which promises to save the government millions of dollars over the next few years. That's something taxpayers can cheer about!

TL; DR: Recent data system breaches at Target and Home Depot remind us all that the continuous threat of criminal hacking of computer security systems is not abating. Rather, it’s becoming routine. Business managers, lawmakers, and computing professionals must understand the motivation behind this activity if they want to effectively protect business interests and thwart attacks. Perhaps the biggest challenge is that the hacking community is a diverse and complex universe: a large variety of skill layers and several motivators. Only by understanding the motives of criminal security hackers it is possible to profile computer crimes. With solid profiles in hand, security professionals can better predict future activity and install the appropriate safeguards.

Most security professionals are likely to spend much more time analyzing the technical and mechanical aspects of cybercrime than the social and psychological dimensions. Of course it’s critically important to dissect malware, examine hacker tools, and analyze their code. However, if we want to understand the nature of the cyber threat, then security professionals need to act more like criminal investigators. We no longer live in a world of mere glory-seekers and script kiddies. Some very serious thugs are now lurking in virtually every sector. So, it’s critically important for you to understand their motives and signatures, since these point to their targets and reveal their methods of operation.

As you consider your business context, it important to frequently ask yourself this question: What exactly are the means, motives, and opportunities for potential criminal hackers of my business computing systems? Getting a solid answer to this question is the key to identifying your most vulnerable assets and developing a security plan.

The Home Depot Breach

In September 2014, Home Depot Inc. made an announcement that as many as 56 million cards may have been compromised in a sustained malware attack on its payment systems—an attack that had been underway for many months. This security breach was even larger than the previous holiday attack at Target Corporation. This is yet another highlight in a string of similar events at corporations around the world, and reminds us of the vulnerability of U.S. retailers to hackers that continue to aggressively target their payment systems. Home Depot has said that the company had begun a project to fully encrypt its payment terminal data this year, but was outpaced by the hackers. The Home Depot attack is the latest in a wave of high-profile hackings at big merchants in recent months, ranging from high-end retailer Neiman Marcus Group Ltd. to grocer Supervalu Inc. to Asian restaurant chain P.F. Chang's China Bistro Inc.

According to many IT and computing system analysts, the top three hacker motives are financial, corporate espionage, and political activism. In the remainder of this article, we look closely at the financial motive, and then we help you consider the best approaches to securing your cloud-computing assets with BitCan.

Financial System Hackers

You’re probably most familiar with this type of hacker, since they cause the most damage and often feature in the news. The motive here is pretty obvious: make money the easy way, by stealing it. Financial system security hackers range in size from a few lone actors to large cyber-crime organizations—often with the backing of conventional criminal organizations. Collectively, these thieves are responsible for extracting billions of dollars from consumers and businesses each year.

These threats go well beyond the hobbyist community to a very high level of sophistication. All criminal attackers immerse themselves in a complex underground economy: a vast black market in which participants buy and sell toolkits, zero-day exploit code, and malware botnet services. Vast quantities of private data and intellectual property are up for sale—highly valuable data that has been stolen from victims. A recent market trend is the sale of web-exploit kits such as Blackhole, Nuclear Pack, and Phoenix—which they use to automate drive-by download attacks.

Some financial system hackers are opportunistic, and focus on small businesses and consumers. Larger operations go to great lengths to analyze large enterprises and specialize in one or two industry verticals. In a recent attack on the banking and credit card industry, a very organized group was able to pull off a global heist of $45 million in total from ATM-with an extreme degree of synchronization. These secondary attacks were feasible because of a previously undetected breach of some bank networks and a payment processor company.

Malicious hacker attacks are quite common, and often have tragic and highly disruptive outcomes. And these attacks are also inevitable, as more Internet users utilize cloud computing and storage. This raises more concerns about combating the effects of hacking, and it will become increasingly critical in the future. There is ongoing debate as to whether cloud computing is more vulnerable to hacking threats. After years of extensive industry debate, it’s been found to be the same problem in a different location. So, if businesses can build reliable security and recovery methods, then cloud computing can be a serious consideration. Most importantly, the freedom, accessibility, and collaboration that is available through cloud computing can far outweigh and mitigate the risks to your data security.

Many cloud computing users assume their data is held safe by the security measures of their cloud vendor. But, hackers use code-cracking algorithms and brute force attacks to acquire passwords, and they can also access data transmissions that lack proper encryption.

Ask yourself this question: Do you have solid infrastructure, processes, and procedures to ensure reliable, high-security backups of your sensitive and business-critical data? If you can’t answer this question with confidence, then we invite you to read on a bit further as we consider various aspects of a top-tier cloud backup service.

Your cloud backup service should process all data through encryption to ensure that it’s entirely unreadable by unauthorized users. It should only be possible to decrypt your data when you decide to retrieve it. Minimally, this means that data transmission should be done only through the SSL protocol and that strong passwords are necessary for information access and decoding.

No system is hacker-proof, but the greatest benefit of cloud a backup service is the high-degree of readiness for recovery from a hacking event. Companies that specialize in cloud backup services, like BitCan, reduce threats to your data by enabling full recovery of all business-critical data to its original state in just a matter of clicks. These backup companies replicate your cloud data and safeguard it in a separate cloud so that the likelihood of data loss from natural disasters and other threats remains infinitesimally small.

We recommend that you visit http://www.gobitcan.com and start your free 30-day trial. Or, you can read more below to learn how BitCan cloud backup services can help secure your backup data, support to your data-recovery plan, and bring you peace of mind.

Intensive Security for Your Online Backups

Rock-solid facilities. With BitCan cloud backup services, you can eliminate most of your backup infrastructure headaches and also alleviate your concerns about the safety and privacy of your cloud backups. Our robust, extreme-security data centers utilize precise electronic surveillance and multi-factor access control systems. The design of all our environmental systems aims to minimize the impact of any disruptions to operations. Multiple geographic locations and extensive redundancy add up to a high degree of resiliency against virtually all failure types, including natural disasters.

Protection from the bad guys. Not only do you get super-strong physical protection for your backup data, but we lock everything down with extensive network and security monitoring systems. As you expect, our systems include essential security measures such as distributed denial of service (DDoS) protection and password brute-force detection on all BitCan accounts. Additional security measures include:

Secure access and data transfer – all data access and transfers go through secure HTTP access using SSL
Unique users – Our identity and access management features allow you to control the level of access that users have to your BitCan infrastructure services.
Encrypted data storage – encrypt your backup data and objects using Advanced Encryption Standard (AES) 256, a secure symmetric-key encryption standard that employs 256-bit keys.
Security logs – BitCan provides extensive, verbose logs of all activity for all users of your account.
Native Support – Native support for MongoDB, MySQL, and Linux/Unix/Windows files.

Start your free 30-day free trial of BitCan today.

TL;DR: The realities of modern corporate networks make the move to distributed database architectures inevitable. How do you leverage the stability and security of traditional relational database designs while making the transition to distributed environments? One key consideration is to ensure your cloud databases are scalable enough to deliver the technology's cost and performance benefits.

Your conventional relational DBMS works without a hitch (mostly), yet you're pressured to convert it to a distributed database that scales horizontally in the cloud. Why? Your customers and users not only expect new capabilities, they need them to do their jobs. Topping the list of requirements is scalability.

David Maitland points out in an October 7, 2014, article on Bobsguide.com that startups in particular have to be prepared to see the demands on their databases expand from hundreds of requests per day to millions -- and back again -- in a very short time. Non-relational databases have the flexibility to grow and contract almost instantaneously as traffic patterns fluctuate. The key is managing the transition to scalable architectures.

Availability defines a distributed database

A truly distributed database is more than an RDBMS with one master and multiple slave nodes. One with multiple masters, or write nodes, definitely qualifies as distributed because it's all about availability: if one master fails, the system automatically rolls over to the next and the write is recorded. InformationWeek's Joe Masters Emison explains the distinction in a November 20, 2013, article.

The Evolving Web Paradigm

The evolution of database technology points to a "federated" database that is document and graph based, as well as globally queryable. Source: JeffSayre.com

The CAP theorem states that you can have strict availability or strict consistency, but not both. It happens all the time: a system is instructed to write different information to the same record at the same time. You can either stop writing (no availability) or write two different records (no consistency). In the real world, everything falls between these two extremes: business processes favor high availability first and deal with inconsistencies later.

Kyle Kingsbury's Call Me Maybe project measured the ability of distributed databases such as NoSQL to handle multiple partitions in real-world conflict situations. InformationWeek's Joe Masters Emison describes the project in a September 5, 2013, article. The upshot is that distributed databases fail -- as all databases sometimes do -- but they do so less cleanly than single-node databases, so tracking and correcting the resulting data loss requires asking a new set of questions.

The Morpheus database-as-a-service (DBaaS) delivers the flexibility modern databases require while ensuring the performance and security IT managers require. Morpheus provides the reliability of 100% bare-metal SSD hosting on a high-availability network with ultra-low latency to major peering points and cloud hosts. You can optimize queries in real time and analyze key database metrics.

Morpheus supports heterogeneous ElasticSearch, MongoDB, MySQL, and Redis databases. Visit the Morpheus site for pricing information or to sign up for a free trial account.

Securing distributed databases is also more complex, and not just because the data resides in multiple physical and virtual locations. As with most new technologies, the initial emphasis is on features rather than safety. Also, as the databases are used in production settings, unforeseen security concerns are more likely to be addressed as they arise. (The upside of this equation is that because the databases are more obscure, they present a smaller profile to the bad guys.)

The advent of the self-aware app

Databases are now designed to monitor their connections, available bandwidth, and other environmental factors. When demand surges, such as during the holiday shopping season, the database automatically puts more cloud servers online to handle the increased demand, and similarly puts them offline when demand returns to normal.

This on-demand flexibility relies on the cloud service's APIs, whether they use proprietary API calls or open-source technology such as OpenStack. Today's container-based architectures, such as Docker, encapsulate all resources required to run the app, including frameworks and libraries.

Troubleshooting Problems with MySQL Replication

How Three Companies Improved Performance and Saved Money with MongoDB

How to Minimize Data Wrangling and Maximize Data Intelligence

New Compilers Streamline Optimization and Enhance Code Conversion

Database - Beginning with Cloud Database As A Service

Don't Fall Victim to One of These Common SQL Programming 'Gotchas'

Don't Drown Yourself With Big Data: Hadoop May Be Your Lifeline

The SQL Vulnerability Hackers Leverage to Steal Your IDs, Passwords, and More

No "Buts" About It: The Cloud Is Transforming Your Company's Business Processes

How Database Failure Left Harry Potter and Thousands of Travelers Stranded Outside the US

Why More Is Better with Database Management: The Multicloud Approach

Can A Silicon Valley CTO Save Government Software From Itself?

Cloud Database Security, Farms and Restaurants: The Importance of Knowing Your Sources

DevOps: The Slow Tsunami That's Transforming IT

Why is Google Analytics so Fast? A Peek Inside

NoSQL Will Protect You From The Onslaught of Data Overload (or a bull charging down an alley)

How Is Google Analytics So Damn Fast?

Slowly But Surely, The Dept. of Defense Plods Towards The Modern Day

Why They Do It: Protect Your Data by Learning What Makes Hackers Tick

The Key to Distributed Database Performance: Scalability