Morpheus Blog

It's not unusual for data analysts to spend more than half their time cleaning and converting data rather than extracting business intelligence from it. As data stores grow in size and data types proliferate, a new generation of tools are arriving that promise to deliver sophisticated analysis tools into the hands of non-data scientists.

One of the hottest job titles in technology is Data Scientist, perhaps surpassed only by the newest C-level position: Chief Data Scientist. IT's long-standing skepticism about such trends is evident by the joke cited by InfoWorld's Yves de Montcheuil that a data scientist is a business analyst who lives in California.

There's nothing funny about every company's need to translate its data into business intelligence. That's where data scientists take the lead role, but as the amount and types of data proliferate, data scientists find themselves spending the bulk of their time cleaning and converting data rather than analyzing and communicating it to business managers.

A recent survey of data scientists (registration required) conducted by IT-project crowdsourcing firm CrowdFlower found that two out of three analysts claim cleaning and organizing data is their most time-consuming task, and 52 percent report their biggest obstacle is poor quality data. While the respondents named 48 different technologies they use in their work, the most popular is Excel (55.6 percent), followed by the open source language R (43.1 percent) and the Tableau data-visualization software (26.1 percent).

Data scientists identify their greatest challenges as time spent cleaning data, poor data quality, lack of time for analysis, and ineffective data modeling. Source: CrowdFlower

What's holding data analysis back? The data scientists surveyed cite a lack of tools required to do their job effectively (54.3 percent), failure of their organizations to state goals and objectives clearly (52.3 percent), and insufficient investment in training (47.7 percent).

A dearth of tools, unclear goals, and too little training are reported as the principal impediments to data scientists' effectiveness. Source: CrowdFlower

New tools promise to 'consumerize' big data analysis

It's a common theme in technology: In the early days, only an elite few possess the knowledge and tools required to understand and use it, but over time the products improve and drop in price, businesses adapt, and the technology goes mainstream. New data-analysis tools are arriving that promise to deliver the benefits of the technology to non-scientists.

Steve Lohr profiles several of these products in an August 17, 2014, article in the New York Times. For example, ClearStory Data's software combines data from multiple sources and converts it into charts, maps, and other graphics. Taking a different approach to the data-preparation problem is Paxata, which offers software that retrieves, cleans, and blends data for analysis by various visualization tools.

The not-for-profit Open Knowledge Labs bills itself as a community of "civic hackers, data wranglers and ordinary citizens intrigued and excited by the possibilities of combining technology and information for good." The group is seeking volunteer "data curators" to maintain core data sets such as GDP and ISO-codes. OKL's Rufus Pollock describes the project in a January 3, 2015, post.

Open Knowledge Labs is seeking volunteer coders to curate core data sets as part of the Frictionless Data Project. Source: Open Knowledge Labs

There's no simpler or straightforward way to manage your heterogeneous MySQL, MongoDB, Redis, and ElasticSearch databases than by using the new Morpheus Virtual Appliance. Morpheus lets you seamlessly provision, monitor, and analyze SQL, NoSQL, and in-memory databases across hybrid clouds via a single point-and-click dashboard. Each database instance you create includes a free full replica set for built-in fault tolerance and fail over.

With the Morpheus database-as-a-service (DBaaS), you can migrate existing databases from a private cloud to the public cloud, or from public to private. A new instance of the same database type is created in the other cloud, and real-time replication keeps the two databases in sync. Visit the Morpheus site to create a free account.

One of the best ways to improve the performance of MySQL databases is to determine the optimal approach for importing data from other sources, such as text files, XML, and CSV files. The key is to correlate the source data with the table structure.

Data is always on the move: from a Web form to an order-processing database, from a spreadsheet to an inventory database, or from a text file to customer list. One of the most common MySQL database operations is importing data from such an external source directly into a table. Data importing is also one of the tasks most likely to create a performance bottleneck.

The basic steps entailed in importing a text file to a MySQL table are covered in a Stack Overflow post from November 2012: first, use the LOAD DATA INFILE command.

The basic MySQL commands for creating a table and importing a text file into the table. Source: Stack Overflow

Note that you may need to enable the parameter "--local-infile=1" to get the command to run. You can also specify which columns the text file loads into:

This MySQL command specifies the columns into which the text file will be imported. Source: Stack Overflow

In this example, the file's text is placed into variables "@col1, @col2, @col3," so "myid" appears in column 1, "mydecimal" appears in column 3, and column 2 has a null value.

The table resulting when LOAD DATA is run with the target column specified. Source: Stack Overflow

The fastest way to import XML files into a MySQL table

As Database Journal's Rob Gravelle explains in a March 17, 2014, article, stored procedures would appear to be the best way to import XML data into MySQL tables, but after version 5.0.7, MySQL's LOAD XML INFILE and LOAD DATA INFILE statements can't run within a Stored Procedure. There's also no way to map XML data to table structures, among other limitations.

However, you can get around most of these limitations if you can target the XML file using a rigid and known structure per proc. The example Gravelle presents uses an XML file whose rows are all contained within an file, and whose columns are represented by a named attribute:

You can use a stored procedure to import XML data into a MySQL table if you specify the table structure beforehand. Source: Database Journal

The table you're importing to has an int ID and two varchars: because the ID is the primary key, it can't have nulls or duplicate values; last_name allows duplicates but not nulls; and first_name allows up to 100 characters of nearly any data type.

The MySQL table into which the XML file will be imported has the same three fields as the file. Source: Database Journal

Gravelle's approach for overcoming MySQL's import restrictions uses the "proc-friendly" Load_File() and ExtractValue() functions.

MySQL's XML-import limitations can be overcome by using the Load_file() and ExtractValue() functions. Source: Database Journal

Benchmarking techniques for importing CSV files to MySQL tables

When he tested various ways to import a CSV file into MySQL 5.6 and 5.7, Jaime Crespo discovered a technique that he claims improves the import time for MyISAM by 262 percent to 284 percent, and for InnoDB by 171 percent to 229 percent. The results of his tests are reported in an October 8, 2014, post on Crespo's MySQL DBA for Hire blog.

Crespo's test file was more than 3GB in size and had nearly 47 million rows. One of the fastest methods in Crespo's tests was by grouping queries in a multi-insert statement, which is used by "mysqldump". Crespo also attempted to improve LOAD DATA performance by augmenting the key_cache_size and by disabling the Performance Schema.

Crespo concludes that the fastest way to load CSV data into a MySQL table without using raw files is to use LOAD DATA syntax. Also, using parallelization for InnoDB boosts import speeds.

You won't find a more straightforward way to monitor your MySQL, MongoDB, Redis, and ElasticSearch databases than by using the dashboard interface of the Morpheus database-as-a-service (DBaaS). Morpheus is the first and only DBaaS to support SQL, NoSQL, and in-memory databases.

You can provision, deploy, and host your databases from a single dashboard. The service includes a free full replica set for each database instance, as well as automatic daily backups of MySQL and Redis databases. Visit the Morpheus site for pricing information and to create a free account.

A New Age Of Hybrid Cloud Management

Join us for this exciting webcast on how to develop, deploy, and manage your applications and databases on any cloud or infrastructure.

You’ll learn how to:

Provision databases and applications on any cloud (AWS, Azure, Google, RackSpace) or infrastructure (OpenStack, VMware, or Bare-Metal)
Elastically scale applications and databases
Automatically backup and recover databases and applications
Log & Monitor Databases and Applications for faster troubleshooting and SLA management
Clone and migrate databases and applications across hybrid clouds
Gain complete infrastructure and resource visibility across your on-premise servers and the cloud

Who should attend?

VP/Dir of IT, Solution Architects, DBAs, Cloud Architects, IT Ops, Datacenter Managers The LIVE webinar will be on Thursday, September 17th at 8am PDT, and there will also be a Q&A for you to ask questions in real-time.

Business managers become techies, and techies become business managers in the modern decentralized organization.

These days, every company is a tech company, and every worker is a tech worker. Business managers are getting more tech-savvy, and IT managers are getting more business-savvy. Yet the cultural barriers in organizations of all sizes between IT departments and lines of business persist -- to the disadvantage of both.

Now there are clear signs that the fundamental nature of the relationship between IT and the business side is changing, just as the way both groups work is changing fundamentally. As every manager knows, change doesn't come easy. Yet every manager also knows that the long-term success of the company depends on embracing those changes.

A positive consequence of the technification of business and the businification of tech is that the two groups are truly collaborating in ways they rarely have in the past. Every business decision not only involves technology, it is predicated on it. Likewise, every tech decision has at its foundation the advancement of the company's short-term and long-term business goals.

As the adage goes: Easier said than done. Yet it's being done -- and done successfully -- in organizations of all types and sizes. Here's a look at what those success stories have in common.

Fusing IT with business: It's all about attitude

In a September 1, 2015, article on BPMInstitute.org, Tim Musschoot writes that successful collaboration between business departments and IT departments depends on two things: having a common understanding among all parties; and managing each party's expectations related to the other.

It's not enough for each side to be using the same terminology. To establish a mutual understanding, you must consider behavioral aspects (who does what, how do they do it, when do they do it); information aspects (what is an order form, who is the customer, how do they relate); and the business rules that apply to processes and information (when can an order be placed, how are rates established).

The quickly closing gap between business and IT is evident by a recent survey of both groups conducted by Harris Poll and sponsored by Appian. Forbes' Joe McKendrick examines the survey results in an April 27, 2015, article. One finding of the survey is that business executives are almost as enthusiastic as IT managers about the potential of cloud services to boost their organizations' profits. Similarly, business managers perceive application maintenance, app and data silos, expensive app development, and slow app deployments as drains on their companies.

The need for business and IT to collaborate more closely on app development, deployment, and maintenance is highlighted in a Forrester study from 2013. As InfoQ's Ben Linders writes, the business side often perceives IT staff as "order takers," while IT departments focus on excelling at specific central functions. Both sides have to resist the tendency to fall into the customer-provider model.

Breaking out of the business/IT silos requires replacing the hierarchical organization with one based on overlapping, double-linked circles. The circles include everyone involved in the development project, and double-linking ensures that the members of each circle are able to communicate immediately and naturally on all the organization's ongoing projects. This so-called sociocracy model ensures that the lines of communication in the company remain visible at all times.

Your database-scaling strategy should match the way the system is used, and the extent to which the database's nodes are distributed.

Not so long ago, the solution to balky database performance was to throw more hardware at it: faster servers and memory, more network bandwidth, and boom! Your database is back to bullet-train speeds.

Any system administrator will tell you that databases aren't what they used to be. For one thing, today's databases are much more widely distributed, reaching end points of all types: phones, tablets, controllers, sensors, even appliances. For another thing, a single database is more likely to run on multiple platforms and interface with systems of all types. Last but not least, the amount of data stored in a typical production database dwarfs the typical storage of just a few years ago.

So you've got more data than ever, integrating with more devices and platforms than ever, and reaching more device types than ever. Add to this the modern demands of real-time analysis and zero downtime. It's enough to make a DBA consider a midlife career change.

But before you start thinking about life as a karaoke repairman, remember that for every problem technology poses, it offers many possible solutions. The trick is to find the answer for your technical predicament, and apply it to best effect. The solution to many database performance glitches is scalability. The challenge is that traditional RDBMSs are easy to scale up by adding more and faster CPUs and memory, but they're notorious for not scaling out easily, which is a problem for today's widely distributed systems. Here's how to take much of the sting out of scaling databases.

The prototypical example of a scale-out system is a web app, which expands across low-cost servers as the number of users increases. By contrast, an RDBMS is designed to scale up: improve performance by adding processing power and more/faster memory.

Web apps are noted for scaling out to commodity servers as demand increases, while RDBMSs scale up by adding more and faster processor and memory, but less well as the number of users goes up. Source: CouchBase

Forget the backend -- Analytics now happens in the real-time data stream

If you're still analyzing data in batch mode on the back end, you could be further behind the competition than you think. InfoWorld's Andrew C. Oliver explains in a July 25, 2015, article that systems based on streaming analytics may cost more initially, but over time they make much better use of resources. One reason real-time analytics is more efficient is that you're not re-analyzing historic data the way you do with batch-mode analysis.

In a Dr. Dobbs article from November 2012, Nikita Shamgunov distinguishes the scale-out requirements of OLTP and OLAP databases: OLTP is used for real-time transaction processing; and OLAP accommodates analysis of large amounts of aggregate data.

Online analytical processing emphasizes deep dives into large pools of aggregate data, while online transaction processing focuses on many simultaneous data transactions with fewer data demands and shorter durations. Source: Paris Technology

Scaling OLAP databases relies on these four characteristics:

Columnar indexes and compression (not specifically scaling out, but they reduce CPU cyles per node)
Intra-parallel query execution (a scale-up partitioning technique that executes subqueries in parallel)
Shared-nothing architecture (prevents a single point of contention)
Distributed query optimization (sends a single SQL query to multiple nodes)

Here are the four considerations for ensuring OLTP databases scale well:

Don't rely on columnar indexes because OLTP is real-time, and its data records are accessed randomly, which reduces the effectiveness of indexes.
Parallel query execution is likewise less effective in OLTP because few queries require processing large amounts of data.
Data in OLAP systems is generally loaded in batch mode, but OLTP data is ingested in real time, so optimizing bursty traffic is less beneficial in OLTP.
In OLTP, uptime is paramount; OLAP systems are more tolerant of downtime.

Simple management of SQL, NoSQL, and in-memory databases is a key feature of the Morpheus next-gen Platform as a Service software appliance built on Docker containers. Morpheus supports management of the full app and database lifecycle in a way that accommodates your unique needs. Provisioning databases, apps, and app stack components is simple and straightforward via the intuitive Morpheus interface.

The range of databases, apps, and components you can provision includes MySQL, MongoDB, Cassandra, PostgreSQL, Redis, ElasticSearch, and Memcached databases; ActiveMQ and RabbitMQ messaging; Tomcat, Java, Ruby, Go, and Python application servers and runtimes; Apache and NGinX web servers; and Jenkins, Confluence, and Nexus app stack components. Request a demo to learn more.

The point of attack on your data systems is now more likely to be at the app level rather than at the network level.

Organizations take great pains to prevent data breaches by securing their network perimeters, but the vulnerabilities targeted most often by hackers these days are at the application layer rather than the network layer. Programming practices have not kept pace with the profound, fundamental changes in data networks – from isolated, in-house silos to distributed, cloud-based virtual environments with unpredictable physical characteristics. Securing apps requires more-automated development environments that incorporate continual self-testing.

Data-security concerns have gone mainstream. On one side, you have people who believe Internet communications should be granted the same protected status as private conversations on the street: You can’t surreptitiously listen in without a warrant. On the other, you have the people who believe government authorities must have the ability to access private conversations, with or without a warrant, to combat such serious crimes as terrorism, human trafficking, and child abuse.

The battleground is end-to-end encryption: Should we, or shouldn’t we?

In a July 25, 2015, article, the Atlantic’s Conor Friedersdorf cites advocates of encryption back doors as warning of the dangers of an Internet in which all data communications are encrypted. They compare it to a physical space in which authorities have no ability to observe crimes in progress.

Friedersdorf refutes this argument by pointing out that the crimes the authorities are investigating take place in the physical world for the most part. He compares the encrypted communications to any group of people whispering to each other on a street corner. Without a reasonable suspicion and a warrant, the government can’t eavesdrop.

Whether end-to-end encryption becomes universal, or governments retain the ability to tap into email, text messages, phone calls, and other Internet communications, the fundamental approach to data security is changing from one focused on the network, to one focused on applications.

In the two-step end-to-end encryption model, the client and server first exchange keys, and then use the keys to encrypt and decrypt the data being communicated. Source: RiceBox & Security.

Baking security into code makes the programmer’s job easier, but…

MIT data security researcher Jean Yang points out the disparity between the rapid pace of change in data technology, and the slow pace of change in the way programs are created. Yang is quoted by TechCrunch’s Natasha Lomas in a September 27, 2015, article as stating that legacy code acts as an impediment to adoption of the types of structural programing changes required to protect data in today’s distributed, virtualized networks.

The Jeeves programming language Yang has created encapsulates and enforces security and privacy policies “under the covers” so that programmers don’t need to be concerned with enforcing them via library function calls or other methods. However, Yang and her colleagues took pains to accommodate the way programmers actually work to avoid requiring that they change their current favorite methods and tools.

Bad code called a primary cause of data breaches

Organizations to date have focused on securing their networks from attack, but SAP’s Tim Clark asserts that 84 percent of all data breaches occur at the application layer. (Note that SAP is a leading vendor of application security services.) CSO’s Steve Morgan quotes Clark in a September 2, 2015, article. Morgan also cites Cisco Systems’ 2015 Annual Security Report, which found that increased use of cloud apps and open-source content management systems has made sites and SaaS offerings vulnerable at the application level.

Viewing the application layer as a separate stack helps reconcile the different perceptions of security by the dev side and the ops side. Source: F5 DevCentral

Programmers’ tendency to incorporate code written by others in their apps contributes to the prevalence of vulnerabilities in the programs. The borrowed code isn’t properly vetted for vulnerabilities beforehand, so securing programs becomes an after-the-fact operation. Writing secure code requires that security become the starting point for development projects, according to McKinsey & Co. partner James Kaplan, who co-authored the report Beyond Cybersecurity: Protecting Your Digital Business.

Recently hackers have shifted from attacking not only vulnerable applications, but also the tools developers use to create the applications. As CSO’s George V. Hulme reports in a September 29, 2015, article, CIA security researchers reportedly created a modified version of Apple’s Xcode development tool that let the agency place back doors into the programs generated by the toolkit.

The recently announced breach of thousands of products in Apple’s App Store was blamed on Chinese developers who downloaded compromised versions of Xcode, called “XcodeGhost.” The apps created using XcodeGhost implement a cascading effect that infects all apps developed subsequently, accord to Hulme. That’s one of the reasons why experts recommend that the systems used in organizations for development work be isolated from the systems used to actually build, distribute, and maintain the apps.

Much of the added security burden for app developers can be mitigated by automated self-test functions integrated in the development process. CIO’s Kacy Zurkus writes in a September 14, 2015, article that automated testing gives organizations more insight into risks present in both their home-brewed apps and their off-the-shelf programs.

Raising awareness among developers about the risks of vulnerable applications, and about their increasingly important role in ensuring the apps they develop are secure from day one, is the first step in improving the security of your organization’s data. The next steps are to automate application security practices, and to make enforcement of security policies more transparent.

More data requires — you guessed it— more machines. But are tomorrow's workers ready?

In the future, they will determine the precise date when the traditional notion of privacy expired – probably some moment in 1999. It will take a couple of decades at least for humanity to comprehend the abilities and reach of modern surveillance, and the unfathomable amount of data being generated and collected.

For example, satellites can now identify objects as small as 50 centimeters across, according to X Prize Foundation founder Nick Diamandis. Diamandis is quoted by IDG News Service’s James Niccolai in a September 24, 2015, article. The researcher states that data-analysis systems such as IBM’s Watson are the only way to extract useful information from the enormous stores of data we now collect.

Diamandis predicts that we are approaching what he calls “perfect data,” the point at which everything that happens is recorded and made available for mining. He offers two examples: self-driving cars that scan and record their environment constantly, and a fleet of low-flying drones able to capture video of someone perpetrating or attempting a crimes as they occur.

Does this mean we should all don our tinfoil hats and head for the hills? Far from it. Technology is neutral – and that applies just as well to scary, Big Brother-scenario technology. The data-rich world of tomorrow holds much more promise than peril. From climate change to cancer cures, big data is the key to solutions that will impact the world. But this level of data analytics will require machine architectures smart enough and fast enough to process all that data, as well as access tools that humans can use to make sense of the data.

IBM Watson morphs into a data-driven AI platform

Big data has value only if decision-makers have the tools they need to access and analyze the data. First off, this requires a platform on which the tools can run. IBM envisions Watson as the platform for data-driven artificial-intelligence applications, as the New York Times’ Steve Lohr reports in a September 24, 2015, article. Watson is being enhanced with language understanding, image recognition, and sentiment analysis. These human-like capabilities are well-suited to AI apps, which IBM refers to as “cognitive computing.”

IBM’s Watson includes natural-language processing of questions that balances evidence against hypotheses to generate a confidence level for each proposed answer. Source: IBM.

Healthcare is expected to be one of the first beneficiaries of cognitive computing. For example, Johnson & Johnson is teaming with IBM and Apple to develop a “virtual coach” for patients recovering from knee surgery. In a September 25, 2015, article in the Wall Street Journal, Steven Norton quotes Johnson & Johnson CIO Stuart McGuigan stating that the system’s goal is to “predict patient outcomes, suggest treatment plans, and give patients targeted encouragement.”

New supercomputer architectures for a data-centric world

Data-driven applications won’t go anywhere without the processing power to crunch all that data. As part of the National Strategic Computing Initiative, IBM is working with the Oak Ridge National Laboratory and the Lawrence Livermore National Laboratory to develop a computer architecture designed specifically for analytics and big-data applications. IBM Senior VP John E. Kelly III describes the company’s strategy in a July 31, 2015, post on A Smarter Planet Blog.

The goal of the Department of Energy’s Collaboration of Oak Ridge, Argonne, and Livermore (Coral) program is to develop the fastest supercomputers in the world. They will be based on IBM’s data-centric design that puts processing where the data resides to eliminate the overhead of shuttling data to the processor.

Another player in the smart-machines field is Digital Reasoning, whose Synthesys machine-learning system is designed for fast analysis of email, chat, voice, social networking, and other digital communications. In an August 14, 2014, article, Fortune’s Clay Dillow describes how financial institutions are applying the machine-learning technology Digital Reasoning developed initially for the U.S. Department of Defense to combat terrorism.

Digital Reasoning’s Synthesys machine-learning system promises to deliver knowledge after ingesting and resolving structured and unstructured data. Source: Digital Reasoning

Evolving role of humans in the business decision-making process

The logical progression of machine intelligence leads to AI making better business decisions than human analysts. In a September 21, 2015, post on the New York Times Opinionator blog, Robert A. Burton explains that once the data in any area has been quantified (such as in the games of chess and poker, or the fields of medicine and law), we merely query the smart machine for the optimal outcome. Humans can’t compete.

However, humans will continue to play a key decision-making role that machines to date can’t duplicate: applying emotion, feelings, and intention to the conclusions derived by smart machines. For example, the image of a U.S. flag flying over a baseball park isn’t the same as the image of the same flag being raised by U.S. Marines on Iwo Jima.

For humans to apply their uniquely human sensibility to business decisions, they must have access to the data, which circles back to the need for data-analysis tools that run atop these smart-machine platforms. ComputerWeekly’s Carolyn Donnelly writes in a September 18, 2015, article that the more people who are able to analyze the data, the more informed their decisions, and the faster the technology will be integrated with day-to-day business operations.

As Donnelly writes, the businesses that are able to apply data-science principles and self-service platforms will do a better job of collecting, managing, and analyzing the data. The ultimate result is more productive use of the organization’s valuable information.

More and more of your organization’s data will reside on public cloud servers, so have a plan in mind for getting it there.

When you’re migrating your company’s data from your premises to the cloud, the standard IT model is to do it half way: Move some of your data to public cloud services, and leave some of it in the data center. After all, IT has a well-earned reputation for doing things carefully, deliberately, and in stages.

And that’s exactly how your data-migration project will fail ... Carefully. Deliberately. In stages.

Since the dawn of computers people have been moving data between discrete systems. What has changed is the tremendous amount of data being migrated, and the incredible diversity in the type and structure of data elements. We’ve come a long, long way from paper tape and ASCII text. The traditional elements of data migration are evident in SAP’s description of its BusinessObjects migration service:

Purge and cleanse data.
Devise mapping/conversion rules.
Apply rules to extract and load data.
Adjust rules and programs as testing dictates.
Load test using a subset of data selected manually.
Test and validate using a large amount of automatically extracted data.
Load all data into the “acceptance system.”
Load all data into the pre-production system.
Validate the converted data, and get sign-off from users/owners.
Load all data into the production system, and get final sign-off from all stakeholders.

Avoid the ‘Code, Load, Explode’ cycle of expensive migration failures

Cloud migration smashes the old-style iterative conversion process to bits. The reason is simple: The frequency of data migrations is increasing, but costs and migration failure rates are rising just as fast.

A May 5, 2015, article by Andre Nieuwendam on PropertyCasualty360 describes the “Code, Load, Explode” cycle of data-migration and extract-transform-load (ETL) projects. You develop the programming logic, run a test load, it fails, you adjust your data assumptions, retest, it fails again, etc. The problem with the ETL approach is that it can’t accommodate the almost-infinite number of data variables the migration is likely to encounter.

data-migration-chart

The ETL process for migrating data from source databases to JSON files. Source: Code Project

Even with the increasing prevalence of unstructured data, migrations deal more with records, or groups of related data elements, rather than with the discrete elements themselves. Many migration problems are caused by identical data elements with very different meanings based on the unique context of each record.

Say good-bye to what you know and hello to what you don’t

The lack of easy-to-use migration tools is exacerbating the growing complexity of data migration. As Computerworld’s Ben Kepes explains in an October 7, 2015, article, building a public cloud infrastructure bears little resemblance to devising a data-center architecture. The quintessential example of this is Netflix, whose public cloud is based on massive redundancies, planning for failure, agile development, and “nimble” monitoring and management. Kepes points out that these attributes are sorely lacking in the typical in-house network architecture.

Netflix is far from the only cloud-migration success story, however. In an October 8, 2015, post, Data Center Knowledge’s Yevgeniy Sverdlik describes successful cloud-migration projects at General Electric and Capital One. One thing the two companies have in common is that they both have to collect data from millions of widely distributed end points. This demonstrates a principal benefit of a public-cloud architecture over in-house IT: Your data lives closer to your users/clients/customers.

data-migration-chart-2

Two big advantages of a public-cloud infrastructure are elasticity (you pay for only the resources you use) and faster access to your data. Source: AutomationDirect.com

In GE’s case, the network end points are wind turbines, aircraft engines, and manufacturing equipment of every description. During the company’s three-year migration to a cloud infrastructure, it will reduce the number of data centers it maintains from 34 to only four, all of which will be used to house GE’s most sensitive, valuable data. The data from the other centers will be migrated to Amazon Web Services.

Sverdlick quotes IDC researcher Richard Villars speaking at the recent AWS re:Invent conference in Las Vegas: “People just want to get out of the data center business.” This fact is evident in Capital One’s plans to reduce its data-center count from eight in 2014 to five in 2016 and three in 2018. Two principle benefits of migrating data and apps to the cloud are faster software deployments, and the elasticity to increase or decrease use on demand (such as for Black Monday).

Self-Service portals are becoming more mainstream and the benefits are making users and IT equally happy

Companies of all types and sizes are realizing the benefits of self-service portals for everyday operations like customer account management, employee benefits administration, and access to the organization’s vital data resources. According to Parature’s 2015 Global State of Multichannel Customer Service Report, 90 percent of consumers in the four countries surveyed (Brazil, Japan, the U.K., and the U.S.) expect the companies they transact with to offer self-service customer-support portals or FAQ knowledgebases.

Here’s a look at how organizations are putting the self-service model to use in ways that improve customer service and make them more efficient. The IT functions that are candidates for the self-service model are also presented.

Asheville City's simple search portal puts residents in touch with services, info

The city of Asheville, North Carolina, replaced a complicated morass of portals and PDFs on the city website with what may be the simplest portal interface possible: a single search bar. The SimpliCity search engine provides residents with information about crime and developments in their neighborhood, as well as simple answers to such question as “when is my recycle day?”

The SimpliCity self-service portal created by the city of Asheville, North Carolina, makes it easy for residents to access city services and to find information about their neighborhood. Source: Mountain Xpress

As Hayley Benton reports in an October 7, 2015, article in the Mountain Xpress, SimpliCity is noteworthy not only for its ease of use and effectiveness, but also for the manner in which the app was developed. A small, in-house team led by Asheville CIO Jonathan Feldman used “lean startup” techniques to create a small prototype linked to the city’s ArcGIS ESRI server. The team measured real user behavior and used that data to adjust the app through constant iterations to improve the experience for citizens.

SimpliCity is an open-source project hosted on GitHub and available for any government agency to adapt for use by their citizens. The project was the winner of a Technology Award at the October 2, 2015, Code for America Summit in Oakland, California. Among the award judges’ comments was that the app’s simplicity benefits city workers as much as its residents, which translates into more efficient use of public resources.

HR portal puts employees in benefits driver’s seat

One of the most common applications for self-service portals within organizations allows employees to manage their benefits without the human resources department serving as intermediary. Not only do such portals give employees a sense of control over the administration of their benefits, it frees HR to manage more effectively by reducing paperwork.

That’s the conclusion of Emily Rickel, who is the HR director at medical alert system provider Medical Guardian. Rickel relates her experiences with employee self-service (ESS) portals in a February 2, 2015, article on Software Advice. By adopting the Zenefits ESS system, the company and its employees have saved time and reduced resource consumption throughout the organization.

Software Advice’s survey of HR employees who have implemented ESS found that the two primary uses of the systems are for choosing insurance options and checking paid time off (74 percent in each category), followed by checking insurance information (66 percent) and requesting paid time off (59 percent).

A survey of HR use of employee self-service portals found that choosing insurance and checking PTO are the most common uses of the systems. Source: Software Advice

Self-service use cases for IT departments

No group has been affected more by the cloud disruption than the IT department. As TechRepublic’s Mary Shacklett explains in a September 4, 2015, article, the convergence of ready-to-use tools and a growing willingness among employees – some would say preference – to do things themselves has made implementing self-service IT portals much simpler.

Shacklett points out that not all IT functions are amenable to the self-service approach, but she identifies 10 areas where self-service portals are worth considering, five of which are listed below:

Allow users to participate in testing new apps, particularly in the areas of test resource provisioning and test-result automation.
Give business managers the ability to issue and renew user IDs and passwords, reducing IT’s role to approving the requests.
User requests for enhancements to apps and systems can be automated via online templates that prompt them to describe the functions and features they are proposing.
Likewise, requests for new equipment can be made via online forms that feed into the IT asset management and tracking systems.
End users and managers may not be aware of the range of reports maintained by IT about company activity that can make the workers more productive. Creating a browsable “gallery” of these reports enhances access to and increases the visibility of the reports.

It makes perfect sense that the department most closely associated with automation would take the lead in applying automated processes to its core operations. Self-service portals serve the dual purposes of bringing users and managers closer to IT, and bringing IT closer to the people and operations of their companies’ business departments.

The Open Web Application Security Project (OWASP) offers three tips for maintaining trustworthy server-activity logs:

For compliance, audit, and liability purposes, logs should be created in a way that ensures they can't be overwritten or deleted.
Logging frequency depends on the size and volume of the system, but all logs should be checked regularly to ensure the function is active (by running a simple cron job, for example).
Ensure that users aren't shown stack traces, private information, or other sensitive details in error messages; stick with generic messages, such as the standard 404 and 500 HTTP status response codes.

The Apache documentation explains that the error log file in Unix is usually error_log, and in Windows is error.log; Unix systems may also direct the server to send errors to syslog or pipe them to an external program. To continuously monitor the error log, such as during testing, use this command: tail -f error_log. Other Apache log files are process ID (logs/httpd.pid), which is used to restart and terminate the daemon; the ScriptLog, which records the input to and output from CGI scripts; and the RewriteLog, which analyzes the transformation of requests by the rewriting engine.

What to do when a PHP error-log file goes missing?

A Stack Overflow post from October 2012 highlights how challenging it can be to track down a PHP error log that isn't where you expect it to be. One solution offered was to add this line to the /etc/php.ini file: /var/log/php-scripts.log (other log-discovery options are shown in the image below).

server-log-error

The PHP error_log file can be customized to hide errors from users, log errors to syslog, or other purposes. Source: Stack Overflow

To find the log-file location of a Linux process, you can use lsof (list open files), as explained in a post on the Slash4 blog. Run the code shown below as a root user:

server-log-error-2

Find all open log files on a Linux server by combining lsof and grep. Source: Slash4 blog

You can find the process ID (PID) of httpd, MySQL, or other services using the commands shown below:

server-log-error-3

The lsof and grep commands can be used to find the PID of a process and to search for open log files. Source: Slash 4 blog Use .htaccess to create private, custom error logs

Apache's .htaccess configuration file lets you customize your error reporting to ensure the only the people who need to be notified of specific errors can view the reports. Jeff Starr explains the process on his Perishable Press blog. Start by adding the .htaccess directives below to the httpd.conf file of the domain, or alternatively to the site's root or other directory:

server-log-error-4

Add these commands to the .htaccess file to keep the error log private, enable PHP error logging, and restrict access to the error log. Source: Perishable Press

These best practices and more were what we had in mind when we built the Morpheus Cloud Application Management Platform. Morpheus includes a robust logging and monitoring tool that allows you to easily visualize and monitor logs across all your distributed systems. It also makes it simple to consolidate logs across all of your apps, databases, and IT systems for faster trouble shooting. To learn more about Morpheus or to sign up for a free trial, click here.

"Simple, non-critical apps go public while complex, mission-critical apps stay private."

If only hybrid-cloud management were that straightforward!

The hybrid cloud is the Goldilocks of cloud services: The public cloud is inexpensive, but it’s not safe. The private cloud is safe, but it’s expensive. Putting the less-sensitive of your organization’s data assets in the public cloud while keeping your more-sensitive data in a private cloud is juuuuuuuust right.

If only the real world were as straightforward as fairytales. Configuring and managing a hybrid cloud infrastructure is fraught with peril: too much reliance on the public component puts your data at risk, yet overuse of the private component means you’re spending more money than you need to. However, there is a reward for companies that master the hybrid mix: secure data at a much lower cost than managing everything in-house.

In The Art of the Hybrid Cloud, ZDNet’s James Sanders defines the hybrid cloud as a combination of public cloud services, such as AWS or Google Cloud, and a private cloud platform. The two are linked via an encrypted channel over which your data and applications travel. Hooking just any server to a public cloud service doesn’t create a hybrid cloud. The private side of the connection must be running cloud software, such as the open-source ownCloud or Apache CloudStack.

public-private-or-HybridCloud

Hybrid clouds combine the security and performance of in-house systems with the efficiency and agility of public-cloud services. Source: Raconteur

Sanders claims that the primary reason organizations choose not to use public cloud boils down to bandwidth: they have too much data that they need to access quickly. The public network’s latency is what prevented the Japanese Meteorological Agency from migrating its weather-forecasting data to cloud services. The agency uses an 847-teraflop Hitachi supercomputer to analyze earthquake data to determine whether a tsunami warning needs to be issued. The time-critical nature of such analyses precludes use of the slow public Internet.

The cloud gives developers direct access to the infrastructure

The data-center infrastructure has been refined and improved so much over the years that its reliability is taken for granted by IT managers and business users alike. Conversely, the public network is anything but failure-proof. It remains a given that the network will fail, which explains the Netflix “build for failure” Chaos Monkey app-development strategy. Tech Republic’s Keith Townsend explains in a September 22, 2015 article why building resilience into the app rather than the infrastructure is preventing companies from adopting cloud services.

Townsend claims the cloud’s greatest asset is agility: It allows developers to manipulate the infrastructure directly, with no need for an IT intermediary. This lets ideas move swiftly “from the whiteboard to running code.” According to Townsend, you can’t reduce the complexity of a highly redundant infrastructure without sacrificing reliability. The hybrid cloud has the potential to deliver the agility of the cloud along with the resiliency of the data center.

public-vs-hybrid-cloud-storage

Hybrid cloud solutions offer greater agility, efficiency, scalability, and protection than server virtualization. Source: Archimedius

Whether hybrid clouds deliver on this potential depends on overcoming two challenges. The first is scalability, and the second is ensuring “frictionless consumption.” In the first case, some applications are too large for the public network to support the redundancy they require. Not many organizations have the infrastructure in place to handle the load that would result from an AWS failure, for example.

The second case – frictionless consumption – is even trickier to pull off because of the inherent complexity of cloud management, particularly in relation to highly redundant infrastructures. The heft of large applications can cancel out the cloud benefits of easy, universal access and simple interfaces.

Tips for hybrid-cloud security, monitoring

All the agility, scalability, and usability of hybrid-cloud solutions are worthless without the ability to secure and monitor your organization’s off-site data assets. In a September 2, 2015, article, Tech Republic’s Conner Forrest writes that a major concern is how the cloud service handles authentication on its public-facing portal. Forrest points out that security and monitoring are usually self-service, and few SLAs may be offered by the provider.

In addition to insisting on rock-solid SLAs, hybrid-cloud customers must determine whether workloads are properly separated in multi-tenant environments. Forrest quotes Virtustream co-founder and senior vice president Sean Jennings, who lists seven cloud-security challenges:

Separating duties
Misconfiguration
Blind spots – virtual switches and VM leakage
Reporting and auditing
Visibility
Compliance and governance
Recoverability

Tying all these precautions together is continuous compliance monitoring, which allows you to view your hybrid network as cybercriminals view it. The best way to thwart would-be data thieves is to think like they do and to see your network as the crooks see it. To quote the ancient Chinese military strategist Sun Tzu, “To defeat the enemy, become the enemy.”

Shadow IT problems plague most IT organizations today. While the solution may not seem obvious, it can be by taking a few critical steps toward securing your organization once and for all.

The race is on. Super users are adopting new technologies faster than their IT departments can support them. Sensitive, protected company data exists outside the organization’s own network – beyond the reach of IT and business managers. If your company’s IT department is losing the technology race with your employees, you’re paying the hidden cost of Shadow IT.

The best way to prevent squandering resources chasing after employees’ DIY IT, according to experts, is to devise a cloud infrastructure that incorporates the out-of-network services your line workers rely on increasingly in the course of doing their jobs. The only way to create such an infrastructure is to start by listening to your workers. That’s the conclusion of a report released earlier this year by Australian telecom and information services firm, Telstra.

The report, entitled "The Rise of the Superuser" (pdf), defines a “superuser” as an organization that has embraced remote working by supporting collaboration tools. In such companies, workers are familiar with the products that let them be more productive in their jobs, and they’re more vocal about expressing their tool preferences to IT departments. These users are also more likely to resist adopting technologies proposed by IT that the workers believe do not meet their needs.

Of course, resistance works both ways. When IT departments say no to user requests for new collaboration and productivity tools, they cite several reasons for doing so. According to the Telstra survey, the principle cause of their reluctance, named by 47 percent of IT departments surveyed, is the need to support higher-priority IT projects. A lack of funding, and the need to ensure data security and data compliance were each cited by 40 percent of the respondents.

IT-managers

The top three reasons cited by IT managers for resisting employee requests for next-gen tools are higher-priority projects, lack of funds, and security/compliance concerns. Source: Telstra Global

So how can IT managers solve this growing problem?

Step one: Identify the shadow services employees are using

Cisco recently analyzed data on the use of cloud services by its enterprise customers and concluded that IT departments are way off in their estimates of how many cloud services their employees use on the job. The study results are presented in an August 6, 2015, post on the Cisco Blog.

While IT managers believe their companies are using an average of 51 cloud services, in fact they’re using an average of 730 such services. Cisco predicts that the average number of cloud services used by these companies will increase to 1,000 by the end of 2015.

IT departments are also seriously underestimating the cost to their organizations of shadow IT, according to Cisco’s research. Public cloud costs are 4 to 8 times higher than the cost of the cloud services themselves because they have to be integrated with existing operations and procedures.

Step two: Identify why employees are adopting their own tech solutions

The primary reason workers go the shadow-IT route is because their company’s IT department lacks the resources to investigate and implement the many cloud and mobile tools becoming available every day. InfoWorld’s Steven A. Lowe writes in a September 28, 2015 article four downsides of shadow IT:

Critical, private data is being shared by the employees with the wrong people.
The data is likely to be inaccurate and out-of-date.
The data is denied to other workers in the organization who could benefit from it if they knew it existed.
Whenever workers solve their own IT problems, it poses a threat to IT’s autonomy and control.

The IT department may be tempted to respond to these threats by trying to stomp out DIY IT in their organizations, but Lowe argues that this strategy is bound to lose. First of all, the effort would use up all available IT resources: imagine the tracking, inventorying, and monitoring such a project would require. But more importantly, by not embracing DIY IT, the organization is squandering a tremendous opportunity.

Your IT department needs to be approachable, and it needs to focus on helping people help themselves, as well as on encouraging workers to share what they learn with other workers, and with IT. Eventually, your organization will have grown an organic, adaptable cloud infrastructure designed with user needs in mind – needs identified in large part by the users themselves.

seven-elements-of-cloud-computing-value

The seven elements of cloud computing value include utility pricing, elasticity, self-service provisioning, and managed operations. Source: DWM Associates, via Daniel Vizcayno’s Two Cents Blog

Lowe points out that ensuring the company’s data is safe, accurate, and available remains the responsibility of the IT department. To be truly useful, your security policies have to avoid slowing workers down. One of the tips Lowe offers for implementing a cloud infrastructure is to survey users to find out what apps and services they use, although you have to make it clear to them beforehand that you’re not hunting for unauthorized software.

Encouraging more uniform use of cloud apps can be as natural as being available to offer advice and make recommendations to employees who are looking for ways to get more out of their mobile devices. When you discover one group having success with their DIY IT project, publicize it to other groups that may find it useful.

Morpheus will help your organization say goodbye to Shadow IT forever. To see for yourself, click here to start your free trial today.

If you've been trying to figure out how to become a database expert as quickly as possible, consider this your introduction.

There are a number of career choices available to you if you decide to become a database expert, such as Database Administrator, Database Developer, and several others. What does it take and what do you need to learn to become an expert with databases? Read on to learn all about becoming one of the most sought after groups of people in today's workforce.

What Does a Database Expert Do?

The most popular position title for a database expert is Database Administrator (DBA for short). This person is in charge of all aspects of the database - security, storage, development, and anything else that may arise regarding a database.

As a DBA, you would be installing, testing, and troubleshooting databases and the information stored within them. Typically, you will be using database management software like Morpheus to handle many of the management tasks.

Another popular title is Database Developer. In this position, you would be working on the development of data and network structures, as well as queries that store or retrieve information for the database.

In either position, you will probably work closely with Network Analysts and software developers in order to ensure that everything functions properly and performs at a high level.

What kind of education or training do you need?

In most cases, you will need at least an associate’s degree, or quite possibly a bachelor’s degree. Your field of study will need to be computer science, information systems, or information security, and you will likely want to focus on databases and/or networking as you move further along in your studies.

You will likely want to learn Structured Query Language (SQL) to start, as most relational databases use some form of this language as the basis for their queries. Also, you may want to consider learning programming in general, as data structures like JSON are very similar to the object structure used to write queries in many NoSQL databases. For example, MongoDB uses BSON notation, which is formatted like JSON, but has some additional abilities.

An example of SQL. Source: Wikipedia.

An example of JSON (JavaScript Object Notation). Source: Wikipedia.

What else do I need?

In addition to a degree, more advanced positions may require that you have additional certifications in database administration or development. You can find many courses like this online, and they are good to take in any case so that you can more easily advance in your field.

Another thing that may be needed is work experience. To get this, you may need to freelance work or work as an intern for a company for a while in order to gain that precious experience, as this will provide you with a number of projects you can point to having successfully completed during this time. If you are looking for a position that requires managing others, you will likely need additional training in the management of employees, and possibly some experience with this as well.

In the end, getting the education, training, and experience will be well worthwhile, as you will be a highly sought-after database expert, which is something that is a need for every company that makes use of databases in their business plan!

Business managers become techies, and techies become business managers in the modern decentralized organization.

Bringing About Change

There are clear signs that show the fundamental nature of the relationship between IT and the business side is changing, just as the way both groups work is changing fundamentally. As every manager knows, change doesn't come easy. Yet every manager also knows that the long-term success of the company depends on embracing those changes.

A positive consequence of the "technification" of business and the "businification" of tech is that the two groups are truly collaborating in ways they rarely have in the past. Every business decision not only involves technology, it is predicated on it. Likewise, every tech decision has at its foundation the advancement of the company's short-term and long-term business goals.

As the adage goes: Easier said than done. Yet it's being done — and done successfully — in organizations of all types and sizes. Here's a look at what those success stories have in common.

Fusing IT with Business: It's All About Attitude

Three elements must come together when business and IT merge: business processes (behaviors); information processes; and business rules. Source: BPMInstitute.org

Collaboration

Communication is enhanced by replacing the traditional hierarchical structure of software development projects with a "sociocratic" approach of linked and overlapping circles. Source: EbizQ

Bringing IT and business departments together is a singular feature of the Morpheus next-gen Platform as a Service software appliance, which is built on Docker containers. Morpheus offers clear, unambiguous monitoring of SQL, NoSQL, and in-memory databases. The service supports management of the full app and database lifecycle in a way that accommodates your unique needs. Monitoring databases, apps, and app stack components is quick and simple via the intuitive Morpheus interface.

Cloud services now act as data-security aggregators by applying big-data principles to spotting and thwarting new data threats.

It's a fact: the cloud is where more and more of your company's data resides. Still, some companies are reluctant to trust their vital databases to a cloud service. That's because migrating a database to the public cloud is not the same as moving your apps and web servers there.

In "Zen and the Art of Cloud Database Security" from November 2014, SecurityWeek's David Maman writes that cloud services are well-equipped to thwart attacks against the integrity and availability of apps and web servers. However, databases must also be protected against threats to the confidentiality of the data. Equally important, databases must comply with laws and regulations relating to data security.

Forrester Research predicts continued steady growth in both cloud database services and cloud security spending through 2016. Source: Forbes

In the second part of Maman's article, he explains the areas where customers and cloud service providers share responsibility for the security of their hosted databases. These include encryption, ensuring software patches are applied, and implementing a rock-solid backup plan. Maman also describes the compliance-related controls that must be applied to databases, such as mapping regulated data to exact locations; identity management; encryption; and deterring, detecting, and mitigating attacks from within and without the organization.

The benefits of leaving cloud database security to the experts

In a very short period of time, security has shifted from being a primary reason why companies kept their databases in house, to being a primary reason why companies are hosting their databases in the cloud. In an August 29, 2015, article on Tech.co entitled "Do the Math: Can Your Business Benefit from a Cloud Database?", Baron Schwartz cites improved security as one of the three benefits of database as a service; the other two benefits are scalability and reduced administrative costs.

Schwartz states that database security depends on timely application of software updates and patches. Despite the best of intentions, these important operations simply are not done consistently at most organizations, or they're done poorly. It is much safer to trust this important function to a service that specializes in keeping databases patched and protected from the latest threats.

Perceptions change slowly in IT departments, as evident by a 2015 survey by Vormetric that found database risk greater than perceived by IT, and cloud risk less than IT's perceptions. Source: Forbes

DarkReading's Bill Kleyman highlights another big reason for trusting cloud services with the security of your company's databases. In a March 9, 2015, article, Kleyman points out that the cost of a data breach continues to skyrocket: malicious or criminal attacks now cost $246 per record lost on average, while system glitches and employee mistakes cost the companies an average of $171 and $160 per compromised record, respectively.

To see how Morpheus can help you securely store and manage your databases and apps, click here for a free trial.

Follow these approaches to enhance deployment and management of your data resources – whether or not you use Docker.

What? You haven’t converted all your data to Docker yet? Are you crazy? Get with the container program, son.

Well, as with many hyped technologies, it turns out Dockers specifically (and containers in general) are not the cure for all your data-management ills. That doesn’t mean the technology can’t be applied in beneficial ways by organizations of all types. It just means you have to delve into Docker deliberately – with both eyes open and all assumptions pushed aside.

Matt Jaynes writes in Docker Misconceptions that Docker makes your systems more complex, and it requires expertise to administer, especially on multi-host production systems. First and foremost, you need an orchestration tool to provision, deploy, and manage the servers Docker is running on. Jaynes recommends Ansible, which doubles as a configuration manager.

However, Jaynes also suggests several simpler alternative optimization methods that rival those offered by Docker:

Cloud images: App-management services such as Morpheus let you save a server configuration as an image that you replicate to create new instances in a flash. (Configuration management tools keep all servers configured identically as small changes are implemented subsequently.)
Version pinning: You can duplicate Docker’s built-in consistency by using version pinning, which helps avoid conflicts caused by misconfigured servers.
Version control: Emulate Docker’s image layer caching via version control deploys that use git or another version-control tool to cache applications on your servers. This lets you update via small downloads.
Package deploys: For deploys that require compiling/minifying CSS and Javascript assets or some other time-consuming operation, you can pre-compile and package the code using a .zip file or a package manager such as dpkg or rpm.

Running Docker in a multi-host production environment requires management of many variables, including the following:

A secured private image repository (index)
The ability to orchestrate container deploys with no downtime
The ability to orchestrate container-deploy roll-backs
The ability to network containers on multiple hosts
Management of container logs
Management of container databases and other data
Creation of images that can accommodate init, logs, and similar components

Finding the perfect recipe for provisioning via Chef

The recent arrival of Chef Provisioning introduces the concept of Infrastructure as Code, as John Keiser writes in a November 12, 2014, post on the Chef blog. Infrastructure as Code promises to let you write your cluster configuration as code, which makes clusters easier to understand. It also allows your clusters to become “testable, repeatable, self-healing, [and] idempotent,” according to Keiser.

Chef Provisioning’s features include the following:

Application clusters can be described with a set of machine resources.
Multiple copies of your application clusters can be deployed for test, integration, production, and other purposes.
Redundancy and availability are improved because clusters can be spread across many clouds and machines.
When orchestrating deployments, you’re assured the database primary comes up before any secondaries.
machine_batch can be used to parallelize machines, which speeds up deployments.
machine_image can be used to create images that make standardized rollouts faster without losing the ability to patch.
load_balancer and the machine resource can be used to scale services easily.

Keiser provides the example of a recipe that deploys a database machine with MySQL on it. You simply install Chef and Provisioning, set CHEF_DRIVER to your cloud service, and run the recipe:

# mycluster.rb require 'chef/provisioning' machine 'db' do recipe 'mysql' end

For example, to provision to your default Amazon Web Services account in ~/.aws/config, set the CHEF_DRIVER variable to the following:

export CHEF_DRIVER=aws # on Unix set CHEF_DRIVER=aws # on Windows

Then you simply run the recipe:

chef-client -z mycluster.rb

Add machine_batch and parallelization to apply the configuration to multiple servers:

# mycluster.rb require 'chef/provisioning' machine_batch do machine 'db' do recipe 'mysql' end # Create 2 web machines 1.upto(2) do |i| machine "web#{i}" do recipe 'apache2' end end end

In this example, there are three machines in machine_batch, which provisions the machines in parallel. A loop is used to create multiple machines, so you can add more machines by changing 2 to your desired number, all of which will be created in the same time it takes to create one machine.

Finally, run this command:

chef-client -z mycluster.rb

The db machine will respond with “up to date” rather than indicating that it has been created because machine (like all Chef resources) is idempotent: It knows the db machine is already configured correctly, so it doesn’t do anything.

A simple example of Chef Provisioning in Docker

In an April 28, 2015, post on the Safari blog, Shane Ramey provides the example of using Docker with Chef Provisioning to create a network of nodes on a local workstation. The network’s three node types span four instances: a load balancer, two application servers, and one or two database servers. The rudimentary example merely installs packages and sets Chef node data. It has four functions:

Define the machines in the environment: Chef node data and recipe run_list.
Add, query, update, and delete machines by using the environment configuration file.
Share the environment configuration file to allow others to run their own copy of the environment.
Apply version control to the environment configuration.

There are six steps to the process:

Copy the example file ~/deploy-environment.rb to the local machine.
Download and install Docker: boot2docker for Mac and Windows, or Docker for Linux.
Download and install ChefDK.
Install the chef-provisioning-docker library for Chef.
Set the local machine to use Chef Zero to serve cookbooks.
Select Docker as your driver.

Now when you run docker images, the output will resemble that shown in the screen below:

After following the six steps for Chef Provisioning in Docker, running docker images will list information about each of your Docker images. Source: Safari blog

When the images launch, they communicate with a defined Chef server, which can be either the local workstation’s Development Chef Server or a production Chef server. Using a local machine as the Chef server requires that it be launched before the Docker instances can check in. To do so, run the following command:

cd ~/chef && chef-zero -H 0.0.0.0 # Listen on all interfaces

To launch the Docker images, run one or more of the following commands:

Launch Docker images by running one of these commands (after initializing the Chef server, if you’re using a local machine, as in the above example). Source: Safari blog

This technique allows developers working on the same application to share a single environment config file and run everything on the same local machine. Code can be uploaded quickly and simply for deployment in one or more networked environments.

To learn more best practices on container utliization and cloud management, follow Morpheus on Twitter or LinkedIn

Refactoring your apps and databases prior to migrating them to the cloud improves efficiency and portability.

The lift-and-shift approach to moving applications and databases to the cloud may seem most efficient in the short run, but the shortcomings of the technique become more obvious over time. By taking the time to refactor your apps – whether partially or completely – you can reap performance rewards while also ensuring that the systems are easier to port to other cloud services.

If you simply “lift and shift” your apps and databases from in-house servers to AWS and other cloud services, you’re missing a golden opportunity to make your data operations faster, more efficient, and more economical. The only way to take full advantage of the best features of cloud-based app and DB management services is to use containers and other technologies optimized for the cloud.

Here are three main options facing many organizations as they devise a cloud-migration plan for their in-house applications and databases: (HT to David Linthicum for the his original post. You can read his article here: http://www.infoworld.com/article/2937432/green-it/is-your-cloud-green-enough-thats-the-wrong-question.html).

A direct port with no modification of the code (lift and shift)
A partial refactoring that customizes the app to accommodate cloud features (few changes are made to the app itself)
A complete refactoring that not only adds cloud customizations to the app but also reworks its basic functions

While the lift-and-shift migration approach is fastest and simplest, you lose out in two ways. 1: Your application or database can’t take advantage of elastic scalability and other cloud features. 2: The lifted-and-shifted system will cost more to manage in the long run. If your app uses a well-defined architecture, if its data is tightly coupled to the application logic, and if it runs well in the cloud, then lift and shift may let you avoid a costly and time-consuming refactoring process.

When you completely refactor the migrated app, you’ll realize much greater performance while also reducing the cost of managing the system. However, the migration itself is more expensive because you’re altering so much more of the application, and the process will also take longer to complete. If the app is business-critical but is poorly written to begin with, it’s easy to justify the time and expense of refactoring it from top to bottom.

In many instances, the optimal application migration strategy is the “sweet spot” between a straight lift and shift, and a complete code refactoring. Source: Cloud Technology Partners, Inc., via Slideshare

The middle ground of a partial refactoring may seem like the perfect compromise between a straight-ahead porting of the app and totally rewriting it, but this is really a halfway, stop-gap solution that simply postpones the ultimate development and deployment of the cloud-native version of the app.

Determining whether your app would benefit from being containerized

If you decide against the lift-and-shift approach to app migration, there’s another potential limiter you need to avoid: Locking your system to one service’s cloud-native features. In a July 24, 2015, InfoWorld article, Linthicum offers a solution to this predicament: containers. A prime example of such a cloud-native pickle is use of a specific service’s provisioning and deprovisioning resources rather than using the cloud itself to auto-scale and auto-provision your application.

By contrast, containers let you encapsulate your app’s components and place them in containers that keep them separate from the cloud platform itself. This allows the application to be ported quickly and simply to any other platform that supports the container. Containerization entails more work than a straight lift-and-shift port, but the rewards in terms of app performance and cost savings can more than justify the extra upfront cost.

One of the advantages of containers over virtualization is that containers share OSes, as well as bins/libraries when appropriate. Source: ZDNet

Continuing on Linthicum's thought leadership, he states that a primary benefit of using containers to migrate apps and databases to the cloud is the ability to create a lightweight abstraction layer without relying on virtualization. This lets you increase efficiency when transporting workload bundles between cloud services in hybrid and multi-cloud environments. Another advantage of containers is that you can apply security and governance services around rather than inside the containers themselves. This improves both portability and deployment.

One of the most complicated calculations when planning a cloud migration is the cost comparison, particularly when you choose to go with multiple cloud providers. However, as TechTarget’s Joel Shore writes in a December 2015 article, by doing so you can realize savings from 58 percent for a “small” application to 74 percent for “large-scale solutions” over using a sole cloud service. A prime example of a cost-efficient hybrid/multi-cloud service is the Morpheus cloud application management platform.

Morpheus’s intuitive dashboard interface makes managing apps in hybrid cloud settings as simple as pointing and clicking. With Morpheus it takes only seconds to provision databases, apps, and app stack components on any server or cloud, whether on-premise, private, public, or hybrid. Because provisioning is done asynchronously, you can provision multiple IT systems simultaneously. When you’re ready to add nodes via the web UI, CLI, or through an API call, Morpheus automatically configures the database or app cluster to accommodate the new nodes. There’s no simpler way to ensure your organization’s important data is both accessible and portable. To start a free trial click here.

Note: This blog post is a companion piece to an infographic we recently published. To see the full version of the infographic click here.

What is PaaS? infographic by Morpheus. Click here to see the entire graphic.

There’s plenty of substance, and many potential benefits for more efficient data management, behind the “PaaS” buzzword

Platform as a service, or "PaaS", applies the utility model to data management, promising more efficient and more focused IT operations by allowing development teams to focus on the “top of the stack”: the applications and data that distinguish your company in crowded markets and give you an edge over the competition.

Businesses use technology to convert information into knowledge, knowledge into action, and action into achievement. Applying information in a way that helps you reach your organization’s goals starts by asking the right questions. In the rush to the cloud, many business decision makers are asking, “What is PaaS?” But, maybe that's not the right question.

Sure, you need to know that the underpinnings of the services you contract for are rock-solid – today and into the future. But what PaaS is matters less to your business’s bottom line than why PaaS is a good option for your company’s app and data management. Ultimately, how to implement PaaS to best effect for your specific needs becomes your primary concern.

The nuts and bolts of PaaS matter less to business decision makers than the potential the technology offers to improve not only app development and deployment, but also the efficiency and speed of your entire IT operation. After all, avoiding the cost of buying, managing, and maintaining hardware and system software is the cloud’s major claim to fame.

Think of PaaS as "IaaS plus"

In the beginning of the cloud era, there was infrastructure as a service (IaaS), which replaces the data center with a utility-style pay-as-you-go model. Not only do you save the cost of networking, storage, servers, and virtualization, you also avoid having to pay for the staff expertise required to keep the data center running. With IaaS, the customer supplies the OS, middleware, runtime, data, and applications.

Infrastructure as a service replaces in-house hardware and system software with a utility-like pay-as-you-go model, while PaaS adds middleware, runtime, and database software to the utility model. Source: Redgate, Simple Talk.

PaaS moves even more of the generic, plug-and-play components of your company’s data center to the cloud’s on-demand service model: the middleware, runtime, and database components. This allows you to focus your time, energy, and budget on the “top of the stack,” the applications and data that distinguish you from the competition and provide your edge in the marketplace.

PaaS Benefit 1: Businesses can focus on the application layer

PaaS users are liberated from having to manage or secure the underlying cloud infrastructure, including the network, servers, operating systems, and storage, but they retain control over their deployed applications and the configuration settings for their applications in the hosted environment. For example, cloud app management platform Morpheus features an intuitive interface that allows end users to provision databases, apps, and app stack components in just seconds on any server or cloud, whether it’s on-premise, private, public, or hybrid.

The PaaS provider is responsible for provisioning and managing the lower-level infrastructure resources, as well as for supporting the upper-level managed application development and deployment platform.

Operating systems
Databases
Middleware
Software tools
Managed services (usually in a multi-tenant environment)

PaaS Benefit 2: Faster app development and deployment

Because developers are completely abstracted from the lower-level details of the environment, they can focus on rapid development and deployment. They need not be concerned with such matters as scalability, security, and the other components of the stack’s lower layers, all of which are managed by the PaaS provider. Similarly, by supporting a range of runtimes, PaaS allows developers to retain control over the application’s deployment and maintenance.

Integration with the lower IaaS layer is via APIs supplied to PaaS services by the IaaS stack.
At the upper SaaS layer, PaaS services support many different runtimes in which the applications are deployed, managed, and updated.

On the continuum from IaaS to PaaS to SaaS, the required level of skills broadens from a handful of high-priced experts to the universe of everyday end users. Source: James Staten, Forrester Research

PaaS Benefit 3: Affinity for modern widely distributed networks

PaaS complements other cloud-based operations, such as mobile and web apps. It promises to reduce operational costs and increase productivity by reducing time to market.

To learn more about cloud app management for platforms, visit the Morpheus features page or sign up for a free trial here.

Questions Surround the Plans of Chicago and Other Cities to Tax Cloud Services

The best thing you can say about Chicago’s new 5.25 percent tax on cloud services, which went into effect on January 1, 2016, is that it is unlikely to survive a court challenge. The city claims it is merely clarifying its existing Transaction Tax that applies to non-premises computer leases, but others question the city’s legal right to levy what many businesses and residents claim constitutes a new tax.

The new cloud tax exempts small startups, but it applies to all other providers and consumers of cloud services. A similar expansion of Chicago’s Amusement Tax applies to all streaming content sent to or received by businesses and residents with street and IP addresses within the Chicago city limits. The taxes are levied even if the resident is outside the city limits when using the service. That’s only one of the puzzling aspects of cloud taxes that a growing number of municipalities – including New York, Boston, Seattle, and Austin, Texas – are enacting.

Computer leasing tax gets a cloud-era update

In June 2015, Chicago comptroller Dan Widowsky ruled that the city’s nine percent property lease transaction tax applied to cloud services. The city also issued a ruling that extended its amusement tax to streaming services, such as Netflix and Spotify. The new taxes were originally intended to take effect on September 1, 2015, but the start date was ultimately pushed back to January 1, 2016. Amazon Web Services, Netflix, and other cloud and streaming vendors are expected to begin paying the taxes as of February 15, 2016.

Following discussions with Chicago’s business community last summer and fall, the ruling was amended, reducing the tax to 5.25 percent and exempting companies that are less than five years old and that have less than $25 million in revenue. John Pletz reports on the backlash against the tax in a January 12, 2016, article on Crain’s Chicago Business.

Tech companies and their customers located in Chicago, as well as their counterparts in other cities that have applied local taxes to cloud services, are left scratching their heads. They struggle to understand how city officials can profess to support growth in the tech sector in their communities on the one hand, while on the other hand, they reduce via taxation the revenue these companies are able to invest back into their communities through salaries and other spending.

Vagueness brings fair implementation of the cloud tax into question

When a Chicago business purchases cloud services, or provides cloud services to its customers, the tax must be collected. What’s uncertain is how the tax will be levied when the company is located in Chicago but its computer facilities are situated elsewhere. Likewise, what happens when the service is accessed via a mobile device or via a remote connection to a Chicago office? Another uncertainty is the resale of cloud services, which could result in double taxation: once by the provider, and again by the consumer.

Amazon recently clarified its approach to the tax in a letter sent to its customers in Chicago. Rather than attempting to collect the 5.25-percent tax directly from customers, the company will pay the tax itself based on its calculation of what the customers owe to the city.

The Daily Signal’s Jason Snead reports in a September 18, 2015, article that six Chicago residents have filed suit to block the taxes on cloud services and streaming content. The residents claim the city exceeded its authority when it enacted the taxes. The plaintiffs also assert that the taxes violate the Internet Tax Freedom Act of 1998 by discriminating against products and services acquired online, compared to their counterparts purchased via a non-Internet source.

For many consumers and providers of cloud and streaming services, the bigger issue of local taxes is public policy. How can a city present itself as business-friendly when it handicaps companies in their attempts to compete with their counterparts located in areas that aren’t subject to such taxes? Also, imagine the administrative headache for a company such as Amazon that has to account for the taxes levied by dozens or even hundreds of separate, distinct municipalities? The onus of calculating, collecting, and accounting for the taxes is another penalty applied only to a specific class of businesses.

The decline in hardware and software purchases and leases has reduced the amount of tax revenue collected by cities. It’s natural that municipalities would look for ways to recoup that lost revenue. Several analysts point out that there are many more-practical ways to abide by the spirit of the tax policy while also being fair to the businesses that provide and consume these increasingly vital services.

With cloud taxation becoming a growing topic of interest, businesses are taking more precautions to ensure they are approaching the cloud in the post possible way. Morpheus helps make cloud management simple and easy for IT and Dev professionals. To see how Morpheus is helping customers manage their Cloud, click here or watch the video below.

Managing systems in the cloud can be a complex process. Think about it: you have numerous administrative tools to master, servers to maintain, apps to keep running, and scaling that needs to happen — it's a lot to manage, even with a team of pros (and we all know you're probably short-staffed as it is).

The good news? The process doesn't need to be as complicated as we often make it out to be. Using the right Cloud Application Management for Platforms (CAMP) tool can make your job much easier and save you a lot of time and money as well.

Potential Cloud Issues with Growth

How cloud computing matures. Source: The Data Center Journal.

Like the chart above shows, as your organization's use of the cloud matures, the growth into an application-centric and then a business-centric cloud will require more and more administration, which requires more and more time from you and your IT staff.

Things can go astray pretty quickly without some way of simplifying all of the necessary tasks and maintenance items to keep everything running. There are countless problems you could run into. Let's take just one example.

1. You as an IT leader are swamped trying to do something for just about every department in the organization

2. Dev needs something spun up but can't wait so they spin something up on their own

3. You catch wind of the Shadow IT problem and have to spend even more time trying to track down all the errant systems

4. Another request comes into IT but they are falling more and more behind every minute it seems

5. Repeat.

Sound familiar? We thought so. This Shadow IT problem is becoming one of the most common problems IT leaders face in today's fast-paced environment. But that's just one possible problem facing IT leaders. So how do you solve all these problems?

Save Time with CAMP

A Cloud Application Management for Platforms (CAMP) solution can help you organize and streamline your cloud management, making it far easier for you and your staff to meet the increasing demands that come with growth.

Cue the shameless plugin...

We started Morpheus to help bridge the gap between IT and Dev and make the business side of cloud management simple.

Morpheus' User Interface. Source: Morpheus.

With Morpheus, you can provision apps and databases in real-time to public, private, and hybrid clouds and spin up databases, apps, environments and more with a few simple clicks. Additionly, database scaling is as simple as adding nodes when needed. With IT able to set up environments quickly, dev can begin working on new projects fast. This streamlining helps to alleviate the presence of Shadow IT, as the end user will not need to wait nearly as long for a new system.

For the business side, Morpheus provides automatic logging, monitoring, and backups of your systems. You can determine the backup time and frequency to ensure you get the backups you need, when you need them. Morpheus takes care of infrastructure, setup, scaling and more, and sends you alerts when your attention is needed. What does that mean? It means you get part of your life back. Imagine having more freedom to take care of the slew of tasks that have been heaped on your shoulders?

We think it's an awesome tool that'll solve your cloud woes but we want to know what you think. Click here for a free trial and let us know your thoughts.