Building an Agnostic Cloud Infrastructure

Over the last two years I have often been asked: What is the best way to move out of Amazon Web Services (AWS) or other public cloud services? There are many reasons for making such a move, but cost, stability, and planned growth are the primary factors.

Many companies that surpass $50,000 a month in cloud costs on AWS begin to consider what types of things should be pulled out of the cloud or moved to another cloud provider. Companies often can’t make these moves quickly because they have optimized for AWS services with automated scripts for spinning up and down spot instances or deployment tools and development processes that are built around cloud computing in general and AWS in particular. Furthermore, not many companies have clients paying them to improve infrastructure, which typically makes infrastructure changes lower in priority to pumping out new features or fixing bugs. Thus, the simplest way to achieve such a move is to not get locked into vendor-specific features in the first place.

Moving cloud-based infrastructures to hybrid computing or on-premise computing is a well-covered subject. I figure that with so many companies starting up and growing rapidly, it would be more helpful for our fellow start-ups to talk about how to deploy an agnostic infrastructure that is capable of being run in a hybrid environment from day one. Why wait until you are stuck on one provider and do not have the time or resources to move? It is far better to plan and build for hybrid/multiple cloud vendors from day one.

Moz, where I previously worked, successfully moved a large part of its computing from the cloud to on-premise. It was, however, a large challenge due to prevalent AWS-specific tools and processes. To be more nimble, developers launching new products and implementing infrastructure today should think of agnostic computing from the beginning.

At iSpot.tv, where I was a board member before becoming chief of engineering and product earlier this year, software engineers led by Abe Lettelleir began laying the groundwork in late 2012 for being computing agnostic on a shoestring budget. I was able to pass on a few nuggets from hands on experience at Moz early in iSpot’s genesis, and I will highlight some of the decisions we’ve made at iSpot.tv along these lines.

On a given day, iSpot can ingest hundreds of thousands of videos, social interactions, and meta-data associated with TV ads to deliver real-time analytics to our customers. We must filter through millions of second-screen and social interactions along with thousands of television commercials, network show promos, and movie trailers.

We knew that we would need to take advantage of AWS regions and our own servers to ensure we had multiple location redundancy from day one. This architecture meant that if any portion of our processing pipeline failed we would avoid site-wide catastrophic failures or large amounts of backlogged data in need of processing in a short period of time due to down-time.

We did our research and determined it wasn’t necessary to spend top dollar to have a hybrid solution. Our development and test environments are built on refurbished hardware from The Server Store. We settled on Dell servers and used racks. We estimated about 20 percent of the servers we purchased would fail within in a few months and planned accordingly. No big deal, we wouldn’t lose data because our data was multi-homed and our teams wouldn’t be impacted by any failure. Using this strategy we also saved about 60 to 75 percent off list price of new servers. Bonus! As another bonus, the Dell servers have been much more reliable than our initial estimates. The failure rate of our refurbished Dell servers is closer to 3 to 7 percent.

Simple enough from a hardware perspective. However, we also wanted to ensure the developers didn’t have to use different processes to deploy, test, and run code in our on-premise servers as compared to AWS. Our senior systems engineer, Tao Craig, made our internal infrastructure mimic AWS using OpenNebula. Thanks to Tao and OpenNeubula the environments are identical to the developers.

This low cost, agnostic architecture allows us to deploy, move, and run our software pretty much anywhere we need it to be.

Our next challenge, which we’ll be tackling later this year, is to further broaden our cloud computing from AWS-only to other cloud providers such as Redapt and Google Computing Engine as well. Our generalised tools and procedures should allow us to do this without too much ado.

Moving out of the cloud can be as simple or as hard as you make it. It is key to keep your compute and storage generic as that will make it a lot easier for you to move from one cloud provider to another, or to on-premise. Sure, you might spend a little more time upfront, but when you need to scale and determine what you need to leave in the cloud and what you need to move to your own on-site infrastructure, you can do so in an expeditious manner—and before you are hit with a hefty cloud bill.

Original Post: http://www.xconomy.com/boston/2015/05/20/building-an-agnostic-cloud-infrastructure/