x You must enable Federated Login Before for this application.
Google App Engine Control Panel -> Administration -> Application Settings -> Authentication Options

Blog - StackGeek

A Code of Trust

Thu, 08 Aug 2013
Vote on HN

by

After Lew Moorman recruited me to advise Rackspace, I wanted to better understand Rackspace's intent and how they envisioned the future of their industry was going to pan out over the coming years. On of my first questions to Lew was "Why did you Open Source OpenStack?". Lew replied, “We did it to ensure a robust ecosystem, drive long term innovation and to help commoditize the core operating system.”

I’ve been thinking about what he said since, and it appears everything Rackspace wants to achieve with its Open Source strategy is actually based on trust. All good ecosystems rely on trust, whether that’s the school you trust with your child's education, or the bank you trust with your cash or the technology bits you trust with your data. Without trust, no ecosystem remains stable. Trust also empowers positive change and innovation. People who want their ideas to have impact must entrust other people to amplify, execute and improve on what they’ve created. That’s why the most powerful innovators make ideas readily and equally available to markets, governments and society.

Trust is also a necessary requirement of any commodity. When you buy gold, pig iron, pork bellies or crude oil, you’ve got to trust that you’re getting what your contract says you’re going to get. You’ve got to trust that when you say “gold” and I say “gold” we’re talking about the same thing. No more, no less.

To that end, an Open Sourced OpenStack gives way to methods for trusting the underlying infrastructures that runs our code, stores our data and makes our technology startup innovation process successful. This trust in infrastructure comes about when you have access to and are able to understand all the hardware, software and services your company uses day to day. While there will be many of us who don't choose to look behind that curtain of transparency for all the components we use to build our business, there will be a few of us who can and will.

At no other moment in time has trust been at such a premium. Recent revelations show the NSA has its fingers in every aspect of technology in our lives, including our phones, our operating systems, our clouds, and even the processor chips in our computers. All of these current events serve to highlight how precarious and precious the establishment of trust has become in the communications and interactions occurring between all of us.

It's not a surprise most communications occurring today on the Internet take place via software APIs. These APIs are critical to establishing trust in the way we interact with each other.

APIs Define Interactions

APIs exist to provide programmers an easy way to 'wrap' a set of function calls which reside inside a larger set of separate files. You can imagine these function wrappers as envelopes with a label on the outside summarizing what is on the inside of the envelope. Inside each envelope are more envelopes with labels and/or letters, any of which you may choose to open and read, given you have the ability to open the envelopes and view the underlying code.

Let's take a look at a simple two file example of an API. The first file contains code that wraps a fictitious Open Source API called MiCloud:

# clusterbuster.py - a simple "API" for snakeoi.ly site
# THIS SOFTWARE IS PROVIDED BY BUSTER KNUTS ''AS IS'' ETC. ETC.
# 
# import the multi-tenant infrastructure cloud library
import MiCloud

def build_cluster(num_servers):
  cluster = MiCloud.create_cluster("Bob")
  for x in range(num_servers):
    node = MiCloud.grab_available_node()
    instance = node.start_instance()
    ip = MiCloud.get_ip()
    instance.assign_ip(ip)
    cluster.add_instance_to_cluster(instance)
    print "instance %s started" % x
  print "cluster built"

The code here provides a simple loop to iterate over several MiCloud calls. Our programmer can now write a small amount of code in a second file to start ten servers:

import clusterbuster

# start the snakoi.ly cluster
clusterbuster.build_cluster(10)

This second code snippet embodies the essence of an API: It makes it easy to do powerful things with a few lines of code. It also enables the underlying code to change the way it does things 'under the hood' without our programmer having to know or do anything different.

There are a few crackpots who think API parity between OpenSource software and closed/proprietary services like AWS are important. Contrary to these individual's claims, API trust is actually a far more important topic for consideration.

APIs Define Levels of Trust

Let's consider a scenario where another programmer is using a similar 'cloudy' API as the one above, but isn't able to see the first file because a) the code is closed source and b) it's being hosted on a service provider's infrastructure. In this example, the first file (again, hidden from public view) has been modified slightly by a large corporation to include code which checks to see how much a customer paid the service provider last month and if it was below a certain amount, starts an instance for them on an older part of the cloud provider's infrastructure:

  user_value = customer.last_month_invoice_amount()
  if user_value > 50000 or random.random() > 0.5:
    node = MiCloud.grab_available_node()
  else:
    node = MiCloud.grab_available_node_on_crappy_old_greasy_servers()

This decidedly evil example could present itself in the real world via proprietary software libraries or Web APIs, both of which could run non-Open Sourced code and non-transparent infrastructure.

If you think you can assume to trust proprietary APIs and closed source code created by companies who's primary purpose is increasing revenue, think again. Bruce Schneier's post from 2007 revealed what might have been a possible backdoor in the algorithm used by Microsoft's CryptoAPI. Fast forward to today, Microsoft is now being accused of giving the NSA a backdoor to Outlook and Skype. They are claiming they had no choice.

We all want to trust companies to do the right thing with our data. We want to trust Microsoft writes completely secure code we can run and we want to trust Amazon gives us fair, safe and secure computing with the AWS APIs. However, there’s no basis for that trust until you can see into the source code they are running or fully understand how they build and run the services they provide. This is especially true when companies are being forced into compromising situations by certain governments.

SaaS Services Benefit from Implied Trust

When Raffy Marty and I founded Loggly, I quickly realized how important our user's log data could be. Shortly after launching Loggly, we received a request from a customer to delete their account and all the data they sent us. It turned out one of their developers left debug statements in production code, which ended up forwarding their Loggly account all of their user's unencrypted usernames and passwords! Whoops. We both recalled stories of similar requests by Splunk customers struggling with purging data from their installs - for credit card numbers! Double whoops.

Incidents like these serve to illustrate a broader point which I completely missed at the time, but Raffy was quick to point out: Our customer's customers had no idea they also needed to trust Loggly with their data. Our customers assumed they could trust Loggly to do the right thing with that data because they were doing business with us, regardless of our intent or coding abilities. Further, the implied trust chain also required all those people also trust Amazon, because Loggly used the EC2 APIs to start and run our instances.

The fact is, if I use a proprietary service provider because I don't want to run the services myself, then there really is no way to know for certain they will act responsibly with my data. I must implicitly trust them to do the right thing, in all cases. Unfortunately, you can't really trust a service built on closed technologies because you can't see inside the service. The combination of desired outcomes (easy infrastructure) and risk bias (implied trust) presents itself as a dangerous one because leads to cognitive dissonance - literally believing two things at once: I have to TRUST this service because I NEED this service.

I believe, in order to achieve real trust, we have to open everything. And by everything I mean EVERYTHING on the Internet in between my brain and yours. Working together to build that trust enables better outcomes for customers, ensures there’s a sustainable innovation ecosystem and makes technology progressively more accessible to a wider and wider community.

A trust based initiative is worth fighting for and one I’m focused on building in the coming years. I trust you will join me! :)

0 Comments

Handling Subdomain Routes with AppEngine

Tue, 19 Feb 2013
Vote on HN

by

I've been using GAE-Boilerplate for several projects including the code here on StackGeek. GAEB uses webapp2 for it's application framework. Recently, while trying to solve a security concern with utter.io, I went in search of a way to route subdomains to different handlers.

Subdomain Handling

Webapp2 provides a method called DomainRoute which takes a subdomain as an argument and uses it to match a set of routes you pass to it. Here's a snippet of that in action:

routes = [
    # handle specific subdomain/hostname
    DomainRoute(config.subdomain_hostname, [
        Route('/', handler='handlers.PageHandler:subdomain', name='pages-subdomain'),
    ]),

    # handle other hostnames and domains
    Route('/', handler='handlers.PageHandler:root', name='pages-root'),
]

Try It Out

You can clone my example repo and add it to AppEngine Launcher to test the code. Check out the code locally by doing the following:

git clone git://github.com/kordless/webapp2-starter.git

You'll also need to modify your local hosts file for testing. Add the following to the bottom of your /etc/hosts file:

sublocalhost 127.0.0.1

Add the new project by going to File..Add Existing Application and browsing to the directory where you checked it out. Be sure to set the port to 8282 if you want to use the links I provide below.

screenshots

Click on add. Click on the run button to start the app. You should now be able to hit the following urls:

Notice you get different results for the two pages.

Production Configuration

For this to work in production on AppEngine, you'll need to add the full subdomain+domain to Google Apps. In the screenshot below, I've added a subdomain oi.utter.io to the utter.io domain I configured the first time through domain setup for my AppEngine project.

google_apps

Now I'm in production, my config file looks like this:

# config file
if os.environ['SERVER_SOFTWARE'].startswith('Dev'):
    subdomain_hostname = 'sublocalhost'
else:
    subdomain_hostname = 'oi.utter.io'

With this technique I've been able to keep the code in a single repository, deployed to a single AppEngine project, yet serve two distinct subdomains. Very handy!

0 Comments

NetappNFSdriver in Folsom

Sun, 17 Feb 2013
Vote on HN

by

This article describes configuring a Netapp storage device for use with Openstack's Cinder service. It uses the NFS protocol via a NFS driver from Netapp, which allows for storing Cinder volume snapshots directly on a NetApp storage unit.

There is very little documentation available describing this process. This configuration was put together by studying the source code for the driver!

For this guide, OpenStack Folsom was installed on Ubuntu Server 12.04 using the Ubuntu Cloud Archive Repositories from http://ubuntu-cloud.archive.canonical.com/ubuntu.

The OpenStack configuration files are complex, and the configuration process for adding a NetApp box has a few caveats which are described here.

Configuration Files

All configuration is done in /etc/cinder directory.

/etc/cinder/cinder.conf

These are options which needs to be added to cinder.conf for the NetappNFS driver to work correctly:

# Make sure that you don't use nova.volume.netapp_nfs.NetAppNFSDriver
volume_driver=cinder.volume.netapp_nfs.NetAppNFSDriver

# Where the file with shares is located
nfs_shares_config=/etc/cinder/shares.conf

# Where to mount volumes
nfs_mount_point_base=/mnt/cinder-volumes

# Driver sends command to create clones and snapshots via DFM, 
# so we need to configure, it
netapp_wsdl_url=http://172.21.1.22/dfm.wsdl

netapp_login=dfmlogin
netapp_password=dfmpassword

netapp_server_hostname=172.21.1.21

# I'm not sure whether it is necessary to define
# netapp_storage_service
netapp_storage_service=Test-Cloud

/etc/cinder/shares.conf

This file includes Netapp volume/qtree paths on filer which will be mounted to control-node and used for cinder volume creation. Add one path per line in following format:

filername:/vol/CINDER_VOLUMES

Also, there are two things to consider when editing:

  • Assure that there is no empty line in the file, because cinder is dumb and will try to mount empty path, which ends up with error.
  • It is necessary to use hostnames of filers instead of IP addresses. These hostnames has to be same as hostnames of filers in DFM (OnCommand).

Double check you use the correct hostname for the filers if you get a snapshot creation fail:

2012-12-12 13:21:03 16643 TRACE cinder.openstack.common.rpc.amqp WebFault: Server raised fault: '(22255:EOBJECTNOTFOUND) There is no host, aggregate, volume, qtree, resource group, resource pool, or dataset named 192.168.0.2.'

/etc/cinder/rootwrap.d/volume.filters

Append following lines to the end of this file:

stat: CommandFilter, /usr/bin/stat, root
mount: CommandFilter, /bin/mount, root
df: CommandFilter, /bin/df, root
truncate: CommandFilter, /usr/bin/truncate, root
chmod: CommandFilter, /bin/chmod, root
rm: CommandFilter, /bin/rm, root

/etc/cinder/api-paste.ini

The [filter:authtoken] section has to be configured as it is described in cinder installation guide.

After all that configuration you can restart cinder services:

$ sudo service cinder-volume restart 
$ sudo service cinder-api restart 
$ sudo service cinder-scheduler restart

Now you can try to create volume from CLI:

$ cinder create --display_name test 1

or via Dashboard/Horizon:

Bugs Encountered

I patched the file nova/virt/libvirt/driver.py and added the NfsDriver to the list of drivers. There is more info on that process here.

Blueprint of NetappNFSDriver

The blueprints for the driver are here: https://blueprints.launchpad.net/cinder/+spec/netapp-nfs-cinder-driver

0 Comments

OpenStack's Board Breakdown

Wed, 30 Jan 2013
Vote on HN

by

Nick Heath of ZDNet has an article out today titled 'OpenStack dominated by big vendors' interests? "No way.", says Co-Founder' where he attempts to refute Lydia Leong's opinion on the matter and where Chris Kemp claims large companies don't dominate the OpenStack board.

Chris is quoted as saying 'Only one-third of the voting influence is in the hands of big corporations'. However, I think it's crystal clear to anyone involved in the OpenStack community that big corporations are heavily involved.

Let's dig in to the data though, shall we?

Data doesn't lie.

OpenStack Board Structure

OpenStack's Bylaws state the board is comprised of 24 total seats. Seats are evenly (8 seats each) elected by 3 classes of members: Individuals, Platnum Members and Gold Members. Note that Individual Members are allowed to be employed by Platinum or Gold Members.

The Foundation's site also provides gives a breakdown of the Platinum and Gold Members. AT&T, Canonical, HP, IBM, Nebula (founded by Chris Kemp himself), RackSpace, Redhat and SUSE make up the 8 Platinum Members. Cisco, Dell, Intel, NetApp, VMWare, Yahoo, NEC and a few others make up the 14 Gold Members.

Over 50% of the Platinum and Gold members are publicly traded companies, with a total marketshare that is North of 1/2 a TRILLION dollars.

OpenStack Board Breakdown

If we look at the board, 13 of the 24 (54%) board members work for publicly traded companies, including AT&T, Cisco, Dell, HP, IBM, Rackspace, RedHat, Sina, and Yahoo.

Of the remaining companies, over 50% are focused on making money of OpenStack or virtualization technologies. Those include Canonical, Cloudscaling, DreamHost, Mirantis, Nebula, Piston Computing, and SUSE. All of these companies are extremely well funded and some of them are big companies in their own right. SUSE, for example, is wholy owned by a private company with over 4K employees and over $1B in revenue. Is it fair to say they are a 'big' company? I'm going to say, "yes it is"!

OpenStack is Influenced by Big Vendor's Interests

Chris argues "Arguably the individual and smaller interests cancel out the large corporate interests.", but in reality the large corporations completely dominate the board. I'm not even sure who qualifies as a 'smaller interest' here. Perhaps he's making some vaugue reference to Nebula, Piston and CloudScaling's seats as the smaller guys in the market.

Regardless, 3 or 4 seats out of the whole is hardly a decent representation for smaller and individual interests. It's pretty clear by the board makeup that big interests control the board.

I don't always agree with what Lydia writes, but in this case I'm going to have to take her side and call BS on Chris' strawman arguments. It's clear OpenStack's fate is in the hands of big corporations.

Thankfully there's always that fork button on Github to save us all from their evil ways. In fact, fork all you big corporations!

0 Comments

A Banging Deal on HP Cloud's X-Small Instances

Tue, 22 Jan 2013
Vote on HN

by

For years now I've imagined battles with HP printer driver developers and myself. In my most grand envisioned scenarios I sneak up on their filthy hords in the middle of a product crisis (requiring some type of printing and/or scanning) and then rip their laptop's display from its hinges in a single mighty cleaving blow.

Printer Beatdown

Wires, bits of plastic, LCD fluid and small screws fly as I pound the display into their printer/scanner, all the while screeching, "Have you #_&%ers ever considered doing usability studies on your software?".

Signing Up for HP's Cloud

Signing up for HP's cloud is reminiscent of installing HP printer drivers of yesteryear. In these days of multi-part/page forms to ease the pain of signup, HP chooses to hit you with a massive form which collects seventeen different bits of information about you. Once you get past the initial signup, you'll be locked down in a form asking your payment information.

Pay Up!

To make matters worse, prepared to scratch your head over HP's pricing.

Don't Get Confused with HP's Pricing

Back in December I read a post by Beth Pariseau referencing the respective pricing for Rackspace, AWS and HP's small instances. With some feedback from myself and Rackspace she later updated the post to show HP had the best price of the three providers for small instances.

Unfortunately, HP's pricing page today indicates we were only half-correct about their pricing model. Starting March 31, it appears HP plans on raising the price of their small instances to $0.07 an hour, or 1/2 a cent more an hour than Amazon. That difference works out to a whopping $3.60 more a month than AWS ($0.065/hr), and a staggering $7.20 more a month than Rackspace ($0.06/hr). Yes, I'm being dramatic. Remember those printer drivers?

Thankfully, there's a silver lining in HP's cloud offering.

Enter HP's X-Small Instances

After pulling my hair out trying to figure out HP's confusing UI, I stumbled across the fact they provided X-Small instances, which are set to be priced at $0.04 an hour after 3/31. . Digging further, their X-Small instances appear to be essentially what AWS calls a small instance (less .7GB of RAM and 100GB of storage), and what Rackspace calls a 1GB instance.

It was at this point I realized a blog post detailing the difference in the different instance sizes and pricing might be useful. That was right before I was shocked at how fast HP's X-Small instances are.

HP's X-Small Instances Are Crazy FAST

In my tests, HP's X-Small instances finish booting and become available for ssh in about 20-30 seconds. Amazon's small and micro instances took ~1-2 minutes to fully boot and become available for ssh. I didn't test Rackspace, but I did fire up a few instances on my tiny OpenStack cluster, and those booted and became available for ssh in about the same time as Amazon's instances.

Bandwidth in and out of HP's instances appears to be around 3MB/s.

I downloaded Primate Lab's Geekbench to the instances and ran it in 32-bit mode. All the kernels I tested were 64-bit. I don't claim to know much about benchmarking, so take the following for what it's worth.

The result for a X-Small instance on HP's cloud yielded a Geekbench score of 4042. The result for a Small instance on AWS yielded a Geekbench score of 1691. I've included a side by side comparison of both provider's small instances, as well as a comparison between the XX-Large HP instance and their X-Small instance.

It's worth noting the single core-tests for both HP instances are nearly identical, as are the memory throughput tests. It's likely they are using all the same type of hardware for their instances, with the minimum instance getting a single core from these machines. Consider this if your application/framework isn't multi core capable.

In conclusion, HP's X-Small cloud instance would appear to give you the biggest bang for your buck when pricing all the top cloud providers. While I'm not especially in love with HP's signup flow, or their CRAPPY cloud UI, I have to admit their pricing (even after 3/31) looks pretty good for light dev/staging work.

Now, back to my ninja themed daydreams!

Update (1/23/13): I've completed a GeekBench comparison of HP Cloud's X-Small instance and a 1GB instance from Rackspace. Rackspace's 1GB instance weighed in with a 2337.

0 Comments

Website Source on Github

Mon, 19 Nov 2012
Vote on HN

by

The source for the StackGeek website has now been uploaded to Github. Fork the project if you are interested in deploying your own multi-user Github content based blogging site on AppEngine.

It's worth mentioning that with the site I'm trying to introduce a new pattern for technical blogging and sharing of articles. The site is designed from the ground up to be 100% Open Source and at the same time give authors complete control over their content. Everyone should be aware all content authored on StackGeek is, by default, licensed under the share-alike Creative Commons license. Remember, even under a CC license, authors always retain copyright on their content. StackGeek takes no ownership rights in content created here on the site.

Google AppEngine Boilerplate

This site runs on Google AppEngine and was written using the Google AppEngine Boilerplate project on Github. The project is looking for developers who have experience with Python, GAE, JavaScript, WebApp2, Jinja2-like HTML templating, and of course, CSS and design. Join us today if you are interested!

Using the Code

Getting the code going locally isn't hard. Check the site out from Github, fire up the Google AppEngine Launcher, and add an existing project pointing to the code you check out. Edit the config.py file and stick in your own keys/hashes/salts/etc where needed.

AppEngine Launcher Add Existing Project

Fire up the project by clicking 'run' in Launcher, then click browse. Create an account by going to sign up, and then ignore the part where it emails you. You'll be able to enter a user/pass for the first account and be automatically logged in.

If you have any questions, comments, feature requests, bugs you've spotted, or whatever, just open a ticket!

0 Comments

Moving to AppEngine

Mon, 12 Nov 2012
Vote on HN

by

I started StackGeek so I could share my explorations with the OpenStack project. In May I authored a guide for OpenStack titled Installing OpenStack in 10 Minutes which has been getting a good amount of traffic since. That guide is now in the top 10 results on Google when you search for 'openstack install' and the site iself gets around 120 uniques a day, with over 10K video views so far.

Holy Guacamole, it's OpenStack

I want to help more people find solutions to their infrastructure questions and bring those people together in a community where we can share what we know with each other. Given enough content and a good community behind it, I think StackGeek could become a fantastic resource for infrastructure architects.

I also wanted it to be 100% Open Source, including article content and source code for the site.

Moving to AppEngine

The new StackGeek site is build in Python to run on AppEngine and uses the GAE-Boilerplate project hosted on Github. You can follow the project on Twitter if you like. GAEB provides a social enabled web service framework which provides signup, login, user profile, simple account admininstration, and more. The project could use more contributers if you are interested in joining us. I'm currently working on adding a blogging framework to the project (which is used on this site) and there are plans on adding more modules in the future.

An Engine

Why Gist the Articles?

If I build a site, and then expect people to contribute to it, I expect the contributers to a) want credit for their contributions and b) be able to use their contributions on other sites, including perhaps their own blog running GAEB. I figured the easiest way to manage and spread around content was the same way code does it - Open Source that shizzle!

Syncing up with Github gists makes things a bit complicated code-wise, but I think the way it's currently working in the UI makes it pretty easy to use. If you have a feature you'd like to see added to the site, please head on over to the site's project on Github and put up a feature request!

Getting Started

To get started create an account and then head on over to your settings page and add the Github and Twitter associations. Obviously you'll need both a Twitter and Github account! Fill out your profile, including the Bio, Gravatar and Twitter Widget fields. Those will help poplulate with content.

You can create an article one of two ways on the site. The first way is done by going to the create a new article page and then enter an article title and summary. You'll also need to select a post type. Posts go into your blog timeline and guides go on your guide page. Video posts don't go anywhere right now, but eventually they'll help populate the videos page on the site and there will be a user video page as well.

The second way to create an article is to fork an existing user's article. If you have a good Github association on the site, you'll see a fork button next to the articles. Clicking on that button will fork a copy of that article into your account. The copy will be created as a draft.

Right now there's no way for you to 'submit' an article in your account to be seen via the public URLs. I'll have some tool to enable that done in a few weeks! In the meantime, feel free to drop me a line and request publication.

0 Comments

Increase the Size of devstack's Volumes

Sat, 28 Apr 2012
Vote on HN

by

I've been whacking around on OpenStack for about two months now. Devstack is awesome sauce and all, but it's default limitation for creating any cloud volumes larger than a total of 2GB has been driving me crazy. I keep needing volumes which are bigger, end up trying to create them, then get the infinite creation spinner in the UI. The only way I can get it to go away (but still not be able to make any new volumes) is to whack it from the nova database in MySQL. That's silly.

pancakes - mmmmm

Turns out there's a slight mention of this on the devstack page for a multi-node install. Let's borrow some of those instructions and put them on my blog so I can get massive page rank. Or not.

The fix is fairly trivial. Assuming you have a spare hard drive laying around, stuff it into your OpenStack node. If you don't have a spare drive, go buy two SSDs, install one, and mail me the other one. I could really use it.

Fix That Shizzle

The rest of this howto assumes you have a spare empty drive mounted at /dev/sdb. It's not my fault if you bork your box.

Start off by running fdisk from the prompt:

sudo fdisk /dev/sdb

Again, be sure you use the actual file handle for the drive!

  1. Now create a partition new partition by typing n, then a p for a primary partition. Use the default start and end values.

  2. Set the partition type to Linux LVM by typing t then 8e. Hit w to write out the partition to the drive and exit.

  3. From the prompt again, run:

    pvcreate -ff /dev/sdb1 vgcreate nova-volumes /dev/sdb1

The -ff is just there to force the creation of the volume. You can leave it off if you are scared.

That's it. The stack.sh will take care of the rest! You should now be able to create very large volumes at will now!

0 Comments

Comparison of Open Source Cloud Support for EC2

Sun, 08 Apr 2012
Vote on HN

by

The Open Source private cloud social scene has been in a real brouhaha over the last few weeks. Media attention has focused on partnerships and software re-licensing. There's also been talk of mis-aligned hypervisor support and a couple of EC2 API compatibility debates.

It's time to take a look at exactly what matters when it comes to getting serious platform adoption - making small development team's lives easier.

Why does EC2 API Support Matter?

The primary reason EC2/S3 API compatibility matters is the sheer amount of code which has been written against Amazon's APIs. If a developer has written a ton of code to talk to AWS, it's unlikely they would want to spend the time or effort porting all their calls to support a separate API. This is probably one reason there is such a clamor on the Internets for standardizing cloud APIs.

This is also evidenced in the fairly uniformed EC2/S3 support provided by OpenStack, Eucalyptus and Cloudstack. Providing support for EC2 calls makes it easier to extend infrastructure building calls into a private cloud offering.

High Level Comparison for EC2 API Support

The OpenStack wiki has a fairly detailed API compatibility matrix showing support for the main calls to the EC2 APIs. With the exception of Eucalyptus' lack of documentation, both OpenStack's EC2 support and CloudWatch's CloudBridge EC2 support provide all the basic features for launching servers:

FeatureOpenStackEucalyptusCloudStack
Register/Deregister Images
Launch/Terminate/Reboot Instances
Create/Delete/Describe Keypairs
Allocate/Associate/Release IP Address
Create/Delete/Describe Snapshot
Create/Delete/Attach Volume
Create/Delete/Desc/Auth/Revoke Security Group

Each project allows you to create OS images, spin up instances, give them keypairs, assign them floating IPs, control their firewall rules, attach volumes, make snapshots, and shut the whole thing down when you are done.

tl;dr; Servers at a click!

So Then, What's Not Supported?

Not all EC2 API features are supported by the separate projects, and some are not supported at all. In general, there are some small subtle difference between the support of the EC2 APIs. Let's take a look at who supports what, if at all:

FeatureOpenStackEucalyptusCloudStack
Paid AMIs
Spot Instances
Reserved Instances
Instance Bundling
Tagging/Filters
CloudWatch (monitoring)
Elastic Load Balancing
VPC IPSec
Create Image
EBS Backed Instances
Import Keypairs
Describe Region
Console Output

Here's what isn't supported: Paid AMIs (a tasty thought for revenue), spot instances, reserved instances, instance bundling (to AMI), monitoring, and tagging/filter support is, by and large, not supported by anyone. CloudStack has a few filter calls, but they are minimally supported. Someone should really add tagging support.

CloudStack has the upper hand on the load balancing calls. It should be trivial to add load balancing calls to the other projects to support devices like a F5 or Netscaler box though. Citrix, the authors of CloudStack, also owns Netscaler. They do happen to have an image you can launch which contains a load balancer based on HAProxy. I'm not sure if the code for that is publicly available or not. Comments?

CloudStack also support EC2 API equivalent calls for creating IPSec on VPCs. I've never done it, but I hear it's nice. Support for creating images is also provided.

Eucalyptus, for all the press they got with their Amazon press release, actually appears to support the least of all the under-supported EC2 features. The only thing they've got going for them here is the describe-region call that CloudStack doesn't appear to support at all (according to their docs), and the console-output call. Region support in OpenStack is essentially the ‘project' feature they provide. Nice.

CloudStack argues they support the console output call via the GUI, but according to their docs they don't support the EC2 API equivalent calls. I'm pretty sure those are used by RightScale.

OpenStack and CloudStack support starting and stopping EBS backed instances with the EC2 calls, although they aren't really stored on an Elastic Block Storage device. Still, it's cool to be able to pause an instance and then start it back up later. Eucalyptus should support this.

A Few Other Thoughts

In general, I think most of the calls developers would be using are covered by each of the three projects. The difference in method support seem minor in many ways, and anyone trying to make a big deal of it is probably just splitting hairs at this point. Stop being a tool.

I really like the idea of CloudStack's support for load balancing. I think it's often times overlooked, and difficult to configure correctly. Not having used CloudStack's solution, I can't can for certain if it works well or is easy to set up, but it sounds great! :)

Monitoring is an absolute must have here. CloudWatch provides some simple monitoring (haven't managed to get it running yet, so will report back), but none of the projects do a very good job of it, nor support the EC2 APIs for it. I would imagine this would be a feature most users would want. I wonder how many people actually use the calls for CloudWatch. I hooked it up myself to do simple monitoring a while back, but never ended up using it seriously.

If you have any factual corrections for my post, please feel free to comment below. Be sure to contribute to the OpenStack wiki if you see errors there as well!

In my next post I'll be covering the support for the S3 APIs by each project. All three appear to have support for it, but it's poorly documented on all fronts.

0 Comments

Taking OpenStack for a Spin

Tue, 21 Feb 2012
Vote on HN

by

OpenStack is an Open Source cloud computing infrastructure platform originally authored by NASA and RackSpace. OpenStack provides similar services to Amazon's AWS infrastructure platform, except without selling your first born to pay for Amazon's hellaciously expensive instance time.

OpenStack's components are mostly written in Python and include Nova, a fabric controller similar to what EC2 provides, Swift, a S3-like storage system, and Glance, a service for providing services for virtual disk images. Other OpenStack projects include Keystone, an identity service, and Horizon, a Django-based UI framework which ties all these services together in a single webpage view.

As of this writing, the most current version of OpenStack is Essex. Essex is currently in release candidate phase, with RC1 available for download for each of the components listed above. The git repository for the components is available on Github.

I floundered around for a few days trying to download and run the various independent components of OpenStack. The instability of the early versions of Essex and the maze of configuration files you have to tweak make a difficult time of getting it running, at least for me. Thankfully, a bit of Googling revealed a script written by the fine folks over at Rackspace called DevStack, which does all the heavy lifting for getting Openstack running.

Note: Rackspace doesn't recommend using DevStack for production deployments, but it works great if you just want to give OpenStack a test drive and do some quick development on the instances you start.

Getting Started

I've put together a 14 minute step-by-step video guide for getting OpenStack installed on an Oneiric instance running on VMWare Fusion under OSX. The video also guides you through launching an instance and accessing it with a ssh terminal.

You can download Oneiric from Ubuntu's website. Get it running on a VM or a bare metal box and make sure you install the OpenSSH server so you can ssh into it.

Using VMWare Fusion

I'm running OpenStack on a dedicated box, but if you run OpenStack on a VM you'll need to ensure you have enough CPU/memory to start instances. You'll need at least 1.5GB to launch a tiny instance. Keep in mind that instance will be a bit slow because it's running a VM on top of another VM.

Here's a screenshot of my VMWare Fusion config set up with a bridged network and about 4GB of RAM with 2 cores assigned:

Checking Out Code

Now that you have a fresh install running, you'll need git installed on your new box. Go ahead and login and do an aptitude update and install it:

sudo apt-get update
sudo apt-get install git

Now let's check out the DevStack code from Github:

git clone https://github.com/openstack-dev/devstack.git

We need to do a couple of things to configuration files before we fire up DevStack, so let's talk about networking for a second.

Networking

Nova does the management for your instance's network. This is similar to the way AWS manages assigning private and public IPs to instances on EC2. IPs are managed across a set of machines on which you run OpenStack, but you'll still need to configure your routing to make these available to your existing network.

If you have a router that can handle static routes, you could simply map the Nova managed network to the interface on the hypervisor (the base host box) to be able to talk to the instances you launch. I have an Airport Extreme at my house and it doesn't have a way to do static routes, so I'm going to explain how to work around that limitation. We'll set it up so our Nova install will get its own network, but we'll use allocated IPs from the existing network so we can provision and map them to the instances we launch with Nova.

Here's are the configs I used in my install. You'll want to put these in a new file called localrc in the devstack directory you just checked out above.

HOST_IP=10.0.1.20
FLAT_INTERFACE=eth0
FIXED_RANGE=10.0.2.0/24
FIXED_NETWORK_SIZE=256
FLOATING_RANGE=10.0.1.224/27

Change the HOSTIP to the address of your hypervisor box (the box on which you are running the devstack script). You can leave the FIXEDRANGE value alone assuming you aren't using 10.0.2.0/24 on your network.

Change the FLOATING_RANGE to whatever IPs you run on your local network, but only use the top end of the network by using a /27 and starting at the 224 octet. Usually you could use an entire /24 (253 addresses from 10.0.2.1 thru 10.0.2.254), but I'm purposely using the /27 so Nova won't start allocating IPs down low at 10.0.1.1, which is my router.

Speaking of routers, be sure to block out these upper IPs; my Airport has a max range setting in it that I set to 200, to prevent IP address overlap with the ones Nova will start at 225.

Add the Oneiric Image

By default devstack only installs a small Cirros distro. I'm an Ubuntu guy, so I hacked up the stackrc file to include a cloud build of Oneiric which will be downloaded when you run the devstack stack.sh script. Scroll down to the bottom of the stackrc file and delete/replace the following lines:

case "$LIBVIRT_TYPE" in
    lxc) # the cirros root disk in the uec tarball is empty, so it will not work for lxc
    IMAGE_URLS="http://cloud-images.ubuntu.com/releases/oneiric/release/ubuntu-11.10-server-cloudimg-amd64.tar.gz,http://launchpad.net/cirros/trunk/0.3.0/+download/cirros-0.3.0-x86_64-rootfs.img.gz";;
    *)  # otherwise, use the uec style image (with kernel, ramdisk, disk)
    IMAGE_URLS="http://cloud-images.ubuntu.com/releases/oneiric/release/ubuntu-11.10-server-cloudimg-amd64.tar.gz,http://launchpad.net/cirros/trunk/0.3.0/+download/cirros-0.3.0-x86_64-uec.tar.gz";;
esac

Start the Script

The stack.sh script inside the devstack directory will do the rest of the configuration for you. It'll download all the components needed to run OpenStack and stuff them in /opt/stack/. It also downloads the images above and uploads them into Glance so you will have an Ubuntu image you can launch. Start the install process by running the script:

cd devstack
./stack.sh

The script will prompt you for multiple passwords to use for the various components. I used the same pass/key for all the prompts, and suggest you do the same. After the script runs, it'll append the passwords to the localrc file you created earlier:

HOST_IP=10.0.1.20
FLAT_INTERFACE=eth0
FIXED_RANGE=10.0.2.0/24
FIXED_NETWORK_SIZE=256
FLOATING_RANGE=10.0.1.224/27
MYSQL_PASSWORD=f00bar
RABBIT_PASSWORD=f00bar
SERVICE_TOKEN=f00bar
SERVICE_PASSWORD=f00bar
ADMIN_PASSWORD=f00bar

Managing Using the Horizon UI

Once devstack has OpenStack running, you can connect to Horizon, the management UI. The stack.sh script will spit out the URL for you to access the UI:

Horizon is now available at http://10.0.1.x/

Plug this into your browser and then use admin and f00bar for your user/pass. If you made your passwords something else, then obviously use that password here instead of f00bar.

Create Keypairs/Floating IPs/Security Groups

  1. Click on the Project tab on the left side of the screen and then click on Access & Security.
  2. Click on the Allocate IP to Project button at the top and then the Allocate IP button at the bottom of the dialog that pops up. You should see a message that a new IP has been allocated and it should show up in the list of Floating IPs.
  3. Now click on the Create Keypairbutton under Keypairs. Create a key named default and accept the download of the private side of the key to save it on your local machine. You'll need this key later to log into the instance you'll start in a minute.
  4. Finally, click on the Edit Rules button on the default security group under Security Groups. Under Add Rule add rules for pings, ssh and the default web port:

Launch an Instance

Click on the Images & Snapshots tab to view the images you have available. Click on the Launch button next to the flavor of OS you want to launch. In the dialog that pops up, enter the server name and select the default keypair you created earlier. You can also paste in a script in the User Data textbox, which will be run after the machine boots. Here's the one I used to figure out the login user:

#!/bin/bash
cat /etc/passwd

Click on the Launch Instance button at the bottom to launch your instance. You should be taken to the Instances & Volumes tab and shown the status of the instance:

Assign a Floating IP to the Instance

If you are running a network where you can't map static routes in your router, you'll need to assign one of the floating IPs from your network. Click on the Access & Security tab and then click on the Associate IP next to the IP address to assign to the instance.

Click Associate IP to finish assigning the IP to the new instance.

Using Your Instance

Remember the default.pem key that got downloaded earlier? You'll need to go find it and copy it somewhere you won't lose it. You'll also need to change the permissions on it before you use it to login to your new instance:

$ mv default.pem /Users/kord/.ssh/
$ cd /Users/kord/.ssh
$ ssh -i default.pem ubuntu@10.0.1.226
Welcome to Ubuntu 11.10 (GNU/Linux 3.0.0-16-virtual x86_64)

  System information as of Thu Mar 22 22:20:13 UTC 2012

  System load:  0.0              Processes:           57
  Usage of /:   9.8% of 9.84GB   Users logged in:     0
  Memory usage: 20%              IP address for eth0: 10.0.2.2
  Swap usage:   0%

ubuntu@ubuntu:~$ uptime
22:20:32 up 3 days,  2:28,  1 user,  load average: 0.00, 0.01, 0.05

Once you are logged in you can set up your own account and ssh keys to access the box. It took me a bit of floundering to figure out the default user for the cloud version of Oreiric was ubuntu.

More Networking!

If you are running a cluster of servers, you'll need some type of reverse proxy to direct the traffic. It doesn't appear that OpenStack provides this or load balancer features (yet), but there are a few projects that could have APIs added to them to enable future integration into the OpenStack deployment infrastructure.

In the meantime, check out Nodejitsu's Node based HTTP proxy on Github. It's fairly easy to configure and the performance is decent. Here's a version I'm running for reference:

var util = require('util'),
    http = require('http'),
    httpProxy = require('./http-proxy');

//
// Http Proxy Server with Proxy Table
//
httpProxy.createServer({
  router: {
    'spurt.stackgeek.com': '10.0.1.226:80',
    'house.geekceo.com': '10.0.1.19:8080'
  }
}).listen(80);

//
// Target Http Server
//
http.createServer(function (req, res) {
  res.writeHead(200, { 'Content-Type': 'text/plain' });
  res.write('request successfully proxied to: ' + req.url + '\n' + JSON.stringify(req.headers, true, 2));
  res.end();
  console.print("foo");
}).listen(9000);

Maintenance

The OpenStack repos on Github get updated all the time. If you want to stay up-to-date with these changes, I recommend periodically doing a git pull request in each of the directories that get stuffed into /opt/stack/. You can kill your instance of OpenStack before you do this by doing a killall screen.

You can also use the rejoin-stack.sh script if you've already run the script and are wanting to fire it back up again after killing it.

That's about . Be sure to leave comments if you have questions or found errors in the guide!

Happy stacking!

0 Comments