Thursday, September 18, 2014

base checklist: 10 points to decide if chose an OSS for Production or not

One question you get from skeptics (who are actually really important for quality check) while discussion on picking an OpenSource solution instead of a support attached closed source one. Which is how to trust it to be safe for Production release.

The question actually even suits which OSS to pick, when there are several.

Question we are trying to answer here is... How to pick an OpenSource solution that will live long and prosper, not turn into a rot that smells on any change in Production on updates over period of time.

First just to mention again that almost every Technologist already understands. There is no guarantee just "ensured" support over closed source software to guarantee it's safety or supporting future technical growth. I don't wanna dwell into the dangers that it brings in, 'cuz this is not the post for that. That's entirely other exhausted list.

So on what to see in OpenSource software that helps you decide it is trusted to be included for production release...


While weighing in for inclusion of any big or small OpenSource utility into your Production list, following checklist shall help:

1*) OSS have Licenses too
First of all check if their License suits inclusion with licensing of your project. Example, People have been seeking ways (somewhat succeded) to get ZFS on GNU/Linux.

2*) Is project active "enough"
Second quick check is seeing if project has been inactive for a dangerous period. Now for every kind of project, a dangerous period differs widely. Would have to depend on better judgment of self and trusted community you know. Like for a library providing certain algorithm, post stable release changes would be a lot slower. But for a webdev framework, with current tradition... it'll be popping new minor releases now and then.

Now few things for which you'll need to read around a little....
Sources to recon about following attributes: Mailing lists, Issue boards, IRC, Twitter streams, may be others depending on project

3*) How much active and inclusive is its community
How well do they handle PullRequests and Issues raised on their project. This includes the readiness on response and adapting a better direction, both but mainly former.
How well they handle risks and vulnerabilities reported, if any. Quickest patch is not the main measure, most important is accepting it and providing a workaround till main issue gets resolved.

4*) Good core team matters (they need not be very popular)
Check who forms core team maintaining that OSS. Some other projects of their, even if not popular would give you an idea on how much and how well they maintain their projects. 

5*) If Industry already loves it
Not a litmus test though strengthens community support and quality check.
Look for who all in Industry is already using it mainstream, also if you like the softwares they have developed. Just shoot a tweet/mail to them... people are mostly helpful. Don't give up on humanity. ;)

6*) Need to scan it personally anyhow
Try it in a sandbox first, monitor it's not spawning requests to domains it's not supposed to. Not creating any suspicious behavior you don't expect from it.
Also, it survives your production security lockdown, not all projects behave same under restrictions.

7*) Send it on a marathon
Put it under performance test yourselves. There might be preexisting load test results available, and might be accurate as well. But not all implementations suit all projects. Check it under PoC of your implementation behavior with expected concurrency and latency.

8*) Does it tailor fit
If it actually provides what you desire without putting a hack around, give it a chance. If not so, confirm that it suits the design and wouldn't break with project philosophy from maintainers over the coming recent versions at least.

9*) How easy is to resolve an issue
Is project community/developers active enough to help guide around any problems faced.

10*) Do you love supporting FOSS
If yes, welcome to the world of awesomeness. Some mediocrity (not below that, then look something else) at some of points above would only drive you strengthen the project. It's opensource, at least technologists are not supposed to live with the problem if faced.

Monday, February 3, 2014

golang ~ get local changes into GOPATH without pushing them upstream

To get your local Golang repo's sym-linked at your GOPATH and local changes available...

goenv_link(){
  if [ $# -ne 2 ]; then
    echo "Links up current dir to it's go-get location in GOPATH"
    echo "SYNTAX: goenv_linkme  "
    return 1
  fi
  _REPO_DIR=$1
  _REPO_URL=$2

  _TMP_PWD=$PWD
  cd $_REPO_DIR

  if [ -d "${GOPATH}/src/${_REPO_URL}" ]; then
    echo "$_REPO_URL already exists at GOPATH $GOPATH"
    go get "${_REPO_URL}"
    return 1
  fi
  _REPO_BASEDIR=$(dirname "${GOPATH}/src/${_REPO_URL}")
  if [ ! -d "${_REPO_BASEDIR}" ]; then
    mkdir -p "${_REPO_BASEDIR}/src"
  fi

  ln -sf "${PWD}" "${GOPATH}/src/${_REPO_URL}"
  go get "${_REPO_URL}"

  cd $_TMP_PWD
}

alias goenv_linkme="goenv_link $PWD"

---


Every now and then working on my favorite new programming language Golang, I have inter-dependent changes among different packages. To confirm their as-required working state, I'd like the GOPATH to provide the compiled object with local-changes included.

The utility I've been using to push local package changes to GOPATH provided object is following "goenv_alpha" bash function as a shell-profile provided utility.

Say, I've a golang project "github.com/abhishekkr/goshare" which utilizes "github.com/abhishekkr/goshare/httpd", "github.com/abhishekkr/goshare/zeromq" and few more.

If I make some local changes at "{PROJECTS}/goshare" and "{PROJECTS}/goshare/httpd". To push those into GOPATH provided package for testing, following commands using below function "goenv_alpha" shell-util would do the job...

$ goenv_alpha "{PROJECTS}/goshare" "github.com/abhishekkr/goshare"
$ goenv_alpha "{PR..}/goshare/httpd" "github.com/abhishekkr/goshare/httpd
"

These commands will ask you to make a backup file for current existing version of package resource from GOPATH, you can give any name... which will be asked while restoring or you can leave it empty to avoid creating a backup file.

~

goenv_alpha(){   _TMP_PWD=$PWD   if [ $# -ne 2 ]; then echo "Provide Alpha changes usable as any other go package."     echo "Just the import path changes to 'alpha/'"     echo "SYNTAX: goenv_alpha "     return 1   fi _REPO_DIR=$1   _REPO_URL=$2   cd $_REPO_DIR   _PKG_PARENT_NAME=$(dirname $PWD)   _PKG_NAME=$(basename $PWD)
  _PKG_NAME_IN_REPO=$(basename $_REPO_URL)   if [ $_PKG_NAME_IN_REPO != $_PKG_NAME ]; then echo "Path for creating alpha doesn't match the import 'url' for it."     return 1   fi   `go build -work . 2> /tmp/$_PKG_NAME`   _BUILD_PATH=`cat /tmp/$_PKG_NAME | sed 's/WORK=//'`   if [ ! -d $_BUILD_PATH ]; then echo "An error occured while building, it's recorded at /tmp/$_PKG_NAME"     return 1   fi rm -f /tmp/$_PKG_NAME   _CURRENT_OBJECT_PATH="${GOPATH}/pkg/${GOOS}_${GOARCH}"   _CURRENT_OBJECT="${_CURRENT_OBJECT_PATH}/${_REPO_URL}.a"   _NEW_OBJECT="${_BUILD_PATH}/_${_PKG_PARENT_NAME}/${_PKG_NAME}.a"   echo "Do you wanna backup current object? If yes enter a filename for it: "   read GO_ALPHA_BACKUP   if [ ! -z $GO_ALPHA_BACKUP ]; then mv $_CURRENT_OBJECT "${_CURRENT_OBJECT_PATH}/${_REPO_URL}/${GO_ALPHA_BACKUP}.backup"   fi mv $_NEW_OBJECT $_CURRENT_OBJECT   cd $_TMP_PWD   echo "\nAlpha changes have been updated at ${_CURRENT_OBJECT}." }

~

You can undo the pushing of local changes inclusive package resource if you have created a backup file for earlier existing file.

Following commands utilizes the below provided shell-util function "goenv_alpha_undo".

$ goenv_alpha_undo "{PROJECTS}/goshare" "github.com/abhishekkr/goshare"
$ goenv_alpha_undo "{PR..}/goshare/httpd" "github.com/abhishekkr/goshare/httpd"

This will list you the names of backup files present if any, then you can provide the name of your chosen backup file and restore to that package state.

~
goenv_alpha_undo(){
  _TMP_PWD=$PWD
  if [ $# -ne 2 ]; then
    echo "Provide Alpha changes usable as any other go package."
    echo "Just the import path changes to 'alpha/'"
    echo "SYNTAX: goenv_alpha  "
    return 1
  fi _REPO_DIR=$1
  _REPO_URL=$2   cd $_REPO_DIR   _PKG_PARENT_NAME=$(dirname $PWD)   _PKG_NAME=$(basename $PWD)   _PKG_NAME_IN_REPO=$(basename $_REPO_URL)   if [ $_PKG_NAME_IN_REPO != $_PKG_NAME ]; then echo "Path for creating alpha doesn't match the import 'url' for it."     return 1   fi _CURRENT_OBJECT_PATH="${GOPATH}/pkg/${GOOS}_${GOARCH}"   _CURRENT_OBJECT="${_CURRENT_OBJECT_PATH}/${_REPO_URL}.a"   _BACKUP_OBJECT="${_BUILD_PATH}/_${_PKG_PARENT_NAME}/${_PKG_NAME}.a"   echo "Available package files are:"   ls -1 $_CURRENT_OBJECT_PATH/$_REPO_URL | grep $_PKG_NAME | grep -v grep   echo "Enter your backup filename for it: "   read GO_ALPHA_BACKUP   if [ -z $GO_ALPHA_BACKUP ]; then echo "\nNo Backup file was entered." ; return 1   fi mv "${_CURRENT_OBJECT_PATH}/${_REPO_URL}/${GO_ALPHA_BACKUP}" $_CURRENT_OBJECT   cd $_TMP_PWD   echo "\nAlpha changes have been reverted with the provided backup file." }
~

The full [WIP] shell-profile for golang utilities is at:
https://github.com/abhishekkr/tux-svc-mux/blob/master/shell_profile/a.golang.sh

Thursday, December 5, 2013

go get pkg ~ easy made easier for project dependency management

For past sometime I've been trying out ways to improve practices upon awesome capabilities from GoLang. One of the things have been having a 'bundle install' (for ruby folks) or 'pip require -e' (for python folks)  style capability... something that just refers to an text file part of source code and plainly fetches all the dependencies path mentioned in there (for all others).

It and some other bits can be referred here...
https://github.com/abhishekkr/tux-svc-mux/blob/master/shell_profile/a.golang.sh#L31

It's a shell (bash) function that can be added to your Shell/System Profile files and used...

go_get_pkg(){   if [ $# -eq 0 ]; then if [ -f "$PWD/go-get-pkg.txt" ]; then PKG_LISTS="$PWD/go-get-pkg.txt"     else touch "$PWD/go-get-pkg.txt"       echo "Created GoLang Package empty list $PWD/go-get-pkg.txt"       echo "Start adding package paths as separate lines." && return 0     fi else PKG_LISTS=($@)   fi for pkg_list in $PKG_LISTS; do cat $pkg_list | while read pkg_path; do echo "fetching golag package: go get ${pkg_path}";         echo $pkg_path | xargs go get     done done }
---
What it do?
If ran without any parameters. It checks for current working directory for a file called 'go-get-pkg.txt'. If not found creates one empty file by that name. To be done at initialization of project. If found, then it iterates through each line and pass it directly to "get get ${line}". If ran with parameters. Each parameter is treated as path to files similar to 'go-get-pkg.txt' and similar action as explained previously is performed on each file.
Sample 'go-get-pkg.txt' file
-tags zmq_3_x github.com/alecthomas/gozmq github.com/abhishekkr/levigoNS github.com/abhishekkr/goshare
---

Friday, November 15, 2013

systemd enabled lightweight NameSpace Containers ~ QuickStart Guide

systemd (for some time now) provides a powerful chroot alternative to linux users for creating quick and lightweight system containers using power of cgroups and socket activation.

There is a lot more to "systemd" than this, but that's for some other post. Until then can explore it, starting here.

There is a utility "systemd-nspawn" provided by systemd which acts as container manager. This is what can be used to easily spawn a new linux container and manage it. It has been updated with (the systemd's amazing trademark feature) Socket Activation.

This enables any container to make parent/host's systemd instance to listen at different service ports for itself. Only when those service ports receive a connection, these container will spawn and act to it. Voila, resource utilization and scalability concepts.
More of this can read in detail at: http://0pointer.de/blog/projects/socket-activated-containers.html

Here we'll see some way to quickly start using it via some custom made commands.
All the script commands used here can referred from https://github.com/abhishekkr/tux-svc-mux/blob/master/shell_profile/a.virt.sh as well.

Just download and source the linked script in your shell, and the commands told here will be available...
And yes, your system also need to be running systemd already.

Currently this just lets you create archlinux containers, will soon create different containers and make the script mature.

In case you don't have any created container already, or wanna create a new one...
$ nspawn-arch
To list names of all created containers...
$ nspawn-ls
To stop a running container...
$ nspawn-stop
To start an already created conatiner
$ nspawn-start

---


---

Friday, July 26, 2013

Puppet ~ a beginners concept guide (Part 4) ~ Where is my Data

parts of the "Puppet ~ Beginner's Concept Guide" to read before ~
Part#1: intro to puppet, Part#2: intro to modules, and Part#3: modules much more.

Puppet
beginners concept guide (Part 4)


Where is my Data?

When I started my Puppet-ry, the examples I used to see had all configuration data buried inside the DSL code of manifests, people were trying to use inheritance to push down data. Then got to see a design pattern in puppet manifests keeping out separate parameters manifest for configuration variables. Then came along the External Data lookup via CSV files as a Puppet function. Then with enhancements in puppet and other modules came along more.

Below are few usable to fine ways utilizing separate data sources within your manifests,


Here, we will see usage styles of data for Puppet Manifests, Extlookup CSV, Hiera, Plug-in Facts and PuppetDB.

params-manifest:


It is the very basic way of separating out data from your functionality code, and the preferred way for in-future growing value-set type of data. It will keep it separate from the code since start. Once the requirement is at a level to have varied value to inferred based on environment/domain/fqdn/operatingsystem/[any-facter], it can be extracted to any preferred ways given below and just looked-up here. That would avoid changing the main (sub)module-code.
[ Gist-Set Externalize into Params Manifest: https://gist.github.com/3683955 ]
Say you are providing httpd::git sub-module for httpd module placing a template generated config file using params placed data...
```

File: httpd/manifests/git.pp
it includes the params submodule to access the data

File: httpd/templates/mynode.conf.erb

File: httpd/manifests/params.pp
it actually is just another submodule to only handle data

Use it: run_puppet.sh

```
_

extlookup-csv:


If you think your data would suit to a (key,value) CSV format being extracted to data files.Puppet need to be told the location for CSV files need to be looked up for key, and fetch the value assigned to it in that file.
Names given to these CSV files would matter to Puppet while looking up the values from all present CSV files. Puppet need to be given hierarchy order for these file-names to look for the key and the order could involve variable names.

For E.g. say you have a CSV by name of HOSTNAME, ENVIRONMENT and a common file, with hierarchy specified in respective order too. Then Puppet will first look for the queried Key in CSV by HOSTNAME, if not found looks up in ENVIRONMENT named file and after not finding it there goes looking into common file. If it doesn't find the key in any of those files, it returns the default value if specified in the 'extlookup(key, default_value)' method like this. If there is no default value also, Puppet will raise an exception for no value to return.

[ Gist-Set Externalize into Hierarchical CSV Files: https://gist.github.com/3684010 ]

It's the same example as for params with a flavor of ExtData. Here you'll notice a 'common.csv' external data file providing a default set of values. Then there is also a 'env_testnode.csv' file overriding the only required changed value. Now as in 'site.pp' file, precedence of 'env_%{environment}' file is higher than 'common', the 'httpd::git' would look-up all values first from 'env_testnode.csv' and if not found there would goto 'common.csv'. Hence would end-up overriding 'httpd_git_url' value from 'env_testnode.csv'.
```
```

extlookup() method used here is available as a Puppet Parser Function, you would read more in Part#5 Custom Puppet Plug-Ins on how to create your own functions
_


Hiera is a pluggable hierarchical data storage for Puppet. It was started to provide a better external data storage support than Ext-lookup feature with data formats other than CSV too.

This brings in the essence of ENC for data retrieval without having to write one.

Data look-up happens in a hierarchy provided by configuration with self scope resolution mechanism.

It enables Puppet to fetch data from varied external data sources using it's different backends (like local files, redis, http protocol) which can be added on to if needed.
The 'http' backend in turn enables support for data store from any service (couchdb, riak, web-app or so) to provide data.

File "hiera.yaml" from Gist below is an example of hiera configuration to be placed in puppet's configuration directory. The highlights of this configuration are ":backends:", backend source and ":hierarchy:". Multiple backend can be used at same time, their order of listing mark their order of look-up. Hierarchy configures the order for data look-up by scope.

Then depending on what backend you have added, you need to add their source/config to look-up data at.
Here we can see configuration for using local "yaml", "json" files. Look-up data from Redis server (it will set-up datasets for redis usage for current example) with authentication in place. And looking up data from any "http" service with hierarchy as the ":paths:" value.
You can even use GPG protected data as backend, but that is a bit messy to use.

Placing ".yaml" and ".json" from Gist at intended provider location.
The running "use_hiera.sh" would make you show the magic from this example on hiera.

```Gist
```
[Gist-Set Using Hiera with Masterless Puppet set-up: https://gist.github.com/abhishekkr/6133012 ]
_

plugin-facts:


Every system has its own set of information facter (
http://projects.puppetlabs.com/projects/facter) by default made available to puppet. Puppet also enable DevOps people to set custom facter to be used in modules.
The power of these computed Facters is they can use full ruby-power to use local/remote plain/encrypted data over REST/Database/API/anyway available channel.
These require the power of Puppet Custom Plug-Ins (http://docs.puppetlabs.com/guides/custom_facts.html). The ruby file doing this would go at 'MODULE/lib/puppet/facter' and would get loaded by the 'pluginsync=true' in action.
Way to set a Facter in such Ruby code is just...
my_value = 'all ruby code to compure it'
Facter.add(mykey) do
  setcode do
    my_value
  end
end
.....all the rest of code there need to compute the value to be set, or even key-set.

[Gist-Set Externalize Data receiving as Facter: https://gist.github.com/3684968 ]

Same 'httpd::git' example revamped to use Custom Facter as 
```
```
There is also another way to provide a Facter in Puppet Catalog, that can be done by providing an Environment variable with capitalized Facter name pre-fixed by 'FACTER_' and the value which it's supposed to have.
For E.g. # FACTER_MYKEY=$my_value puppet apply --modulepath=$MODULEPATH -e "include httpd::git"
_

puppetdb:


It's a beautiful addition to Puppet component set. Something that have been missing for long and possibly the thing because of which I delayed this post by half year.
It enables the 'storeconfig' power without the Master, provides a support of trusted DB for infrastructure-related data needs and thus best suited of all.

To set-up 'puppetdb' on a node follow the PuppetLabs has a nice documentation.
To set-up a decent example for master-less puppet mode, follow the given steps

Place the 2 '.conf' and 1 '.yaml' file in Puppet's configuration directory.
The shell script would prepare the node with PuppetDB service for masterless puppet usage scenario.

Puppet config setting storeconfig to 'puppetdb' enables saving of exported resources to it. The 'reports' config their would push the puppet apply reports to the database.
PuppetDB config makes Puppet aware of the host and port to connect database at.
The facts setting on routes.yaml enable PuppetDB to be used in a masterless mode.

```
```

[Gist-Set Using PuppetDB with Masterless Puppet set-up: https://gist.github.com/abhishekkr/6114760 ]

Now running anything say like...
puppet apply -e 'package{"vim": }'
and beautiful to that 'export resources' would work like a charm using PuppetDB.
The puppet.conf accompanied will make reports dumped to PuppetDB as well.

_


There's a fine article on the same by PuppetLabs...

Friday, May 31, 2013

Testing Chaos with Automated Configuration Management solutions


No noise making.

But let's be real, think of the count of community contributed (or mysterious closed-and-sold 3rd Party) services, frameworks, library and modules put to use for managing your ultra-cool self-healing self-reliant scalable Infrastructure requirements. Now with so many cogs collaborating in the infra-machine, a check on their collaboration seems rather mandatory like any other integration test for your in-house managed service. 
After all that was key idea behind having automated configuration management itself.

Now the utilities like Puppet/Chef have been out there accepted and used by dev & ops folks for quite some time now.
But the issue with the initially seen amateur testing styles is it evolved from the non-matching frame of 'Product' oriented unit/integration/performance testing. 'Product' oriented testing focus more on what happens inside the coded logic and less on how user gets affected by product.
Most of the initial tools released for testing logic developed in Chef/Puppet were RSpec/Cucumber inspired Product testing pieces. Now for the major part of installing a package, restarting a service or pushing artifacts these tests are almost non-required as the main functionality for per-say installing package_abc is already tested inside the framework being used.
So coding to "ask" to install package_abc and testing if it has been asked seems futile.

That's the shift. The logic developed for Infrastructure acts as a glue to all other applications created in house and 3rd party. Here in Infrastructure feature development there is more to test for the effect it has on the it's users (software/hardware) and less on internal changes (dependencies and dynamic content). Now the stuff in parentheses here means a lot more than seems... let's get into detail of it.

Real usability of Testing is based on keeping sanctity of WHAT needs to be tested WHERE.


Software/Hardware services that collaborate with the help of Automated Infrastructure logic needs major focus of testing. These services can be varying from the
  • in-house 'Product', that is the central component you are developing
  • 3rd Party services it collaborates with,
  • external services it utilizes for what it doesn't host,
  • operating system that it supports and Ops-knows what not.

Internal changes mainly revolve around
  • Resources/Dependencies getting called in right order and grouped for specific state.
  • It also relates to correct generation/purging of dynamic content, that content can itself range as
    • non-corrupt configuration files generated of a template
    • format of sent configuration data from one Infra-component to another for reflected changes
    • dynamically creating/destroying service instances in case of auto-scalable infrastructure


One can decide HOW, on ease and efficiency basis.


Unit Tests work for the major portion of 'Internal Changes' mentioned before using chefspecrspec-chef, rspec-puppet like libraries are good enough. They can very well test the dependency order and grouping management as well as the different data effect on non-corrupt configuration generation from templates.


Integration Tests in this perspective are a of a bit interesting and evolutionary nature. Here we have to ensure the "glue" functionality we talked about for Software/Hardware service is working properly. These will confirm that every type of required machine role/state can be achieved flawlessly, call them 'State Generation Test'. They also need to confirm the 'Reflected Changes Test' across Infra-component as mentioned in Internal changes.
Now utilities like test-kitchen/docker in collaboration with vagrant, docker, etc. help placing them in your Continuous Integration pipeline. This would even help in testing same service across multiple linux distros if that's the plan to support.
Library 'ServerSpec' is also a little nifty piece to write quick final state check scripts.
Then final set of Integration Testing is implemented in form of Monitoring on your all managed/affecting Infrastructure components. This is the final and ever-running Integration Test.


Performance Tests, yes even they are required for it. Tools like ChaosMonkey enable you to enable your Infra to be self-healing and auto-scalable. Should be load-test noticing dynamic containers count and behavior if auto-scalability is a desired functionality too.

Wednesday, April 24, 2013

Beginner's Guide to OpenStack : Basics of Nova [Part 2]

parts of Beginner's Guide to OpenStack to read before this ~

[Part.2 Basics of Nova] Beginner's Guide to OpenStack

# Nova?
It's the main fabric controller for IaaS providing Cloud Computing Service by OpenStack. Took its first baby steps in NASA. Contributed to OpenSource and became most important component of OpenStack.
It built of multiple components performing different tasks turning End User's API request into a virtual machine service. All these components run in a non-blocking message based architecture, and can be run off from same or different locations with just access to same message queue service.

---

# Components?

Nova stores states of virtual machines in a central database. It's optimal for small deployments. Nova is moving towards multiple data stores with aggregation for high scale requirements.








  • Nova API : supports OpenStack Compute API, Amazon's EC2 API and powerful Admin API (for privileged users). It's used to initiate most of orchestration activities and policies (like Quota). It gets communicated over HTTP, converts the requests to commands further contacting other components via Message Broker and HTTP for ObjectStore. It's a WSGI application which routes and authenticates requests.
  • Nova Compute : worker daemon taking orders from its Message Broker and perform virtual machine create/delete tasks using Hypervisor's API. It also updates status of its tasks in Database.
  • Nova Scheduler : decides which Nova Compute Host to allot for virtual machine request.
  • Network Manager : worker daemon picking network related tasks from its Message Broker and performing those. OpenStack's Quantum now with Grizzly release can be opted instead of nova-network. Tasks like maintaining IP Forwarding, Network Bridges and VLANs get covered.
  • Volume Manager : handles attach/detach of persistent block storage volumes to virtual machines (similar to Amazon's EBS). This functionality has been extracted to OpenStack's Cinder. It's an ISCSI solution utilizing Logical Volume Manager. Network Manager doesn't interfere in Cinder's tasks but need to be setup for Cinder to be used.
  • Authorization Manager : interfaces authorized APIs usage for Users, Projects and Roles. It communicates with OpenStack's KeyStone for details.
  • WebUI : OpenStack's Horizon communicates with Nova API for Dashboard interfacing.
  • Message Broker : All components of Nova communicate with each other in a non-blocking callback-oriented manner using AMQP protocol well supported by RabbitMQ, Apache QPid. There is also emerging support for ZeroMQ integration as Message Queue. It's like central task list shared and updated by all Nova components.
  • ObjectStore : It's a simple file-based storage (like Amazon's S3) for images. This can be replaced with OpenStack's Glance.
  • Database : used to gather build times, run states of virtual machines. It has details around instance types available, networks available (if nova-network), and projects. Any database supported by SQLAlchemy can be used. It's central information hub for all Nova components.


---

# API Style
Interface is mostly RESTful. Routes (python re-implementation of Rails route system) packages maps URIs to action methods on controller classes.
Each HTTP Request to Compute requires specific authentication credentials required. Multiple authentication schemes can be allowed for a Compute node, provider determines the one to be used.

---

# Threading Model
Uses Green Thread implementation by design using eventlet and greenlet libraries. This results into single process thread for O.S. with it's blocking I/O issues. Though single reduces race conditions to great extent, to eliminate them further in suspicious scenarios use decorator @lockutils.synchronized('lock_name') over methods to be protected from it.
If any action is long-running, it should have methods with desired process-state location triggering eventlet context switch. Placing something like following code-piece will switch context to waiting threads, if any. And will continue on current thread without any delay if there is no other thread in wait.
from eventlet import greenthread
greenthread.sleep(0)
MySQL query uses drivers blocking main process thread. In Diablo release a thread pool was implemented but removed because of trade-off for advantages over bugs.

---

# Filtering Scheduler
In short it's the mechanism used by 'nova-scheduler' to choose the worthy nova-compute host for new required virtual machine to be spawned upon. It prepares a dictionary of unfiltered hosts and weigh their costing for creating required virtual machine(s) request. Then it chooses the least costly host.
Hosts are weighted based on the configuration options for virtual machines.
It's a better practice for customer to ask for large count of required instances together as each request computes weight.

---

# Message Queue Usage
Nova components use RPC to communicate each other via Message Broker using PubSub. Nova implements rpc.call (request/response, API acts as consumer) and rpc.cast (one way, API acts as publisher).
Nova API and Scheduler uses message queue as Invoker, whereas Network and Compute act as workers. Invoker pattern sends messages via rpc.call or rpc.cast. Worker pattern receives messages from queue and respond back to rpc.call with appropriate response.
Nova uses Kombu library when interfacing with RabbitMQ.

---

# Hooks
Enable developers to extend Nova capabilities by adding named hooks to Nova code as decorator that will lazily load plug-in code matching hook name (using setuptools entrypoints, it's an extension mechanism). The hook's class definition should have pre and post method.
Don't use hooks when stability is a factor, internal APIs may change.

---

# Dev Bootstrap
To get started with contributing... read this (OpenStack Wiki on HowToContribute) in detail.

To get rolling with Nova wheels, system will need to have libvirt and one of the hypervisors (xen/kvm preferred for linux hosts) present.
$ git clone git://github.com/openstack/nova.git
$ cd nova
$ python ./tools/install_venv.py
this will prepare your copy of nova codebase with virtualenv required, now any command you wanna run on this in context of required codebase
$ ./tools/with_venv.sh

---

# Run My Tests
to run the nose tests and pep8 checker, when you are done with virtualenv setup (or that will be initiated first here)... inside 'nova' codebase
$ ./run_tests.sh

---

# Terminology

  • Server: Virtual Machines created inside Compute System, required Flavor & Image detail.
  • Flavor: Represents unique hardware configurations with disk space, memory and CPU time priority
  • Image: System Image File used to create/rebuild a Server
  • Reboot: Soft Server Reboot sends a graceful shutdown signal. Hard Reboot does power reset.
  • Rebuild: Removes all data on Server and replaces it with specified image. Server's IP Address and ID remains same.
  • Resize: Converts existing server to a different flavor. All resize need to be explicitly confirmed, only then the original server is removed. After 24hrs. delay, there is an automated confirmation.