CloudMan's features and uses
This page talks about CloudMan, its features, and potential uses. CloudMan is a cloud manager that orchestrates and manages a cloud infrastructure allowing one to simply use the underlying infrastructure. It is primarily being used in the context of Galaxy Cloud and CloudBioLinux but it can be used for any purpose where a cluster in a cloud is desired. Read on about descriptions of specific features.

In case you're interested in getting a jump start on working with Galaxy and CloudMan, the following is a set of links that you should inspect:

  1. The best place to start is the Galaxy CloudMan wiki, especially if you are only looking to use Galaxy or CloudMan: https://wiki.galaxyproject.org/CloudMan/
  2. If you are interested in learning how things actually operate behind the pretty interfaces, we have a few papers: http://usecloudman.org/publications
  3. If you're interested in deploying on a local/private cloud, CloudBioLinux provides most of the automation infrastructure to to get the necessary components built: https://github.com/chapmanb/cloudbiolinux/tree/master/contrib/flavor/cloudman

The above list should be considered a starting point in getting familiar with the overal system. Whenever you have a question, please shoot us an email. More about emailing the Galaxy community can be found at https://wiki.galaxyproject.org/MailingLists

 

CloudMan development

Added: 03 May 2013

Similar to the main Galaxy trello.com board, there is now a trello board for CloudMan. The board contains a set of tasks that are being considered as potential future directions or are being worked on. Check it out and leave your mark!

Also, we have setup a #cloudman channel on irc.freenode.net for discussion and help. Join the channel and join the discussion.

Tags: outreach

CloudMan startup procedure

Added: 06 Mar 2013

When a CloudMan-enabled machine image is launched, the launched instance goes through a contextualization process. The end result of this process is start of the CloudMan application. The process proceeds as follows:

  1. cloudman upstart job runs (at launch level 2, 3, 4, or 5). The job definition is specified in /etc/init/cloudman.conf. There is no log output from running this script.
  2. ec2autorun.py script is executed next. This script is embeded in the image and is typically found in /usr/bin but some older imaged may have it  /opt/galaxy/pkg/bin/. The log output from this script is stored in the same directory as the script itself, called ec2autorun.log. This script processes the user data, downloads cm_boot.py script from either the cluster's bucket (if the bucket exists) or from the default bucket and runs the script.
  3. /tmp/cm/cm_boot.py runs next. This script prepares the instance for CloudMan, starts the web server, downloads CloudMan (from the cluster's bucket or the default bucket) and then launches CloudMan. The log output for this script is in /tmp/cm/cm_boot.py.log.
  4. Finally, CloudMan starts. CloudMan runs in /mnt/cm and its log is stored in /mnt/cm/paster.log.

And there it is, a functional cluster in the cloud. 

With more research and private clouds coming online, there are more questions about setting up a private instance of Galaxy in the cloud. Here is a quick overview with the required components and steps to get you going.

Start off by setting up a CloudMan machine image on the local cloud (using the image building process from https://github.com/chapmanb/cloudbiolinux is the recommended method). Install Galaxy as well as all of its tools, dependencies, and reference data on the appropriate block storage volumes and turn those into snapshots (see this paper for the architecture overview http://onlinelibrary.wiley.com/doi/10.1002/cpe.1836/full and then Galaxy's wiki (http://wiki.g2.bx.psu.edu/Admin) for the details on how to set everything up). Beyond that, it's a matter of making sure it all works as desired on your setup. You'll probably also want to use a version of the code similar to https://github.com/chapmanb/biocloudcentral to launch instances because for the non-amazon case, the user data (http://wiki.g2.bx.psu.edu/CloudMan/UserData) required by an instance is a bit tedious to compose by hand.

Hope this helps and let us know if you have any questions in the process. 

Ready to use machine images for using CloudMan cluster platforms exist on AWS and NeCTAR clouds (see BioCloudCentral.org about starting one). However, if you have a custom machine image and would like to use CloudMan with it, that is also possible. CloudMan can be installed on any (Ubuntu) machine image and thus turn an instance of that machine image into a cluster on the cloud instance. More so, the process of installing CloudMan on a custom machine image is automated and documented.

To get started, clone the set of mi-deployment scripts from https://bitbucket.org/afgane/mi-deployment/ to your local machine:

$ hg clone https://bitbucket.org/afgane/mi-deployment/
$ cd mi-deployment

Before actually running the scripts, make sure your system environment is properly setup: mi-deployment scripts require fabric and boto, so if you do not already have those installed on your local machine, do so with:

$ pip install fabric
$ pip install boto

Next, create the following file with your cloud credentials:

$ cat ~/.boto
[Credentials]
aws_access_key_id = <your cloud access key>
aws_secret_access_key = <your cloud secret key>

Lastly, define which applications you would like to have installed on your machine image by editing mi-deployment/conf_files/apps.yaml and (un)commenting any applications whose dependencies you want or do not want installed (this is primarily cloudman but others are also supported). 

The local environment is now properly setup, so start an instance of your machine image and simply start the mi-deployment configuration script. From your local machine, run:

$ fab -f mi_fabfile.py -i <key_file> -H <instance_IP> configure_MI

After the configuration step completes (typically, 6-8 minutes if CloudMan is the only application your're adding), bundle the instance into a new machine image by running the following command:

$ fab -f mi_fabfile.py -i <key_file> -H <instance_IP> create_image

And there it is. A new machine image with all of your customizations plus all the features of the CloudMan platform is ready to be used.

Behind the accessible and functional web interface, CloudMan allows you to access your cluster from the command line as well. You can find the exact command needed to ssh (including the current IP) on the Admin interface:

ssh command on the Admin page

(Note that CloudMan images created after 2012 allow you to also log in using the same password as you use for accessing the CloudMan web interface; simply log in as ubuntu user).

To ssh to the instance, simply start the Terminal application (or Putty on Windows) and provide the command:

ssh command

Once logged in, the cluster behaves just like any other SGE-managed cluster: submit a job using qsub <job script>:

Submit a job via qsub

Check the status of a job or the queue, use qstat -f:

 Check queue status

For ubuntu user, CloudMan stores all of the persistent data in /export/data. This directory is also shared over NFS with worker nodes and should thus be used for submitting jobs from and having the output stored there.

And there you go - a scaleable and functional compute cluster ready for anything you throw at it.

 

Tags: CLI, cluster

By default, CloudMan will configure your cloud cluster to run jobs on the head node (i.e., the master instance). This allows you to have only a single cloud instance and start using it instantly, thus minimizing the cost associated with using a cloud. However, such behavior may not always be desirable. If, for example, you have a substantial number of users or jobs being submitted, the head node may be busy running jobs and lead to poor responsiveness of user services. Alternatively, you may choose to keep a small (i.e., cheap) master instance alive at all times so users can use it whenever needed. Such instance may not have enough power to also handle user jobs though.


CloudMan enables you to toggle the master instance from running jobs. Simply go to the Admin page and click on Switch master not to run jobs and no jobs will run on the master instance:

Switch master not to run jobs

You can revert this decision at any point by clicking Switch master to run jobs. Note that if you chose to have the master instance not run any jobs, it will be necessary to have at least one worker instance to handle any jobs. Fortunately, this is also simple and automated via CloudMan's auto-scaling feature.

Tags: admin, cluster