Cloud Architecture

Purpose: The main overarching goal of this document is to outline the proposed underlying architecture for our cloud-based sensor project. I will detail the logistics of exactly what needs to be implemented in order to promote a dynamic cloud environment. Additionally, this proposal aims to be as all-encompassing as is currently possible, but their will most likely be additions to the infrastructure at a later date.

The Virtual Machine Manager: This is one of the most important parts of the whole system. The VMM will be in charge of not only ensuring that everything is running smoothly, but also looking to see if the cloud ecosystem can be improved by either adding virtual machines or taking some away to more efficiently balance load. This VM will also be the core component for communication. Making sure that each of the subsidiary worker nodes are receiving the correct messages. It is also likely that the VMM will use a job scheduler (Condor) in order to further improve efficiency and maintain order.

Virtual Machine Instantiation: There are several reasons why a virtual machine would be launched, but the most important one hinges on load triggers. The Virtual Machine Manager will be monitoring all of the worker nodes’ vitals in regards to memory usage, available hard drive space, CPU usage, etc. The VMM will use custom and pre-built sensors that are already available within Phantom. Therefore, if the VMM sees that its current workforce is being overwhelmed it will launch another node to increase to increase the virtual workforce and thus improve productivity among the nodes since the work will be drawn out across more VM’s. Furthermore, this is directly tied to load balancing: ensuring that one node isn’t doing all of the work.

The Image Directory: Stored within Cumulus is a directory of images that VM’s can utilize. There will be a base image available to those nodes that are being instantiated by the VMM. I must also mention that it is at this point that the recently launched cloud instance will be bound to a RabbitMQ messaging queue so that it will be able to better communicate across the cloud network. I see the best way to do that is to include the RabbitMQ framework within the base image. Then based upon which part of the network needs help the VMM will direct a URL to a bucket that contains the necessary packages for that worker node to do its job. A script will run on the machine that will make this process go smoothly.

The RPC System: Our cloud network is built with the communicative messaging framework known as RabbitMQ. Besides basic messaging capabilities we are able to use this framework to layer on a remote procedure call system that will prove invaluable. For example, when a VM is instantiated by the VMM it is bound to a queue based on the machine image. Within that queue framework is an RPC server that is able to perform actions based upon the messages it receives. This would allow me to execute scripts on the fly (simply as basic sub processes of the parent scripts). There are other uses for this system, but it will allow me to remotely execute jobs on VM’s without even having to deal with them directly.

The Data Product Repository: Going off of what I had said earlier, recently instantiated virtual machines will be given a URL to a bucket where there package is stored. Then they will use a “get” function to retrieve the object from the cloud bucket repository on Cumulus. I should also mention that these URL’s are stored in a MySQL database tables for reference. I would also reiterate the fact of how dynamic the system is in that the VMM will tell the virtual machines which package to choose based upon what part of the network is currently under duress.

Classes of Virtual Machines: There are several different types of nodes that are present within our network. The first kind is known as the worker node. These nodes will be used for computation and analysis as well as image processing and modeling. They are present all across our network from the Weather Generator to the Agricultural nodes. In addition to this node we also have a webserver node that doubles as a database. Moreover, the VMM is sort of in a class of its own as it is the brains behind the whole operation, and ties the whole dynamic structure together. Those are the basic node types.

Packages: The packages described earlier on will come in the form of recipes by Chef. They are Ruby scripted packages that allow for complete customization, which will prove quite valuable within the proposed infrastructure. It would allow for customization of any node. The packages will again be stored as objects within a bucket.

Big Data Solution: In order to handle Big Data the first major challenge will be space. I have already set up a bucket system within Cumulus to be able to handle the space constraints that Big Data often presents. On top of that I have a Python script that is able to read the Big Data in as chunks. As the data is being read in it is transferred to a bucket in Cumulus where it is stores as an object. Additionally, when the data is uploaded to Cumulus the Python script generates a URL for the object that is stored within a MySQL database table. The data can then later be downloaded by worker VM’s through the dispersal of the URL. Radu and I are currently setting up a test environment for this.

Closing Remarks: After reading through this document you should have a good idea of what I am proposing. It is a system that boasts a dynamic infrastructure capable of running self-sufficiently as it will have a VMM that intelligently finds gaps within the cloud network and fills them. The system will also be capable of shutting down VMs if it sees that the load is low and would run better if the task were dedicated to several machines instead of a fleet. It is extraordinarily important that communication function correctly and RabbitMQ has proven by leaps and bounds that it is capable of that. In closing, it is my deepest belief that this model will provide us with the best chance of succeeding in our endeavors to build a cloud network to support sensors, modeling procedures, data collection, and so much more!