めもめも

このブログに記載の内容は個人の見解であり、必ずしも所属組織の立場、戦略、意見を代表するものではありません。

Prototype of DCK Server for OpenStack

A new word "Cloud OS" is floating around these days. If you interpret it as a "system to operate computing resources in the cloud", it makes some sense to me. However, it gives me an impression that there could be more direct comparison between the use of traditional operating system (which controls resources in a single machine) and the use of IaaS cloud (where you control resources across multiple machines).

One idea is to think VM instances on the cloud as running processes on a single machine. If you pursue this comparison, you may reach the conclusion that "Cloud OS" gives you an interface to administer a large number of VM instances based on the multi-processing model of the traditional operating system. For example, you can list the running instances with "ps" command. The relationship among instances fits in the tree (parent-children) model.

To clarify this idea, I created a makeshift REST API server which wraps the OpenStack APIs as below.

              REST API              REST API
[Client]  --------> [DCK Server] ------> [OpenStack]

I call it "DCK Server". The meaning of DCK will be revealed later, but please think it as a meaningless prefix at the moment. Now I will show you how you can manage the IaaS cloud through the DCK server.

Image handling

If VM instances are "processes", the original template image could be binary files of the processes. So you can use "ls" to get a list of templates.

#### list all registered images
$ dck-ls            
{
  "images": [
    {
      "url": "http://localhost:5000/dck/api/v1.0/image/aed1bb23-b0f3-4ac0-a500-3f44ce1e79e1",
      "id": "aed1bb23-b0f3-4ac0-a500-3f44ce1e79e1",
      "name": "Fedora18"
    },
    {
      "url": "http://localhost:5000/dck/api/v1.0/image/a69d46b1-65f6-45db-81ee-a9f605dcf0f8",
      "id": "a69d46b1-65f6-45db-81ee-a9f605dcf0f8",
      "name": "Fedora19"
    },
    {
      "url": "http://localhost:5000/dck/api/v1.0/image/c64d6355-8a56-4b13-b391-b061ec930868",
      "id": "c64d6355-8a56-4b13-b391-b061ec930868",
      "name": "RHEL64"
    }
  ]
}

#### show details of a specific image
$ dck-ls c64d6355-8a56-4b13-b391-b061ec930868
{
  "image": {
    "status": "active",
    "name": "RHEL64",
    "deleted": "False",
    "checksum": "24a7dbc9b430922953ff5f3479a77294",
    "created_at": "2013-07-11T12:59:20",
    "disk_format": "qcow2",
    "updated_at": "2013-07-11T12:59:26",
    "owner": "f057a8457a8144b48fded7e6ba644e92",
    "protected": "False",
    "min_ram": "0",
    "container_format": "bare",
    "min_disk": "0",
    "is_public": "True",
    "id": "c64d6355-8a56-4b13-b391-b061ec930868",
    "size": "700186624"
  }
}

Actually, "dck-ls" is a wrapping function of the following REST request.

function dck-ls {
    if [[ -z $1 ]]; then
        curl http://localhost:5000/dck/api/v1.0/images
    else
        curl http://localhost:5000/dck/api/v1.0/image/$1
    fi
}

Instance handling

You can use "ps" to get a list of running instances.

$ dck-ps
{
  "instances": [
    {
      "url": "http://localhost:5000/dck/api/v1.0/proc/0",
      "ppid": 0,
      "pid": 0,
      "name": "kernel"
    },
    {
      "url": "http://localhost:5000/dck/api/v1.0/proc/1",
      "ppid": 0,
      "pid": 1,
      "name": "init"
    }
  ]
}

Ahh, there's no real instances now. So you can see only dummy entries which fill the position of "Linux Kernel (pid=0)" and "/sbin/init (pid=1)". So now let's start a new instance with "exec".

$ dck-exec 1 exec webserver01 72fcc69a-eae3-4367-a883-5ae9983d07c "#User-data for the initialization"
{
  "instance": {
    "userdata": "#User-data for the initialization",
    "ppid": 1,
    "pid": 2,
    "name": "webserver01",
    "imageid": "72fcc69a-eae3-4367-a883-5aed9983d07c"
  }
}

This means you start a new instance as a child-instance with ppid=1 using the specified image and user-data. pid=2 is automatically assigned to the new instance. Then, you can "fork" the existing process, too.

$ dck-exec 2 fork webserver02
{
  "instance": {
    "userdata": "#User-data for the initialization",
    "ppid": 2,
    "pid": 3,
    "name": "webserver02",
    "imageid": "72fcc69a-eae3-4367-a883-5aed9983d07c"
  }
}

$ dck-exec 2 fork webserver03
{
  "instance": {
    "userdata": "#User-data for the initialization",
    "ppid": 2,
    "pid": 4,
    "name": "webserver03",
    "imageid": "72fcc69a-eae3-4367-a883-5aed9983d07c"
  }
}

The forked instance have the same template image and user-data contents with the parent instance as seen above.

Now you can see the meaningful list of instances.

#### list all instances
$ dck-ps
{
  "instances": [
    {
      "url": "http://localhost:5000/dck/api/v1.0/proc/0",
      "ppid": 0,
      "pid": 0,
      "name": "kernel"
    },
    {
      "url": "http://localhost:5000/dck/api/v1.0/proc/1",
      "ppid": 0,
      "pid": 1,
      "name": "init"
    },
    {
      "userdata": "#User-data for the initialization",
      "name": "webserver01",
      "url": "http://localhost:5000/dck/api/v1.0/proc/2",
      "pid": 2,
      "imageid": "72fcc69a-eae3-4367-a883-5aed9983d07c",
      "ppid": 1
    },
    {
      "userdata": "#User-data for the initialization",
      "name": "webserver02",
      "url": "http://localhost:5000/dck/api/v1.0/proc/3",
      "pid": 3,
      "imageid": "72fcc69a-eae3-4367-a883-5aed9983d07c",
      "ppid": 2
    },
    {
      "userdata": "#User-data for the initialization",
      "name": "webserver03",
      "url": "http://localhost:5000/dck/api/v1.0/proc/4",
      "pid": 4,
      "imageid": "72fcc69a-eae3-4367-a883-5aed9983d07c",
      "ppid": 2
    }
  ]


#### show only a specified instance.
$ dck-ps 2
{
  "instance": {
    "userdata": "#User-data for the initialization",
    "name": "webserver01",
    "url": "http://localhost:5000/dck/api/v1.0/proc/2",
    "pid": 2,
    "imageid": "72fcc69a-eae3-4367-a883-5aed9983d07c",
    "ppid": 1
  }
}

It's possible to list the children of a specific parent, too.

#### list of children with ppid=2
$ dck-ps-ppid 2
{
  "instances": [
    {
      "userdata": "#User-data for the initialization",
      "name": "webserver02",
      "url": "http://localhost:5000/dck/api/v1.0/proc/3",
      "pid": 3,
      "imageid": "72fcc69a-eae3-4367-a883-5aed9983d07c",
      "ppid": 2
    },
    {
      "userdata": "#User-data for the initialization",
      "name": "webserver03",
      "url": "http://localhost:5000/dck/api/v1.0/proc/4",
      "pid": 4,
      "imageid": "72fcc69a-eae3-4367-a883-5aed9983d07c",
      "ppid": 2
    }
  ]
}

Again, these dck-* commands are just wrapping functions of following REST API calls.

function dck-exec {
    contents="\"type\": \"$2\", \"name\": \"$3\", \"imageid\": \"$4\", \"userdata\": \"$5\""
    curl -H 'Content-Type: application/json' -X POST -d "{ $contents }" http://localhost:5000/dck/api/v1.0/proc/$1
}

function dck-ps {
    if [[ -z $1 ]]; then
        curl http://localhost:5000/dck/api/v1.0/procs
    else
        curl http://localhost:5000/dck/api/v1.0/proc/$1
    fi
}

function dck-ps-ppid {
    curl http://localhost:5000/dck/api/v1.0/proc/$1/children
}

So what?

As the code of DCK-server is really a makeshift prototype using python-flask, I don't expose it here. But anyway, the operations shown above may give you some inspiration on how you can/should manage VM instances on the cloud. It enables you to mange cloud resources based on the traditional multi-process programming model. Yeah, "the programmable infrastructure" is yet another new word floating around me, but the real programming model of it is not discussed enough. I hope pursuing this comparison would result in a possible and useful example.

Especially, you can use inter-process communication methods such as process signal and shared memory in the traditional OS. Adding the similar "inter-instance communication methods" would give a foundation of the automation/autonomic programming model of the cloud....