2016-07-27

Introduction to Google Cloud Platform for OpenStackers (Part IV)

Disclaimer

This is not an official document of Google Cloud Platform. Please refer to the GCP web site for the official information.

cloud.google.com

Part III of this article series is here.

enakai00.hatenablog.com

This part shows how the typical 3tier web application system can be built on top of OpenStack and GCP respectively.

Using OpenStack

The diagram below shows a systems architecture for a typical 3tier web application consisting of the following components.

Load Balancer
Web Server
Web Application Server
Database Server

They are deployed across two data centers for redundancy over independent failure zones. (Note that Zones in a single region share a single controller cluster.) You may typically use the following OpenStack components for this configuration.

Network (Neutron)
VM Instance (Nove)
Security Group (Nova)
Cinder Volume (Cinder)
Object Storage (Swift)

I will describe how each of them will be used here. Note that, as mentioned in Part 1, it is not that easy to have multiple Availability Zones across multiple data centers in OpenStack. So we assume that there are two independent OpenStack clusters (or Regions) in each data center.

Network

It the diagram above, there are multiple subnets to separate packets in different tiers. But as mentioned in Part 2, you don't necessarily have to emulate the same network architecture in the cloud. You can use a single subnet for all tiers, and the packet separation can be achieved with Security Groups. By following this option, you could have the following simple network design. There is just a single subnet in each Region.

VM instance

In many cases, you would deploy all servers (Load Balancer, Web Server, Web Application Server and Database Server) as VM instances. Even though OpenStack provides a LBaaS (Load Balancer as a Service) feature, you need some 3rd party proprietary plugins for production use. The standard opensource plugin is provided as a sample implementation and not ready for production at the time of writing.

Security Group

Security Group defines a set of firewall rules according to server roles. In this configuration, you need to define the following groups according to four server roles (Load Balancer, Web Server, Web Application Server, Database Server).

Security Group	Source	Protocol

load-balancer	any	HTTP/HTTPS
	Management Subnet	SSH

web-server	load-balancer	HTTPS
	Management Subnet	SSH

web-application	web-server	TCP8080
	Management Subnet	SSH

database	web-application	MYSQL
	Management Subnet	SSH

The packet source can be specified with a subnet range or a name of Security Group. "Management Subnet" stands for a subnet range from which sysadmins login to the guest OS for maintenance. We assume that client SSL is terminated at Load Balancer, and it communicates to Web Servers with HTTPS. Web Servers communicate to Web Applications Servers with TCP8080. MySQL is used for Database Server.

After defining these Security Groups, you will assign them to each instance as below.

Instance	Security Group
Load Balancer	load-balancer
Web Server	web-server
Web Application Server	web-application
Database Server	database

Cinder Volume

As described in Part 3, there are two options, Ephemeral Disk and Cinder Volume, for instance attached disks. Since Cinder Volume is required to store persistent application data, you need to attach a Cinder Volume to Database Server as a data volume.

Object Storage

To realize the redundancy across data centers, you need to replicate the database between two Regions. Though you can use the replication mechanism of MySQL in theory, it may not be a realistic option considering the limited network bandwidth between data centers in general. The safe option is to take a periodical database backup into the object storage. When you need to failover the service to another Region, you would restore the most recent data to Database Server in another Region. This means you use two Regions with the active-backup configuration.

Considerations on Failover Operation

As mentioned above, you need to restore the most recent data from the backup stored in the object storage. The object storage system should also be deployed across data centers in a redundant manner for this purpose.

You also need to take care of the global IP change. Since you cannot use the same global IP in different Regions, you need to use some additional mechanism such as Dynamic DNS so that clients can access the service with the same URL.

Using GCP

Since GCP provides managed services for SQL database and load balancer, you don't have to use VM instances for these features. In addition, as describe in Part 1, it has the internal network between Zones in a Region with the bandwidth comparable to within a data center. So using multiple Zones within a single Region, you can achieve the redundancy over independent failure zones with a relatively simple configuration.

Considering these points, you can build the same system using the following components on top of GCP.

Network

You just define a single subnet in the Region which covers multiple Zones transparently.

VM Instance

You deploy Web Servers and Web Application Servers as VM instances. You can use managed services for other functions.

Firewall Rules

Firewall rules are defined with tag names assigned to instances as destination. The source can be specified with a subnet range or tag names. In this scenario, you would need to define the following rules.

Source	Destination	Protocol
130.211.0.0/22	web-server	HTTP, HTTPS
web-server	web-application	TCP8080
Management Subnet	web-server, web-application	SSH

"130.211.0.0/22" is a subnet range from where Load Balancer (and health checking system) accesses Web Server. "Management Subnet" stands for a subnet range from which sysadmins login to the guest OS for maintenance.

"web-server" and "web-application" are tag names assigned to Web Server and Web Application Server respectively. Instead of assigning rules as in Security Groups, you assign a tag name to instances according to their roles.

The access control for the database connection is configured based on the source IP address in the configuration of Cloud SQL.

HTTP(S) Load Balancer

GCP provides a HTTP(S) Load Balancer service with which you can use a single global IP address (VIP) to distribute client accesses to multiple Regions and Zones. In addition, you can use the single Cloud SQL instance from multiple zones as explained later. Leveraging these features, you can use the active-active configuration across multiple Zones resulting in a redundant service across data centers.

Cloud SQL

Cloud SQL is a managed service for MySQL database which provides data replication between multiple Zones. Scheduled data backup and failover process are automated in a transparent manner. You can also access the database from multiple Zones in the same Region thanks to the high bandwidth internal network.

Considerations on Failover Operation

Using the active-active configuration described above, failover can be completely automated. Since the load balancer service provides a single global IP for all Zones, clients can keep using the same URL to access the service.

This exemplifies the fact that you can build a high availability service with a simple architecture by combining Compute Engine with appropriate managed services on GCP.

2016-07-26

OpenStackユーザーのためのGoogle Cloud Platform入門（パート４）

前置き

これは、GCPの公式文書ではありません。公式ドキュメントについては、公式ホームページを参照ください。

・Google Cloud Platform

実際に試してみたい方は、60日の無料トライアルをお試しください。（クレジットカードの登録が必要ですが、トライアル終了後は明示的に課金設定を行わない限り、勝手に課金されることはありませんので、ご安心ください。）

cloud.google.com

本記事のパート3はこちらです。

enakai00.hatenablog.com

パート4では、Web3層アプリケーションを例として、どのようなコンポーネントを組み合わせてシステムを作るのかを比較してみます。

OpenStackによる構成例

「OpenStackクラウドインテグレーション」では、次のような構成例が紹介されています。

ロードバランサー、Webサーバー、Webアプリケーションサーバー、DBサーバーからなる、典型的なWeb３層アプリケーションを構築するもので、複数のデータセンターを利用して、DR構成を実現しています。上記の書籍にしたがって、この環境を構築した場合、次のようなコンポーネントが利用されることになります。

・ネットワーク（Neutron）
・VMインスタンス（Nova）
・セキュリティグループ（Nova）
・Cinderボリューム（Cinder）
・オブジェクトストレージ（Swfit）

これらのコンポーネントの構成を順番に説明していきます。

なお、上記の書籍では、1つのリージョンの複数のAvailability Zone（以下、ゾーン）を用いてDRを実現する想定になっていますが、パート１で説明したように、現在のOpenStackでは、地理的に離れた場所で複数ゾーンを構成するのはそれほど簡単ではありません。ここでは、複数のリージョンを用いて、DRを実現する想定とします。（OpenStackの場合、同一リージョンのゾーンは、コントローラーノードを共有している点に注意してください。）日本国内の２箇所のデータセンターに、別リージョンとして、独立したクラスターを構成しているものとしてください。

ネットワーク

前述の書籍では物理ネットワークをクラウド上で再現するために、複数のサブネットを作成していますが、ここでは、単一のサブネットを用いて、セキュリティグループでインスタンス間の通信を制限することにします。この場合、それぞれのリージョンに単一のサブネットを用意するだけなので、仮想ネットワークの構成はシンプルになります。

VMインスタンス

上記の書籍では、ロードバランサー、Webサーバー、Webアプリケーションサーバー、DBサーバーをすべてVMインスタンスとして導入しています。ロードバランサーについては、LBaaS（Load Balancer as a Service）の機能を利用することも可能ですが、オープンソースとして標準提供されるプラグインはサンプル実装の位置づけとなっており、本格的に利用するにはサードパーティーの商用プラグインが必要となります。

セキュリティグループ

OpenStackのセキュリティグループでは、サーバーの役割ごとにセキュリティグループを定義して、受信を許可するパケットを指定します。パケットの送信元をセキュリティグループで指定することもできるので、全体として、次のようなセキュリティグループを定義することになります。

Security Group	Source	Protocol

load-balancer	any	HTTP/HTTPS
	Management Subnet	SSH

web-server	load-balancer	HTTPS
	Management Subnet	SSH

web-application	web-server	TCP8080
	Management Subnet	SSH

database	web-application	MYSQL
	Management Subnet	SSH

「Management Subnet」は、システム管理者が外部からインスタンスにSSHログインする際の接続元ネットワークのサブネットレンジを与えます。また、クライアントとのSSL通信はロードバランサーでターミネートして、Webサーバーとの通信にはHTTPS、Webアプリケーションサーバーとの通信にはTCP8080を使用します。データベースには、MySQLを使用するものとしています。

この上で、それぞれのインスタンスに対して、次のようにセキュリティグループを割り当てます。

Instance	Security Group
Load Balancer	load-balancer
Web Server	web-server
Web Application Server	web-application
Database Server	database

Cinderボリューム

VMインスタンスの接続するディスクには、Ephemeral DiskとCinderボリュームがありますが、永続保存が必要なデータには、Cinderボリュームを使用する必要があります。今回の構成では、DBサーバーのデータ保存領域には、Cinderボリュームの使用が必須となります。

オブジェクトストレージ

DBサーバーのデータバックアップをオブジェクトストレージに取得します。

DRの切り替え手順について

DRでリージョンを切り替える際は、オブジェクトストレージに取得したデータベースのバックアップを切り替え先のDBサーバーにリストアします。この前提として、オブジェクトストレージ自体が複数のリージョンにレプリケーションされる構成で、データセンター障害に対する冗長性を持っている必要があります。

この他には、ゾーン間でデータベースのレプリケーションを行うという方法もありますが、地理的に離れた場所の場合、ネットワーク帯域の不足に伴う問題が発生する可能性もあるので注意が必要です。

また、外部からアクセスするIPアドレスについては、リージョン間で同一のグローバルIPを使用することができないため、IPアドレスの変更を外部のクライアントに意識させないためには、DDNSを利用して、DNSに登録するIPアドレスを変更するなどの作業が必要となります。

GCPによる構成例

GCPでは、データベースとロードバランサーについては、マネージドサービスとして機能が提供されていますので、VMインスタンスとして機能を用意する必要はありません。また、GCPが提供するネットワークは、同一のリージョンであれば、ゾーン間においても、ゾーン内と同等のネットワーク帯域が提供されています。そのため、単一リージョンにおける複数のゾーンを利用することで、比較的簡単に独立したFailure zone（故障域）にまたがる冗長性を実現することができます。これらの点に注意すると、次のようなコンポーネントを利用して、前述のOpenStackの場合と同等の環境を構築することが可能です。

全体像を先に示すと、次のようになります。

ネットワーク

GCPのネットワークでは、複数ゾーンにまたがったサブネットが構成できるので、使用するリージョンに単一のサブネットを作成します。

VMインスタンス

WebサーバーとWebアプリケーションサーバーをVMインスタンスとして構築します。

ファイアウォールルール

GCPのファイアウォールルールでは、インスタンスに付与したタグを用いて送信先を指定します。送信元については、サブネットレンジ、もしくは、タグでの指定が可能ですので、次のようなルールを用意することになります。

Source	Destination	Protocol
130.211.0.0/22	web-server	HTTP, HTTPS
web-server	web-application	TCP8080
Management Subnet	web-server, web-application	SSH

「130.211.0.0/22」はロードバランサーがWebサーバーにアクセスする際の送信元のサブネットレンジです。また、「Management Subnet」は、システム管理者が外部からインスタンスにSSHログインする際の接続元ネットワークのサブネットレンジを与えます。「web-server」と「web-application」は、WebサーバーとWebアプリケーションサーバーのインスタンスに付与するタグになります。セキュリティグループのように、インスタンスにルールを割り当てるのではなく、インスタンスには、単純にタグを付与するだけでよい点に注意してください。

Webアプリケーションサーバーからデータベースへの接続については、後述のCloud SQLの設定を用いて、接続を許可するIPアドレスを指定します。

HTTP(S)ロードバランサー

GCPが提供するロードバランサーは、単一のグローバルIPアドレス（VIP）で複数のリージョン、ゾーンに負荷分散することが可能です。この後で説明するように、複数のゾーンから同一のデータベースを参照することもできるので、両方のゾーンを同時に使用する構成で、複数データセンターを用いた冗長構成が実現できます。

Cloud SQL

Cloud SQLは、MySQLのマネージドサービスで、レプリケーションによる複数ゾーンにまたがった冗長構成を取ることが可能です。定期的なバックアップ処理、あるいは、障害時のフェイルオーバー処理も自動化されています。また、前述のように、GCPのネットワークでは、ゾーン間においてもゾーン内と同等のネットワーク帯域が確保されているので、それぞれのゾーンのWebアプリケーションサーバーから同一のデータベースにアクセスすることも問題ありません。

DRの切り替え手順について

上記で説明したように、両方のゾーンを同時に使用する構成をとることにより、データセンター障害時に特別な切り替え手順が発生することはありません。ロードバランサーが単一のグローバルIPアドレス（VIP）を提供するため、クライアントからのアクセス先が変わることもありません。

この例からも分かるように、Compute Engineとマネージドサービスを適切に組み合わせることで、シンプルで可用性の高い環境を構築することが可能になります。

2016-07-25

Introduction to Google Cloud Platform for OpenStackers (Part III)

Disclaimer

This is not an official document of Google Cloud Platform. Please refer to the GCP web site for the official information.

cloud.google.com

Part II of this article series is here.

enakai00.hatenablog.com

In this part, I will give a comparison between OpenStack Nova/Cinder and Compute Engine regarding VM instance management.

Instance size

In OpenStack, you would choose the "Instance Type" to specify the instance size (number of vCPUs and amount of memory). Instance Types are predefined by the sysadmin. General users are not allowed to add custom types.

On the other hand, in Compute Engine, instance sizes are defined as "Machine Type". In addition to choosing one from predefined Machine Types when you create a new instance, you can change the number of vCPUs and amount memory separately, to create your own "Custom Machine Type."

Reference:

Storage options

OpenStack: Ephemeral Disk and Cinder Volume

OpenStack provides two different kinds of instance attached disks, "Ephemeral Disk" and "Cinder Volume". Initially, Ephemeral Disk was intended to be used as a system disk containing operating system files whereas Cinder Volume was to store persistent application data. However, Live Migration is not available with Ephemeral Disk, people often use Cinder Volume for a system disk, too.

When you choose Ephemeral Disk for a system disk, the OS template image (managed by Glance) is cloned into the local storage of compute node, and the local image is directly attached to the instance. When you destroy the instance, the attached image is also destroyed. On the other hand, Cinder Volume provides a persistent disk area (LUN) which resides in the external storage box. In typical configuration, the disk area (LUN) is attached to the compute node using the iSCSI protocol, and attached to the instance as a virtual disk. Even when you destroy the instance, the attached volume remains in the external storage. You can reuse the volume by attaching to a new instance.

Comparing Ephemeral Disk and Cinder Volume

Since the disk image is stored in the local storage of the compute node, Ephemeral Disk is sometimes used to achieve better storage performance. It is also possible to use multiple storage boxes with different performance characteristics for the backend device of Cinder Volume. Users can specify the preferred backend in creating a new Cinder Volume.

In addition, applications running on the instance can access the object storage provided by OpenStack Swift.

Compute Engine: Persistent Disk and Local SSD

Compute Engine provides "Persistent Disk" as a persistent storage attached to an instance which corresponds to "Cinder Volume" of OpenStack. It is used for a system disk containing operating system files, and for storing persistent application data as well. The data is automatically encrypted when it goes out of the instance to the physical storage. A single Persistent Disk can be extended up to 64TB, but the maximum total capacity of the attached disks is restricted based on the instance size.

When you need a high performance local storage, you can also use the local SSDs. By attaching eight SSDs with each of 357GB capacity, you can use 3TB of local SSDs in total. Even with local SSDs, Live Migration is available. Since the data in local SSDs are copied between instances during Live Migration, there may be a temporally decrease in storage performance though.

In addition, applications running on the instance can access the object storage provided by Cloud Storage.

Reference:

Storage Options

Instance Metadata

OpenStack: Metadata Service

OpenStack has "Metadata Service" which provides the mechanism for retrieving instance information from the instance guest OS. By accessing the special URL under "http://169.254.169.254/latest/meta-data", you can retrieve instance information such as am instance type, security groups and assigned IP addresses. You can also add custom metadata in the "Key-Value" form.

There is also a special metadata called "user-data". If you specify a text file as uesr-data when creating a new instance, the "cloud-init" agent running in the guest OS will receive it, and do some startup configuration according to its contents. Typically, you can use a script file as user-data which will be executed by cloud-init to install application packages on startup.

Compute Engine: Metadata Server

Compute Engine has "Metadata Server" which provides the mechanism for retrieving instance and project information from the instance guest OS. One of the following URLs are used to access metadata.

Metadata Server provides "Instance metadata" and "Project metadata". Instance metadata is associated to each instance providing information specific to the instance whereas Project metadata is associated to the project and shared by all instances in the project. You can add custom metadata for both Instance metadata and Project metadata in the "Key-Value" form.

There is also a special metadata called "startup-script" and "shutdown-scripts". They are executed when you start or shutdown the instance respectively. Different from OpenStack's user-data, startup-script is executed every time you restart the instance. These are handled by the agent included in "compute-image-packages" which is preinstalled in the official template images.

Reference:

Storing and Retrieving Instance Metadata

Agents in guest OS

OpenStack: cloud-init

The agent package called "cloud-init" is preinstalled in the standard guest OS images of OpenStack. It handles the initial configurations, at the first boot time, such as extending the root filesystem space, storing a SSH public key and executing the script provided as user-data.

Compute Engine: compute-image-packages

The agent package called "compute-image-packages" is preinstalled in the standard guest OS images of Compute Engine. It handles the initial configurations at the first boot time, and also handles dynamic system configuration while the guest OS is running.

The dynamic system configuration includes adding new SSH public keys and changing network configurations required for HTTP load balancing for example. You can generate and add a new SSH key through Cloud Console or gcloud command after launching an instance. This is done through the agent running on the guest OS as a daemon process.

As a side note, the agent uses Metadata Server for the dynamic configuration. Compute Engine's Metadata Server provides a mechanism to notify the agent about metadata updates, and the agent is triggered with the metadata updates.

Reference:

Scripts and tools for Google Compute Engine Linux images.

Accessing other services from applications

In the standard guest OS images of Compute Engine, SDK (including gcloud command) is preinstalled and you can use them with the "Service Account" privilege to access other services of GCP. Through the IAM mechanism, access control is enforced per instance. For example, read-only authority for Cloud Storage is assigned to an instance by default. If you want your application running in the instance to store data in Cloud Storage, you need to assign the read-write authority to the instance. With this mechanism, you don't have to setup passwords or credential codes for applications running in the instance by hands.

Reference:

Creating and Enabling Service Accounts for Instances

What's next?

I would give a sample architecture for a typical 3tier web application both on OpenStack and GCP.

enakai00.hatenablog.com