めもめも

このブログに記載の内容は個人の見解であり、必ずしも所属組織の立場、戦略、意見を代表するものではありません。

Introduction to Google Cloud Platform for OpenStackers (Part IV)

Disclaimer

This is not an official document of Google Cloud Platform. Please refer to the GCP web site for the official information.

cloud.google.com

Part III of this article series is here.

enakai00.hatenablog.com

This part shows how the typical 3tier web application system can be built on top of OpenStack and GCP respectively.

Using OpenStack

The diagram below shows a systems architecture for a typical 3tier web application consisting of the following components.

  • Load Balancer
  • Web Server
  • Web Application Server
  • Database Server

They are deployed across two data centers for redundancy over independent failure zones. (Note that Zones in a single region share a single controller cluster.) You may typically use the following OpenStack components for this configuration.

  • Network (Neutron)
  • VM Instance (Nove)
  • Security Group (Nova)
  • Cinder Volume (Cinder)
  • Object Storage (Swift)

I will describe how each of them will be used here. Note that, as mentioned in Part 1, it is not that easy to have multiple Availability Zones across multiple data centers in OpenStack. So we assume that there are two independent OpenStack clusters (or Regions) in each data center.

Network

It the diagram above, there are multiple subnets to separate packets in different tiers. But as mentioned in Part 2, you don't necessarily have to emulate the same network architecture in the cloud. You can use a single subnet for all tiers, and the packet separation can be achieved with Security Groups. By following this option, you could have the following simple network design. There is just a single subnet in each Region.


VM instance

In many cases, you would deploy all servers (Load Balancer, Web Server, Web Application Server and Database Server) as VM instances. Even though OpenStack provides a LBaaS (Load Balancer as a Service) feature, you need some 3rd party proprietary plugins for production use. The standard opensource plugin is provided as a sample implementation and not ready for production at the time of writing.

Security Group

Security Group defines a set of firewall rules according to server roles. In this configuration, you need to define the following groups according to four server roles (Load Balancer, Web Server, Web Application Server, Database Server).

Security Group Source Protocol
load-balancer any HTTP/HTTPS
Management Subnet SSH
web-server load-balancer HTTPS
Management Subnet SSH
web-application web-server TCP8080
Management Subnet SSH
database web-application MYSQL
Management Subnet SSH

The packet source can be specified with a subnet range or a name of Security Group. "Management Subnet" stands for a subnet range from which sysadmins login to the guest OS for maintenance. We assume that client SSL is terminated at Load Balancer, and it communicates to Web Servers with HTTPS. Web Servers communicate to Web Applications Servers with TCP8080. MySQL is used for Database Server.

After defining these Security Groups, you will assign them to each instance as below.

Instance Security Group
Load Balancer load-balancer
Web Server web-server
Web Application Server web-application
Database Server database
Cinder Volume

As described in Part 3, there are two options, Ephemeral Disk and Cinder Volume, for instance attached disks. Since Cinder Volume is required to store persistent application data, you need to attach a Cinder Volume to Database Server as a data volume.

Object Storage

To realize the redundancy across data centers, you need to replicate the database between two Regions. Though you can use the replication mechanism of MySQL in theory, it may not be a realistic option considering the limited network bandwidth between data centers in general. The safe option is to take a periodical database backup into the object storage. When you need to failover the service to another Region, you would restore the most recent data to Database Server in another Region. This means you use two Regions with the active-backup configuration.

Considerations on Failover Operation

As mentioned above, you need to restore the most recent data from the backup stored in the object storage. The object storage system should also be deployed across data centers in a redundant manner for this purpose.

You also need to take care of the global IP change. Since you cannot use the same global IP in different Regions, you need to use some additional mechanism such as Dynamic DNS so that clients can access the service with the same URL.

Using GCP

Since GCP provides managed services for SQL database and load balancer, you don't have to use VM instances for these features. In addition, as describe in Part 1, it has the internal network between Zones in a Region with the bandwidth comparable to within a data center. So using multiple Zones within a single Region, you can achieve the redundancy over independent failure zones with a relatively simple configuration.

Considering these points, you can build the same system using the following components on top of GCP.


Network

You just define a single subnet in the Region which covers multiple Zones transparently.

VM Instance

You deploy Web Servers and Web Application Servers as VM instances. You can use managed services for other functions.

Firewall Rules

Firewall rules are defined with tag names assigned to instances as destination. The source can be specified with a subnet range or tag names. In this scenario, you would need to define the following rules.

Source Destination Protocol
130.211.0.0/22 web-server HTTP, HTTPS
web-server web-application TCP8080
Management Subnet web-server, web-application SSH

"130.211.0.0/22" is a subnet range from where Load Balancer (and health checking system) accesses Web Server. "Management Subnet" stands for a subnet range from which sysadmins login to the guest OS for maintenance.

"web-server" and "web-application" are tag names assigned to Web Server and Web Application Server respectively. Instead of assigning rules as in Security Groups, you assign a tag name to instances according to their roles.

The access control for the database connection is configured based on the source IP address in the configuration of Cloud SQL.

HTTP(S) Load Balancer

GCP provides a HTTP(S) Load Balancer service with which you can use a single global IP address (VIP) to distribute client accesses to multiple Regions and Zones. In addition, you can use the single Cloud SQL instance from multiple zones as explained later. Leveraging these features, you can use the active-active configuration across multiple Zones resulting in a redundant service across data centers.

Cloud SQL

Cloud SQL is a managed service for MySQL database which provides data replication between multiple Zones. Scheduled data backup and failover process are automated in a transparent manner. You can also access the database from multiple Zones in the same Region thanks to the high bandwidth internal network.

Considerations on Failover Operation

Using the active-active configuration described above, failover can be completely automated. Since the load balancer service provides a single global IP for all Zones, clients can keep using the same URL to access the service.

This exemplifies the fact that you can build a high availability service with a simple architecture by combining Compute Engine with appropriate managed services on GCP.