Server farm management
Last updated: August, 2009
This page gives a short overview of how you can manage a high availability server farm. It is not meant to be complete but serves to explain the high level (core) concepts for managing a server farm. Management is based on three concepts: redundancy, separation of server/services and actual data, and a specification model for managing the multiple different services. It is important to know that this page does not talk about virtualized servers, but servers on which the actual OS is installed directly.
Important is to setup redundant machines for critical service (e.g. database, caching, etc..). Services that will be reachable from the outside (Internet, WAN) can then be bundled and associated to a loadbalancer. E.g. if you have four caching services you can bundle them up in the load balancer, to prevent (or minimize) load issues on the machines.
A specification model, defines on a conceptual level the relationships between servers, services, etc... Figure 1 shows such a model and the relations (using UML notation) between the main concepts. The next paragraphs will explain these relations in more detail. Important to know is that the server, service classes and vip specifications will defined in a socalled model configuration file. The model configuration file can be a text file (e.g. xml file) that describes these relations.

Figure 1. Specification Model
The server is defined in the model configuration file. It has a list of attributes (e.g. name, place in rack, etc..), and it defines a list of relations between the server and service classes. This file contains a sample structure that defines some of the relationships of figure1. A service class is (usually) a single application (but your free to define multiple applications in a service class) like for example an apache or tomcat server. Multiple servers can be associated to multiple service classes (hence the many to many relationship). The server-server classes relationship in the model configuration file states: this server will run application (service) xyz.
The service classes are defined within the model configuration file too. Since they 'map' to real applications that will be installed on the server they contain an attribute that points to a package specification. The package specification is a file outside the model configuration that contains a list of rpm packages needed by the service to run on a base installation. A base installation is the installation of a collection of base packages on all servers in your farm (e.g. a basic OS like Fedora, Centos, etc..) that are defined in a base kickstart file. Typically a package specification has a one to one relationship with the service class (it is not mandatory, but why define different service classes that point to the same packages?). The package lists a set of rpm packages. Rpm packages can be in multiple package specifications (e.g. the perl package might be needed by the frontend production service and a backend development service alike).
Eache service class is mapped to a unique virtual IP (VIP) if they need an outside connection (e.g. apache server, mysql, etc..),. VIPs are defined for internal use (within your server farm), wan (e.g. your intranet) and external (the Internet). It should be noted that not all services classes need all three values. For example your backend database only needs an internal one. A development server might need an internal and a wan as it needs to be accessible by developers or testers within your intranet. A frontend server would need only an external, internal and wan VIP. The third octed of the VIPs is defined in the model configuration file.
The information defined in the model configuration file (server, service classes, vips) is then used to generate the dns entries for your server farm (you will have your own dns server on your farm). The VIPs are used to 'aggregrate' service instances of the same service class. For example one service class can contain a caching server, and for your website might have 5 servers that run such a caching server (service instances). The relationship between VIP and service instances is used to configure your load balancer (if you have multiple service instances for the same service class).
The complete virtual ip can be generated by scripts for the server instances and are typically based on three segments that describe the four octets of the virtual ip (VIP). The first segment is are the two octes of the ip range provided by your datacenter. The second segment is the third octet and is the number defined in your model configuration file for this particular service class, and the third segment is the fourth octet or your virtual ip it is an increasing number that starts at 1 and is incremented by the number of service instances you have per service class.
After you define the model configuration file you can verify (with a script) to check if different service classes do not have the same VIP. Once you have defined the model configuration file properly, generated the dns entries and prepared the rpm packages there is enough information to augment the base kickstart configuration for the different servers, to the desired setup on the defined machines in your server farm. Assuming the system is kickstart based, it is highly desirable to either put ready to use config files in your rpms, or create symbolic links to for example an nsf mount (as the kickstart removes all data on the server). Typically such a kickstart takes approximately 15 minutes (this can differ on different hardware). The nsf mount also contains the rpm database as well as a subversion repository in which the model configuration file and rpm spec files are located. Putting the rpm spec file and the model configuration file under version management enables you to rollback any changes you made.
Provided you have server redundancy (always good to have in a production environment), you can update servers independently without losing your site. Typically the kickstart will wipe out the whole system and reinstall it, but what you can do too is to define the service class in your model configuration file and do something like a 'yum install' on the target servers in case you do not want to take down the server for 15 minutes. The infrastructure makes it very easy to migrate a server configuration from one hardware to another through kickstart without using any server virtualization as you specified the whole configuration through kickstart scripts and the model configuration file and separated the services from the data. It is similar to do a 'cut' and 'paste' of a virtual machine hosted on a virtual server, but without the virtualization overhead. Although the infrastructure described is well suited for non virtualized environments, you can include collections of virtualized servers. This means that in the model configuration file you define for example the VMWare server as one of the services (which maps to actual hardware), but also the virtual machines which would map to the vwmare server and not actual hardware. If your infrastructure is completely virtualized and you have the proper tools to 'copy/cut' (aka clone) virtual machines the model described here might change somewhat.