stoney cloud: Multi-Node Installation

From stoney cloud
Revision as of 14:09, 10 January 2014 by Pat (Talk | contribs)


Jump to: navigation, search

System Overview

The stoney cloud builds upon various standard open source components and can be run on commodity hardware. The final Multi-Node Setup consists of the following components:

  • One Primary-Master-Node with an OpenLDAP Directory Server for the storage of the stoney cloud user and service related data with the web based management VM-Manager interface and the Linux kernel based virtualization technology.
  • One Secondary-Master-Node with the Linux kernel based virtualization technology.
  • Two Storage-Nodes configured as a replicated and distributed data storage service based on GlusterFS.

The components communicate with each other over a standard Ethernet based IPv4 network.

Prerequisites

The following items and conditions are required to be able to install and configure a stoney cloud environment:

  • Dedicated Hardware (4 Servers) which fulfil the following requirements:
    • 64-Bit Intel with VT-Technologie (AMD is not tested at the moment).
    • 8 Gigabyte Memory (more is better).
    • 147 Gigabyte up to 2 Terabyte disks.
    • Two physical Ethernet Interfaces, which support the same bandwidth (for example 1 Gigabit/s).
  • Two Gigabit layer-2 switches supporting IEEE 802.3ad (dynamic link aggregation), IEEE 802.1Q (VLAN tagging) and stacking (optional, but recommended).
  • Experience with Linux environments especially with Gentoo Linux.
  • Good experience with IP networking, because the Switches need to be configured manually (dynamic link aggregation and VLAN tagging).

Limitations

During the installation of the Gentoo Linux operating system, the first two physical Ethernet Interfaces are automatically configured as a logical interface (bond0) and the four tagged VLANs are set up. If more than two physical Ethernet Interfaces are to be included in to the logical interface (bond0) or the physical Ethernet Interfaces have different bandwidths (for example 1 Gigabit/s and 10 Gigabit/s), the final logical interface (bond0) needs to be configured manually after the installation of the Gentoo Linux operating system. Only the first two GlusterFS Storage-Nodes are set up automatically. More Storage-Nodes need to be integrated manually.

Network Overview

As stated before, a minimal multi node stoney cloud environment consist of two VM- and Storage-Nodes. It is highly recommended to use IEEE 802.3ad link aggregation (bonding, trunking etc.) over two network cards and attach them as one logical link to the access switch.

It is out of scope of this document on how to configure the switches for stacking, link aggregation and VLAN tagging, consult the respective user manual.

There are two scenarios on how to connect the nodes to the network.

Network Overview: Physical Layer Scenario 1 (recommended)

The preferred solution is to use two switches which are stackable (supporting link aggregation over two switches). Connect the nodes and switches as illustrated below. Thus eliminating the single point of failure as presented in scenario 2.

                      +----------------------+      +----------------------+
                      |                      |      |                      |
                      |      vm-node-01      |      |      vm-node-02      |
                      |                      |      |                      |
                      +----------------------+      +----------------------+        
                                     bond0 | \      / | bond0
                                           |  \    /  |
                                           |   \  /   |
                                           |    \/    |                                          ___
                                           |    /\    |                                      ___(   )___
                                           |   /  \   |                                   __(           )__
                                           |  /    \  |                                 _(                 )_
                                 +-------------+   +-------------+                    _(                     )_
                                 |  switch-01  |===|  switch-02  |-------------------(_   Corporate LAN/WAN   _)
                                 +-------------+   +-------------+                     (_                   _)
                                           | \      / |                                  (__             __)
                                           |  \    /  |                                     (___     ___)
                                           |   \  /   |                                         (___)
                                           |    \/    |
                                           |    /\    |
                                           |   /  \   |
                                     bond0 |  /    \  | bond0
                     +-----------------------+      +-----------------------+
                     |                       |      |                       |
                     | tier1-storage-node-01 |      | tier1-storage-node-02 |
                     |                       |      |                       |
                     +-----------------------+      +-----------------------+

Network Overview: Physical Layer Scenario 2 (use at your own risk)

If theres only one switch available (or the switches aren't stackable) connect the nodes as illustrated below. As you can see, the switch is as a single point of failure.

                      +----------------------+      +----------------------+
                      |                      |      |                      |
                      |      vm-node-01      |      |      vm-node-02      |
                      |                      |      |                      |
                      +----------------------+      +----------------------+                     ___
                                  bond0  \\            // bond0                              ___(   )___
                                          \\          //                                  __(           )__
                                           \\        //                                 _(                 )_
                                         +-------------+                              _(                     )_
                                         |  switch-01  |-----------------------------(_   Corporate LAN/WAN   _)
                                         +-------------+                               (_                   _)
                                           //       \\                                   (__             __) 
                                          //         \\                                     (___     ___)
                                  bond0  //           \\ bond0                                  (___)
                     +-----------------------+      +-----------------------+
                     |                       |      |                       |
                     | tier1-storage-node-01 |      | tier1-storage-node-02 |
                     |                       |      |                       |
                     +-----------------------+      +-----------------------+

Network overview: Logical layer

The ideal stoney cloud environment is based on four logical separated VLANs (virtual LANs):

  • admin: Administrative network, used for administration and monitoring purposes.
  • data: Data network, used for GlusterFS traffic.
  • int: Internal network, used for internal traffic such as LDAP, libvirt and more.
  • pub: Public network, used for accessing the VM-Manager webinterface, Spice traffic and internet access.

Each of the above VLANs hold dedicated services and separates them from each other. This documentation assumes, that the four VLANs are present and the following IP networks are available:

VLAN name VLAN ID Network prefix Default Gateway address Broadcast address Domain name VIP
admin 110 10.1.110.0/24 -- 10.1.110.255 admin.example.com
data 120 10.1.120.0/24 -- 10.1.120.255 data.example.com
int 130 10.1.130.0/24 -- 10.1.130.255 int.example.com 10.1.130.10
pub 140 192.168.140.0/24 192.168.140.1 192.168.140.255 example.com 192.168.140.10

The IP allocation of the nodes will be assumed as stated in the table below:

Node name Admin address (VLAN 110) Data address (VLAN 120) Int address (VLAN 130) Pub address (VLAN 140)
tier1-storage-node-01 10.1.110.11 10.1.120.11 10.1.130.11 192.168.140.11
tier1-storage-node-02 10.1.110.12 10.1.120.12 10.1.130.12 192.168.140.12
vm-node-01 10.1.110.13 10.1.120.13 10.1.130.13 192.168.140.13
vm-node-02 10.1.110.14 10.1.120.14 10.1.130.14 192.168.140.14

We'll also presume, we have the following Domain Name Servers:

  • Domain Name Server 1: 192.168.140.1

Base Installation

All nodes are based on the Gentoo Linux operating system.

BIOS Set Up Checklist

  • Enable the "reboot-after-power-loss" option (if your BIOS supports it).
  • Make sure, that you have the newest BIOS (BMC and perhaps SCSI firmware) version.
  • Make sure, you've disabled halt on post errors (or similar) or enable keyboard-less operation (if your BIOS supports it).

RAID Set Up

Create a RAID1 volume. This RAID-Set is used for the Operating System. Please be aware, the the current stoney cloud only supports 147 Gigabyte up to 2 Terabyte disks for this first RAID-Set.

Optional: For the two GlusterFS Storage-Nodes we recommend a second RAID-Set configured as RAID6-Set with battery backup.

Node Installation

The first step of Semi-Automatic Multi-Node Set Up is the same for all four Nodes. In this documentation we presume, that you stick to the naming convention mentioned above. After the Base Installation of the Nodes, the following daemons will be running:

  • crond: Crond, to execute scheduled commands
  • ntpd: Network Time Protocol daemon, keeps the time synced with the Time from Servers in the LAN or WAN.
  • sshd: OpenSSH SSH daemon, used for remote access and remote administration.
  • syslogd: System Logging, keeps track of messages from the System and the Applications.
  • udevd: Linux dynamic device management, that manage events, symlinks and permissions of devices.

Skipping Checks

To skip checks, type no when asked:

 Do you want to start the installation?
yes or no?: no

Then manually restart the stoney cloud installer with the desired options:

/mnt/cdrom/stoney-cloud-installer -c

Options:

-c: Skip CPU requirement checks
-m: Skip memory requirement checks
-s: Skip CPU and memory requirement checks

First Storage-Node (tier1-storage-node-01)

  1. Insert the stoney cloud CD and boot the server.
  2. Answer the questions as follows (the bold values are examples can be set through the administrator and are variable, according to the local setup):
    1. Global Section
      1. Confirm that you want to start? yes
      2. Choose a Node-Type: Storage-Node
      3. Choose a Block-Device: sda
      4. Confirm to erase all (from a previous installation): yes
      5. Confirm to continue with the given the Partition-Scheme: yes
      6. Choose the network interfaces to bond together.
        1. Device #0: eth0
        2. Device #1: eth1
      7. Node-Name: tier1-storage-node-01
    2. pub-VLAN Section
      1. VLAN ID: 140
      2. Domain Name: example.com
      3. IP Address: 192.168.140.11
      4. Netmask: 24
      5. Broadcast: 192.168.140.255
      6. Confirm the pub-VLAN Section with yes
    3. admin-VLAN Section
      1. VLAN ID: 110
      2. Domain Name: admin.example.com
      3. IP Address: 10.1.110.11
      4. Netmask: 24
      5. Broadcast: 10.1.110.255
      6. Confirm the admin-VLAN Section with yes
    4. data-VLAN Section
      1. VLAN ID: 120
      2. Domain Name: data.example.com
      3. IP Address: 10.1.120.11
      4. Netmask: 24
      5. Broadcast: 10.1.120.255
      6. Confirm the data-VLAN Section with yes
    5. int-VLAN Section
      1. VLAN ID: 130
      2. Domain Name: int.example.com
      3. IP Address: 10.1.130.11
      4. Netmask: 24
      5. Broadcast: 10.1.130.255
      6. Confirm the int-VLAN Section with yes
      7. Enter the default Gateway: 192.168.140.1
      8. Enter the primary DNS-Server: 192.168.140.1
      9. Omit configuring a second DNS-Server with no
    6. Confirm the listed configuration with yes
    7. Enter your very secret root password
    8. Confirm to reboot with yes
  3. Make sure, that you boot from the first harddisk and not from the installation medium again.
  4. Continue with specializing your Node

Second Storage-Node (tier1-storage-node-02)

  1. Insert the stoney cloud CD and boot the server.
  2. Answer the questions.
  3. Reboot the Server and make sure, that you boot from the first harddisk.

Primary-Master-Node (vm-node-01)

  1. Insert the stoney cloud CD and boot the server.
  2. Answer the questions.
  3. Reboot the Server and make sure, that you boot from the first harddisk.

Secondary-Master-Node (vm-node-02)

  1. Insert the stoney cloud CD and boot the server.
  2. Answer the questions.
  3. Reboot the Server and make sure, that you boot from the first harddisk.

Specialized Installation

First Storage-Node (tier1-storage-node-01)

Before running the node configuration script, you may want to create a additional Backup Volume on Storage Node.

Log into the first Storage-Node and execute the node-configuration script as follows:

/usr/sbin/fc-node-configuration --node-type primary-storage-node

For more information about the script and what it does, please visit the fc-node-configuration script page.

Second Storage-Node (tier1-storage-node-02)

Before running the node configuration script, you may want to create a additional Backup Volume on Storage Node.

Log into the second Storage-Node and execute the node-configuration script as follows:

/usr/sbin/fc-node-configuration --node-type secondary-storage-node

For more information about the script and what it does, please visit the node-configuration script page.

Primary-Master-Node (vm-node-01)

If you configured a additional Backup Volume on the Storage Nodes, you want to mount them now in the VM-Node.

Log into the Primary-Master-Node and execute the node-configuration script as follows:

/usr/sbin/fc-node-configuration --node-type primary-master-node

The stoney cloud uses virtual ip addresses (VIPs) for fail over purposes. Therefore you need to configure ucarp.

Confirm that you want to run the script.

Do you really want to proceed with configuration of the primary-master-node?
yes or no (default: no): yes

Enter the VIP (virtual IP) for the public interface. The apache will listen on this VIP because it is listening on the public interface (if you followed this documentation the VIP for the public interface is 192.168.140.10).

Please enter the VIP for the pub-interface (VLAN 140)
(default: 192.168.140.10): 192.168.140.10

Enter the VIP (virtual IP) for the internal interace. The LDAP will listen on this VIP because it is listening on the internal interface (if you followed this documentation the VIP for the internal interface is 10.1.130.10).

Please enter the VIP for the int-interface (VLAN 130)
(default: 10.1.130.10): 10.1.130.10

Enter the RIP (real IP) for the secondary-master-node on the internal interface. The RIP is needed to keep the LDAP directories synchronized (if you followed this documentation the RIP for the secondary-master-node on the internal interface is 10.1.130.14)

Please enter the IP for the int-interface (VLAN 130) of the
secondary-master-node (default: 10.1.130.14): 10.1.130.14

The script now tests the network configuration and


In order to mount the gluster-filesystem, you need to connect via ssh to the primary-storage-node. Enter the IP for the primary-storage-node on the admin interface (if you followed this documentation the IP for the primary-storage-node on the admin interface is 10.1.110.11). Enter the a valid username which exists on the primary-storage-node (if you followed this documentation it is root) and the corresponding password.

Please enter the following information for the primary Storage-Node with the OpenSSH daemon listening on the VLAN with the name 'admin' and with the VLAN ID '110':

IP address (default: 10.1.110.11): 10.1.110.11
Username (default: root): root
Password for root: **********

In order to mount the gluster-filesystem, you need to connect via ssh to the secondary-storage-node. Enter the IP for the secondary-storage-node on the admin interface (if you followed this documentation the IP for the primary-storage-node on the admin interface is 10.1.110.12). Enter the a valid username which exists on the primary-storage-node (if you followed this documentation it is root) and the corresponding password.

Please enter the following information for the secondary Storage-Node with the OpenSSH daemon listening on the VLAN with the name 'admin' and with the VLAN ID '110':

IP address (default: 10.1.110.12): 10.1.110.12
Username (default: root): root
Password for root:  **********


  • Configure the LDAP directory:
    • Define the password for the LDAP-Superuser (cn=Manager,dc=stoney-cloud,dc=org)
    • Currently the user for the prov-backup-kvm daemon is the LDAP-Superuser so enter the same password again
    • Define the password for the LDAP-dhcp user (cn=dhcp,ou=services,ou=administration,dc=stoney-cloud,dc=org)
    • Enter all necessary information for the stoney cloud administrator (User1)
      • Given name
      • Surname
      • Gender
      • E-mail
      • Language
      • Password
  • Finally enter the domain name which will correspond to the public VIP (default is stoney-cloud.example.org)
  • Due to bug #9, you need to manually finish the configuration of the libvirthook scripts:
    • You mainly have to fill in the following variables:
      • libvirtHookFirewallSvnUser
      • libvirtHookFirewallSvnPassword
    • See also this test configuration
  • Due to bug #12, you need to manually configure the LDAPKVMWrapper.pl script:
    • Fill in the /etc/Provisioning/Backup/LDAPKVMWrapper.conf file
    • Create a cronjob entry which runs the script /usr/bin/LDAPKVMWrapper.pl once a day:
      • 00 01 * * * /usr/bin/LDAPKVMWrapper.pl | logger -t Backup-KVM

For more information about the script and what it does, please visit the fc-node-configuration script page.

Secondary-Master-Node (vm-node-02)

If you configured a additional Backup Volume on the Storage Nodes, you want to mount them now in the Secondary-Master-Node.

Log into the Secondary-Master-Node and execute the node-configuration script as follows:

/usr/sbin/fc-node-configuration --node-type secondary-master-node
  • In order to get the configuration from the primary-master-node, we need to access it via ssh
    • Enter the IP for the primary-master-node on the admin interface (if you followed this documentation it is 10.1.130.13)
    • Enter the username (if you followd the default setup it is root)
    • Enter the users password.
  • In order to mount the gluster-filesystem, you need to connect via ssh to the primary-storage-node, so enter the following information:
    • Enter the IP for the primary-storage-node on the admin interface (if you followed this documentation the IP for the primary-storage-node on the admin interface is 10.1.110.11)
    • Enter the a valid username which exists on the primary-storage-node (if you followed this documentation it is root)
    • Enter the users password
  • Repeat the same procedure for the secondary-storage-node (if you followed this documentation the IP is 10.1.110.12)
  • Due tu bug #9, you need to manually finish the configuration of the libvirthook scripts:
    • You mainly have to fill in the following variables:
      • libvirtHookFirewallSvnUser
      • libvirtHookFirewallSvnPassword
    • See also this test configuration

For more information about the script and what it does, please visit the fc-node-configuration script page.

Links

Old Documentation

Specialized Installation

Primary-Master-Node (vm-node-01)

If you configured a additional Backup Volume on the Storage Nodes, you want to mount them now in the VM-Node.

Log into the Primary-Master-Node and execute the node-configuration script as follows:

/usr/sbin/fc-node-configuration --node-type primary-master-node

Manual Steps

In order to be able to migrate a VM from a carrier, a special user called transfer will be created.

lvcreate -L 60G -n transfer local0
 
mkfs.xfs -L "OSBD_transfe" /dev/local0/transfer 
 
cat << EOF >> /etc/fstab
 
LABEL=OSBD_transfe  /home/transfer    xfs      noatime,nodev,nosuid,noexec  0 2
EOF
 
mount /home/transfer
 
useradd --comment "User which is used for VM disk file transfer between carriers" \
        --create-home \
        --system \
        --user-group \
        transfer
 
passwd transfer


Allow password authentication for the transfer user:

$EDITOR /etc/ssh/sshd_config
[...]

Match User transfer
        PasswordAuthentication yes

[...]

To apply the changes above, restart the SSH daemon:

/etc/init.d/sshd restart