Set-up of an small Riak cluster with VirtualBox, part I - codecentric AG Blog

:

Introduction

The aim of this article is to show how to set-up a small Riak cluster using VirtualBox. Riak is a NoSQL database of the key-value-type. Objects in the database are uninterpreted atomic binary entities. They are addressed by unique keys. To facilitate key handling, objects are collected in buckets. A bucket may also be seen as a common prefix for a set of keys. Riak is a distributed database with data replication across nodes. The degree of replication is configurable with 3 copies being the default.

We assume a certain basic familiarity with Riak. There is a nice Riak tutorial available that will lead you through the initial steps.

For virtualisation we use VirtualBox by Oracle. VirtualBox is a mature, fully fledged virtualisation environment that is available for free. So, it’s just the right choice for experimenting with small clusters. The host and guest operating system we chose is Ubuntu 11.10. It is clear that the database servers should be run under some Unix OS. And Ubuntu is easy to install and run. In order to be able to address a sufficient amount of working memory the 64bit-version is chosen. If you want to use more than three VirtualBoxes this choice is inevitable at least on the host.

We assume Ubuntu to be installed on the host. The following additional packages should be installed on the host: curl, erlang, ssh. Curl is used to communicate with the Riak servers. Erlang will be used to run some of the example from the Riak tutorial. There are no Riak packages to install on the host. The host takes over the role of a database client.

The first Riak node

Installation

Download and install the VirtualBox for Ubuntu. Create a new virtual machine. Some hints on how to do this can be found here. I chose to give it 1GB virtual RAM instead of 500MB. The disk size should grow dynamically. During the installation and set-up phase it is best to use NAT as networking device.

To install Ubuntu on the VirtualBox you need to have an ISO image available at some place in the host file system. Before you start the virtual machine, you should include the Ubuntu ISO image as a CD so that you can install Ubuntu after starting the virtual machine. After installation and update of Ubuntu we recommend to install the following packages:

  • curl, for communication with the Riak server
  • ssh, optional for remote administration
  • virtualbox-ose-guest-x11, for better integration into the host
  • build-essential,
  • linux-headers-generic.

In order to make the USB devices work, enter
> sudo usermod -aG vboxusers <user name>
in a host shell, where <user name> is of course the name of your user.

The administration of the guest is a lot easier when you log in as root. To do so you have to set up a root password on the guest by
> sudo passwd root

Basho offers precompiled Riak packages for Ubuntu. Download and install the 64-bit Debian package on the guest. The installation does not just install the software. It also starts the Riak server and places a start script in the boot process as /etc/rc{2,3,5}.d/S20riak so that the server automatically starts at boot up.

The “cluster” we set up consists of three nodes. Each node runs in its own virtual machine. Each virtual machine gets assigned a fixed IP address. Under the presupposition that we do not scale horizontally and need to start an arbitrary number of servers the assumption of a fixed IP address for a database server is not such a strange one. We set up a simple network consisting of three virtual machines as DB servers and the host as client.

Network setup

Now, where all necessary packages are installed on the guest and the guest does not need internet access any more, we can change the network adapter for setting up a small local net. We define a host-only network for the virtual machines. See the vm manual for more details. In VirtualBox go to File → Preferences → Network. Click on (new host-only network). Click on “edit” and check that the DHCP-Server is switched off. Also note the IPv4-address of the adapter. We need it later on. In this example we use 192.168.56.1.

The next step consists in changing the network adapter settings for the virtual machine. In the VirtualBox Manager, click on Network and set “connected to” to “Host-only adapter”.

Next step is to manually configure the guest to use a fixed IPv4-address, in this example 192.168.56.2. Start the system administration and go to Network → Wired → Configure → IPv4-Settings-Tab. Choose method “Manual”. Add a new address setting the net mask to 255.255.255.0 and the gateway to the IP-address of the host-only adapter, here 192.168.56.1. Leave DNS and search domain empty. Save your changes.

Riak node configuration

The Riak node that got installed with the installation of Riak is set up to work with localhost. Since we want to use the Riak node as a server we address from client machines, this setup does not work for us. We have to re-configure this to set the ring owner to the IP address of the server (our virtual machine).

Stop the running Riak server.
> /etc/init.d/riak stop
Remove the existing Riak ring.
> rm -rf /var/lib/riak/ring/*
> rm -rf /var/lib/riak/bitcask/*

Using your favourite editor, change the name of the Riak node in /etc/riak/vm.args:
-name riak@192.168.56.2
or the IP address you picked for the virtual machine. Edit the HTTP and protocol buffer address (2 different lines) in /etc/riak/app.config:
{http, [ {"192.168.56.2", 8098 } ]},
{pb_ip, "192.168.56.2" }

Finally, restart the whole virtual machine and check whether the Riak server is running properly. On the guest machine, run
> riak-admin ring_status
You should see something like this:
Attempting to restart script through sudo -u riak
================================== Claimant ===================================
Claimant: 'riak@192.168.56.2'
Status: up
Ring Ready: true

============================== Ownership Handoff ==============================
No pending changes.

============================== Unreachable Nodes ==============================
All nodes are up and reachable

Also run
> riak-admin member_status
You should see something like this:
Attempting to restart script through sudo -u riak
================================= Membership ==================================
Status Ring Pending Node
-------------------------------------------------------------------------------
valid 100.0% -- 'riak@192.168.56.2'
-------------------------------------------------------------------------------
Valid:1 / Leaving:0 / Exiting:0 / Joining:0 / Down:0

The server is up and running. Can we reach the server from a client, i.e., the host? Try
> curl -v http://192.168.56.2:8098/ping
on a host terminal. The desired result looks like this
* About to connect() to 192.168.56.2 port 8098 (#0)
* Trying 192.168.56.2... connected
* Connected to 192.168.56.2 (192.168.56.2) port 8098 (#0)
> GET /ping HTTP/1.1
> User-Agent: curl/7.21.6 (x86_64-pc-linux-gnu) libcurl/7.21.6 OpenSSL/1.0.0e zlib/1.2.3.4 libidn/1.22 librtmp/2.3
> Host: 192.168.56.2:8098
> Accept: */*
>
< HTTP/1.1 200 OK < Server: MochiWeb/1.1 WebMachine/1.9.0 (someone had painted it blue) < Date: Fri, 20 Apr 2012 13:29:52 GMT < Content-Type: text/html < Content-Length: 2 < * Connection #0 to host 192.168.56.2 left intact * Closing connection #0

In the first line of the answer by the server you read
< HTTP/1.1 200 OK
So, ping works. The server is reachable. Lets query for some data, though we haven't stored any data yet. So we expect to receive an information that there is no data for the given key when we submit
curl -v http://192.168.56.2:8098/riak/test/doc2323232
The result should look like this
* About to connect() to 192.168.56.2 port 8098 (#0)
* Trying 192.168.56.2... connected
* Connected to 192.168.56.2 (192.168.56.2) port 8098 (#0)
> GET /riak/test/doc2323232 HTTP/1.1
> User-Agent: curl/7.21.6 (x86_64-pc-linux-gnu) libcurl/7.21.6 OpenSSL/1.0.0e zlib/1.2.3.4 libidn/1.22 librtmp/2.3
> Host: 192.168.56.2:8098
> Accept: */*
>
< HTTP/1.1 404 Object Not Found < Server: MochiWeb/1.1 WebMachine/1.9.0 (someone had painted it blue) < Date: Fri, 20 Apr 2012 13:38:00 GMT < Content-Type: text/plain < Content-Length: 10 < not found * Connection #0 to host 192.168.56.2 left intact * Closing connection #0

Here,
< HTTP/1.1 404 Object Not Found
is what we expected to see. The server does answer the way we expected it.

Lo and behold, we are done with the first node.