I.think()

Working with a linux cluster
May 09, 2015

I frequently need to work on a linux cluster. It’s a lot of work to manage the environment on many machines, especially when you don’t have root access. Over time I gradually came up with a working solution.

The cluster I work on has /home shared over NFS. If I put program and libraries in my $HOME directory, they can be shared across multiple machine. So when I need a program or a library, I will download its source, manually compile and install it in my $HOME directory (usually by adding --prefix=$HOME to configure script). Add PATH=$HOME/bin:$PATH to ~/.bashrc and the manually installed programs will override system programs.

Also, linuxbrew is very useful for my situation. It puts stuff in $HOME/.linuxbrew, so after installing any package via linuxbrew on one machine, it’s available throughout the cluster.

The following is what I do to setup my work environment on a cluster:

# install rbenv, ruby-build, and manually install ruby

$ git clone https://github.com/sstephenson/rbenv.git ~/.rbenv
$ echo 'export PATH="$HOME/.rbenv/bin:$PATH"' >> ~/.bashrc
$ echo 'eval "$(rbenv init -)"' >> ~/.bashrc
$ source ~/.bashrc  # refresh bash environment

$ git clone https://github.com/sstephenson/ruby-build.git ~/.rbenv/plugins/ruby-build

$ rbenv install 2.2.2
$ rbenv global 2.2.2

In my case, rbenv install 2.2.2 will report build failure, which is due to missing OpenSSL libraries. Ruby is actually built successfully, but without OpenSSL extensions. I had to manually cd to the build directory and run make install to install ruby by hand.

Without OpenSSL support, gem cannot download packages. We need to install OpenSSL and rebuild ruby later.

# install linuxbrew

$ ruby -e "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/linuxbrew/go/install)"
$ echo ‘export PATH="$HOME/.linuxbrew/bin:$PATH"’ >> ~/.bashrc
$ source ~/.bashrc  # refresh bash environment

$ brew install rbenv  # so rbenv could be updated by linuxbrew
$ brew install openssl
$ rbenv install -f 2.2.2  # reinstall ruby with openssl support
$ gem install bundler

At the time of writing this post, I cannot install Lua via linuxbrew. So I installed Lua and LuaRocks manually.

I downloaded Lua 5.2.3, and modified its Makefile, setting INSTALL_TOP to my $HOME directory. And following this thread, compile Lua by

# manually build and install Lua

$ make -C src clean all SYSCFLAGS="-DLUA_USE_LINUX" SYSLIBS="-Wl,-E -ldl -lreadline -lncurses"
$ make install

You can also set the flags in src/Makefile.

Now, for LuaRocks to work, you must have unzip utility. If it’s not installed, you can do this:

# install unzip for luarocks

$ brew tap homebrew/dupes
$ brew install unzip

Installing LuaRocks is easy, just download it from here, and do the traditional configure (don’t forget --prefix=), make and make install.

Switching between alternative packages

One more thing is how to switch between different versions/implementations of a package. In my case, I need to switch between OpenMPI and MVAPICH2. My solution is to create a $HOME/packages directory, and install packages into separate $HOME/packages/<package> subdirectories. In my case, I have

$HOME/packages/openmpi-1.8.4/
$HOME/packages/mvapich2-2.1rc2/

And for each package, I created an activate script setting up its environment:

$HOME/packages/activate-openmpi-1.8.4
$HOME/packages/activate-mvapich2-2.1rc2

The activate script has content like this:

# content of $HOME/packages/activate-openmpi-1.8.4

export PATH=$HOME/packages/openmpi-1.8.4/bin:$PATH
export LD_LIBRARY_PATH=$HOME/packages/openmpi-1.8.4/lib:$LD_LIBRARY_PATH

To select which package to use, I created $HOME/packages/bashrc.pkgs:

# content of $HOME/packages/bashrc.pkgs

source $HOME/packages/activate-openmpi-1.8.4  # enabled
# source $HOME/packages/activate-mvapich2-2.1rc2  # disabled

And add following lines to ~/.bashrc for setting up packages:

# add to ~/.bashrc

if [ -f ~/packages/bashrc.pkgs ]; then
    source ~/packages/bashrc.pkgs
fi

See this post for additional tips on managing dotfiles.


Embedding mruby and its thread safety
Jun 22, 2014

mruby is a lightweight implementation of the Ruby language. Recently I tried to embed it into one of my projects, and the experience was great.

Ruby is a great language. It’s very easy to learn, and there are tens of thousands of libraries to make you productive. The most fascinating aspect of the language is, I believe, its flexible syntax. You can easily create a domain-specific language (DSL) for you application, and speak its logic naturally. Like most people, I started learning ruby when writing a rails web application, and have been in love with the language ever since.

The birth of mruby is a great news for me. I always wanted to add some scripting support for my application, but I’m never a fan of Lua. Even if it is the No.1 choice for embedded scripting, it just does not feel right for me. Now, with mruby, I can write ruby script in my application!

mruby source code is hosted at https://github.com/mruby/mruby. It’s core consists of around 30k lines of C and ruby code, after compilation you will get a libmruby.a at around 3MB. When compiled with libmruby.a, the size of my binary increased by around 700KB. This is not small, compared with Lua (~200KB), but the functionality it brought in is well worthy.

Embedding mruby is very easy. Just #include <mruby.h>, use mrb_open() to create an mrb_state, and run mrb_load_string() to execute Ruby code. You can also use mrb_funcall() to call Ruby code from C/C++ code, or register your C/C++ code so that Ruby code can use it. The best example is mirb.c in mruby source code.

One thing I care about is thread safety. I need to run multiple mruby instances, each in a thread, and there will be no interaction across threads. If mruby cannot support this use case then I’ll have to use other solutions. Luckily, the code base of mruby is not big, so I checked all of its source code. There is only a few static non-const variables, which is only used for GC statistics. So I can confidently use mruby in a multi-thread environment!

Comments   |  

Manage dotfiles across machines
May 24, 2014

Managing dotfiles have always been a headache for me, since I need to work on multiple computers and frequently switch between them. But now I have a permanent solution, thanks to my friend Chien-Chin Huang!

It’s actually very simple — use Dropbox!!

Why the heck didn’t I think of this before!?

So just simply put your dotfiles into some folder in Dropbox. And link them to your $HOME directory, and it’s all done! When you need to work on a new machine, just setup Dropbox, wait for the dotfiles to be synced, and link them again.

Setting up Dropbox on server machines without GUI could be painful. My solution is to let all servers mount an NFS folder, and put dotfiles into that NFS folder. It’s basically replacing Dropbox using an NFS folder. Works without any problem :)

Now my $HOME looks like this:

lrwxr-xr-x    1 santa  staff     52 May 13 17:19 .bash_profile -> /Users/santa/Dropbox/.toolkit/dotfiles/.bash_profile
lrwxr-xr-x    1 santa  staff     46 May 13 17:19 .bashrc -> /Users/santa/Dropbox/.toolkit/dotfiles/.bashrc
lrwxr-xr-x    1 santa  staff     49 May 13 17:19 .gitconfig -> /Users/santa/Dropbox/.toolkit/dotfiles/.gitconfig
lrwxr-xr-x    1 santa  staff     48 May 13 17:19 .teamocil -> /Users/santa/Dropbox/.toolkit/dotfiles/.teamocil
lrwxr-xr-x    1 santa  staff     49 May 13 17:19 .tmux.conf -> /Users/santa/Dropbox/.toolkit/dotfiles/.tmux.conf
lrwxr-xr-x    1 santa  staff     43 May 13 17:19 .vim -> /Users/santa/Dropbox/.toolkit/dotfiles/.vim
lrwxr-xr-x    1 santa  staff     45 May 13 17:19 .vimrc -> /Users/santa/Dropbox/.toolkit/dotfiles/.vimrc
...

Also, a few recent discoveries:

  • Use tmux. It’s absolutely awesome. Way better than screen. With teamocil you can easily start your workspace in no time. Also, check out those .tmux.conf.

  • I’m now using Vundle to manage my vim plugins. It’s now only one command away to update all plugins.


[simple-rpc] RPC code generator
May 22, 2014

In the previous posts, I’ve discussed how simple-rpc serializes message, and how server and client communicates with each other, and how are the request and reply messages formatted. The scaffolding is in shape, and we are ready to use those apparatus to provide RPC support.

But now, we have a new problem — usability. How can we make the RPC system easy to use?

It’s very inconvenient to write RPC messages by hand. For each user-defined message type, you have to write serialization code for it. And, for each RPC method, you will have to write both server-side and client-side code stub, to pack up the request/reply messages. Doing this by hand is extremely painful and error-prone.

So, inspired by Google Protocol Buffer and Apache Thrift, we write an RPC definition file, and use a compiler to translate it into server-side and client-side code stubs. This greatly alleviates users’ burden.

The RPC definition file

The RPC definition we use is very simple. Compared with Google Protocol Buffer and Apache Thrift, it’s like a stripped version. We discarded some fancy features to make simple-rpc simple enough.

The following basic types are supported in RPC definition file:

Type Description
i8, i16, i32, i64 Fixed size integers
vi32, vi64 Variable size integers (usually smaller serialized footprint)
double Double precision float number
string String or binary blob (std::string)
pair<T1, T2> Exactly, std::pair<T1, T2>
vector<T>, list<T>, map<K, V>, set<E> Corresponding STL containers

Define a custom message type using keyword struct.

struct point3d {
    double x;
    double y;
    double z;
}

Nested struct is supported, and they can also be elements in an STL container.

With custom message types defined, we can now define the RPC methods:

service Math {
    sum(i32 a, i32 b, i32 c | i32 result);
    euclidean_distance(point3d p1, point3d p2 | double distance);
}

Here we defined a service called Math, which encapsulates 2 RPC methods sum and euclidean_distance. For those RPC methods, their input parameters consists of two parts — left side of | is input parameter, and right side of | is output parameter.

For Math service, we generate a MathService class containing server-side RPC code stub, and a MathProxy class containing client-side RPC code stub. To use the generated code, we only need to register MathService to rpc::Server, and create a MathProxy on top of connected rpc::Client, then we can call RPC like regular function:

math_proxy.sum(1, 2, 3, &result);

Programmers only need to write the RPC definition file, and fill out actually C++ code for RPC function body. They don’t need to care anything about message serialization and error handling.

The RPC compiler

So now we have designed the RPC definition file, the next thing is to write the compiler. I chose to use Yapps 2 because it’s simple to use, and very small so I can add it into simple-rpc code repository.

In Yapps, you need to create a parser definition for the language. It’s pretty much like most parser generators: define the string pattern to ignore, add basic token definition, create grammar rules on those token, write actions for each rule. After parsing the input RPC definition file, we will have an AST, and by processing it we can create the RPC code stub easily. It’s a very straight forward translation process, once you have got the AST. And for different target languages, just write different translators for them.


[simple-rpc] Server and client
May 15, 2014

Server side

The RPC server is in charge of running RPC requests. When a client connects to the server, this connection is registered to (possibly one of many) IO threads, and been monitored for IO events (ready to read/write).

RPC request has the following structure:

<size> <xid> <rpc_id> <arg1> ... <argN>
  • size is the number of bytes from xid to argN. It’s a 4 byte integer.
  • xid is the request id. It’s numbered by client side. It’s a variable size integer.
  • rpc_id is the id of RPC function call. It is used to dispatch different kind of RPC calls. It’s a 4 byte integer.
  • arg1argN is the serialized form of RPC function call parameter.

Since the RPC message is sent over network with a limited bandwidth, we want to make the message size as small as possible to reduce bandwidth consumption. One frequently used value type is integer, and mostly we are sending small integers (list length, etc) which could be represented using less than default integer size (4 or 8 bytes). To reduce serialized size of integers, we added a variable size integer. It borrows ideas from UTF-8 encoding, that the higher bits in first byte indicates how many bytes are used to represent the integer:

First byte Total bytes Range
0xxxxxxxx 1 -64 ~ 63
10xxxxxxx 2 -8192 ~ 8191
110xxxxxx 3 -1048576 ~ 1048575
1110xxxxx 4 -134217728 ~ 134217727

This extends to at most 9 bytes to represent an integer.

When a client issues an RPC request, it will send the TCP packets through network. The IO thread on server side will be notified, and it will try to read the packs from wire. If a full RPC request is successfully read, this packet will be parsed and the request will be processed. The following pseudo code shows how this part works:

ServerConnection::handle_read():   # will be called on each 'ready to read' event
    requests = []
    while True:
        packet_size, err = input.peek(4)  # try to peek 4 bytes
        if not err and input.size() >= packet_size + 4:
            input.drop(4)   # 4 bytes of packet size already read successfully
            req = Request()
            req.marshal = input.read(packet_size)
            requests += req,    # queue the RPC request
        else:
            break   # have not received all packets for RPC request, or there's no more data to process

    for req in requests:
        req.xid, rpc_id = unpack_from(req.marshal)
        rpc_handlers[rpc_id].handle(req)    # find handler for the RPC request, and process it

Each RPC handler looks like this:

rpc_handler(req, connection):
    ... # process request

    # send back data
    connection.begin_reply(req.xid)
    connection.marshal += {reply_content}
    connection.end_reply()
    cleanup_resource(req, connection)

And to provide RPC support on the server, they need to be registered to a certain RPC id:

rpc_server.reg(rpc_id, some_rpc_handler)

The RPC reply has the following format:

<size> <xid> <error_code> <ret1> ... <retN>

It is very similar to RPC request, except the error_code field. It will be 0 if RPC is successfully processed; ENOENT if requested RPC handler cannot be found; EINVAL if some field is missing.

Client side

The client side provides a very generic RPC call interface. The pseudo code is like the following:

client.connect("rpc_server_ip:port")
future = client.begin_request(rpc_id, optional_callback)
client += rpc_parameter
client.end_request()

future.wait()
error_code, reply_data = future.reply.unpack()

The callback function will be invoked when client side received the RPC reply.

This interface is obviously not easy to use. To make RPC call more user-friendly, we added scripts to automatically generate code to create request message, serialize request data, and choose correct rpc_id. The scripts will be covered in next post.