Monthly Archives: September 2013

On Thin Clients and Large Data

So I have been a fan of the thin client model of computing for a few years now.  I was turned on to the idea (without even realizing it) at my old consulting job, where we would remote desktop into a hefty SAS server to run our programs.  There were a number of benefits to this setup.  For programs with long run times, we could run the code on the server, and not have to worry about shutting down our laptops for the duration of the run.  The server had much larger memory and disk space than our laptops, so running code on large data sets was a lot easier on the server because of the larger resources.  Having all of the code and data in one location allowed for easier collaboration between team members.  And not having to run the code locally meant that our computers were still functional for other tasks while running a large job (nothing worse than trying to answer a few e-mails while a large SAS job is churning away in the background, tying up all of your computers CPU and memory).

Since leaving the consulting job, I have been running a thin client setup of my own for a while now, but my opinion on the matter has cooled somewhat.  I purchased a Samsung Chromebook and installed Ubuntu on it using crouton.  There are some noticeable bugs, but overall, I love the setup.  The laptop is light weight, small but large enough to be functional without a second monitor, and cheap.  It is essentially a 90% of a MacBook Air for 20% the price.  The two major sacrifices that you make with this laptop are RAM (2 GB) and disk space (16 GB SSD).  After a few months of use however, I am down to 4 GB [1] of disk space left which doesn’t leave much room to do work locally.  While I still think the thin client setup is the way of the future, I may have gone too thin, so to speak.

Part of the problem I am having is that I like to do work locally from time to time.  The laptop is small and light, and therefore easy to take with me anywhere.  I often find myself in locations without any wireless connection, and therefore, no access to AWS, my current choice of server provider.

Another complaint that I have is the set up time required to get AWS instances running.  I normally keep one micro tier instance of EC2 running at all times, in case I need to offload a small programming task to a server.  However, the micro instance, while free, can’t handle some of the larger jobs that I want to run.  This means that I need to fire up a larger EC2 instance, load all the data up, make sure I have all of the proper libraries installed, and then run the job.  Finally, after the job has run, I need to get the results, and kill the instance.  If I had a larger budget, I could afford to leave a larger instance running 24/7 (or just buy my own server), but the larger instances aren’t free, so in order to keep things cheap, I need to go through these hoops.

I am probably going to be spending some time exploring ways to make the remote server process a little less painless.  I am fairly certain that what I am doing is less than optimal.  Things like imaging my micro instance and transferring that image to a larger instance can make things a little less painful (at least when it comes to installing libraries and utilities), and hopefully I can dig up some other shortcuts and best practices for the other issues I have run into.  This should make some good fodder for my next few blog posts.  Stay tuned.

[1] It’s more like 2 GB of disk space, as ChromeOS relies on zRAM to supplement RAM, and it will automatically start deleting data if you get below 2 GB (From what I can tell, ChromeOS only deletes recoverable data like Google account information).  For the most part, all of the data is being used by Chrome OS and Ubuntu as well as the various utilities and tools that I have installed on Ubuntu (as far as I can tell), so clearing up space isn’t going to be easy.  I will probably start using a SD card to add some extra storage though.