My small gig with Google Hashcode

20161123hc

Lately, I’ve been honing a bit my programming skills. I’ve picked up Python again after some time. It’s a very fun language to use! 100% recommended for both beginners to coding and more experienced people. Lots of functionality and libraries are available. And it’s FREE!

Although I knew the basics of Python, I needed something to work on. I visited Google together with SIAM a couple of weeks ago and heard about the Hackaton-like competitions they host. So I looked for some info there, and discovered Hashcode. Hashcode is a competition that is hosted annually and consists of two parts: a qualifying round online and a final round in Paris. It’s done in teams, so you cannot go lone wolf there. Even if it was allowed, it’s very hard to go alone anyway. You only have a few hours to complete the assignment and those are quite hard. You can even create a hub if several teams from the same university, club or institution compete together.

I tried the one that was proposed in Hashcode 2015 for the qualifier rounds. Basically, it consists of optimizing pools of servers. A pool is just a group of servers devoted to one task, so for example, they can be doing high-performance computing together. The servers are located in rows where they get network connectivity, power, etc. Those rows can fail of course, so you cannot use the affected servers, but you still would want to use the pool. The optimization part comes into assigning the pools so that if one row goes offline, all pools still have a good capacity to operate. You can find more details here.

20161123servers

The problem can be divided in four parts. First, you have to read the server and the environment characteristics. That is, number of desired pools, number of servers, size of each server, capacities… This is provided by Google themselves. Then you have to assign the servers to the different rows, keeping in mind that there are some slots in some rows that are not available. Afterwards, you assign the pools with some (hopefully) clever method, maximizing the capability of the pools if one row goes offline. Lastly, you sum up everything into a text file. Not so bad, right?

Since I couldn’t come up with a clever way of assigning the pools, I just assigned them at random, trying to distribute them as evenly as possible. Afterwards I choose one (or several) servers and change the pool randomly, and compute the capacity that would remain if a row went offline, keeping it if it’s an improvement. Repeat the process during N iterations and save the result.

Of course, being random one needs several runs of the algorithm to get a good result. My best was 232, so let’s have a look at the rankings:

20161123rankings

… not good. There were 230 teams in total that had scores. The best ones were above 400 and many people had more than 350. I’m sure there is a cleverer way to assign the pools, but I wasn’t able to come up with anything myself, and this would have been what I would have used in the competition. Sure there are a lot of excellent teams at Hashcode!

I also had a look at the final round problem of 2015, which is about balloon positioning in a grid. I know some equivalent problems that can be solved with dynamic programming, so maybe I’ll do better.