*** MOVED ***

NOTE: I have merged the contents of this blog with my web-site. I will not be updating this blog any more.

2007-07-25

ICFPC 2007

I spent most of this weekend taking part in the ICFP contest for 2007. As in 2005, I took part in the contest with Yumpee as my team-mate. Our team was called "Kalianpur Bakaits" (an IIT-K reference) in the beginning, but we changed it to "The Great Indian Rope Trick" later, for reasons that should become obvious in a while.

This year's task was very similar to that from last year, but it was a bit tougher and even more fun. The organisers surely must have spent a lot of time and effort in putting together the task description and the supporting materials. There were no mistakes discovered in the task description for this year's contest (unlike those for the previous years' contests). The mailing list also had an unusually low-traffic during the contest - perhaps it was because the task description was precise enough, perhaps it was because a lot of people were finding it difficult just to get started or perhaps the contest was affected by the release of the latest Harry Potter book during the same weekend.

This year's task was to save an alien named "Endo". Endo was a "Fuun" travelling in a spaceship named "Arrow", when it had a mishap with an "interstellar garbage collector" that picked it and then dumped it on Earth. This caused critical damage to both Endo and Arrow. Our job was to repair Endo's DNA in such a way that Arrow is able to restore it to its former condition using its limited remaining power.

Endo's DNA is a long sequence comprising four bases, 'I', 'C', 'F' and 'P'. This sequence can be interpreted to create an RNA sequence comprising the same four bases that in turn can be used to build the proteins necessary to restore Endo. Prefixes comprising the four bases can be attached to a DNA sequence to modify its behaviour - we had to find a cheap and low-risk prefix that Arrow could process to repair Endo's DNA.

This translates to creating an interpreter for the DNA sequence that produces an RNA sequence, creating another interpreter for this RNA sequence that creates an image out of it and finally discovering a prefix that could restore Endo. The organisers provided us the DNA sequence for Endo, a "source" image resulting from interpreting this DNA sequence and a "target" image representing a fully-restored Endo. We were asked to create a DNA prefix that results in an image that is as close to the target image as possible, while being as short and as cheap to process as possible.

The task description was detailed enough for us to readily build a DNA to RNA converter as well as an RNA to image converter. The challenge was to do the processing quickly enough since the given DNA sequence required about 2 million iterations over a string that was about 7.5MB long. I created the DNA-to-RNA converter, while Yumpee built the RNA-to-image converter.

Since we had to splice and join huge lists while interpreting the DNA, I started off with a C implementation that used Lisp-style "cons" cells. While it seemed to perform well (but not good enough for the task), it had hard to track bugs that caused segmentation faults, not to mention memory leaks. I then tried using Java for creating the interpreter, which resulted in a correct but even slower interpreter. By this time, I had wasted two out of the three days allotted for the task. We had not made much progress while a lot of teams seemed to have discovered the same "magical" prefix comprising 28 bases that had boosted their rankings.

This is when I finally decided to act on Yumpee's suggestion of looking into using "ropes" as the underlying data structure. He had been suggesting it repeatedly since the time he had seen the task description, but I kept putting it off since I was not familiar with it. I decided to check out the "cords" implementation of ropes that comes with Boehm-GC. I modified my original interpreter in C to use cords instead of my home-grown and buggy cons-cells implementation.

The difference in performance was startling! The very first implementation ran through all the 2 million iterations in about 10 minutes on my PC and in about 4 minutes on Yumpee's PC. The code had also become much simpler to read and much closer to the pseudo-code provided in the task description. I debugged it a little to remove a bug that resulted from a misreading of the specification and then handed it over to Yumpee. By this time, he had a working RNA-to-image converter (he was also busy barking up the wrong trees till that time) and had also discovered the magic 28-base prefix. Unfortunately for us, we had just nine hours left to finish the task.

We discovered that the fun had just started. Embedded in the DNA were a series of clues that formed a virtual treasure-hunt, involving encrypted pages, EBCDIC-encoded text, fractals, callable functions, etc. We discovered a lot of clues but we couldn't find a prefix in time that would improve our rankings. It felt good that we could make so much progress by the end of the contest, unlike the last time. However, I felt like an idiot for not having listened to Yumpee earlier and saving myself and our team a lot of wasted time.

As I always end up saying after such contests, I hope we have a better luck the next time.

Some more reports: 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, etc.

Update (2007-07-26): I tweaked the code for the DNA-to-RNA converter a little bit to avoid creating a lot of unnecessary cords and that reduced its running time to about 1 minute 35 seconds on my PC (about 7 times better than before) and about 30 seconds on Yumpee's PC. Right now the garbage collector itself is the biggest consumer of CPU cycles (about 45% on the whole), but disabling it leads to the programme quickly running out of memory on my PC.

2007-07-11

"Finance, Investments 'n' Trading"

"Finance, Investments 'n' Trading" is a weblog by Shobhit that tries to bring some sanity to the general mania currently surrounding the Indian stock markets.

You should read his articles in chronological order (he provides a helpful "Table of Contents" article for this purpose). He might sound too pessimistic to some and I do not necessarily agree with everything that he says, but the articles are still a good and recommended read, particularly if you actively buy and sell stocks or are thinking of getting into it.

The Indian stock markets have been in a prolonged and an almost-continuous bull run for about three to four years. So many people have apparently made so much money with such ease that everyone from students and housewives to army men and retired government workers want to jump into the bandwagon for fear of being left out. It does not help that newspapers, magazines and TV channels provide a disproportionate coverage of the "excitement" surrounding the Indian stock markets as if it is a form of sport.

When caution and rational reason are abandoned in times of general euphoria, many people are likely to get their hands burnt (some times without even realising that they have made a net loss, taking inflation, taxes and trading charges into account). The only people who consistently make money in such situations are the stockbrokers (for example, read Warren Buffett's "Chairman's Letter" to Berkshire Hathaway Shareholders for 2005 (PDF) from page 17 onwards).

I don't think that investing in stocks is necessarily a bad thing or that the current bull run is not backed by a real growth in the Indian economy. I just wish that people think a bit more rationally, keep realistic expectations of returns and figure out how to calculate their real loss or gain from trading, before jumping in.

I wish that more people read and understand the advice given in Benjamin Graham's superb book "The Intelligent Investor".

2007-07-04

Expiry Dates

If medicines and food items become dangerous to consume after their expiry dates, shouldn't pesticides become more powerful after their expiry dates?

2007-07-02

Dragon Ball Z

Many years ago, a cartoon TV channel in India started showing Dragon Ball Z. They showed 53 episodes of this series comprising the Vegeta Saga and the Namek Saga. Just as the series got really interesting, they yanked it off the air without notice and without even a word of apology.

Loyal viewers of the series were aghast. They pleaded with the channel to show them the rest of the series as well, rather than leaving them on tenterhooks. The channel could not care less. The viewers were frustrated and cursed the channel using the choicest of expletives. They did not have much choice.

Several years later, the same cartoon channel starts showing Dragon Ball Z once again. Some of the old viewers hope the channel would show the entire series this time and watch the episodes once again to revive their memories.

They were naïve. The channel showed the same 53 episodes of the series and then abandoned it yet again at the same critical point in the story.

They were idiots. The channel as well as the viewers.