I had a chance recently to visit my alma mater IIT Kanpur. It has been over 11 years since I graduated from that place. This is the first time in all these years that I got to visit the place, though I had been yearning to do so all the while. It turned out to be a mixed experience - I found IIT Kanpur to be familiar and estranging at the same time.
Showing posts with label iit. Show all posts
Showing posts with label iit. Show all posts
2007-12-22
2007-07-25
ICFPC 2007
I spent most of this weekend taking part in the ICFP contest for 2007. As in 2005, I took part in the contest with Yumpee as my team-mate. Our team was called "Kalianpur Bakaits" (an IIT-K reference) in the beginning, but we changed it to "The Great Indian Rope Trick" later, for reasons that should become obvious in a while.
This year's task was very similar to that from last year, but it was a bit tougher and even more fun. The organisers surely must have spent a lot of time and effort in putting together the task description and the supporting materials. There were no mistakes discovered in the task description for this year's contest (unlike those for the previous years' contests). The mailing list also had an unusually low-traffic during the contest - perhaps it was because the task description was precise enough, perhaps it was because a lot of people were finding it difficult just to get started or perhaps the contest was affected by the release of the latest Harry Potter book during the same weekend.
This year's task was to save an alien named "Endo". Endo was a "Fuun" travelling in a spaceship named "Arrow", when it had a mishap with an "interstellar garbage collector" that picked it and then dumped it on Earth. This caused critical damage to both Endo and Arrow. Our job was to repair Endo's DNA in such a way that Arrow is able to restore it to its former condition using its limited remaining power.
Endo's DNA is a long sequence comprising four bases, 'I', 'C', 'F' and 'P'. This sequence can be interpreted to create an RNA sequence comprising the same four bases that in turn can be used to build the proteins necessary to restore Endo. Prefixes comprising the four bases can be attached to a DNA sequence to modify its behaviour - we had to find a cheap and low-risk prefix that Arrow could process to repair Endo's DNA.
This translates to creating an interpreter for the DNA sequence that produces an RNA sequence, creating another interpreter for this RNA sequence that creates an image out of it and finally discovering a prefix that could restore Endo. The organisers provided us the DNA sequence for Endo, a "source" image resulting from interpreting this DNA sequence and a "target" image representing a fully-restored Endo. We were asked to create a DNA prefix that results in an image that is as close to the target image as possible, while being as short and as cheap to process as possible.
The task description was detailed enough for us to readily build a DNA to RNA converter as well as an RNA to image converter. The challenge was to do the processing quickly enough since the given DNA sequence required about 2 million iterations over a string that was about 7.5MB long. I created the DNA-to-RNA converter, while Yumpee built the RNA-to-image converter.
Since we had to splice and join huge lists while interpreting the DNA, I started off with a C implementation that used Lisp-style "cons" cells. While it seemed to perform well (but not good enough for the task), it had hard to track bugs that caused segmentation faults, not to mention memory leaks. I then tried using Java for creating the interpreter, which resulted in a correct but even slower interpreter. By this time, I had wasted two out of the three days allotted for the task. We had not made much progress while a lot of teams seemed to have discovered the same "magical" prefix comprising 28 bases that had boosted their rankings.
This is when I finally decided to act on Yumpee's suggestion of looking into using "ropes" as the underlying data structure. He had been suggesting it repeatedly since the time he had seen the task description, but I kept putting it off since I was not familiar with it. I decided to check out the "cords" implementation of ropes that comes with Boehm-GC. I modified my original interpreter in C to use cords instead of my home-grown and buggy cons-cells implementation.
The difference in performance was startling! The very first implementation ran through all the 2 million iterations in about 10 minutes on my PC and in about 4 minutes on Yumpee's PC. The code had also become much simpler to read and much closer to the pseudo-code provided in the task description. I debugged it a little to remove a bug that resulted from a misreading of the specification and then handed it over to Yumpee. By this time, he had a working RNA-to-image converter (he was also busy barking up the wrong trees till that time) and had also discovered the magic 28-base prefix. Unfortunately for us, we had just nine hours left to finish the task.
We discovered that the fun had just started. Embedded in the DNA were a series of clues that formed a virtual treasure-hunt, involving encrypted pages, EBCDIC-encoded text, fractals, callable functions, etc. We discovered a lot of clues but we couldn't find a prefix in time that would improve our rankings. It felt good that we could make so much progress by the end of the contest, unlike the last time. However, I felt like an idiot for not having listened to Yumpee earlier and saving myself and our team a lot of wasted time.
As I always end up saying after such contests, I hope we have a better luck the next time.
Some more reports: 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, etc.
Update (2007-07-26): I tweaked the code for the DNA-to-RNA converter a little bit to avoid creating a lot of unnecessary cords and that reduced its running time to about 1 minute 35 seconds on my PC (about 7 times better than before) and about 30 seconds on Yumpee's PC. Right now the garbage collector itself is the biggest consumer of CPU cycles (about 45% on the whole), but disabling it leads to the programme quickly running out of memory on my PC.
This year's task was very similar to that from last year, but it was a bit tougher and even more fun. The organisers surely must have spent a lot of time and effort in putting together the task description and the supporting materials. There were no mistakes discovered in the task description for this year's contest (unlike those for the previous years' contests). The mailing list also had an unusually low-traffic during the contest - perhaps it was because the task description was precise enough, perhaps it was because a lot of people were finding it difficult just to get started or perhaps the contest was affected by the release of the latest Harry Potter book during the same weekend.
This year's task was to save an alien named "Endo". Endo was a "Fuun" travelling in a spaceship named "Arrow", when it had a mishap with an "interstellar garbage collector" that picked it and then dumped it on Earth. This caused critical damage to both Endo and Arrow. Our job was to repair Endo's DNA in such a way that Arrow is able to restore it to its former condition using its limited remaining power.
Endo's DNA is a long sequence comprising four bases, 'I', 'C', 'F' and 'P'. This sequence can be interpreted to create an RNA sequence comprising the same four bases that in turn can be used to build the proteins necessary to restore Endo. Prefixes comprising the four bases can be attached to a DNA sequence to modify its behaviour - we had to find a cheap and low-risk prefix that Arrow could process to repair Endo's DNA.
This translates to creating an interpreter for the DNA sequence that produces an RNA sequence, creating another interpreter for this RNA sequence that creates an image out of it and finally discovering a prefix that could restore Endo. The organisers provided us the DNA sequence for Endo, a "source" image resulting from interpreting this DNA sequence and a "target" image representing a fully-restored Endo. We were asked to create a DNA prefix that results in an image that is as close to the target image as possible, while being as short and as cheap to process as possible.
The task description was detailed enough for us to readily build a DNA to RNA converter as well as an RNA to image converter. The challenge was to do the processing quickly enough since the given DNA sequence required about 2 million iterations over a string that was about 7.5MB long. I created the DNA-to-RNA converter, while Yumpee built the RNA-to-image converter.
Since we had to splice and join huge lists while interpreting the DNA, I started off with a C implementation that used Lisp-style "cons" cells. While it seemed to perform well (but not good enough for the task), it had hard to track bugs that caused segmentation faults, not to mention memory leaks. I then tried using Java for creating the interpreter, which resulted in a correct but even slower interpreter. By this time, I had wasted two out of the three days allotted for the task. We had not made much progress while a lot of teams seemed to have discovered the same "magical" prefix comprising 28 bases that had boosted their rankings.
This is when I finally decided to act on Yumpee's suggestion of looking into using "ropes" as the underlying data structure. He had been suggesting it repeatedly since the time he had seen the task description, but I kept putting it off since I was not familiar with it. I decided to check out the "cords" implementation of ropes that comes with Boehm-GC. I modified my original interpreter in C to use cords instead of my home-grown and buggy cons-cells implementation.
The difference in performance was startling! The very first implementation ran through all the 2 million iterations in about 10 minutes on my PC and in about 4 minutes on Yumpee's PC. The code had also become much simpler to read and much closer to the pseudo-code provided in the task description. I debugged it a little to remove a bug that resulted from a misreading of the specification and then handed it over to Yumpee. By this time, he had a working RNA-to-image converter (he was also busy barking up the wrong trees till that time) and had also discovered the magic 28-base prefix. Unfortunately for us, we had just nine hours left to finish the task.
We discovered that the fun had just started. Embedded in the DNA were a series of clues that formed a virtual treasure-hunt, involving encrypted pages, EBCDIC-encoded text, fractals, callable functions, etc. We discovered a lot of clues but we couldn't find a prefix in time that would improve our rankings. It felt good that we could make so much progress by the end of the contest, unlike the last time. However, I felt like an idiot for not having listened to Yumpee earlier and saving myself and our team a lot of wasted time.
As I always end up saying after such contests, I hope we have a better luck the next time.
Some more reports: 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, etc.
Update (2007-07-26): I tweaked the code for the DNA-to-RNA converter a little bit to avoid creating a lot of unnecessary cords and that reduced its running time to about 1 minute 35 seconds on my PC (about 7 times better than before) and about 30 seconds on Yumpee's PC. Right now the garbage collector itself is the biggest consumer of CPU cycles (about 45% on the whole), but disabling it leads to the programme quickly running out of memory on my PC.
Labels:
icfpc,
iit,
programming
2006-12-14
Articles by Dheeraj Sanghi
Dheeraj Sanghi is a professor in the Computer Science and Engineering (CSE) department at IIT Kanpur. Our batch, the CSE class of 1996, studied "Computer Networks" under him and he was an active patron of the Association of Computer Activities (ACA).
Recently I stumbled upon a collection of his articles on various topics, including career counselling for students who want to study CS in India, improvements to the undergraduate programme in IIT Kanpur, views on the recent move by the Indian government to impose quotas in the IITs, etc.
Though I found myself disagreeing with some of his points, I found these articles quite interesting as they touch upon topics that I have been thinking about in recent times.
Recently I stumbled upon a collection of his articles on various topics, including career counselling for students who want to study CS in India, improvements to the undergraduate programme in IIT Kanpur, views on the recent move by the Indian government to impose quotas in the IITs, etc.
Though I found myself disagreeing with some of his points, I found these articles quite interesting as they touch upon topics that I have been thinking about in recent times.
2006-06-22
Libraries
I really miss having access to a well-stocked and conveniently accessible library of books here in Bangalore. I really like reading books and it is not possible for me to buy all the books I fancy reading. It is not just a question of being able to afford these books; I do not have enough space in my home to stock all such books and they are a pain when moving houses.
2006-02-13
IITs All About
Someone has uploaded CBS's "60 Minutes" feature on the IITs to Google Video (thanks to Arpana for pointing me to this). There is a lot of truth accompanied by the usual hyperbole in this video with unfortunately some of the interviewed students also playing along. For instance, not everyone likes to stay up all night preparing for examinations, that too accompanied by a doting tea-making mother. Certainly not everyone has their entire family dropping them off at the examination centre and has them hanging around for the entire duration of the examination. "Puh-sycho!" It is also too much of a stretch to put the IITs above MIT, Princeton, CMU, etc. - they are each good and bad in their own ways.
For the IIT Kanpur junta, someone else has also uploaded the musical production targetted at nostalgic alumni "Din Bhar" to Google Video. You must see this at least once.
(Originally posted on Advogato.)
For the IIT Kanpur junta, someone else has also uploaded the musical production targetted at nostalgic alumni "Din Bhar" to Google Video. You must see this at least once.
(Originally posted on Advogato.)
Labels:
advogato diary,
iit,
nostalgia
2005-05-09
IITs
(Disclaimer: I am an alumnus [B.Tech.] of IIT Kanpur with Computer Science as my major, so read the following rant with this fact in mind and your preferred measure of salt.)
I feel that in the last few years the Indian government, irrespective of the party that comes to power, seems hell-bent on killing these awesome engineering institutions which have a largely justified and well-earned reputation worldwide. They started by almost doubling the number of available seats in some departments, notably Computer Science, then by increasing the number of institutes branded "IIT" (from the original 5 to a current target of 11 in total) and finally by making the entrance exams (JEE) much easier for students to crack!
In the short term, they will be able to boast of a much larger number of IIT graduates but will they be able to ensure the quality as well? Are they not killing by dilution the very brand that they wish to exploit? Sheesh!
(Originally posted on Advogato.)
I feel that in the last few years the Indian government, irrespective of the party that comes to power, seems hell-bent on killing these awesome engineering institutions which have a largely justified and well-earned reputation worldwide. They started by almost doubling the number of available seats in some departments, notably Computer Science, then by increasing the number of institutes branded "IIT" (from the original 5 to a current target of 11 in total) and finally by making the entrance exams (JEE) much easier for students to crack!
In the short term, they will be able to boast of a much larger number of IIT graduates but will they be able to ensure the quality as well? Are they not killing by dilution the very brand that they wish to exploit? Sheesh!
(Originally posted on Advogato.)
Labels:
advogato diary,
iit
2004-07-05
"Things a Computer Scientist Rarely Talks About"
My friend Ananth has just gifted me a copy of "Things a Computer Scientist Rarely Talks About", a very different book by Donald Knuth. Having admired and being in awe of most of Knuth's work and being an atheist, this book should make some interesting reading for me.
Other recently acquired books on my "reading pending" list (besides almost the whole of my library) include "The Art of Unix Programming" by Eric Raymond and "Five Point Someone" by Chetan Bhagat, a novel about life in an IIT.
(Originally posted on Advogato.)
Other recently acquired books on my "reading pending" list (besides almost the whole of my library) include "The Art of Unix Programming" by Eric Raymond and "Five Point Someone" by Chetan Bhagat, a novel about life in an IIT.
(Originally posted on Advogato.)
Labels:
advogato diary,
books,
iit,
programming
2004-03-29
History of GCJ
Some time back, I stumbled upon "Cygnus Foundry Java Edition - Architecture and Design Manual". This document is somewhat dated and describes the plans more than what really has been implemented in GCJ, but it is still a good read and I would highly recommend it to anyone trying to understand GCJ.
I had my "Aaaahhhh!!" moments of comprehension reading this document, especially with the "Stack Slot Compilation", "Class Metadata", "Debugging Interpreted Java", "C/C++ and Compiled Java", etc sections.
Mark Wielaard (mjw) has created "Planet Classpath", a wonderful consolidation of weblogs maintained by GNU Classpath hackers. Kudos to him.
This weekend I spent time reading "The Java Virtual Machine Specification", something that I should have a done a very long time ago. I didn't finish it and I didn't understand everything, but a lot of things have become much clearer, including the meaning of (ID[Ljava/lang/String;)Ljava/lang/Thread;. ;-)
I also played around with Jasmin, an assembler for the JVM.
In 1995-96, when I was in the final year of my undergraduate studies at IIT-K and Java was this new and cool language for creating web animations, most of us learnt this language and played around with it. "HS" was a guy who went beyond the language and used to play with the class file format and raw JVM instructions - he was promptly labelled a "weirdo" and people used to make fun of him behind his back, but were still in awe of him.
I am such a weirdo myself now. :-)
But seriously, the JVM architecture is quite simple and the instruction set is quite high-level and simple - no decent Java programmer will have much difficulty in understanding it. It is worth a dekko.
(Originally posted on Advogato.)
I had my "Aaaahhhh!!" moments of comprehension reading this document, especially with the "Stack Slot Compilation", "Class Metadata", "Debugging Interpreted Java", "C/C++ and Compiled Java", etc sections.
Mark Wielaard (mjw) has created "Planet Classpath", a wonderful consolidation of weblogs maintained by GNU Classpath hackers. Kudos to him.
This weekend I spent time reading "The Java Virtual Machine Specification", something that I should have a done a very long time ago. I didn't finish it and I didn't understand everything, but a lot of things have become much clearer, including the meaning of (ID[Ljava/lang/String;)Ljava/lang/Thread;. ;-)
I also played around with Jasmin, an assembler for the JVM.
In 1995-96, when I was in the final year of my undergraduate studies at IIT-K and Java was this new and cool language for creating web animations, most of us learnt this language and played around with it. "HS" was a guy who went beyond the language and used to play with the class file format and raw JVM instructions - he was promptly labelled a "weirdo" and people used to make fun of him behind his back, but were still in awe of him.
I am such a weirdo myself now. :-)
But seriously, the JVM architecture is quite simple and the instruction set is quite high-level and simple - no decent Java programmer will have much difficulty in understanding it. It is worth a dekko.
(Originally posted on Advogato.)
Labels:
advogato diary,
gcc,
gcj,
iit
Subscribe to:
Comments (Atom)
