2010-06-22
ICFP Programming Contest 2010
2010-01-23
Spawner in the Works
2009-06-30
ICFPC 2009
2009-03-01
The Tormentor
2009-01-19
The End of DDJ
2008-12-20
Decision Tables
if-then-else conditions that can be simplified quite a bit by the simple technique of using decision tables. A decision table is a compact yet exhaustive way of encoding conditions and the actions to be taken under those conditions. The rows and columns of a decision table denote the conditions and the cells denote the actions. Since it simplifies complicated conditional logic, it can make your code a lot easier to maintain and a lot less error-prone.2008-10-16
EOF on the DDJ Subscription
2008-10-13
Indian Edition of the Purple Dragon Book
2008-07-17
ICFPC 2008
2008-07-02
Major Programming Contests This Month
There was also an article in DDJ about programming contests a couple of months back. It is an interesting read, but there is nothing new in there for those who are already familiar with TopCoder.
2008-05-24
ACM
2008-05-08
Dr Dobb's Journal
I first came across DDJ in the IIT Kanpur Library about 15 years ago. For someone interested in computer programming, it was a fantastic magazine. Along with BYTE (published only on-line after 1998), which was a fantastic magazine for people interested in microcomputers, it became regular reading for me at the library. I used to read new issues as soon as they became available in the library and would try to read as many of the old issues as I could lay my hands on (the library used to have bound volumes of old issues of these magazines dating back several years). The early issues provide a fascinating insight into the evolution of the personal computer.
When I started my career as a software engineer, I tried to subscribe to DDJ myself. At about Rs 3,000, the annual cost of the subscription at the time was a bit too much for me. This situation did not improve for several years after that. I had to console myself by ordering "Dr Dobb's CD Release 6", a CD that contained electronic versions of all the DDJ issues from January 1988 to June 1998.
When I noticed recently that the print issues of DDJ were available internationally at just $30 (about Rs 1,200) a year, I jumped at the opportunity to order a subscription for myself. Even though an electronic edition of an issue is immediately made available to subscribers, I always prefer to read the "dead-tree" (print) edition when possible. I eagerly waited for my copy of the print edition to arrive in my mailbox. I was setting myself up for frustration.
My copy of the November 2007 issue of DDJ arrived two and a half months after it was announced! To top that, the December 2007 issue did not arrive at all. To add insult to injury, my email to them inquiring about this problem was rejected by their "Barracuda Spam Firewall" with a terse message stating that "Message content rejected". As if to then really irritate me, they sent a letter in January 2008 asking me to renew my subscription ten months before it was set to expire!
Fortunately for me, the situation improved considerably from January 2008 and the issues started arriving regularly and well in time (as was the case many years ago, an issue of DDJ arrives about a fortnight before the corresponding month - for example, the issue for May 2008 was delivered in mid-April).
The first thing I noticed about these issues was that they were really thin - they seem to be about one-third the thickness of the old issues of DDJ from what I remember. Could that be the reason for the drop in prices? The second thing I noticed in the initial issues was that many of the articles were about web development and other such things in which I do not have that much interest. Was it going to be like this for the rest of the year?
Once again, fortunately for me, the more recent issues have featured some really nice articles. Articles like "Fast String Search on Multicore Processors" and "Detecting Bugs in Safety-Critical Code", for example, provide insights into areas beyond the realm of run-of-the-mill software development. Herb Sutter's "Effective Concurrency" is a really nice column. The issues for April 2008 (Algorithms) and for May 2008 (Programming Languages) in particular were quintessential DDJ.
Most elements of DDJ have not changed much over the years. The articles still have the same feel to them and most of the advertisements for software products are still embarrassingly corny. The PC-lint Bug of the Month series of advertisements are still there, though they are about C++ instead of C as a reflection of the changing times.
So would I still recommend this magazine to a programmer in India? Yes, definitely. Even with the cost of international shipping included, this magazine costs the same or less than comparable Indian magazines (such as they are) and the quality and the depth of the articles is usually much better.
2008-01-05
A Taste of Haskell
2007-10-11
ICFPC 2007: Epilogue
The organisers of the contest have an interesting report on the contest that also contains the "ideal" way one would go about solving the puzzles. Interestingly, Jochen Hoenicke managed to find a perfect DNA prefix some time after the contest was over. Impressively, Jed Davis came third by using a brute-force approach that won him the Judges' Prize - he was declared to be "an extremely cool hacker".
Update (2007-10-24): The organisers have now shared the video of their presentation about the contest at ICFP 2007.
2007-08-20
Calculating Interest Rates
In "Calculating EMIs", we derived the formula for calculating the "Equated Monthly Installment" (EMI) on a loan. If "E" represents the EMI, "P" represents the principal amount in the loan, "r" represents the monthly rate of interest (one way of arriving at it is to divide the annual rate of interest, quoted as a percentage, by 1,200) and "n" represents the number of months in the tenure of the loan, then:
E = P × r × (1 + r)n / ((1 + r)n - 1)
In the current example, we know the values for "E", "P" and "n" and wish to calculate "r". Unfortunately it is not that simple to calculate "r" using just the high-school algebra that most of us manage to remember. Fortunately there is a simple algorithm that can help us in this situation.
Let us first rewrite the formula above as an equation:
P × r × (1 + r)n / ((1 + r)n - 1) - E = 0
Our task now is to find the roots of this equation - that is, the values of "r" that will make the left-hand-side (LHS) of this equation evaluate to zero.
To find the roots of a given equation "f(x) = 0", the algorithm in question can be described as follows:
- Find a value "a" for which "f(a)" evaluates to a negative value.
- Find a value "b" for which "f(b)" evaluates to a positive value.
- Let "c" be the average of "a" and "b".
- If "f(c)" is close enough to zero, "c" is the desired root.
- Otherwise, if "f(c)" is a negative value, substitute "c" for "a" and repeat the procedure from step #3 and if "f(c)" is a positive value, substitute "c" for "b" and repeat the procedure from step #3.
Note that this is just a binary search algorithm. By "close enough to zero", we mean that the absolute value of "f(c)" is less than some value, usually called "epsilon", that can be as small as we please. The algorithm given above can be rewritten as a function in a pseudo-language as follows:
guessRoot( f, a, b)
{
c := (a + b) / 2;
if( absoluteValue( f( c)) < EPSILON)
return c;
else if( f(c) < 0)
return guessRoot( f, c, b);
else
return guessRoot( f, a, c);
}
You can implement this in your favourite programming language along with a function that calculates the LHS of the equation given earlier. You can choose a value of "epsilon" according to your preference - the smaller the value of "epsilon", the more accurate is the result and the longer it takes to compute it. The time taken for the computation is also affected by how wide is the range between "a" and "b". Note that Newton's method is a much faster way of computing the roots of such equations, though it involves calculating derivatives.
How do you arrive at the values for "a" and "b"? This differs for each function. For our example, we can start with a low guess of "0.001%" ("0%" gives an undefined result) as the annual rate of interest and a high guess of "100%" and this gives us a negative and a positive value for the LHS respectively. With an "epsilon" of "0.00001", a C programme computes the answer in around 25 iterations.
In our example, the first dealer is offering us an effective loan of Rs 50,000 for 12 months with an EMI of Rs 4,380 and the effective annual rate of interest comes to about 9.32%. The second dealer is offering us an effective loan of Rs 40,000 for 36 months with an EMI of Rs 1,330 and the effective annual rate of interest comes to about 12.08%. In terms of the interest rates being charged by the dealers, you should now be able to tell that the first dealer has a better proposition for you when compared to the second dealer.
2007-07-25
ICFPC 2007
This year's task was very similar to that from last year, but it was a bit tougher and even more fun. The organisers surely must have spent a lot of time and effort in putting together the task description and the supporting materials. There were no mistakes discovered in the task description for this year's contest (unlike those for the previous years' contests). The mailing list also had an unusually low-traffic during the contest - perhaps it was because the task description was precise enough, perhaps it was because a lot of people were finding it difficult just to get started or perhaps the contest was affected by the release of the latest Harry Potter book during the same weekend.
This year's task was to save an alien named "Endo". Endo was a "Fuun" travelling in a spaceship named "Arrow", when it had a mishap with an "interstellar garbage collector" that picked it and then dumped it on Earth. This caused critical damage to both Endo and Arrow. Our job was to repair Endo's DNA in such a way that Arrow is able to restore it to its former condition using its limited remaining power.
Endo's DNA is a long sequence comprising four bases, 'I', 'C', 'F' and 'P'. This sequence can be interpreted to create an RNA sequence comprising the same four bases that in turn can be used to build the proteins necessary to restore Endo. Prefixes comprising the four bases can be attached to a DNA sequence to modify its behaviour - we had to find a cheap and low-risk prefix that Arrow could process to repair Endo's DNA.
This translates to creating an interpreter for the DNA sequence that produces an RNA sequence, creating another interpreter for this RNA sequence that creates an image out of it and finally discovering a prefix that could restore Endo. The organisers provided us the DNA sequence for Endo, a "source" image resulting from interpreting this DNA sequence and a "target" image representing a fully-restored Endo. We were asked to create a DNA prefix that results in an image that is as close to the target image as possible, while being as short and as cheap to process as possible.
The task description was detailed enough for us to readily build a DNA to RNA converter as well as an RNA to image converter. The challenge was to do the processing quickly enough since the given DNA sequence required about 2 million iterations over a string that was about 7.5MB long. I created the DNA-to-RNA converter, while Yumpee built the RNA-to-image converter.
Since we had to splice and join huge lists while interpreting the DNA, I started off with a C implementation that used Lisp-style "cons" cells. While it seemed to perform well (but not good enough for the task), it had hard to track bugs that caused segmentation faults, not to mention memory leaks. I then tried using Java for creating the interpreter, which resulted in a correct but even slower interpreter. By this time, I had wasted two out of the three days allotted for the task. We had not made much progress while a lot of teams seemed to have discovered the same "magical" prefix comprising 28 bases that had boosted their rankings.
This is when I finally decided to act on Yumpee's suggestion of looking into using "ropes" as the underlying data structure. He had been suggesting it repeatedly since the time he had seen the task description, but I kept putting it off since I was not familiar with it. I decided to check out the "cords" implementation of ropes that comes with Boehm-GC. I modified my original interpreter in C to use cords instead of my home-grown and buggy cons-cells implementation.
The difference in performance was startling! The very first implementation ran through all the 2 million iterations in about 10 minutes on my PC and in about 4 minutes on Yumpee's PC. The code had also become much simpler to read and much closer to the pseudo-code provided in the task description. I debugged it a little to remove a bug that resulted from a misreading of the specification and then handed it over to Yumpee. By this time, he had a working RNA-to-image converter (he was also busy barking up the wrong trees till that time) and had also discovered the magic 28-base prefix. Unfortunately for us, we had just nine hours left to finish the task.
We discovered that the fun had just started. Embedded in the DNA were a series of clues that formed a virtual treasure-hunt, involving encrypted pages, EBCDIC-encoded text, fractals, callable functions, etc. We discovered a lot of clues but we couldn't find a prefix in time that would improve our rankings. It felt good that we could make so much progress by the end of the contest, unlike the last time. However, I felt like an idiot for not having listened to Yumpee earlier and saving myself and our team a lot of wasted time.
As I always end up saying after such contests, I hope we have a better luck the next time.
Some more reports: 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, etc.
Update (2007-07-26): I tweaked the code for the DNA-to-RNA converter a little bit to avoid creating a lot of unnecessary cords and that reduced its running time to about 1 minute 35 seconds on my PC (about 7 times better than before) and about 30 seconds on Yumpee's PC. Right now the garbage collector itself is the biggest consumer of CPU cycles (about 45% on the whole), but disabling it leads to the programme quickly running out of memory on my PC.
2007-05-23
ICFP Contest 2007
There is already a blog written by one of the organisers that contains some teaser puzzles. (Do these images use some form of steganography or can we simply work out the graphical transformations applied to the original image and apply them in reverse to obtain the desired image? I wonder.)
2007-05-14
VU3RDD Gets $2.56 From Donald Knuth
2007-01-21
"Concepts, Techniques, and Models of Computer Programming"
There is a slightly-dated version of the book available online (PDF, 3.4 MB), if you want to preview some of the content before buying it. There is also an Indian edition of the book published by Prentice Hall of India (ISBN: 81-203-2685-7) and priced at Rs 450. The book's web site links to some reviews and you can also read my review of the book.
2007-01-20
Local Variables in Java
While he agreed to change the style of his code, he was still reluctant to move the declaration of a variable used only within a loop from outside it to inside it. For example, he was reluctant to change:
String s;
for( int i = 0; i < 10; i++)
{
s = String.valueOf( i);
}
to:
for( int i = 0; i < 10; i++)
{
String s = String.valueOf( i);
}
He believed that only one variable is created in the former case while 10 variables are created in the latter - clearly it is more efficient to declare a single variable outside the loop and keep reusing it inside the loop!
I then pointed out the section in the JVM specification that says that a JVM uses a fixed-size array for storing the values of local variables used in a method and each local variable maps to an index in this array. A Java compiler calculates the size of this array during the compilation of a method and declares it in the generated bytecode for the method.
Since he was still sceptical, I compiled both the variants to bytecode, used
javap -c to produce the dissassembled code and used diff to show that the generated code was the same in both the cases (except for the indices used for s and i). I then used a simple modification of using the JVM Emulator Applet written by Bill Venners as a standalone application to show the bytecode variants in execution and demonstrate that the size of the local variables array really remains constant throughout.
He was finally convinced.
On the other extreme, I have another colleague who is in the masochistic habit of introducing new scopes to isolate the local variables used only in a section of a method's body. That is, something like:
{
Foo x = wombat.snafu( );
// Use x here.
...
}
{
Bar y = new Bar( a, b, c);
// Use y here.
...
}
