*** MOVED ***

NOTE: I have merged the contents of this blog with my web-site. I will not be updating this blog any more.

2006-07-31

Decadence

I complete 10 years in the software industry today.

Ten years ago, as a lad fresh out of college and fascinated with computer programming, I felt tremendously excited that I had got a job where I could write computer programmes all the time.

That delusion was rather short-lived. I soon discovered that programming is actually a very small part of software engineering. Much of my time is spent writing specifications, communicating with my teammates and clients via email, conference calls and meetings, writing documentation, fixing bugs, giving demonstrations and presentations, etc. I spend more time within Microsoft Word than within Vim.

When I do get to write code, I have to write the simplest of programmes. The requirements almost never merit anything sophisticated. On the rare occasions where better data structures and algorithms can help, I still cannot use them since the code then becomes unmaintainable by the "average" software engineer. For an underperforming application, the standard solution seems to be to use a bigger machine instead of making the application more efficient. Developing "enterprise" software can be really dull and depressing for someone even mildly interested in programming.

FOSS.IN 2006

FOSS.IN/2006 will be held between November 24-26 at the J. N. Tata Auditorium in IISc Bangalore. This event used to be called "Linux Bangalore" and is justly touted as "India's Premier Annual Free and Open Source Software Event". It has been organised every year since 2001 by the Bangalore Linux Users' Group (BLUG). Atul Chitnis's diary entry has some more information.

I have personally attended only two of these events and I regret not having attended more of these. FOSS.IN/2005 was fairly good and mostly fun. I hope to be able to attend this year's event as well.

2006-07-25

ICFPC 2006

I participated in the ICFP programming contest for 2006 over this weekend. I was taking part in it purely for the fun of it (yes, really). It turned out to be a very interesting task. Check out the task description to see what I mean.

You have to first implement an interpreter for a simple but quirky virtual machine called the Universal Machine (UM). You then use this interpreter on the given data file ("codex.umz") to run a small programme that takes a password and uses this password to decrypt and uncompress a binary image. You then use the interpreter on this extracted binary image to discover a small UNIX-like multi-user system where you can log in as "guest". This system, called "UMIX", includes a set of common UNIX utilities like "ls", "rm", "cat", "more", etc., an interpreter for BASIC (called "Qvikbasic", that only recognises numbers written with Roman numerals) and a file-transfer utility called "umodem".

Using the guest account, you complete a simple password-cracking utiliy written in BASIC and use it to discover the passwords of a couple of other user accounts. When you log in as those users, you find more tasks that you can complete to discover the passwords of other user accounts, which yield even more tasks and so on till you get the "root" password. For example, one of the tasks was in the form of an Adventure-like game and another was using a "2-dimensional" programming language, complete with ASCII-art boxes for each function, to solve a given problem.

It was incredible and this is easily one of the best ICFPC tasks I have ever attempted.

You might also want to check out Mark Probst's account, Mauricio Fernandez's account, Gregory Brown's accounts (1, 2, 3, 4), etc.

The organisers also provided a handy benchmark for testing your UM implementation (called "SANDmark") and a reference UM implementation (in the form of a UM binary of course). If you want to quickly check it out, here is my UM implementation (written in C). You can use "(\b.bb)(\v.vv)06AAVKIru7p0OmCvaT" as the password to extract the UMIX binary image from "codex.umz" using this interpreter. (Paul Ingemi has even implemented a JIT-ing UM using LuaJIT!)

When I first implemented the UM, it ran quite slowly - it took a very long time for the UMIX login prompt to appear and over three hours just to compile "hack.bas" with Qvikbasic. I first tried simple tricks like converting indexed array accesses to incremental pointer accesses, caching values that were being used often, using a function-pointer-based dispatcher table to decode and execute instructions instead of a big switch statement, etc. but nothing helped. I found that the UM binaries allocate and deallocate a lot of "arrays" of various sizes and inspired by one of the posts to the ICFPC 2006 mailing list, I even implemented a system where arrays were only allocated in sizes of the smallest power of two larger than the requested size and recycled internally instead of being returned back to the operating system. However, even that did not help much.

That is when I decided to stop fooling around and ran the interpreter through qprof (I feel gprof and oprofile are not as immediately useful as qprof is - try it out and you would probably agree). I discovered that the real bottleneck was in the code that searched for a free slot to assign an index to a newly allocated array. When I eliminated that bottleneck by explicitly keeping track of freed slots, the performance of the interpreter improved drastically and it became usefully responsive. I eliminated the power-of-two array size foolery and it was still pretty responsive - malloc()/free() are really implemented pretty well in GNU libc and it is not usually worth it to second-guess it (except perhaps in extreme cases). (Also check out Doug Lea's notes on the design of a memory allocator.)

One of the wicked ideas from the mailing list did help however - using the pointer to an array's memory as its index instead of maintaining an "array of arrays", yielded a 20% boost in the performance of my UM implementation (as measured by SANDmark) but reduced its "portability" to only those machines where both integers and pointers are 32 bits.

The lesson learnt? We programmers are really quite bad at guessing the hotspots in our programmes. We should always use a profiler to find out where the real bottlenecks are in our programmes.

By the way, a few hapless souls discovered that the lack of unsigned integers in Java makes it unnecessarily difficult to implement something like this in Java. Why exactly couldn't the creators of the language provide unsigned variants of the integral types?

2006-07-21

Validating XML Documents

I was writing a couple of XML documents conforming to certain XML Schemata the other day. I was looking for a simple command-line tool that would take an XML document and an XML Schema and check if the document was well-formed and really conformed to the given XML Schema (I did not want to use web-based validation). I could have written a tool like this myself but I was feeling rather lazy and just wanted to quickly download a tool from somewhere to do this.

It turned out to be a surprisingly frustrating task and eventually took more time than what I would have taken to write it myself. Perhaps my Google queries were not good enough, perhaps people are just happy with their respective IDEs, perhaps everyone just writes their own little tool around the usual XML parsing libraries, perhaps people are not so anal about writing XML documents that strictly conform to the applicable XML Schema, etc. - I don't know why, but it took me a while to locate such a tool.

I first used Sun's Multi-Schema XML Validator (MSV) and it worked for me for a while but then tripped with a StackOverflowError on a particular XML Schema that I had to use so I had to abandon it. I next tried XMLStarlet but the error messages it spewed were a bit confusing and it did not fully support XML Schemata so I abandoned it as well. I am now using a little tool called "DOMCount" that is included with Apache Xerces and that ostensibly parses a document and prints out the number of DOM elements it has encountered but that also works fairly well as a document validator. The error messages shown by this tool, while better than those from XMLStarlet, can still confuse some times but I can live with it for the moment.

While creating these documents from the appropriate XML Schemata, I found xs3p [link currently seems broken] to be really useful. This is a stylesheet that takes an XML Schema and generates a pretty HTML document from it that you can use to understand the original XML Schema and easily navigate through its structure. I used Apache Xalan to generate the HTML documents.

New Delhi, US Visa and Kingfisher Airlines

I was in New Delhi at the beginning of this week to appear for an interview at the US embassy for a visa application for a business trip to the US in the near future.

This was my very first visit to New Delhi. I have finally fulfilled a long-held desire to see all the four metropolitan cities of India, viz. Delhi, Bombay (Mumbai), Madras (Chennai) and Calcutta (Kolkata). I was there for just a little over 24 hours, but I was still quite impressed with what I saw - wide, pothole-free roads with traffic zipping along at 50-60 kmph, lots of trees, everything spread out instead of congested together, etc. It did not feel like an Indian city at all. I am told that what I saw was just "New" Delhi and that Old Delhi is just like any other Indian city in terms of congestion and chaos. It's also rather myopic of me to judge a city by what I saw on my trips back and forth between my hotel in Connaught Place , the domestic and international airport terminals (a distance of about 20-25 kms) and the US Embassy.

However, others who have stayed in Delhi as well as Bangalore also insist that Delhi is a far cleaner city, has much less of traffic congestion and is a much larger city than Bangalore. I tend to agree based on what I have seen. I did not find Bombay, Madras and Calcutta much better than Bangalore - the awful amount of time you spend commuting in Bombay, the utter lifelessness of Madras for a youngster and the insanely congested traffic in Calcutta made me realise how much better off I was being in Bangalore than in any of these cities. With Delhi, I wasn't so sure.

While applying for a non-immigrant US visa, you have to fill out DS-156. One of the queries in this form that never fails to amuse me is:

Do you seek to enter the United States to engage in export control violations, subversive or terrorist activities, or any other unlawful purpose? Are you a member or representative of a terrorist organization as currently designated by the U.S. Secretary of State? Have you ever participated in persecutions directed by the Nazi government of Germany; or have you ever participated in genocide? [Yes/No]

Do they really expect anyone to answer "Yes" to that query?

My flights for this trip were with the new Kingfisher Airlines. I was quite impressed by them and would rank them higher than Jet Airways, the only other domestic airlines that I think has a reasonably good service. (My sister says that Paramount Airways is also quite good, but I am yet to travel with them.) Kingfisher has these little touches everywhere that score over Jet - the tickets are generally priced lower, you can select your seat over the Internet before checking in, when you arrive at the airport a valet takes your luggage, tags it, puts it through the security check and takes it to the check-in counter for you (no, I was not travelling Business Class but "Kingfisher Class", which is what they like to call Economy Class), there is an in-flight entertainment system to keep you occupied, the seats have slightly more leg room, etc. Oh yes, the air hostesses are really pretty too (with a bit too much of make-up I think, but quite pretty nonetheless). Wonderful.

2006-07-20

ICFP Contest 2006

Those interesting in programming might want to check out the ICFP Contest for 2006 which will be held during the coming weekend (21st to 24th July 2006). The problems are generally open-ended and fun to solve even if your entry does not make it to the top - check out the problems from the previous years' contests to see what I am talking about.

2006-07-19

All Your Blogs Are Belong To Us

We in India are no longer able to directly view blogs hosted on Blogspot, Typepad, Geocities, etc. This is because ISPs in India are blocking access to these popular sites acting on a government diktat to block some blogs "within the provisions of the Fundamental Right to free speech and expression granted in India's constitution" [sic]. Mridula was one of the first ones to notice and write about this blockade and the story has already made it to Slashdot and some of the mainstream media in India.

The government's diktat was to block a specific set of blogs ("to ensure a balanced flow of information") but since all the Blogspot URLs resolve to the same IP address, the blockade ends up blocking all blogs hosted on Blogspot. Ditto for Typepad, Geocities, etc. Since blogger.com is not blocked, I am able to post to my blog. I can also view my blog from my office since the Internet connection there is routed via a corporate proxy server located in the US. Using pkblogs.com, which was set up to allow people in Pakistan to view Blogspot blogs since they have a similar blockade in place, is another workaround, as is using a public proxy server or the Tor network using something like Torpark.

In 2003, a similar government diktat to block a specific Yahoo! Group had caused all Yahoo! Groups to become inaccessible from India. Thankfully that situation was resolved quickly and let's hope this issue is too.

A clueless bureaucracy eagerly assisted by a servile and clueless set of ISPs is not good news. Where is a cluebat when you need one?

2006-07-14

GCC Summit 2006

Just FYI, the proceedings of the GCC Summit for 2006 are now available online. They usually make for interesting reading for those even mildly involved with GCC in particular or with compilers in general.

2006-07-10

Presentation Skills

Most of us geeks seem to think that presentation skills are something for the marketroids or the suits to worry about. Even if we do have to present something, we only have fellow geeks in the audience who ought to be able to understand what we are talking about. However, geeks are humans too and humans need you to take care of certain things in your presentation for them to be able to fully appreciate and understand what you are talking about.

For example, in FOSS.in/2005 most of the talks were about cool stuff but in my opinion many a presentation left a lot to be desired. Among the talks that I attended, Jim Zemlin's talk about the Linux Standards Base was one of the precious few where the speaker seemed to understand and care about the basics of presentation.

Here are a few suggestions for speakers based on personal observations:
  • Keep the points in your slides short - avoid verbiage. Use the points to drive your talk - do not expect the audience to read everything off the slides. You can annotate your slides with explanatory notes if you want to upload the slides somewhere for people who were not able to make it to your talk.
  • Find out how much time you will have for your presentation, factor in the time usually taken up for questions from the audience and possible delays due to the previous talk and structure your presentation appropriately. You can have "checkpoint" slides in your presentation that you can use to gauge how well you are doing with respect to your plan.
  • Use clear fonts (I personally prefer scalable sans-serif fonts) and use a relatively large point size. People in the back should be able to make out what is written on your slides.
  • Spread your points evenly across a slide. Try not to list more than seven points per slide; use continuation slides to break up long lists of points.
  • Use foreground and background colours that have create good contrast. Remember that what looks good on your computer under ordinary lighting might look quite different when projected on a big screen in a darkened room.
  • Fancy transitions get pretty distracting pretty fast. Avoid them as much as possible except for things like progressively introducing the layers of a system, the phases of a handshake in a communication protocol, etc.
  • Do not keep your hands in your pockets all the time (a worse habit is to move them about in there). Gesticulate while talking - it's sometimes hard to know what gestures to use, so observe good speakers while they talk.
  • Make eye contact with the audience. Do not stare at a single person or throw fleeting glances at everyone. Hold eye contact with some random person for a little while, move on to another random person and so on.
  • Try an ice-breaker in the beginning to make your audience comfortable with you instead of just diving into your presentation. Humour is the best ice-breaker in my opinion but it takes talent to pull it off well. Some speakers use a quick show-of-hands kind of surveys. For one of his talks in FOSS.in/2005, Andrew Cowie put on his DJ hat and played various high-tempo tracks from his playlist while the organisers arranged various things for the talk.
  • It is usually not a good idea to declare "Feel free to interrupt me when you have a question". You will quickly discover that this ruins your flow and that there are several jerks out there who would ask a question just for the sake of it and to make their presence felt. You can pause your presentation at various "milestone" points instead and check if anyone in the audience has any questions to make sure that they understand the stuff being presented before moving on to other things.
  • The people in the front row might just shout a question across to you instead of using a microphone. For the benefit of everyone else, repeat the question before answering it (possibly rephrasing it) so that they know what is being discussed.
  • Ask someone to record your presentation and watch the video later. You will discover a lot of things you were saying or doing that you were not even aware of and some obvious areas for improvement. For example, the first time I saw a video of my own presentation I was dismayed by the fact that I had put on a pretentious accent for some reason, that I kept saying "Ok?" irritatingly often and that I kept my hands in my pockets almost all the time.
  • Do not stand between the audience and the projection of the presentation slides obscuring their view.
  • Depending on the nature of the presentation, it might be a good idea to distribute supporting material (brochures, printouts of the slides, etc.) before the presentation so that the people in the audience have a little background information or a take-home refresher. Printouts of slides can also be used by people (given enough space) to write notes corresponding to the slides.
  • If you can help it, do not talk about something that you are not excited about yourself. You will very likely give a better presentation about something that you genuinely believe in and care for than otherwise.
I can go on and on like this, but most of these points are obvious if you have any common sense or if you observe a good speaker. You can also search the web for a great deal of material on presentation skills. If your company offers a training on presentation skills, take it instead of sniggering and dismissing it as something for "losers".

Finally, here are a few suggestions for presentation attendees:
  • Please arrive on time. The cavalier attitude of most people towards punctuality causes unnecessary delays in starting a presentation and this has a cascading effect on the presentations following the presentation.
  • Please do not ask questions merely for the sake of asking a question. Before you ask a question, ask yourself if the question makes sense, if the question is relevant to the current talk, if the question would require an answer that would be better off discussed in a post-talk chat with the speaker, etc.
  • When you do ask a question, please use a microphone so that everyone else is able to make out what you are trying to ask. Please talk clearly and at a reasonable pace.
  • When using a microphone, do not ask a question seated on your seat - it becomes very difficult for people to locate you. Stand up while you ask the question.
  • Please do not come to a presentation merely to check emails, chat with friends, browse the web, etc. on your laptop using the wireless network made available by the organisers. It is being rude to the speaker and distracting to your neighbours.
  • Please switch off your mobile phones during the talk or at least switch it to vibrating mode. If you absolutely must take a call, please leave the room and take it outside so that you do not disturb others.
  • Try to be a bit discreet before walking out of a presentation - try not to walk out at all if you can help it. It is disheartening for a speaker to see people walk out of his presentation.
  • Try to hold off on that urge to talk to your neighbour. It disturbs everyone else.
Once again, you would think that all of this is common sense but it is surprising how many people are willing to forget all of these when they attend a talk.

2006-07-08

Books on C and C++

Tarandeep asked me what books on C and C++ I would recommend for someone who knows a bit of each of these programming languages. My problem is that I do not generally like reading books specific to a given programming language. In addition, I do not know C++ properly enough to be able to discern a genuinely good book on C++ from a mere pretender. He still insists that I write down a list of such books. I am therefore putting this list as a blog post in the hopes that people more knowledgeable about such things would help him out. We did search for such lists on the web but I was frankly not satisfied with the lists that we could readily find.

Here are the books on C that I would readily recommend:
  1. "The C Programming Language", Second Edition, by Brian Kernighan and Dennis Ritchie.
  2. "Expert C Programming - Deep C Secrets" by Peter van der Linden.
  3. "C Traps and Pitfalls" by Andrew Koenig.
(See also: List of books recommended in the comp.lang.c FAQ.)

Here are the books on C++ that I think should be useful:
  1. "The C++ Programming Language", Third Edition, by Bjarne Stroustrup.
  2. "Effective C++", Third Edition", by Scott Meyers.
  3. "Essential C++" by Stanley Lippman.
I did not particularly like Stroustrup's book, but it served as a useful reference when programming in C++.

By the way, many people do not like "The C Programming Language" but I am one of those who just love this book. It is a short book that is always to the point and has examples that teach you a lot about computer programming techniques and style. I agree that you should already know a bit about computer programming to fully appreciate this book. It was the book that I used to learn C. I love all of Brian Kernighan's books in general. He is one of the very few authors who have actually imbibed the lessons from "The Elements of Style".

In India, we have a few books on C and C++ written by some Indian authors that are terrible in my opinion but that unfortunately have been mandated as text books in several colleges here. The result is that many of the graduates who have not been exposed to other books form extremely warped ideas about these programming languages and about things like pointers. Sad.

2006-07-06

Pricing Your Time

We often hear clichés like "Time is Money" or "Lost Time is Lost Money". Most of us generally agree with these assertions but do not actually try to quantify the money associated with our time. We therefore rely on our intuition or mood to decide whether it is worth it to do something ourselves or pay someone else to do it on our behalf.

For example, does it make sense for us to fill out the relevant form and file our income tax returns ourselves or should we just pay the fees to an agent who will do it for us?

Suppose the agent charges 200 rupees for this service, you drive a motorcycle that gives you about 48 kilometres per litre of petrol in city traffic conditions, the price of petrol is 55 rupees per litre, the distance to the income tax office is about 15 kilometres and the parking attendant there charges 2 rupees. If you were to file the returns yourselves (filing income tax returns online is not yet fully available in India), you will directly incur a cost of about 36 rupees in fuel and parking charges. Assuming that it takes about 15 minutes for you to fill the form yourself and about 60 minutes for the trip to the income tax office and back, is it worth saving this time and pay the agent an extra 164 rupees for his service?

A couple of my friends and I used to amuse ourselves by trying to calculate our time's worth using our salary as a guide. Assume that your annual salary is about 4,00,000 rupees, your employer expects you to work about 8 hours every day from Monday to Friday and grants you a leave of 15 days in a year. With about 52 weeks in a year, that is about 246 working days or 1,968 working hours. This means that your employer is willing to pay you about 203 rupees for every working hour.

So in this particular case, at least from your employer's perspective, it is better for you to pay the agent the extra 164 rupees that he is asking for rather than waste about 254 rupees in lost (hopefully productive) time.

Technically we should also consider other factors like the price we are willing to pay to save ourselves the effort (in addition to the time) involved in the task, the amount of trust we place in the agent to do the job correctly and on time, our willingness to share the information about our income with a third party, the amount of masochistic desire we have to do our job ourselves, etc. This is therefore a very crude measure of the monetary value of your time.

A professor of Economics has come up with another formula to calculate the monetary value of your time and you might have your own method of calculating this value. Whatever method you use, these methods can be used to quickly tell if it could be worthwhile to do something ourselves or pay someone else to do it for us.

(I wrote this piece purely for its amusement value - it should not be taken too seriously and should merely be used as an indicator in the spirit of Burgernomics.)

2006-07-03

Superman Returns

I watched this movie over the weekend and was somewhat disappointed. The special effects were decent and more natural than in the original series of movies, as was to be expected, but the plot just had so many holes and the acting was so so-so that I was wondering how Bryan Singer and Kevin Spacey who gave us "The Usual Suspects" could have also given us this.

The New Yorker's Anthony Lane has written a far more eloquent critique of the movie than I can ever hope to write, but I would add that Brandon Routh is also about as good-looking as Christopher Reeve and is unfortunately about as wooden an actor. Of course, they are still nothing compared to Keanu Reeves when it comes to having a consistent lack of expressions throughout a movie. Perhaps having a chiseled good-looking face implies that your facial muscles are in a permanent rigor mortis.

There are some things that I would never understand about superhero stories. For one, why do they always have the same villain in story after story either as the main villain or as a willing aide to the main villain? Superman has Lex Luthor, Batman has the Joker, the X-Men have Magneto, He Man has Skeletor, etc. Do fans never get tired of seeing the same villains bugging their superheroes in episode after episode? Do they never wonder that if their superhero is all he is chalked out to be, why he is not able to get rid of this villain for good? Are they in fact aware of this irony and actively relish it?

Another thing that bugs me about superheroes is the need for almost all of them to have a mild-mannered alter ego. Why? And why can't other people recognise them in most of the cases? Clark Kent as the alter ego of Superman is particularly worrisome - does the addition of spectacles so change the facial appearance of a person that even someone close to them, like Lois Lane is to Superman, is unable to recognise them?

Yet another thing that really irritates me about superhero stories is the mess that all the hundreds and thousands of stories and story branches create. Again, Superman is the perfect example of this mess. Are there Supergirl, Superboy and Krypto, the Superdog, or not? Does Superman have a son or not? Has Superman died or not? Et cetera. What is the canonical Superman storyline?

Finally, why do most superheroes wear their underwear outside of their tights? What is it about superpowers that affects their sartorial sensibilities? In our college, a mild form of ragging involved the seniors making the freshers wear their underwear outside of their trousers, tying a bedsheet or a shawl around their neck as a cape and making them run down the corridors of hostels screaming "I am Superman!". Some of my friends would also remember our batchmate, who is now a banker in Bombay, running through the corridors of our hostel one night in an obviously inebriated state and clad only in an underwear and a bedsheet tied around his neck screaming "I am Superman!".