rmathew: gcc

Showing posts with label gcc. Show all posts

2006-07-14

GCC Summit 2006

Just FYI, the proceedings of the GCC Summit for 2006 are now available online. They usually make for interesting reading for those even mildly involved with GCC in particular or with compilers in general.

2006-06-30

Separating Debugging Information

Debugging information in an executable usually takes up a lot of space so developers usually "strip" an executable before shipping it. This however makes it difficult to diagnose problems in your programme reported by the users, especially when a problem is only reproducible in the reporting user's environment.

Microsoft's development tools support creating a separate "Program Database" (PDB) file for an executable containing debugging information for the executable. Its diagnostic tools support reading a PDB file for an executable , if one is available, to generate a more usable diagnostic report for a fault. You ship stripped executables to your users and when they report a fault, you ask them to install the corresponding PDB files to help you get a better diagnostic report. I think that this is a nice idea and I used to wonder if the GNU toolchain would ever support something like this.

Danny Smith pointed out to me that something similar is already supported by recent versions of GDB and binutils. You can use the --only-keep-debug option of objcopy for this purpose. For example:
gcc -g hello.c objcopy --only-keep-debug a.out a.dbg strip --strip-debug a.out
a.dbg now has the debugging symbols and a.out has been stripped of debugging symbols. When you want to debug a.out you can use:
gdb -s a.dbg -e a.out
Simple.

2006-06-12

GCJ for MinGW

I have updated my article "Building GCJ for Windows" and the associated scripts to work with the current SVN mainline sources of GCC (to be released as GCC 4.2). They might also work with GCC 4.1 sources, but I have not checked it myself. The article also has some tips for building GCC natively on Windows using the MSYS toolkit, especially to make the built and installed GCC relocatable (see below).

A major portion of the effort went in to ensure that the resultant toolchain was actually relocatable (that is, the installation can be archived and then extracted elsewhere, possibly on a different machine, and everything can still be expected to be working). The proper locations of the Windows headers and runtime libraries and the flags to pass to the GCC configuration scripts were something that took a lot of trial and error (and a lot of help from Mark Mitchell and Danny Smith) to get right, since I was trying to do something less common (building cross and crossed-native compilers) for a platform that gets the attention of very few GCC hackers as such, if at all.

I had stopped working on GCJ for Windows quite a while back and the reason I had to update my article and scripts was that there seemed to be a lot of people trying to build GCJ for Windows themselves using the latest released or in-trunk sources (and my instructions and scripts) and they were running into all sorts of issues. Unfortunately, GCJ on Windows has become worse than it used to be which is understandable since there is no one who is actively working on it to improve it. It is also a shame since even though it is a closed platform with an ugly design it appears to have the most number of users enthusiastically willing to try out GCJ.

We must do something about this situation.

QEMU
For a fan of Linux trying to make GCJ for Windows work, a very useful property of GCC is that it can be built on Linux as a cross compiler or as a crossed-native compiler targetting Windows. For a person with a relatively old machine and limited free time to hack on GCJ, this is also important since the build on Linux is way faster and far more reliable than that on Windows itself using MSYS. Equally important is the ability to test out the binaries created in this process without having to reboot the machine into Windows or having access over the network to another machine running Windows. Wine doesn't quite help since I need an environment that is as faithful to the real thing as possible.

QEMU running Windows on Linux comes to my rescue here. When run with the -kernel-kqemu option using the QEMU Accelerator ("kqemu"), the guest OS runs at very close to native speeds without adversely affecting the performance of the host OS. It has a built-in TFTP server that allows you to easily transfer files from the host machine into the guest system (there are also other ways of achieving this using QEMU, but this is the simplest). It's almost magical and is immensely useful. It's no wonder that virtualisation is becoming so popular these days and every developer who has tried it out sings its praises. If you are an "enterprise software" developer, you should already know what I am talking about. If you haven't tried it out yet, you really should. Virtualisation offers you the freedom and the flexibility to play around that is very useful and quite addictive.

2006-06-03

GCJ and ECJ

RMS has finallly agreed to using ECJ to generate bytecode for GCJ!

That sound you hear is the huge collective sigh of relief heaved by Free Java hackers everywhere.

2006-05-31

GCJ: Quo Vadis?

Andi Vajda asked whether GCJ would cease to exist if Sun were to release the source code for Java and its tools under a really Free licence. I have also seen such questions asked on Slashdot, OSNews and other fora.

The response from Andrew Haley mirrors what I personally think of the situation - as long as there are hackers willing to maintain it, GCJ would continue to exist. Miguel de Icaza says that among the two types of hackers who usually work on Free Software in their spare time, GCJ and GNU Classpath only attract the Free Software idealogues and not those who want to get something free (gratis) working for them since Sun's JDK already works well for them and is free. Tom Tromey has a more philosophic take on the current situation - almost ascetic in fact. All this is enough to make a GCJ or GNU Classpath hacker reflect on the current state of Free Java, the utility of his contribution to it and the impact of a fairly Free release of Java source code from Sun. Despite being an extremely erratic contributor working on the fringes of Free Java, I cannot help doing the same.

It was about four years ago that I first flirted with GCJ. I wondered, for no particular reason as is usually the case with me, if it was possible to use GCJ to create native GUI applications with SWT in Java on Windows. I found out that the support was almost there and with a little effort and a lot of support from the GCJ hackers (especially Tom), I was able to add in that support and contribute it back to GCJ where it was accepted with minor modifications. That was my epiphany with Free Software. Until then I was more impressed by the fact that I could get so much decent-quality software for free than the liberty to make modifications to such software. But now I began to realise that having the source code available to you meant that you could change it yourself to fix its shortcomings and share your improvements with the other users. It also meant that the availability of the software did not depend on the solvency of the vendor or its willingness to maintain it. There were many other factors in favour of Free Software that became apparent to me over time. Suffice it to say that I finally understood what that twitchy, smelly and passionate preacher of Free Software was talking about all the time.

In time, my original itch died out (like so many of my digressions in life) but I continued to work on GCJ. I quickly moved from Windows to Linux (since that was the only platform that I enjoyed working on) and began fixing front-end bugs more out of a desire to help folks than to fix anything that was affecting any of my personal or professional work. This played a part in the fact that my track record with GCJ has been absolutely abyssmal (except perhaps in the area of contributing the most noise to the GCJ mailing lists). My pathetic time-management skills and a propensity to be carried away by even the slightest distraction have also played a big role. Since I worked on GCJ in my free time at home and I did not want to compromise too much on my personal life (spending time with my family, watching movies, reading books, meeting friends, getting enough sleep, etc.), I rarely found the time to debug and test anything except the most trivial of bugs. Finally, my tendency to "wait and watch" (first for GCJX and now for ECJ) has not helped matters much either.

There are some inherent problems with GCC and GCJ too. The GCJ front-end seems to have been written in a hurry in order to get as many things working as fast as possible with not much thought given to overall maintainability. The people who originally wrote it have moved on to other things in life leaving others with little idea of how it all works. Subsequent hackers (including me) have always made incremental changes to fix immediate issues rather than perform any big refactoring of the code, with the natural result that the code has become even more unwieldy now than before. To fully bootstrap GCC, especially with checking enabled, and then to run its testsuite takes an awful amount of time, especially on slightly older hardware (like my otherwise perfectly capable P3-based system). This is a huge barrier for most prospective hackers. (I reserve my rant about the disastrous effects of bundling several language front-ends and their ever-bloating runtime libraries into a single compiler system for another day.) Until recently, the ubiquitous tree data structure was used in funky ways for almost everything in GCC. There is precious little documentation for a programme of this complexity and some parts of this documentation is out-of-date. The best way to understand stuff in GCC is to read through the source code, to watch the operation of the relevant parts in a debugger and to ask questions on the mailing lists when you do not understand something even after doing all this.

All these are problems that can be overcome one way or the other. The biggest problem with GCJ however is the sheer paucity of hackers willing to work on it to improve it compared to the number of people willing to use it and reporting problems with it. This situation is particularly severe for Windows. Were it not for Red Hat's sponsorship of some critical GCJ hackers (and the heroic efforts of Tom in particular), GCJ would have been in a very bad shape by now. This situation really makes me realise how true Miguel's observations are with respect to hackers of Free Software and Free Java.

A Free Java from Sun would not obviate the need for GCJ though. I personally feel that ahead-of-time compilation to native code providing more opportunities for aggressive optimisations (platform-agnostic as well as platform-specific) and a more straightforward integration with C/C++ via CNI are enough to show the utility of GCJ orthogonal to the status of the freedom provided by Sun's JDK.

This post has already become the longest I have ever posted, so I will reserve my rant about how Java the language and its bloated "standard" runtime is not even worth spending so much time and effort on in the first place, for another day.

2006-05-02

Planet GCC

There is now a Planet GCC aggregating the feeds from Planet Classpath and the blogs of a bunch of GCC hackers. If you know of a blog of a GCC hacker that is not directly or indirectly aggregated here, please let Dan know. Thanks to Dan for this initiative.

(Originally posted on Advogato.)

2006-04-17

GCC and Google Summer of Code 2006

GCC is looking for students interested in working in Google's Summer of Code on a project helping GCC.

(Originally posted on Advogato.)

2006-04-07

GCJ

I seemed to be in my elements on the GCJ list this week, provoking a thread on the lack of good support in GCJ for Windows and eliciting a reply from the GCC Steering Committee on the status of the proposal to integrate ECJ into GCJ.

(Originally posted on Advogato.)

2006-04-03

More Front Ends in GCC

One of the great advantages of structuring a compiler such that the front-end, the middle-end and the back-end are relatively independent is that if you write M front-ends and have N back-ends, you get M*N compilers "for free" assuming you have a good enough intermediate representation in the middle-end. This idea was discussed as far back as the 1950s and UNCOL was an ambitious effort towards this goal. GCC is a stellar example of such a compiler - it supports C, C++, Java, Ada, etc. "out-of-the-box" and can target a whole bunch of platforms. You implement a language front-end for GCC and you immediately have a compiler for that language for a whole lot of platforms; you implement a target back-end for GCC and you immediately have compilers for several languages for that platform. Of course, this is grossly oversimplified, since you have to usually port the language runtime to a platform too or since your language might strain the GCC intermediate representation or expose latent bugs in the middle-end making the effort rather difficult. But the overall idea still remains valid.

The GNU Pascal Compiler (GPC) guys recently proposed an integration of GPC with GCC (in the same source repository, but on a different branch - weird). Some day, the GCC Scheme Compiler (GSC) guys, the PL/I for GCC guys, etc. might also want to integrate their front-ends with GCC. Having more front-ends in the GCC source tree itself means that middle-end changes do not inadvertently break these front-ends, latent middle-end bugs and unwarranted assumptions are exposed, general GCC enhancements are automatically applied, etc. So it's a good thing for GCC, in a way.

However, I personally think it is not a good idea. The GCC mainline is already quite bloated with a number of languages and runtimes and building all of the languages and their runtime libraries (thank you Sun for regularly increasing the bloat in the "standard" Java runtime with every release of the JDK) takes quite a while even on a decent system. Having more languages and their runtimes within GCC will only exacerbate this issue. I personally also feel (though I have no real practical experience in this area) that it does not let the optimisers make assumptions that they can use to perform stronger optimisations. A recurring problem in this area is the folding of constants, where languages like Java specify a bit too much with respect to what can be folded and how it should be folded.

On a slightly different note, the GSC guys have also created a "Hello World" front-end for GCC that shows you how to build a front-end for GCC for your favourite language.

On an entirely different note, I have ended up writing 3,000 lines of text in the user manual of a 4,000 line programme (both rough "wc -l" figures)! Either the manual is unnecessarily verbose or the programme is too complex.

(Originally posted on Advogato.)

ECJ for GCJ: Still in limbo

It has been almost a month since Tom formally proposed integrating ECJ in GCJ to the GCC Steering Committee (SC). There has been no word from the SC yet on this request. However, the SC did ask the GCC developers to avoid gratuitously including source code from external projects in GCC. One consequence of this for GCJ was the removal of fastjar from the GCC source tree. I'm not sure if the SC's decision was coincidental or in fact a result of deliberations triggered by Tom's request.

(Originally posted on Advogato.)

2006-03-06

GCJ and ECJ

Tom has asked the GCC Steering Committee to provide their verdict on the proposed use of the Eclipse compiler for Java in GCJ. This follows his earlier proposal to abandon GCJX for GCJ and adopt ECJ instead. As of this writing, there has been no response from the GCC SC yet.

(Originally posted on Advogato.)

2006-02-10

QEMU on steroids

First there was QEMU that provided a fairly fast emulation of x86 hardware using a technique called "dynamic translation". Then came kqemu (or QEMU Accelerator Module) that allowed user code (ring 3) to run directly on the actual hardware providing speedups of around 3-5 times. Now comes the -kernel-kqemu option that allows even some of kernel code (ring 0) to run directly on the actual hardware providing impressive speedups over the old kqemu. Of course, these speedups come at the cost of affecting the stability of the host OS because of bugs in kqemu. kqemu is also not Free software, though it is free (gratis) for non-commercial uses.

In other news, GCC's SVN repository is also available for read-only access via HTTP for those who are stuck behind corporate firewalls and want access to the latest sources without having to download weekly snapshots. Of course, this is slower than the SVN protocol and might also be pulled off if it contributes too much to the load on the server.

(Originally posted on Advogato.)

2006-02-09

"A look at GCJ 4.1"

Mark Wielaard has written another article for LWN.net titled "A look at GCJ 4.1" (where he also looks at GCJ 4.2 and beyond). It is subscribers-only for the moment (for a week), but if you are interested in Linux in any way, you should seriously consider subscribing to LWN. It's quite good.

(Originally posted on Advogato.)

2006-01-30

ECJ for GCJ?

Tom proposed killing GCJX and replacing it with the Eclipse compiler for Java (Eclipse JDT Core plug-in, known informally as ECJ). He has been almost single-handedly working on GCJX for more than a year and it looks pretty good already, so it is pretty courageous of him to be the one to propose using something else instead of GCJX in the overall interests of GCJ.

ECJ seems pretty good and very actively maintained. It must be one of the fastest Java compilers around and fully supports the new language features introduced in JDK 1.5. So it is a very good move for GCJ.

Using ECJ does introduce GCC bootstrapping issues though. However, it should be possible to easily overcome these issues. The bigger issues are political and legal in nature. Let us hope these are resolved favourably.

I personally feel a little sad though. This removes another "fun" part of GCJ even though it is pragmatically a better thing to do, especially considering the precious little resources that the GCJ project has. I feel that GCJ is becoming more and more an "integration" project combining the best-of-breed in Free software for a given task - the Java language compiler would be ECJ, the garbage collector is Boehm-GC, the runtime library is GNU Classpath and the optmisation and code-generation is done by GCC. Of course, this can hardly be characterised as bad and is in fact quite a sensible thing to do given the limited amount of resources that the Free software world has at its disposal, but...

(Originally posted on Advogato.)

2006-01-06

Virtual Address Space Randomisation and Debugging

I feel rather silly today. Even though I knew about virtual address space randomisation in newer Linux kernels, it never struck me that I should disable it to get a reproducible debugging session with predictable breakpoint conditionals. My silly workaround was to use this patch:


Index: tree-ssa-operands.c
===================================================================
--- tree-ssa-operands.c (revision 109196)
+++ tree-ssa-operands.c (working copy)
@@ -1460,6 +1460,16 @@ get_call_expr_operands (tree stmt, tree
   tree op;
   int call_flags = call_expr_flags (expr);

 +  if (strcmp (lang_hooks.decl_printable_name (current_function_decl, 2),
+              "of") == 0)
+    {
+      const char *called_f
+        = lang_hooks.decl_printable_name (TREE_OPERAND (TREE_OPERAND (stmt, 0),
+                                                        0), 2);
+      if (strcmp (called_f, "_Jv_ThrowBadArrayIndex") == 0)
+        printf ("Hello \"_Jv_ThrowBadArrayIndex\"!\n");
+    }
+
   if (!bitmap_empty_p (call_clobbered_vars))
     {
       /* A 'pure' or a 'const' functions never call clobber anything.

and then put a breakpoint at the "printf" to get the debugger to stop the compiler process while processing the operands for the statement I was interested in.

Thanks to Mike Stump, we now have a page in the GCC Wiki that explains this problem and how to avoid it. Putting in the desired breakpoint is very simple now and avoids unnecessarily kludgy patches that contaminate the tree:


(gdb) b tree-ssa-operands.c:1463
Breakpoint 1 at 0x80d1f3f: file /extra/src/gcjx/gcc/gcc/tree-ssa-operands.c, line 1463.
(gdb) cond 1 stmt==0xb7c27fc8

Cool! Now all that is left is to use this breakpoint to figure out what the actual problem is that caused us to fire up a debugger.

(Originally posted on Advogato.)

2005-12-21

Planet GCC

I set up a simple Wiki page listing the weblogs of various GCC hackers as a temporary measure till such a time that we have a "Planet GCC" weblog feed aggregator of our own. Please feel free to update it with links to weblogs of GCC hackers that you find missing.

(Originally posted on Advogato.)

2005-11-21

More Improvements to GCC

Apart from the projects already planned for GCC 4.2, we now have proposals for Link-time Optimisation, New Register Allocation Method and LLVM Integration. The integration with LLVM, should it happen, would be the most significant change to GCC since Tree-SSA. Pretty exciting times for a hacker to be involved with GCC!

(Originally posted on Advogato.)

2005-11-15

Subversion, Old Dog, New Tricks

It turns out that I do not need too much of extra disc space for working on trunk and gcjx-branch using SVN compared to CVS after all. This is because I used to always create a snapshot of GCC sources and use it as a working copy for fear of messing up my checked-out sources. Since SVN always keeps a copy of the pristine sources around (which is the major cause of the increased disc space usage) and it is easy and fast to use svn diff to figure out the damage and to use svn revert to restore sanity, I no longer need to continue with my weird model of development. It is also quite simple to just ignore everything from the GCC SVN repository except for the interesting stuff - for the gcjx-branch, my checkout only has the bare minimum stuff needed to bootstrap C, C++ and Java and run the libjava testsuite, while for trunk I have removed all the Ada stuff since I can't build Ada anyways. Of course, all this would probably have been possible with CVS as well, but there weren't nice instructions in the GCC Wiki for lazy souls like me for doing this with CVS.

(Originally posted on Advogato.)

2005-10-25

Subversion and GCC

GCC would be moving to Subversion around this weekend. In general, I feel this is a good move and will probably help our prolific developers a lot. I do have concerns about its alarming usage of disc space relative to CVS though. As it is, my home PC is under a bit of a strain trying to squeeze in GCC mainline and gcjx-branch copies, not to mention snapshots of these that I actually use as working copies, on the hard disc partitions that I have provided to Linux. After the move to Subversion, I will have to make some adjustments to the disc partitions to fit all this stuff in.

However, this will probably have to wait as I would be on vacation in Ooty most of next week.

(Originally posted on Advogato.)

2005-10-14

Dumping Parse Trees

GCJX now accepts an "-fdump-tree" option that prints out the abstract syntax tree of a Java source file to stdout.

(Originally posted on Advogato.)

*** MOVED ***