dtrace progress 20080801 Friday, 01 August 2008  
The blog may have been quiet for a couple of weeks, but only because I was on holiday.

On holiday, with a 32-bit Linux laptop - so I managed to clean up and fix a number of 32-bit issues with dtrace. It now works quite well on Ubuntu 8.

I have just put out a new release - and fixed the accidentally broken 64-bit dtrace userland binary.

I have been working on getting the D functions: stack() and ustack() to work. For 32-bit kernels, ustack seems to work - but I have a lot of work to do to get the Psymtab.c code to lookup the symbol entries and print them symbollically. I hit issues with libelf.so not seeming to work properly for me, so need to investigate that issue.

The kernel stack is probably wrong and a pain due to -fomit-frame-pointer use in the Ubuntu kernel - the goal is to achieve what a kernel stack trace can do, along with symbols. (dtrace reads /proc/kallsyms so we should be able to see something).

The Sun code walks the stack looking at signals and interrupts - and I ripped out that code to get it to work at all. I will need to spend more effort here.

I need to get the 64-bit kernel and user stack dumps working, as that will give a huge amount of functionality to probe the system. If i can get the symtabs to work - that will be a great milestone.

Unfortunately, Unix has gotten itself in a mess with the ELF file format: the simple libelf library has difficulty handling 32 or 64 bit files given that the source process may be 32 or 64 bit itself, and so libraries such as <gelf.h> seem to exist to try and hide the word size issue.

Given that Solaris is a pure ELF system, and much of the symtab lookup on a Linux system is embedded in gdb (along with stack tracing), means that we end up with a bit of an emulation on top of an emulation in dtrace - but so far, no big issues (other than concern over the multitudes of libelf.so variants in the wild).

dtrace is still not ready for a prime time production system - it may work for you, it may not, but hopefully, over time, the port will become more robust, and work in most areas.

I hit some issues with accessing /proc/PID/mem where we can't access certain areas of memory - and I think this may be a bug in Linux kernels, but I need to work out if its me at fault or Linux.

Posted at 21:18:31 by Paul Fox | Permalink