dtrace update ... Monday, 23 February 2015  
Still delaying the dtrace release. Having gotten 3.16 kernels to work, I started working backwards on random 3.x kernels, to validate it still worked there. I fixed a number of issues there, and then headed into RedHat 5.6 / Centos 5.6 land (2.6.18+ kernel).

I spent some time trying to get execve() syscall tracing to work - and am still working on that.

Along my journey, I noticed a few things. Firstly dtrace4linux is too complicated - trying to support 32+64b kernels, along the entire path back to 2.6.18 or earlier, is painful. I cannot easily automate regression testing (not without a lot more hard-disk space, and not worthwhile whilst I am aware of obvious bugs to fix). I could simplify testing by picking any release, and just rebooting with different kernels - rather than full ISO images of RedHat/Centos/Ubuntu/Arch and so on.

I also noticed that the mechanism dtrace4linux uses to find addresses in the kernel is slightly overkill. It hooks into the kernel to find symbols which cannot be resolved at link time. The mechanism I have is pretty interesting - relying on a Perl script to locate the things it needs. I found a case where one of the items I need is not visible at all in user space - its solely in the kernel - part of the syscall interrupt code (the per-cpu area). Despite what latest kernels do, some older kernels *dont*. And catering for them is important. In one case I have had to go searching the interrupt code to find this value. I ended up writing a C program to run in user space, prior to the build, and really, it would have been better to generalise this so that everything we need is simply defined in a table compiled in to the code, rather than the /dev/fbt code to read from the input stream. This would ensure that a build compiles and works. Today, sometimes I debug issues with old kernels because a required symbol is missing and we end up dereferencing a null pointer (not a nice thing to do in the kernel).

One problem I had with the above, was that gdb on the older distro releases cannot be used to read kernel memory due to a bug in the kernel precluding reading from /proc/kcore. Fortunately, I include a script in the release which emits a vmlinux.o, complete with symbol table, from the distribution vmlinuz file.

I havent reverified the ARM port of dtrace, but thats something for a different rainy or snowy day.


Posted at 21:48:32 by fox | Permalink
  new dtrace .. small update Friday, 20 February 2015  
The next release of dtrace is hopefully this weekend. Having resolved the issues I had previously, have been doing more testing - so far only really on the 3.16 kernel, and found that some of the syscalls were behaving badly due to reimplementation in the kernel. Hopefully when I have fixed the last two or three, then I can finish my merges and push out the latest release. I will do a cursory check on some of the older kernels - it is likely I have made a mistake somewhere and broken older kernels, but will be easier to fix having made some internal changes.

Note that no new functionality is in here - the issues with libdwarf remain - I may try again to solve that issue, and "dtrace -p" is still a long way off from being functional.

Given that 3.20 is now the current kernel, I may need to see if that works and pray that 3.17-3.20 didnt affect how dtrace works, or, if it does, the work to make it compile should be much less than the issues that 3.16 raised.


Posted at 18:07:51 by fox | Permalink
  Why is gcc/gdb so bad? Thursday, 19 February 2015  
When gcc 0.x came out - it was so refreshing. A free C compiler. GCC evolved over the years, got slower and used more memory. I used to use gcc on a 4MB RAM system (no typo), and wished I had 5MB RAM. Today, memory is cheap, and a few GB to compile code is acceptable. (The worst I have seen is 30+GB to compile a C++ piece of code - not mine!)

One of the powerful features of gcc was that "gcc -g" and "gcc -O" were not exclusive. And gdb came about as a free debugger, complimenting gcc.

Over recent years, gdb has become closer to useless. It is a powerful and complex and featureful debugger. But I am fed up single stepping my code, and watching the line of execution bounce back and forth because the compiler emits strange debug info where we move back and forth over lines of code and declarations.

Today, in debugging fcterm - my attempt to place a breakpoint on a line of code, puts the breakpoint *miles* away from the place I am trying to intercept. This renders "gcc -g" close to useless, unless I turn off all optimisations, and pray the compiler isnt inlining code.

Shame on gcc. Maybe I should switch to clang/llvm.


Posted at 23:05:06 by fox | Permalink
  address: 0000f00000000000 Saturday, 14 February 2015  

Strange. Continue to keep finding why dtrace is not passing my tests. I have narrowed it down to a strange exception. If the user script accesses an invalid address, we either get a page fault or a GPF. DTrace handles this and stubs out the offending memory access. Heres a script

build/dtrace -n '
        BEGIN {
               cnt = 0;
               tstart = timestamp;
        }
        syscall::: {
               this->pid = pid;
               this->ppid = ppid;
               this->execname = execname;
               this->arg0 = stringof(arg0);
               this->arg1 = stringof(arg1);
               this->arg2 = stringof(arg2);
               cnt++;
        }
        tick-1s { printf("count so far: %d", cnt); }
        tick-500s { exit(0); }
'

This script will examine all syscalls and try and access the string for arg0/1/2 - and for most syscalls, there isnt one. So we end up dereferencing a bad pointer. But only some pointers cause me pain. Most are handled properly. The address in the title is one such address. I *think* what we have is the difference between a page fault and a GPF. Despite a lot of hacking to the code - I cannot easily debug, since once this exception happens the kernel doesnt recover. I have modified the script above to only do syscall::chdir: which means I can manually test via a shell, doing a "cd" command. On my 3-cpu VM, I lose one of the CPUs and the machine behaves erratically. Now I need to figure out if we are getting a GPF or some other exception.

I tried memory addresses: 0x00..00f, 0x00..0f0, 0x00..f00, ... in order to find this. I suspect there is no page table mapping here or its special in some other way. May need to dig into the kernel GDT or page table to see what is causing this.

UPDATE: 20150215

After a bunch of digging I found that the GPF interrupt handler had been commented out. There was a bit more to this than that, because even when I re-enabled it, I was getting some other spurious issues. All in all, various bits of hack code and debugging had got in the way of a clear message.

I have been updating the sources to merge back in the fixes for the 3.16 kernel, but have a regression on syscall tracing which can cause spurious panics. I need to fix that before I do a next release.


Posted at 10:28:01 by fox | Permalink
  no dtrace updates Monday, 09 February 2015  
People have been questioning why there are no dtrace updates. I hope to be in a position to properly respond shortly. Just before Christmas, I started work on Debian Jessie (3.16 kernel) and hit a number of issues. Although I made good progress fixing issues on x32 syscalls on a x64 system, and systematically fixing other issues, I had to hack the driver tremendously. These hacks are experiments to figure out why I could so easily crash the kernel. The usual means of panicing the kernel did not hold - normally a stray issue causes a kernel message and I can debug around the issue to isolate the cause.

The issues I hit were all very low level - the cross-cpu calls, the worker interrupt thread, and the current issue - relating to invalid pointers when accessed via a D script. I have a "hard" test which wont pass without crashing the kernel - crashing the kernel really hard, requiring a VM reboot. This is nearly impossible to debug. The first thing I had to do was increase the console mode terminal size - when the panic occurs, the system is totally unresponsive and all I have is the console output to look out, with no scrolling ability. Having a bigger console helps - but it seems like the GPF or PageFault interrupt, when occuring inside the kernel, does not work the same way as it has on all prior Linux kernels. Looking closely at the interrupt routines shows some changes in the way this works - enough to potentially cause a paniccing interrupt to take out the whole kernel; this makes life tough to debug.

If I am lucky, the area of concern is related to the interrupt from kernel space. If I am unlucky, it is not this, but something else. (Am hypothesing that the kernel stacks may be too small).

I have been saving up putting out any updates, despite some pull requests from people, because I am not happy the driver is in a consistent state to release. When I have finished this area of debugging, I can cross-check the other/older kernels, and see if I have broken anything.

It is very painful dealing with hard-crashing kernels - almost nothing helps in terms of debugging, so am having to try various tricks to isolate the instability. These instabilities in theory, exist on other Linux releases - but I will only know when I have gotten to the bottom of the issue.


Posted at 23:02:06 by fox | Permalink