CRiSP + Motif (no dtrace) | Saturday, 15 August 2009 |
Am on holiday from next weekend for a couple of weeks, and I want to do something more rewarding, so am switching back to CRiSP for a while to kick some tyres.
First up is more finer control of file auditing - you can tell CRiSP to keep track of files you edit in an audit trail; useful for those times when you forgot where you placed a file.
I've fixed some other customer reports.
I keep on staring at ribbon bars, and before I fully tackle this (theres some pre-alpha code in CRiSP to do this, but its not ready for primetime), I am revisiting the Motif factor. CRiSP is built on Motif and over the years, it has driven me insane. In recent weeks I have fixed some uninitialised memory refs in Motif which could cause core dumps, but I have always had a goal to remove it totally. Many of the widgets are native Xt widgets, and the few remaining just require a bit of debugging to get rid of it totally - thus making the code more supportable, and ready for other things. (And freeing up a fair amount of memory).
CRiSP has some theming support and in getting rid of Motif, it will be easier to complete that, and finally make menu items to have icons in them.
People have also asked for freetype font support (which exists in CRiSP in a semi undocumented fashion). So, if the Motif removal goes well, then freetype can be made available to most of the widgets.
Painful dwarf | Sunday, 09 August 2009 |
Why?
Imagine the SYSCALL instruction fires. This is a special instruction in the amd/x86 cpus which moves from user mode to system mode, *without* pushing the return address on the stack. The Linux kernel, immediately after the transition (entry_64.S) puts the user space SP into the thread task area, but the PC is hiding. On entry to the kernel side of a syscall, it is in the RCX register, but by the time we hit a probe, e.g. sys_open(), we are miles away and the pt_regs array isnt accurate. At the point of probe, we force a breakpoint trap (luckily, only our code executes at this point, so we dont have to consider nested interrupts and blowing the state areas in the thread stack).
What makes this tricky is getting everything to work at once - anything even slightly wrong just gives bogus results -- stack traces which are not accurate or totally missing.
I am better now - I seem to get the first two stack frames, but the third one is elusive (I am either miscomputing the dwarf frame info or misapplying the result to find the next frame; for a third frame, its frustrating since we have gone thru the same looped code twice, so why the third is problematic is not clear).
The code so far is fairly horrid, with lots of experiments in their, and no 32-bit version yet done. My biggest fear is if any of this is subtly dependent on kernel releases (I think it is not), so that would be one weight off my chest.
(Kernel releases are subtly different in syscall/interrupt handling, and also structure layout for the user/process/thread, but I dont think we care too much, yet).
slow dwarf | Wednesday, 05 August 2009 |
Alas, the current Windows CRiSP release has black arrows on the scrollbars... to be fixed this weekend. Nuts.
I am trying to get this to parse properly:
$ build/dwarf /lib/libpthread.so.0 .... CIE length=00000014 Version: 01 Augmentation: "zRS" Code alignment factor: 1 Data alignment factor: -8 Return address reg: 0x10 Augmentation Length: len=0x01 1b R encoding 1b (kernel)I am working thru the various opcodes, being able to parse, but no guarantee the semantics are correct (thats the next phase).2c38 FDE len=7c cie=001c pc=e0ff..e109 tpc=ffffffffffffffff 0000: dwarf.c: unsupported DW entry 0xf 12
libpthread.so.0 is where the open64 syscall is located when I do my ustack() test against the perl interpreter.
In theory the parsing shouldnt matter, as in the kernel, we skip over blocks of the dwarf instructions to find the matching block, but it helps me to relax a little and better understand this stuff so I can tackle why some SYSCALL instruction blocks arent being handled properly.
People are sending me bug reports on 2.6.30.* kernels (fixed an issue with 2.6.30.4, but now theres a 2.6.30.5 - I cannot keep up with these releases and the gratuitous kernel code changes on each release!). So, just trying to stay above water, but progress is slow.
mail problems | Tuesday, 04 August 2009 |
If you see no response from me, then this could be the issue - just remail me; if you see dup emails from me, its me attempting to fix the issue.
dtrace linux status - the dwarfs | Saturday, 01 August 2009 |
A particular issue I am having at present is the sys_open syscall. gdb can show a stack trace but my kernel code cannot find the appropriate dwarf frames mirroring where we came from. So I need to put in more effort to work through the use case scenarios.