CRiSP Weblog: December 2013 Archives

July 2015
Su	Mo	Tu	We	Th	Fr	Sa
			1	2	3	4
5	6	7	8	9	10	11
12	13	14	15	16	17	18
19	20	21	22	23	24	25
26	27	28	29	_	_	3

Why?!

Sunday, 31 March 2013

So you have a software product, X, and decide to upgrade to the latest, to fix some annoying bug or feature. When you use the new version, you find some of the old menus are gone, and new incomprehensible features are added. And you wished you hadnt.

Lets take a real world example. Upgraded from Galaxy Note 1 to Galaxy Note 2. Twice the cpu power, twice the RAM. Nice.

But...the screen is narrower. Why? I can live with that.

Android 4.1.2 Jelly Bean is lacking UMS. You dont know what UMS is? UMS is the ability to access your device as a block device and mount it on Windows or Linux and copy files around.

You know, this mounting the device is a good thing for Android. So much better than the locked in Apple (iTunes) or Microsoft (ActiveSync) solutions. Why? Because you have more control. You might like a GUI. I want to use a shell script and manage things without using a mouse or some pretty looking but unfunctional GUI interface.

*Why did they do that?!* Android is supposed to be about control, but what it demonstrates is that Google simply doesnt care about the users or anything about user interface design. I try to avoid Chrome as much as possible because the backspace key doesnt work, there are no settings to avoid unwanted ads or oversized images, or anything useful really.

The settings in Android are an ugly assortment of mess. Everything keeps changing and there is no logic whatsoever to the layout of the settings. Nor is there any useful help.

So, UMS was removed. UMS is problematic because the device has to unmount the filesystems and wait for a disconnect before remounting (else filesystem corruption could occur). Instead, we have MTP and CTP - go search the web; some people are happy with them, but a lot of people are not. They are slow and badly implemented protocols which bypass the kernel buffer cache, so they are slow, by design. And limited.

Or, try getting MTP to work with Ubuntu - adding beta quality software with a FUSE driver which doesnt work properly.

Or, I can add an ssh server to my Android, and use wifi (of course, after rooting my device). Why would I want to use wifi? In my house its pathetically slow. A cable is fast and predictable. I cant copy 10+GB of data to my device over wifi - thats around 8h of xfer time. Maybe less if I can be bothered to walk nearer to the router.

I do wander what the state of software will be like in 50 years time or more. Today, we have some great products out there, and half the web is full of questions about where feature X is gone and lots of half hearted or bad implementations of solutions (because every platform and version is so fragmented).

I was thinking about this the other day with respect to DTrace. DTrace is pretty much doomed. There are implementations for Linux (mine and Oracles), MacOS, Solaris and FreeBSD. Thats 5 versions of DTrace, none of them sharing source code. Each working independently of each other. So, just keeping up with kernel versions and bugs is 5x the work it needs to be. 5 independent code bases (ignoring the people who have created git clones and then done nothing with them, which adds to the wasted effort of supporting something that is going nowhere).

Android is the same - the xda-developers forum is a great place to learn, but boy, is it hard work. So many threads and tangents, solving similar problems and there is no "definitive" solutions. You have to read a lot to see which tools or authors are gaining consensus and real solutions. I spend half a day rooting my Galaxy Note 2, because all the instructions, copied from each other, were *wrong* and supplied a bogus file. Today I spend most of the day figuring out how to enable UMS on the droid, and none of them work. They half work.

And then on my Ubuntu system, it has mysteriously decided audio doesnt work anymore and I have to chmod 666 /dev/snd*, and manually edit /etc/resolv.conf after a reboot. Why? Because some software thinks it is clever and incontrol and I cannot figure out what all the bits are that sit in /etc/* which attempt to do things that I dont want. My emulated RaspberryPi (qemu) boots faster on my laptop than the laptop does itself.

I am not far off totally dropping Ubuntu (or Mint, or Arch, or any of the distros) because their attempts to do things for me, cause me more work and pain for zero benefit.

Posted at 17:15:03 by fox | Permalink

DTrace/ARM

Tuesday, 19 March 2013

Progress on the ARM port of DTrace continues. At the moment, the driver loads, and reloads into the kernel. (I was crashing on a reload, which is annoying - having to keep rebooting as I fine tune the code). Its still a skeletal ARM port, but I can now do:

$ dtrace -n tick-1s

and the expected thing happens. Thats a huge relief - the core of DTrace just works.

Next was some small fixes for /proc/dtrace/fbt - to see the state of the available FBT probes (and specifically the instructions and patch values). A key difference of ARM vs x86 is that ARM is a 32-bit cpu - all instructions are 32 bit wide, and not variable like on x86 (which allows 1-byte or more instructions). Because instructions are 4-bytes wide, some changes to the key data structure (intr_t) is needed to handle the 4-byte wide value and apply that to, e.g. the FBT driver.

ARM presents many challenges: the first is that there are so many variants. I am targetting the ARMv6 architecture available on the RaspberryPi, and uses a hand built kernel running in a qemu VM.

At present I dont have a /proc/kcore, which is a nuisance - its useful using this to examine inside the kernel, e.g. disassembly or proving if a probe is what I expect it to be. (I can work around that easily).

Additionally, this is a single CPU kernel - no SMP. (I believe I found a bug in the Solaris DTrace where the maximum number of CPUs vs the kernel, is concerned; Solaris supports 32 i386 CPUs or 256 amd64 CPUs - if the max CPUs are configured, theres some out-by-one maths going on in the buffer snap code, which causes dtrace to abort early in its work. On ARM I hit this because NCPU==1, and the code tries to look at cpu#0 and cpu#1).

Now, I am ready to start handling FBT. I can get DTrace to plant probes on entry, but the return probes are broken, because its using the x86 instruction disassembler and not an ARM specific one (easy to fix, when I am ready).

But before I allow an FBT probe, I need to intercept the ARM single step and breakpoint interrupts. I found this page as a useful hint/starter to get me to the right place to locate what I am after...

http://pankaj-techstuff.blogspot.co.uk/2007/11/story-interrupt-handling-in-linux-2611.html

First step is to put a "no-op" interrupt handler in place to prove I am doing the right thing. I am not an ARM assembler expert, and am using gcc -S and ARM instruction references on the net, to get me closer to one.

Once this is done, then the syscall mess (in my code) can be looked at - the systrace.c code is full of hacks and quirks for i386/amd64 and most/none of this is relevant for ARM.

After that, we would be mostly done. (USDT will require some work, but USDT is more leasurely than core driver work).

The dtrace, available at my site or github doesnt have all the ARM updates, so hold off for a while before expecting this to work. I will update the blog when I feel its more ready for primetime.

Posted at 21:23:15 by fox | Permalink

Divide by 10

Sunday, 03 March 2013

I've gotten my RaspberryPi VM to run from a kernel I have built, which means I can now load dtrace into the ARM kernel and continue debugging.

One silly issue I have hit is module/divide arithmetic for 64-bit numbers. ARM, being a classic RISC chip, doesnt have a divide instruction but the kernel nicely hides this for you. In most cases, divide-by-a-constant is handled by the compiler via various optimisations.

At the moment, theres a few pieces of code where divide/modulo are not by a constant the compiler can see, which results in the compiler generating calls to maths helper functions. On the i386/x64 architectures, this is handled by dtrace mapping to the appropriate mechanisms to call the do_div() function.

On the ARM, its different - and confusing enough (given the differing ARM architectures) to not be working yet; so when dtrace is tracing to its own internal log buffer (which I need to debug the GPF's I am triggering) its a nuisance, because decimal numbers come out wrong.

Its interesting poking around the kernel code at the do_div64() macros and the printf type code, as the kernel makes good attempts to leverage native CPU features, or emulate, in the case of the ARM/Risc instructions, to get a reasonable performance.

Theres lots of ways to do divide, without actually doing a divide (many algorithms in the kernel). What is a nuisance is debugging the mechanism I am using to call these, as its mostly crashing the kernel, and difficult to do in user space (in user space, it can work, because the compiler "just works", which doesnt help prove what I am doing is correct in the kernel).

Still a way to go, but at least I can load the driver.

Posted at 12:13:34 by fox | Permalink