libelf brokenness Sunday, 28 November 2010  
I do dislike libelf - its just an important library for manipulating executables, but when it goes wrong, you are SOL trying to determine what *it* did wrong, rather than your application.

As I diagnose the backwards compatibility issues on later binutils, I found that we have a new section .gnu_hash which exists instead of .hash. Older dynamic linkers dont like these executables.

You can tell gcc/ld to write old style formats, but this is a nuisance to go through every makefile, adding the switch and potentially autodetecting if the platform you are on or are building for supports this switch.

Much easier to simple patch the ELF executable.

But a simple piece of code like the following creates a broken ELF file, and looking at the libelf source doesnt easily lend itself to determining why.

I am currently looking to write my own libelf library, to make it easier to do what *I* want.

	if (elf_version(EV_CURRENT) == EV_NONE) {
	        printf("%s: not an ELF file\n", fname);
	        return -1;
        }
	if ((fd = open64(fname, O_RDWR)) < 0) {
		if (debug)
			printf("Ignoring %s\n", fname);
		return 0;
        }
	if ((elf = elf_begin(fd, ELF_C_RDWR, NULL)) == NULL) {
	        printf("%s: elf_begin failed - %s\n", fname, elf_errmsg(elf_errno()));
	        return -1;
        }

// Following line, which does *nothing* causes the // emitted updated file to be a corrupt ELF. Why? Who // knows. That would require a lot of code reading in // elf_update() to figure out what is happening. elf64_getehdr(elf);

elf_update(elf, ELF_C_WRITE);


Posted at 09:56:30 by fox | Permalink
  Woes of VirtualBox Wednesday, 24 November 2010  
More Ubuntu 10.10 grief. Why did VirtualBox stop liking my existing virtual hard drives? Strangely my Windows XP and Windows 7 vm disks work fine, but all my legacy Linux releases (see prior post!) fail to boot in strange ways.

After a lot of fiddling with cpu/paging options in VirtualBox, I found something strange. Where my VM was configured for a SATA drive, I flipped them to make them IDE, and they seem to be back in action again.

Not sure what happened - maybe the newer VirtualBox release accidentally flipped them to SATA. (I dont think I would have chosen SATA for legacy kernels which have no knowledge of SATA, so maybe VirtualBox defaulted to SATA for me - thanks, err, but no thanks).

Very strange.

At least I can finally go back to ELF hacking to see how the old RedHat AS4 / Centos 4 handles glibc 2.12 file formats.


Posted at 22:39:09 by fox | Permalink
  ELF. ELF. ELF. Dont do it! Wednesday, 24 November 2010  
This is interesting...to me at least. How to create a "universal" Linux binary. A universal binary is one which can be created on one kernel/glibc distro but run on all prior and future releases (or as many as I have or can catalog).

Its common to want the latest/greatest distro on your main machine. (I currently run Ubuntu 10.10). But if I build software (such as CRiSP) on that host, I want it to run on all prior Ubuntu/RedHat/whatever systems. Building on each one when there is close to zero difference is a waste of energy and time.

I experimented a while back and found that glibc makes judicious incompatible changes to the headers and libc, enticing applications to only run on the glibc it was built on or later.

On investigating what was triggering this, I found it was things I didnt care about. E.g. <ctype.h> which has always been implemented as #defines and bitmap indexing into an array (isdigit(), isalpha(), etc) was replaced by libc calls (to handle Unicode or widechar types). Since I dont use these, I dont need that headache. After trying to run my binary on a failing older glibc, I worked out the headers to avoid, and wrote a tool to patch the ELF executable to patch the GLIBC version requirements. This works well.

At least it did.

But Ubuntu 10.10 is running glibc 12.1. Here, when I create my universal binary, I find it doesnt work on an earlier glibc. Instead I get a cryptic "error loading shared library: glibc 2.5 or later dynamic linker" is required.

What on earth happened? Where is this "information" telling the /lib/ld-linux.so.2 that my binary needs such a thing?

Heres a good link from Ali Bahrami: http://blogs.sun.com/ali/entry/gnu_hash_elf_sections.

The binutils people have decided to break ELF backwards compatibility (or is that forward compatibility?). Instead of a ".hash" ELF section - required by the rt.ld dynamic linker (/lib/ld-linux.so.2), they create a new style of hash table, as described in that link above.

Now consider an older dynamic linker. It is expecting a ".hash" section. But executables on glibc 12.1 (could be earlier releases too - it may have existed on Ubuntu 10.04, but I havent verified when this occurred yet), dont have a ".hash" section.

Instead we have a ".gnu.hash" section.

$ objdump -h ~/bin/fcterm | grep hash
  3 .gnu.hash     00000268  0804818c  0804818c  0000018c  2**2

So, I am asserting by doing this, the older runtime linker gets confused, and refuses to run the executable. You are really SOL here, as its not as if you can use LD_LIBRARY_PATH or LD_PRELOAD to bypass /lib/ld-linux.so.2. You could do this with a chroot/jail, but that gets fiddly and may require root access - or any number of things to send your customers scurry for the Windows installation disks.

So - for my next trick, lets trying patching in an ELF conformant ".hash" section and see if we can get an older glibc to "like" my executables.

I'll post an update if I am successful or not.


Posted at 22:27:21 by fox | Permalink