Xen, "hwcap 0 nosegneg", and -mno-tls-direct-seg-refs
A colleague was showing me a compiler flag that needed to be included when compiling a program for Linux x86 Xen hosts: -mno-tls-direct-seg-refs. Being the curious person that I am, I googled around trying to figure out exactly what this directive was and why is was needed.
From the gcc man page:
Controls whether TLS variables may be accessed with offsets from
the TLS segment register (%gs for 32-bit, %fs for 64-bit), or
whether the thread base pointer must be added. Whether or not this
is legal depends on the operating system, and whether it maps the
segment to cover the entire TLS area.
For systems that use GNU libc, the default is on.
But why Xen? Well, according to this thread, x86 systems (32-bit, *not* 64-bit x86_64!) use a weird trick to handle negative segment references. Xen is forced to emulate this behavior, which is slow. Normally, disabling tls-direct-seg-refs would incur a performance hit, but since Xen has to go to extra trouble to emulate them, it's actually faster to turn them off all together.
This is related to the "hwcap 0 nosegneg" ldconfig directive. A very informative post on LKML lays out the details of what this does exactly. To summarize, it tells the dynamic linker to look for a "nosegneg" set of libraries. Here "nosegneg" is just a hardware capability keyword like "sse2" or "mmx" that might be used for various types of optimizations. This way a Xen guest (domU) can obtain the speed benefit of libraries compiled with the no-tls-direct-seg-refs flag automatically.