the set_tid_address returns the tid (which is also the pid when called
from the initial thread) so there is no need to make a separate
syscall to get pid/tid.
problem 1: mutex type from the attribute was being ignored by
pthread_mutex_init, so recursive/errorchecking mutexes were never
being used at all.
problem 2: ownership of recursive mutexes was not being enforced at
unlock time.
the existence of a (kernelspace) thread must never have observable
effects after the thread count is decremented. if signals are not
blocked, it could end up handling the signal for rsyscall and
contributing towards the count of threads which have changed ids,
causing a thread to be missed. this could lead to one thread retaining
unwanted privilege level.
this change may also address other subtle race conditions in
application code that uses signals.
note that this presently does not handle consistency of the libc's own
global state during forking. as per POSIX 2008, if the parent process
was threaded, the child process may only call async-signal-safe
functions until one of the exec-family functions is called, so the
current behavior is believed to be conformant even if non-ideal. it
may be improved at some later time.
this allows sys/types.h to provide the pthread types, as required by
POSIX. this design also facilitates forcing ABI-compatible sizes in
the arch-specific alltypes.h, while eliminating the need for
developers changing the internals of the pthread types to poke around
with arch-specific headers they may not be able to test.