riscv/musl - musl - Gitea: Git with a cup of tea

Commit Graph

Author	SHA1	Message	Date
Rich Felker	1b9406b03c	fix build regression on ARM for ISA levels less than v5 commit `06fbefd100` (first included in release 1.1.17) introduced this regression. patch by Adrian Bunk. it fixes the regression in all cases, but spuriously prevents use of the clz instruction on very old compiler versions that don't define __ARM_ARCH. this may be fixed in a more general way at some point in the future. it also omits thumb1 logic since building as thumb1 code is currently not supported.	9 years ago
Szabolcs Nagy	06fbefd100	add a_clz_64 helper function counts leading zero bits of a 64bit int, undefined on zero input. (has nothing to do with atomics, added to atomic.h so target specific helper functions are together.) there is a logarithmic generic implementation and another in terms of a 32bit a_clz_32 on targets where that's available.	9 years ago
Rich Felker	29237f7f5c	rework arm atomic/tp backends to be thumb-compatible and fdpic-ready three problems are addressed: - use of pc arithmetic, which was difficult if not impossible to make correct in thumb mode on all models, so that relative rather than absolute pointers to the backends could be used. this was designed back when there was no coherent model for the early stages of the dynamic linker before relocations, and is no longer necessary. - assumption that data (the relative pointers to the backends) can be accessed at a constant displacement from the code. this will not be possible on future fdpic subarchs (for cortex-m), so move responsibility for loading the backend code address to the caller. - hard-coded arm opcodes using the .word directive. instead, use the .arch directive to work around the assembler's refusal to assemble instructions not available (or in some cases, available but just considered deprecated) in the target isa level. the obscure v6t2 arch is used for v6 code so as to (1) allow generation of thumb2 output if -mthumb is active, and (2) avoid warnings/errors for mcr barriers that clang would produce if we just set arch to v7-a. in addition, the __aeabi_read_tp function is moved out of the inner workings and implemented as an asm wrapper around a C function, so that asm code does not need to read global data. the asm wrapper serves to satisfy the ABI calling convention requirements for this function.	9 years ago
Szabolcs Nagy	3b27725385	better a_sc inline asm constraint on aarch64 and arm "Q" input constraint was used for the written object, instead of "=Q" output constraint. this should not cause problems because "memory" is on the clobber list, but "=Q" better documents the intent and more consistent with the actual asm code. this changes the generated code, because different registers are used, but other than the register names nothing should change.	10 years ago
Rich Felker	e7a1118984	fix arm a_crash for big endian contrary to commit `89e149d275`, big endian arm does need the instruction bytes in big endian order. rather than trying to use a special encoding that works as arm or thumb, simply encode the simplest/canonical undefined instructions dependent on whether __thumb__ is defined.	10 years ago
Rich Felker	89e149d275	add native a_crash primitive for arm the .byte directive encodes a guaranteed-undefined instruction, the same one Linux fills the kuser helper page with when it's disabled. the udf mnemonic and and .insn directives are not supported by old binutils versions, and larger-than-byte integer directives would produce the wrong output on big-endian.	10 years ago
Rich Felker	397f0a6a7d	overhaul arm atomics for new atomics framework switch to ll/sc model so that new atomic.h can provide optimized versions of all the atomic primitives without needing an ll/sc loop written in asm for each one. all isa levels which use ldrex/strex now use the inline ll/sc model even if the type of barrier to use is not known until runtime (v6). the cas model is only used for arm v5 and earlier, and it has been optimized to make the call via inline asm with custom constraints rather than as a C function call.	10 years ago
Rich Felker	1315596b51	refactor internal atomic.h rather than having each arch provide its own atomic.h, there is a new shared atomic.h in src/internal which pulls arch-specific definitions from arc/$(ARCH)/atomic_arch.h. the latter can be extremely minimal, defining only a_cas or new ll/sc type primitives which the shared atomic.h will use to construct everything else. this commit avoids making heavy changes to the individual archs' atomic implementations. definitions which are identical or near-identical to what the new shared atomic.h would produce have been removed, but otherwise the changes made are just hooking up the arch-specific files to the new infrastructure. major changes to take advantage of the new system will come in subsequent commits.	10 years ago
Rich Felker	4a241f14a6	overhaul ARM atomics/tls for performance and compatibility previously, builds for pre-armv6 targets hard-coded use of the "kuser helper" system for atomics and thread-pointer access, resulting in binaries that fail to run (crash) on systems where this functionality has been disabled (as a security/hardening measure) in the kernel. additionally, builds for armv6 hard-coded an outdated/deprecated memory barrier instruction which may require emulation (extremely slow) on future models. this overhaul replaces the behavior for all pre-armv7 builds (both of the above cases) to perform runtime detection of the appropriate mechanisms for barrier, atomic compare-and-swap, and thread pointer access. detection is based on information provided by the kernel in auxv: presence of the HWCAP_TLS bit for AT_HWCAP and the architecture version encoded in AT_PLATFORM. direct use of the instructions is preferred when possible, since probing for the existence of the kuser helper page would be difficult and would incur runtime cost. for builds targeting armv7 or later, the runtime detection code is not compiled at all, and much more efficient versions of the non-cas atomic operations are provided by using ldrex/strex directly rather than wrapping cas.	12 years ago
Rich Felker	867b1822f3	add explicit barrier operation to internal atomic.h API	12 years ago
Rich Felker	8b3d7d0d35	fix build error on arm due to new a_spin code this was broken by commit `ea818ea834`.	12 years ago
Rich Felker	ea818ea834	add working a_spin() atomic for non-x86 targets conceptually, a_spin needs to be at least a compiler barrier, so the compiler will not optimize out loops (and the load on each iteration) while spinning. it should also be a memory barrier, or the spinning thread might keep spinning without noticing stores from other threads, thus delaying for longer than it should. ideally, an optimal a_spin implementation that avoids unnecessary cache/memory contention should be chosen for each arch, but for now, the easiest thing is to perform a useless a_cas on the calling thread's stack.	12 years ago
Rich Felker	90e51e45f5	clean up unused and inconsistent atomics in arch dirs the a_cas_l, a_swap_l, a_swap_p, and a_store_l operations were probably used a long time ago when only i386 and x86_64 were supported. as other archs were added, support for them was inconsistent, and they are obviously not in use at present. having them around potentially confuses readers working on new ports, and the type-punning hacks and inconsistent use of types in their definitions is not a style I wish to perpetuate in the source tree, so removing them seems appropriate.	12 years ago
Rich Felker	e783efa6ef	fix arm thread-pointer/atomic asm when compiling to thumb code armv7/thumb2 provides a way to do atomics in thumb mode, but for armv6 we need a call to arm mode. this commit is based on a patch by Stephen Thomas which fixed the armv7 cases but not the armv6 ones. all of this should be revisited if/when runtime selection of thread pointer access and atomics are added.	12 years ago
Rich Felker	3933fdd500	use dmb barrier instruction for atomics on arm v7 aside from potentially offering better performance, this change is needed since the old coprocessor-based approach to barriers is deprecated in arm v7, and some compilers/assemblers issue errors when using the deprecated instruction for v7 targets.	12 years ago
Rich Felker	efe07b0f89	fix arm atomic asm register constraint the "m" constraint could give a memory reference with an offset that's not compatible with ldrex/strex, so the arm-specific "Q" constraint is needed instead.	12 years ago
Rich Felker	1974bffa2d	use inline atomics and thread pointer on arm models supporting them this is perhaps not the optimal implementation; a_cas still compiles to nested loops due to the different interface contracts of the kuser helper cas function (whose contract this patch implements) and the a_cas function (whose contract mimics the x86 cmpxchg). fixing this may be possible, but it's more complicated and thus deferred until a later time. aside from improving performance and code size, this patch also provides a means of producing binaries which can run on hardened kernels where the kuser helpers have been disabled. however, at present this requires producing binaries for armv6k or later, which will not run on older cpus. a real solution to the problem of kernels that omit the kuser helpers would be runtime detection, so that universal binaries which run on all arm cpu models can also be compatible with all kernel hardening profiles. robust detection however is a much harder problem, and will be addressed at a later time.	12 years ago
Rich Felker	35a6801c6c	fix arm atomic store and generate simpler/less-bloated/faster code atomic store was lacking a barrier, which was fine for legacy arm with no real smp and kernel-emulated cas, but unsuitable for more modern systems. the kernel provides another "kuser" function, at 0xffff0fa0, which could be used for the barrier, but using that would drop support for kernels 2.6.12 through 2.6.14 unless an extra conditional were added to check for barrier availability. just using the barrier in the kernel cas is easier, and, based on my reading of the assembly code in the kernel, does not appear to be significantly slower. at the same time, other atomic operations are adapted to call the kernel cas function directly rather than using a_cas; due to small differences in their interface contracts, this makes the generated code much simpler.	13 years ago
Rich Felker	7568ee4cbf	add missing a_or_l to atomic.h for non-x86 archs this is needed for recently committed sigaction code	13 years ago
Rich Felker	a3bdcd9376	remove little-endian assumption from arm atomic.h this hidden endian dependency had left big endian arm badly broken.	14 years ago
Rich Felker	d960d4f2cb	initial commit of the arm port this port assumes eabi calling conventions, eabi linux syscall convention, and presence of the kernel helpers at 0xffff0f?0 needed for threads support. otherwise it makes very few assumptions, and the code should work even on armv4 without thumb support, as well as on systems with thumb interworking. the bits headers declare this a little endian system, but as far as i can tell the code should work equally well on big endian. some small details are probably broken; so far, testing has been limited to qemu/aboriginal linux.	15 years ago

8 Commits (1b9406b03c0a94ebe2076a8fc1746a8c45e78a83)