the two/three/four byte memmem specializations are not prepared to
handle haystacks shorter than the needle; they unconditionally read at
least up to the needle length and subtract from the haystack length.
if the haystack is shorter, the remaining haystack length underflows
and produces an unbounded search which will eventually either crash or
find a spurious match.
the top-level memmem function attempted to avoid this case already by
checking for haystack shorter than needle, but it failed to re-check
after using memchr to remove the maximal prefix not containing the
first byte of the needle.
the logic for this loop was copied from null-terminated-string logic
in strstr without properly adapting it to work with explicit lengths.
presumably this error could result in false negatives (wrongly
comparing past the end of the needle/haystack), false positives
(stopping comparison early when the needle contains null bytes), and
crashes (from runaway reads past the end of mapped memory).
in cases where the memorized match range from the right factor
exceeded the length of the left factor, it was wrongly treated as a
mismatch rather than a match.
issue reported by Yves Bastide.
to optimize the search, memchr is used to find the first occurrence of
the first character of the needle in the haystack before switching to
a search for the full needle. however, the number of characters
skipped by this first step were not subtracted from the haystack
length, causing memmem to search past the end of the haystack.
sadly the C language does not specify any such implicit conversion, so
this is not a matter of just fixing warnings (as gcc treats it) but
actual errors. i would like to revisit a number of these changes and
possibly revise the types used to reduce the number of casts required.