[PATCH] lib: fix buggy strcmp and strncmp

Rasmus Villemoes rasmus.villemoes at prevas.dk
Wed Oct 5 11:09:25 CEST 2022


There are two problems with both strcmp and strncmp:

(1) The C standard is clear that the contents should be compared as
"unsigned char":

  The sign of a nonzero value returned by the comparison functions
  memcmp, strcmp, and strncmp is determined by the sign of the
  difference between the values of the first pair of characters (both
  interpreted as unsigned char) that differ in the objects being
  compared.

(2) The difference between two char (or unsigned char) values can
range from -255 to +255; so that's (due to integer promotion) the
range of values we could get in the *cs-*ct expressions, but when that
is then shoe-horned into an 8-bit quantity the sign may of course
change.

The impact is somewhat limited by the way these functions
are used in practice:

- Most of the time, one is only interested in equality (or for
  strncmp, "starts with"), and the existing functions do correctly
  return 0 if and only if the strings are equal [for strncmp, up to
  the given bound].

- Also most of the time, the strings being compared only consist of
  ASCII characters, i.e. have values in the range [0, 127], and in
  that case it doesn't matter if they are interpreted as signed or
  unsigned char, and the possible difference range is bounded to
  [-127, 127] which does fit the signed char.

For size, one could implement strcmp() in terms of strncmp() - just
make it "return strncmp(a, b, (size_t)-1);". However, performance of
strcmp() does matter somewhat, since it is used all over when parsing
and matching DT nodes and properties, so let's find some other place
to save those ~30 bytes.

Signed-off-by: Rasmus Villemoes <rasmus.villemoes at prevas.dk>
---

Please double- and triple-check before applying. I've tested these
against my libc's strcmp() and strncmp() for thousands of random
strings, but I may very well have messed up when copy-pasting back to
lib/string.c.

 lib/string.c | 27 +++++++++++++++++----------
 1 file changed, 17 insertions(+), 10 deletions(-)

diff --git a/lib/string.c b/lib/string.c
index 78bd65c413..ecea755f40 100644
--- a/lib/string.c
+++ b/lib/string.c
@@ -206,16 +206,20 @@ size_t strlcat(char *dest, const char *src, size_t size)
  * @cs: One string
  * @ct: Another string
  */
-int strcmp(const char * cs,const char * ct)
+int strcmp(const char *cs, const char *ct)
 {
-	register signed char __res;
+	int ret;
 
 	while (1) {
-		if ((__res = *cs - *ct++) != 0 || !*cs++)
+		unsigned char a = *cs++;
+		unsigned char b = *ct++;
+
+		ret = a - b;
+		if (ret || !b)
 			break;
 	}
 
-	return __res;
+	return ret;
 }
 #endif
 
@@ -226,17 +230,20 @@ int strcmp(const char * cs,const char * ct)
  * @ct: Another string
  * @count: The maximum number of bytes to compare
  */
-int strncmp(const char * cs,const char * ct,size_t count)
+int strncmp(const char *cs, const char *ct, size_t count)
 {
-	register signed char __res = 0;
+	int ret = 0;
+
+	while (count--) {
+		unsigned char a = *cs++;
+		unsigned char b = *ct++;
 
-	while (count) {
-		if ((__res = *cs - *ct++) != 0 || !*cs++)
+		ret = a - b;
+		if (ret || !b)
 			break;
-		count--;
 	}
 
-	return __res;
+	return ret;
 }
 #endif
 
-- 
2.37.2



More information about the U-Boot mailing list