2^16 bytes is a mere 64KB. Sure you can get away with small strings if you have to, but in a world where that isn't something that you have to put up with it would be quite frustrating.
For example, say you need to preform some sort of text editing style task, and insert a few chars into the middle of a file. If one of the internal representations of the file happens to be one contiguous char* , then all you have to do is one quick memmove to make some room. With a length-prefixed representation the best case scenario is you do the memmove as before, then also update the length (no biggy really, since you probably keep that around somewhere anyway). However, if you have a 2^16 restriction and have a file larger than that you're suddenly can't use a contiguous piece of memory. This would complicate numerous things including searching, splitting, and (potentially) insertion. Not having a contiguous piece of memory also complicates the process of laying any number of data structures on top of the file data. Even further, it causes issues when you want to just memmap in a file, unless you want all your files to be perpended with the number of chars in them, which causes even more issues...
Right, so you'd need a BigString analogue to BigInt for that case.
Which wouldn't be that bad, really. I mean, a 64k string is plenty for your everyday string needs. And in cases where you're handling really long strings, you'd probably want a specialized data structure anyway. I mean, it's not like char* is exactly efficient when you need to insert something into the middle of a many-megabyte file.
That's kind of what I mean. The fact that null-terminated strings can get arbitrarily long doesn't seem like a big advantage. I mean, if you're working with really long strings, you probably want to use something more sophisticated than a character array anyway.
But the question here isn't actually "What is the ONE TRUE STRING format that the language permits, all others to be rigidly banned by the compiler?", the question is "What shall the default string be in the core C APIs and functions?"
If you still need NULL-terminated strings, you could have chosen them, and if you knew enough to so choose, hopefully you know enough to treat them like the dangerous tools they are. Meanwhile, the core C functions and API and UNIX could have been built around the much safer strings, which wouldn't have been all that hard to upgrade to 4 bytes (or more) later. Or we could have done a UTF-8-like size encoding, or turn the default strings into linked lists if they got large, etc. It would be OK, because raw expanses of memory would still be available to you, it just wouldn't be the default.
NULL-terminated strings are the wrong default, even though they should be available to those who really need them.
Hm, It can be a real problem if you have the joy of working with old C++ code: the initial codebase might have had its own string type, and then some more classes added by long-gone programmers who thought the earlier types were slow, and then newer code added by people who actually use std::string (well, you get the idea).
For example, say you need to preform some sort of text editing style task, and insert a few chars into the middle of a file. If one of the internal representations of the file happens to be one contiguous char* , then all you have to do is one quick memmove to make some room. With a length-prefixed representation the best case scenario is you do the memmove as before, then also update the length (no biggy really, since you probably keep that around somewhere anyway). However, if you have a 2^16 restriction and have a file larger than that you're suddenly can't use a contiguous piece of memory. This would complicate numerous things including searching, splitting, and (potentially) insertion. Not having a contiguous piece of memory also complicates the process of laying any number of data structures on top of the file data. Even further, it causes issues when you want to just memmap in a file, unless you want all your files to be perpended with the number of chars in them, which causes even more issues...