> Proper array support (which passes around the length along with the data point...

kazinator · on April 14, 2020

The C family has already evolved in this direction decades ago. Have you heard of C++ (Cee Plus Plus)?

It is production-ready; if you want a dialect of C with arrays that know their length, you can use C++. If you wanted a dialect of C in 1993 with arrays that know their length for use in a production app you could also have used C++ then.

The problem with all these "can we add X to C" is that there is always an implicit "... but please let us not add Y, Z and W, because that would start to turn C into C++, which we all agree that we definitely don't want or need."

The kicker is that everyone wants a different X.

Elsewhere in this thread, I noticed someone is asking for namespace { } and so it goes.

C++ is the result --- is that version of the C language --- where most of the crazy "can you add this to C" proposals have converged and materialized. "Yes" was said to a lot of proposals over the years. C++ users had to accept features they don't like that other people wanted, and had to learn them so they could understand C++ programs in the wild, not just their own programs.

apotheon · on April 14, 2020

C++ introduces a shit-ton of stuff that one often doesn't want, and even Bjarne Stroustrup (who many content has never seen a language feature he didn't want) has been a little alarmed at the sheer mass of cruft being crammed into recent updates to the standard. I know many C++ people think C++ is pure improvement over C in all contexts and manners, but it's not. It's different, and there are features implemented in C++ and not in C that could be added to C without damaging C's particular areas of greatest value, and many other features in C++ that would be pretty bad for some of C's most important use cases.

C shouldn't turn into C++, or even C++ Lite™, but it shouldn't remain strictly unchanging for all eternity, either. It should just always strive to be a better C, conservatively, because its niche is one where conservative advancement is important.

Some way to adopt programming practices that guaranteee consistent management of array and pointer length -- not just write code to check it, but actually guarantee it -- would, I think, perfectly fit the needs of conservative advancement suitable to C's most important niche(s). It may not take the form of a Rust-like "fat pointer". It may just be the ability to tell the compiler to enforce a particular constraint for relationships between specific struct fields/members (as someone else in this discussion suggested), in a backward-compatible manner such that the exact same code would compile in an older-standard compiler -- a very conservative approach that should, in fact, solve the problem as well as "fat pointers".

There are ways to get the actually important upgrades without recreating C++.

kazinator · on April 14, 2020

> C++ introduces a shit-ton of stuff that one often doesn't want

The point in my comment is that every single item in C++ was wanted and championed by someone, exactly like all the talk about adding this and that to C.

> C shouldn't turn into C++

Well, C did turn into C++. The entity that gave forth C++ is C.

Analogy: when we say "apes turned into humans", we don't mean that apes don't exist any more or are not continuing to evolve.

Since C++ is here, there is no need for C to turn into another C++ again.

A good way to have a C++ with fewer features would be to trim from C++ rather than add to C.

nicoburns · on April 14, 2020

Sure, but theres a vast space between the C and C++ approaches. You don't have to say yes to everything to say yes to a few things. I would suggest that better arrays are an example of something that pretty much everybody wants.

simias · on April 15, 2020

But if you want better arrays you want operator overload to be able to use these arrays as 1st class citizens without having to use array_get(arr, 3), array_len(arr), array_concatenate(arr1, arr2) etc... You want to be able to write "arr[3]", "arr.len()", "arr1 += arr2" etc... To implement operator overload you might need to add the concept of references.

If you want your arrays type-safe you'll need dark macro magic (actually possible in the latest standards I think) or proper templates/generics.

If you really want to make your arrays convenient to use you'll want destructors and RAII.

Then you'd like to be able to conveniently search, sort and filter those arrays. That's clunky without lambdas.

And once you get all that, why not move semantics and...

Conversely if you don't want any of this what's wrong with:

    struct my_array {
        my_type_t *buf;
        size_t len;
    }

I don't think it's worth wasting time standardizing that, especially since I'd probably hardly ever use that since it doesn't really offer any obvious benefits and "size_t" is massively overkill in many situations.

loeg · on April 15, 2020

> But if you want better arrays you want operator overload to be able to use these arrays as 1st class citizens without having to use array_get(arr, 3), array_len(arr), array_concatenate(arr1, arr2) etc... You want to be able to write "arr[3]", "arr.len()", "arr1 += arr2" etc...

I don't think that's true at all.

1. "arr[3]" syntax can just be part of the language.

2. For length, we already have the "sizeof()" syntax, although admittedly it is a compile-time construct and expanding it to runtime could be confusing. I am ok with using a standard psuedo-function for array-len and would absolutely prefer it to syntax treating first-class arrays as virtual structs with virtual 'len' members.

3. I don't think any C practitioner wants "arr1 += arr2" style magic.

So I don't buy that there is a need for operator overload; the rest of your claims that this is basically an ask for C++ follow baselessly from that premise.

apotheon · on April 16, 2020

> Conversely if you don't want any of this what's wrong with:

As I suggested, adding a(n optional) constraint such that "buf" can be limited by "len" in such a struct is a possible approach to offering safer arrays. Such a change seems like it kinda requires a change to the language.

pjmlp · on April 15, 2020

Apparently not everyone, otherwise it would be part of ISO C already, and it hasn't been for lack of trying.

apotheon · on April 16, 2020

Not literally everyone, I would think, but the previous statement could, in theory, still be true. It would just require some people to want something else, conflicting with that desire, even more.

I know, this is pedantic, I suppose. Mea culpa.

apotheon · on April 16, 2020

> every single item in C++ was wanted and championed by someone

This is irrelevant to the point I made in the text you quoted.

> Well, C did turn into C++. The entity that gave forth C++ is C.

My mother didn't turn into me. She just gave rise to me. She's still alive and well.

My point, which seems to have completely escaped you, is that C itself should not turn into C++, so claims that any attempt at all ever to improve C with the addition of a single constraint mechanism for managing pointer size safely is a slipper slope to duplicating what C++ has become, leaving no non-C++ C language in its wake -- well, such claims seem unlikely to be an unavoidable Truth.

> A good way to have a C++ with fewer features would be to trim from C++ rather than add to C.

Again, my point is not easily crammed into the round hole of your idea of how things worked. It is, instead, that C can have a few more safety features without becoming "C++ with fewer features".

I feel like you didn't read my previous message as a whole at all given the way you responded to it, and just looked for trigger words you could use to push some kind of preconceived notions.

apotheon · on April 16, 2020

That should have said "many contend". Now it seems too late to edit.

twic · on April 14, 2020

> if you want a dialect of C with arrays that know their length, you can use C++

C++ doesn't have arrays which know their length.

zokier · on April 14, 2020

What's std::array then?

> combines the performance and accessibility of a C-style array with the benefits of a standard container, such as knowing its own size

https://en.cppreference.com/w/cpp/container/array

kevin_thibedeau · on April 14, 2020

They're objects that mostly behave like arrays. You can't index element two of std::array foo as 1[foo] since it isn't an actual C array.

kazinator · on April 15, 2020

A Pascal array is just ones and zeros that behave like an array. So is a Fortran array.

> You can't index element two of std::array foo as 1[foo] since it isn't an actual C array.

That's just a silly quirk of C syntax that is deliberately not modeled in C++ operator overloading. It's not a real capability; it doesn't make arrays "do" anything new, so it' hard to call it an array behavior. It's a compiler behavior, that's for sure.

It could easily be added to C++, similarly to the way preincrement and postincrement are represented (which allows that obj++ and ++obj to be separate overloads).

   T &array_class::operator [] (int index) {
      // handles array[42]
   }

   T &array_class::operator [] (int index, int) {  // Fictional!!
      // handles 42[array]
   }

The dummy extra int parameter would mean "this overload of operator [] implements the flipped case, when the object is between the [ ] and the index is on the left".

C++ could easily have this; the technical barrier is almost nonexistent. (I wonder what the minimal diff against GNU C++ would be to get it going.)

I suspect that it's explicitly unwanted.

saagarjha · on April 14, 2020

Ok, but who actually uses that?

kevin_thibedeau · on April 15, 2020

The point is to demonstrate that std::array isn't an array.

kazinator · on April 15, 2020

What makes you so sure that if C got better arrays, those would be arrays, supporting a[i] i[a] commutativity and all?

That is predicated on equivalence to *(a + i) where a is a dumb pointer whose displacement commutes.

pjmlp · on April 15, 2020

That is a quirk of C's arrays, no other language besides Assembly allows for that.

And even in Assembly, it depends on the CPU flavor which kind of memory accesses are available.

kazinator · on April 15, 2020

It depends entirely on the whims of the assembly language design. Assembly lanuages for the Motorola 68000 could allow operand syntax like like [A0 + offset], which could commute with [offset + A0], but the predominant syntax for that CPU family has it as offset(A0) which cannot be written A0(offset).

None of that changes what instruction is generated, just like C's quirk is one of pure syntax that doesn't affect the run-time.

saagarjha · on April 15, 2020

Ok, fair. But for almost all practical purposes, std::array is an appropriate array replacement.

kazinator · on April 14, 2020

C++ has features in its syntax so that you can write objects that behave like arrays: support [] indexing via operator [], and can be passed around (according to whatever ownershihp discipline you want: duplication, reference counting). C++ provides such objects in its standard library, such as: std::basic_string<T> and std::vector<T>. There is a newer std::array also.

pjmlp · on April 15, 2020

And depending on the compiler they can also bounds check, even in release builds, it is a matter of enabling the right build configuration flags.

loeg · on April 14, 2020

Fat pointers in C would involve an ABI break for existing code, in that uintptr_t and uintmax_t would probably need to double in size.

rkangel · on April 14, 2020

It would presumably involve a new type that didn't exist in the current ABI. Those pointers would stay the same, and the new (twice as big) pointers would be used for the array feature.

kazinator · on April 14, 2020

On a given platform, the fat pointer type could have an easily defined ABI expressible in C90 declarations (whose ABI is then deducible accordingly).

For instance, complex double numbers can have an ABI which says that they look like struct { double re, im; };

professoretc · on April 14, 2020

The point of uintptr_t is that it's an integer type to which any pointer type can be cast. If you introduce a new class of pointers which are not compatible with uintptr_t, then suddenly you have pointers which are not pointers.

_kst_ · on April 14, 2020

No, uintptr_t is an integer type to which any object pointer type can be converted without loss of information. (Strictly speaking, the guarantee is for conversion to and from void*.) And if an implementation doesn't have a sufficiently wide integer type, it won't define uintptr_t. (Likewise for intptr_t the signed equivalent.)

There's no guarantee that a function pointer type can be converted to uintptr_t without loss of information.

C currently has two kinds of pointer types: object pointer types and function pointer types. "Fat pointers" could be a third. And since a fat pointer would internally be similar to a structure, converting it to or from an integer doesn't make a whole lot of sense. (If you want to examine the representation, you can use memcpy to copy it to an array of unsigned char.)

saagarjha · on April 15, 2020

Note that POSIX requires that object pointers and function pointers are the same for dlsym.

loeg · on April 15, 2020

Surely you're not arguing that a bounded array is in fact a function rather than an object? The distinction between function and object pointers exists for Harvard architecture computers, which sort of exist (old Atmel AVR before they adopted ARM), but are not dominant.

kazinator · on April 14, 2020

You would be shocked by this language called C++ which is highly compatible with C and has "pointer to member" types that don't fit into a uintptr_t.

(Spoiler: no, there is no uintptr2_t).

loeg · on April 14, 2020

Ditto uintmax_t. We do not want a uintmax2_t.

cesarb · on April 14, 2020

Existing code would be using normal pointers, not fat pointers, so there would be no ABI break. New code using fat pointers would know that they fit into a pair of uintptr_t, so the size of uintptr_t would not need to change either.

loeg · on April 14, 2020

I don't think we want a uintptr_t and uintptr2_t.

monocasa · on April 14, 2020

IDK, it's not like it'd be an auto_ptr situation where you just don't use uintptr_t anymore and call the other one uintptr2_t. THere's different enough semantics that they both still make sesne.

Like, as someone who does real, real dirty stuff in Rust, usize as a uintptr equivalent gets used still even though fat pointers are about as well supported as you can imagine.