> Types are there whether you want them to be or not
But this only applies to typed programs, right? For every type system you come up with there will be some valid and normalizing term that cannot be typed under that type system.
There is also a minor thing that came to my mind. Types are most often associated with intuitionist logics, which are about constructive provability, as opposed to classical logics, which are about "truthyness" so the title of the post is a bit misleading :)
"But this only applies to typed programs, right? For every type system you come up with there will be some valid and normalizing term that cannot be typed under that type system."
In theory, yes - I came here to make that same objection.
In practice, there arises the question of whether such terms are actually ever encountered for a given type system when solving realistic problems.
My impression is that these type system limitations start showing up once you need to mix and match different type systems. For most terms there is a reasonable type system that will type it but when you scale up to your whole program/system then any single type system might not be able to cover all the terms that you want to write. Some type system features such as parametric polymorphism, subtyping, intersection types, linear types, etc are hard to mix in practice, especially when you worry about decidability and type inference.
I think this is one of the reasons why dynamic languages are popular. It kind of lets you mix different programming styles in a single program, at the cost of laxer type checking (something I hope will get better in the future).
No, this is not true at all. In practice every program is typed. There is no useful function that is fully general, so in practice there are limits on the types and the values of the inputs.
All dynamic typing gets you is that you can ignore this fact, and then hit it head on when you make a mistake.
It is possible to give programs that are truly dynamically typed, but they are not "real" programs. What can you do to a variable with zero information on what it contains ? Not much : return it, pass it around, ... that's about it.
The real complaints are mostly about two things
1) it's hard to know what type an input parameter should be, so I just don't. This is just not a valid complaint. But type inference helps a lot here. Generic programming catches the other usecases.
2) many languages make it really hard to correctly specify types. This is a language flaw. Dynamic typing can be an improvement over C code, but not over Haskell code. Other languages are somewhere in between.
Sometime, it's another flaw that may exist : the standard libraries are not correctly thought out to have a good algebraic structure (e.g. correct IS-A relationships for all number types, and the ability to generalize over them)
These complaints basically boil down to people having constructed wrong type systems. It's a flaw in many, many languages. Python and Haskell are the only ones I know to get it close to right.
3) You want to eval() in your program (or, for example, read in arbitrary json). If you're doing this, you're doing something wrong. Even if you do manage to isolate things correctly (history says you are unlikely to get this right), you're open to trivial DDOSing.
The problem in Python is that type errors end up being reported later than they would in a typed language. For example, if I have a function f that expects an object and call it with f(None) its not going to immediately give me an error. Its only going to give an error when I try to call a method on that `None` and depending on how my program is structured the line with the erroneous f(None) call might not even end up appearing in the final stack trace.
You wrote "in theory, there is a type system that describes every system": I would be very surprised if this is true. It would a major turnaround in modern mathematics. What is this theory? Any link?
The fact that there exists such a type system, does not mean that we have constructed it. Having said that, the theoretical existence of such a theory seems highly plausible, but without a proof, it would not surprise me to learn that such is impossible. Without a constructivist proof, it would not suprise me to learn that the existence of such a system is irrelevent (to applied computer science).
As I mentioned with my caveat, I'm an amateur type theorist, so I'm still digging into a lot of things. Do you have some relevant literature for me about this being impossible?
My claim may have been too strong: I meant more that "it hasn't been disproven that type systems are inherently disjoint." Similar to how physicists believe there's a Grand Unified Theory of physics, but that doesn't mean we actually know it yet.
That certainly tells us that there is no single type system that can describe every correct program, which seems to be the stronger version of Steve's thesis that he was supporting in the immediate parent.
A weaker version of his thesis, which could still support his original phrasing in the GGP, would be "for any correct program, there exists a type system in which it can be typed [though we might not be able to find it]." That doesn't seem to be ruled out, and sounds plausible, though I don't know of any specific results on it.
For most type systems, they certainly are: most reasonably large C programs contain assembly corresponding to unsafe pointers, unions, etc. that a compiler for a typical high-level language cannot emit under any circumstances (without being ridiculously intelligent). Languages like ATS and C-plus-formal-verification mostly avoid this problem at the expense of being really hard to write.
"most reasonably large C programs contain assembly corresponding to unsafe pointers, unions, etc."
I'm not sure what you're saying here with the "corresponding to" - obviously you can write "unsafe pointers", unions, etc, right in C, and I've not really ever encountered inline assembly for the purposes of subverting the type system. I'll respond to more of your content when I'm confident I've understood you (which I don't mean as a brush off...).
My response was poorly worded, but what I meant is that compilers for other languages with higher level type systems (which make more guarantees) cannot emit the same assembly as a C compiler, not that assembly is needed to subvert C's type system.
All right, let's see if I can express what I'm talking about properly. I don't have much background in theory, but what I'm trying to say is pretty trivial, so...
The article talks about translating from a typed lambda calculus to the untyped variant. Real compilers typically translate from the original typed AST gradually through multiple IRs to assembly, which is more complicated since they're changing the form of the program rather than just dropping type annotations, but is generally analogous. What I care about is whether there are programs with "terms" - instruction sequences - which correspond to useful programs which cannot be "typed" in a particular language, i.e. they cannot be produced by a compiler following any kind of straightforward procedure to lower from that language. Of course you can always come up with an equivalent program as long as the language is Turing complete; in particular, just as untyped lambda calculus can also be considered unityped, you can do something like Emscripten where you mechanically translate low level IR into a tiny subset of a higher level language, and maybe the compiler will optimize that subset. But that's not that interesting, because the general case is changing the program to a completely different one, and the special case removes all of the benefits of the type system. Additionally, you can have a language like Rust that lets the programmer poke holes in the type system, which is useful in practice but makes the type system no longer guarantee what it says it guarantees.
So, of course, with these limitations, not only are there correct programs which typical, non-dependent type systems cannot express, but, judging by the high performance of C (which has a type system, but one that provides barely any guarantees and thus has barely any limitations) in practice, these programs are quite common and useful in practice. Dependent type systems, along with their Curry-Howard mirror of theorem provers for low-level languages, can almost always make up for this (though by Gödel's incompleteness theorem there is some correct program somewhere that they can't prove correct), but they're a relative pain to program in and for that reason are uncommon in practice. Or to put it another way, you have a choice between delegating verification of complex properties about programs (the types which are there whether you want them to be or not) to computers, which have rather poor insight, or to humans, which are notoriously unreliable. The ultimate type system is a general AI you can explain the situation to in English.
> Additionally, you can have a language like Rust that lets the programmer poke holes in the type system, which is useful in practice but makes the type system no longer guarantee what it says it guarantees.
FWIW, Rust has a pile of behaviours that are undefined behaviour[1]: that is, the compiler will optimise/reason assuming they never happen. These are impossible to hit in safe code, but are possible with `unsafe`, it's up to the programmer to manually maintain them. That is, the type system does guarantee certain things, and it's up to the programmer to not break it inside `unsafe`.
If you want your high-level language to be as fast as C, it needs provably safe manual memory management (a.k.a. linear types) and provably safe unchecked array access (a.k.a. dependent types). Most high-level languages don't have these features, because in their current state they are really hard for programmers to use. See ATS for an example.
There is some existing work on typed assembly languages (having a compiler emit assembly language that can be verified to be correct), although so far this is more of a research thing.
Well, in practice we write programs that behave well within the type system we are using. Thus, no, you won't find those very often, but it also does not mean that there is no problem.
If you were building a sandcastle, a type system is like plastic forms in the shape of castle components (walls, towers, etc)... They help you create a better sandcastle...
This article was essentially saying: "You can remove the forms when you are done and your sandcastle will still stand."
Your hands are your mind... you think the types, check the types, and erase them, but do all of it briefly, while playing with that part, and then move away.
Compiled languages are like building a bunch of plaster mold, and then just doing the castle in one step.
Optional typing, or using Any types and casting, is like using molds for a nice foundation, and using your hands to add little bits on top.
Yes, type erasure is neat and a bit of thinking will make it obvious why proving static properties does not require run time checks if the transformations you perform on your program preserve those static properties. I disagree with the author's claim though that types are everywhere. Types in fact are extra structure that is hoisted onto programs. At the end of the day the machine knows nothing about the types and will shuttle bytes back and forth all day long. Furthermore, types restrict the universe of computation and there are examples of programs that are not well-typed but don't go wrong when evaluated.
Types give you the power of abstraction -- they allow you to define new concepts that "don't to go wrong" and allow you to speak of things other than just bytes-shuttled-on-a-turing-tape. Rigorously justifying the ignorance of the underlying bytes (i.e. being parametric over representation) is an advantage, not a downside.
But maybe I am misunderstanding your comment? Perhaps you were just equivocating over what "really there" means? If that's the case, is multiplication not "really there" because it can be implemented in terms of bit shifting and addition? We don't imagine multiplication as somehow putting these restrictions on what we can do with our bit shift operators -- the multiplication is the "real" thing we care about and the binary operations are incidental.
In any case, see the famous parable that opens Reynolds' "Types, Abstraction, and Parametric Polymorphism" for a colourful example of ignoring the underlying data representation:
If you're doing mathematics then multiplication is the thing you care about and not its representation in any given number/notation system but if you're actually implementing multiplication then you definitely care about the actual bits and bytes. What structure you impose on top of the implementation to make it easier is all nice and well but don't fool yourself into thinking that types are the end of it. In fact to truly impose some kind of type structure so you could reason about what was going on at the hardware level you'd need more than just the integer type or the floating point type because those types don't let you reason about overflow/underflow at such a high level. Any correct implementation would need to worry about those restrictions and consequently would have to represent those restrictions in the type system.
Don't get me wrong I think type theory is really cool. In fact any kind of theory that makes me a better programmer is cool. That's one of the reasons TypeScript right now is one of my favorite languages. It is the perfect balance between the prototyping power of dynamic languages and the awesome static guarantees of statically typed languages. The best part is that the type system does not get in the way when I'm prototyping and actually starts to help out when I have the design fleshed out. I just wish more languages supported that kind of type system because sometimes I really miss the static checks when writing stuff in Ruby and Python.
What do you mean "types don't do anything"? Are you seriously going to argue that? I don't know many people in the programming or CS fields who believe that. What some people believe is that there are type systems too onerous for the benefits they bring (which is, of course, debatable).
Following your line of thought, one could say that the textual representation of your program "doesn't do anything" either, and neither does the syntax or the IDE/editor you use. So none of it matters. But of course it wouldn't be true -- all those things are useful tools that help you write software that does cool things.
edit: thought of an even better example: tests. Tests "don't do anything". They are not directly related to the cool stuff we want to do with our computers. But who is going to argue that we shouldn't write any tests?
You'll notice I only addressed the faulty assertion that "types don't do anything". I've purposefully left the excitement part out, because that's subjective. Some people are excited about testing methodologies, some about programming languages, some about their IDEs, etc.
What you can bet is that absolutely no-one is excited about mission-critical software failing because of a bug that could have been caught by testing... or type checking.
I guess I wasn't clear. You could say that a line of code does something if it generates instructions for the cpu. `x = y` would generate at least one instruction (probably some kind of mov on an intel system), which would instruct the cpu to do something, e.g. change the value in a register.
If you scanned through a program looking for syntactic forms that represent types as such, you could find clear examples of such syntax in most programs; some programs would have more and some less, probably determined to a large extent by programming language. It would be things like variable declarations, class declarations, etc. -- essentially declarative scaffolding, which wraps and adds meaning and context to code that does things; the type code itself, though, generates no cpu instructions, because its purpose isn't to represent action, it's orthogonal to action.
You could argue that type code causes actions to be performed, such as when types need to be casted, which may trigger some kind of transformation on the underlying data. But, even though the type helps trigger action (the data transformation), the type is not that action.
I don't know if what I've written will make sense or sound like empty nonsense... but the article itself makes the point that you will generally not find types in machine code; at most you can find indirect evidence that they were there.
You may say, "what difference does that make? Why make the distinction?"
Well, machine instructions are what make the computer do observable things. Any code you have to write that doesn't produce instructions needs to be justified somehow. There are lots of ways of justifying such extra code. But it's orthogonal to actually doing things -- pretty much by definition.
Edit -- to respond more directly:
>>What some people believe is that there are type systems too onerous for the benefits they bring (which is, of course, debatable).
I think that's a real risk. People tend to get attached to ideas about programming languages in quasi-religious fashion; I think it's healthy to stay in touch with an objective cost/benefit perspective, though. The user doesn't see the code, he/she only sees the results. And the experience of the user is what you should focus on as a programmer -- right?
>>Following your line of thought, one could say that the textual representation of your program "doesn't do anything" either, and neither does the syntax or the IDE/editor you use. So none of it matters. But of course it wouldn't be true -- all those things are useful tools that help you write software that does cool things.
I'm not sure I follow the distinction you're making, between the textual representation of the code and -- what? Clearly the text has to be written before the computer will do anything -- so it must be important.
>>edit: thought of an even better example: tests. Tests "don't do anything". They are not directly related to the cool stuff we want to do with our computers. But who is going to argue that we shouldn't write any tests?
I said types don't do anything, and that they don't excite me. I didn't say we should get rid of all types. Tests don't do anything, either (at least, they don't do what the program should do). Tests also don't excite me. That doesn't mean that testing is pointless. Testing can be costly and tedious -- but you have to weigh that against the risk of bugs getting through.
An alternative to static types are runtime assertions - check that the two arguments to an add function are integers before performing the operation. Then the type system (strong, dynamic typing) is definitely part of the code of the program, if that's what you want.
> Well, machine instructions are what make the computer do observable things.
The typechecker is a program, and if the typechecker fails, you will see output.
You definitely see lots of things like (string? x) even in Lisp. It shows that some types are unavoidable. (Though frequently the type information is just used as additional information.)
Typecheckers definitely do things. But are those things interesting?
If you could show that seat belts save lives, which I think you can, then that's kind of interesting. In any case, I meant the question to be rhetorical.
This is an execution-centric view; I think that types really are everywhere when it comes to how typical programmers think about their code (even in dynamically typed languages).
What's the difficulty, exactly? You need some mundane float->bits and bits->float operations, which are commonly although not universally provided in strongly typed environments. The rest is integer ops.
That's not a terribly difficult proposition in a strongly-typed language -- in fact, it is even more performant than if you were to do it with tagged unions! The function "floatFromBits : float -> bits" has no execution overhead, and simply adds a type annotation to the semantic meaning of the identity. At execution time, there are no instructions emitted for it...
So, thanks for the well wishes, but in this case I don't think they're needed: any reasonable compiler should be able to do this without great difficulty.
It could just be a movd if the compiler backend people put the work in (and if they can't be arsed and implement it as a function call, that's nothing to do with the type system).
If the language supports union, then simply give your working variable the type of "union { float, long }". This directly expresses the idea that you're working with a piece of memory that is used for two purposes.
Some food for thought from Rich Hickey:
"Statically typed programs and programmers can certainly be as correct about the outside world as any other program/programmers can, but the type system is irrelevant in that regard. The connection between the outside world and the premises will always be outside the scope of any type system."
http://www.reddit.com/r/programming/comments/lirke/simple_ma...
He seems to be saying "types are just a formal system; they are meaningless until given meaning by an interpretation". Which is, of course, true. But the same can be said of programming languages. They're just meaningless squiggles until interpreted by human or machine.
Assuming you have some sort of specification for your program, you can use a formal system (such as a static type system) to describe aspects of the specification. It's up to you to correctly model the requirements in the system, but it's similarly up to you to model the requirements of the system in your (far larger and more complex) program. The types ensure that the program fits the aspects of the specification you were able to describe with them.
Also what about protocols? Can type systems handle that. Things like:
* open file
* close file
* read form file
That can be type checked up and down it will still be broken. Is that Rich Hickey meant? Here we are dealing with a real world -- a file. And it has a protocol to access it. So in a sense we want a protocol checker not a type checker in that case...?
Separation logic can handle that—it’s a program logic that subsumes typechecking and typestate, among other things. Your example, in pseudocode:
// open : string -> file_mode -> (f : file) | f @ read * f @ close
f = open "input.dat" Read
// close : (f : file) | f @ read * f @ close -> ()
close f
// getline : (f : file) | f @ read -> string | f @ read
s = getline f
“open” grants the “read” and “close” permissions to “f”; “close” revokes the “read” permission that “getline” requires, causing “getline” not to typecheck.
You could certainly handle that kind of error using a type system. There could be a scoping rule where certain kinds of operation (i.e. close file) put the reference to the object out of scope. The way Haskell chains I/O operations using monads is another way: the result of the 'close file' operation is basically a world that doesn't contain a reference to the file.
I was more convinced by someone who replied to that post (psnively). He seems to be saying that type systems can't model the world as we know it perfectly, in which case it sounds like a Nirvana fallacy - we can model a lot of things that we care about as programmers, so even though we can't model all of it, it still has merit. If what he is referring to is that type systems are somehow "separate" from the outside world, then it seems somewhat like saying that mathematics is a wholly abstract field. Still, though, applied mathematics has been incredibly successful.
After psnively's reply, Rich decides to throw a tantrum (read: "bow out").
Isn't type erasure just a rather limited subclass of compiler optimization, and as such comes "for free" without having to explicitly code for it? The compiler frontend can leave in all type checks, having most of them can be removed by the optimizer, the same way you can remove any sort of redundant check (a default case in a switch statement over an enumeration, etc, etc)
That being said, general type erasure can have (massive) problems. Look at Java.
I'd say Java's type erasure has on the whole been massively successful (and it's allowed integration with e.g. Scala that a more reified type system might make difficult). What do you see as wrong with it?
For example, having to use a varargs hack just to make a T[] (that's an array of a a generic type).
Or not being able to call T.class
Or not being able to use instanceof T.
Or having a Object[] that crashes at runtime when you try to put an Object in it.
Or not being able to overload a method based on if it's passed a List<Foo> or List<Bar>. (Also applies to interfaces like Comparable, etc. There's no good way of going "this is a Comparable<Foo> and a Comparable<Bar>, but not a Comparable<Baz>.)
Or for that matter, trying to pass an int[] to something that expects an Integer[], or vice versa.
Or for that matter, someone passing in a List into your parameter expecting List<Foo> and crashing at runtime because it actually was a List<Bar>.
Introspection like this usually is a hack around (1) having a weak type system (2) using objects wrong. That said, T.class and instanceOf work perfectly fine in Haskell and such, because of the lack of inheritance in the OOP sense. Instead, they just say "if you want to get a type, implement the Typeable interface, which has a method that returns the Type". In other words, if you want it, do it manually, with the added benefit that there is 0 cost, memory or otherwise interfaces when they aren't used.
Really, a quote I heard once: "Java is like a strawman built specially by people who want to argue against using strong type systems."
It is ugly but you can make many of that things making all constructors request a paramter Class<T> cls.
One thing you learn using java is not moving arrays around.
The comparator stuff is true.
Make the compiler to fail if you pass raw classes. A List is not a List<Foo> and in some cases is not a List<?> (where List is a class of <T extends Foo>).
What I miss is some kind of self type, or "return this", I do not know how to tell it. Imagine a imaginary class A:
this aMethod() {
stuff();
return this;
)
This aMethod returns an A when invoked on an A instance or a B, B extends A, if is a B instance. Handy for builders and fluid apis:
Which doesn't help in the case I was discussing, namely when you want a generic array. As in: generic. Not specific.
You end up having to use a collection, which ends up with an order of magnitude more memory usage than an array in the case where you want to store characters. (char is the worst case. But all primitives are pretty bad in that regard.)
Personally, I wish that type systems allowed for arbitrary pure (as in "same input -> same output") code to define what is and isn't a valid instance of a type at compile time.
Being able to declare a function that is only valid for, say, power of two inputs (say: a hashmap's initial size, when the hashmap is using a bitwise and for wrapping) and having it actually enforced at compile time would be very useful.
Any such system would need a theorem prover that's powerful and easy to use, and such a prover doesn't exist. For example, to implement an algorithm that involves arrays and indices, you'd need to supply machine-checkable proofs that all indices involved in the algorithm are valid for their respective arrays. That's really difficult, e.g. you can try to come up with some static reasoning scheme that would work on this algorithm:
The original paper on this topic is Hongwei Xi's "Eliminating array bound checking through dependent types", http://www.cs.bu.edu/~hwxi/academic/papers/pldi98.pdf . You will note that he failed to remove all bound checks in the algorithm, even though he started out with a good language (ML), added many manual annotations, and used a sophisticated method of automatically proving theorems. Also, his method and most subsequent methods will fail on any problem that involves multiplication of integers to get array indices, e.g. addressing a rectangular array as a[i*n+j], because integer arithmetic with multiplication is undecidable and even simple instances make the computer cry.
This is a hard research problem. If you come up with a language design that is guaranteed to be memory-safe, eliminates array bound checks in most realistic use cases, and can be used by regular programmers, you will be hailed as a hero. I'm not exaggerating at all.
Programmers code with three types of constraints: soft cast, hard cast, and unsafe cast. Soft cast is a runtime assertion that the constraint isn't violated, hard cast is a "do not compile unless you can prove that this constraint isn't violated" and unsafe cast is a "check at compile time to make sure you cannot provide a counterexample, but do not include a runtime check". Hard cast will be rarely used, for obvious reasons.
Compiler propagates known values / constraints forwards as much as possible, and propagates constraints backwards as much as possible. Note that this generally already happens as part of compiler optimizations. If at any point any value violates a constraint, error out and provide a "constraint trace" of what caused the violation.
After this step, the compiler looks at all unknown-at-compile time variables and randomly (or rather, pseudo-randomly, maybe seeded with the md6 of the file or something to be deterministic) generates and tests values (along with some common values - note that this could be integrated with a test framework built into the language!), ensuring that no constraints are violated. (There are a bunch of optimizations here that I'm not going to mention.) If any constraints are, error out with the variables required to violate the constraint and the values assigned to them.
This is too much machinery to build into a general-purpose compiler today, but there are static analysis tools (see Frama-C and the like) that will analyze your program text (optionally with annotations like you've mentioned) and figure out constraint violations.
Well, the thing is, anything would be better than what we have currently.
I know that in general the problem is undecidable. But we don't need to solve the entire problem to solve a large chunk of practical cases.
Even a simple runtime cast (with check) / asserting cast (a cast with an assertion that would be triggered at compile time if the compiler can find a valid counterexample, otherwise unchecked) combination would be miles above what we have now in most strongly typed languages.
> If you come up with a language design that is guaranteed to be memory-safe, eliminates array bound checks in most realistic use cases, and can be used by regular programmers, you will be hailed as a hero. I'm not exaggerating at all.
I'm just guessing, but wouldn't most uses of arrays be in traversing the whole array, ie mapping over the array, folding it etc.? I have seen a ton more
> for i = 0 to a.length -1
then I have seen something like indexing a sorted array like in a binary search. So if these operations are abstracted to functions that don't use array indexing directly, then they could perhaps be proven once in some library. And such a proof seems very straightforward (conceptually, perhaps not realistically) - just prove that your for loop exits when the index i gets incremented to out of bounds of the array.
I think Rust's iterators use no bounds-checked indexing underneath, at least for simple things like mapping over an array. They probably haven't proved that the indexing won't go out of bounds, though, probably just vetted and tested it a lot.
Yeah, Rust's iterators are a good example, and they indeed solve this specific use case.
(For those who don't know, an iterator in Rust has a function next() that returns an Option<A>. That allows you to do array bound checking and loop termination checking with a single operation, like when iterating over a linked list.)
Also I've seen a promising paper by Corneliu Popeea: http://gallium.inria.fr/~naxu/research/abce.pdf . It says it works on unmodified C code, doesn't require annotations, and successfully removes all checks in quicksort (!) Kinda sounds too good to be true though, I wonder what's the catch.
Your example is perfectly possible to implement in most languages; you just can't expect to give a function an arbitrary run-time integer value and have its validity checked at compile time :) So, the type would have to look something like this:
class IntegerPowerOfTwo {
private int exponent;
public IntegerPowerOfTwo(int exponent) {
this.exponent = exponent;
}
public int toInt() {
return 1 << exponent;
}
}
If you want a compile-time conversion from a constant integer to an IntegerPowerOfTwo, it is possible with C++, for instance, although you need a bit of template metaprogramming.
I'd like to know why this was downvoted. (Edit: other than the fact that the exponent should be unsigned int :/)
The whole point of encapsulation is to be able to write data types that have invariants other than those offered by the machine-level types. If you want a type T that satisfies the predicate
∀n∈T: ∃k∈N: n = 2^k
then you write a type that only provides operations that maintain the constraint. In this case, for instance, construction from k and multiplication by another T. As I said, some languages like C++ do also allow things such as compile-time conversion of an unconstrained constant integer into a power-of-two integer, with a compile error if the constraint isn't satisfied, but in my opinion that's mostly syntactic sugar. Would be nice if you could simply write, say,
type intPowerOfTwo = unsigned int where ∀intPowerOfTwo n: ∃unsigned int k: n = 2^k
but what about the unsigned int operations, such as addition, that do not maintain the invariant? Should the type system include a general theorem prover that can deduce which operations are allowed?
However, I understand this was simply an example and do agree that expressive and powerful type systems are certainly a very useful tool. C++ template metaprogramming (pretty much computing with types as values) is very impressive even if horribly ad hoc and the concepts feature (constrained templates), if ever actually implemented, will make all kinds of things easier.
I'm not sure why you're being downvoted. This technique is not entirely unlike using a phantom type, which is a common way of encoding such constraints into the type system.
I'll admit it's not ideal; in fact, it's often downright cumbersome. In some cases, it's not possible to encode desired information into a particular type system.
But in this case, it seems valid to me. There is no way to construct a value of IntegerPowerOfTwo that does not represent a power of two, and therefore it fulfills GP's example requirement.
To each his own. I rarely hear fluent Haskellers say they don't care for the syntax though; if you haven't really gotten far enough to write production code in Haskell that leverages Haskell's powerful abstraction features, then you should and you might find the syntax growing on you.
The syntax is nice because it enables the programmer to express software in a terse way, generally favoring the types as documentation and a strong mental model to understand the abstraction.
You can do that in Scala (though it won't necessarily be easy). Just require an implicit parameter that witnesses whatever constraint you want, and define basically functions at the type level that supply it.
You can do "kind of" dependent types in Haskell too but it's an ugly mess (like trying to do it in Scala). Idris handles it really well for general programming tasks.
In general, what you want to be able to mix types and values are dependent types (see languages such as Coq, Agda, Idris). However, dependent types are way too powerful, and as such extremely difficult to use.
A light-weight form of dependent types are refined types, which are essentially dependent function types that have contracts on their parameters and return types. As refined types are way more limited than full dependent types (essentially, linear arithmetics, logic, and simple stuff like this), you don't always need to write the proofs manually but can use an automated theorem prover such as Z3.
This is a hot research topic. There are attempts to do type inference for refined types - see [1] and [2]. The Liquid Haskell team is also exploring the issue of dependent types and laziness [3]. Below you mentioned that you want "soft casts" as well - check out [4], which attempts to do just that (implemented in language Sage [5]).
I'm very interested in this topic as well, and have experimented a bit [6]; as it turns out, it's not that hard to implement a basic type-checker for simple refined types. It's also quite powerful - using Z3, even non-linear problems such as rectangular array access (mentioned by a poster below) are solvable, since you can trivially prove its safety using real arithmetics (which is decidable).
The only big disadvantage that I see is the difficulty with dealing with state, but then again, even programmers can't reason about it properly; I mean, if `x` is a mutable object, can you really say `if x.a != 0 then 1 / 0` and be sure that no-one will modify `x.a` in the meantime from another thread? I hope that linear types could help.
> A light-weight form of dependent types are refined types, which are essentially dependent function types that have contracts on their parameters and return types
I had never seen it written that way. What can full dependent types do that you can't do with refinement types?
I'm not sure, as I'm not really knowledgeable in programming with full dependent types (as in Agda, Idris, or Coq).
However, I can point you to this discussion on Reddit [1]; essentially, one user (kamatsu) is saying that refinement types are not dependent types:
> More or less, if you have sigma and/or pi types, then you have dependent types. If you don't, you don't.
Also, he claims that a crucial difference is that dependent types are proofs, while type-checking refined types requires proof search. Then, another user (neelk) points out an article [2] which claims (I haven't read it) that dependent types can be encoded using singleton types, which can be written using refined types.
"Whether your programming language "embraces" types or has a modern type system or not, these concepts are central to how programs execute, because they are a fundamental part of how computation works." - what are "modern" type systems?
Was something new invented in the last 15-20 years that I had missed?
The discussion of types here is all over the place. This is because they are handled so differently in various programming languages. Pascal included array bounds in the type annotation for arrays and subscripting was enforced (at runtime) to be withing bounds. In C (and C++) arrays are almost equivalent to pointers (the full answer is complicated see [1]) and there is a real danger of buffer overflows. Python doesn't include type annotations, but the run-time enforces strong typing, lists carry their type information and will throw an exception if subjected to a non-list operation. The type system in Haskell is perhaps the best example of a really expressive and really effective system for ensuring that the compiler can catch as many error as possible at compile time.
Haskell programmers sometimes say that if a program compiles it almost always works. This is interesting, but unfortunately, types and type annotations in use today don't fully capture the formal specification needed to guarantee program correctness.
Just writing down specification in preparation for a proof of correctness can be difficult. The specifications have to be expressed in a formal system, for example some form of mathematical logic like predicate calculus. Working with anything less than a formal specification is like trying to solve a mathematical set of equations without using math symbols. Natural language is just inadequate to the task. See [2].
Getting the specifications right isn't easy. I remember my first attempt at proving the correctness of a sorting program. I understood that I needed to insure that A[i] <= A[i+1] for all elements in the array, but I forgot that the final result had to include all and only the original elements (i.e. that it had to be a strict permutation of the input elements). I don't believe that type systems are currently useful for these kind of specifications and verifications of program correctness.
Real programs interact with the outside world (e.g. GUI's are hard to describe mathematically). Real programs often have distributed or concurrent execution for which new logics (like Temporal Logic) may have to be employed. Proving that a program will terminate or make progress or not deadlock or not livelock or will meet some real-time constraints are all exceptionally hard.
Many years ago (in the 1980's) I worked on this research area at University. I had the hope that eventually, programming would be more like an interactive exploration with a sophisticated program prover. I thought that the programmer and software-prover would work together to produce a correct program. Today, we are still programming in pretty much the same old ways as back then. The languages are much better, but it's a harder problem than I thought it was going to be.
But this only applies to typed programs, right? For every type system you come up with there will be some valid and normalizing term that cannot be typed under that type system.
There is also a minor thing that came to my mind. Types are most often associated with intuitionist logics, which are about constructive provability, as opposed to classical logics, which are about "truthyness" so the title of the post is a bit misleading :)