This method is really inefficient. There are b! / (b - n)! ways to choose n of b unique numbers, whereas this solution is O(b^n).
That being said, it's still fast enough for this example.
Also, Python solution along the same lines:
import itertoolsdef
def valueOf(*values, base=10):
v = 0
for v in itertools.accumulate(values, lambda a,b: a*base+b):
pass
return v
def solve():
for s,e,n,d,m,o,r,y in itertools.permutations(range(10), r=8):
if valueOf(s,e,n,d) + valueOf(m,o,r,e) == valueOf(m,o,n,e,y):
yield (s,e,n,d),(m,o,r,e),(m,o,n,e,y)
(If anyone knows of a better way to do valueOf, let me know)
That's what I've noticed with Haskellers. They have a tendency to write really slow code and don't have a clear idea of the execution model behind what they write down.
A while ago I asked in #haskell for translations into Haskell of the following C++ code:
This should be easy. The algorithm is right in front of you. There should not be a lot of thinking in transcribing an algorithm into Haskell. But most of them wrote down something with a fold which turned what is an O(m*n) algorithm in time and O(n) in space into something like O(m^n) in space and time. The only one who got close to the C++ implementation was one who used a lot of State monads.
Then, of course, there was someone who pointed me out to a radically different algorithm written in Haskell that I think is only O(n+m) in both space and time. I haven't been able to understand this other algorithm yet.
The responses you got on IRC don't necessarily represent Haskell best practice (or even good practice). Most people probably have better things to do than translate random toy programs from C++.
Certain highly stateful algorithms may be difficult to translate into Haskell, in the same way that certain highly structural or lazy algorithms may be difficult to translate to C++.
I'm curious if your time complexity analyses are accurate, given that you (admittedly) did not understand the faster algorithm. You could be getting tripped up by laziness.
In general, I've found that most production Haskell code is quite fast, in both big-O and absolute terms.
There is a set of simple patterns/recipes most people use in Haskell that corresponds to their kind of math intuition Haskell seems to support. It doesn't have anything to do with efficiency though, for that you need to be an expert and you still won't be 100% sure the compiler really optimizes the execution the way you intended.
Performance regressions in GHC are taken seriously as far as I've seen. It's not that big of a deal since you should test/profile your program on any compiler upgrade.
If you see any regressions, just post it to the ghc mailing list and it will probably be solved for you since it's much needed regression feedback/data.
That's good to know. Though I will note that I don't use Haskell. (Haven't had the time to properly learn it, and keep getting frustrated by minor things when I try to dabble with it.)
That being said, I used to think that about Java as well - and then the whole substring thing happened.
Yeah, it's got a completely different algorithm along the diagonal of the matrix. That's the O(m+n) algorithm. I still haven't been able to understand it.
But that's not the point. The point is that Haskellers have a hard time translating an algorithm that's right in front of them and instead of writing down an O(m*n) algorithm they instinctively do O(m^n). This whole "many-worlds" approach of the original article is also leading to a very inefficient algorithm.
The abstractions Haskellers prefer lead them towards slow code.
Haskellers have a hard time translating an algorithm that's right in front of them
That's no surprise: it is an open problem if this is possible in general. Some time ago Pippenger [1] showed that some programs cannot be transliterated into pure strict functional languages without suffering a slowdown.
Later, Bird et al [2] show that the example given by Pippenger to show the slowdown really depends on strictness, and can be transliterated into a pure lazy functional language without asymptotic slowdown.
As of May 2015, nobody knows if lazy functional language can solve all problems with the same asymptotic time complexity as stateful languages.
[1] N. Pippenger, Pure Versus Impure Lisp.
[2] R. Bird, G. Jones, O. De Moor, More haste, less speed: lazy versus eager evaluation.
It's an open problem whether it can be translated into a pure, lazy language without an asymptotic slowdown. It's not at all an open problem as to whether it can be implemented in Haskell. The algorithm is mutating vectors - the direct translation is to use ST, and will have the same asymptotic behavior as the C++ version.
Why is that surprising? Haskell (as typically implemented; I'm slightly less confident about the details of the report) is lazy and pure by default, but that can be violated locally or more globally if you ask for it loud enough. So the language, as a whole, is not pure in the sense required for those theoretical questions to really apply.
What's needed here is ST. It permits local O(1) mutable vectors, while enforcing referential transparency when viewed from outside the runST.
ST is clearly not pure in the sense that theory requires to make "can we implement this algorithm with the same complexity" an interesting question. It seems wrong to consider it part of "the unsafe stuff", however - it doesn't have the same kind of unenforced caveats as unsafePerformIO and unsafeCoerce and similar.
Separately, "the unsafe stuff" is still part of the language as used, and while it's certainly better practice to avoid it when you can, it's silly to exclude it when asking whether it's possible to do something - they are sometimes appropriate.
Are you suggesting to define purity by the inability to define O(1)
mutable vector-like operations? That would be an interesting
proposal. But I wonder if it captures the conceptual content of the
informal concept of purity. Maybe a time complexity restriction should
be provable from purity?
"Are you suggesting to define purity by the inability to define O(1) mutable vector-like operations?"
"Define", no, as that certainly doesn't encompass the full scope of purity. I think mutability runs counter to most people's notion of purity, particularly when they say "Haskell isn't pure". Of course, we can fake mutability at a cost of O(log n), which is what makes the O(1) notable.
If you do think ST is "pure" then it's not an open question whether you can implement update-heavy array based algorithms in a pure language without algorithmic penalty - you clearly can.
Looking over the PLDI paper, I get the impression that the key difference between ST and the usual State monad is (1) the ability to name state, and (2) the use of parametric polymorphism to prevent scope-extrusion of state.
Of course ML has scope-extrusion of state, and one of my questions/examples upthread was about the ability of Haskell's monadically controlled effects to implement such scope extrusion without 'tricks'.
All in all I feel that the notion of purity is controversial, insufficiently theorised, and that the community would benefit from a clearer understanding of the trade-offs involved in the different notions of purity
If you object to notions of purity, why did you bring it up in the first place?
The first place it was used in this thread was your citing a paper entitled "Pure versus Impure Lisp".
In that paper, it was used to mean prohibiting "set-car!" and "set-cdr!". ST provides the equivalent of "set-car!" and "set-cdr!", so in the presence of ST the result you were citing in the paper doesn't even apply to a strict version of Haskell.
Please just stop making claims about Haskell until you have some sense of what you're talking about (or at least add a caveat warning others that you might not).
I don't object to purity, I want to know what it means. I didn't
realise that this is not a well-understood concept. (I mentioned
Sabry's definition of purity which I found surprising.)
The original discussion started because somebody complained that
Haskell programmers often appear unable to transliterate imperative
algorithms. I assume that this complaint was about transliteration
into pure Haskell (whatever that may mean exactly). Clearly
transliteration into full Haskell with actual mutability is
possible. I mentioned Pippenger's article to bring to the debate the
idea that the question whether such transliteration is always possible
is an open research problem.
"I assume that this complaint was about transliteration into pure Haskell (whatever that may mean exactly). Clearly transliteration into full Haskell with actual mutability is possible."
Ah, if you'd made that assumption explicit I would not have objected.
I don't see anywhere the querent suggesting that his translation demand invoked any notion of purity. If it did, then certainly he was asking a harder question than he presented, although even in that case a naive translation could have trivially produced an O(m n lg(n)) solution and O(m^n) should quite reasonably be considered failure.
I think the issue in the main is that relatively novice Haskell developers are likely to be the most eager to try and translate arbitrary code on demand in #haskell.
I don't think anyone knows a solid definition of purity. It certainly refers to at least two things (1. independence of order of evaluation, 2. no side effects) and informally speaking we often "know" what it is in a particular context, but writing down a formal definition is very hard. I'm certainly not satisfied with Sabry's definition. I don't like the idea that you have to have both a strict and a non-strict evaluation for your language.
If nobody knows what purity means, then what are all those tortured discussions on whether Haskell is pure or not about? ...
Anyway, I'm a bit unhappy about the requirement of full evaluation order
independence, because even the purest of pure languages (e.g. the
untyped lambda-calculus or PCF) don't have full evaluation order
independence. I guess you mean evaluation order independence modulo
termination?
My intuition is that pure means that all language expressions M can always be
evaluated locally by just looking at the meaning of all subterms of
M. In terms of Hoare logic (for total correctness) that would mean
that the pre-condition speaks only about the termination of the free
variables of M (sorry for being so terse, I think this can be made
formal).
Usually this is the way words work - we notice a regularity in the environment we wish to refer to; we build a sense of it, usually motivated by some examples and counterexamples; we give it a handle; and we then try to find a definition that matches our intuition. There certainly seem to be regularities that "pure" is intended to point at in this context, but it definitely seems that we've not settled on a definition, and we may not even agree on examples.
The notions of purity I was pointing at was any that entailed the restrictions in the paper you'd linked. There are restrictions of Haskell that are pure in those senses, but Haskell taken as a whole is not, even without invoking anything unsafe.
"I get the impression that the key difference between ST and the usual State monad is (1) the ability to name state, and (2) the use of parametric polymorphism to prevent scope-extrusion of state."
The key difference between ST and the State monad is that the State monad is simply sugar for a function call of the form (s -> (a, s)), whereas ST allows for actual mutability. This is what changes the asymptotic complexity.
ST also does things to prevent references from leaking, and that's valuable and more useful in the context of ST, but that's irrelevant to our questions up-thread.
"Of course ML has scope-extrusion of state, and one of my questions/examples upthread was about the ability of Haskell's monadically controlled effects to implement such scope extrusion without 'tricks'."
That wasn't this thread.
I still think scope extrusion - without asking for it - would be a bug not a feature; particularly so in the case of a multithreaded, lazily evaluated language with inlining and CSE - where you really have little control over what gets executed when if you're ducking the features that guide what gets executed when.
This doesn't limit expressibility, because you do have the ability to ask for it, using those "tricks" that you keep excluding for no good reason. "How expressive is Haskell if it lacked those" is an interesting question, but not am immediately practical one, and you should be clear that that is the question you're asking.
Lazy is easy: it's call-by-name evaluation, i.e. (lambda x.M)N -> M[N/x] with caching, ie. each argument is evaluated at most once.
Purity is more subtle. I don't think there's an agreed upon definition. [1] essentially ties purity to the equivalency of
call-by-name, call-by-need and call-by-value evaluation orders (ignoring divergence and errors). I'm not totally sure that is the right definition.
[1] A. Sabry, What is a Purely Functional Language?
The most proximate question here, with respect to purity, is "can you do O(1) mutation (outside of the narrow case of replacing a thunk with its result)?" The theory is far more interesting when we say "no", but the answer in the case of Haskell is "yes".
It seems a bit of a broad generalisation to say "Haskellers have a hard time translating an algorithm" based on who happened to be on the IRC channel at the time and decided to answer a challenge from a random C++ programmer who turned up.
Anyway, Haskell is perfectly decent at implementing your algorithm, and imperative algoritms in general. He's a more-or-less direct translation
http://lpaste.net/132519
There's extra noise related to wrapping and unwrapping mutable cells which I could tidy up if I had the time, otherwise it's essentially identical.
In any case the author of the post in question, Bartosz Milewski, is an expert in C++ and not particularly an expert in Haskell, so your point seems misdirected.
And of course the root of this conversation was mistaken in the first place. The algorithm Bartosz is presenting is the b!/(b-n)! algorithm that just examines each permutation once.
> It seems a bit of a broad generalisation to say "Haskellers have a hard time translating an algorithm" based on who happened to be on the IRC channel at the time and decided to answer a challenge from a random C++ programmer who turned up.
I'm not a "C++ programmer". I'm just a programmer. I happened to have the algorithm written in C++, but I could quite easily have written it in pseudocode.
I got about 6 responses in 20 minutes. Four of those were O(m^n). Yes, I think this is quite indicative that Haskellers have a hard time writing performant code.
Note that the later parts of this series havd been published, and in fact the definition of StateL does inforce uniqueness as expected, in a way that makes the proposed solution O(n!).
Satisfying arbitrary constraints is often exponential (http://en.wikipedia.org/wiki/Boolean_satisfiability_problem). I'd certainly be interested in seeing more efficient implementations, but the approach taken by the example Haskell code (and duplicated in your Python) is substantially more efficient than the example non-Haskell code provided in the article.
It certainly doesn't mean we have to punt the easy cases, but it seems inappropriate to lambaste the current solution without establishing that we are in fact looking at an easy case.
That no one seems to have produced (or pointed at) an asymptotically better solution in any language, I think that's some small evidence that there isn't such a solution and substantially more evidence that it's not easy to find.
There's an easy better method: assign variables starting from the lower order digits, greedily check correctness, and skip over all permutations with that prefix if it fails.
Although I don't know how one would go about proving or disproving if this is asymptotically better or just a (substantial) constant-factor speedup.
Oh! I had misread your original comment to be, "There are b! / (b - n)! ways to choose n of b unique numbers, [so] this solution is O(b^n)." You were contrasting! I believe that b!/(b-n)! is asymptotically the same class as b^n (whether we're varying b or n), hence my confusion.
I had initially skimmed the article, and assumed he was already handling uniqueness. With the correct definition of StateL, the solution presented is equivalent to what you describe here, but apparently that definition is deferred to a future article.
The idea would be that, for StateL, for_each would pick one value and pass that value to the lambda, interpreting the result in a context where the sel only sees the remaining values in the list.
(I used Stirling's Approximation, namely the form ln(x!) ~= x ln x - x for large x)
However, note that (1+(C-1)log((C-1)/C)) / C is greater than zero for all C > 1, as 1+(C-1)log((C-1)/C) > 0 for all C > 1. As such, this diverges for all C > 1.
That is a) hauling in an awfully big external library for something that should be relatively trivial, b) inefficient (you're replacing a single multiplication by 10 and addition per digit with O(log x) multiplications and an addition per digit, assuming np.power uses a decent exponentiation algorithm), and c) if anything, less readable than my version.
To put it simply, I don't see an improvement with that version.
And Python 3 has some improvements w.r.t. wildcard arguments.
This post should be read in the context of the related posts in Milewski's highly informative blog. The goal is not to show a solution, it is to demonstrate some aspects of monads, and in this particular case, that the concept is not confined to Haskell or functional languages.
I assume/hope that the big idea for part 2 is that this setup allows for efficiently pruning impossible sub solutions and will result in an optimal algorithm.
Why would you write an algorithm in C++ if the runtime will be bad?
He even says explicitly "(never mind how we deal with uniqueness)" in the solution section and has already acknowledged that there are only around 2 million combinations to check.
ETA since he mentions the implementation is a translation of this Haskell implementation: http://blog.jle.im/entry/unique-sample-drawing-searches-with... presumably the uniqueness is done by updating the list of remaining possibilities for selection each time a new value is selected.
It's straight from the definition of an integer. An integer is defined as sum(i=0 to n) d_n * base^n. A number with zero digits is just the empty sum, which is canonically defined to be zero.
Do you get bonus points for knowing that these are Alphametics? And covered in ridiculous detail in volume 4A of The Art of Computer Programming? I'm curious how the solutions there stack up to what is in mind here.
"Once you internalize the many-worlds approach to programming, the implementation is pretty straightforward." are you serious? this is why people don't take Haskell seriously.
It is disingenous in the extreme to ask, "So what’s all this talk about mutation and state? Well, who said you can’t have state in functional programming?" when the very first line of the very first hit on "functional programming mutation state" is the Wikipedia article on functional programming, which says:
"In computer science, functional programming is a programming paradigm—a style of building the structure and elements of computer programs—that treats computation as the evaluation of mathematical functions and avoids changing-state and mutable data." -- https://en.wikipedia.org/wiki/Functional_programming
So the answer to the question "Who said you can't have state in functional programming?" is: "Every introduction to functional programming ever."
This is why people hate Haskell and make fun of Haskellers. In their quest to appear more clever than the rest of us, they engage in utterly disingenous rhetoric. Which is too bad, because I'm pretty sure they are actually more clever than the rest of us--certainly more clever than me--and don't need this nonsense.
Haskell is an amazingly cool language, and I've talked to people at meetups who are using it for interesting things, and learning a bit of it myself has been an interesting and educational challenge. But getting around to learning it took five years longer than it should have because I was so put off by the rhetorical nonsense the language community engages in.
That being said, it's still fast enough for this example.
Also, Python solution along the same lines:
(If anyone knows of a better way to do valueOf, let me know)