This is nitpicking, but it always bothers me: "mb/s" is an abbreviation for "millibits-per-second". You meant "MB/s", although this wasn't obvious; if you hadn't mentioned maxing out a GigE connection, it could have been interpreted as "Mb/s", which is short for "megabits-per-second".
Actually, that's not a very good argument. The answer to that is "to you, maybe". The better argument is this one: "Must worry about I/O occurring in all function calls. (They might call wait().) The user needs to make their functions coroutine safe!" I think this is the reason why coroutines are more popular in functional programming languages where side effects are limited by style or by enforcement of the language itself.
That is not a good argument either. If you are going to write your entire IO library in asynchronous style like node.js then you could easily make all IO routines coroutine safe. In fact you have exactly the same problem whether you use coroutines or not. If you have a nice asynchronous program and I call wait in the middle that's going to hurt you in the same way.
You don't really need to know how to use the full power of coroutines if you just want asynchronous operations. You just need to know that read() may block the current coroutine.
You could even be more explicit about the whole thing, use futures and only allow them to block. The above example would go something like this if the reads can't be executed concurrently:
future_x = read();
x = future_x.wait();
y = read().wait(); // shortcut
write(x + y);
Or something like this if the reads are independent:
future_x = read();
future_y = read();
// one of a handful of functions that can "block", all operating on futures:
waitForAll(future_x, future_y);
write(future_x.get() + future_y.get());
Using futures rather than implicit suspension has the added advantage of being able to pipeline independent reads just as you can with callback-style asynchronous I/O.
You can already implement[1] an approximation of this in terms of callbacks, but it doesn't look quite as nice, e.g.:
var handler = new AsyncHandler();
// independent, pipelined reads
readAsync(handler.cb());
readAsync(handler.cb());
handler.whenDone(function(x, y) {
write(x+y);
});
It gets substantially uglier than that if the dependencies aren't so straightforward, e.g. A, B & C are independent, D depends on A & B having completed and the last part of the code requires the results from C & D. Futures do much better in that sort of situation.
I was just throwing ideas out there, not really thinking it through. :) Although you're absolutely right about waitForAll() in that example, there is a point to that sort of function, say if you wanted to add a timeout. Or you could have a waitForAny() function - useful if you only need one of the futures to finish before proceeding.
get() vs. wait() is admittedly a question of preference. Personally, I'd keep them separate and even go so far as to have it warn you if you called get() without a wait() or a successful isReady() or so on that future. It keeps the suspensions explicit and the intentions clear. It's a bit like explicit vs implicit transactional systems.
Looks like it's essentially the same type of programming model, although I have a feeling WaitHandle.WaitAll() just blocks the current thread. In the ideal case, it would internally call the event loop coroutine and process other events instead without sending the thread to sleep. Thread scheduling involves system (kernel) calls, coroutines are purely userspace, just like node.js's async callback mechanism.
Yes, other languages might be more suitable for coroutines.
But then again, Erlang is probably a language more "suitable" to high concurrency programming, but node's goal is to make writing scalable networks programs possible for everyone. One might call it the PHP of concurrency : )
The statements made about co-routines/cooperative threading and the stated reasoning for removing them makes me wonder whether the decision to remove them was made based on benchmarking and experimentation or just based on taste. In my experience they simplify many forms of asynchronous logic tremendously and can have performance benefits as well, as long as you're not forced to use them for everything.
I suppose someone who wants them in node.js can just reimplement them as an extension.
> Basically callbacks and coroutines don't play well together.
Can you elaborate on this? I thought the opposite: coroutines play very well with callbacks. Callback libraries force you to write programs in continuation passing style. Coroutines let you write asynchronous programs in normal style. Lets assume we have an asynchronous readAsync(callback) function that reads a number and calls the callback with the input when it's ready. Now you're writing your program like this:
x = read() // may block this coroutine, but *not* the entire program
y = read()
write(x+y)
This can be achieved with something like:
function read(){
coro = getCurrentCoroutine()
var result
callback = function(x){ result = x; coro.resume() }
readAsync(callback)
yield // stops the current coroutine until it is resumed
return result
}
It seems to me that this is a general way to adapt any callback style library to coroutines.
If you implement coroutines as a state machine they integrate just fine with callback-oriented APIs. But perhaps that was the problem since the PDF mentions stack swapping.
I think the reason multipart works that way, is so you can stream data that you don't know the full length of beforehand.
But afaik, that's pretty much never the case with file uploads, unless you are uploading a file that is still growing in size - so yeah, it's annoying : ).
('felixge2' because my other account is in noprocrast mode: )
Yeah, you could -- I wonder what would happen if the client gets it wrong or is deliberately dishonest? Not trusting the client is a big part of writing an open server and this seems like you would have to trust the client in a big way.
Well, it's not a big problem - you should have a timeout on incoming connections, and node is pretty well-suited for having lots of "hanging" connections (I ran some test with 56k active connections).
If a connection is closed, either by a timeout, or EOF, you simply check if the promised content length matches the count of received bytes - if not you should probably discard the whole thing (unless you're specifically supporting broken clients).
When parsing boundaries you know the first character is going to be a hyphen(-), and the last character is going to be a newline. Wouldn't it be easier to search for hyphens, and then read until you see a newline, and then compare to the boundary? Typically boundary characters are random printable characters, so you might be doing more work than you need.
The rub is "search for hyphens": you have to look at every character until you find a hyphen; the point of Boyer-Moore is that you don't have to even look at most of the characters at all.
If I can get a revenue stream of approximately $1000 a month my entire website will be literally a call to a few different web services and some fancy-shmancy CSS files. God I love hacker news! Ok fine there will be some of my code in there, hopefully not for long because soon there will be a tool to solve any other problem as well :P As long as these don't get too pricey.
</pedantry>