Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

This is really interesting. Have you looked at Apache Beam? What I think is interesting about Beam -in this specific context- is that it has a standalone runner (java), that similarly as riko let you write pipelines without worrying about a complex setup. But then, if you need to scale your computation, Beam is runner-independent and you can take the same code and run it at scale on a cluster, wether it's spark, flink, or google cloud. You can read more here [1].

As for riko more specifically, Beam will have soon a python sdk, but I'm unsure if there will be a python standalone runner. Maybe this is something to look into...

[1] https://www.oreilly.com/ideas/future-proof-and-scale-proof-y...



> This is really interesting. Have you looked at Apache Beam?

Just gave it a look. Took a while to find some examples with code, but once I did it made a bit more sense.

> Beam is runner-independent and you can take the same code and run it at scale on a cluster, wether it's spark, flink, or google cloud.

I thought that was pretty cool.

> As for riko more specifically, Beam will have soon a python sdk, but I'm unsure if there will be a python standalone runner. Maybe this is something to look into...

A python standalone runner would be very useful. Otherwise I'm hesitant to go much further since my goal is to have a pure python solution for working with streaming data. Most libraries require installing java and that is what I'd like to avoid.


Python sdk is work in progress - there's currently a branch: https://github.com/apache/incubator-beam/tree/python-sdk




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: