This is really interesting. Have you looked at Apache Beam? What I think is interesting about Beam -in this specific context- is that it has a standalone runner (java), that similarly as riko let you write pipelines without worrying about a complex setup. But then, if you need to scale your computation, Beam is runner-independent and you can take the same code and run it at scale on a cluster, wether it's spark, flink, or google cloud. You can read more here [1].
As for riko more specifically, Beam will have soon a python sdk, but I'm unsure if there will be a python standalone runner. Maybe this is something to look into...
> This is really interesting. Have you looked at Apache Beam?
Just gave it a look. Took a while to find some examples with code, but once I did it made a bit more sense.
> Beam is runner-independent and you can take the same code and run it at scale on a cluster, wether it's spark, flink, or google cloud.
I thought that was pretty cool.
> As for riko more specifically, Beam will have soon a python sdk, but I'm unsure if there will be a python standalone runner. Maybe this is something to look into...
A python standalone runner would be very useful. Otherwise I'm hesitant to go much further since my goal is to have a pure python solution for working with streaming data. Most libraries require installing java and that is what I'd like to avoid.
As for riko more specifically, Beam will have soon a python sdk, but I'm unsure if there will be a python standalone runner. Maybe this is something to look into...
[1] https://www.oreilly.com/ideas/future-proof-and-scale-proof-y...