I actually just implemented some thing very similar to this. You probably want to checkout Dashing [1] which has built in plugins for a lot of things already. It's pretty easy to work with as well.
D3 is pretty ok with large datasets but I understand your point.
What Shiny does to get around this is natively 'evaluate' the plots on the backend, creating a rasterized PNG file. A similar approach could work for Pyxley (using matplotlib or Seaborn to render the plot, and then sending that image file to the front end) but I fear with so much development time spent on d3 support such an approach would not be natively implemented.
>What Shiny does to get around this is natively 'evaluate' the plots on the backend, creating a rasterized PNG file
I don't think that's the case for D3 charts, because that would kill the interactivity that is so great about D3. I use RCharts to inject D3 into my Shiny applications, and have ran into into performance issues with just a couple hundred data points. I think this is because all heavy lifting is done by the client (browser), not the server.
Unless you have a screen with 50,000 pixels you would need to downsample the data anyway, wouldn't you normally do that instead of handling a visualization library the full data set?
You actually do have a screen with 50k pixels. Many more than that.
A continuous heat map over two dimensions, at 500x500 pixels, is 250,000 pixels. Of course, to generate the heat map, you have to aggregate (binning on two dimensions), but you don't have to downsample.
How easy is it to integrate a chart or graph into a larger project? My biggest gripe with Shiny is how difficult is to use the R calculate and graphing functions in a larger project without using OpenCPU as an API.
My guess is that with Python being a more general purpose language, this should be easier..
It shouldn't be too bad. My goal was just to make it a little simpler to do the basic things like formatting the data as json for a particular charting library. The python helpers just set up the APIs. I made an example called custom_react that shows how I mixed some custom code with some pyxley stuff.
I've been working with it and it's not bad - won't tear my hair out if it's what I use in the end. But a couple pain points for me are the need to rewrite my code to grab 'session variables', needing to understand how images are returned, especially if you need to return more than one, and so on - versus Shiny, which allows me to more or less drop in the code I've already been working with. Oh, and parallel requests are just not possible.
Of course, most of these issues are understandable and possibly by design, especially since OpenCPU was designed for embedded systems. That's fine - it just makes creating webapps or dashboards around R a more complicated usecase.
How important is flask in this mix? How difficult would it be to back this by a django app for example? I'm asking because the charting would be REALLY useful for me right now, and I already have the django models...
The flask dependency is pretty strong because I used the requests module for the api route functions. It should be possible to override it and use django instead, but I don't know django that well. It's probably a good idea to separate that dependency so it's a little more flexible.
Yup, that one. I planned on simply filtering a pandas dataframe using the request.args that are passed in. The javascript components can be used directly, but I wrote the python wrappers as a convenience for really simple dataframes.
If you already have a working web app, why not just use something like flot[1] for your charting (or d3 if you are up for it). Seems like pyxley is more targeted at people who have some data in pandas and want to get some visualizations on the web quickly.
Wow, this is really cool. Playing around w/ US state maps in examples right now. I love stuff like this partially because of the hands on introductions to component technologies I had heard of but not used before.
Hmm I have a python script that monitors a few datapoints but don't know how to appropriately save them and plot data on a graph/webpage. I might give this a shot?
This seems nice, but where did you define the datasource(s) for your examples? Did I miss something? I'm already using Flask, so I'd be interested in using this.
looks great. but i can't help but think it would save everyone a lot of time -- maybe not up-front, but in the long-run -- if we wrote these frameworks in c and just wrote language bindings for r, python, ruby, etc. why are we rebuilding good frameworks over again just because they're not written in our preferred language?
There is a big divide between interpreted and compiled resources though.
If you have the original source in an interpreted language, it supports your ARM raspberry pi or chromebook just as well. With the C variation you need to build / package / install.
I've kinda just come to accept that code re-use cross language has never been good except with the unix pipe.
It's quite a bit of work to make sure that a project is available on all potentially relevant distros and always up-to-date, so that only works for popular projects, and even for those there are often alternative package sources to get current versions on older releases etc.
The problem there is updates: you'll get bugfixes, maybe even backported bugfixes, but in order to get a new version with new features you either have to deal with installing/building from a non-OS-package-manager source, or upgrade your entire OS.
Why is having R a limitation? R is a fantastic data language which arguably beats all others in terms of number of libraries and data manipulation tools.
If GP comes from a more software-dev background, I can understand a general dislike for R. There's lots of things where R can't be beat right now, but for simple data analysis and visualization, there's some damn good python packages like pandas and d3py that GP may find easier to use than dataframes/tables and rCharts in R.
R has no native bayesian library like Pymc 3 (Must use stan which is c++). Also Python is better for ad hoc and agent based modeling and for out of core data with blaze and dask.
Unless you can help here, I see alot of pre coded models and older samplers, but nothing with a flexible JIT for user extensible variab;es and autodiff for newer HMC and NUTS type samplers. Exception being STAN, but that is its own C++ modeling language, can't talk to R functions , is more verbose than PYMC 3 and doesn't do discrete variables (unlike pymc3).
Big screen on wall with 6 or so boxes. Each box displaying data which updates in real time. Such as
I can't tell if this is one of Pyxley's use cases?