Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Pyxley: Python Powered Dashboards (stitchfix.com)
240 points by astrobiased on July 16, 2015 | hide | past | favorite | 46 comments


I've been looking for something to do this (below) in python at least on backend.

Big screen on wall with 6 or so boxes. Each box displaying data which updates in real time. Such as

  - scrolling list of source control commits
  - graph of busy/idle slaves
  - single big number, pending builds
  - graph of open ticket counts
  - etc
I can't tell if this is one of Pyxley's use cases?


I think Tipboard might be more what you're after:

  - https://github.com/allegro/tipboard
  - http://tipboard.readthedocs.org/


I actually just implemented some thing very similar to this. You probably want to checkout Dashing [1] which has built in plugins for a lot of things already. It's pretty easy to work with as well.

[1] https://shopify.github.io/dashing/


And if the parent prefers doing it in python, there's pydashie [1], a python port of Dashing.

[1] https://github.com/evolvedlight/pydashie


Dashing is great, but it's Ruby on the backend, not Python :)


It actually has a REST API for sending a dashboard data so you can pump in new data from Python or any language.


Yeah, well aware. I was only address this from GP:

> I've been looking for something to do this (below) in python at least on backend.


Will work great until you try to render 50,000 points and your browser crashes because it's build on d3.


D3 is pretty ok with large datasets but I understand your point.

What Shiny does to get around this is natively 'evaluate' the plots on the backend, creating a rasterized PNG file. A similar approach could work for Pyxley (using matplotlib or Seaborn to render the plot, and then sending that image file to the front end) but I fear with so much development time spent on d3 support such an approach would not be natively implemented.


>What Shiny does to get around this is natively 'evaluate' the plots on the backend, creating a rasterized PNG file

I don't think that's the case for D3 charts, because that would kill the interactivity that is so great about D3. I use RCharts to inject D3 into my Shiny applications, and have ran into into performance issues with just a couple hundred data points. I think this is because all heavy lifting is done by the client (browser), not the server.


htmlweidget is kinda an evolution to rcharts by the same person if i recall correctly.


Unless you have a screen with 50,000 pixels you would need to downsample the data anyway, wouldn't you normally do that instead of handling a visualization library the full data set?


You actually do have a screen with 50k pixels. Many more than that.

A continuous heat map over two dimensions, at 500x500 pixels, is 250,000 pixels. Of course, to generate the heat map, you have to aggregate (binning on two dimensions), but you don't have to downsample.


Right. I was thinking only of time series.


How easy is it to integrate a chart or graph into a larger project? My biggest gripe with Shiny is how difficult is to use the R calculate and graphing functions in a larger project without using OpenCPU as an API.

My guess is that with Python being a more general purpose language, this should be easier..


I've been working on a lightweight API solution for R recently. Still needs a bit of devops tooling, but I think the code is converging on stability.


It shouldn't be too bad. My goal was just to make it a little simpler to do the basic things like formatting the data as json for a particular charting library. The python helpers just set up the APIs. I made an example called custom_react that shows how I mixed some custom code with some pyxley stuff.


For backend monitoring purposes, one common solution is to stand up statsd and consume it from Grafana.


Why you don't want to use OpenCPU?


I've been working with it and it's not bad - won't tear my hair out if it's what I use in the end. But a couple pain points for me are the need to rewrite my code to grab 'session variables', needing to understand how images are returned, especially if you need to return more than one, and so on - versus Shiny, which allows me to more or less drop in the code I've already been working with. Oh, and parallel requests are just not possible.

Of course, most of these issues are understandable and possibly by design, especially since OpenCPU was designed for embedded systems. That's fine - it just makes creating webapps or dashboards around R a more complicated usecase.


How important is flask in this mix? How difficult would it be to back this by a django app for example? I'm asking because the charting would be REALLY useful for me right now, and I already have the django models...


The flask dependency is pretty strong because I used the requests module for the api route functions. It should be possible to override it and use django instead, but I don't know django that well. It's probably a good idea to separate that dependency so it's a little more flexible.


>The flask dependency is pretty strong because I used the requests module for the api route functions.

Not sure I understand? Are you talking about Kenneth Reitz's requests module? Why would that tie you to Flask or any other framework or library?


pyxley uses `flask.request`, not Kenneth Reitz's `requests`.


Yup, that one. I planned on simply filtering a pandas dataframe using the request.args that are passed in. The javascript components can be used directly, but I wrote the python wrappers as a convenience for really simple dataframes.


If you already have a working web app, why not just use something like flot[1] for your charting (or d3 if you are up for it). Seems like pyxley is more targeted at people who have some data in pandas and want to get some visualizations on the web quickly.

[1]http://www.flotcharts.org/


link to examples is 404, a possibly correct link https://github.com/stitchfix/pyxley/tree/master/examples


Fixed! Thanks for the catch.


Wow, this is really cool. Playing around w/ US state maps in examples right now. I love stuff like this partially because of the hands on introductions to component technologies I had heard of but not used before.


Hmm I have a python script that monitors a few datapoints but don't know how to appropriately save them and plot data on a graph/webpage. I might give this a shot?


This seems nice, but where did you define the datasource(s) for your examples? Did I miss something? I'm already using Flask, so I'd be interested in using this.


looks great. but i can't help but think it would save everyone a lot of time -- maybe not up-front, but in the long-run -- if we wrote these frameworks in c and just wrote language bindings for r, python, ruby, etc. why are we rebuilding good frameworks over again just because they're not written in our preferred language?


There is a big divide between interpreted and compiled resources though.

If you have the original source in an interpreted language, it supports your ARM raspberry pi or chromebook just as well. With the C variation you need to build / package / install.

I've kinda just come to accept that code re-use cross language has never been good except with the unix pipe.


Except tons of Python code depends on C. This project depends on Pandas. Pandas has C dependency.

So to run this there's code that needs to be compiled anyway.


haven't linux package managers (e.g. emerge, aptitude, yum, etc.) basically solved the packaging problem?


It's quite a bit of work to make sure that a project is available on all potentially relevant distros and always up-to-date, so that only works for popular projects, and even for those there are often alternative package sources to get current versions on older releases etc.


Not really.

The problem there is updates: you'll get bugfixes, maybe even backported bugfixes, but in order to get a new version with new features you either have to deal with installing/building from a non-OS-package-manager source, or upgrade your entire OS.


Or add ppa/OBS/hosted repo that builds the latest version.


Good idea! What's stopping you?


looks cool. shiny is very neat, but has the limitation of having R behind it and debugging ain't fun. I very much look forward to testing pyxley!


Why is having R a limitation? R is a fantastic data language which arguably beats all others in terms of number of libraries and data manipulation tools.


If GP comes from a more software-dev background, I can understand a general dislike for R. There's lots of things where R can't be beat right now, but for simple data analysis and visualization, there's some damn good python packages like pandas and d3py that GP may find easier to use than dataframes/tables and rCharts in R.


R has no native bayesian library like Pymc 3 (Must use stan which is c++). Also Python is better for ad hoc and agent based modeling and for out of core data with blaze and dask.

So no, R is not ahead in everything.


I'm not trying to get into a R vs. Python argument, both have their strengths, but is it really true that R doesn't have anything like Pymc?

https://cran.r-project.org/web/views/Bayesian.html


Unless you can help here, I see alot of pre coded models and older samplers, but nothing with a flexible JIT for user extensible variab;es and autodiff for newer HMC and NUTS type samplers. Exception being STAN, but that is its own C++ modeling language, can't talk to R functions , is more verbose than PYMC 3 and doesn't do discrete variables (unlike pymc3).


Too many dependencies and layers as compared to a pure python solution like bokeh, or seaborn and flask. I like that dataframe integration.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: