xboost's comments

xboost · on July 21, 2017

As it is mentioned on the github page, the project was inspired from keras and other great projects, but many decisions did not completely fit with the way keras does things. There will be support for keras models in the future, but currently we are trying finish the work on the web api, the web ui and the cli.

amelius · on July 21, 2017

> but many decisions did not completely fit with the way keras does things

That's interesting. Could you expand on this some more?

xboost · on July 21, 2017

Sure, one of the reasons why we couldn't base the code on Keras at the time when we start working on the project was the integration of keras into the Estimator api, meaning that we couldn't have used the distributed training that was offered by the tensorflow team.

amelius · on July 21, 2017

Ok, sounds like a good reason. But how is that situation now? Did the Keras team remove that limitation in the meantime?

xboost · on July 21, 2017

I understand your concerns, we had the same reactions from some of our friends, and I can assure you that the integration is planned, and any redundancy, duplication, or patches that we have right now for some tf code will be removed.

thank you for taking the time to have a look at the project, and I am very happy that we are receiving such a constructive criticism.

_ntka · on July 21, 2017

It was always possible to train Keras models in a distributed setting (I was doing it in late 2015). And there's built-in, one-line integration with the Estimator API coming in the next version of TensorFlow.

xboost · on July 21, 2017

Hi, thanks for the comment, here's an example for a configuration file for a convolutional denoising autoencoder, where some preprocessing is applied during the input pipelines https://github.com/polyaxon/polyaxon/blob/master/examples/co...

option_greek · on July 21, 2017

Thanks xboost. Could you elaborate more on that. Is there a way to convert images files to TF records directly (as a pipeline parameter). Great project by the way! The easier it becomes to setup end to end training, the more easier it will be to use custom data.

xboost · on July 21, 2017

Sure, we are preparing some ways to automotate datasets creation and versionning. For now, the way to feed data is through the built-in Tensorflow numpy/pandas input functions, there are some examples where this pipeline configuration is used. Otherwise there are different pipeline modules that could be used to feed the data especially for training where TF records could be faster.

The way to create a TF record is still manuall, here's an image data converter that was used to create the mnist dataset:

https://github.com/polyaxon/polyaxon/blob/master/polyaxon/da...

More data converters will be available soon.

Once you have a record, numpy array, or a pandas Dataframe, you can basically use any operation/layer on your data by providing the feature name and the list of operation to apply. In general only operation that are necessary for data augmentation should be done on the input data pipelines, otherwise everything should be done during the creation of the TF records, to minimize the computations.

One last thing for reinforcement learnign, the way we feed data is through the feed_dict, because an interaction with an environement is necessary.