Ask HN: This is my first open-source project. Feedback?

jrockway · on March 31, 2012

In addition to tests, it's probably time to flesh this out by adding functions that operate on graphs.

It's also relevant to look at this:

http://hackage.haskell.org/packages/archive/astar/0.2.1/doc/...

Notice how A* is implemented without any explicit graph object: everything is functions.

This is a model worth thinking about. There are an infinite number of graph libraries for Java. If your value-add is a query language on top of graphs, why not skip the graph representation and let someone else maintain that so you can focus on the query language?

srl · on March 31, 2012

Relative scala newbie here. This looks pretty nice, thanks!

I think I'm not alone in learning to use new languages and libraries primarily by inhaling example code. Why not provide a couple relatively trivial programs, and then a couple non-trivial ones to demonstrate some particularly powerful / elegant aspect of your approach? It's good advertising too.

Also, github lets you use README.md with markdown formatting, to make the readme more legible / fancy.

michaelochurch · on March 31, 2012

Thank you.

I wrote a card game (Ambition) back in 2003 and I'm considering OSing the rules to that, in Scala. That might be a better read for intro Scala because the space (in a card game) is more closed. This is more abstract, with the NodeT's and EdgeT's.

Odersky's Scala book, by the way, is really good.

vitorbal · on March 31, 2012

Which book? Could you edit in a link? Thanks!

fredoliveira · on March 31, 2012

This would be it: http://www.artima.com/shop/programming_in_scala

gtani · on March 31, 2012

this is complete text of 1st edition, written for scala 2.7. Probably about 1/4 of the text is unchanged between editions, maybe less, but will give you good feel for writing style, depth of coverage.

http://www.artima.com/pins1ed/

The 2nd edition is an excellent learning and language survey resource. Also, the books' index is superb.

pdelgallego · on March 31, 2012

Plain text is ok for the README file, but markdown is more readable, specially if you are going to include code examples.

humbledrone · on April 1, 2012

Yeah, in terms of gaining users I think that learning a little bit of basic markdown formatting is very worthwhile. It only takes a few minutes to learn the important stuff (headings, bullets, code formatting, hyperlinks), and the knowledge will continue to pay off on future open source projects. (I am a recent convert from plaintext readme files, and am happy with the decision.)

Visually, having a decent README sets the project apart, and makes it look more "put together." Anecdotally, I can say that when I'm looking for a library to solve a particular problem, I'm probably going to stick around and read more about the repo that has beautiful markdown documentation, all else being equal. Maybe I'm just vain, though... :)

vjeux · on March 31, 2012

You just have to name your file README.md :)

oacgnol · on March 31, 2012

I'm not sure if this was intended by design or not, but why did you decide to put everything into one .scala file? I know Scala allows you to define multiple objects/classes in a file but it is pretty long and it would be pretty difficult to find what I'm looking for if I were to edit anything.

Otherwise, cool project. I too am also interested in graphs and Scala, so I'll be watching your project. Have you looked at flockDB from Twitter?

michaelochurch · on March 31, 2012

The one-file thing is because I'm new to the Scala build environment (although I have 2 years of Clojure and 2 years Ocaml + Haskell) environment. I was just delaying on that SBT/Maven question for as long as possible.

I generally split files around 500 lines, but for FP languages that may be a bit big.

Haven't looked at FlockDB. Twitter does a lot of cool stuff-- and they use Scala. I may consider applying if I decide to move out to CA.

kmfrk · on March 31, 2012

If you are making a graph library, you should probably include some screenshots. :)

Make a screenshots/ folder and link to them from your README.

axiak · on March 31, 2012

He's making a graph library, not a chart library...

tshaddox · on March 31, 2012

To clarify, this library is for storing the data structures called "graphs," which are a bunch of points with links between them.

kmfrk · on March 31, 2012

Ah, sorry, I guess I read too fast. :)

zrail · on March 31, 2012

I don't know Scala but good on you for contributing something that looks pretty useful :)

michaelochurch · on March 31, 2012

Thanks. Right now, it's very basic. I'm undecided whether I want to write a disk layer or just use a graph DB.

Mostly, I want to see if static typing is right, architecturally, for a general-purpose graph library. Is writing [NodeT, EdgeT] on each graph type going to drive me insane?

I also decided that after 5 years being a "company man" and putting 60 hours per week into corporations that not exist one day, that I should get over my FOE (fear of embarrassment) and contribute to open-source.

timClicks · on April 1, 2012

I dislike projects that are named literally. It causes unnecessary linguistic gymnastics to distinguish between the project, the thing it's trying to achieve and the next project.

zoowar · on March 31, 2012

Better solution: https://code.google.com/p/signal-collect/

Devko · on March 31, 2012

You should comment your code

michaelochurch · on March 31, 2012

Specific thoughts on what deserves more commenting?

I tend to prefer explicit type annotations and unit tests (later) over comments except when I feel like there's something non-intuitive that's not at all obvious from the code. Comments are also critical for future-safety against "fixes" that "look right" but will actually break the code.

I tend to have a "60-second rule". If it took 60 seconds for me to figure this out, then I should comment. For example, I commented the need for a curried function signature in Utils.mergeMaps because it wasn't at all obvious, and I kept getting type errors till I got it right.

Personally, I think the worst thing about what I have right now is that it really deserves to be split into more than one file, but I'm delaying the build system question (I dislike Maven, SBT seems neat has a bad reputation) for a little white.

jrockway · on March 31, 2012

I would ignore advice on commenting. Things should be clear, not commented. Docstrings are nice if you are going to generate some documentation for end users; googling for "string isempty java" is easier than opening up java/lang/String.java and searching for "empty" in your text editor. (But both techniques have their advantages and disadvantages.)

Tests are a great way to show what your library can do. I recommend adding those before you worry about documentation. (How do you know your code, as posted, works?)

michaelochurch · on March 31, 2012

I agree with you on testing.

One thing I have to decide on this project is whether I'm going to follow the convention (which I don't like) of putting "test/" in a separate directory from "main/". I dislike it because I think unit tests should be included inline if short and relevant to the purpose of the function.

Generally, my testing practice is to REPL-test and then include the tests in code for posterity. It makes the testing process more playful to start in REPL testing. I know it's the opposite of TDD but it works for me.

dpritchett · on April 3, 2012

Michael, could you explain why you are using Scala over Clojure? I know you've got experience with both, but Clojure has always seemed more appealing to me.

ondrasej · on March 31, 2012

As for commenting - code that is meant to be a library should have a clearly documented API. Perhaps not from the beginning, but as the API gets stable, the JavaDoc comments for generated documentation should be there. Though at the current state, tests and "getting started" examples (create a graph, list all nodes, determine if there is an edge from A to B, ...) would be sufficient (and would really help).

And since you're writing a graph library, there definitely should be a detailed commentary on the implementation of the graph (e.g. representation of the graph structure, memory complexity of the representation, and time complexity of common operations). Once you start working with larger graphs, using the correct representation and algorithms for your task makes a huge difference and it is something that must not be hidden from the users.