Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Ask HN: Creating a query builder for end users
39 points by systematical on April 19, 2014 | hide | past | favorite | 32 comments
Our product is starting to get lots of requests for custom reports from our customers. I've taken some steps to really stream line the creation of the reports and exporting reports to PDF / XLS format. This has really helped and most reports can be created fairly quickly with a very full feature set.

The other day I began thinking about giving users the power to create their own filters essentially creating their own queries through a UI. This isn't so bad because our framework, CakePHP, stores all the relations to the data and its easy to build on top of this.

The tricky part is striking a balance between usability and power. A concern of mine is that this sort of thing might be too complex to our non-techie customer base. Have you done something like this before? Do you have any great examples? How did you handle (or why did you not handle) complex things like: OR, AND, NOT IN, IN, <, >, BETWEEN etc... Specifically I'm interested in your thoughts on building an intuitive and beautiful UI.



I faced the same basic concept a number of years ago, and we had to make it work for barely computer literate police.

Basically, we used a single piece of software's single feature as our reference, for describing it to others and for how to handle the experience.

iTunes Smart Playlists. (https://www.evernote.com/shard/s136/sh/7f5f39a1-6c85-4853-aa...)

I have yet to see a better interface (than iTunes) for building what are essentially custom queries.


I think that this can be improved by removing the All and Any operators and instead letting segmentations be nested. So for example:

  All
    a = b
    Any
      All
        c = d
        d = f
      c = e
would be

  a = b
    c = d
      d = f
    c = e
I think that it is more intuitive to segment the set as you go.


I built QueryTree (http://querytreeapp.com) as a response to this sort of problem. It's a domain specific visual programming language designed for working with tabular data. So people who want to build a custom query on a database can do it just by dragging and dropping the right "tool" onto the "worksheet".

It's a desktop app rather than web, although it is actually built using HTML, canvas and JavaScript.

It can do almost anything SQL can. It can't do "IN" queries yet but I have a plan do that in a way that's consistent with the general drag and drop approach of the app. It doesn't explicitly handle BETWEEN queries but they are easily achieved by using two filter tools one after the other.

I think there are two types of complexity to be dealt with when enabling users to build reports/queries. One is the complexity of the concepts involved, the other is the complexity of communicating what you want to happen to the computer. I built QueryTree because of a belief that many people understood they the concept of a filter or a join, but just didn't like text based programming languages. I replaced the SQL syntax but left everything else in place. If I didn't think my target users were up to understanding the conceptual complexity I would have restricted it a bit more and would probably have ended up with something like IFTTT.com but for data. Mostly, people seem to "get" QueryTree though, especially when they're using it on data that they're familiar with.


Looks nice! What technologies are you using for cross-platform support in your app?


After evaluating all the desktop html frameworks I could find, I settled on "node-webkit".


Looks nice but doesn't support PostgreSQL


True, it is planned though. Feel free to vote for that feature: http://querytree.uservoice.com


I've done this when building the reporting side of a survey product.

The vast majority of your users won't use the complex filters.

Ask your customers what kinds of reports they want to build, usually it will be a series of AND requirements.

You can look at email filter builders for inspiration.

Usually it is something like building a series of :

Field (drop down) operator (drop down) value (text)

The tricky part in the ui is combining these (and/or) and if you really want to get fancy, supporting parena to change the precedence of the operators. A AND (b or c).

You must not convert this to SQL on the client side. You must validate the fields submitted are in your approved list on the backend (indexes/security).


'You must not convert this to SQL on the client side. You must validate the fields submitted are in your approved list on the backend (indexes/security).'

Interesting point. Can't you get around this by only exposing certain tables/fields in the UI based on the users permissions?


You shouldn't trust it just because of your UI. With developer tools, regardless of your UI, it would be trivially easy for a malicious user to send any arbitrary SQL query. This would allow them to retrieve sensitive data or modify/delete records from your production database. You really should only build this query server-side, and, as the OP is using PHP, I should also add the caveat that they should use PDO.


What is exposed in the UI is not always the response you receive back. This is the crux of the mass assignment bug in Rails.


This is interesting because I'm building the exact same thing, which also has the be plugged into a Cake app.

For choosing the tables to query, my idea is to present a grid. Each cell has the table name and tables available through relations. When a table is selected, the cells either remain "on" or are disabled based on relations available. These will typically change depending on tables selected. Relations are discovered as defined, say hasOne, hasMany, etc. this is step one.

Step 2 involves selecting the fields to include in reports. These are typically prefixed with the model name and with underscores replaced with spaces. So User.first_name becomes "User's first name", User's country name or Country name, which I haven't worked out fully yet. I suppose some intuitive way to combine these into virtual fields would be nice.

The hairy bit is selecting conditions. I get requests to hard code queries such as finding the number of product ordered by users who have taken an online course or an in-person course so I have to be flexible. This part is tough to express in a UI simply, because X AND (Y OR Z) isn't the same as (X AND Y) OR Z. However, having some way of grouping conditions and requiring that such groups also have a binary operator by dragging and selecting seems the way to go, a la selecting icons on a desktop.

All of this can be saved as a "virtual table" for reuse. Output is typically a CSV for marketing purposes but my intention is to have this so I can plug in a PDF renderer with nice charts and what not, like Google Analytics.

Though something tells me building all this may be overkill if I can find a package that handles analytics and custom-defined events, along with the ability to upload backdated data!


> The tricky part is striking a balance between usability and power.

Yeap, this is indeed the big issue. One thing that helps is to tune the UI for your specific domain so that it has just enough power to be usable in that domain, but no more. We had to build a query tool for retailers in my last gig, and tuning the UI just for retail helped a lot.

For example, make it easy to specify a BETWEEN operation in your UI if that tends to be used often, provide commonly-used domain queries as templates so your customers can learn by looking at and modifying these templates, etc.

For inspiration, you may also want to look at other products that have query building as core feature. Two that stand out to me in terms of usability are mixpanel and chart.io: https://mixpanel.com/segmentation/ (see "Goodbye SQL" section) http://chartio.com/product/tour (see "Drag and Drop Chart Creation")


I was working on such system for performance measurements. Basic assumption was that user would never see any mention of tables, joins, column names nor any other SQL concept.

1.Table selection - there is predefined set of joined tables we call views. Views have descriptive names e.g. 'Temperature measurements', no underlying tables are shown.

2.Column selection - user selects columns from views using descriptive names , not actual column names . UI clearly indicates columns that land in 'group by' section by calling them dimensions. There is predefined aggregation strategy for all columns that are not dimensions - e.g. average for temperatures or sum for number of measurements. In some cases more that one aggregate is available e.g. average or maximum temperature. User just selects maximum or average temperature not aggregation operator.

3. Filters: user can select simple filters on columns: =<> and string matching for text. All filters are ANDed

4.Sorting: user can select columns to sort on.

5.Display: results are displayed as tables or charts. More that one section may be placed on single report.

The real system has tons of additional features but basic design as design as described above has worked great for years.


I have worked on an app that gave the exact functionality to the users. The users could create queries using the query builder, and also schedule when the queries could be run.

The challenges of course was training the users to use the operators correctly and in a way that made logical sense. It did take a while for the end users to learn to build queries.

On the UI front, most of the fields and tables could be chosen by double-clicking or drag and drop. Certain operators could not be used with fields based on their types. We also provided a AST like tree to help users understand the operators.

On the optimization front, letting users build their own queries could also lead to very costly queries. We had to profile the user created queries regularly and fine tune our query generator to handle such uses. We ended up creating a lot of materialized views in Oracle to handle specific frequent uses of joins in the table.

The users who were using this tool were top level finance and accounting folks, who wanted to produce reports for the CFO has he dreamt up different data points to work on!


We've been doing exactly this for our antispam app, we started off with a working custom filters web app (Angularjs) which you can see at http://209.213.221.169:8080/ using text or regular expressions (the regexp has a bug in it). This also shows the query in plain english as you build the query. Original design credits for the test app to https://github.com/kindohm/knockout-query-builder/

This is the design we've come up with so far, which incorporates grouping of queries. We've tried to make it readable, but we also display the query in a sentence as you build it. http://cdn.firetrust.com/images/misc/customfilter.jpg



My advice regarding the usability vs power issue is to have multiple stages. You'd have a DSL/query language that you parse, is secure, etc. And then you'd have a UI over the top of that which automatically creates the right queries.

The win here is that you then have an easy interface for 90% of users but the remainder with more bizarre or complex requirements can still get to where they want to go by either writing a query themselves in an "advanced" field of some sort, or you could even do it for them for $$$. Embedded JavaScript is an option here.

I would contrast this with something I would not advise, having a visual interface go straight to a database query behind the scenes.. because then you're basically limited to what you exposed in the UI.


You could start with an intermediate stage, with say a query-builder wizard that walks them through the process.

1. What type of report would you like to create? (define the domain)

2. What are you looking for? (define the topic)

3. What limits are there? (define the data boundaries)

4. How do you want it displayed? (report output format like excel, pdf, etc).

See how a select group of customers like it and get feedback on it ... some might immediately say "yes we want something more powerful" or what you have will be just fine. From there you can decide if you want to pursue exposing the query-builder more openly than a guided wizard, and if you want to charge extra for it.


Mac Finder's query builder is, in my opinion, the friendliest you can get. http://i.stack.imgur.com/YdUSq.png


I've done this before in dojo framework + dgrid. Our system is based off of a dynamic metamodel where users define classes (e.g. Task) and attributes under those classes (e.g. State, owner, project reference) from a set of predefined types (e.g. Enum, user, reference). The defined metamodel ends up as a class hierarchy and is presented as a tree for the query builder. After defining a class, you can then instantiate that class and set attribute values on it.

For building queries, you drag classes and attributes from the metamodel tree into either the filter tree or the report fields section. The filter tree is a tree of AND's and OR's constructed as you drag attributes into it. Selecting an item in the filter tree then brings up a value editor based on the attribute type that allows you to filter your query based on values that make sense for that type. So if you define your tasks state as enum(unassigned, assigned, started, completed) you get a combo box with those values and options to select in, not in, null, etc.

The report fields section is a list of attributes you want to see in the result set. Basically just a select on that attributes value for the query system. Imagine a todo list created instantly by filtering on tasks with owner == current user and state not in completed, but also wanting to see and sort on the priority, and not showing the owner since you know it's you. The report fields area allows setting of sort order, and ascending or descending.

As you drag attributes into the filter or fields sections, a results pane underneath automatically gets new columns added (for fields added of removed) or new results based on a modified filter tree. After saving a query, users can then register for updates of items that match the query. If any data is changed that affects query results, the updated item or removal is pushed via websocket and that item is updated or removed from the results grid. With this, users never have to refresh the query since all the data is live.

Screenshots are slightly edited and a little out of date, but you get the idea.

Query builder: http://i.imgur.com/AV071JV.png

Dashboard (powered by query/report building): http://i.imgur.com/BvBW7L3.png


there are a lot of query builders out there, i'm pretty sure you dont need to write one. Doctrine has one :

http://phpcr.github.io/

which deals with pure PHP objects.it follows a java standard,you can write SQL like queries with it.

https://github.com/phpcr/phpcr-docs/blob/master/tutorial/Tut...


This blog post demonstrates grouping values under the associated filter expression:

http://kindohm.com/posts/2013/09/25/knockout-query-builder/

There are also number of projects implementing JavaScript query builders that can be found from there, including Red, YUI, jQuery and Ext Grid query builders.


I've done this when I was working on the Playstation Network so business users could define storefront categories that contain products based on their own custom logic. We didn't get a lot of requests to change the feature and it was stupid simple. I wouldn't get too fancy unless your users get really excited about it and ask for it to do more.


MagentoCommerce (also PHP) has a query builder, maybe you could have a look for inspiration? http://www.magentocommerce.com/wiki/_media/welcome_to_the_ma...


Having spent over a year working with Magento's community edition, it seems to me that, for a developer unfamiliar with it, the reaction it's likeliest to inspire would be panicked flight.


You might look at how Wufoo's rule system works. Designed for not-very-technical users and it is pretty powerful.


Check out HTSQL, should be able to act as your API.


I spent ten years of my career building and then altering something like this and would be happy to talk with you about it.


I'm actually looking to work on something similar for an intranet tool.


Take a look at Pebble Reports. It is exactly what you are describing. Although it is a Windows app, not web.


SQL is pretty good




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: