We've been pretty excited about this release to come for some time at Heroku as its loaded with great features. In addition to the JSON datatype here's a bit of a longer list of features that are pretty noteworthy in the release:
- Allow libpq connection strings to have the format of a URI
- Add a JSON data type
- Allow the planner to generate custom plans for specific parameter values even when using prepared statements
- Add the SP-GiST (Space-Partitioned GiST) index access method
- Add support for range data types
- Cancel queries if clients get disconnected
- Add CONCURRENTLY option to DROP INDEX
- Add a tcn (triggered change notification) module to generate NOTIFY events on table changes
- Allow pg_stat_statements to aggregate similar queries via SQL
- text normalization. Users with applications that use non-parameterized SQL will now be able to monitor query performance without detailed log analysis.
Since postgres has basic json type support now, and PL/Javascript exists, it's only a matter of time until an extension appears that lets you deploy javascript applications directly to the database.
Who needs CouchDB or Node.js when you can just say CREATE EXTENSION 'couchnodegres.js'
Once the database can start receiving and returning json, then you can treat it as a webservice and not have the client involved in SQL. I think this is huge.
I am actually pretty excited about the native JSON support, and overall I am a huge fan of Postgres, but this is the most press-release-y press release ever* . By that I mean that the quotes are way too "perfect", the kind you only see in press releases. Some PR or marketing guy wrote them then showed them to the person to whom they'd be attributed to get their ok. Nothing inherently wrong with it, just struck me as funny.
* having written more than my share of press releases in my time
> Some PR or marketing guy wrote them then showed them to the person to whom they'd be attributed to get their ok
I happen to know the guy that AFAIK was in charge of preparing the press release and he's actually a major contributor to the codebase, a geek par excellence and have been nagging people to get him quotes on the development mailing list.
The fact that apart from hacking C code he also knows how to write a catchy press release just makes him all the awesomer :)
That was some good combination. I always admire programmers than can also produce good design. Now, I will add marketer + programmer combination to my "admiration" list!
Yep, the people that can both produce and sell are a rare and very valuable breed, if only because they can actually understand what the other marketers need (or will pay for!).
This. Good devs who are able to better pimp their great work (such as anything in this Postgres release), are the devs that get the coolest job offers. So props to the great work by the Postgres team and their tasteful fact based self promotion. :-)
I'm one of those quotes (talking about NewsBlur) and I just tried to give back to an amazing tool in my tech stack by being effusively honest. Honestly, I think they deserve more credit and was a bit surprised that they chose my quote over the dozens of undoubtedly heavier stacks that also use Postgres. Surprised in the good way.
I think Postgres doesn't do enough to toot it's own horn. Look at it's website. Come on. Spice things up a bit. I'm inclined to think hardcore users purposely downplay the product because they see it as their secret sauce.
It actually looks extremely professional. In my experience, when you're trying to convince a large company to use a platform, you want a clean, business-case-focused site without excessive flair. Postgres nails it on those points, I think.
It's reminiscent of IBM, which is a huge compliment in this context.
The last major web site update was in late 2004, according to my bisection of the wayback machine. It's a project, not a company, and there is no marketing or design budget. For example, Magnus, who does some Windows maintenance, is a principal author of the new pg_receivexlog, and knows a lot about the SSL support also worries about the web-based mailing list archive program and the CSS of the manual, if memory serves.
Nobody's holding back anything. It would be wonderful if someone wanted to contribute their design skills to the web page or documentation -- talking to the mailing list pgsql-www is probably the way to do that. I am positive such a person would receive profuse appreciation, and maybe can parlay that into personal advantage if the fulfillment of the impact such improvements would have is not quite material enough by itself.
The project runs its own web site infrastructure (many regimes have come and went in Postgres' history, only semi-recently was pgfoundary decommissioned for the purpose of new projects), and it's a little arcane -- but if someone really wanted to take ownership to move things beyond maintenance into progress, perhaps changes could be made.
As one of the main contribs noted in a previous HN article, the important part was to get JSON conversion into core, full stop - because extensions (like hstore) can't declare dependencies on other extensions. Having JSON in/out in core is an enabling feature for all the rest.
That limitation (not-lucid dependencies between contribs) is rather sucky, but jumping into the world of dependency management will require a deep breath for sure.
Hence, instead of waiting for that, JSON support in un-adorned Postgres to solve a common use case in the interceding years.
The jargon for what this gives you is "a stable oid". Also more or less equivalent to a "system OID" at this time. These "Object IDentifiers" in Postgres are unsigned int32s that are (almost?) never reused (only accrued, or removed), and are all under the number 10000, and all assigned statically by hand. A-priori knowledge of these numbers can simplify writing extensions dramatically, but clearly this is not scalable for a future with dependency chains in extensions.
I'm sure something like that will be added. One of the issues there facing is that hstore is a string => string map, so a conversion from json to that will be lossy (i.e. 'that "true" in hstore, is that the string "true" or a converted boolean true from json?').
What you can do right now though is use PL/V8 to query the JSON fields. Then you can use functional indexes to still being able to speed up queries. Yes. You could that before but now there's a guarantee that a field of type json contains just that, meaning that your application logic will get simpler.
You won't get them in an hstore[0] column, which was the context of my reply. Of course in a JSON column you want full JSON support, but my and my parent's post were about converting back and forth between hstore and json.
Since hstore only handles string:string mapping, the choice is either to silently corrupt data by stringifying all values in the json->hstore encoding, or erroring out if the input data is anything other than string:string.
The latter would be what I'd prefer, and more in line with usual Postgres behavior.
JSON is much more powerful than hstore. Not only can it store different JSON data types (boolean, numbers) but also you can store documents with multiple levels deep, which is not possible on the hstore type.
Having said that, hstore is perfect when storing simple key/values. It's been battle tested and used for years and has some powerful and native indexing possibilities.
hstore is much more limited, it's a flat mapping of strings to strings (so a single level of JSON object). And as I noted, there does not seem to be any convenience method for converting between hstore and JSON data (or the other way around)
PostreSQL can't horizontally scale easily or properly handle JSON right now compared to MongoDB. It also has a fixed schema which makes database migrations a necessary evil again. Sorry but the idea that PostgreSQL is a feature by feature replacement is pretty laughable.
Easily being the appropriate word here. In that article you wrote they had to manually setup sharding. With something like CouchDB I can press a button to add a new node in the cluster. Again. Easily is what I am talking about.
And again V8 is nice and all but it is rather bolted on as opposed to something that is was built from scratch to support JSON. This shows in terms of feature support and most importantly ease of use. I can't just annotate HashMaps, Lists etc in my Java classes and have them serialized to PostgreSQL in JSON format.
I am just saying that PostgreSQL is a great and all but it is not a proper JSON document store style database anymore than hacking SQL on top of MonogDB would make it a RDBMS.
I noticed that Mattdebord's post qualified it by saying feature-for-feature replacement of Mongodb for us. He didn't claim it was for everyone. Why all the hate?
Maybe not easily but it can far more flexibly than MongoDB. Take a look at Postgres-XC which offers a two-tiered clustering solution for Postgres inspired by Teradata. A bit complex, but you don't give up your formal proof of data integrity in order to get it.
My two favorite features are not so high on the PR docs though.
The first is SECURITY BARRIER and LEAKPROOF which gives us an ability to rethink how to multi-tenant applications. This is a game changer and will get even better in future versions I am sure.
The second is NO INHERIT constraints, which I will certainly be making good use of. It is also a complete game changer when it comes to table inheritance and partitioning, and my main use will be things like CHECK (false) NOINHERIT to ensure that a table in fact never has rows of its own.
There is an amazing amount of good stuff going on around Postgres right now. Postgres-XC was recently released, and more. It is an amazing data modelling platform and ORDBMS.
One area where Oracle is ahead is in multi-tenant applications. They have an approach where you can build filters on data based on specific criteria. You can do this on PostgreSQL too but the way you have to do it before this either involves add-ons or actual partitions of data, or tweaks to prevent building functions altogether.
Security barriers are important in catching up to Oracle in this regard and they allow us to build more integrated multi-tenant applications with greater access to the db by the tenants.
NO INHERIT constraints open up a fairly large area of PostgreSQL for use or misuse, because you can now partition a primary key between parent and child declaratively. My own use will be to prevent inserts into the parent table directly. Table inheritance is a really neat feature if you use it in non-traditional ways. It is not particularly useful for its advertised use (set/subset stuff outside of partitioning). However what it allows you to do is to compose your tables out of smaller re-usable pieces each of which can have complex derivations of data attached. For example, if you want full text search on comments on a lot of your tables, you can create a consistent, centralized interface for this using table inheritance.
Managing the forward/backward links in a series this long while it is still coming out is a pain. They will be added as the series draws to a close next week though.
>The first is SECURITY BARRIER and LEAKPROOF which gives us an ability to rethink how to multi-tenant applications
It doesn't really let us re-think it does it? It just closes a hole in how you would have typically done it anyways. Is there any way to solve the problem of having to create a new connection as company_X_user for every single request?
I think a number of people might find such functionality useful -- every once in a while I run across people with tens or hundreds of thousands of schemas, because a program written to query against a particular schema is more convincing and easy to audit than row-based multi-tenancy.
The problem is that there are performance and tooling issues when you have that many database objects, so those people can end up sad, even though the model was one they basically were happy with.
So I think adding database features for lucid multi-tenant work while reducing the risk of cross-tenant leakage is a movement in a good direction.
Personally, I'm excited about the range types and I can see immediate usefulness for them. My own applications aside, anything that helps developers create schemas that are better able to handle temporal data is a good thing.
Range types combined with exclusion constraints solve the problem of not allowing overlapping bookings using a database constraint. The solution is simple, clean, and flexible unlike the workarounds.
Example of adding such a constraint:
ALTER TABLE reservation ADD EXCLUDE USING gist (room WITH =, during WITH &&);
In this example a room cannot be double booked.
EDIT: This is a feature entirely unique to PostgreSQL.
For example: if you have a coupon that has a start and end time (but the end time is optional), typically, you'd have a start_at and end_at datetime columns. The query for checking for an active coupon would be:
select * from coupons
where start_at < now() and (end_at is null or end_at > now())
Now, you can have a single column that represents the range of time that the coupon is active.
select * from coupons where now() <@ duration;
Plus, exclusion contraints. So you could prevent the database from storing a coupon that was active at the same time as another coupon.
I used Navicat until I finally decided to try M-x sql-postgres in conjunction with a SQL Mode 'scratch' buffer from which I can simply C-c C-c.
I'm not sure about Sequel Pro, but being able to [reverse] incremental search on \d and \dt is a blessing.
If for some reason you want to eyeball sample data for a table with a ton of columns that won't fit in one screen, then a GUI's resizable columns are nice. But usually I just type of the columns of interest — us programmers are usually good typists.
It's a display feature in the "psql" client program.
Normal result:
column1 | column2 | column3
---------+---------+---------
1 | a | 9.9
2 | b | 19.9
(2 rows)
Using \x:
-[ RECORD 1 ]-
column1 | 1
column2 | a
column3 | 9.9
-[ RECORD 2 ]-
column1 | 2
column2 | b
column3 | 19.9
The first form is tabular and works well for a few columns; but doesn't work well when there are many columns, because the lines start to wrap. So you use \x for wide tables to make the result readable (but, obviously, fewer rows are shown at a time).
Using "\x auto" automatically chooses which format to use based on your terminal width.
\x in psql switches if things are like a table or a list. Before you'd have to run a query and see that you cant read it, then hit \x and then run the query again. Now you can just have it figure out which one to show you based on your terminal size
"PostgreSQL 9.2 will ship with native JSON support, covering indexes, replication and performance improvements, and many more features. We are eagerly awaiting this release and will make it available in Early Access as soon as it’s released by the PostgreSQL community," said Ines Sombra, Lead Data Engineer, Engine Yard.
One thing I love about PostgreSQL development is all the small nice fixes added in every version.
Of the small fixes in 9.1 my personal favorite is probably the cleanup of pg_stat_activity. There are also many other nice small fixes like improved tab completion for some commands and the ability to set environment variables in psql.
We are excited about Cascading Replication because it reduces network data transfer over WAN when we have multiple Read Replicas within and across datacenters.
- Allow libpq connection strings to have the format of a URI
- Add a JSON data type
- Allow the planner to generate custom plans for specific parameter values even when using prepared statements
- Add the SP-GiST (Space-Partitioned GiST) index access method
- Add support for range data types
- Cancel queries if clients get disconnected
- Add CONCURRENTLY option to DROP INDEX
- Add a tcn (triggered change notification) module to generate NOTIFY events on table changes
- Allow pg_stat_statements to aggregate similar queries via SQL
- text normalization. Users with applications that use non-parameterized SQL will now be able to monitor query performance without detailed log analysis.