I doubt many people can tell you exactly when the first time they read a map was. Mine was memorable though. Circa 3rd grade, I went through the usual battery of standardized tests for the first time, which included map reading. I did pretty bad, which was odd because it was the only section I bombed like that. Concerned that perhaps I had some sort of learning problem related to spatial data or visualization, a guidance counselor reviewing my scores quizzed me about that section and what I thought of it. Told her I thought it was pretty neat, and that I was looking forwarding to learning about these “maps” one day. Turns out, due to a school change and differences in class order between schools, I had never been shown one before the exam. For someone who had to deduce what the symbols meant during the test, suddenly my scores didn’t look too bad.
It’s easy to feel like a completely disoriented newbie to spatial information when trying to learn how to use PostGIS, the popular PostgreSQL extension adding support for all sorts of map related features. Geographic information systems (GIS) are filled with their own special terminology and techniques. To help navigate this maze (literally sometimes!), Regina Obe and Leo Hsu have recently released PostGIS In Action, a whopping 492 pages of nothing but information on this specialized topic.
The book aims to be a comprehensive resources for three groups: GIS practitioners, database practitioners, and scientists/researchers/etc. To the extent that it’s possible to do so, the material in the book tries to write from each of these perspectives. So you get an introduction go GIS terminology, an introduction to SQL, and an introduction to installing the software and making everything fit together. Not every section will be useful to every type of reader, but there are enough handy tips sprinkled around every section that you might pick up a useful trick even on material you know well already. For example, in the performance tuning section that I mainly breezed through, I did pick up some useful windowing function and Common Table Expression ideas, ones that are even useful beyond the GIS context.
I like to kick off working with new technology by picking a real-world project and seeing how far I can get with it. I’d tried this with PostGIS once before, about a year ago, and failed miserably. The project involves a long list of addresses that I wanted to transform into spatial data, then analyze using spatial queries. The process of turning addresses into coordinates, called geocoding, can be done for the US using a public data set named TIGER. During that earlier attempt, I couldn’t make any sense out of which versions of each component I needed to get that working though, and gave up on the whole thing. Reading through that section of PostGIS In Action, I felt a little better. It wasn’t that I was confused about the complexity–it really is that much of a pain to figure out! Quote from the book:
The TIGER geocoder packaged with PostGIS 1.5 and below doesn’t handle the new U.S. Census data ESRI shapefile format. For those, therefore, we’re using a newer version currently under development by Stephen Frost…for our exercises we’ve taken [this] newer version and made some minor corrections to support the TIGER Census 2009 data.
This sort of thing is where the book is at its best. Advice about which versions of which software work together, and helper scripts unique to the book to aid in some of the complicated parts, can skip you past days of frustrating work.
The book mainly aims at PostgreSQL 8.4 and 9.0, but there’s material going back to 8.2 and some previews of coming features in 9.1. While the server side of tools covered includes the most common PostgreSQL operating systems (Windows/Linux/Mac OS X), it’s obvious that Windows is the preferred platform for many of the client GIS tools. Accordingly, it’s not a surprise that the recommendations for PostgreSQL are biased toward using the one-click installers, rather than getting dragged too deeply into the trivia of software building and installation.
But what PostGIS in Action does in many places is refer to web resources for things it skims over, which is commendable. Even a book of this length can’t cover everything about every possible platform available, and having an author point out the best articles available is a helpful way to extend its reach. From the sections I know enough about to comment on, the recommended additional reading were often articles I’d already read and found useful. The main missing one was that the somewhat slim coverage of useful postgresql.conf settings to improve performance could have used a link to the Tuning Your PostgreSQL Server page, which covers some of the same material in more detail. That wiki is one of the main additional resources suggested at the end though.
With all the specialized terminology and multiple skill sets required to work through this material, finding the right sequence to read this book in is challenging. Putting things into the best order for learning the material is the area I think could be improved the most in a future edition of this title. To pick a trivial example, but one that’s characteristic of what I saw in multiple places, the order of things in the “SQL primer” chapter was rather strange. The first section covers how to use the information_schema to navigate column metadata. How that section ended up at the very beginning, before even covering what SELECT means, I have no idea. In a few of these cases I spotted, the information needed is all there, you just need to read it in a different order than it’s presented. Readers may find it’s worth skimming the whole chapter to get an idea how it flows if things don’t seem to fit together easily. Don’t be afraid to skip around if the info you need looks like it’s covered better in other sections.
My first pass through PostGIS In Action left me much more comfortable with the big picture of how applications built using these tools fit together. And I expect to refer back to it for both its introduction to specific programs and its useful sample code. Trying to be a complete reference for all of the targets this title aims at is very tough though.
GIS practitioners and scientists who don’t already have much SQL and/or database experience will likely need the most additional information beyond what this book covers, in order to become completely functional PostGIS users. But intros to SQL are easy to find; discussions of GIS aimed at the database practitioner, what I’ve been looking for, are rare. So far I’ve spent the most time with the terminology introduction in the first two chapters, plus the TIGER use information I mentioned. And I already feel like my copy of PostGIS In Action was a worthwhile purchase. It’s great to finally have a full size book on this very important PostgreSQL-based technology.