This week’s flame war on the pgsql-performance list once again revolves around the fact that PostgreSQL doesn’t have the traditional hint syntax available in other databases. There are a mix of technical and pragmatic reasons behind why that is:
- Introducing hints is a common source of later problems, because fixing a query place once in a special case isn’t a very robust approach. As your data set grows, and possibly changes distribution as well, the idea you hinted toward when it was small can become an increasingly bad idea.
- Adding a useful hint interface would complicate the optimizer code, which is difficult enough to maintain as it is. Part of the reason PostgreSQL works as well as it does running queries is because feel-good code (“we can check off hinting on our vendor comparison feature list!”) that doesn’t actually pay for itself, in terms of making the database better enough to justify its continued maintenance, is rejected by policy. If it doesn’t work, it won’t get added. And when evaluated objectively, hints are on average a problem rather than a solution.
- The sort of problems that hints work can be optimizer bugs. The PostgreSQL community responds to true bugs in the optimizer faster than anyone else in the industry. Ask around and you don’t have to meet many PostgreSQL users before finding one who has reported a bug and watched it get fixed by the next day.
Now, the main completely valid response to finding out hints are missing, normally from DBAs who are used to them, is “well how do I handle an optimizer bug when I do run into it?” Like all tech work nowadays, there’s usually huge pressure to get the quickest possible fix when a bad query problem pops up.
If PostgreSQL didn’t have some ways to deal with that situation, there would be no serious production PostgreSQL databases. The difference is that the things you adjust in this database are more rooted in influencing the decisions the optimizer already makes in a fairly subtle way, rather than just you telling it what to do. These are hints in the literal sense of the word, they just don’t have the some user interface for hinting that users of other databases new to PostgreSQL go looking for.
With that in mind, let’s take a look at what you can do in PostgreSQL to work around bad query plans and optimizer bugs, particularly the things that many people seem to think can only be solved with hints:
- join_collapse_limit: This adjusts how much flexibility the optimizer has to reorder joins of multiple tables. Normally it tries every possible combination when joins can be rearranged (which is most of the time, unless you’re using an outer join). Lowering join_collapse_limit, perhaps even to 1, removes some or all of this flexibility. With it set to 1, you’ll get the joins in the order you wrote them in, period. Planning large numbers of joins is one of the hardest things for the optimizer to do; each join magnifies errors in estimates, and increases query planning time. If the underlying nature of your data makes it obvious what order joins should happen, and you don’t expect that to ever change, once you figure out the right order you can lock it down using this parameter.
- random_page_cost: Defaulting to 4.0, this parameter sets how expensive seeking to disk to find a random page on disk is, relative to a reference value of 1.0. Now, in reality, if you measure the ratio of random to sequential I/O on regular hard drives, you’ll find this number is closer to 50. So why 4.0? First, because it’s worked out better than larger values in community testing. Second, in many cases index data in particular will be cached in memory, making the effective cost of reading those values lower. If, for example, your index is 90% cached in RAM, that means that 10% of the time you’ll do the operation that is 50X as expensive; that would make your effective random_page_cost about 5. This sort of real-world situation is why the default makes sense where it’s at. I normally see popular indexes get >95% cache in memory. If your index is actually much more likely than that to all be in RAM, reducing random_page_cost all the way down to just above 1.0 can be a reasonable choice, to reflect that it’s no more expensive than any other read. At the same time, random seeks on a really busy systems can be much more expensive than the expectation you’d have from just looking at single-user simulations. I’ve had to set random_page_cost as high as 60 to get the database to stop using indexes when the planner was mis-estimating how expensive they would be. Typically that situation comes from a sensitivity estimate error on the planner’s part–if you are scanning more than about 20% of a table, the planner knows using a Sequential Scan will be much more efficient than an Index Scan. The ugly situation where I had to force that behavior to happen much earlier than that came about when the planner expected 1% of the rows to be returned, but it was actually closer to 15%.
- work_mem: Adjusts how much memory is available to queries doing sorting, hashing, and similar memory-based operations. This is only a rough guideline to queries, not a hard limit, and a single client can end up using multiples of work_mem when running a query. Accordingly, you need to be careful not to set this value too high in the postgresql.conf file. What you can do instead, though, it set it before running a query that really benefits from having extra memory to hold sorting or hash data. You can sometimes find these queries from logging slow ones using log_min_duration_statement. You can also find them by turning on log_temp_files, which will log every time work_mem is too small, and therefore sorting operations spill to disk instead of happening in memory.
- OFFSET 0: PostgreSQL will rearrange subqueries into the form of a join, so it can then use the regular join order logic to optimize it. In some cases, that decision can be a really bad one, as the sort of thing people tend to write as subqueries seem a bit more difficult to estimate for some reason (I say that based on the number of such troublesome queries I see). One sneaky trick you can do to prevent this logic is put OFFSET 0 onto the end of the subquery. This doesn’t change the results any, but inserting the type of Limit query node used to execute OFFSET will prevent rearrangement. The subquery will then always execute in the way most people expect it to–as its own isolated query node.
- enable_seqscan, enable_indexscan, enable_bitmapscan: Turning off one of these features for looking up rows in a table is a fairly large hammer to highly recommend avoiding that type of scan (not always preventing it–if there’s no way to execute your plan but a seqscan, you’ll get a seqscan even if the parameters is turned off). The main thing I recommend these for is not to fix queries, it’s to experiement with EXPLAIN and see just why the other type of scan was preferred.
- enable_nestloop, enable_hashjoin, enable_mergejoin: If you suspect your problem is the type of join being used rather than how the tables are being read, try turning off the type you are seeing in your plan using one of these parameters, then run EXPLAIN again. Errors in sensitivity estimates can easily make a join seem more or less efficient than it really is. And, again, seeing how the plan changes with the current join method disabled can be very informative for why it decided on that one in the first place.
- enable_hashagg, enable_material: These features are relatively new to PostgreSQL. Using Hash Aggregation aggressively was introduced in version 8.4, and more aggressive materialization in 9.0. If you are seeing those types of nodes in your EXPLAIN
output and they seem to be doing something wrong, because this code is so much newer it’s a bit more likely to have a limitation or bug than some of the older features. If you had a plan that worked fine in older versions of PostgreSQL, but uses one of these node types and seems to perform much worse as a result, disabling these features can sometimes return you to the earlier behavior–as well as shine some light on why the optimizer did the wrong thing as useful feedback. Note that this is generally the way more advanced features tend to get introduced into PostgreSQL: with the option to turn it off for troubleshooting purposes, if there proves to be a plan regression relative to how earlier versions executed things.
- cursor_tuple_fraction: If you don’t intend to read all of the rows back from a query, you should be using a cursor to implement that. In that case, the optimizer tries to prioritize whether it gives you the first row back quickly, or whether it prefers to optimize the whole query, based on this parameter. By default the database assumes that you’ll be reading 10% of the query back again when you use a cursor. Adjusting this parameter lets you bias it toward expecting you to read less or more than that.
All of these parameters and query tweaks should be consider triage adjustments. You don’t want to run with these in place forever (except perhaps for join_collapse_limit). You use them to get out of a jam, and then hopefully you’ll figure out what the real underlying cause of the bad plan–bad statistics, optimizer limitation/bug, or something else–and then address the problem from that direction. The more you are pushing optimizer behavior in a direction, the more exposed you are to future changes in your data making that push no longer a correct one. If you use them right, as a way to study why you got the wrong plan (the approach I used in the query optimization chapter of PostgreSQL 9.0 High Performance), the way you hint at things in PostgreSQL should result in you leaving every run-in with bad optimizer behavior a little more saavy about how to avoid that class of problem in the future