Oracle to PostgreSQL: Reasons to Migrate
PostgreSQL has quite a few compelling advantages, some of which are not found in any other database management system. This article will outline the ones that are most interesting to the Oracle developer seeking to do a migration or investigate alternatives for new development.
To prevent this article from becoming another list of PostgreSQL features, it will stick to the features that differentiate it from Oracle specifically, and the advantages that those differences create.
Federation vs. Foreign Data Wrappers
Oracle has a feature called Federation that allows tables from other databases to be treated as local data. PostgreSQL kicks this up to an 11 with foreign data wrappers. It is fantastically superior to the federation system, can be hooked to anything that remotely resembles data, and costs nothing in fees. This makes PostgreSQL a great ETL platform, even if all you use it for is the data ingestion.
plSQL vs. everything else
Oracle has a built-in programming language called plSQL. PostgreSQL has somewhere around 80 or so. Really. The extension system of PostgreSQL is used to create procedural languages as plug-ins, and there are bindings for just about any language you can think of. And if there isn’t, you can always follow along with Mark Wong for how to make a wrapper for your personal favorite.
Application programming
Oracle provides an application API to communicate with the database. PostgreSQL also provides an API for convenience and trusted languages. However, the product is open source, so there are no second class development citizens. You can access anything that PostgreSQL has to offer by including the header file in your project. Do whatever you want with it.
Internationalization and Localization
Oracle provides a globalization toolkit. PostgreSQL is built from the ground up to rely completely on well-known and widely compatible system services for localization. It can use any character encoding, collation and code page that the operating system provides.
Web Development
Oracle acknowledges the existence of HTML through HTML DB. PostgreSQL natively supports JSON, XML and plugs in Javascript as a back end coding language, which you may use at the same time as Java or any other front end language of your choice.
Authentication
Oracle has a built-in authentication system (a nod here to Proxy Auth, which is a bit more flexible). PostgreSQL relies on the Host Based Authentication and SASL protocols to plug in any authentication system that the host can support, and a few that plug directly into PostgreSQL. This provides an enormous array of possibilities for authentication along with the potential to offload the authentication process onto other machines.
Extensibility
Oracle has a plug-in system with mostly proprietary plug-ins. PostgreSQL has an extension system that is supported by the general community, with thousands of available plug-ins.
Read Scalability
Oracle has fairly good vertical read scalability. PostgreSQL can create a virtually unlimited read cluster. The number of nodes is limited only by the amount of resources you would like to throw at it.
Cost
It’s hard to beat “free”, and Oracle doesn’t even try hard. Let’s face it, Oracle is just ridiculously expensive, and they don’t mind charging you again for every single instance.
This is not a linear comparison, either. One of the biggest advantages of PostgreSQL is that you can have all of the instances that you want, with no additional cost. (I guess it is an additional cost of +$0.00, or multiply by $0.00, you decide).
It’s not fair (to PostgreSQL) to compare the cost of a single instance of Oracle to a single instance of PostgreSQL. Once you taste the freedom of free, it will be hard to go back to jamming everything into a single instance just to reduce costs.
How much should a temporary database cost that transforms data on the way to the warehouse? I think free is about enough. How about a transient database for reporting? I’ll take free for that as well. How about one that is the data ingestion point for ETL? Free is good. I like free. It’s very, uh…freeing.
Performance
No, wait! Hear me out on this one. I’ve already mentioned that PostgreSQL can create an unlimited number of nodes in a read cluster. That can dilute the cost of any particular read operation to near zero. But there’s also another way the PostgreSQL is more performant than Oracle.
Because PostgreSQL costs nothing per node, you can tune it differently for every work load. Of course, you can do this with Oracle also, but you’ll be paying per node for the ability to tune it this way. So, if you want to differentiate the tuning parameters for your warehouse to OLTP to reporting to the data lake, PostgreSQL makes this fairly easy, and still be easy on the wallet.
Of course there are a lot more compelling features in PostgreSQL, but I already wrote that article a few months ago. Let me know in the comments if you have migrated to PostgreSQL for any other reason.
Hi Kirk,
As you do a comparison and link to a list of PostgreSQL features, let me refine the name and description of the Oracle features you compare to, so that people can find them and do a fair comparison. I’m afraid they may not recognize the names and descriptions you provide, at least in current versions. As an example, nobody will get search hits for “Federation”, or “plSQL”, or “HTML DB”… in the Oracle documentation but they will find “Oracle Gateway”, “PL/SQL”, “APEX”…
# Federation vs. Foreign Data Wrappers
There is no feature called “Federation”.
The closest from your description is Database links and Heterogeneous Services through Database Gateway. They go further than FDW in many points. But anyway, I would never use that for ETL. ETL needs optimized bulk loads and there are other features for that (like External Tables to read files, and direct-path inserts to fast load). If your goal is to federate and distribute some small reference tables, then Materialized Views is the feature you may look for.
https://docs.oracle.com/en/database/oracle/oracle-database/20/heter/introduction.html#GUID-EC402025-0CC0-401F-AF93-888B8A3089FE
# plSQL vs. everything else
“Oracle has a built-in programming language called plSQL.”
PL/SQL is more than that. It is compiled (to pcode or native), manages dependencies (tracks dependencies on schema objects), optimized for data access (UDF can even be compiled to run within the SQL engine), can be multithreaded (Parallel Execution). That’s different from PL/pgSQL which is interpreted at execution time. You mention languages as “as plug-ins” and for this, there are other ways to run different languages (external procedures, OJCM, External Table preprocessor,…) but when it comes to performance, transaction control, dependency tracking,… that’s PL/SQL.
https://docs.oracle.com/en/database/oracle/oracle-database/20/lnpls/overview.html#GUID-17166AA4-14DC-48A6-BE92-3FC758DAA940
# Application programming
Providing an “API to communicate with the databasw” is not about open source as the main goal is: encapsulation and hide implementation details. In order to access internal structures, which is what you mention, Oracle provides relational views (known as V$ views) accessible with the most appropriate API for a relational database: SQL
https://docs.oracle.com/en/database/oracle/oracle-database/20/refrn/dynamic-performance-views.html#GUID-8C5690B0-DE10-4460-86DF-80111869CF4C
# Internationalization and Localization
The “globalization toolkit” is only one part of the globalization features. You can also use any “any character encoding, collation and code page” but not relying on the OS implementation of it makes it cross-platform compatible and OS upgrade compatible (see https://wiki.postgresql.org/wiki/Locale_data_changes)
https://docs.oracle.com/en/database/oracle/oracle-database/20/nlspg/overview-of-globalization-support.html#GUID-6DD587EE-6686-4802-9C08-124B495978D5
# Web Development
“Oracle acknowledges the existence of HTML through HTML DB. PostgreSQL natively supports JSON, XML and plugs in Javascript”. HTML DB can be found in paper books, but the name is “APEX” since 2006. And it is not (only) about HTML, JSON, or XML but is a low-code Rapid Application Development with no equivalent for other databases.
Support for the structures and languages you mention are all there. The latest trend being JSON: https://docs.oracle.com/en/database/oracle/oracle-database/20/adjsn/index.html
# Authentication
“Oracle has a built-in authentication system.”
Yes, to be platform-independent, and has many other External Authentication: https://docs.oracle.com/en/database/oracle/oracle-database/20/dbseg/configuring-authentication.html#GUID-BF8E5E84-FE7E-449C-8081-755BAA4CF8DB
# Extensibility
“Oracle has a plug-in system”. I don’t know what you are referring to. Oracle is multi-platform proprietary software. Commercial, which means with vendor supported. There are a lot of APIs for extensions, but the vendor must have to control what runs in the engine in order to provide support.
# Read Scalability
“PostgreSQL can create a virtually unlimited read cluster”. Oracle has active/active cluster (called RAC) and read replicas (called Active Data Guard). For horizontal scalability, you use the same as for vertical (Parallel Execution) across multiple nodes (in sync, with instance affinity on partitions,…)
https://docs.oracle.com/en/database/oracle/oracle-database/20/vldbg/parallel-exec-intro.html#GUID-F9A83EDB-42AD-4638-9A2E-F66FE09F2B43
# Cost
“they don’t mind charging you again for every single instance.”
No, that’s wrong, license metrics are on processors (CPU) or users (NUP). You run as many instances as you want on your licensed servers for your licensed users: https://www.oracle.com/a/ocom/docs/corporate/oracle-software-licensing-basics.pdf
“jamming everything into a single instance just to reduce costs”.
No, database consolidation is recommended to scale the management of multiple databases, but not for licensing costs. If you go there, there are a lot of features to allow isolation and data movement in consolidated databases: Multitenant, Resource Manager, Online Relocate, Lockdown Profiles,…
# Performance
“differentiate the tuning parameters for your warehouse to OLTP to reporting to the data lake”: I already mentioned the point about read replicas and about multiple instances in a server. But with oracle, all the parameters I want to set different for OLTP or reporting do not require another instance. They can be set at session or PDB level. As Oracle does not need the filesystem buffer cache, there’s no need to separate on different servers to avoid noisy neigbours.
I hope this helps to look further at the features. There are many reasons to migrate and the main one is the will to move from a commercial model (with license and support) to an open-source one (start with low cost, help from community). But the decision must be done on facts and not rumors.
Franck.
Thank you for adding constructively to the discussion Franck. I’m not sure that I agree with everything you say here, but you’ve clearly spent some time thinking about the issue.
> No, database consolidation is recommended to scale the management of multiple databases, but not for licensing costs.
Frankly I would argue with this. It looks to me that Oracle Corp. pushed itself into all this consolidation story and started delivering technology (like CDB/PDB/instance caging) that answers clients needs, however Oracle forced them into this corner in 1st place. However once you are free, you can operate as many PostgreSQL+Linux boxes/VMs as you wish – who cares?
There is no needed – driven by e.g. CPUs licensed by PostgreSQL engine – to consolidate anything via shared[noisy] hardware/hypervisors/LDOMs/LPARs/multiple DBs instances on same box[+instance caging]/PDBs/schemas or whatever else you may have out there especially when hardware is cheap those days (compared to Oracle DB licensing).
To the list of advantages of PostgreSQL I would add:
– PostGIS (vs Oracle Spatial)
– indexing: GiST, GIN
– smaller DB engine: MBs of runtime code instead of GBs of stuff (yum -y install time matters for devs),
– smaller DB engine but with extreme extensibility (extensions, storage API, FDW) that is going to be foundation for a lot of cool stuff in future for sure, with already existing gems like CitusDB on top of core PostgreSQL
– no uninstallable/unlinkable-but-still-licensed stuff waiting for some random dev to accidentally to enable (e.g. create table compression in the past, did they fix that already?)
– no LMS audits hitting their own bugs in dba_feature_usage_statistics 😉
– much better security record (and no mess like Oracle JVM/opatch on RAC/MAA envs, but well… with downtime :o)
The list of Oracle advantages over PostgreSQL, well that’s another post 😉
I was trying to make a bit more direct comparisons, so didn’t go into features for which there is no parallel. Additional indexes I missed. Good catch. What about PostGIS makes you think it’s better?
Reasons not to migrate:
Oracle is extremely well instrumented (Oracle Wait Interface). You get timed runtime statistics on everything, instance, session, statement. Postgres is not as well instrumented, nor does it have all such statistics on board. Performance analysis is therefore much more tedious.
Oracle’s multitenant provides completely independent databases that can be handled independently, physical backup & recovery, physical cloning, physical migration, etc. Postgres databases are not independent but just separate namespaces enabling only logical operations with pg_dump / pg_restore.
To be continued…
I freely admit complete bias in this article. It was not intended to defend the Oracle side of the argument or to present a truth table of Oracle vs. PostgreSQL at all levels. It simply points out the PostgreSQL direct advantages and leaves it at that.
“Postgres databases are not independent but just separate namespaces”
Postgres DBs are not namespaces, maybe you are talking about MS SQL server?
MS SQL ppl are criticizing PostgreSQL on that you cannot perform cross-db JOINS in postgresql (in MS SQL you can, by sacrificing security).
Oracle ppl are criticizing PostgreSQL for not physically separating databases in the same cluster.
Hmm something tells me PgSQL sitting in the middle does the right thing!
Or maybe he means MySQL, where “SCHEMA” is just an alias for “DATABASE”.
What he meant was true plug unplug of a PDB .Each PDB is self conatined or independent from its CDB or ROOT Container database. if I move a postgress database from one instance to another instance I have to recreate the users and roles in the instance I moved the database .