Is PostgreSQL the Right Reporting Database for You?
PostgreSQL works with a variety of data and is highly customizable, but it doesn't handle Big Data as well as other options. See if it's right for your reporting needs.
Join the DZone community and get the full member experience.Join For Free
Pour out a little liquor for the days of one-parent hierarchical databases.
We’ve come a long way.
These days, there’s no shortage of dynamic DBMS tools. But when talking open source, Postgres will undoubtedly dominate the conversation. It’s regarded by many as the best wrench in your database toolkit. Period.
Still, every business is different, and you have to choose the reporting systems that are right for you. So is Postgres the way to go when you’re after BI, analytical or operational reporting? Let’s delve into that.
Think of this as your guide — or rather cheat sheet — covering the basics of what to consider before going all in for PostgreSQL reporting.
(Are you team MySQL? Check out our all-points rundown on the pros and cons of using MySQL for analytical reporting.)
Start by taking a hard look at your 3Vs.
Variety: What Types of Data Are You Working With?
One of the many benefits of PostgreSQL is that it works with a huge variety of data, including some pretty handy advanced types like ‘array’ which can enable quick, ad hoc aggregation in query results, and of course native data types, allowing you to build your own.
Both of these can be a BIG help in analytics and reporting, allowing you to swiftly retrieve relevant data.
Postgres still falls a little short in types like currency, images, and music, but these are already on their radar and we wouldn’t be surprised to see them in the next update.
But if the majority of your data is qualitative stuff, such as text, image-heavy articles, and social media posts, think twice before relying solely on Postgres for your reporting needs. You could, for example, keep PostgreSQL as your database and use a free reporting tool that was built to handle qualitative data, such as Atlas.ti or QDA Miner Lite.
Quick Caveat: Regardless of your current variety types, this is an area to watch closely. As the number of data sources continues to grow across organizations of all shapes and sizes, variety will become more and more crucial for intelligent reporting.
Sound like you?
If your organization is poised for major growth across multiple data sources, you’ll want to set up a reporting platform that can quickly perform cross-database queries on a variety of data types, and easily scale as you grow. Sisense was built to handle fragmented datasets, and of course, we recommend it as a great option if you’re looking at high growth in the near future.
Live for today, plan for tomorrow.
Which brings us to our next, and possibly the most talked about V.
Volume: How Much Data Does Your BI Rely On?
Even before PostgreSQL 9.5, the DBMS was widely praised for its ability to handle extremely large datasets.
But there were clear limitations to the amount of data it could process before performance took a hit.
Those are largely problems of the past and Postgres deserves massive kudos for its big data upgrades. Its improved scalability makes it a great option for business analytics of all sizes.
Still, it’s worth remembering that at its core, Postgres is a relational database. It wasn’t initially built for big data.
So it’s understandable that issues still pop up, even since the latest features were added, including rapid analogous table growth of up to terabytes in size, which can lead to other problems such as slower query performance and indexing.
And of course, bloat. Even simple maintenance tasks like vacuum can take way longer than they should.
This is a small price to pay for the kind of benefits you get with Postgres and these issues can easily be mitigated through custom configuration of data architecture, performance tuning and table partitioning. And of course, it has a fantastic community and support team to help you handle any speedbumps.
But at the end of the day, optimization hacks take time and manpower — luxuries that not all organizations have.
If you’re like most users with moderate data usage, don’t even worry about it.
You should have no problem using Postgres for reporting and analytics. Postgres generally provides fast and stable query performance and can quickly access up to a terabyte on a single instance.
On the flipside, if you have a smaller (or growing) team in need of big data insights, Postgres isn’t a cure-all.
Even with the advanced big data features (which include the well-needed BRIN indexing and abbreviated keys for faster sorts, as well as the TABLESAMPLE SQL clause for accessing huge tables without a lot of sorting), companies with very high volume and velocity will still need to work in tandem with Hadoop or enable NoSQL.
If that sounds like you, you may want to skip the time spent troubleshooting and lighten the load with a BI tool like the Sisense Postgres Connecter, which pulls the relevant data incrementally and then transforms, analyzes, and stores it on a third-party server of your choice, on-premises or in the cloud.
This is a great option if you’ve got large or fast-growing data.
The last thing your people want to be doing in the busiest of business times is waiting around for answers.
And speaking of time management, let’s talk…
Velocity: How Fast Does Your Business Move?
How fast will you need to generate, deliver and accumulate data? And through how many data sources?
If you have a need for speed, plus a variety of fast-growing data, you may end up outgrowing Postgres for your reporting needs. Not because it’s a bad tool. (Au contraire, it’s a phenomenal tool!)
But it was built to satisfy the needs of the masses.
Even if you have the time and budget to custom configure both your data architecture and hardware to get max power out of PostgreSQL reporting, you’ll still want personalized, that cater to your specific business needs.
There is great power in visualization.
AnalyticsWeek’s Jelani Harper hits it right on the head:
“Actually deriving insight from integrated data (including unstructured, semi-structured, and structured data) is largely attributed to data visualizations. Many solutions enable users to customize dashboards that reflect any number of sources, their integration, and metrics employees can intuitively configure. Truly competitive platforms offer these capabilities with existing dashboards pliable by conventional point and click methods. The result is less dependence on IT to create visualizations, expedient time to both insight and action, and meaningful integration of sources to truly inform decision-making.”
If you want the kind of quick, actionable insights that keep you ahead of the competition (and trust me, you do want that), you’re going to need a level of customization that can’t be found in off-the-shelf Postgres reporting.
The good news is there’s an abundance of great systems that can offer this kind of insight, all while working seamlessly with Postgres.
Let’s wrap up with a run down of the pros and cons of using Postgres for reporting.
- Works with a huge variety of data
- Fast and stable query performance
- Handles extremely large datasets with ease
- Any issues can be solved through custom configuration, optimization hacks, or software
- Excellent support community
- Room for improvement with qualitative data management
- Can become slow when dealing with really big data
For many, the benefits of using Postgres for both database and reporting far outweigh the risks.
But you’re not ‘most’. You’re you.
So think about what you really need your business analytics to do, then set up your reporting to support that (not the other way around).
Published at DZone with permission of Gur Tirosh, DZone MVB. See the original article here.
Opinions expressed by DZone contributors are their own.