Understanding Self-Service BI in the Cloud: What Really Matters?
Understanding Self-Service BI in the Cloud: What Really Matters?
Self-service BI is an approach to data analytics that enables users to access corporate data without specialized tech skills. This kind of BI is having an iPod moment.
Join the DZone community and get the full member experience.Join For Free
Hortonworks Sandbox for HDP and HDF is your chance to get started on learning, developing, testing and trying out new features. Each download comes preconfigured with interactive tutorials, sample data and developments from the Apache community.
Cast your mind back to 2004: Britney’s on the radio, the rudimentary first version of Facebook has launched… and Sony has released its hotly tipped NW-HD1 Audio Player.
This piece of tech was supposed to be a game-changer. The idea of carrying your entire music library around with you in digital format sounded incredibly awesome back then. Who wouldn’t want a portable version they could search instantly and tap into anywhere rather than having to rifle through a physical collection they can only access when they’re on the same premises?
And yet the NW-HD1 was one of the biggest tech flops of the decade.
Firstly, it was extortionately expensive. Secondly, to use it, you first had to manually process every track using its awkward built-in software. Far from a convenient system providing unprecedented control and access to your files, it ate up a ton of time and left you with lower-quality versions of your original assets.
Did this mean that the idea of carrying a complete digitized music library around with you was doomed? Of course not. The same year, Apple perfected the design of the iPod. The years that followed brought us more MP3 players, smartphones, and, now, Spotify. Today, hassle-free, anywhere-access to high-quality music is something we take for granted.
Why am I telling you this?
Because cloud-hosted, self-service business intelligence is having an iPod moment.
In the early days of cloud Business Intelligence, many users wound up disappointed. Here they were, hoping to get unfettered, self-service access to their data and insights, without the physical infrastructure requirements that went with it. Instead, they got a cloud “solution” that demanded more time and effort than ever to make it work.
When Self-Service Cloud BI Isn’t Really Self-Service
The fact that a BI tool is cloud-hosted doesn’t automatically make it self-service. You still have to add, tweak, and manage data.
If your cloud BI solution uses traditional BI technology and architecture, it will demand countless service hours, an expensive ROI model, and a rigid data-driven environment with little tolerance for change. This is a far cry from what many cloud BI vendors imply when they say “self-service.”
The Cloud Is Just a Location
There’s a world of difference between installing your BI software on a “virtual” computer on Amazon EC2 and getting a fully managed BI service in the cloud, even though both are often referred to as “cloud BI” or “BI in the cloud.”
The first is just a case of shifting where your data is stored — it’s the choice between the floor in your basement or on the floor in Amazon’s basement.
The second option — a fully-managed Cloud BI service — is a much more involved, strategic decision about how you want to handle your BI need. It needs the same careful consideration you give to outsourcing your BI solution to a third-party, on-premises vendor.
Solve Problems, Don’t Shift Them
If you already use fully cloud-based self-service applications like Salesforce for CRM, Google Analytics for traffic analysis, or Zendesk for help desk management, you’ll have experienced true self-service cloud solutions and may be expecting the same from your cloud BI vendor as standard.
But traditional BI solutions have cumbersome data management and require end users to call customer support or bother their IT department every time they want to add data, alter a field, or change a data visualization on the interface of a report or dashboard.
When you shift a traditional system like this from on-premise to the cloud, you retain the same problems. The cloud alone won’t solve them.
Synchronicity and Integration Are Everything
BI software is best deployed to be as “close” as possible to the data that feeds it so that there is minimal overhead in transferring data from the sources to the BI software for analysis.
For example, if the source data is on the Amazon Cloud and the BI software is on Rackspace, that data would need to be transferred from Amazon to Rackspace. Similarly, if the data is on-premises and BI is installed in the cloud, the source data would need to be uploaded to the cloud first.
For some companies dealing with very big on-premise datasets, hosting and sending all that data over the web can get prohibitively expensive, fast. And, of course, you might be obliged to keep this data firmly onsite for unavoidable security or regulatory reasons — especially if you’re a financial or healthcare organization.
Either way, unless you’re using a Cloud BI System that has a watertight system in place, synchronizing this data effectively while continuously introducing new datasets and sources means you’re always playing catch-up. And that’s before you’ve even started dealing with challenges like data warehousing, data modeling, query formulation, and data visualization.
Disjointed Apps Make Things Worse
Salesforce and Google Analytics are true self-service applications partly because the same application is used for data entry, administration, and operation. This means Salesforce/Google control (and can pre-engineer) the entire data architecture, from how data is stored to what the user can do with it.
BI software, on the other hand, doesn’t generate new data – it draws it in from other data landscapes. If this data is kept elsewhere to the actual BI software, it can be generated by many different applications, in countless different formats, and stored in a variety of locations.
This is awkward enough on-premise, but when you transfer those problems to the cloud they get a whole lot more complicated.
Suffice to say, you need a coherent, integrated, flexible, and preferably single-stack BI solution in place that brings all these strands together and gives you control over how you combine and manipulate your data. You need that on-premise, but you need it even more in the Cloud.
The Golden Rule
If a BI solution is not self-service on-premises, it won’t be self-service in the cloud either.
This means that the first essential thing to consider is not whether the BI system is cloud-hosted or whether you plan on deploying it on-premises, but rather that the BI software at hand is genuinely self-service. And that goes for fully-managed BI services, too.
Why Opt for Self-Service BI in the Cloud?
Like Sony’s NW-HD1 player, previous iterations of Cloud BI overstated their ability to solve the problems of users stuck with a lackluster on-premise solution.
But, like the iPod, with the right approach, cloud-hosted BI is still a great idea.
After all, it takes enormous pressure off your IT team to manage the fundamental infrastructure, while giving you easy, reliable ways to backup data offsite. You should see improved uptime, as your provider is responsible for ensuring the server is available around-the-clock. And, of course, as many organizations move to cloud-first, your IT team might be unable or unwilling to deal with very involved, on-premise software, full stop.
How Cloud BI Can Mean More Self-Service
The right Cloud BI platforms can give you more self-service overall.
Vendors who streamline deployment of their BI environment in the cloud should handle version upgrades, making sure you always have access to the newest features and innovations, as well as regularly updated data and software integrations.
For example, the underlying technology used in Sisense’s proprietary In-Chip™ engine lets you run any ad-hoc query and receive answers on the spot, without the need to prepare data in advance for each new question. This is because Sisense effectively uses CPU cache memory as well as processes and prepares data only when a new query is made, allowing concurrent, ad-hoc queries to return results in seconds.
Using this type of holistic cloud service can also mean better performance and scalability of data volumes and sources, as you aren’t as restricted by the capacity of the on-premise hardware.
3 Tips for Choosing a BI Cloud Solution
If the cloud is calling you, here are a few things to watch out for:
- Avoid vendors that nudge you towards using their own implementers or partner networks. The important thing is having the tools you need to be self-sufficient, not relying on a new intermediary for the sake of being in the Cloud.
- Beware platforms that focus on fancy front-end features. You need a workable self-service solution for simplifying data preparation more than you need flashy visualizations or dashboards that limit what you can do with the data that drives them.
- Make sure the security is up-to-scratch: no sharing of client data between systems and environments, use of industry-leading cloud computing providers such as AWS, options for connecting to your chosen databases over the internet or SSH tunnel, and secure admin access via VPN with 2-factor authentication.
Lastly, take advantage of free trials and watch out for vendors that demand payment upfront for a proof of concept on your data. You need to know that this system works for you, and that means taking the time to test it without pressure or financial obligation. If it’s too costly or time-consuming for a vendor to offer a free proof of concept, well, their tool is probably going to take you a ton of time and investment as well.
Remember the golden rule: What matters most is that these are genuinely self-service BI tools, regardless of whether you host them in the cloud!
Published at DZone with permission of Aviad Harell , DZone MVB. See the original article here.
Opinions expressed by DZone contributors are their own.