Do You Trust Google Big Query with Your Big Data?
Google has come up with a fantastic service to analyze large amounts of data. It’s called BigQuery and it allows you to run analysis on big data on the cloud. As expected, the tool has a superb, intuitive web UI. The data analysis language uses SQL like queries. (Hive, anyone ). Have a look at the Big Query Tutorial, it looks pretty neat. So, now all you need to do to run queries is to upload your data to Google using the form shown below.
It allows you to upload a file or point to it using Google’s cloud storage.
Now, the interesting question here is that to analyze using BigQuery how much of that data are you willing to give Google? And how long will that take? The answer won’t be “Let me quickly upload a 500 GB file and run some queries”. That amount of data would definitely take some time to upload. So, effectively, this SaaS becomes pretty useless as more and more data volumes need to be uploaded for analysis.
Everyone trusts Google ( ), so this concern might be easily ignored. But a potential other problem I see is the “Privacy Policies” that are violated. Usually, when you want to analyze data, it can contain sensitive data such as user behavior patterns and so forth. How comfortable will your customers be if you hand that data over to Google? Even anonymizing this data might not save you from a potential legal breach.
I still believe setting up your own data analysis and monitoring platform is the best way to go. Thoughts? I’d love to hear them.