A few months back, Shine’s Pablo Caif and Graham Polley were welcomed into the Google Developer Expert (GDE) program as a result of their recent work at Telstra. The projects they are working on consist of building bleeding edge big data solutions using tools like BigQuery and Cloud Dataflow on the Google Cloud Platform (GCP). You can read all about that here.
GDE acceptance comes with many benefits and privileges, one of which is a yearly trip to a private summit at a different location each year. With Google footing the bill, they bring all the GDEs (around 250 currently) from around the globe for, let’s admit it, a complete Google geek-out fest for 2 days!
This year the summit was at the Googleplex in Mountain View. Needless to say, Pablo and Graham were chomping at the bit to go. However, in addition to the summit, Google invited them to fly out prior to the actual summit itself. They had lined up a few other things especially for the guys. So this was no ordinary trip. Lucky buggers!
We asked both guys to give their individual feedback on the trip, and here’s what they had to say about it. Read on if you want to hear about how the guys spent six days hanging out with Google in America.
Here’s what the packed agenda looked like:
- Fly to Seattle to meet the Dataflow engineering team
- Hook up with the BigQuery & Dremel teams in Kirkland
- Meet another part of the Dataflow engineering team back at Mountain View
- Attend the two day GDE private summit
- Give presentations at the 2015 Silicon Valley Devfest
Note: Both Graham and Pablo are under strict NDAs with Google. Therefore, most of what was discussed with Google during their visit cannot be made public. So, please bear that in mind as you read this blog post.
Day 1: Meeting the “Dataflowers”
The first day of our Google trip was probably the toughest, and the most exciting at the same time! After a 5am hotel wakeup call, we flew up to Seattle from San Francisco 7am flight. From the airport, we then took a taxi straight to the Google offices in downtown Seattle to meet Rafael Fernandez.
Rafael is the Technical Program Manager for Dataflow, and he was the man charged with looking after us during our visit. Big shout out to him and the teams right now for showing us such great hospitality!
With some serious jet lag in tow, we hooked up with Rafael at the office where he introduced us to what he referred to as “the Dataflowers!” The Dataflow team were keen to get down to business, so we started by explaining to them how we use the Dataflow service. They were interested to hear our use case, and this led us into some great technical discussions.
The technical discussions were of great value to us, and the team were all very curious to hear about what we are building. Frances Perry, who is considered the “authority” on big data analytics at Google, was kind enough to perform a deep dive technical session with us. We walked her through our design and code. She then made some great suggestions on how we could get more performance out of our pipelines by using some of the other Dataflow programming model techniques. This was already a great start to the trip! Having Frances review our code was simply awesome.
The sessions were also of benefit to the Dataflow team. They were able to see how we use (or misuse) the service, and we provided them with feedback on some pain points we had with the tool. After about four hours of sitting with the team, having great discussions, and getting some interesting insights into how the service works under the hood, it was time to pack up and leave, but not before Frances offered to review my slide deck for the talk I was giving on Dataflow at the 2015 Silicon Valley Devfest. Could it get any better than this?
Day 2: BigQuery & Dremel Heaven
BigQuery is really where this whole Google journey began. BigQuery is a public exposure of an internal technology used within Google called Dremel. We’ve been using it for about two years now, and it was this blog that was the catalyst for building the relationship with Google since that time. Personally, I’m a massive fan of this technology, and I love nothing more than showing other people just how powerful it is. In fact, the best part is when they see it smash billions of rows in seconds! So, day two was probably the highlight of the trip for me all because we got to hook up with the BigQuery and Dremel teams.
For this meeting, we whizzed across the lake in Seattle to the Kirkland offices. This is where the BigQuery teams are located. We were a little early, so our host Rafael decided to kill some time by bringing us to quickly say hello to the Dataproc team. Turns out they are also based in Kirkland. Dataproc is the latest product that has been added to Google’s big data product fleet, and it enables you to spin up Hadoop/Spark clusters in seconds without having to worry about the infrastructure at all. Sound familiar? If it does, then you’ve probably read my blog on Dataproc here, in which I put it to the test and spun up a cluster on my 17 minute train commute. It passed with flying colours!
After saying hello to the Dataproc team, we jumped into a meeting room with the brains behind BigQuery. We were also asked to give them a quick presentation on how we use it in our projects, what we like/dislike about it, and what features we’d like to see in future releases. For me at least, BigQuery is doing an outstanding job for us already, so telling the actual guys who built it "oh, we don’t like this" or "it needs to have this" was a bit daunting to be honest. But, speaking to the engineers they assured us that this was the type of feedback they like to get–directly from the folks who are using the product.
I think our presentation was well received by the team. In hindsight, there are a lot more things we’d like to see in BigQuery (e.g. row updates), but of course we blanked on some on the day! Nevertheless, the team said they loved seeing how the product they develop is being used by customers in the real world. They also gave us some great hints on improving our current workflows, like using UDFs to perform some of our mapping.
But the icing on the cake, as what happened with Pablo, was when I got to sit with one of the main guys for BigQuery, Seth Hollyman, and he threw his eye over the presentation for my upcoming presentation at the Googleplex. Fan-bloody-tastic! My deck had just been reviewed by a BigQuery engineer. Wow!
Day 3: Dataflowers Redux
After meeting the Dataflow and BigQuery teams in Seattle, we jumped on a plane back up to San Francisco, and made our way to Mountain View to meet another part of the Dataflow team. The teams are split into two locations, Seattle and Mountain View. We teamed up again with Rafael, and he brought us around the office and introduced us to some of the other team members.
We all then went for lunch at the famous Googleplex cafeteria! The Googleplex has a huge all you can eat cafeteria, but it also has restaurants attached to it for a more formal affair. It should go without saying that the restaurants are also free to all Google employees!
The Dataflow team had booked a table at an amazing Indian restaurant. It was sublime. We all sat around around and chatted, in particular about the solution we were building using Dataflow. This part of the team was also keen on hearing how we use it and had some questions on what improvements they could make. Yet again, this was a great opportunity for us to probe the team and get insider tips on using the service that Google offers.
Day 4: GDE Summit Part One
The first day of the Google Developer Summit was, in my opinion, the best of the two day summit. We started off the morning with a keynote from Jason Titus (VP of Developer Product Groups). He was quick to reassure us that Google is 100% committed to GCP, and to gaining more traction outside the US. That’s good news for us.
I remember being at a Google Cloud Live event last year when Urs Holzle boldly stated Google’s cloud revenue will overtake their advertising revenue stream within five years. That was a bold statement to make, and I’ve yet to see progress of GCP getting a cut of the lions share of cloud computing overseas.
Take Australia for example. Why is Amazon scooping up all the enterprise business here, and Google is not? Simple, Amazon has a data center in Australia. Google does not. And, that’s a problem for a lot of companies that have sensitive PII information that cannot be stored offshore for legislative or legal reasons. For GCP to start making inroads in this region they need a data center. Whether it will happen or not remains to be seen, but Google cannot ignore this fact any longer.
With the keynote over, and the GDEs refueled with coffee and a tasty lunch, we all split up into our respective expert categories for the breakout talks. Pablo and I are experts on GCP, so we headed to the Cloud breakout session.
We started with a talk on the Big Data product suite i.e. BigQuery, Dataflow, Dataproc & Datalab. This was given by William Vambenepe (Lead PM for Big Data, GCP). In my opinion, it was the most interesting and valuable talk that I attended. We were given some good insights into where Google wants to take these products and also some new features coming up in each of them.
Following on from William’s talk, we had a session on Machine Learning (ML). Just before the summit, Google announced they were open sourcing ‘Tensor Flow’, which is the tech that powers ML. This was really hot news and had a lot of people excited. In the ML session we were lucky to have one of the lead engineers give us a live demo. It was impressive to say the least.
We wrapped up the cloud breakout session with two more talks. One on Cloud SQL, and the last on Containers (i.e. Docker) and Kubernetes. Kubernetes is Google’s orchestration tool for managing a cluster of Docker containers. It’s open source, and there was a real buzz about this one too! To be honest, I haven’t yet had a chance to play with either Docker or Kubernetes, but after attending the talk and seeing how powerful this stuff is, let’s just say my weekend looks busy!
Throughout the day, we were encouraged to give direct feedback to the presenters on whichever product they were presenting about. This was a great chance, as users, to tell Google directly what works, what doesn’t, and what needs to be improved. And this is one of the main reasons Google has this GDE program. They really value and listen to the feedback from GDEs.
Day 5: GDE Summit Part Two
Day two of the summit shifted from the technical side of things to something much more different.
One of the responsibilities of being a GDE is to give technical talks and large conference presentations. This is a mutually beneficial relationship between Google and the GDEs. They have experts evangelize and talk about their products, and at the same time the GDE is at the forefront of public exposure. In addition, Shine is also recognized through having employees speak at public events. It’s a win-win!
The first order of the day was a session on ‘Improving your public speaking’ by Elizabeth Padilla and Martin Omander. They had some really useful nuggets of information in this talk. For instance, they suggested performing warmup exercises before your talk and getting your brain firing on all cylinders prior to getting on stage. This session actually went on a lot longer than expected because everyone enjoyed it so much and asked heaps of questions at the end. Having done lots of talks myself, I found this talk really valuable.
The last order of the day, and of the summit, was a three hour team building and networking session. We were split into groups of 10, with an array of different skill sets in the group. Our group was made up of GCP, Android, Angular, and UX experts. We were then assigned a word. Our word was “care”. The challenge was to come up with a product using the Google technologies within an hour and then pitch it in front of the judges. The judges were all Google employees. The winners got some cool prizes like the new Chrome cast.
Our idea was simple–an Android app that was the opposite of the Facebook like! It was called “Who cares!”. More tongue in cheek fun than anything else, the app was very elegant and simplistic in its design. You had a feed, but instead of liking posts you “I don’t care” them instead! I can’t remember which technologies we used in the design of the app, but that’s not the point! It was all about how to work in a team across a multitude of skill sets. Plus, it was just good fun. In the end we won the prize for most original idea!
Day 6: Presenting at the Googleplex
It’s not every day that you get to attend a conference at the Googleplex, let alone present at one! But that was just the case for myself and Pablo. We were both accepted to give talks at the 2015 Silicon Valley Devfest, which is essentially the largest group of Google developers in the world. I was accepted to give a talk on BigQuery, and Pablo on Dataflow. We were excited!
Pablo was up first, and aside from some audio issues at the start (the batteries on his mic died 2 minutes in), he delivered a cracking talk in front of around 200 people. Dataflow is a fairly new product for Google, and whenever Pablo shows it off it gets a lot of interest from the audience. It can be used to perform massively parallel ETL in the cloud, or to run complex aggregations and analysis over large datasets. Think of it like Hadoop except on steroids, and Google will manage all the infrastructure for you too. Nice!
You can check out Pablo’s slide deck here.
A few talks later it was my turn to present. As I was setting up, the conference organizer asked me if I could reduce my talk from the allocated 20 minutes to just 10 minutes as the sessions were running way over. I hesitantly agreed, and my talk was spontaneously delivered as a lightning talk! Even though I had prepared for a 20 minute talk, this was a lot of fun. I raced through my presentation, but at the same time dropping little nuggets of information that the audience seemed to respond really well to.
But, without a shadow of a doubt, the pièce de résistance was the live demo I performed at the end. Big shout out to Jordan Tigani for the inspiration. I started by querying 1 million rows. BigQuery smashed it in 2 seconds. Then I asked the audience if we should go bigger. They were keen, so I queried 1 billion rows. BigQuery smashed it in 7 seconds. They were duly impressed. However, I wasn’t finished. I then queried 100 billion rows (5TB). BigQuery smashed it in 57 seconds. Jaws hit the ground. It was priceless, and one of the week’s highlights for me!
Here’s my slide deck.
Pablo: For me, the whole trip was a great experience. However, the highlight was meeting with the Dataflow team in Seattle. Having the chance to discuss the details of the implementation of our applications with the people who actually write the APIs that we are using is something we’re very lucky to have been able to do. Being there, with Google engineers, and discussing and reviewing our code is not something that one can easily brag about.
I think it was also very interesting for them to see our use case scenario and how we understood and used the APIs. I also enjoyed meeting the Big Query team in Kirkland and showing them how we use BQ. Again, having the chance to give direct feedback to the engineers who actually work on BQ was pretty awesome.
The summit was quite good as well, and I especially enjoyed the GCP sessions which were a great opportunity to catch up with the latest improvements to the platform and hear about best practices. Sharing this experience with other passionate developers and giving feedback directly to senior Google management is a privilege that not many have had.
Graham: For me at least, it doesn’t really get much better than those six days. I mean seriously, as a developer you are already lucky enough if you get to work with tools on the GCP stack like BigQuery & Dataflow, but to actually fly out to Silicon Valley and meet the teams who build them! That is just unparalleled in terms of awesomeness! We got a glimpse into how the tools are built and managed, which makes me appreciate them that little bit more. We also got some great vision and clarity on the product roadmaps. There are some exciting things coming up soon, so stay tuned.
Another take home for me, and which reinforced the thought I already had begun mulling over, was that GCP is so much bigger everywhere else except Australia. That’s primarily down to the lack of data centers here. But that caveat aside, it’s also a lack of adoption from developers down under. AWS offers a much bigger range of services than Google. However, in my opinion, those services might be more plentiful but they are not as good as the tools on the Google stack.
Take the big data suite for example. If I was asked to choose between AWS or GCP to build a big data solution, Google wins hands down. And, that is not me being biased! It’s because nothing can match BigQuery, Dataflow, and Dataproc for performance and cost. It’s that simple.
The summit itself was cool, and we got to meet a lot of interesting people. The info we received from Google regarding upcoming features and roadmaps was fantastic. I’d love to share but of course my lips are contractually sealed!