Predict Bike Sharing Rides With a Neural Network
Deep learning has really fired up my imagination. Mye first project is predicting the number of bikes that a bike sharing company needs on the road to meet demand based on the past rental data.
Join the DZone community and get the full member experience.Join For Free
Deep learning, with its immense potential, has really fired up my imagination — so much so that I signed up to do a Udacity Deep Learning course. I am going to document my journey as I learn deeply learn this topic (sorry couldn't resist the pun). Now the journey itself is going fairly slow, given that I work for one of the hottest technology companies in the Silicon Valley - talk about a first world problem.
The first project is predicting the number of bikes that a bike sharing company needs on the road to meet demand based on the past rental data. The data itself comes from the UCI Machine Learning Repository. Bike sharing companies have come in vogue around the world and the beauty is that individual rides are recorded leading to a virtual sensor network for sensing the mobility of a city (I didn't come up with this astute observation — UCI did).
This NN was built using libraries in Python like Numpy, Pandas, and Matplotlib or as in there were no machine learning or deep learning framework used. I will spare the details of building the network itself for a subsequent blog and focus on a few interesting observations.
Observation 1: There is a ton of data and ton of data overwhelms.
There were 17,000 rows of data capturing the ride rentals by the hour. There were 59 features such as windspeed, temperature, and humidity to bring color to this data. The sheer amount of data was overwhelming.
Once I plotted the data (thanks, Udacity; they provided most of the code and I just spent time building the NN) there was some good news. Seemed like there was a pattern here.
Observation 2:Jupyter notebook and Python rock.
I am quite taken by Python data libraries and specifically Jupyter notebook. Now, if only I can start getting a handle on these libraries that will be wonderful. That said, the Python documentation is quite terse and without a cookbook or code, it is hard to see what's going on. As a Java guy, I wonder why can't Java be so simple!
Observation 3:The magic is in the hidden layer of the NN.
The input layer is where the data comes in (all 59 columns of it and iterated over all 17k rows) and the output is where the prediction comes out - simple enough.
The interesting bit happens in the "Perceptron" or each node in the hidden layer.
Each input is multiplied by random weights - cue in crazy matrix math, fed into a sigmoid function and pushed to the output layer. The sigmoid function is what converts the numerical output into a probability of the prediction being close to the actual data.
Observation 4: The real magic is in the back propagation and gradient descent.
What's really magical is that the NN is fed the data on the forward pass, the prediction is compared to the actual data and an error is computed. This error is pushed back through the network into the hidden layer and at each stage, the weights are adjusted such that prediction starts getting closer to the actual data — wow! A learning rate is set up that adjusts the amount the gradient descent formula adjusts the weight so that the weights are adjusted by a nominal amount so that the algorithm finds the global minimum or the right answer.
We do the whole process for a number of "epochs" or iterations that run in hundreds. This is called training the data. Eventually, the trained NN is fed a separate dataset called the testing data to see how good the network performs.
Observation 5: The joy of predicting right.
Close to 15+ hours of pure assignment time — whew!
You can see that my NN is pretty spot-on for most days, except Christmas and the end of the year. The reason for the failure is that the dataset is for two years and the NN just saw two data points for Christmas and this is not enough for it to make a reliable prediction.
Other than that it was pure joy getting to this point — loved it. Forget the misery of the matrix math and the fact that I don't quite recall where gradient descent came in! It was three weeks ago when I wrote this, so I deserve a break.
Published at DZone with permission of Harpreet Singh, DZone MVB. See the original article here.
Opinions expressed by DZone contributors are their own.