DZone
AI Zone
Thanks for visiting DZone today,
Edit Profile
  • Manage Email Subscriptions
  • How to Post to DZone
  • Article Submission Guidelines
Sign Out View Profile
  • Post an Article
  • Manage My Drafts
Over 2 million developers have joined DZone.
Log In / Join
  • Refcardz
  • Trend Reports
  • Webinars
  • Zones
  • |
    • Agile
    • AI
    • Big Data
    • Cloud
    • Database
    • DevOps
    • Integration
    • IoT
    • Java
    • Microservices
    • Open Source
    • Performance
    • Security
    • Web Dev
DZone > AI Zone > Deep Learning Using Keras: Lessons Learned

Deep Learning Using Keras: Lessons Learned

In this article, I want to share lessons learned or things I wished I had known while experimenting with Keras a year ago.

Lexman Nandan user avatar by
Lexman Nandan
·
Aug. 07, 18 · AI Zone · Opinion
Like (4)
Save
Tweet
7.09K Views

Join the DZone community and get the full member experience.

Join For Free

If you are planning to experiment with deep learning models, Keras might be a good place to start. It’s a high-level API written in Python with backend support for Tensorflow, CNTK, and Theano.

For those of you who are new to Keras, you can read more at keras.io or a simple google search will take you to the basics and more on Keras.

In this article, I want to share lessons learned or things I wished I had known while experimenting with Keras a year ago. Some of the things I am sharing might be replaced with new approaches or even automated by advanced machine learning platforms.

  • In general, start with smaller neural net architecture and see how the model is performing on dev/test set.
  • Model architecture, hyperparameter values vary based on the dataset. In other words, it could be different for different dataset/business problems.
  • Architecture and hyperparameters are typically derived using an iterative approach. There is no golden rule here.
  • Split of train/dev/test can be 90%, 5%, 5% or even 98%, 1% 1%. In Keras, dev split is specified as part of model.fit with validation key word.
  • Define and finalize the metrics before building your model. One metric can be focused on model accuracy (MAE, Accuracy, Precision, Recall etc.), but there needs to be one more metric that is business related.
  • You don’t always need a deep learning model for solving business problems. It is way faster to iterate and run a tree-based model like Gradient Booster Method or Random Forest than CNN or LSTM

Parameter Selection: Important Ones:

  • Learning rate — start with default rate and if the network is not learning, increase to .001, .0001, .00001 etc.
  • Activation function (relu and tanh are popular ones). The activation function is used to introduce some non-linearity to the model. The last layer is typically linear.
  • Optimizer (nadam is most commonly used optimizer. In most use cases, you only need to change the learning rate and leave all other parameters at default values.
  • The number of hidden layers and number of units in each layer is mostly derived by iteration.
  • Batch size also plays a role in the performance of the model. Again, this is determined by trial and error method.
  • Data needs to be normalized. (Between 0 and 1 or -1 and 1). Typically for relu, normalize the features between 0 and 1.
  • Start with low epochs (say 10 and see how the model is performing)
  • Underfitting: this can be resolved by adding more data, building deeper layers and reduce any overfitting techniques.
  • Overfitting: adding a dropout layer or a regularization parameter (L1 or L2) is a way to reduce overfitting.
  • Evaluate if the model is converging using the plot of loss function and epoch

The below figure shows a model that is converging at epoch ~ 100. If the model is not converging, the training and validation curves will not intersect.

Image title

I hope you will find this article useful on your journey to learn and experiment with deep learning models with Keras.

If I have missed anything important or you find anything different from your experiments, please comment below.

Deep learning Keras

Opinions expressed by DZone contributors are their own.

Popular on DZone

  • API Security Tools: What To Look For
  • JIT Compilation of SQL in NoSQL
  • How To Check for JSON Insecure Deserialization (JID) Attacks With Java
  • The Power of Enum: Make Your Code More Readable and Efficient [Video]

Comments

AI Partner Resources

X

ABOUT US

  • About DZone
  • Send feedback
  • Careers
  • Sitemap

ADVERTISE

  • Advertise with DZone

CONTRIBUTE ON DZONE

  • Article Submission Guidelines
  • MVB Program
  • Become a Contributor
  • Visit the Writers' Zone

LEGAL

  • Terms of Service
  • Privacy Policy

CONTACT US

  • 600 Park Offices Drive
  • Suite 300
  • Durham, NC 27709
  • support@dzone.com
  • +1 (919) 678-0300

Let's be friends:

DZone.com is powered by 

AnswerHub logo