Optimize TF 2.0 Pretrained Network to Run in Inference in OpenCV
In this article, we discuss how to convert a pretrained network (written in TF 2.0) to an optimized network, suitable for inference.
Join the DZone community and get the full member experience.Join For Free
In this tutorial, I will demonstrate how I converted a pretrained network, written in TF 2.0, with a bit of advanced architecture, to an optimized network suitible for inference. My goal was to run the pretrained model in inference, using C++ and OpenCV's Dnn module. However, this guide will help you produce an optimized model that can be used in many platforms, including Python and TensorRT.
After training a model in TF 2.0, I looked for a simple way to optimize the model for inference. However, I didn't find one place that guided me on how to do so in TF 2.0, since TF removed their support in frozen optimized graphs, as can be seen in all the complaints stacked up in StackOverflow (for example, 1, 2).
There are many solutions written in different internet forums. I’m writing a guide on the most simple way I found to do so.
You may also like: Introduction to TensorFlow.
In my network implementation, I used:
- Dataset and iterators (
- Placeholders (
- Convolutions (
- Depthwise Convolutions (
- Relu6 (
- Batch normalization (
- Flatten (
- Softmax (
- Global average (
After training, we would like to export our model to .pb file, to run at inference. Tensorflow has a simple code API to publish a model:
However, this would export the full model in many files, including graph def, variables, and will not remove summaries, which are unnecessary at inference time. This limits the performance of using a GPU, since unnecessary network components often run on CPU.
Until TF 2.0, best practice was to:
- Freeze a model (take a graph def combined with checkpoint, and convert all variables to consts within only one pb file)
- Optimize the frozen graph for inference, including:
- Remove unused nodes.
- Fold arithmetic and expressions to constants.
- Remove identity and no op operatioms.
- Fold batch normalization into simple multiplications.
- Sort by execution order, which is neccessary in order to run it in OpenCV.
This code was mainly mainted by Graph Transform Tool and was integrated in the TensorFlow tools library. That was until TF2.0, where TF removed their support and their graph utils code, since
tf.compact1.graph_util is basically empty. Instead, they expect the developers to use the SavedModels API. However, other libraries and inference modules don't support it, and TF users need to break their heads to find a way to optimize their model.
I will detail exactly what I did in order to freeze my model:
Write Your Code in Advance to Fit the Inference
Use a flag named
For iterators use:
I recommand this method instead of post editing a graph. It helps to consider the inference from the beginning of development. For example, I modified my implementation of average pooling during training , from using
average_pooling2d to just using
reduce_sum, which is simpler and will run on more platforms.
You can also add handling under this flag for the summaries and the filewriter. Instead of working to remove them at post training, just don't add them :)
Train a Model in TF 2.0
Save a Checkpoint of the Model, Using:
Save a graph def of the model using:
Create a new environment, with tensorflow 1.15, and run the following code:
This code should work perfectly, but didn’t work on my specific model. I got the exception:
“Didn’t find expected Conv2D input to bn”
It was wierd, and I started researching my code and whether the problem was within my network or the environment changes.
Finally, I discovered it was neither of my assumptions. I discovered that TF1.15 code is not updated with the latest graph transform tool code in its
inference_lib. The issue with the code was assuming that batch normalization accepted Conv2D op as input. But, in my network I used bias, using the resulting
BiasAdd operation before applying batch normalization
Before I tried to fix the bug, I searched online and saw it was already taken care of in GitHub, but it wasn't updated in TF, and won’t be, since it has been removed in TF2.0.
I modified the TF code, and patched the file in:
%installation python folder%/tensorflow/python/tools/optimjzed_for_inference_
with the attached code. After those fixes, my code didn’t return any errors.
I hope this guide will help someone out there to convert a model written in TF2.0 to run at inference.
Good luck :)
In my next post, I will demonstrate how to run an optimized graph at inference on OpenCV in C++ and on TF in Python. I will benchmark them (hint: on a CPU, they are pretty equal).
Opinions expressed by DZone contributors are their own.