Deployable convolutional neural network libraries
I recently reviewed the current state of neural network libraries from a perspective of integrating them in multi-platform, desktop C++ applications (Linux, Windows). Although there has been a lot of progress in machine-learning, it mostly seems to be concentrated on the research/development side. As a consequence, deployment of the finished models to end-users is a bit more challenging than it could be.
Contestants
- TensorFlow
- caffe
- tiny-dnn
- mxnet (although, I found out about it very late in the process)
The model I implemented was AlexNet-like but a bit deeper and adjusted for different input image size and dimensions.
TensorFlow
TensorFlow is very python-oriented. Although it is possible to build a stand-alone C++ application, the easiest way seems to be to integrate this app in bazel, Google's build system. The resulting binary also seems rather large (~100MiB).
caffe
caffe is the opposite of TensorFlow in this respect. There is nice and fully working python interface, but it seems under-documented. The main interface for net-construction seems declarative text format (protobuf text format) which is then used with caffe executables for training and inference.
The best support for caffe is under Linux. There is a Windows build for MSVC (which I could not use because I build by software on Windows with MinGW). I did succeed building caffe under MinGW-w64, but it took full two days to track down and build all dependencies. Unfortunately the end result just did not work™ (same input on Windows build produced wildly incorrect results although worked perfectly fine on Linux).
The resulting binary file size was a bit better- ~30MiB but keeping all dependencies up to date on Windows and debugging the internals of caffe did not appeal to me.
tiny-dnn
tiny-dnn is header-only library with no hard external dependencies. Unfortunately, it does not have any GPU support, so training a reasonably-sized model takes very long time.
Luckily, it can load model which is developed and trained in caffe. The caffe converter code does not support all layer types yet, but I was able to add support for the missing layers used in model.
This is the way I eventually went with. Since my model is relatively small, inference times are acceptable with just CPU, using default tiny-dnn backend (which I believe does not use any CPU-specific optimizations at all).
Training my model in caffe is also very fast process (on a machine with GPU).
Just keep in mind that inference time can be significantly worse when using frameworks without GPU support. There are also a lot of ways to multiply large matrices on CPUs, so there can be large differences in performance of various machine-learning frameworks (and even differences on various architectures).
Comments