MulticoreWare has Developed High Performance Face Detection Neural Network for Embedded Vision Processors

Share Article

MulticoreWare, Inc. has developed a high performance face detection algorithm solution for Synopsys, Inc.’s DesignWare® EV Family of vision processors utilizing Convolutional Neural Networks (CNNs). The collaboration demonstrates that high levels of accuracy and performance can be achieved in object detection software algorithms on a highly optimized embedded hardware platform.

Neural network object detection solutions have been shown to be superior to conventional computer vision algorithms in terms of accuracy, but require significant compute performance to operate in real-time. Most current CNN-based object detection solutions are designed for powerful desktop and server platforms with high-end CPUs and graphics processors and significant amounts of memory. As a result, deploying CNN based solutions on embedded systems, which have tighter cost, power and memory constraints, has presented a challenge.

Synopsys’ DesignWare EV Processors run a CNN executable and operate at more than 1000 GOPS/W, providing fast and accurate detection of a wide range of objects at a fraction of the power consumption of competing vision solutions. Face detection systems are enabling a variety of powerful applications, from consumer devices and home automation systems to commercial and government security and surveillance systems. In general, CNN object detection and identification deployed on embedded platforms will enable high performance solutions for a variety of domains such as gesture recognition, Automotive Driver Assistance Systems (ADAS), traffic monitoring, home security and military reconnaissance.

“Deep learning-based convolutional neural networks have recently emerged as the leading approach for achieving state-of-the-art object detection accuracy for a wide range of object classes,” said Matt Gutierrez, director of marketing for Synopsys’ Processor Solutions. "Reducing the complexity of the CNN graphs is key to ensuring the resulting algorithms meet the low cost, low power requirements of embedded computing platforms. We worked with MulticoreWare to develop a highly optimized face detection algorithm that enables users of DesignWare EV processors to meet the high performance, low power consumption needs of a broad range of embedded vision applications.”

Curtis Davis, Chief Operating Officer of MulticoreWare commented on the achievement, “MCW has considerable experience developing CNN tools for high-performance computing (HPC) platforms. In this case, we had to redesign and train our face detection tool applying embedded system constraints. We are able to obtain state-of-the-art accuracy using a considerably reduced neural network with just 5 layers, in contrast to the 10 layers needed for the comparable desktop application. The size of weights (less than 512KB) involved in this network is about 250 times smaller than the desktop-based CNN. In addition, the computation requirement is 200 times less, and the data bandwidth is 400 times less, enabling this solution to accommodate the memory constraints of an embedded system. Our testing on the Face Detection Data Set and Benchmark (FDDB) data set demonstrates an impressive 98.6% accuracy for face classification.”

MulticoreWare, Inc. is a leading provider of high performance video processing software libraries including the x265 HEVC encoder, UHDcode HEVC decoder, GPU accelerated Video Processing Library and GPU accelerated VP9 decoders. MulticoreWare's parallel computing tools, libraries and expertise in heterogeneous computing forms the foundation of a worldwide professional services business, with more than 200 engineers in 6 locations. Connect with the MulticoreWare on Facebook at Follow us on LinkedIn at

Share article on social media or email:

View article via:

Pdf Print

Contact Author

Thomas Vaughan