5 min read
Face detection is providing new quantitative paths for a wide range of applications, such as smart office spaces, ad analytics, and surveillance. However, implementing face detection on video is challenging on two fronts:
Many applications need to simultaneously process multiple videos (e.g., webcam feeds) at near real-time frame rates. Until recently, the assumption has been that a dedicated GPU must be allocated for each video feed. With the advent of containerization, distributed pub/sub messaging services, and elastic GPUs in the cloud, we can architect applications that more efficiently utilize hardware to process multiple high-speed video feeds.
In this blog, I'll describe an architecture that's suitable for detecting faces in multiple video feeds with an efficient use of GPU hardware. This architecture is generally applicable to any application that processes fast data streams and requires the ability to scale by adding more machines without changing the application code.
The two components of MapR used for this face detection application are MapR Event Streams and MapR PACC:
Distributed pub/sub messaging services are typically used for communication between services. When messages are produced faster than they are consumed, these services store the unconsumed messages in memory. As such, they provide a convenient decoupling that allows producers to send data to consumers without the burden of maintaining connections and managing back pressure across a variable number of consumers.
Pub/sub streaming systems, like Kafka and MapR Event Streams (MapR Event Store), have traditionally been thought of as a solution for inter-service communication rather than a distribution mechanism for fast data streams, such as video. However, we've proven that video can be distributed through MapR Event Store. In doing so, we see two distinct advantages:
Real-time face detection requires high speed processors. Video processing on traditional CPUs can only be done by significantly compromising the fidelity and timeliness of video frame processing. For example, it takes more than 10 seconds to classify a single image on an Intel Xeon Skylake CPU in a standard Google Cloud virtual machine. That same process takes less than 1 second if an Nvidia K80 GPU is used.
Docker makes it easy to orchestrate video producers and consumers. When they're packaged with the MapR Persistent Application Client Container (MapR PACC), they can maintain access to a MapR cluster and the video streams hosted on that cluster, even though the Docker containers themselves are ephemeral.
By dockerizing the video consumers, face detection processes can be provisioned on the fly, when new video feeds demand faster processing. Discovering those streams will never be a problem, since MapR's global namespace ensures stream locations never change.
Resource discovery is often a problem in distributed architectures where processes run on different machines, but that's not a problem with MapR because the MapR global namespace eliminates any confusion about where files, tables, or streams are located across clusters.
Our face detection demo application was implemented in Python using APIs for Tensorflow and MapR Event Store. The code and documentation is provided here:
The functional flow for the various components in this application are shown below:
Watch a video demonstration of this application here:
Stay ahead of the bleeding edge...get the best of Big Data in your inbox.