4 min read
Using GPUs to train neural networks for deep learning is becoming commonplace. But the cost of GPU servers and the storage infrastructure required to feed GPUs as fast as they can consume data is significant. I wanted to see if I could use a highly reliable, low-cost, easy-to-use Oracle Cloud Infrastructure (OCI) environment to reproduce deep learning benchmark results published by some of the big storage vendors. I also wanted to see if a MapR distributed file system in this cloud environment could deliver data to the GPUs as fast as those GPUs could consume data residing in memory on the GPU server.
First, I ran one benchmark using data in the local file system. This loaded up the Linux buffer cache with all 143 GB of data.
Next, I ran the benchmarks through one epoch against this data with 1, 2, 4, and all 8 GPUs on the server. In the charts below, that's the "Buffer Cache" number.
Then, I cleared the buffer cache and re-ran the benchmarks pulling the data from MapR. I cleared the MapR file system caches on each of the MapR servers between each run to make sure I was pulling data from the physical storage media.
I got some of the best performance numbers I've seen for training these models, and the MapR performance was almost identical to in-memory reads on the local file server.
I used nvidia-smi, provided in the NGC container, to collect GPU utilization metrics on the 8 GPUs in the cluster to confirm the GPUs are working at full speed to process the data. These graphs show the GPU utilization for the 1 GPU and 8 GPU runs, pulling data from MapR.
And the 1 and 8 GPU utilization numbers from nvidia-smi for ResNet-152 were as follows:
For just a few dollars per hour, Oracle Cloud Infrastructure gives you the highest performing NVIDIA GPU enabled servers with highly available, reliable, and massively scalable MapR storage to perform machine learning tasks faster and more effectively than similar storage infrastructure solutions, with the latter priced orders of magnitude higher.
Stay ahead of the bleeding edge...get the best of Big Data in your inbox.