I was testing Floydhub today and just upgraded to the data scientist plan. Just now when I was testing Floyd GPU to train an LSTM network, it takes 2.5 times more time to run on floyd's server than it is on my local machine. How could I improve the performance? Am I not following the best practice or something? Thanks.
My local machine's spec: Nvidia 1080 Ti, 32 GB RAM
Floyd's standard GPU server spec: Tesla K80 · 12 GB Memory, 61 GB RAM
I then tried the TeslaV100 GPU, and it is only 10% faster than the K80 GPU. Clearly, there is something wrong but what could possibly be the reason?
Problem solved. It seems the reason is that I didn't use a very high batch size.