I have a dataset of 20Gb's which is of Numpy array having the extension ".npy
". My single file of .npy
is of 20Gb's
When I try to upload that 20Gb file using the command "floyd data upload
" it compressed the 20Gb's of data into only 19.8Mb's which is not possible because you can't compress 20Gb's file to 19.8Mb's. Seeing at this compression I was shocked and try to found out how to compression works in floyd-cli but couldn't.
Terminal:
But when I tried to load the array using
X = np.load('/floyd/input/data/crop_256_trainLabels_master_256_v2.npy')
Then I somewhat get the idea of how compression work in floyd-cli. I referred the StackOverflow link for the link which shows the same issue https://stackoverflow.com/questions/19793937/failed-to-interpret-file-s-as-a-pickle-when-loading-an-npy-array
The solution on StackOverflow says that if you save the ".npy" array using the mention below method then it is not the perfect way of saving the ".npy" array
Method:
f = open(Filename, "w")
try:
f.write(a)
I was not sure if I'm right or wrong about the compression, but the way compression works for ".npy" array is not proper and if you want you can compress the ".npy" array using the Numpy compression method which converts the ".npy" into ".npz"
Please help! Because I have to submit my project and the deadlines are near. Thank You!