Currently, the output data generated at /output is kept when a job finishes, but if the job is interrupted through
floyd stop <ID>, any data created during the run is lost.
It would be great if we could issue a command like
floyd stop -keep <ID>, so that we may interrupt the experiment without loosing the ouput.
Use case: Let's say i'm running a big model, but for some reason (maybe random initializers) it has a low-enough error rate. I would like to be able to stop it then, and download the checkpoints, without having to wait for it to finish or pre-program a stoping logic.