Hi @craigboman,
Any update on this feature request? Looking forward to this option as well.
it's currently on our roadmap, but it could take a while. Meanwhile, here's what you can do if you are working with S3 bucket.
Our enterprise users connect to their private S3 buckets from within their FloydHub job. You can create S3 credentials with read-only access to specific buckets and then use the awscli
tool to download the data. You can either pull the data once and create a dataset out of it. Then you can mount the data at runtime to your training job. This is the approach we recommend to customer with large datasets in S3.
Alternatively, you can also stream the S3 data every time the training starts - this would require some changes to the training code. And it is a little slow to stream S3 data across the internet for every training job.
Hope that helps.