Is there a limit to this? We currently have 60 GiB of data, in a couple of weeks our upload size will be greater than 100 GiB
You can currently upload 100GB of data at a time. If you will exceed this threshold, you can consider to split the data in shards and upload them separately. You can mount up to 200GB to your Job/Workspace.
Also, can you do some S3 based workaround within the floyd upload command so we don't have to dance around this issue?
I'll share this with the Dev Team to discuss on prioritization, but it could take a while.
The above error is happening because "aws configure" is an interactive command. It prompts the user for input.
It looks like that the environment variable that I shared with you isn't the right one. Let's do in this way, so it should be less error-prone.
- Create a script file called
S3download.sh in this way:
# Default directory where aws-cli load config files
# Move files
mv credentials /root/.aws/
mv config /root/.aws/
# Install & configure the aws-cli
pip install awscli && aws configure
# Download the data
aws s3 sync s3://<bucket>/<path> /floyd/home
- Launch the job as
floyd run 'bash S3download.sh'
Let me know if this works.