-
Notifications
You must be signed in to change notification settings - Fork 22
feat: Add support for Amazon AWS and Google Drive #511
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
for more information, see https://pre-commit.ci
for more information, see https://pre-commit.ci
Hi @JoeZiminski, cloudsupport.1.mp4 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @Ijv3-0 thanks a lot for this, in general it is a super clean PR with the code very easy to read. There's a high attention to detail and it's well documented with good docstrings. Please find some general feedback on the code in the comments.
The set up and transfer functionality itself works nicely, I was able to transfer files. For the set up, it would be nice if the user could complete the entire set up through the TUI, rather than having to set up the configs outside of the file (e.g. by using widgets to pass the information and rclone calls under the hood). Otherwise, it worked well.
In this case, we have a number of duplicate PRs for this feature, and this new transfer functionality will additionally require extensive testing and documentation which will take a lot of time and require access to our internal testing. For this reason I won't merge this PR, but will link to it in all relevant correspondence of a good example implementation. I appreciate that you have put a lot of work into this. In a number of places I have noted where you have made improvements to the existing codebase, which could be added in new PRs. In this way, you could be able to contribute to the codebase, please feel free to open new PRs on these if you are interested in this. You also mentioned you had some questions about this implementation, which I'm happy to answer.
"Google Drive connection requires 'gdrive_folder_id' to be specified.", | ||
ConfigError, | ||
) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nice input checks!
if list(config_dict.keys()) != list(canonical_dict.keys()): | ||
utils.log_and_raise_error( | ||
f"New config keys are in the wrong order. The" | ||
f" order should be: {canonical_dict.keys()}.", | ||
f" order should be: {list(canonical_dict.keys())}.", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch! This could be a separate PR
# Initialise the local project folder | ||
utils.print_message_to_user( | ||
f"Making project folder at: {config_dict['local_path']}" | ||
) | ||
try: | ||
folders.create_folders(config_dict["local_path"]) | ||
except OSError: | ||
except OSError as e: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In this case, because error messages are propagated to the TUI, we generally don't show the entire native error and instead replace it with something a bit less detailed.
|
||
ds_logger.close_log_filehandler() | ||
|
||
def _setup_rclone_central_gdrive_config(self, log: bool = True) -> None: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think in this case, because this function just directly wraps rclones setup_rclone_config_for_gdrive
with no other effects, calls to this function can be replaced with direct calls to setup_rclone_config_for_gdrive
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see this uses the existing structure for the SSH case, which is good, but in this case the SSH structure is not ideal! So that could be factored out in a separate PR.
|
||
@check_configs_set | ||
@check_is_not_local_project | ||
def setup_aws_connection(self) -> None: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
very well structured, and it is easy to read alongside the gdrive function because they are all structured in a similar way
@@ -28,6 +28,52 @@ def wrapper(*args, **kwargs): | |||
return wrapper | |||
|
|||
|
|||
def requires_aws_configs(func): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice idea!
@@ -0,0 +1,457 @@ | |||
from typing import Dict, Literal, Tuple, Union |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Again, this is a very well structured module, the code is clean and easy to read. Most of the comments on the AWS module transfer to this one too.
A list of options to pass to Rclone's copy function. | ||
see `cfg.make_rclone_transfer_options()`. | ||
""" | ||
assert upload_or_download in [ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would keep these asserts, just as an extra level of protection
central_filepath = cfg.get_base_folder( | ||
"central", top_level_folder | ||
).as_posix() | ||
|
||
utils.log(f"Local path: {local_filepath}") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is nice, but rlcone already logs this information so we can skip it
extra_arguments = handle_rclone_arguments(rclone_options, include_list) | ||
|
||
if upload_or_download == "upload": | ||
output = call_rclone( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a nice change, to set the command separately and call rlclone
only once! This could be a separate PR
Description
Currently, datashuttle supports transferring projects between local filesystems and remote systems via SSH. To enhance usability and data-sharing, we need to extend datashuttle to support file transfers via AWS S3 and Google Drive using RClone.
Extend TUI: Modify the existing terminal user interface (TUI) to support AWS S3 and Google Drive in a similar style to SSH.
Update Python API: Implement AWS S3 and Google Drive transfer functionality using RClone.
Dynamic UI Rendering: When a user selects AWS S3 or Google Drive, the respective input fields should appear dynamically
Why is this PR needed?
What does this PR do?
References
Addresses #407
References
fixes #407
How has this PR been tested?
How has this PR been tested?
Please explain how any new code has been tested, and how you have ensured that no existing functionality has changed.
Is this a breaking change?
If this PR breaks any existing functionality, please explain how and why.
Does this PR require an update to the documentation?
If any features have changed, or have been added. Please explain how the
documentation has been updated.
Checklist: