Skip to content

Add support for downloading publicly available datasets #340 #515

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 10 commits into
base: main
Choose a base branch
from

Conversation

demoncoder-crypto
Copy link

The implementation includes:
A new public_data.py module with functions to list, get info about, and fetch datasets
Proper input validation and local caching in ~/.movement/public_data
Comprehensive unit tests verifying all functionality
An example file demonstrating typical usage patterns
Documentation explaining available datasets, parameters, and citations

@niksirbi niksirbi linked an issue Mar 26, 2025 that may be closed by this pull request
@niksirbi niksirbi requested a review from lochhh March 26, 2025 11:35
@niksirbi
Copy link
Member

niksirbi commented Mar 26, 2025

Thanks for attempting to tackle this @demoncoder-crypto. We are experiencing a high volume of PRs at the moment, but we will get to reviewing this eventually.

Could you deal with the pre-commit and linting failures in the meantime?
You can do pre-commit install (needed only once), followed by pre-commit run -a in your development environment.

After that the checks should automatically apply to all commits you attempt.

@demoncoder-crypto
Copy link
Author

Thanks @niksirbi for comments, I tried to fix pre-commit issues. Please let me know if they are up to expectation. thanks

@demoncoder-crypto
Copy link
Author

Downloading file 'videos/rotating-mouse_eye-tracking_stim-uniform_video.mp4' from 'https://gin.g-node.org/neuroinformatics/movement-test-data/raw/master/videos/rotating-mouse_eye-tracking_stim-uniform_video.mp4' to 'C:\Users\Freiz.movement\data'.
The package gets stopped here and nothing happens then when i try sphinx build C:\Users\Freiz\AppData\Roaming\Python\Python311\Scripts\sphinx-build.exe -b html source build
Is there a way to fix this?

Copy link
Collaborator

@lochhh lochhh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the PR @demoncoder-crypto! The overall structure looks about right, with relevant examples and tests added 👏 The PR is not quite ready though, since it is missing the key bits, i.e. places marked with "placeholder".

Here are a few suggestions/requests for changes for the work you've done so far:

  • we've recently updated our logging module and here's the relevant section in our contributing guide. you'll need to update the logger calls accordingly.
  • if you try building the docs locally, you'll see a few warnings that you'll need to resolve, e.g.:
    • title underline too short in public_datasets.rst
    • fetch_<data>: footnote 1 not referenced
    • datasets user guide not in toc tree (please see our guide on adding a new page)
  • calms21 metadata in PUBLIC_DATASETS
  • rat7m metadata in PUBLIC_DATASETS
  • inaccuracies in fetch_calms21 docstring:
    • the dataset consists of trajectory data of social interactions, recorded from videos of freely behaving mice in a standard resident-intruder assay
    • irrelevant tasks: I don't think the tasks are in the calms21 dataset
    • incorrect paper referenced

@demoncoder-crypto
Copy link
Author

Thanks @lochhh I will work on these issues by tomorrow, I am really grateful for such detailed response.

@demoncoder-crypto
Copy link
Author

@lochhh @niksirbi I have tried to fix the errors, do let me know if they are up to expectation

@lochhh
Copy link
Collaborator

lochhh commented Apr 30, 2025

Thanks @demoncoder-crypto for the changes. As you can see the CI checks are still failing and the key implementations (places marked with "placeholder implementation", for instance in movement/public_data.py and movement/examples/public_datasets.py) are still missing. There are also merge conflicts with the main branch which need resolving.

@demoncoder-crypto
Copy link
Author

Thanks @lochhh I will fix them soon.

Copy link

sonarqubecloud bot commented May 2, 2025

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add support for downloading publicly available datasets
3 participants