Stephan Heunis jsheunis @jsheunis@mas.to |
Michał Szczepanik mslw @doktorpanik@masto.ai |
|
Psychoinformatics lab,
Institute of Neuroscience and Medicine (INM-7) Research Center Jülich |
adina@bulk1 in /ds/hcp/super on git:master❱ datalad status --annex -r
15530572 annex'd files (77.9 TB recorded total size)
nothing to save, working tree clean
(github.com/datalad-datasets/human-connectome-project-openaccess)
/dataset
├── sample1
│ └── a001.dat
├── sample2
│ └── a001.dat
...
/dataset
├── sample1
│ ├── ps34t.dat
│ └── a001.dat
├── sample2
│ ├── ps34t.dat
│ └── a001.dat
...
Without expert/domain knowledge, no distinction between original and derived data
possible.
/raw_dataset
├── sample1
│ └── a001.dat
├── sample2
│ └── a001.dat
...
With modularity after applied transform (preprocessing, analysis, ...)
/derived_dataset
├── sample1
│ └── ps34t.dat
├── sample2
│ └── ps34t.dat
├── ...
└── inputs
└── raw
├── sample1
│ └── a001.dat
├── sample2
│ └── a001.dat
...
Clearer separation of semantics, through use of pristine version of original dataset within a
new, additional dataset holding the outputs.
|
Imagenette dataset |
datalad save
it, or use commands such as datalad download-url
or datalad add-urls
to retrieve it from web-sources-c yoda
prepares a useful structure-c text2git
keeps text files such as scripts in Git Science has many different building blocks: Code, software, and data produce research outputs.
The more you share, the more likely can others reproduce your results
datalad-container
extension gives DataLad commands to add, track, retrieve, and
execute Docker or Singularity containers.
pip/conda install datalad-container
datalad containers-run
datalad containers-run
download-url
yoda
configurationdatalad containers-add
datalad containers-run
datalad containers-run
git diff
datalad
tag