Skip to content

Commit 7a75a6a

Browse files
committed
Update README.md to include docs reference and remove user guide info.
1 parent 2ebf429 commit 7a75a6a

1 file changed

Lines changed: 4 additions & 98 deletions

File tree

README.md

Lines changed: 4 additions & 98 deletions
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,5 @@
11
# OceanDataStore
22

3-
[**Documentation**](https://noc-msm.github.io/OceanDataStore/)
4-
53
A Python library designed to streamline writing, updating & accessing ocean model and observational data stored in cloud object storage.
64

75
## Installation
@@ -15,107 +13,15 @@ pip install git+https://github.com/NOC-MSM/OceanDataStore.git
1513

1614
**Note:** we strongly recommend installing **OceanDataStore** into a new virtual environment using either ``venv`` or ``conda / mamba``.
1715

18-
## User Guide
19-
20-
### Creating a Credentials File
21-
22-
To get started using **OceanDataStore**, users need to create a ``credentials.json`` file containing the following information:
23-
24-
```json
25-
{
26-
"secret": "your_secret",
27-
"token": "your_token",
28-
"endpoint_url": "https://noc-msm-o.s3-ext.jc.rl.ac.uk"
29-
}
30-
```
31-
32-
### Sending Individual Files
33-
34-
To create a new zarr store in an object store from a local file, use the `send_to_zarr` command:
35-
36-
```bash
37-
ods send_to_zarr -f /path/to/file.nc -c credentials.json -b bucket_name -v var
38-
```
39-
40-
The arguments used are:
41-
- `-f`: Path to the netCDF file containing the variables.
42-
- `-c`: Path to the JSON file containing the object store credentials.
43-
- `-b`: Bucket name in the object store where the variables will be stored.
44-
- `-v`: Variable within the netCDF file to send to the object store.
45-
46-
In the example above, without a `-p` (or `--prefix`), the variables will be stored in `<bucket_name>/<var>`. If a `--prefix` is provided, the variables will be stored in `<bucket_name>/<prefix>/<var>`.
47-
48-
### Sending Lots of Files
49-
50-
To create a new zarr store in an object store using a large number of files, we can use [dask](https://www.dask.org) with the `send_to_zarr` command by passing a dask configuration JSON file:
51-
52-
```bash
53-
ods send_to_zarr -f filepaths -c credentials.json -b bucket_name -p prefix \
54-
-gf filepath_domain -uc '{"lat":"lat_new", "lon":"lon_new"}' \
55-
-cs '{"x":500, "y":500, "depthw":25}' \
56-
-dc dask_config.json
57-
```
58-
59-
The arguments used are:
60-
- `-f`: Paths to the multiple netCDF files containing the variables.
61-
- `-c`: Path to the JSON file containing the object store credentials.
62-
- `-b`: Bucket name in the object store where the variables will be stored.
63-
- `-p`: Prefix used to define path to object (see above).
64-
- `-gf`: Path to model grid file containing domain variables.
65-
- `-uc`: Coordinates dimension variables to update given as a JSON string '{current_coord : new_coord}'.
66-
- `-cs`: Chunk strategy used to rechunk model data.
67-
- `-dc`: Path to JSON file containing Dask configuration.
16+
## Documentation
6817

69-
where the contents of the ``dask_config.json`` are:
70-
71-
```json
72-
{
73-
"config_kwargs": {
74-
"temporary_directory":"..../jasmin_os_tmp/",
75-
"local_directory":"..../jasmin_os_tmp/"
76-
},
77-
"cluster_kwargs": {
78-
"n_workers" : 12,
79-
"threads_per_worker" : 1,
80-
"memory_limit":"2GB"
81-
}
82-
}
83-
```
84-
85-
In the example, a LocalCluster with 12 single threaded workers, each with 2 GB of available memory, is used to transfer a large collection of files to an object store.
86-
87-
Users are strongly recommended to implement `send_to_zarr` workflows using a job scheduler, such as SLURM or PBS, to either run the LocalCluster on a single compute node or to use an existing the SLURMCluster or PBSCluster (dask job queue).
88-
89-
**Note:** the netCDF4 library does not support multi-threaded access to datasets, so users should ensure that ``threads_per_worker : 1`` in their dask configuration JSON file to avoid raising CancelledError exceptions when using ``send_to_zarr`` or `update_zarr`.
90-
91-
### Updating Existing Stores
92-
93-
To update an existing zarr store in an object store, we can use the `update_zarr` command:
94-
95-
```bash
96-
ods update_zarr -f /path/to/file.nc -c credentials.json -b bucket_name -p prefix -v var
97-
```
98-
99-
This command will replace and/or append the values of variable `var` stored at the local filepath to the `/bucket_name/prefix/var` store provided it already exists in the object store.
100-
101-
**Note:** compatability checks must be passed before local data will be appended to an existing store, these include chunk size & dimension compatability.
102-
103-
### Updating Existing Stores With Lots of Files
104-
105-
To update an existing zarr store in an object store using a large number of files, we can use [dask](https://www.dask.org) via the `update_zarr` command analogously to `send_to_zarr`:
106-
107-
```bash
108-
ods update_zarr -f filepaths -c credentials.json -b bucket_name -p prefix \
109-
-gf filepath_domain -uc '{"lat":"lat_new", "lon":"lon_new"}' \
110-
-cs '{"x":500, "y":500, "depthw":25}' -ad time \
111-
-dc dask_config.json
112-
```
18+
To learn more about OceanDataStore, click [**here**](https://noc-msm.github.io/OceanDataStore/) to explore the documentation.
11319

11420
## Examples
11521

116-
For further examples of how to implement the commands in **OceanDataStore** in your own workflows, see the bash scripts in the `examples` directory.
22+
For examples of how to implement the commands in **OceanDataStore CLI** in your own workflows, see the bash scripts in the `examples` directory.
11723

118-
## OceanDataStore Arguments
24+
## OceanDataStore CLI Arguments
11925

12026
### Mandatory Arguments
12127

0 commit comments

Comments
 (0)