Skip to content

Commit 74e435b

Browse files
committed
Update summaries & formatting in CLI and OceanDataCatalog User Guides in docs/
1 parent 3d87109 commit 74e435b

2 files changed

Lines changed: 55 additions & 51 deletions

File tree

docs/docs/catalog_guide.md

Lines changed: 6 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,8 +1,12 @@
1-
# OceanDataCatalog API
1+
# OceanDataCatalog
22

33
!!! abstract "Summary"
44

5-
**This is the User Guide for the OceanDataCatalog API to explore and access ocean data stored in the JASMIN Object Store.**
5+
* This is the User Guide for the **OceanDataCatalog API** to explore and access ocean data stored in the JASMIN Object Store.
6+
7+
* Visit the interactive NOC Model STAC browser **[here].**
8+
9+
[here]: catalog.md
610

711
---
812

docs/docs/cli_guide.md

Lines changed: 49 additions & 49 deletions
Original file line numberDiff line numberDiff line change
@@ -2,13 +2,13 @@
22

33
!!! abstract "Summary"
44

5-
**This is the User Guide for the OceanDataStore Command Line Interface (CLI) to write and update ocean data in cloud object storage.**
5+
* This is the User Guide for the **OceanDataStore Command Line Interface (CLI)** to write and update ocean data to **Analysis-Ready Cloud Optimised** formats in S3-compatible cloud object storage.
66

77
---
88

99
## Creating a Credentials File
1010

11-
To get started using **OceanDataStore CLI**, users need to create a ``credentials.json`` file containing the following information:
11+
* To get started using **OceanDataStore CLI**, users need to create a ``credentials.json`` file containing the following information:
1212

1313
```json
1414
{
@@ -18,7 +18,7 @@ To get started using **OceanDataStore CLI**, users need to create a ``credential
1818
}
1919
```
2020

21-
where `token` is your access key ID, `secret` is your secret access key and `endpoint_url` is the optional endpoint URL to use for the object store backend.
21+
* Here `token` is your access key ID, `secret` is your secret access key and `endpoint_url` is the optional endpoint URL of your S3-compatible object store.
2222

2323
## Sending Individual Files
2424

@@ -31,22 +31,22 @@ where `token` is your access key ID, `secret` is your secret access key and `end
3131

3232
Zarr works especially well in combination with cloud storage, such as the JASMIN object store, given that users can access data concurrently from multiple threads or processes using Python or a number of other programming languages.
3333

34-
[Click here](https://zarr-specs.readthedocs.io/en/latest/specs.html) for more information on the Zarr specification.
34+
**[Click here](https://zarr-specs.readthedocs.io/en/latest/specs.html)** for more information on the Zarr specification.
3535

36-
To create a new Zarr store in an object store from the contents of a local netCDF file, we can use the `send_to_zarr` command:
36+
* To send a local netCDF file to a Zarr store in an S3-compatible object store, we can use the `send_to_zarr` command:
3737

3838
```bash
3939
ods send_to_zarr -f /path/to/file.nc -c credentials.json -b bucket_name -p prefix -zv 3
4040
```
4141

42-
The arguments used are:
42+
* The arguments used are:
4343

44-
* `-f`: Path to the netCDF file containing the variables.
45-
* `-c`: Path to the JSON file containing the object store credentials.
46-
* `-b`: Bucket name in the object store where the variables will be stored.
47-
* `-zv`: Zarr version used to create the zarr store. Options are 2 (v2) or 3 (v3).
44+
- `-f`: Path to the netCDF file containing the variables.
45+
- `-c`: Path to the JSON file containing the object store credentials.
46+
- `-b`: Bucket name in the object store where the variables will be stored.
47+
- `-zv`: Zarr version used to create the zarr store. Options are 2 (v2) or 3 (v3).
4848

49-
In the above example, the variable(s) will be stored in a single Zarr v3 store at the `<bucket_name>/<prefix>` path.
49+
* In the above example, the variable(s) will be stored in a single Zarr v3 store at the `<bucket_name>/<prefix>` path.
5050

5151
### Icechunk Repositories
5252

@@ -57,57 +57,57 @@ In the above example, the variable(s) will be stored in a single Zarr v3 store a
5757

5858
This allows Icechunk repositories to support data version control, since users can time-travel to previous snapshots of a repository.
5959

60-
[Click here](https://icechunk.io/en/latest/overview/) for an overview of Icechunk.
60+
**[Click here](https://icechunk.io/en/latest/overview/)** for an overview of Icechunk.
6161

62-
To create a new icechunk repository in an object store from a variable `var` contained in a local netCDF file, we can use the `send_to_icechunk` command:
62+
* To create a new Icechunk repository in an S3-compatible object store from a variable `var` contained in a local netCDF file, we can use the `send_to_icechunk` command:
6363

6464
```bash
6565
ods send_to_icechunk -f /path/to/file.nc -c credentials.json -b bucket_name -p prefix -v var -br "main" -cm "New commit message..."
6666
```
6767

68-
The arguments used are:
68+
* The arguments used are:
6969

70-
* `-f`: Path to the netCDF file containing the variables.
71-
* `-c`: Path to the JSON file containing the object store credentials.
72-
* `-b`: Bucket name in the object store where the variables will be stored.
73-
* `-v`: Variable within the netCDF file to send to the object store.
74-
* `-br`: Branch of Icechunk repository to commit changes to.
75-
* `-cm`: Commit message to be recorded when committing changes to Icechunk repository.
70+
- `-f`: Path to the netCDF file containing the variables.
71+
- `-c`: Path to the JSON file containing the object store credentials.
72+
- `-b`: Bucket name in the object store where the variables will be stored.
73+
- `-v`: Variable within the netCDF file to send to the object store.
74+
- `-br`: Branch of Icechunk repository to commit changes to.
75+
- `-cm`: Commit message to be recorded when committing changes to Icechunk repository.
7676

77-
Note, that the `send_to_icechunk` command requires two additional arguments, `-br` and `-cm`, which define the branch on which to perform the transaction and the commit message to record.
77+
* Note, that the `send_to_icechunk` command requires two additional arguments, `-br` and `-cm`, which define the branch on which to perform the transaction and the commit message to record.
7878

7979
## Sending Lots of Files to Stores
8080

81-
To create a new Zarr store in an object store using a large number of files, we can use [dask](https://www.dask.org) with the `send_to_zarr` command by passing a dask configuration JSON file:
81+
* To create a new Zarr store in an object store using a large number of files, we can use [dask](https://www.dask.org) with the `send_to_zarr` command by passing a dask configuration JSON file:
8282

8383
```bash
8484
ods send_to_zarr -f /path/to/files*.nc -c credentials.json -b bucket_name -p prefix \
8585
-gf /path/to/domain_cfg.nc -uc '{"lon":"lon_new", "lat":"lat_new"}' \
8686
-cs '{"x":2160, "y":1803}' -dc dask_config.json -zv 3
8787
```
8888

89-
Similarly, we can create a new Icechunk repository in an object store using a large number of files:
89+
* Similarly, we can create a new Icechunk repository in an object store using a large number of files:
9090

9191
```bash
9292
ods send_to_icechunk -f /path/to/files*.nc -c credentials.json -b bucket_name -p prefix \
9393
-gf /path/to/domain_cfg.nc -uc '{"lon":"lon_new", "lat":"lat_new"}' \
9494
-cs '{"x":2160, "y":1803}' -dc dask_config.json -br "main" -cm "New big commit message..."
9595
```
9696

97-
The arguments used are:
98-
* `-f`: Paths to the multiple netCDF files containing the variables.
99-
* `-c`: Path to the JSON file containing the object store credentials.
100-
* `-b`: Bucket name in the object store where the variables will be stored.
101-
* `-p`: Prefix used to define path to object (see above).
102-
* `-gf`: Path to model grid file containing domain variables.
103-
* `-uc`: Coordinates dimension variables to update given as a JSON string '{current_coord : new_coord}'.
104-
* `-cs`: Chunk strategy used to rechunk model data.
105-
* `-dc`: Path to JSON file containing Dask configuration.
106-
* `-zv`: Zarr version used to create the zarr store. Options are 2 (v2) or 3 (v3).
107-
* `-br`: Branch of Icechunk repository to commit changes to.
108-
* `-cm`: Commit message to be recorded when committing changes to Icechunk repository.
109-
110-
where the contents of the ``dask_config.json`` are:
97+
* The arguments used are:
98+
- `-f`: Paths to the multiple netCDF files containing the variables.
99+
- `-c`: Path to the JSON file containing the object store credentials.
100+
- `-b`: Bucket name in the object store where the variables will be stored.
101+
- `-p`: Prefix used to define path to object (see above).
102+
- `-gf`: Path to model grid file containing domain variables.
103+
- `-uc`: Coordinates dimension variables to update given as a JSON string '{current_coord : new_coord}'.
104+
- `-cs`: Chunk strategy used to rechunk model data.
105+
- `-dc`: Path to JSON file containing Dask configuration.
106+
- `-zv`: Zarr version used to create the zarr store. Options are 2 (v2) or 3 (v3).
107+
- `-br`: Branch of Icechunk repository to commit changes to.
108+
- `-cm`: Commit message to be recorded when committing changes to Icechunk repository.
109+
110+
* Here the contents of the ``dask_config.json`` are:
111111

112112
```json
113113
{
@@ -123,23 +123,23 @@ where the contents of the ``dask_config.json`` are:
123123
}
124124
```
125125

126-
In the example, a LocalCluster with 12 single threaded workers, each with 2 GB of available memory, is used to transfer a large collection of files to an object store.
126+
* In the above example, a dask LocalCluster with 12 single threaded workers, each with 2 GB of available memory, is used to transfer a large collection of files to the object store.
127127

128-
Users are strongly recommended to implement `send_to_zarr` workflows using a job scheduler, such as SLURM or PBS, to either run the LocalCluster on a single compute node or to use an existing the SLURMCluster or PBSCluster (dask job queue).
128+
* Users are strongly recommended to implement `send_to_zarr` workflows using a job scheduler, such as SLURM or PBS, to either run the LocalCluster on a single compute node or to use an existing the SLURMCluster or PBSCluster (dask job queue).
129129

130130
**Note:** the netCDF4 library does not support multi-threaded access to datasets, so users should ensure that ``threads_per_worker : 1`` in their dask configuration JSON file to avoid raising CancelledError exceptions when using ``send_to_zarr`` or `update_zarr`.
131131

132132
### Updating Existing Stores
133133

134-
To update an existing Zarr store in an object store, we can use the `update_zarr` command:
134+
* To update an existing Zarr store in an S3-compatible object store, we can use the `update_zarr` command:
135135

136136
```bash
137137
ods update_zarr -f /path/to/file.nc -c credentials.json -b bucket_name -p prefix -v var -zv 3
138138
```
139139

140-
This command will replace and/or append the values of variable `var` stored at the local filepath to the `/bucket_name/prefix/var` store provided it already exists in the object store.
140+
* This command will replace and/or append the values of variable `var` stored at the local filepath to the `/bucket_name/prefix/var` store provided it already exists in the object store.
141141

142-
Similarly, to update an existing Icechunk repository, we can use the `update_icechunk` command:
142+
* Similarly, to update an existing Icechunk repository, we can use the `update_icechunk` command:
143143

144144
```bash
145145
ods update_icechunk -f /path/to/file.nc -c credentials.json -b bucket_name -p prefix -v var -br "main" -cm "Update commmit message..."
@@ -149,7 +149,7 @@ ods update_icechunk -f /path/to/file.nc -c credentials.json -b bucket_name -p pr
149149

150150
### Updating Existing Stores With Lots of Files
151151

152-
To update an existing Zarr store in an object store using a large number of files, we can use [dask](https://www.dask.org) via the `update_zarr` command as we showed above with `send_to_zarr`:
152+
* To update an existing Zarr store in an object store using a large number of files, we can use [dask](https://www.dask.org) via the `update_zarr` command as we showed above with `send_to_zarr`:
153153

154154
```bash
155155
ods update_zarr -f filepaths -c credentials.json -b bucket_name -p prefix \
@@ -158,7 +158,7 @@ ods update_zarr -f filepaths -c credentials.json -b bucket_name -p prefix \
158158
-dc dask_config.json -zv 3
159159
```
160160

161-
Similarly, to update an existing Icechunk repository with a large collection of files, we can use the `update_icechunk` command:
161+
* Similarly, to update an existing Icechunk repository with a large collection of files, we can use the `update_icechunk` command:
162162

163163
```bash
164164
ods update_icechunk -f filepaths -c credentials.json -b bucket_name -p prefix \
@@ -167,14 +167,14 @@ ods update_icechunk -f filepaths -c credentials.json -b bucket_name -p prefix \
167167
-dc dask_config.json -br "main" -cm "Update commit message..."
168168
```
169169

170-
where `-ad` is the dimension along which to append chunk data.
170+
* Here `-ad` is the dimension along which to append chunk data.
171171

172172
## Reference
173173

174-
For a complete reference to the available flags when using the **OceanDataStore CLI**, see the [Reference] page.
174+
* For a complete reference to the available flags when using the **OceanDataStore CLI**, see the **[CLI Reference]** page.
175175

176-
## Examples
176+
## Further Examples
177177

178-
For further examples of how to implement the commands in **OceanDataStore** in your own workflows, see the bash scripts in the `examples` directory.
178+
* For further examples of how to implement the **OceanDataStore CLI** in your own workflows, see our example bash scripts in the `examples` directory.
179179

180-
[Reference]: cli_reference.md
180+
[CLI Reference]: cli_reference.md

0 commit comments

Comments
 (0)