You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/docs/cli_guide.md
+49-49Lines changed: 49 additions & 49 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -2,13 +2,13 @@
2
2
3
3
!!! abstract "Summary"
4
4
5
-
**This is the User Guide for the OceanDataStore Command Line Interface (CLI) to write and update ocean data in cloud object storage.**
5
+
*This is the User Guide for the **OceanDataStore Command Line Interface (CLI)** to write and update ocean data to **Analysis-Ready Cloud Optimised** formats in S3-compatible cloud object storage.
6
6
7
7
---
8
8
9
9
## Creating a Credentials File
10
10
11
-
To get started using **OceanDataStore CLI**, users need to create a ``credentials.json`` file containing the following information:
11
+
*To get started using **OceanDataStore CLI**, users need to create a ``credentials.json`` file containing the following information:
12
12
13
13
```json
14
14
{
@@ -18,7 +18,7 @@ To get started using **OceanDataStore CLI**, users need to create a ``credential
18
18
}
19
19
```
20
20
21
-
where `token` is your access key ID, `secret` is your secret access key and `endpoint_url` is the optional endpoint URL to use for the object store backend.
21
+
* Here `token` is your access key ID, `secret` is your secret access key and `endpoint_url` is the optional endpoint URL of your S3-compatible object store.
22
22
23
23
## Sending Individual Files
24
24
@@ -31,22 +31,22 @@ where `token` is your access key ID, `secret` is your secret access key and `end
31
31
32
32
Zarr works especially well in combination with cloud storage, such as the JASMIN object store, given that users can access data concurrently from multiple threads or processes using Python or a number of other programming languages.
33
33
34
-
[Click here](https://zarr-specs.readthedocs.io/en/latest/specs.html) for more information on the Zarr specification.
34
+
**[Click here](https://zarr-specs.readthedocs.io/en/latest/specs.html)** for more information on the Zarr specification.
35
35
36
-
To create a new Zarr store in an object store from the contents of a local netCDF file, we can use the `send_to_zarr` command:
36
+
*To send a local netCDF file to a Zarr store in an S3-compatible object store, we can use the `send_to_zarr` command:
*`-f`: Path to the netCDF file containing the variables.
45
-
*`-c`: Path to the JSON file containing the object store credentials.
46
-
*`-b`: Bucket name in the object store where the variables will be stored.
47
-
*`-zv`: Zarr version used to create the zarr store. Options are 2 (v2) or 3 (v3).
44
+
-`-f`: Path to the netCDF file containing the variables.
45
+
-`-c`: Path to the JSON file containing the object store credentials.
46
+
-`-b`: Bucket name in the object store where the variables will be stored.
47
+
-`-zv`: Zarr version used to create the zarr store. Options are 2 (v2) or 3 (v3).
48
48
49
-
In the above example, the variable(s) will be stored in a single Zarr v3 store at the `<bucket_name>/<prefix>` path.
49
+
*In the above example, the variable(s) will be stored in a single Zarr v3 store at the `<bucket_name>/<prefix>` path.
50
50
51
51
### Icechunk Repositories
52
52
@@ -57,57 +57,57 @@ In the above example, the variable(s) will be stored in a single Zarr v3 store a
57
57
58
58
This allows Icechunk repositories to support data version control, since users can time-travel to previous snapshots of a repository.
59
59
60
-
[Click here](https://icechunk.io/en/latest/overview/) for an overview of Icechunk.
60
+
**[Click here](https://icechunk.io/en/latest/overview/)** for an overview of Icechunk.
61
61
62
-
To create a new icechunk repository in an object store from a variable `var` contained in a local netCDF file, we can use the `send_to_icechunk` command:
62
+
*To create a new Icechunk repository in an S3-compatible object store from a variable `var` contained in a local netCDF file, we can use the `send_to_icechunk` command:
63
63
64
64
```bash
65
65
ods send_to_icechunk -f /path/to/file.nc -c credentials.json -b bucket_name -p prefix -v var -br "main" -cm "New commit message..."
66
66
```
67
67
68
-
The arguments used are:
68
+
*The arguments used are:
69
69
70
-
*`-f`: Path to the netCDF file containing the variables.
71
-
*`-c`: Path to the JSON file containing the object store credentials.
72
-
*`-b`: Bucket name in the object store where the variables will be stored.
73
-
*`-v`: Variable within the netCDF file to send to the object store.
74
-
*`-br`: Branch of Icechunk repository to commit changes to.
75
-
*`-cm`: Commit message to be recorded when committing changes to Icechunk repository.
70
+
-`-f`: Path to the netCDF file containing the variables.
71
+
-`-c`: Path to the JSON file containing the object store credentials.
72
+
-`-b`: Bucket name in the object store where the variables will be stored.
73
+
-`-v`: Variable within the netCDF file to send to the object store.
74
+
-`-br`: Branch of Icechunk repository to commit changes to.
75
+
-`-cm`: Commit message to be recorded when committing changes to Icechunk repository.
76
76
77
-
Note, that the `send_to_icechunk` command requires two additional arguments, `-br` and `-cm`, which define the branch on which to perform the transaction and the commit message to record.
77
+
*Note, that the `send_to_icechunk` command requires two additional arguments, `-br` and `-cm`, which define the branch on which to perform the transaction and the commit message to record.
78
78
79
79
## Sending Lots of Files to Stores
80
80
81
-
To create a new Zarr store in an object store using a large number of files, we can use [dask](https://www.dask.org) with the `send_to_zarr` command by passing a dask configuration JSON file:
81
+
*To create a new Zarr store in an object store using a large number of files, we can use [dask](https://www.dask.org) with the `send_to_zarr` command by passing a dask configuration JSON file:
-cs '{"x":2160, "y":1803}' -dc dask_config.json -br "main" -cm "New big commit message..."
95
95
```
96
96
97
-
The arguments used are:
98
-
*`-f`: Paths to the multiple netCDF files containing the variables.
99
-
*`-c`: Path to the JSON file containing the object store credentials.
100
-
*`-b`: Bucket name in the object store where the variables will be stored.
101
-
*`-p`: Prefix used to define path to object (see above).
102
-
*`-gf`: Path to model grid file containing domain variables.
103
-
*`-uc`: Coordinates dimension variables to update given as a JSON string '{current_coord : new_coord}'.
104
-
*`-cs`: Chunk strategy used to rechunk model data.
105
-
*`-dc`: Path to JSON file containing Dask configuration.
106
-
*`-zv`: Zarr version used to create the zarr store. Options are 2 (v2) or 3 (v3).
107
-
*`-br`: Branch of Icechunk repository to commit changes to.
108
-
*`-cm`: Commit message to be recorded when committing changes to Icechunk repository.
109
-
110
-
where the contents of the ``dask_config.json`` are:
97
+
*The arguments used are:
98
+
-`-f`: Paths to the multiple netCDF files containing the variables.
99
+
-`-c`: Path to the JSON file containing the object store credentials.
100
+
-`-b`: Bucket name in the object store where the variables will be stored.
101
+
-`-p`: Prefix used to define path to object (see above).
102
+
-`-gf`: Path to model grid file containing domain variables.
103
+
-`-uc`: Coordinates dimension variables to update given as a JSON string '{current_coord : new_coord}'.
104
+
-`-cs`: Chunk strategy used to rechunk model data.
105
+
-`-dc`: Path to JSON file containing Dask configuration.
106
+
-`-zv`: Zarr version used to create the zarr store. Options are 2 (v2) or 3 (v3).
107
+
-`-br`: Branch of Icechunk repository to commit changes to.
108
+
-`-cm`: Commit message to be recorded when committing changes to Icechunk repository.
109
+
110
+
* Here the contents of the ``dask_config.json`` are:
111
111
112
112
```json
113
113
{
@@ -123,23 +123,23 @@ where the contents of the ``dask_config.json`` are:
123
123
}
124
124
```
125
125
126
-
In the example, a LocalCluster with 12 single threaded workers, each with 2 GB of available memory, is used to transfer a large collection of files to an object store.
126
+
*In the above example, a dask LocalCluster with 12 single threaded workers, each with 2 GB of available memory, is used to transfer a large collection of files to the object store.
127
127
128
-
Users are strongly recommended to implement `send_to_zarr` workflows using a job scheduler, such as SLURM or PBS, to either run the LocalCluster on a single compute node or to use an existing the SLURMCluster or PBSCluster (dask job queue).
128
+
*Users are strongly recommended to implement `send_to_zarr` workflows using a job scheduler, such as SLURM or PBS, to either run the LocalCluster on a single compute node or to use an existing the SLURMCluster or PBSCluster (dask job queue).
129
129
130
130
**Note:** the netCDF4 library does not support multi-threaded access to datasets, so users should ensure that ``threads_per_worker : 1`` in their dask configuration JSON file to avoid raising CancelledError exceptions when using ``send_to_zarr`` or `update_zarr`.
131
131
132
132
### Updating Existing Stores
133
133
134
-
To update an existing Zarr store in an object store, we can use the `update_zarr` command:
134
+
*To update an existing Zarr store in an S3-compatible object store, we can use the `update_zarr` command:
This command will replace and/or append the values of variable `var` stored at the local filepath to the `/bucket_name/prefix/var` store provided it already exists in the object store.
140
+
*This command will replace and/or append the values of variable `var` stored at the local filepath to the `/bucket_name/prefix/var` store provided it already exists in the object store.
141
141
142
-
Similarly, to update an existing Icechunk repository, we can use the `update_icechunk` command:
142
+
*Similarly, to update an existing Icechunk repository, we can use the `update_icechunk` command:
To update an existing Zarr store in an object store using a large number of files, we can use [dask](https://www.dask.org) via the `update_zarr` command as we showed above with `send_to_zarr`:
152
+
*To update an existing Zarr store in an object store using a large number of files, we can use [dask](https://www.dask.org) via the `update_zarr` command as we showed above with `send_to_zarr`:
0 commit comments