Skip to content

Commit b478af5

Browse files
committed
Update based on suggestions from armintoepfer@.
1 parent a597b5a commit b478af5

2 files changed

Lines changed: 21 additions & 19 deletions

File tree

docs/quick_start.md

Lines changed: 17 additions & 15 deletions
Original file line numberDiff line numberDiff line change
@@ -5,10 +5,10 @@ dataset. This will cover the steps of running from a subreads BAM file and
55
generate a FASTQ of consensus reads.
66

77
This covers the following stages:
8-
1. Running [pbccs] with the `--all` option to output all reads (it is possible
9-
to use DeepConsensus from existing pbccs reads, but yield will be higher when
8+
1. Running *[ccs]* with the `--all` option to output all reads (it is possible
9+
to use DeepConsensus from existing *ccs* reads, but yield will be higher when
1010
including all reads)
11-
2. Aligning subreads to the pbccs consensus with [actc]
11+
2. Aligning subreads to the *ccs* consensus with *[actc]*
1212
3. Running DeepConsensus using one of two options (with pip or using Docker)
1313

1414
## System configuration
@@ -57,17 +57,17 @@ bash install_nvidia_docker.sh
5757

5858
to make sure our GPU is set up correctly.
5959

60-
## Process the data with [pbccs] and [actc]
60+
## Process the data with *ccs* and *actc*
6161

62-
You can install `ccs` and `actc` on your own. For convenience, we put them in
62+
You can install *[ccs]* and *[actc]* on your own. For convenience, we put them in
6363
a Docker image:
6464

6565
```
66-
DOCKER_IMAGE=google/deepconsensus:0.2.0rc-gpu
66+
DOCKER_IMAGE=google/deepconsensus:0.2.0rc1-gpu
6767
sudo docker pull ${DOCKER_IMAGE}
6868
```
6969

70-
DeepConsensus operates on subreads aligned to a draft consensus. We use [pbccs]
70+
DeepConsensus operates on subreads aligned to a draft consensus. We use *ccs*
7171
to generate this.
7272

7373
```bash
@@ -82,9 +82,9 @@ Note that the `--all` flag is a required setting for DeepConsensus to work
8282
optimally. This allows DeepConsensus to rescue reads previously below the
8383
quality threshold.
8484
If you want to split up the task for parallelization, we recommend using the
85-
`--chunk` option in `ccs`.
85+
`--chunk` option in *ccs*.
8686

87-
Then, we create `subreads_to_ccs.bam` was created by running [actc]:
87+
Then, we create `subreads_to_ccs.bam` was created by running *actc*:
8888

8989
```bash
9090
sudo docker run -v "${DATA}":"/data" ${DOCKER_IMAGE} \
@@ -94,11 +94,13 @@ sudo docker run -v "${DATA}":"/data" ${DOCKER_IMAGE} \
9494
/data/subreads_to_ccs.bam
9595
```
9696

97-
DeepConsensus will take FASTA format of ccs, so we use samtools to generate.
97+
DeepConsensus will take FASTA format of *ccs*.
98+
99+
*actc* already converted the BAM into FASTA. Rename and index it.
98100

99101
```bash
100102
sudo docker run -v "${DATA}":"/data" ${DOCKER_IMAGE} \
101-
samtools fasta --threads "$(nproc)" /data/ccs.bam > ${DATA}/ccs.fasta
103+
mv /data/subreads_to_ccs.fasta /data/ccs.fasta
102104

103105
sudo docker run -v "${DATA}":"/data" ${DOCKER_IMAGE} \
104106
samtools faidx /data/ccs.fasta
@@ -111,7 +113,7 @@ sudo docker run -v "${DATA}":"/data" ${DOCKER_IMAGE} \
111113
You can install DeepConsensus using `pip`:
112114

113115
```bash
114-
pip install deepconsensus[gpu]==0.2.0rc0
116+
pip install deepconsensus[gpu]==0.2.0rc1
115117
```
116118

117119
NOTE: If you're using a CPU machine, install with `deepconsensus[cpu]` instead.
@@ -138,7 +140,7 @@ time deepconsensus run \
138140

139141
At the end of your run, you should see:
140142
```
141-
Processed 1000 ZMWs in 346.73112511634827 seconds
143+
Processed 1000 ZMWs in 341.3297851085663 seconds
142144
Outcome counts: OutcomeCounter(empty_sequence=0, only_gaps_and_padding=50, failed_quality_filter=424, failed_length_filter=0, success=526)
143145
```
144146
the outputs can be found at the following paths:
@@ -169,7 +171,7 @@ time sudo docker run --gpus all \
169171
At the end of your run, you should see:
170172

171173
```
172-
Processed 1000 ZMWs in 433.63712906837463 seconds
174+
Processed 1000 ZMWs in 428.84565114974976 seconds
173175
Outcome counts: OutcomeCounter(empty_sequence=0, only_gaps_and_padding=50, failed_quality_filter=424, failed_length_filter=0, success=526)
174176
```
175177

@@ -184,6 +186,6 @@ You might be able to tweak parameters like `--batch_zmws` depending on your
184186
hardware limit. You can also see [runtime_metrics.md](runtime_metrics.md) for
185187
runtime on different CPU or GPU machines.
186188

187-
[pbccs]: https://github.com/PacificBiosciences/ccs
189+
[ccs]: https://ccs.how
188190
[actc]: https://github.com/PacificBiosciences/align-clr-to-ccs
189191
[a GitHub issue]: https://github.com/google/deepconsensus/issues

docs/runtime_metrics.md

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -17,8 +17,8 @@ gcloud compute instances create "${USER}-n2-64" \
1717
--zone "us-west1-b"
1818
```
1919

20-
* With pip: 735.94 seconds / 1000 ZMWs
21-
* With Docker: 760.54 seconds / 1000 ZMWs
20+
* With pip: 725.50 seconds / 1000 ZMWs
21+
* With Docker: 707.41 seconds / 1000 ZMWs
2222

2323
## 16vCPUs (Cascade Lake) (n2-standard-16 on GCP)
2424

@@ -54,5 +54,5 @@ gcloud compute instances create "${USER}-gpu" \
5454
--min-cpu-platform "Intel Skylake"
5555
```
5656

57-
* With pip: 346.73 seconds / 1000 ZMWs
58-
* With Docker: 433.64 seconds / 1000 ZMWs
57+
* With pip: 341.33 seconds / 1000 ZMWs
58+
* With Docker: 428.85 seconds / 1000 ZMWs

0 commit comments

Comments
 (0)