Skip to content

Commit d785f5e

Browse files
committed
Updates.
1 parent 46b932a commit d785f5e

6 files changed

Lines changed: 66 additions & 34 deletions

File tree

README.md

Lines changed: 46 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1,9 +1,52 @@
11
# DeepConsensus
22

3-
DeepConsensus uses deep learning for correcting errors in Pacific Biosciences
4-
(PacBio) Circular Consensus Sequencing data.
3+
DeepConsensus uses gap-aware sequence transformers to correct errors in Pacific
4+
Biosciences (PacBio) Circular Consensus Sequencing (CCS) data.
55

6-
See: https://github.com/google/deepconsensus
6+
![DeepConsensus overview diagram](docs/images/pipeline_figure.png)
7+
8+
## Installation
9+
10+
### From pip package
11+
12+
```bash
13+
pip install deepconsensus==0.1.0
14+
```
15+
16+
You can ignore errors regarding google-nucleus installation, such as `ERROR:
17+
Failed building wheel for google-nucleus`.
18+
19+
### From source
20+
21+
```bash
22+
git clone https://github.com/google/deepconsensus.git
23+
cd deepconsensus
24+
source install.sh
25+
```
26+
27+
(Optional) After `source install.sh`, if you want to run all unit tests, you can
28+
do:
29+
30+
```bash
31+
./run_all_tests.sh
32+
```
33+
34+
## Usage
35+
36+
See the [quick start](docs/quick_start.md).
37+
38+
## Where does DeepConsensus fit into my pipeline?
39+
40+
After a PacBio sequencing run, DeepConsensus is meant to be run on the CCS reads
41+
and subreads to create new corrected reads in FASTQ format that can take the
42+
place of the CCS reads for downstream analyses.
43+
44+
See the [quick start](docs/quick_start.md) for an example of inputs and outputs.
45+
46+
NOTE: This initial release of DeepConsensus (v0.1) is not yet optimized for
47+
speed, and only runs on CPUs. We anticipate this version to be too slow for many
48+
uses. We are now prioritizing speed improvements, which we anticipate can
49+
achieve acceptable runtimes.
750

851
## Disclaimer
952

deepconsensus/scripts/run_deepconsensus.py

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -126,7 +126,6 @@ def create_all_commands(directories: List[str], input_subreads_aligned: str,
126126
--out_dir={directories[3]} \
127127
--checkpoint_path={checkpoint} \
128128
--inference=true \
129-
--params=deepconsensus/models/model_configs.py:transformer_learn_values+ccs \
130129
--max_passes=20'
131130

132131
command5 = f'python3 -m deepconsensus.postprocess.stitch_predictions \

deepconsensus/scripts/run_deepconsensus_test.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -57,7 +57,7 @@
5757
5858
5959
***** DRY-RUN ONLY:*****
60-
python3 -m deepconsensus.models.model_inference_with_beam --dataset_path=output_directory/3_write_tf_examples/inference --out_dir=output_directory/4_model_inference_with_beam --checkpoint_path=checkpoint --inference=true --params=deepconsensus/models/model_configs.py:transformer_learn_values+ccs --max_passes=20
60+
python3 -m deepconsensus.models.model_inference_with_beam --dataset_path=output_directory/3_write_tf_examples/inference --out_dir=output_directory/4_model_inference_with_beam --checkpoint_path=checkpoint --inference=true --max_passes=20
6161
6262
6363
***** DRY-RUN ONLY:*****

docs/images/pipeline_figure.png

269 KB
Loading

docs/quick_start.md

Lines changed: 11 additions & 24 deletions
Original file line numberDiff line numberDiff line change
@@ -13,12 +13,17 @@ Memory: 126G
1313

1414
## Download data for testing
1515

16-
This will download about 162 MB of data and the model is another 244 MB.
16+
This will download about 252 MB of data and the model is another 245 MB.
1717

1818
```bash
1919
# Set directory where inputs will be placed.
20-
INPUTS="${HOME}/deepconsensus_quick_start/inputs"
20+
QUICKSTART_DIRECTORY="${HOME}/deepconsensus_quick_start/"
21+
# This will soon have 3 subfolders: inputs, model, and outputs.
22+
23+
INPUTS="${QUICKSTART_DIRECTORY}/inputs"
24+
MODEL_DIR="${QUICKSTART_DIRECTORY}/model"
2125
mkdir -p "${INPUTS}"
26+
mkdir -p "${MODEL_DIR}"
2227

2328
# Download the input data which is PacBio subreads and CCS reads.
2429
gsutil cp gs://brain-genomics-public/research/deepconsensus/quickstart/v0.1/subreads.bam "${INPUTS}"/
@@ -28,7 +33,7 @@ gsutil cp gs://brain-genomics-public/research/deepconsensus/quickstart/v0.1/ccs.
2833
# gsutil cp gs://brain-genomics-public/research/deepconsensus/quickstart/v0.1/subreads_to_ccs.bam "${INPUTS}"/
2934

3035
# Download DeepConsensus model.
31-
gsutil cp gs://brain-genomics-public/research/deepconsensus/models/v0.1/checkpoint-50* "${INPUTS}"/
36+
gsutil cp gs://brain-genomics-public/research/deepconsensus/models/v0.1/* "${MODEL_DIR}"/
3237
```
3338

3439
## Prepare input files
@@ -51,34 +56,16 @@ samtools view -h "aligned.subreads.bam" | \
5156

5257
## Install DeepConsensus
5358

54-
First go to a parent directory where you want to install DeepConsensus, then
55-
follow the steps below to install DeepConsensus.
56-
57-
```bash
58-
git clone https://github.com/google/deepconsensus.git
59-
cd deepconsensus
60-
source install.sh
61-
```
62-
63-
You can ignore errors regarding google-nucleus installation, such as:
64-
65-
```
66-
ERROR: Failed building wheel for google-nucleus
67-
```
68-
69-
(Optional) After `source install.sh`, if you want to run all unit tests, you can
70-
do:
71-
7259
```bash
73-
./run_all_tests.sh
60+
pip install deepconsensus==0.1.0
7461
```
7562

7663
## Run DeepConsensus
7764

7865
```bash
7966
# Set directory where outputs will be placed and set the model for DeepConsensus
80-
OUTPUTS="${HOME}/deepconsensus_quick_start/outputs"
81-
CHECKPOINT_PATH=${INPUTS}/checkpoint-50
67+
OUTPUTS="${QUICKSTART_DIRECTORY}/outputs"
68+
CHECKPOINT_PATH=${MODEL_DIR}/checkpoint-50
8269

8370
# Run DeepConsensus
8471
python3 -m deepconsensus.scripts.run_deepconsensus \

setup.py

Lines changed: 8 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -34,13 +34,16 @@
3434
"""
3535

3636
import pathlib
37+
from setuptools import find_packages
3738
from setuptools import setup
3839

3940
here = pathlib.Path(__file__).parent.resolve()
4041

4142
# Get the long description from the README file
4243
long_description = (here / 'README.md').read_text(encoding='utf-8')
4344

45+
REQUIREMENTS = (here / 'requirements.txt').read_text().splitlines()
46+
4447
setup(
4548
# To support installation via
4649
#
@@ -50,11 +53,11 @@
5053
description='DeepConsensus',
5154
long_description=long_description,
5255
long_description_content_type='text/markdown',
53-
url='TODO',
56+
url='https://github.com/google/deepconsensus',
5457
author='Google LLC',
55-
keywords='TODO',
56-
packages=['deepconsensus'],
58+
keywords='bioinformatics',
59+
packages=find_packages(where='.'),
5760
package_dir={'deepconsensus': 'deepconsensus'},
58-
python_requires='==3.6',
59-
install_requires=[],
61+
python_requires='>=3.6,<3.7',
62+
install_requires=REQUIREMENTS,
6063
)

0 commit comments

Comments
 (0)