AdaSeq uses configuration file to control model assembling, training and evaluation. The configuration file supports yaml json jsonline format.
Let's take resume.yaml as an example. A configuration file usually consists of the following fields:
experiment: ...
task: ...
dataset: ...
preprocessor: ...
data_collator: ...
model: ...
train: ...
evaluation: ...Notice: Default = / means this parameter is compulsory.
| Parameter | Description | Type | Default |
|---|---|---|---|
| exp_dir | experiment directory | str | experiments |
| exp_name | experiment name. all outputs will be saved to ./${exp_dir}/${exp_name}/${datetime}/ |
str | unknown |
| seed | random seed | int | 42 |
task supports the following values (see metainfo):
- word-segmentation
- part-of-speech
- named-entity-recognition
- relation-extraction
- entity-typing
Please refer to Customizing Dataset as the combination of dataset parameters is complex.
| Parameter | Description | Type | Default |
|---|---|---|---|
| task | task of the dataset | str | None |
| name | modelscope dataset name, for example damo/resume_ner |
str | None |
| path | huggingface dataset name, for example conll2003 |
str | None |
| data_file | data files, it can be an url, local directory or archive, it can be a dict containing train valid test as well |
str/dict | None |
| data_type | used to specify data loading method | str | None |
| transform | dataset post processing, usually containing name key scheme |
dict | None |
| labels | label set, it can be a list labels: ['O', 'B-ORG', ...], file or urllabels: PATH_OR_URL, or a function counting labels from dataset |
str/list/dict | None |
| access_token | used to access private repos from modelscope or huggingface | str | None |
| Parameter | Description | Type | Default |
|---|---|---|---|
| type | preprocessor type | str | / |
| model_dir | tokenizer name or directory | str | / |
| is_word2vec | whether to use Lookup Table | bool | False |
| tokenizer_kwargs | other parameters for tokenizer | dict | None |
| max_length | maximum sentence length (subtoken-level) | int | 512 |
data_collator supports the following values (see metainfo):
- DataCollatorWithPadding
- SequenceLabelingDataCollatorWithPadding
- SpanExtractionDataCollatorWithPadding
- MultiLabelSpanTypingDataCollatorWithPadding
- MultiLabelConcatTypingDataCollatorWithPadding
| Parameter | Child-Parameter | Description | Type | Default |
|---|---|---|---|---|
| type | model type | str | / | |
| embedder | used to embed input ids to vectors, usually a pretrained model | dict | None | |
| └ | type | embedder type, optional when using modelscope or huggingface model | str | None |
| └ | model_name_or_path | pretrained model name or path, supporting both modelscope or huggingface models | str | / |
| encoder | encode the sentence vector, such as LSTM |
dict | None | |
| └ | type | encoder type | str | / |
| decoder | not available, under construction | dict | None |
| Parameter | Child-Parameter | Description | Type | Default |
|---|---|---|---|---|
| trainer | trainer type | str | None | |
| max_epochs | maximum number of epochs in training | int | / | |
| dataloader | used to load data | dict | / | |
| └ | batch_size_per_gpu | batch size per gpu | int | / |
| └ | workers_per_gpu | data loading workers per gpu | int | 0 |
| optimizer | optimizer | dict | None | |
| └ | type | optimizer type | str | / |
| └ | lr | learning rate for all parameters except specific param_groups | float | / |
| └ | options | options used in optimizer, for example grad_clip: max_norm: 2.0 |
dict | None |
| └ | param_groups | param_groups can have different learning rates | list | None |
| └ | └ regex | regex expression to specify parameter group | str | / |
| └ | └ lr | learning rate for specific parameter group | float | / |
| lr_scheduler | used to adjust learning rate uding training | dict | None | |
| └ | type | supporting all lr_scheduler from pytorch (check if your pytorch version includes them) | str | / |
| └ | options | options used in the lr_scheduler | dict | None |
| hooks | also callbacks see ModelScope documentation | list | None |
| Parameter | Child-Parameter | Description | Type | Default |
|---|---|---|---|---|
| dataloader | used to load data | dict | / | |
| └ | batch_size_per_gpu | batch size per gpu | int | / |
| └ | workers_per_gpu | data loading workers per gpu | int | 0 |
| metrics | evaluation metrics | list | None | |
| └ | type | metric type | str | / |