ModelScope Model
How to finetune and infer using a pretrained Paraformer-large Model
Finetune
Modify finetune training related parameters in
finetune.py
output_dir: # result dir
data_dir: # the dataset dir needs to include files:
train/wav.scp
,train/text
;validation/wav.scp
,validation/text
dataset_type: # for dataset larger than 1000 hours, set as
large
, otherwise set assmall
batch_bins: # batch size. For dataset_type is
small
,batch_bins
indicates the feature frames. For dataset_type islarge
,batch_bins
indicates the duration in msmax_epoch: # number of training epoch
lr: # learning rate
Then you can run the pipeline to finetune with:
python finetune.py
Inference
Or you can use the finetuned model for inference directly.
Setting parameters in
infer.py
data_dir: # the dataset dir needs to include
test/wav.scp
. Iftest/text
is also exists, CER will be computedoutput_dir: # result dir
ngpu: # the number of GPUs for decoding
njob: # the number of jobs for each GPU
Then you can run the pipeline to infer with:
python infer.py
Results
The decoding results can be found in $output_dir/1best_recog/text.cer
, which includes recognition results of each sample and the CER metric of the whole test set.
Inference using local finetuned model
Modify inference related parameters in
infer_after_finetune.py
output_dir: # result dir
data_dir: # the dataset dir needs to include
test/wav.scp
. Iftest/text
is also exists, CER will be computeddecoding_model_name: # set the checkpoint name for decoding, e.g.,
valid.cer_ctc.ave.pb
Then you can run the pipeline to finetune with:
python infer_after_finetune.py
Results
The decoding results can be found in $output_dir/decoding_results/text.cer
, which includes recognition results of each sample and the CER metric of the whole test set.