Processing and Usage of the WenetSpeech Dataset

The WenetSpeech dataset provides over 10,000 hours of Mandarin Chinese speech, categorized into strong-labeled (10,005 hours), weak-labeled (2,478 hours), and unlabeled (9,952 hours) subsets, suitable for supervised, semi-supervised, or unsupervised training. The data is grouped by domain and style, and datasets of different scales (S, M, L) as well as evaluation/test data are provided. The tutorial details how to download, prepare, and use this dataset for training speech recognition models, making it a valuable reference for ASR system developers.

Read More