Ziyi's Blog

Chinese-pipeline:Final Report

Welcome to the Chinese Pipeline projectIn this post, we will show the work we had done on Chinese Pipeline for Red Hen Lan as part of GSoC 2019. Please Click here to access to my repository and click here to see the previous blogs.Please feel free to send emails to liuziyi219@gmail.com if you hav...

2019/08/21

Chinese-Pipeline: ASR for Chinese Pipeline

The project is mainly based on the open source project Deepspeech2 on PaddlePaddle released by Baidu. For the configuration, I strongly recommend you to use our singularity recipe to avoid protential problems.For the code, click here PrerequsitesFor the Red Hen Lab participants, all the configura...

2019/08/18

Chinese Pipeline:Decreaing the sample rate doesn't work

There are posts saying deepspeech modal only supports 8k sample format, so I accommodated all the voice sample rate to 8k. However, the performance was much worse that the previous one. Here is the code of accommodating the sample rate: for file in *.wav;do #echo $file c=${file} #echo $c ...

2019/07/24

Chinese Pipeline:Several updates about VAD and Xunfeisdk

Updates about WebRTCVad change of the usageinitial：python audiosplit.py <aggressiveness> <dir to result>now: python audiosplit.py <padding duration> <path to wav file/directory> <path to result> two modesSupport splitting a single audio or all the audios in a certa...

2019/07/23

Chinese Pipeline:How to use XunfeiSDK

Register and Download the SDK packageThe first thing is to sign up in Xunfei Open Platform. Then enter into the personal application, there you can create an application and add the service you need(you can choose SDK or WebAPI or other formats, in this case, I used Linux SDK). After that, downlo...

2019/07/06

Chinese-Pipeline:First experimental result and the comparation of 3 ASR APP

This week I tested the modal by using the testset. The current codes are separated for the convenience of testing, and after we obtain a satisfying result, I will reconstruct the code. The result and the WERHere are some results and their WER(word error rate) value This time the result is much be...

2019/06/23

Chinese Pipeline:An introduction to the Chinese news test set

I spent more than a week making this test set. This contains almost 150 minutes of voice. I made the correct text for each audio, so we can calculate exactly how well the performance of ASR Chinese Pipeline is. SourcesThe data is from 新闻联播(xinwenlianbo). We extract 5 episodes which are from 2019....

2019/06/16

Chinese Pipeline：usage of WebrtcVAD

WebrtcVADA VAD classifies a piece of audio data as being voiced or unvoiced. InstallationInstall the webrtcvad module pip install webrtcvad Preparing the audiosRedhen only have mp4 format videos. So, we need to transform the video to audios by ffmpeg. FFmpeg is a powerful tool for format convert...

2019/06/08

Chinese Pipeline：Audios playing and audios editting

How to play audios in HPCProf Steen installed Sox for us. Sox is a powerful tool for audio editting. However, for some reasons I didn’t know, I couldn’t use Sox in my image, but I can use it on my local computer. Prof Steen advised me to use netcat. This method didn’t solve the problem directly, ...

2019/06/08

Chinese Pipeline: access to HPC and usage of Singularity

It’s really exciting to work with Red Hen this summer! And I will record the whole work in my blog. Introduction to the projectRed Hen gathers Chinese broadcasts to make datasets for NLP, OCR, audio, and video pipelines. This work is mainly to improve the current ASR pipeline and build a NLP pip...

2019/06/01