Ziyi's Blog

Chinese Pipeline: access to HPC and usage of Singularity

字数统计: 415阅读时长: 2 min
2019/06/01 Share

It’s really exciting to work with Red Hen this summer! And I will record the whole work in my blog.

Introduction to the project

Red Hen gathers Chinese broadcasts to make datasets for NLP, OCR, audio, and video pipelines. This work is mainly to improve the current ASR pipeline and build a NLP pipeline.

To read the whole proposal, please click here Chinese Pipeline

To get to the code and sample we use, please go to the github Chinese pipeline repository

Use of CRWU HPC

Connection to HPC

We use the hpc cluster instead of our local servers. To connect to the CRWU, you firstly should download the CISCO vpn(This step is very important). The website has already shown the details about how to use this vpn. Just notice that there are two passwords, the first one is the password you used to login in CASE, the second one is the DUO code.

Also don’t use public network, especially school network, it might be not stable. Using hotspot of moblie phones is the best choice.

After connecting to the vpn, we can connect to the HPC by

ssh abc123@hpclogin.case.edu

To avoid inputing password everytime, you can add your local id_ rsa.pub into authorized_ keys. The file authorized_ keys is in “.ssh”.

Use of Singularity Container and GPU

This is my first time to use Singularity Container. It took me some time to get familiar with it. It may be a little difficult for a tyro who hasn’t used Docker before.
But you will find it quite convenient since you get used to it.

Singularity enables users to have full control of their environment. Singularity containers can be used to package entire scientific workflows, software and libraries, and even data. This means that you don’t have to ask your cluster admin to install anything for you - you can put it in a Singularity container and run.

To invoke a singularity, you need to load module singularity and cuda, and if you need to use GPU, you also need to require for the allocation of GPU before you shell the image.

module load singularity/2.5.1
module load cuda/7.5
export SINGULARITY_BINDPATH="/mnt" //don't forget it or you can't get access to the files in the cluster.
srun -p gpu -C gpup100 --mem=100gb --gres=gpu:2 --pty bash
cd path //the path is where your image in
singularity shell -e --nv XXXX.simg
CATALOG
  1. 1. Introduction to the project
  2. 2. Use of CRWU HPC
    1. 2.1. Connection to HPC
    2. 2.2. Use of Singularity Container and GPU