Skip to content
This repository has been archived by the owner on Mar 23, 2023. It is now read-only.

Commit

Permalink
[doc] update readme for bert preprocessing (#129)
Browse files Browse the repository at this point in the history
  • Loading branch information
FrankLeeeee authored May 26, 2022
1 parent 22feb71 commit 60e057d
Showing 1 changed file with 2 additions and 2 deletions.
4 changes: 2 additions & 2 deletions language/bert/preprocessing/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -93,7 +93,7 @@ The texts are then processed so that they can be fed to dask.
This could take hours.

```shell
download_wikipedia --outdir $PWD
download_wikipedia --outdir $PWD/wikipedia
```

3. Preprocess the data with dask
Expand All @@ -111,4 +111,4 @@ bash ./pretrain_preprocess.sh 2 bert-large-uncased 512
> **Note**
> 1. Vocab file can be either a vocab file name or path to a vocab file.
> 2. You may see some dask errors during preprocessing, but it is ok as long as the program does not stop.
> 3. You can increase `num_dask_workers` to speed up data processing, but will consume more memory.
> 3. You can increase `num_dask_workers` to speed up data processing, but will consume more memory.

0 comments on commit 60e057d

Please sign in to comment.