From 60e057d62189a210ec085d93f89e922bca3d005c Mon Sep 17 00:00:00 2001 From: Frank Lee Date: Thu, 26 May 2022 10:45:32 +0800 Subject: [PATCH] [doc] update readme for bert preprocessing (#129) --- language/bert/preprocessing/README.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/language/bert/preprocessing/README.md b/language/bert/preprocessing/README.md index 14f07d9..196d3cb 100644 --- a/language/bert/preprocessing/README.md +++ b/language/bert/preprocessing/README.md @@ -93,7 +93,7 @@ The texts are then processed so that they can be fed to dask. This could take hours. ```shell -download_wikipedia --outdir $PWD +download_wikipedia --outdir $PWD/wikipedia ``` 3. Preprocess the data with dask @@ -111,4 +111,4 @@ bash ./pretrain_preprocess.sh 2 bert-large-uncased 512 > **Note** > 1. Vocab file can be either a vocab file name or path to a vocab file. > 2. You may see some dask errors during preprocessing, but it is ok as long as the program does not stop. -> 3. You can increase `num_dask_workers` to speed up data processing, but will consume more memory. \ No newline at end of file +> 3. You can increase `num_dask_workers` to speed up data processing, but will consume more memory.