Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] pd: update paddle doc #4489

Open
wants to merge 1 commit into
base: devel
Choose a base branch
from

Conversation

HydrogenSulfate
Copy link
Contributor

@HydrogenSulfate HydrogenSulfate commented Dec 23, 2024

Add paddle icon svg and update related docs

Summary by CodeRabbit

  • New Features

    • Added support for Paddle as a backend across various documentation sections.
    • Introduced new commands and installation instructions for Paddle in the installation guides.
    • Expanded model freezing and fine-tuning documentation to include Paddle-specific instructions.
    • Updated training documentation to incorporate Paddle commands and usage.
  • Documentation

    • Enhanced README and backend documentation to reflect Paddle support.
    • Updated various model and training documents to include Paddle in supported backends.
  • Chores

    • Added new environment variables related to Paddle in the documentation.

@HydrogenSulfate HydrogenSulfate changed the title pd: update paddle doc [WIP] pd: update paddle doc Dec 23, 2024
Copy link
Contributor

coderabbitai bot commented Dec 23, 2024

📝 Walkthrough

Walkthrough

This pull request introduces comprehensive documentation updates for the DeePMD-kit project to include Paddle as a new supported backend. The changes span multiple documentation files, adding Paddle support to installation instructions, training processes, model descriptors, and various other sections. The modifications provide users with detailed guidance on using Paddle with DeePMD-kit, including installation commands, training procedures, and backend-specific configurations.

Changes

File Change Summary
README.md Updated "Highlighted features" to include Paddle in the list of supported deep learning frameworks
doc/backend.md Added Paddle backend support with file extensions and command line options
doc/conf.py Added Paddle icon substitution for documentation
doc/env.md Introduced new Paddle-related environment variables PD_JIT and PD_CINN
doc/install/* Added Paddle installation instructions for Python and C++ interfaces
doc/model/* Updated model documentation to include Paddle support across various descriptors
doc/train/* Added Paddle implementation details for training, fine-tuning, parallel training, and TensorBoard

Possibly related PRs

Suggested labels

Python, Docs

Suggested reviewers

  • njzjz
  • wanghan-iapcm

Tip

CodeRabbit's docstrings feature is now available as part of our Early Access Program! Simply use the command @coderabbitai generate docstrings to have CodeRabbit automatically generate docstrings for your pull request. We would love to hear your feedback on Discord.


Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Generate unit testing code for this file.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai generate docstrings to generate docstrings for this PR. (Beta)
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

🧹 Nitpick comments (13)
doc/freeze/freeze.md (1)

38-44: Minor grammar improvement.

Consider changing “The output model is called model.json and model.pdiparams.” to indicate that there are two distinct files:

- The output model is called `model.json` and `model.pdiparams`.
+ The output files are called `model.json` and `model.pdiparams`.
🧰 Tools
🪛 LanguageTool

[uncategorized] ~44-~44: The grammatical number of this noun doesn’t look right. Consider replacing it.
Context: ... where the model is trained. The output model is called model.json and `model.pdipa...

(AI_EN_LECTOR_REPLACEMENT_NOUN_NUMBER)

🪛 Markdownlint (0.37.0)

41-41: null
Dollar signs used before commands without showing output

(MD014, commands-show-output)

doc/env.md (2)

59-64: Consider adding usage examples for PD_JIT

The documentation clearly explains the purpose and options but would benefit from practical examples showing how to enable JIT in different scenarios and its impact on performance.


66-72: Add examples and clarify CINN compatibility

While the documentation mentions the CINN requirement, it would be helpful to:

  1. Add examples showing how to check CINN support and enable it
  2. Provide guidance on when to use CINN (e.g., specific operations or models that benefit from it)
  3. Document any known limitations or compatibility issues
doc/backend.md (2)

38-46: Enhance Paddle backend documentation

The Paddle backend section would benefit from additional information:

  1. Platform support (CPU/GPU/ROCm compatibility)
  2. Performance characteristics compared to other backends
  3. Migration guide for users switching from TensorFlow or PyTorch
  4. Known limitations or unsupported features

69-69: Add examples of backend switching commands

Consider adding concrete examples of the dp commands for each backend to make it clearer for users:

# Train with TensorFlow backend
dp train --tf input.json

# Train with PyTorch backend
dp train --pt input.json

# Train with Paddle backend
dp train --pd input.json
doc/train/training.md (1)

29-35: LGTM! Consider adding command output example.

The Paddle tab item is well-structured and consistent with other backend documentation. Consider adding an example of the command output to match the documentation style of other backends.

🧰 Tools
🪛 Markdownlint (0.37.0)

32-32: null
Dollar signs used before commands without showing output

(MD014, commands-show-output)

doc/train/finetuning.md (2)

255-255: Fix typo in documentation

"refering" should be "referring"

-One can check the available model branches in multi-task pre-trained model by refering to the documentation
+One can check the available model branches in multi-task pre-trained model by referring to the documentation
🧰 Tools
🪛 LanguageTool

[misspelling] ~255-~255: This word is normally spelled as one.
Context: ...n check the available model branches in multi-task pre-trained model by refering to the do...

(EN_COMPOUNDS_MULTI_TASK)


208-264: Standardize terminology usage

The document inconsistently uses both hyphenated and non-hyphenated versions of terms:

  • "pre-trained" vs "pretrained"
  • "multi-task" vs "multitask"

Please standardize the usage throughout the document.

🧰 Tools
🪛 Markdownlint (0.37.0)

214-214: null
Dollar signs used before commands without showing output

(MD014, commands-show-output)


221-221: null
Dollar signs used before commands without showing output

(MD014, commands-show-output)


228-228: null
Dollar signs used before commands without showing output

(MD014, commands-show-output)


251-251: null
Dollar signs used before commands without showing output

(MD014, commands-show-output)


258-258: null
Dollar signs used before commands without showing output

(MD014, commands-show-output)

🪛 LanguageTool

[uncategorized] ~210-~210: Do not mix variants of the same word (‘pretrain’ and ‘pre-train’) within a single text.
Context: ...ilizes a single-task pre-trained model (pretrained.pd) and modifies the energy bias withi...

(EN_WORD_COHERENCY)


[misspelling] ~241-~241: This word is normally spelled as one.
Context: ...: {} } ``` #### Fine-tuning from a multi-task pre-trained model Additionally, within...

(EN_COMPOUNDS_MULTI_TASK)


[misspelling] ~243-~243: This word is normally spelled as one.
Context: ...bility offered by the framework and the multi-task training process proposed in DPA2 [pape...

(EN_COMPOUNDS_MULTI_TASK)


[typographical] ~247-~247: It appears that a comma is missing.
Context: ... the paper above. For fine-tuning using this multitask pre-trained model (`multitask...

(BY_FOR_IN_COMMA)


[grammar] ~247-~247: The verb ‘multitask’ is plural. Did you mean: “multitasks”? Did you use a verb instead of a noun?
Context: ...aper above. For fine-tuning using this multitask pre-trained model (`multitask_pretraine...

(PLURAL_VERB_AFTER_THIS)


[uncategorized] ~247-~247: Do not mix variants of the same word (‘pretrain’ and ‘pre-train’) within a single text.
Context: ...using this multitask pre-trained model (multitask_pretrained.pd), one can select a specific branch ...

(EN_WORD_COHERENCY)


[uncategorized] ~248-~248: Do not mix variants of the same word (‘pretrain’ and ‘pre-train’) within a single text.
Context: ...ch (e.g., CHOOSEN_BRANCH) included in multitask_pretrained.pd for fine-tuning with the following ...

(EN_WORD_COHERENCY)


[misspelling] ~255-~255: This word is normally spelled as one.
Context: ...n check the available model branches in multi-task pre-trained model by refering to the do...

(EN_COMPOUNDS_MULTI_TASK)

doc/train/parallel-training.md (3)

202-202: Fix punctuation and grammar

There are two issues in this line:

  1. Double period at the end
  2. Missing article before "sequential sampler"
-> **Note** The underlying dataloader will use a distributed sampler to ensure that each GPU receives batches with different content in parallel mode, which will use sequential sampler in serial mode. In the TensorFlow version, Horovod shuffles the dataset using different random seeds for the same purpose..
+> **Note** The underlying dataloader will use a distributed sampler to ensure that each GPU receives batches with different content in parallel mode, which will use a sequential sampler in serial mode. In the TensorFlow version, Horovod shuffles the dataset using different random seeds for the same purpose.
🧰 Tools
🪛 LanguageTool

[uncategorized] ~202-~202: Possible missing article found.
Context: ...ontent in parallel mode, which will use sequential sampler in serial mode. In the TensorFl...

(AI_HYDRA_LEO_MISSING_A)


[typographical] ~202-~202: Two consecutive dots
Context: ...ferent random seeds for the same purpose.. ```mermaid flowchart LR subgraph s...

(DOUBLE_PUNCTUATION)


193-193: Consider using English documentation URL

The Paddle DDP documentation link is in Chinese. Consider using the English version of the documentation if available.


278-279: Enhance environment variable documentation

The note about NUM_WORKERS could be more specific about recommended ranges or provide examples of appropriate values.

Consider adding more specific guidance:

-If `NUM_WORKERS` is too large, it may cause the program to be terminated by the system;
-if it is too small, it may slow down data reading. You can try adjusting it to an appropriate size.
+If `NUM_WORKERS` is too large (e.g., > 16), it may cause the program to be terminated by the system;
+if it is too small (e.g., < 4), it may slow down data reading. We recommend starting with NUM_WORKERS=8
+and adjusting based on your system's capabilities.
doc/install/install-from-source.md (2)

365-366: Fix grammar and improve clarity.

The sentence has grammatical issues and can be clearer.

-If you want to use C++ interface of Paddle, you need to compile the Paddle inference library(C++ interface) manually from the [linux-compile-by-make](https://www.paddlepaddle.org.cn/documentation/docs/zh/develop/install/compile/linux-compile-by-make.html), then use the `.so` and `.a` files in `Paddle/build/paddle_inference_install_dir/`.
+To use the C++ interface of Paddle, you need to manually compile the Paddle inference library following the [Linux compilation guide](https://www.paddlepaddle.org.cn/documentation/docs/zh/develop/install/compile/linux-compile-by-make.html). After compilation, use the `.so` and `.a` files from `Paddle/build/paddle_inference_install_dir/`.
🧰 Tools
🪛 LanguageTool

[style] ~365-~365: You have already used this phrasing in nearby sentences. Consider replacing it to add variety to your writing.
Context: ...-item} Paddle {{ paddle_icon }} If you want to use C++ interface of Paddle, you need t...

(REP_WANT_TO_VB)


441-442: Fix grammar in the assumption statement.

The sentence has a grammatical error with the verb "get".

-I assume you have get the Paddle inference library(C++ interface) to `$PADDLE_INFERENCE_DIR`, then execute CMake
+I assume you have obtained the Paddle inference library (C++ interface) in `$PADDLE_INFERENCE_DIR`. Execute CMake with:
🧰 Tools
🪛 LanguageTool

[grammar] ~441-~441: The verb form ‘get’ does not seem to be suitable in this context.
Context: ...le {{ paddle_icon }} I assume you have get the Paddle inference library(C++ interf...

(HAVE_VB)

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between cfe17a3 and 563a7bd.

⛔ Files ignored due to path filters (1)
  • doc/_static/paddle.svg is excluded by !**/*.svg
📒 Files selected for processing (16)
  • README.md (1 hunks)
  • doc/backend.md (3 hunks)
  • doc/conf.py (1 hunks)
  • doc/env.md (1 hunks)
  • doc/freeze/freeze.md (1 hunks)
  • doc/install/easy-install.md (1 hunks)
  • doc/install/install-from-source.md (10 hunks)
  • doc/model/dpa2.md (1 hunks)
  • doc/model/sel.md (1 hunks)
  • doc/model/train-energy.md (1 hunks)
  • doc/model/train-se-atten.md (1 hunks)
  • doc/model/train-se-e2-a.md (1 hunks)
  • doc/train/finetuning.md (2 hunks)
  • doc/train/parallel-training.md (2 hunks)
  • doc/train/tensorboard.md (1 hunks)
  • doc/train/training.md (1 hunks)
✅ Files skipped from review due to trivial changes (2)
  • doc/model/train-se-e2-a.md
  • doc/model/train-energy.md
🧰 Additional context used
🪛 Markdownlint (0.37.0)
doc/train/training.md

32-32: null
Dollar signs used before commands without showing output

(MD014, commands-show-output)

doc/freeze/freeze.md

41-41: null
Dollar signs used before commands without showing output

(MD014, commands-show-output)


50-50: null
Dollar signs used before commands without showing output

(MD014, commands-show-output)

doc/train/finetuning.md

214-214: null
Dollar signs used before commands without showing output

(MD014, commands-show-output)


221-221: null
Dollar signs used before commands without showing output

(MD014, commands-show-output)


228-228: null
Dollar signs used before commands without showing output

(MD014, commands-show-output)


251-251: null
Dollar signs used before commands without showing output

(MD014, commands-show-output)


258-258: null
Dollar signs used before commands without showing output

(MD014, commands-show-output)

🪛 LanguageTool
doc/freeze/freeze.md

[uncategorized] ~44-~44: The grammatical number of this noun doesn’t look right. Consider replacing it.
Context: ... where the model is trained. The output model is called model.json and `model.pdipa...

(AI_EN_LECTOR_REPLACEMENT_NOUN_NUMBER)


[misspelling] ~46-~46: This word is normally spelled as one.
Context: ...model.jsonandmodel.pdiparams`. In [multi-task mode](../train/multi-task-training-pd.m...

(EN_COMPOUNDS_MULTI_TASK)

doc/train/parallel-training.md

[uncategorized] ~194-~194: Possible missing comma found.
Context: ...arallel (distributed) mode or in serial mode depending on your execution command. #...

(AI_HYDRA_LEO_MISSING_COMMA)


[misspelling] ~200-~200: This word is normally spelled as one.
Context: ...fines the number of threads involved in multi-threaded loading, which can be modified by setti...

(EN_COMPOUNDS_MULTI_THREADED)


[uncategorized] ~202-~202: Possible missing article found.
Context: ...ontent in parallel mode, which will use sequential sampler in serial mode. In the TensorFl...

(AI_HYDRA_LEO_MISSING_A)


[typographical] ~202-~202: Two consecutive dots
Context: ...ferent random seeds for the same purpose.. ```mermaid flowchart LR subgraph s...

(DOUBLE_PUNCTUATION)

doc/install/install-from-source.md

[style] ~365-~365: You have already used this phrasing in nearby sentences. Consider replacing it to add variety to your writing.
Context: ...-item} Paddle {{ paddle_icon }} If you want to use C++ interface of Paddle, you need t...

(REP_WANT_TO_VB)


[grammar] ~441-~441: The verb form ‘get’ does not seem to be suitable in this context.
Context: ...le {{ paddle_icon }} I assume you have get the Paddle inference library(C++ interf...

(HAVE_VB)

doc/train/finetuning.md

[misspelling] ~202-~202: Did you mean “and”?
Context: ... In Paddle version, we have introduced an updated, more adaptable approach to fin...

(AN_AND)


[uncategorized] ~210-~210: Do not mix variants of the same word (‘pretrain’ and ‘pre-train’) within a single text.
Context: ...ilizes a single-task pre-trained model (pretrained.pd) and modifies the energy bias withi...

(EN_WORD_COHERENCY)


[misspelling] ~241-~241: This word is normally spelled as one.
Context: ...: {} } ``` #### Fine-tuning from a multi-task pre-trained model Additionally, within...

(EN_COMPOUNDS_MULTI_TASK)


[misspelling] ~243-~243: This word is normally spelled as one.
Context: ...bility offered by the framework and the multi-task training process proposed in DPA2 [pape...

(EN_COMPOUNDS_MULTI_TASK)


[typographical] ~247-~247: It appears that a comma is missing.
Context: ... the paper above. For fine-tuning using this multitask pre-trained model (`multitask...

(BY_FOR_IN_COMMA)


[grammar] ~247-~247: The verb ‘multitask’ is plural. Did you mean: “multitasks”? Did you use a verb instead of a noun?
Context: ...aper above. For fine-tuning using this multitask pre-trained model (`multitask_pretraine...

(PLURAL_VERB_AFTER_THIS)


[uncategorized] ~247-~247: Do not mix variants of the same word (‘pretrain’ and ‘pre-train’) within a single text.
Context: ...using this multitask pre-trained model (multitask_pretrained.pd), one can select a specific branch ...

(EN_WORD_COHERENCY)


[uncategorized] ~248-~248: Do not mix variants of the same word (‘pretrain’ and ‘pre-train’) within a single text.
Context: ...ch (e.g., CHOOSEN_BRANCH) included in multitask_pretrained.pd for fine-tuning with the following ...

(EN_WORD_COHERENCY)


[misspelling] ~255-~255: This word is normally spelled as one.
Context: ...n check the available model branches in multi-task pre-trained model by refering to the do...

(EN_COMPOUNDS_MULTI_TASK)

🔇 Additional comments (10)
doc/conf.py (1)

170-170: Ensure the icon file is present.

The file reference "_static/paddle.svg" is newly introduced. Confirm that it exists and is properly placed under the "_static/" directory; otherwise, documentation builds may fail or show a broken icon link.

Use this script to verify the presence of "paddle.svg" in the repository:

✅ Verification successful

Icon file is present at the expected location

The file paddle.svg exists at doc/_static/paddle.svg, which is the correct location referenced in the configuration.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Search the entire repo for paddle.svg to ensure it exists

fd paddle.svg

Length of output: 38

doc/model/sel.md (1)

35-39: LGTM!

The addition of Paddle is consistent with the existing pattern for other backends. This is well-aligned with the broader documentation updates.

doc/freeze/freeze.md (1)

46-54: LGTM!

The new multi-task instructions for Paddle are clear and consistent with the structure used for other backends.

🧰 Tools
🪛 LanguageTool

[misspelling] ~46-~46: This word is normally spelled as one.
Context: ...model.jsonandmodel.pdiparams`. In [multi-task mode](../train/multi-task-training-pd.m...

(EN_COMPOUNDS_MULTI_TASK)

🪛 Markdownlint (0.37.0)

50-50: null
Dollar signs used before commands without showing output

(MD014, commands-show-output)

doc/model/dpa2.md (1)

1-4: No issues found.

Adding the Paddle icon and listing it among supported backends is consistent with the broader changes.

doc/train/tensorboard.md (1)

1-4: Consider adding Paddle-specific TensorBoard instructions

While the Paddle backend has been added to the supported backends list, the document doesn't mention any Paddle-specific considerations or setup requirements for TensorBoard integration. Please clarify:

  1. Are there any special steps needed to enable TensorBoard with Paddle?
  2. Are all TensorBoard features (metrics tracking, model graph, etc.) fully supported with Paddle?
README.md (1)

22-22: LGTM! Documentation update is accurate and well-integrated.

The addition of Paddle to the list of supported backends is consistent with the PR objectives and maintains the documentation style.

doc/model/train-se-atten.md (1)

1-1: LGTM: Paddle backend support properly documented

The addition of Paddle icons in the header and supported backends note accurately reflects the new backend support.

Also applies to: 4-4

doc/train/finetuning.md (1)

200-264: LGTM: Paddle implementation properly documented

The Paddle implementation section accurately mirrors the PyTorch implementation, providing clear instructions for both single-task and multi-task fine-tuning scenarios.

🧰 Tools
🪛 Markdownlint (0.37.0)

214-214: null
Dollar signs used before commands without showing output

(MD014, commands-show-output)


221-221: null
Dollar signs used before commands without showing output

(MD014, commands-show-output)


228-228: null
Dollar signs used before commands without showing output

(MD014, commands-show-output)


251-251: null
Dollar signs used before commands without showing output

(MD014, commands-show-output)


258-258: null
Dollar signs used before commands without showing output

(MD014, commands-show-output)

🪛 LanguageTool

[misspelling] ~202-~202: Did you mean “and”?
Context: ... In Paddle version, we have introduced an updated, more adaptable approach to fin...

(AN_AND)


[uncategorized] ~210-~210: Do not mix variants of the same word (‘pretrain’ and ‘pre-train’) within a single text.
Context: ...ilizes a single-task pre-trained model (pretrained.pd) and modifies the energy bias withi...

(EN_WORD_COHERENCY)


[misspelling] ~241-~241: This word is normally spelled as one.
Context: ...: {} } ``` #### Fine-tuning from a multi-task pre-trained model Additionally, within...

(EN_COMPOUNDS_MULTI_TASK)


[misspelling] ~243-~243: This word is normally spelled as one.
Context: ...bility offered by the framework and the multi-task training process proposed in DPA2 [pape...

(EN_COMPOUNDS_MULTI_TASK)


[typographical] ~247-~247: It appears that a comma is missing.
Context: ... the paper above. For fine-tuning using this multitask pre-trained model (`multitask...

(BY_FOR_IN_COMMA)


[grammar] ~247-~247: The verb ‘multitask’ is plural. Did you mean: “multitasks”? Did you use a verb instead of a noun?
Context: ...aper above. For fine-tuning using this multitask pre-trained model (`multitask_pretraine...

(PLURAL_VERB_AFTER_THIS)


[uncategorized] ~247-~247: Do not mix variants of the same word (‘pretrain’ and ‘pre-train’) within a single text.
Context: ...using this multitask pre-trained model (multitask_pretrained.pd), one can select a specific branch ...

(EN_WORD_COHERENCY)


[uncategorized] ~248-~248: Do not mix variants of the same word (‘pretrain’ and ‘pre-train’) within a single text.
Context: ...ch (e.g., CHOOSEN_BRANCH) included in multitask_pretrained.pd for fine-tuning with the following ...

(EN_WORD_COHERENCY)


[misspelling] ~255-~255: This word is normally spelled as one.
Context: ...n check the available model branches in multi-task pre-trained model by refering to the do...

(EN_COMPOUNDS_MULTI_TASK)

doc/train/parallel-training.md (1)

191-281: LGTM: Paddle parallel training implementation well documented

The Paddle implementation section effectively documents the DDP setup, providing clear instructions for both single-node and multi-node training scenarios. The dataloader implementation and usage instructions are comprehensive.

🧰 Tools
🪛 LanguageTool

[uncategorized] ~194-~194: Possible missing comma found.
Context: ...arallel (distributed) mode or in serial mode depending on your execution command. #...

(AI_HYDRA_LEO_MISSING_COMMA)


[misspelling] ~200-~200: This word is normally spelled as one.
Context: ...fines the number of threads involved in multi-threaded loading, which can be modified by setti...

(EN_COMPOUNDS_MULTI_THREADED)


[uncategorized] ~202-~202: Possible missing article found.
Context: ...ontent in parallel mode, which will use sequential sampler in serial mode. In the TensorFl...

(AI_HYDRA_LEO_MISSING_A)


[typographical] ~202-~202: Two consecutive dots
Context: ...ferent random seeds for the same purpose.. ```mermaid flowchart LR subgraph s...

(DOUBLE_PUNCTUATION)

doc/install/install-from-source.md (1)

212-218: LGTM! Well-documented environment and CMake variables.

The documentation for environment variables and CMake variables is clear, complete, and follows the same format as other backends.

Also applies to: 233-239, 480-487, 496-503

Comment on lines +203 to +206
```bash
pip install paddlepaddle-gpu==3.0.0b2 -i https://www.paddlepaddle.org.cn/packages/stable/cu123/
pip install deepmd-kit-cu11
```
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Verify package repository URL.

The CUDA 11.8 installation uses the CUDA 12.3 package repository URL. This should be updated to use the correct CUDA version.

-pip install paddlepaddle-gpu==3.0.0b2 -i https://www.paddlepaddle.org.cn/packages/stable/cu123/
+pip install paddlepaddle-gpu==3.0.0b2 -i https://www.paddlepaddle.org.cn/packages/stable/cu118/

Comment on lines +212 to +215
```bash
pip install paddlepaddle-gpu==3.0.0b2 -i https://www.paddlepaddle.org.cn/packages/stable/cu118/
pip install deepmd-kit
```
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Fix CPU installation command.

The CPU installation command incorrectly uses the GPU package. For CPU-only installations, use the CPU-specific package.

-pip install paddlepaddle-gpu==3.0.0b2 -i https://www.paddlepaddle.org.cn/packages/stable/cu118/
+pip install paddlepaddle==3.0.0b2 -i https://www.paddlepaddle.org.cn/whl/linux/cpu/stable.html

Comment on lines +189 to +221
::::::{tab-item} Paddle {{ paddle_icon }}

:::::{tab-set}

::::{tab-item} CUDA 12.3

```bash
pip install deepmd-kit[paddle]
```

::::

::::{tab-item} CUDA 11.8

```bash
pip install paddlepaddle-gpu==3.0.0b2 -i https://www.paddlepaddle.org.cn/packages/stable/cu123/
pip install deepmd-kit-cu11
```

::::

::::{tab-item} CPU

```bash
pip install paddlepaddle-gpu==3.0.0b2 -i https://www.paddlepaddle.org.cn/packages/stable/cu118/
pip install deepmd-kit
```

::::

:::::

::::::
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codebase verification

Warning needed: PaddlePaddle 3.0.0b2 is still in beta

The PyPI version history confirms that PaddlePaddle 3.0.0b2 is the latest version and is still in beta status. Since the installation instructions rely on this beta version, we should add explicit warnings about:

  • Using beta software in production environments
  • Potential API instability and breaking changes
  • The lack of a stable 3.0.0 release

Specific locations needing attention:

  • doc/install/easy-install.md: Add a warning note above the Paddle installation instructions about beta status and stability implications
🔗 Analysis chain

Consider stability implications of beta dependency.

The installation instructions specify PaddlePaddle version 3.0.0b2, which is a beta release. Consider:

  1. Adding a warning about potential stability issues
  2. Documenting any known limitations
  3. Specifying when a stable release is expected
🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Check if there are newer versions available
curl -s https://pypi.org/pypi/paddlepaddle/json | jq -r '.releases | keys[]' | sort -V | tail -n 5

Length of output: 136

Comment on lines +369 to +371
[Cuda118_cudnn860_Trt8531_D1/latest/paddle_inference.tgz](https://paddle-qa.bj.bcebos.com/paddle-pipeline/GITHUB_Docker_Compile_Test_Cuda118_cudnn860_Trt8531_D1/latest/paddle_inference.tgz)

[Cuda123_cudnn900_Trt8616_D1/latest/paddle_inference.tgz](https://paddle-qa.bj.bcebos.com/paddle-pipeline/GITHUB_Docker_Compile_Test_Cuda123_cudnn900_Trt8616_D1/latest/paddle_inference.tgz)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codebase verification

Add last modified dates to pre-compiled library links

The links are actively maintained with recent updates (December 21, 2024). Adding the last modified dates would help users verify the freshness of these pre-compiled libraries:

  • Update link text for CUDA 11.8: [Cuda118_cudnn860_Trt8531_D1/latest/paddle_inference.tgz (Last updated: Dec 21, 2024)](https://paddle-qa.bj.bcebos.com/paddle-pipeline/GITHUB_Docker_Compile_Test_Cuda118_cudnn860_Trt8531_D1/latest/paddle_inference.tgz)
  • Update link text for CUDA 12.3: [Cuda123_cudnn900_Trt8616_D1/latest/paddle_inference.tgz (Last updated: Dec 21, 2024)](https://paddle-qa.bj.bcebos.com/paddle-pipeline/GITHUB_Docker_Compile_Test_Cuda123_cudnn900_Trt8616_D1/latest/paddle_inference.tgz)
🔗 Analysis chain

Consider adding version information for pre-compiled libraries.

The links to pre-compiled libraries should include version information or last updated date to help users make informed decisions.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Check if the links are accessible and get last modified date
for url in \
  "https://paddle-qa.bj.bcebos.com/paddle-pipeline/GITHUB_Docker_Compile_Test_Cuda118_cudnn860_Trt8531_D1/latest/paddle_inference.tgz" \
  "https://paddle-qa.bj.bcebos.com/paddle-pipeline/GITHUB_Docker_Compile_Test_Cuda123_cudnn900_Trt8616_D1/latest/paddle_inference.tgz"; do
  echo "Checking $url"
  curl -sI "$url" | grep "Last-Modified"
done

Length of output: 1552

Comment on lines +101 to +106
# cu123
pip install --pre paddlepaddle-gpu -i https://www.paddlepaddle.org.cn/packages/nightly/cu123/
# cu118
pip install --pre paddlepaddle-gpu -i https://www.paddlepaddle.org.cn/packages/nightly/cu118/
# cpu
pip install --pre paddlepaddle -i https://www.paddlepaddle.org.cn/packages/nightly/cpu/
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codebase verification

Use stable version installation commands instead of nightly builds

The repository's easy-install.md already contains stable version installation commands for PaddlePaddle. Replace the nightly build commands with stable version commands:

# cu123
pip install paddlepaddle-gpu==3.0.0b2 -i https://www.paddlepaddle.org.cn/packages/stable/cu123/
# cu118
pip install paddlepaddle-gpu==3.0.0b2 -i https://www.paddlepaddle.org.cn/packages/stable/cu118/
# cpu
pip install paddlepaddle==3.0.0b2 -i https://www.paddlepaddle.org.cn/packages/stable/cpu/

This aligns with the version requirement (3.0 or above) mentioned in backend.md and provides more stable installations compared to nightly builds.

🔗 Analysis chain

Consider providing stable release installation commands.

The current installation commands use nightly builds (--pre flag), which might be unstable. Consider adding commands for installing stable releases or adding a note about potential instability of nightly builds.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Check if stable versions are available
curl -s https://www.paddlepaddle.org.cn/documentation/docs/en/install/pip/linux-pip_en.html | grep -A 5 "pip install paddlepaddle"

Length of output: 132


Script:

#!/bin/bash
# Let's check if there are any mentions of stable PaddlePaddle versions in the repository
rg -i "paddlepaddle.*stable" --type md

# Also check for any version constraints or recommendations
rg -i "paddlepaddle.*version" --type md

# Check if there are any other installation instructions for PaddlePaddle
fd -e md -x grep -l "paddlepaddle" {}

Length of output: 612


Script:

#!/bin/bash
# Let's check the stable version installation commands from easy-install.md
cat doc/install/easy-install.md | grep -A 5 "paddlepaddle.*stable"

# Also check the version requirement mentioned in backend.md
cat doc/backend.md | grep -A 2 -B 2 "Paddle.*version"

Length of output: 989

@github-actions github-actions bot added the Docs label Dec 23, 2024
Copy link

codecov bot commented Dec 23, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 84.41%. Comparing base (c24498b) to head (563a7bd).
Report is 4 commits behind head on devel.

Additional details and impacted files
@@           Coverage Diff           @@
##            devel    #4489   +/-   ##
=======================================
  Coverage   84.41%   84.41%           
=======================================
  Files         670      670           
  Lines       62149    62149           
  Branches     3487     3490    +3     
=======================================
  Hits        52465    52465           
  Misses       8558     8558           
  Partials     1126     1126           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant