-
Notifications
You must be signed in to change notification settings - Fork 11
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve annotator installation scripts #571
Improve annotator installation scripts #571
Conversation
Warning Rate limit exceeded@akotlar has exceeded the limit for the number of commits or files that can be reviewed per hour. Please wait 4 minutes and 3 seconds before requesting another review. How to resolve this issue?After the wait time has elapsed, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout. Please see our FAQ for further information. WalkthroughThe pull request introduces a new Changes
Possibly related PRs
Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media? TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
Outside diff range and nitpick comments (1)
install/detect-shell-profile.sh (1)
1-37
: Effective shell profile detection script with minor improvements suggestedThe
detect-shell-profile.sh
script effectively determines the appropriate shell profile file for the current user based on their default shell. The use ofgetent
to retrieve the user's shell and thecase
statement to handle different shell types is a logical and clear approach.The script follows common conventions by prioritizing
.bash_profile
over.bashrc
for Bash users and using the standard profile files for other shells. Ensuring the existence of the profile file is a nice touch for automation and user experience.A couple of minor suggestions for improvement:
- Consider adding error handling for scenarios where
getent
fails to retrieve the user's shell. This could help provide a more informative error message to the user.- The
else
block in the Bash case (lines 14-16) seems redundant since it assigns the same value as the firstif
block. You can remove thiselse
block for a cleaner flow.Overall, this is a well-structured and effective script for detecting the user's shell profile file. Great job!
Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Files selected for processing (20)
- Dockerfile.python (1 hunks)
- Makefile (1 hunks)
- TUTORIAL.md (1 hunks)
- install-rpm.sh (1 hunks)
- install/detect-shell-profile.sh (1 hunks)
- install/export-bystro-libs.sh (1 hunks)
- install/export-go-path-linux.sh (0 hunks)
- install/install-go-linux.sh (0 hunks)
- install/install-go-mac.sh (0 hunks)
- install/install-go-packages.sh (1 hunks)
- install/install-liftover-linux.sh (1 hunks)
- install/install-lmdb-linux.sh (1 hunks)
- install/install-perl-libs.sh (1 hunks)
- install/install-perlbrew-linux.sh (1 hunks)
- install/install-rpm-deps.sh (1 hunks)
- install/update-packages.sh (0 hunks)
- perl/Dockerfile (2 hunks)
- perl/README.md (2 hunks)
- perl/cpanfile (1 hunks)
- perl/lib/Utils/SqlWriter/Connection.pm (3 hunks)
Files not reviewed due to no reviewable changes (4)
- install/export-go-path-linux.sh
- install/install-go-linux.sh
- install/install-go-mac.sh
- install/update-packages.sh
Additional context used
Shellcheck
install/export-bystro-libs.sh
[warning] 13-13: PROFILE_FILE appears unused. Verify use (or export if used externally).
(SC2034)
install-rpm.sh
[warning] 30-30: GO_PLATFORM appears unused. Verify use (or export if used externally).
(SC2034)
[warning] 70-70: ShellCheck can't follow non-constant source. Use a directive to specify location.
(SC1090)
LanguageTool
TUTORIAL.md
[uncategorized] ~15-~15: If this is a compound adjective that modifies the following noun, use a hyphen.
Context: ...search performance ## Open Source The open source command line annotator that powers byst...(EN_COMPOUND_ADJECTIVE_INTERNAL)
Additional comments not posted (58)
install/install-lmdb-linux.sh (4)
2-2
: Great addition ofset -e
for better error handling!Adding
set -e
ensures that the script exits immediately if any command fails. This prevents the script from continuing with the installation process if something goes wrong, thereby improving the reliability of the installation.
6-7
: Informative comments about LMDB.The added comments provide useful context about why LMDB is chosen as the database engine in Bystro. Mentioning its speed and efficient use of cache helps in understanding the rationale behind this choice.
8-25
: Excellent improvements to the installation process!The changes made to the script significantly enhance its functionality and robustness:
Using a temporary directory for cloning and building LMDB keeps the working directory clean and organized. Creating the temporary directory with
mktemp -d
ensures a secure and unique directory name.Cloning the LMDB repository using HTTPS instead of the Git protocol improves compatibility and security.
The added echo statements provide clear feedback on the progress of each step, enhancing the user experience during the installation process.
Cleaning up the temporary directory after the installation maintains a tidy environment.
These improvements make the script more reliable, maintainable, and user-friendly.
27-27
: Great addition of the success message!Adding the echo statement to indicate the successful completion of the LMDB installation is a nice touch. This clear feedback lets the user know that the installation process has finished without any issues, providing a sense of assurance.
install/install-rpm-deps.sh (5)
2-9
: Excellent addition of error handling and root privilege check!The script now includes robust error handling with
set -e
andset -o pipefail
, ensuring that any command failure terminates the script immediately. This is a best practice for bash scripts.Additionally, checking for root privileges at the start and prompting the user to run with
sudo
if not executed as root is crucial for security and proper execution. The script provides clear feedback to guide the user.Great work on enhancing the script's reliability and user interaction!
11-12
: Good practice to update system packages.Updating system packages with
dnf update -y
before installing dependencies is a good practice. It ensures that the system is up-to-date, preventing potential compatibility issues and security vulnerabilities.The
-y
flag is used appropriately to make the script non-interactive, which is suitable for an automated installation script.
14-40
: Comprehensive package installation with proper versioning.The script installs a comprehensive set of packages required for the project. Installing development tools as a group using
dnf groupinstall -y "Development Tools"
simplifies the setup process.The list of packages covers essential tools and libraries, ensuring a complete development environment. It's good to see
tar
andgrep
included, as they are commonly used utilities.Installing Node.js 20.x using the setup script from nodesource.com is a reliable way to ensure a recent version is installed. The script uses appropriate flags (
-y
) for non-interactive installation, which is suitable for an automated setup.Great job on creating a thorough and efficient package installation process!
42-46
: Proper installation of pm2 and cpanminus.Installing pm2 globally using
npm install -g pm2
is the correct approach. It allows pm2 to be used system-wide for process management, which is often required in production environments.Installing cpanminus directly using cURL (
curl -L https://cpanmin.us | perl - App::cpanminus
) ensures that the latest version is installed. This is a reliable way to set up cpanminus, which is essential for managing Perl dependencies.The script uses appropriate flags (
-g
for global npm install) for proper installation.
48-48
: Clear success message for user feedback.Printing a clear success message using
echo -e "\n\nAll dependencies have been installed successfully.\n"
provides valuable feedback to the user. It confirms that all dependencies have been installed successfully, giving confidence that the script executed as intended.The script uses
echo -e
with newline characters (\n
) for proper formatting, making the output readable and visually separated from the rest of the script's output.This is a nice touch to enhance user experience and communication.
Makefile (1)
24-24
: Verify compatibility of the updatedbystro-vcf
package version.Updating the
bystro-vcf
package to version2.2.3
is a good step towards staying current with the latest improvements and bug fixes.However, please ensure that this newer version is fully compatible with the rest of the project and doesn't introduce any breaking changes.
You can run the following script to search for potential usage of the
bystro-vcf
package across the codebase:install/install-perl-libs.sh (4)
8-9
: LGTM!The consolidation of multiple
cpanm install
commands into a single command improves the efficiency and maintainability of the script. The use ofcpanm
utility ensures a reliable installation of the required Perl modules.
11-18
: LGTM!The separation of Perl modules that require special handling into distinct
cpanm
commands improves the clarity and maintainability of the script. The use of separate commands for each module ensures a more robust and reliable installation process.
20-32
: LGTM!The introduction of the
install_custom_msgpack
function to handle the custom installation ofData::MessagePack
improves the organization and maintainability of the script. The function encapsulates the entire installation process, ensuring that a specific version of the module is installed from a known Git branch.
34-34
: LGTM!The inclusion of the
install_custom_msgpack
function call at the end of the script ensures that the custom installation ofData::MessagePack
is performed as part of the overall installation process.install/install-liftover-linux.sh (4)
10-17
: LGTM!The
check_liftover_installed
function is well-implemented and follows good practices:
- It uses the appropriate command to check if
liftOver
is installed.- It handles both cases correctly: when
liftOver
is installed and when it is not.- It redirects the output to
/dev/null
and returns appropriate exit codes.
20-23
: LGTM!The script correctly uses the
check_liftover_installed
function to determine the installation status ofliftOver
and provides appropriate feedback to the user.
25-37
: LGTM!The script correctly determines the appropriate installation directory for
liftOver
and handles the case when/usr/local/bin
is not writable. It also ensures that$HOME/bin
exists and updates the user'sPATH
if necessary.
39-43
: LGTM!The script correctly downloads the
liftOver
binary, sets the execute permission, and provides clear feedback to the user about the successful installation.Dockerfile.python (2)
10-10
: Verify the compatibility and impact of the updatedbystro-snp
version.The version update to
1.0.1
suggests potential bug fixes or new features. Ensure that the updated version is compatible with the rest of the system and does not introduce any breaking changes.Run the following script to analyze the impact of the version update:
Verification successful
Approve the version update, but verify changes in functionality
The update to bystro-snp version 1.0.1 has been consistently applied across the codebase. No references to the previous version were found, indicating a thorough update. However, please review the Changes.md file, particularly the entries related to bystro-snp, to ensure that the new version's features and changes align with your project's requirements.
- The update affects multiple parts of the system, including Dockerfiles, installation scripts, and configuration files.
- Changes.md mentions updates to bystro-snp output and functionality, which may be related to this version update.
Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash # Description: Verify the compatibility and impact of the updated `bystro-snp` version. # Test 1: Search for import statements or function calls related to `bystro-snp`. # Expect: Identify the files and functions that depend on `bystro-snp`. rg --type python -A 5 $'bystro-snp' # Test 2: Search for version-specific code or configuration related to `bystro-snp`. # Expect: Identify any version-specific dependencies or configurations that might need updating. rg --type python -A 5 $'bystro-snp.*1\.0\.0'Length of output: 142
Script:
#!/bin/bash # Description: Verify the usage and version references of bystro-snp across the codebase # Search for bystro-snp usage across all file types echo "Searching for bystro-snp usage:" rg "bystro-snp" -n # Look for version-specific mentions of bystro-snp echo -e "\nSearching for version-specific mentions of bystro-snp:" rg "bystro-snp.*[0-9]+\.[0-9]+\.[0-9]+" -n # Check for any documentation or configuration files that might reference bystro-snp echo -e "\nSearching for bystro-snp in documentation and configuration files:" rg "bystro-snp" -n -g "*.md" -g "*.yaml" -g "*.yml" -g "*.json" -g "*.txt"Length of output: 4293
8-8
: Verify the compatibility and impact of the updatedbystro-stats
version.The version update to
1.0.1
suggests potential bug fixes or new features. Ensure that the updated version is compatible with the rest of the system and does not introduce any breaking changes.Run the following script to analyze the impact of the version update:
Verification successful
Approve the bystro-stats version update to 1.0.1
The update from version 1.0.0 to 1.0.1 of bystro-stats in Dockerfile.python appears to be consistent with the rest of the codebase. Here are some points to consider:
- The install script (install/install-go-packages.sh) already references version 1.0.1, which is good.
- Multiple configuration files use bystro-stats, but they don't specify a version, so they should work with the new version.
- No Go files directly import bystro-stats, suggesting it's used as a command-line tool.
To ensure a smooth transition:
- Verify that version 1.0.1 of bystro-stats is backwards compatible with the existing usage in your project.
- Check the Python tests (python/python/bystro/search/utils/tests/test_annotation.py) and utility module (python/python/bystro/search/utils/annotation.py) to ensure they're compatible with the new version.
Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash # Description: Verify the compatibility and impact of the updated `bystro-stats` version. # Test 1: Search for import statements or function calls related to `bystro-stats`. # Expect: Identify the files and functions that depend on `bystro-stats`. rg --type python -A 5 $'bystro-stats' # Test 2: Search for version-specific code or configuration related to `bystro-stats`. # Expect: Identify any version-specific dependencies or configurations that might need updating. rg --type python -A 5 $'bystro-stats.*1\.0\.0'Length of output: 146
Script:
#!/bin/bash # Description: Verify the compatibility and impact of the updated bystro-stats version. # Test 1: Search for references to bystro-stats in all files echo "Searching for bystro-stats references:" rg "bystro-stats" -n --no-heading # Test 2: Look for Go files that might be using the bystro-stats package echo "Searching for Go files using bystro-stats:" rg --type go "github\.com/bystrogenomics/bystro-stats" -n --no-heading # Test 3: Check the Dockerfile for any other references to bystro-stats echo "Checking Dockerfile for bystro-stats references:" rg "bystro-stats" Dockerfile.python -n --no-heading # Test 4: Search for shell scripts or configuration files that might be using bystro-stats echo "Searching for bystro-stats in shell scripts and config files:" rg "bystro-stats" -g "*.sh" -g "*.yaml" -g "*.yml" -g "*.json" -n --no-headingLength of output: 3649
perl/lib/Utils/SqlWriter/Connection.pm (2)
18-18
: Update default driver to MariaDBModifying the default value of the
driver
attribute to"DBI:MariaDB"
aligns with the transition to the MariaDB driver. This change ensures that the MariaDB driver is used by default for new connections, maintaining consistency with the updated import statement.
48-48
: Replace MySQL-specific connection option with MariaDB equivalentReplacing the
mysql_read_default_group
option withmariadb_read_default_group
ensures compatibility with the MariaDB driver. This change allows the appropriate client configuration group to be specified when establishing a connection, maintaining the intended functionality of reading default client configuration.install/export-bystro-libs.sh (5)
5-8
: LGTM!The check for the first argument and the usage instructions improve the script's usability by providing clear guidance on how to run it correctly.
17-29
: LGTM!The checks for the existence of the
perl/lib
directory and the presence of at least one.pm
file improve the script's robustness by ensuring the necessary files are present before proceeding. The error messages provide clear feedback to the user if the conditions are not met.
31-43
: LGTM!The checks for the existence of the
perl/bin
directory and the presence of at least one.pl
file improve the script's robustness by ensuring the necessary files are present before proceeding. The error messages provide clear feedback to the user if the conditions are not met.
45-54
: LGTM!The
append_if_missing
function is a good addition to streamline the process of appending environment variable exports to the profile file. It avoids duplicates by checking if the export lines already exist and provides feedback on the operation's outcome, enhancing the script's user-friendliness.
56-58
: LGTM!Using the
append_if_missing
function to append thePERL5LIB
andPATH
exports conditionally improves the script's robustness by avoiding duplicates.The static analysis hint indicates that the
PROFILE_FILE
variable appears unused. Please verify its usage or consider removing it if it's not needed.Verification successful
PROFILE_FILE
is used, but its usage within the script should be reviewedThe
PROFILE_FILE
variable is indeed present and used in theinstall/export-bystro-libs.sh
script. It's defined as an optional second argument in the script's usage instructions. However, the static analysis hint about it being unused might be referring to a specific part of the script where it's not utilized effectively.Consider reviewing the script to ensure
PROFILE_FILE
is properly used throughout, especially in relation to theappend_if_missing
function calls. If it's not actively used within the script body, you may want to update the implementation to make use of it or remove it if it's no longer needed for this specific script.Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash # Description: Verify the usage of the `PROFILE_FILE` variable. # Test: Search for the usage of `PROFILE_FILE`. Expect: No occurrences. rg --type sh $'PROFILE_FILE'Length of output: 1228
install/install-go-packages.sh (9)
4-7
: LGTM!The code segment correctly checks for the required number of arguments and provides clear usage instructions when the argument is missing. This improves the script's usability and prevents it from proceeding with an incorrect number of arguments.
13-17
: LGTM!The code segment correctly checks if Go is installed on the system before proceeding with the installation of Go packages. This prevents the script from attempting to install packages when Go is not available, which would result in errors.
19-26
: LGTM!The code segment correctly checks the installed Go version and compares it with the required minimum version (1.20). This ensures that the installed Go version meets the minimum requirement before proceeding with the installation of Go packages, preventing compatibility issues that may arise from using an older Go version.
28-33
: LGTM!The code segment correctly initializes a new Go module named "bystro" in the specified directory if it doesn't already exist. This is necessary for managing dependencies and building Go packages. By checking for the existence of the
go.mod
file, the script avoids reinitializing the module if it already exists.
35-37
: LGTM!The code segment correctly tidies up the Go module dependencies using the
go mod tidy
command. This ensures that the dependencies are up to date and properly managed, reducing the chances of conflicts or issues related to outdated or unused dependencies.
39-41
: LGTM!The code segment correctly installs the local 'dosage' command using the
go install ./cmd/dosage
command. This makes the 'dosage' command available for use, allowing users to perform dosage-related operations using the Bystro tool.
43-45
: LGTM!The code segment correctly installs version 1.0.1 of the 'bystro-stats' package using the
go install
command with the package path and version specified. This allows users to perform statistical analysis tasks related to the Bystro tool, and specifying the version ensures that a consistent and compatible version of the package is installed.
47-48
: LGTM!The code segment correctly installs version 2.2.3 of the 'bystro-vcf' package using the
go install
command with the package path and version specified. This allows users to work with VCF (Variant Call Format) files in the Bystro tool, and specifying the version ensures that a consistent and compatible version of the package is installed.
50-51
: LGTM!The code segment correctly installs version 1.0.1 of the 'bystro-snp' package using the
go install
command with the package path and version specified. This allows users to work with SNP (Single Nucleotide Polymorphism) data in the Bystro tool, and specifying the version ensures that a consistent and compatible version of the package is installed.Version update verified. No apparent compatibility issues.
The compatibility check shows that the codebase is likely to work well with the updated version 2.2.3 of bystro-vcf:
- Go files import bystro-vcf without specifying a version, suggesting flexibility with minor updates.
- The installation script (install-go-packages.sh) already uses version 2.2.3, aligning with the Dockerfile update.
Given that this is a patch version update (2.2.2 to 2.2.3), it's unlikely to introduce breaking changes. The consistency between the Dockerfile and the installation script further reduces the risk of compatibility issues.
Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash # Description: Search for usage of bystro-vcf and analyze compatibility with the updated version. # Test: Search for bystro-vcf usage in Go files. Expect: No compatibility issues. rg --type go -A 5 $'bystro-vcf' # Test: Search for bystro-vcf usage in shell scripts. Expect: No compatibility issues. rg --type sh -A 5 $'bystro-vcf'Length of output: 1626
75-75
: Provide more context for the database driver change and verify compatibility.The change from the
DBD::mysql
driver to theDBD::MariaDB
driver raises a few questions:
- What is the reason for switching from MySQL to MariaDB? Is there a specific requirement or benefit that motivated this change?
- The version difference between
DBD::mysql
(4.051) andDBD::MariaDB
(1.23) is significant. Have you verified that the existing codebase is compatible with the new driver and version?- Are there any necessary changes in the codebase to accommodate the switch to the MariaDB driver?
Please provide more context about the driver change and ensure that the new driver is fully compatible with the existing functionality.
Run the following script to search for potential compatibility issues:
Verification successful
Database driver change appears compatible, but update configurations and test thoroughly
The switch from
DBD::mysql
toDBD::MariaDB
seems to be compatible with the existing codebase. No direct usage ofDBD::mysql
was found, which reduces the risk of compatibility issues. However, there are a few actions to consider:
Update configuration settings and test files that still reference MySQL:
- perl/t/utils/sqlWriter.t
- perl/lib/Utils/SqlWriter/Connection.pm
- perl/lib/Seq/Tracks/Gene/Build/TX.pm (comment only)
The presence of a MariaDB-specific option in SqlWriter/Connection.pm suggests some preparation for this change was already in place.
Conduct thorough testing to ensure full compatibility, especially focusing on database interactions and any SQL queries that might be affected by subtle differences between MySQL and MariaDB.
These minor updates and thorough testing should help ensure a smooth transition to the MariaDB driver.
Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash # Description: Search for MySQL-specific code and configurations that may need to be updated for MariaDB compatibility. # Test: Search for MySQL-specific configurations. Expect: No incompatible configurations. rg --type perl -A 5 $'mysql' # Test: Search for MySQL-specific code usage. Expect: No incompatible code. rg --type perl -A 5 $'DBD::mysql'Length of output: 2257
8-8
: Version update looks good. Verify compatibility.Updating to the latest patch version of
bystro-stats
is a good practice to incorporate bug fixes and improvements. Please ensure that version1.0.1
is compatible with the rest of the codebase and dependencies.Run the following script to search for any potential compatibility issues:
Verification successful
bystro-stats update to 1.0.1 appears compatible
The update of bystro-stats to version 1.0.1 in the Dockerfile is consistent with its usage across the codebase. Key observations:
- Configuration files reference 'bystro-stats' without version specificity, relying on the globally installed version.
- No other version numbers or specific usage patterns were found, reducing the risk of version-specific incompatibilities.
- The minor patch update (1.0.0 to 1.0.1) typically implies backwards compatibility and bug fixes.
The update appears safe to proceed with. However, as a precaution, it's recommended to run any existing tests that involve bystro-stats to ensure full compatibility.
Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash # Description: Search for usage of bystro-stats and analyze compatibility with the updated version. # Test: Search for bystro-stats usage in Go files. Expect: No compatibility issues. rg --type go -A 5 $'bystro-stats' # Test: Search for bystro-stats usage in shell scripts. Expect: No compatibility issues. rg --type sh -A 5 $'bystro-stats'Length of output: 492
Script:
#!/bin/bash # Description: Search for any other references to bystro-stats in the codebase # Search for bystro-stats in all files, excluding the Dockerfile and install script we've already seen rg 'bystro-stats' --type-not go --type-not sh -C 3 | grep -v 'perl/Dockerfile' | grep -v 'install/install-go-packages.sh' # Search for any version numbers related to bystro-stats rg 'bystro-stats.*\d+\.\d+\.\d+' --type-not go --type-not sh -C 3 | grep -v 'perl/Dockerfile' | grep -v 'install/install-go-packages.sh' # Search for any usage patterns or function calls related to bystro-stats rg 'bystro-stats\.[a-zA-Z]+' --type-not go --type-not sh -C 3Length of output: 10485
install/install-perlbrew-linux.sh (8)
3-4
: LGTM!The addition of
set -e
andset -o pipefail
improves the script's error handling by ensuring that errors are caught and the script exits gracefully.
6-10
: LGTM!The use of parameter expansion to set default values for the variables streamlines the argument parsing logic and makes it more concise. The default values seem reasonable and provide a good fallback if the user doesn't specify the arguments.
12-15
: LGTM!Echoing the values of the variables provides useful information to the user about the installation configuration. This helps the user verify that the correct values are being used before proceeding with the installation.
17-18
: LGTM!Exporting
PERLBREW_ROOT
andPERLBREW_HOME
environment variables based on theDIR
argument or its default value ensures that they are available to the subsequent commands in the script. The paths for these variables follow the standard convention and seem appropriate.
23-32
: LGTM!The script checks for the presence of
perlbrew
before attempting to install it, which prevents unnecessary installations and potential conflicts. It also checks for the availability ofcurl
orwget
and uses the appropriate command to download and run the installation script. The error handling and user feedback in case of missing dependencies is well-handled.
35-46
: LGTM!The script updates the user's profile with perlbrew environment variables if they are not already present. It checks if the variables are already sourced using
grep
and appends them to the profile file if not found. This approach ensures that the necessary environment variables are set up for the user without duplicating them if they already exist. The script also echoes a message to inform the user about the update.
58-69
: LGTM!The script checks if the desired Perl version is already installed and installs it if not found. It determines the number of available CPU cores and uses that value for parallel compilation, which can significantly speed up the installation process. The installation logic is well-structured and provides clear output messages indicating whether tests are being run or skipped. If the desired Perl version is already installed, the script informs the user and skips the installation.
75-82
: LGTM!The script installs
cpanm
(cpanminus) if it is not already installed for the current Perl version. It checks ifcpanm
is present in thebin
directory of the current Perl version and installs it usingperlbrew install-cpanm
if not found. The script provides clear output messages indicating whethercpanm
is being installed or if it is already installed. Installingcpanm
ensures that the user has the necessary tool to easily install Perl modules.
perl/README.md (2)
87-87
: The review comment for line 32 also applies to this line since it's a duplicate change using a different installation method.
32-32
: Verify compatibility of the MariaDB driver across the codebase.The change from
DBD::mysql
toDBD::MariaDB
is a significant change in the database driver. While the APIs are likely compatible, there might be subtle differences that could impact the codebase.To aid in verifying the compatibility of the MariaDB driver, run the following script to search for all database interactions in the codebase:
Review the results to ensure that the MariaDB driver is compatible with all existing database interactions. Thoroughly test these interactions to confirm expected behavior.
Verification successful
Approval: MariaDB driver change appears compatible with existing code
The change from
DBD::mysql
toDBD::MariaDB
seems to be compatible with the current codebase. Our analysis shows:
- Database interactions are limited to two files:
SqlWriter.pm
andSqlWriter/Connection.pm
.- The code uses standard DBI methods, which should work similarly with MariaDB.
- No MySQL-specific functions were found, suggesting database-agnostic code.
To ensure a smooth transition:
- Thoroughly test database operations in
SqlWriter.pm
andSqlWriter/Connection.pm
.- Review and update connection parameters if necessary for MariaDB compatibility.
- Verify error handling with the new driver to catch any potential issues.
Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash # Description: Search for all database interactions in the codebase. # Test: Search for DBI module usage and database handle variables. Expect: Comprehensive list of files and line numbers where database interactions occur. rg --type perl -e 'use DBI' -e '\$dbh'Length of output: 414
TUTORIAL.md (1)
15-15
: LGTM!The update to the URL for the open source command line annotator looks good. Pointing users to the official organization repository is the right approach.
Tools
LanguageTool
[uncategorized] ~15-~15: If this is a compound adjective that modifies the following noun, use a hyphen.
Context: ...search performance ## Open Source The open source command line annotator that powers byst...(EN_COMPOUND_ADJECTIVE_INTERNAL)
@@ -5,7 +5,7 @@ use warnings; | |||
package Utils::SqlWriter::Connection; | |||
|
|||
use DBI; | |||
use DBD::mysql 4.051; | |||
use DBD::MariaDB 1.23; # Specify the version of DBD::MariaDB |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tip
Codebase Verification
Incomplete transition to MariaDB driver
The transition from MySQL to MariaDB driver is partially complete, but there's still a MySQL-specific parameter in use. To fully migrate to MariaDB:
- Replace the line:
with the MariaDB equivalent:
$connection .= ";mysql_socket=" . $self->port_num if $self->socket;
$connection .= ";socket=" . $self->port_num if $self->socket;
This change ensures that the connection string is fully compatible with the MariaDB driver.
Analysis chain
Transition to MariaDB driver
The import statement change indicates a shift from the DBD::mysql
driver to the DBD::MariaDB
driver. This transition may affect compatibility and functionality with MariaDB databases.
Verify that existing code relying on MySQL-specific features or syntax is reviewed and updated for compatibility with MariaDB. Run the following script to identify potential occurrences:
Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Description: Search for MySQL-specific syntax or features in Perl files.
# Test: Search for common MySQL-specific keywords and syntax.
# Expect: No concerning occurrences found.
rg --type perl -i $'mysql_\w+|TINYINT|MEDIUMINT|BIGINT|UNSIGNED|ZEROFILL'
Length of output: 194
Script:
#!/bin/bash
# Description: Search for MySQL-specific connection parameters and check for MariaDB alternatives
# Search for MySQL-specific connection parameters
echo "MySQL-specific connection parameters:"
rg --type perl -i 'mysql_\w+='
# Search for potential MariaDB-specific connection parameters
echo -e "\nMariaDB-specific connection parameters:"
rg --type perl -i 'mariadb_\w+='
# Display the content of the Connection.pm file for context
echo -e "\nContent of Connection.pm file:"
cat perl/lib/Utils/SqlWriter/Connection.pm
Length of output: 2295
perl/Dockerfile
Outdated
RUN go install github.com/mikefarah/[email protected] | ||
RUN go install github.com/akotlar/[email protected] | ||
RUN go install github.com/bystrogenomics/[email protected] | ||
RUN go install github.com/akotlar/[email protected] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tip
Codebase Verification
Inconsistent package names for bystro-snp. Update required.
The verification process revealed an inconsistency in the package names used for bystro-snp across different files:
perl/Dockerfile
andREADME.md
usegithub.com/akotlar/bystro-snp
Dockerfile.python
andinstall/install-go-packages.sh
usegithub.com/bystrogenomics/bystro-snp
While all installations use the correct version (1.0.1), this inconsistency should be addressed:
- Determine which package name is correct (
akotlar
orbystrogenomics
). - Update all occurrences to use the correct package name:
perl/Dockerfile
Dockerfile.python
install/install-go-packages.sh
README.md
- Review and update any other files that mention bystro-snp, including configuration files in the
config/
directory.
Ensuring consistency will prevent potential confusion or issues in the future.
Analysis chain
Version update looks good. Verify compatibility.
Updating to the latest patch version of bystro-snp
is a good practice to incorporate bug fixes and improvements. Please ensure that version 1.0.1
is compatible with the rest of the codebase and dependencies.
Run the following script to search for any potential compatibility issues:
Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Description: Search for usage of bystro-snp and analyze compatibility with the updated version.
# Test: Search for bystro-snp usage in Go files. Expect: No compatibility issues.
rg --type go -A 5 $'bystro-snp'
# Test: Search for bystro-snp usage in shell scripts. Expect: No compatibility issues.
rg --type sh -A 5 $'bystro-snp'
Length of output: 474
Script:
#!/bin/bash
# Description: Search for both package names of bystro-snp and check for any other references
# Search for akotlar/bystro-snp
echo "Searching for akotlar/bystro-snp:"
rg -i "akotlar/bystro-snp"
echo -e "\nSearching for bystrogenomics/bystro-snp:"
rg -i "bystrogenomics/bystro-snp"
echo -e "\nSearching for any other mentions of bystro-snp:"
rg -i "bystro-snp" --type-not go --type-not sh
echo -e "\nChecking for any configuration files mentioning bystro-snp:"
fd -e yaml -e json -e toml -e ini -e conf | xargs rg -i "bystro-snp"
Length of output: 2613
To build a Docker image using the `Dockerfile`, run the following: | ||
|
||
```bash | ||
cd ../ && docker build -t bystro-annotator -f Dockerfile.perl . |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Am I supposed to start in the bystro repo folder to run this, or does it intend for me to start in the bystro/perl folder so it takes me to the bystro folder?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 4
Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Files selected for processing (4)
- .github/workflows/perl-build-docker.yml (1 hunks)
- INSTALL.md (2 hunks)
- perl/INSTALL.md (1 hunks)
- python/python/bystro/covariance/tests/test_hypothesis_hd.py (5 hunks)
Files skipped from review as they are similar to previous changes (3)
- .github/workflows/perl-build-docker.yml
- INSTALL.md
- python/python/bystro/covariance/tests/test_hypothesis_hd.py
Additional context used
LanguageTool
perl/INSTALL.md
[typographical] ~33-~33: Consider adding a comma after ‘Alternatively’ for more clarity.
Context: ...://metacpan.org/pod/App::cpanminus). - Alternatively you can use [cpanm](https://metacpan.or...(RB_LY_COMMA)
[typographical] ~118-~118: It appears that a comma is missing.
Context: ...1.tar.gz | tar xvf - ``` In this example the hg38 database would located in `/mn...(DURING_THAT_TIME_COMMA)
[grammar] ~118-~118: The modal verb ‘would’ requires the verb’s base form.
Context: ...In this example the hg38 database would located in/mnt/annotator/hg38
3. Update the...(MD_BASEFORM)
Markdownlint
perl/INSTALL.md
136-136: Punctuation: ':'
Trailing punctuation in heading(MD026, no-trailing-punctuation)
138-138: null
Bare URL used(MD034, no-bare-urls)
139-139: null
Bare URL used(MD034, no-bare-urls)
Additional comments not posted (1)
perl/INSTALL.md (1)
8-8
: Clarify the working directory for the Docker build command.The past review comment is still valid:
cristinaetrv: Am I supposed to start in the bystro repo folder to run this, or does it intend for me to start in the bystro/perl folder so it takes me to the bystro folder?
Please update the instructions to clearly specify the expected working directory when running the Docker build command. This will help users avoid confusion and ensure they are in the correct location.
perl/INSTALL.md
Outdated
- Debian/Ubuntu: 'sudo ../install/install-rpm-deps.sh` | ||
- Centos/Fedora/Amazon Linux: 'sudo ../install/install-apt-depts.sh` | ||
- bgzip: `../install/install-htslib.sh ~/.profile ~/.local` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fix typos in the prerequisite installation commands.
The commands for installing prerequisites have typos in the file names:
- For Debian/Ubuntu, it should be
install-apt-deps.sh
instead ofinstall-rpm-deps.sh
. - For CentOS/Fedora/Amazon Linux, it should be
install-rpm-deps.sh
instead ofinstall-apt-depts.sh
.
Please update the commands with the correct file names to ensure users can successfully install the prerequisites.
perl/INSTALL.md
Outdated
bgzip -d -c --threads 32 hg38_v11.tar.gz | tar xvf - | ||
``` | ||
|
||
In this example the hg38 database would located in `/mnt/annotator/hg38` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fix grammatical errors in the sentence.
The sentence has the following grammatical issues:
- Missing comma after "In this example".
- Incorrect verb form "would located" instead of "would be located".
Please update the sentence as follows:
-In this example the hg38 database would located in `/mnt/annotator/hg38`
+In this example, the hg38 database would be located in `/mnt/annotator/hg38`
Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
In this example the hg38 database would located in `/mnt/annotator/hg38` | |
In this example, the hg38 database would be located in `/mnt/annotator/hg38` |
Tools
LanguageTool
[typographical] ~118-~118: It appears that a comma is missing.
Context: ...1.tar.gz | tar xvf - ``` In this example the hg38 database would located in `/mn...(DURING_THAT_TIME_COMMA)
[grammar] ~118-~118: The modal verb ‘would’ requires the verb’s base form.
Context: ...In this example the hg38 database would located in/mnt/annotator/hg38
3. Update the...(MD_BASEFORM)
perl/INSTALL.md
Outdated
1. Human (hg38): https://s3.amazonaws.com/bystro-db/hg38_v11.tar.gz | ||
2. Human (hg19): https://s3.amazonaws.com/bystro-db/hg19_v10.tar.gz |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Convert bare URLs to Markdown links.
Using bare URLs is not recommended in Markdown as it reduces readability. Please convert the URLs to Markdown links with descriptive text.
-1. Human (hg38): https://s3.amazonaws.com/bystro-db/hg38_v11.tar.gz
-2. Human (hg19): https://s3.amazonaws.com/bystro-db/hg19_v10.tar.gz
+1. [Human (hg38) database](https://s3.amazonaws.com/bystro-db/hg38_v11.tar.gz)
+2. [Human (hg19) database](https://s3.amazonaws.com/bystro-db/hg19_v10.tar.gz)
Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
1. Human (hg38): https://s3.amazonaws.com/bystro-db/hg38_v11.tar.gz | |
2. Human (hg19): https://s3.amazonaws.com/bystro-db/hg19_v10.tar.gz | |
1. [Human (hg38) database](https://s3.amazonaws.com/bystro-db/hg38_v11.tar.gz) | |
2. [Human (hg19) database](https://s3.amazonaws.com/bystro-db/hg19_v10.tar.gz) |
Tools
Markdownlint
138-138: null
Bare URL used(MD034, no-bare-urls)
139-139: null
Bare URL used(MD034, no-bare-urls)
perl/INSTALL.md
Outdated
yq write -i config/hg38.yml temp_dir /mnt/my_fast_local_storage/tmp | ||
``` | ||
|
||
## Databases: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove trailing colon from the heading.
The heading "Databases:" has a trailing colon, which is not recommended in Markdown. Please remove the colon to improve the formatting.
-## Databases:
+## Databases
Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
## Databases: | |
## Databases |
Tools
Markdownlint
136-136: Punctuation: ':'
Trailing punctuation in heading(MD026, no-trailing-punctuation)
Ran the following and it ran successfully on the workstation, on a folder outside of my home directory:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
Outside diff range and nitpick comments (4)
perl/INSTALL.md (4)
28-46
: Add a comma after "First" for better readability.Please update the sentence as follows:
-First you'll need to install some prerequisites: +First, you'll need to install some prerequisites:Other than that, the manual installation instructions for prerequisites and Go programs are clear and provide the necessary commands.
Tools
LanguageTool
[uncategorized] ~29-~29: Possible missing comma found.
Context: ...level dependencies. ## Manual Install First you'll need to install some prerequisit...(AI_HYDRA_LEO_MISSING_COMMA)
48-69
: Add a comma after "Alternatively" for better clarity.Please update the sentence as follows:
-Alternatively you can use [cpanm](https://metacpan.org/dist/App-cpanminus/view/bin/cpanm), which can be installed with the following: `curl -fsSL https://cpanmin.us | perl - App::cpanminus` +Alternatively, you can use [cpanm](https://metacpan.org/dist/App-cpanminus/view/bin/cpanm), which can be installed with the following: `curl -fsSL https://cpanmin.us | perl - App::cpanminus`Other than that, the instructions for installing
cpm
and setting up environment variables are clear and provide the necessary commands.Tools
LanguageTool
[typographical] ~50-~50: Consider adding a comma after ‘Alternatively’ for more clarity.
Context: ...://metacpan.org/pod/App::cpanminus). - Alternatively you can use [cpanm](https://metacpan.or...(RB_LY_COMMA)
115-151
: Fix the grammatical errors in the example sentence.Please update the sentence as follows:
-In this example the hg38 database would located in `/mnt/annotator/hg38` +In this example, the hg38 database would be located in `/mnt/annotator/hg38`Other than that, the instructions for configuring the Bystro Annotator, including downloading and installing the Bystro database and updating the YAML configuration, are clear and provide the necessary steps.
Tools
LanguageTool
[typographical] ~135-~135: It appears that a comma is missing.
Context: ...1.tar.gz | tar xvf - ``` In this example the hg38 database would located in `/mn...(DURING_THAT_TIME_COMMA)
[grammar] ~135-~135: The modal verb ‘would’ requires the verb’s base form.
Context: ...In this example the hg38 database would located in/mnt/annotator/hg38
3. Update the...(MD_BASEFORM)
153-157
: Improve the formatting of the "Databases" section.Please apply the following changes:
- Remove the trailing colon from the heading "Databases:".
- Convert the bare URLs to Markdown links with descriptive text.
-## Databases: +## Databases -1. Human (hg38): https://s3.amazonaws.com/bystro-db/hg38_v11.tar.gz -2. Human (hg19): https://s3.amazonaws.com/bystro-db/hg19_v10.tar.gz +1. [Human (hg38) database](https://s3.amazonaws.com/bystro-db/hg38_v11.tar.gz) +2. [Human (hg19) database](https://s3.amazonaws.com/bystro-db/hg19_v10.tar.gz)Tools
Markdownlint
153-153: Punctuation: ':'
Trailing punctuation in heading(MD026, no-trailing-punctuation)
155-155: null
Bare URL used(MD034, no-bare-urls)
156-156: null
Bare URL used(MD034, no-bare-urls)
Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Files selected for processing (2)
- install/install-perl-libs.sh (1 hunks)
- perl/INSTALL.md (1 hunks)
Additional context used
LanguageTool
perl/INSTALL.md
[uncategorized] ~26-~26: If this is a compound adjective that modifies the following noun, use a hyphen.
Context: ... "sudo" access to install the necessary system level dependencies. ## Manual Install First...(EN_COMPOUND_ADJECTIVE_INTERNAL)
[uncategorized] ~29-~29: Possible missing comma found.
Context: ...level dependencies. ## Manual Install First you'll need to install some prerequisit...(AI_HYDRA_LEO_MISSING_COMMA)
[typographical] ~50-~50: Consider adding a comma after ‘Alternatively’ for more clarity.
Context: ...://metacpan.org/pod/App::cpanminus). - Alternatively you can use [cpanm](https://metacpan.or...(RB_LY_COMMA)
[typographical] ~135-~135: It appears that a comma is missing.
Context: ...1.tar.gz | tar xvf - ``` In this example the hg38 database would located in `/mn...(DURING_THAT_TIME_COMMA)
[grammar] ~135-~135: The modal verb ‘would’ requires the verb’s base form.
Context: ...In this example the hg38 database would located in/mnt/annotator/hg38
3. Update the...(MD_BASEFORM)
Markdownlint
perl/INSTALL.md
153-153: Punctuation: ':'
Trailing punctuation in heading(MD026, no-trailing-punctuation)
155-155: null
Bare URL used(MD034, no-bare-urls)
156-156: null
Bare URL used(MD034, no-bare-urls)
Additional comments not posted (9)
install/install-perl-libs.sh (5)
8-18
: LGTM!The new function
install_custom_msgpack
encapsulates the custom installation logic forData::MessagePack
module. It ensures a specific version is installed from a custom repository usingcpm
, aligning with the overall refactoring to usecpm
for module installation. Calling the function immediately after its definition ensures the custom installation is performed.
20-21
: LGTM!The code segment installs
Alien::LMDB
andLMDB_File
modules usingcpm
, aligning with the overall refactoring. SettingALIEN_INSTALL_TYPE
toshare
ensures the correct installation type forAlien::LMDB
. Running tests during installation helps catch any issues early.
23-24
: LGTM!The code segment installs
MouseX::Getopt
module usingcpm
, aligning with the overall refactoring. Omitting the--test
option is justified due to a known trivial failure related to formatting of CLI help strings. The comment clearly explains the reason for omitting the--test
option.
26-26
: LGTM!The code segment installs modules from a
cpanfile
usingcpm
, aligning with the overall refactoring. Installing modules from acpanfile
helps manage dependencies. Running tests during installation helps catch any issues early.
2-4
: LGTM!The code segment enables strict error handling using
set -e
, which helps catch and fail on any errors during the installation process. Printing a message helps provide clarity and demarcation of the Perl libraries installation section.perl/INSTALL.md (4)
3-18
: LGTM!The Docker installation instructions are clear and provide the necessary commands to build the image and run the container. The example commands for running the annotator, building a new database, and fetching dependencies are helpful for users.
19-27
: LGTM!The local installation instructions are clear and provide the necessary commands for Debian/Ubuntu and CentOS/Fedora/Amazon Linux. The mention of requiring "sudo" access to install system-level dependencies is helpful for users.
Tools
LanguageTool
[uncategorized] ~26-~26: If this is a compound adjective that modifies the following noun, use a hyphen.
Context: ... "sudo" access to install the necessary system level dependencies. ## Manual Install First...(EN_COMPOUND_ADJECTIVE_INTERNAL)
72-102
: LGTM!The instructions for installing specific dependencies separately are clear and provide the necessary commands. The additional instructions for manually installing
LMDB_File
for Perl versions greater than 5.36.0 are helpful for users with newer Perl versions.
175-184
: LGTM!The information about the coding style and using
tidyall
to tidy files before submitting patches is helpful for contributors. The instructions for installingtidyall
,perltidy
, andperlcritic
are clear and provide the necessary command.
bystro-annotate.pl --help | ||
``` | ||
|
||
## Configuring the Bystro Annotator |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Recommend making this the next size up
…ling which won't work when libdeflate is used or bgzip is used
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Much cleaner and easier to follow instructions with these changes, and was able to install easily once issues were resolved. LGTM!
Recommend to issue a PR in near future with further cleanup of documentation so that it is up to date, and free of typos and old links.
Made an issue to track that #573 |
Mac installation script is removed for the time being; Docker is still available for that purpose. We can revisit in the future, this PR has gotten large enough, and we have Dist::Zilla (and Docker) based install to fall back to. In future work we will further simplify installation.
Summary by CodeRabbit
Release Notes
New Features
Documentation
Bug Fixes
DBD::mysql
toDBD::MariaDB
for improved database compatibility.Chores
2.0.0
to reflect significant changes in the software..dockerignore
file to optimize Docker build context.