Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding initial test suite #10

Merged
merged 13 commits into from
Jul 8, 2024
Merged

Adding initial test suite #10

merged 13 commits into from
Jul 8, 2024

Conversation

Joshua-K1
Copy link
Collaborator

@Joshua-K1 Joshua-K1 commented Jul 8, 2024

🤖AEP PR SUMMARY🤖

  • Added a new file named run-test-suite.yml in the .github/workflows directory with a workflow for running tests on pull requests and manual triggers
  • Updated the Makefile to include a new target unittest for building locally and running tests
  • Modified prompts.py in the src/helpers directory to include debug logging for prompt injection and error handling for prompt injection detection
  • Updated requirements.txt in the src directory to include a new dependency coverage

Copy link

github-actions bot commented Jul 8, 2024

  • Python Version in Strategy

    • Consider removing '3.12' from the python-version list in the matrix strategy if Python 3.12 is still in development or pre-release by the time of your project's deployment. Keeping your testing environments to stable releases ensures your tests run against versions of Python that are fully supported and widely used.
      Example:
      yml
      matrix:
      python-version: ['3.10', '3.11']
    
    
  • GitHub Actions Versions

    • It is advised to use the latest version of actions such as actions/checkout and actions/setup-python. This ensures you benefit from the latest features and security patches. For example, if actions/checkout@v3 and actions/setup-python@v3 exist, it's beneficial to upgrade.
      Example:
    - uses: actions/checkout@v3
    - uses: actions/setup-python@v3
  • Caching Dependencies

    • To reduce the runtime of workflows and save on CI/CD minutes, consider caching dependencies after installing them. GitHub Actions provides caching mechanisms that can significantly speed up build times.
      Example:
    - name: Cache Python dependencies
      uses: actions/cache@v3
      with:
        path: |
          ~/.cache/pip
          !~/.cache/pip/log
        key: ${{ runner.os }}-pip-${{ hashFiles('**/src/requirements.txt') }}
        restore-keys: |
          ${{ runner.os }}-pip-
  • Security Considerations for API Keys

    • Ensure that SYSTEM_API_KEY is either a placeholder value that needs to be replaced or securely handled if it's a sensitive key. It's not clear whether "system" as a value is intended for demonstration.
    • Similarly, review if SYSTEM_API_KEY, OPENAI_API_TYPE, and azure_openai_api_version need to be hardcoded or if they should be securely managed or made configurable through secrets/environment variables, especially in scenarios where different environments (development, staging, production) require different settings or versions.
  • Use of Specific Python Installation Command

    • Instead of python -m pip install --upgrade pip, consider using python -m ensurepip followed by python -m pip install --upgrade pip to ensure pip is installed in environments where it might be missing.
      Example:
    run: |
      python -m ensurepip
      python -m pip install --upgrade pip
      if [ -f src/requirements.txt ]; then pip install -r src/requirements.txt; fi
  • pytest Command Improvements

    • For better maintainability and readability, consider breaking down the pytest command and adding explanatory comments on flags used, especially -vv, -s, and --junit-xml.
      Example:
    run: |
      # Verbose output, disable capturing of stdout, generate an XML report
      pytest tests/ -vv -s --junit-xml=test-results.xml
  • GitHub Actions Best practices

    • It's good practice to specify the exact commit sha for GitHub Actions used from the marketplace which are not published by GitHub or verified publishers for enhanced security. For instance, pmeier/pytest-results-action@main could point to an unexpected latest version which might introduce breaking changes or security vulnerabilities. Instead, point to a specific commit hash.
      Example (Example commit hash, replace with the latest stable commit):
    uses: pmeier/pytest-results-action@abcdef1234567890abcdef1234567890abcdef12

Copy link

github-actions bot commented Jul 8, 2024

Code Review Feedback

GitHub Actions Workflow: run-test-suite.yml

  1. Use of ubuntu-latest:

    • Consider specifying an exact version of Ubuntu instead of ubuntu-latest. This will help to ensure the reproducibility of your builds over time. For example, use ubuntu-20.04.
  2. Python Version Strategy:

    • Including a development version (e.g., 3.12 if not officially released yet) in the testing matrix could lead to unstable builds. Ensure that all specified versions are stable releases for a more reliable CI process.
  3. Environment Variables:

    • It looks clean, but consider periodically reviewing the secrets and environment variables to ensure no deprecated or unnecessary ones are being injected, reducing potential security risks.
  4. actions/checkout@v2 and actions/setup-python@v2:

    • Check if there are newer versions of these actions to leverage any improvements or security patches.

Python Code: prompts.py

  1. Logging Sensitive Information:

    • Be cautious about logging potentially sensitive information. For example, event_logger.debug(f\"Response from AI ContentSafety: {response.json()}\") might contain sensitive data about the content. Consider logging less sensitive metadata instead.
  2. Error Handling in check_for_prompt_inj:

    • The try-except block is a good practice, but you could enhance the error handling by being more specific about the exceptions you are catching (e.g., requests.exceptions.RequestException instead of a general Exception). This can help with diagnosing issues more precisely.
  3. Duplicate Logging:

    • The logging statement under the if response_json['documentsAnalysis'][0]['attackDetected']: condition is duplicated before and inside the condition. Removing the duplicate before the condition will make the code cleaner and reduce unnecessary logs.
  4. Refactoring for Readability:

    • The process within the check_for_prompt_inj function could be refactored into smaller functions for improved readability and maintainability. For example, the construction and sending of the request could be one function, and the analysis of the response could be another.
  5. Security Best Practices:

    • Ensure that requests.post is being called with HTTPS URLs to prevent MITM (Man-in-the-Middle) attacks. This should be implied by using secure endpoints (config.azure_cs_endpoint), but it's always good to assert such conditions in the code if possible.
  6. Content-Type Header:

    • Since you are sending JSON data, your approach is correct. Just ensure that all endpoints you interact with support the 'application/json' Content-Type.

General Recommendations:

  • Code Documentation:

    • Adding docstrings to functions, especially like check_for_prompt_inj, can greatly improve the understandability of the code for others and your future self. Include a brief description, parameters, and what the function returns.
  • Unit Testing:

    • Ensure that there are sufficient unit tests covering the logic in check_for_prompt_inj, especially testing the handling of different response payloads and error conditions.

Copy link

github-actions bot commented Jul 8, 2024

GitHub Workflow Improvements

YAML Best Practices

  • Indent consistency: Ensure consistent indent spacing for readability. For example, the environment variables (env: block) should have consistent indentation.
  • Explicit Python Versions: Specify multiple Python versions in the test matrix for more comprehensive testing, if applicable. For example,
    yaml
    python-version: ['3.12', '3.11', '3.10']
  • Quotes consistency: Maintain consistent use of quotes for strings. Although YAML does not always require quotes, choosing either always quoted or minimal quotes for clarity helps. For the name and runs-on fields, either consistently use quotes or remove them where they are unnecessary.

Actions Versions

  • Update GitHub Actions to Use Latest Versions: It's beneficial to use the latest stable versions of actions for both security patches and new features. For example,
    uses: actions/checkout@v3
    uses: actions/setup-python@v3

Security

  • Validate and ensure that secrets used in the workflow have minimal permissions necessary for the operations they perform to follow the principle of least privilege.

Makefile Improvements

Error Handling

  • Ensure proper error handling or set -e in scripts to fail on errors within the Makefile. This ensures better reliability and debugging.

Consistency and Documentation

  • Documentation Comments: Adding comments for each target about what they do can help new developers understand the codebase quickly.

Code Improvements in Python

Error Handling

  • Specific Exceptions: Catch specific exceptions in try-except blocks rather than a general Exception for requests.post() call in check_for_prompt_inj:
    except requests.exceptions.RequestException as err:

Performance

  • Reuse of Sessions: Use requests.Session() for making HTTP requests to reuse underlying TCP connections, improving performance, especially when making multiple requests to the same host.

Security

  • Data Leakage: Be cautious about logging sensitive information. For instance, logging full response from external services could potentially leak sensitive data. Consider logging less detailed error messages or sanitize logs accordingly.

Clean Code Practices

  • Function Naming: Ensure function names are descriptive and adhere to a consistent naming convention. check_for_prompt_inj could be more descriptive.
  • String Formatting: Use f-strings for consistency and readability, as already adopted in the majority of the codebase.

Dependencies

  • Pin the versions of the dependencies in src/requirements.txt to ensure reproducibility of the build and mitigate the risk of automatically upgrading to a newer, potentially breaking version of a library.
    requests==2.25.1
  • Regularly update dependencies to the latest stable versions to incorporate security patches and new features.

Copy link

github-actions bot commented Jul 8, 2024

Review Comments for the Provided Git Diff

Workflow (run-test-suite.yml)

  1. Usage of Ubuntu Latest: It's recommended to specify the exact version of an Ubuntu runner rather than using ubuntu-latest to ensure the consistency and reproducibility of the workflow runs. This avoids surprises when ubuntu-latest moves to a new version.

    yaml
    runs-on: ubuntu-22.04

    
    
  2. Python Setup Action Version: Consider using the latest version of the setup-python GitHub Action (v3 instead of v2) to leverage the latest features and fixes.

    uses: actions/setup-python@v3
  3. Checkout Action Version: Updating to actions/checkout@v3 is recommended for the latest features and fixes.

    uses: actions/checkout@v3
  4. Python Version Hardcoding: Instead of hardcoding a single Python version in the matrix, consider testing against multiple relevant Python versions to ensure compatibility.

    python-version: ['3.8', '3.9', '3.10', '3.11']
  5. Use of pmeier/pytest-results-action@main: Relying on the main branch of a GitHub Action can introduce instability. Prefer using a specific release or commit.

    uses: pmeier/[email protected]  # As an example

Makefile

  1. Lack of Consistency in Script Execution: For consistency, consider using the same pattern for invoking scripts. If one script uses $(MAKE) to invoke another target, all should, unless there's a specific reason not to.

    • Consider wrapping run_tests.sh in a make target and calling it similarly to others:

      .PHONY: run-tests
      run-tests:
          ./buildscripts/run_tests.sh
  2. No Newline at the End of File: Always ensure there's a newline at the end of files. This is a POSIX compliance issue, and while it's a small detail, it can prevent potential issues in Unix-like environments and maintain consistency across files.

prompts.py

  1. Logging Response Before Checking Success: Logging the full response from an external API before checking if the request was successful could potentially leak sensitive information to logs. Consider logging detailed responses only after validating the response code.

    response = requests.post(url, headers=headers, data=json.dumps(data))
    
    if response.ok:
        event_logger.debug(f\"Response from AI ContentSafety: {response.json()}\")
  2. Exception Handling in check_for_prompt_inj: While catching a broad exception is better than none, it's a good practice to catch specific exceptions wherever possible (e.g., requests.exceptions.RequestException). This aids in handling different failure cases more gracefully.

    except requests.exceptions.RequestException as err:
  3. Magic Strings and Numbers: The API version (\"2024-02-15-preview\") used in the URL is hard-coded. Consider defining such values as constants at the top of your modules or in a configuration file. This makes maintenance and updates easier.

    API_VERSION = \"2024-02-15-preview\"

requirements.txt

  1. Explicit Versioning: The additional libraries (coverage) should have explicit versions defined to ensure reproducibility and minimize unexpected updates. Not specifying a version can lead to unexpected compatibility issues.

    coverage==5.5  # As an example
    

Adhering to these suggestions will enhance the maintainability, security, and compatibility of your code and infrastructure configuration.

Copy link

github-actions bot commented Jul 8, 2024

null

Copy link

github-actions bot commented Jul 8, 2024

.github/workflows/run-test-suite.yml Improvements

  • Python version hardcoding: Using python-version: ['3.12'] hardcodes the Python version; consider using a more flexible approach to automatically adopt newer versions. Example:
    yml
    python-version: ['3.x']

    ensures the latest Python 3 version is used.
    
    
  • Actions versioning: It is recommended to use stable versions of GitHub Actions rather than tags like v2 or main. Pinning to a specific version or commit SHA ensures stability and reproducibility. Example:

    uses: actions/[email protected]
    uses: actions/[email protected]
    uses: pmeier/pytest-results-action@1a2b3c4d
  • Redundant environment variables: There seems to be duplicate environment variables for Azure OpenAI Key (AZURE_OPENAI_KEY and OPENAI_API_KEY). Double-check and ensure only necessary variables are declared.

  • Sensitive data exposure: Verify that all sensitive data (e.g., keys and endpoints) used in this workflow are appropriately scoped and have minimum required permissions. Limit exposure by adhering to the principle of least privilege.

Makefile Improvements

  • Rule Dependency: In the unittest target, it's calling $(MAKE) build-local and $(MAKE) run-tests sequentially. If build-local has dependencies or prerequisites, ensure they are explicitly defined for readability and maintainability.

  • Missing newline: The diff shows that there's no newline at the end of the file. It's generally a good practice to end files with a newline to avoid potential issues with UNIX utilities:

    inference-run:
      ./buildscripts/run_inference_service.sh

src/helpers/prompts.py Improvements

  • Error handling: The check_for_prompt_inj function now includes error handling, which is an improvement. Ensure that all potential exceptions from the requests.post call are handled appropriately, not just generic exceptions. Consider catching specific exceptions (requests.exceptions.RequestException) for more granular error handling.

  • Logging sensitive information: Be cautious about logging potentially sensitive information. For instance, logging full response data (event_logger.info(f\"Response from AI ContentSafety: {response.json()}\")) and config URLs might expose information that could be sensitive. Always sanitize or minimize what's logged, especially in production environments.

  • Improving readability: The function check_for_prompt_inj mixes logging and business logic. Consider separating these concerns for better readability. Additionally, the nested conditional logic for checking attackDetected could be streamlined for clarity.

src/requirements.txt Improvements

  • Specify library versions: Avoid using libraries without specifying versions. This can lead to inconsistencies and unexpected behavior when the libraries are updated. Specify exact versions for better reproducibility. Example:
    requests==2.26.0
    

Copy link

github-actions bot commented Jul 8, 2024

General Feedback

The given diffs introduce multiple changes related to GitHub Actions workflows, Makefile adjustments, and Python code enhancements for security measures. Below are suggestions for additional improvements focusing on best practices, code quality, and further security practices.

GitHub Actions Workflow (run-test-suite.yml)

  1. Use Latest Version of Actions

    • Instead of actions/checkout@v2 and actions/setup-python@v2, consider using the latest versions since they may include bug fixes, new features, or security patches.
  2. Specifying Python Versions

    • The workflow uses a specific Python version (3.12). It's generally a good idea to test against multiple versions of Python if your application is intended to support them. This can be achieved by expanding the python-version matrix. Example:
      yaml
      python-version: ['3.8', '3.9', '3.10', '3.11', '3.12']
      
      
  3. Sensitive Information in Logs

    • Ensure that none of the echo commands potentially leak sensitive information to the logs. Although environment variables themselves may not be leaked, inadvertent logging elsewhere could expose them. Always audit logging practices.

Makefile Improvements

  1. Add Descriptions for Make Targets

    • Adding comments or descriptions for make targets can improve readability and maintainability, especially for newcomers to the project.
  2. Ensure Portable Makefile Syntax

    • Ensure that the syntax used in the Makefile is POSIX-compliant for maximum compatibility across different environments.

Python Code (prompts.py)

  1. Error Handling

    • The addition of a try-except block around the requests.post call is a good practice to handle potential runtime exceptions. Ensure that this error handling is sufficient for all possible exceptions that could arise from the network request (for instance, requests.exceptions.RequestException).
  2. Logging Sensitive Information

    • Be cautious about logging potentially sensitive information, even if it's for debugging purposes. For instance, logging the complete response from an external service might inadvertently expose sensitive data. Consider logging only non-sensitive metadata, or ensure logs are appropriately secured and access-controlled.
  3. Content-Type Header

    • The Content-Type header is set to application/json, which is good practice when sending JSON payloads. Ensure that all external services you interact with expect and correctly handle this content type.
  4. Security Measures

    • It's great to see proactive security measures through check_for_prompt_inj(). Continuously review and update these measures based on the evolving threat landscape and the external service's capabilities.

Python Requirements (requirements.txt)

  1. Pin Dependencies

    • Consider pinning your dependencies to specific versions to ensure repeatable builds and avoid unexpected breaks due to package updates. For example:
      requests==2.25.1
      
  2. Regularly Update Dependencies

    • Regularly update your dependencies to include security patches and improvements. This can be automated with tools like Dependabot.
  3. Specify Indirect Dependencies

    • While it's generally good practice to avoid overly specifying indirect dependencies, being aware of critical sub-dependencies and their versions can be crucial for maintaining a secure and stable application.

Copy link

github-actions bot commented Jul 8, 2024

Review of the given diff

GitHub Actions Workflow Improvements

  • Use of Latest Versions for Actions: Ensure that actions used like actions/checkout@v2 and actions/setup-python@v2 are updated to their latest versions (if available) for security and functionality improvements.
    yaml
    • uses: actions/checkout@v3
    • uses: actions/setup-python@v3
  • Explicitly specify Python patch version: The Python version is specified as 3.12, which might lead to unpredictable builds if 3.12.x releases contain changes. Specify the full version (e.g., 3.12.0) to ensure consistent environments across runs.
    python-version: ['3.12.0']
  • Avoid Hardcoding Secrets: Ensure no accidental hardcoding of secrets. While this diff does not show hardcoded secrets, awareness is crucial. Always use secrets from GitHub Secrets.
  • Security of Action: Verify the source and security of third-party actions like pmeier/pytest-results-action@main. Prefer using actions with version tags or commit SHA to prevent unexpected changes.
    uses: pmeier/[email protected]

Makefile Improvements

  • Modular Commands: The Makefile has improved modularity with separate commands for unittest, which is a good practice. Ensure that your make tasks are as idempotent as possible, for example, by handling pre-existing states gracefully.
  • Documentation: Adding comments to explain the purpose of each command or section in the Makefile could improve maintainability, especially for new developers.
  • Final Newline: Make sure there is a newline at the end of the Makefile as per Unix and POSIX standards to avoid potential issues with tools that process the Makefile.

Python Code Improvements

  • Error Handling in HTTP Requests: Good use of try-except block for handling request failures. However, consider narrowing the catch to more specific exceptions like requests.exceptions.RequestException to avoid catching unrelated exceptions.
    except requests.exceptions.RequestException as err:
  • Sensitive Data Logging: Be cautious about logging sensitive information, especially when logging responses from external services. Make sure this does not expose any sensitive or personal information.
  • Constants Naming Convention: Python constants like url and headers in check_for_prompt_inj function should be named using uppercase letters as per PEP 8 style guide if they are meant to be constants.
  • Magic Strings & Numbers: API version strings and endpoint strings are hardcoded; consider defining these as constants at the start of your module or in a configuration file.

Dependency Management

  • Pinning Exact Versions in requirements.txt: To ensure reproducible builds and avoid unexpected updates, consider pinning your dependencies to specific versions in src/requirements.txt.
    requests==2.26.0
    coverage==6.0.2
    

Overall, ensuring that your workflow is secure, maintainable, and up-to-date with the latest practices will significantly benefit the longevity and security of your project.

Copy link

github-actions bot commented Jul 8, 2024

GitHub Actions Workflow: .github/workflows/run-test-suite.yml

1. Use Action Versions:

  • Change uses: actions/checkout@v2 to a pinned version or at least use the latest major version v3 if consistent updates are preferred. Moreover, following this practice for all GitHub actions minimizes unexpected breaks due to updates.

yaml

  • uses: actions/checkout@v3

This advice extends to `actions/setup-python@v2` and `pmeier/pytest-results-action@main`. Pin them to specific versions or use the latest major version tags to ensure stability.

#### 2. Python Version As a Matrix:

If future-proofing or expanding the testing strategy across multiple Python versions is a consideration, it makes sense. Otherwise, specifying a matrix for a single version of Python is unnecessary overhead.

```yaml
strategy:
  matrix:
    python-version: ['3.12']

Consider simplifying unless planning to add more versions.

3. Redundant Environment Variables:

There’s redundant setting of AZURE_OPENAI_KEY and OPENAI_API_KEY to the same secret. Ensure if both are indeed required, or if not, consider removing one to tidy the environment variable space.

OPENAI_API_KEY: ${{ secrets.TEST_AZURE_OPENAI_KEY }}

Makefile: Makefile

4. Explicit Makes Call:

In unittest, it explicitly calls $(MAKE) for both build-local and run-tests. This is good practice as it ensures the make process is correctly propagated, especially in environments where make might not refer to GNU make by default.

Python Code: src/helpers/prompts.py

5. Exception-Handling in check_for_prompt_inj method:

The addition of a try-except block enhances error handling, making the code more robust by catching and logging any exceptions that occur in the requests.post call. However:

  • Ensure that the catch of a generic Exception is what you want. Catching more specific exceptions related to network issues or JSON parsing might make error handling more precise and actionable.
except requests.exceptions.RequestException as err:
    ...

6. Logging Response Before Parsing:

Before attempting to parse the response (response.json()), logging is done inside the try block. This is a good practice, especially for debugging issues related to API calls. Just ensure sensitive information isn't logged unintentionally.

7. Code Duplication:

There’s a duplicated logging statement when an injection is detected. Consider refactoring this to reduce redundancy and simplify future changes.

requirements.txt: src/requirements.txt

8. Explicit Version Pinning:

The file should pin specific versions for each package to ensure that the environments are reproducible, and unexpected updates do not break the application.

requests==x.x.x

Replace x.x.x with the latest or a stable version of each requirement.

Copy link

github-actions bot commented Jul 8, 2024

Workflow File Improvements

  1. Update Actions Versions:

    • Use the latest version of actions, like actions/checkout and actions/setup-python, to benefit from the latest features and security updates.
      yaml
      • uses: actions/checkout@v3
      • uses: actions/setup-python@v3
      
      
  2. Refine Python Versions:

    • Rather than hardcoding a single Python version, consider testing against multiple versions to ensure compatibility.
      python-version: ['3.8', '3.9', '3.10', '3.11']
  3. Environment Variables Security:

    • Be cautious with how environment variables are handled, especially when writing them to a .env file. This approach could potentially expose sensitive information if not properly secured.

Makefile Improvements

  1. Parallelism in Makefile:

    • Consider leveraging the capability of make to run jobs in parallel (-j flag) when appropriate, e.g., when tasks do not depend on each other, to improve build times.
  2. Include a Final Newline:

    • Always include a newline at the end of files, including the Makefile, to adhere to the POSIX standard and avoid potential parsing issues.

Python Code Improvements

  1. Exception Handling in check_for_prompt_inj Function:

    • The exception handling in the try-except block could be more granular, catching more specific exceptions (e.g., requests.exceptions.RequestException) rather than a general Exception, to avoid masking unrelated issues.
      except requests.exceptions.RequestException as err:
  2. Logging Sensitive Information:

    • Be cautious when logging potentially sensitive information. Ensure that the logged data in event_logger.debug(f\"Response from AI ContentSafety: {response.json()}\") does not contain sensitive data or consider redacting parts of the output.

Requirements.txt Improvements

  1. Pin Dependencies:

    • It's a best practice to pin dependencies in requirements.txt to specific versions to ensure consistent environments and avoid potential breaking changes in dependencies.
      requests==2.26.0
      Flask==2.0.1
      # etc. for all dependencies
      
  2. Avoid Directly Adding Unrelated Dependencies:

    • The addition of coverage seems to be aiming at test coverage measurement. Ensure dependencies are categorized (e.g., separate requirements for production and development) and that their inclusion is necessary for the project's scope. Consider using a tool like pip-tools for better dependency management.

Overall, these suggestions aim to enhance the maintainability, security, and efficiency of the code and workflow.

Copy link

github-actions bot commented Jul 8, 2024

GitHub Actions Workflow

General Improvements

  • Use latest actions: Always ensure to use the latest version of actions for features and security patches. For example, actions/checkout@v2 should be actions/checkout@v3 if it's available and tested.

    yaml

    • name: Checkout repository
      uses: actions/checkout@v3
    
    
  • Specify Python versions more dynamically: Instead of hardcoding a single Python version ('3.12'), consider supporting multiple versions or using a more dynamic selection method to future-proof your workflows. For minor updates, you might consider specifying '3.x' to automatically use the latest Python 3 version.

  • Redundancy in environment variables: You have both AZURE_OPENAI_KEY and OPENAI_API_KEY set to the same value. If they always share the same value, consider using only one to avoid redundancy and potential confusion.

Security and Best Practices

  • Sensitive data in plain text: Writing secrets directly into a .env file could potentially expose them if the file is accidentally included in the repository or a build artifact. Instead, consider directly using GitHub secrets in the runtime environment.

  • Use secrets for SYSTEM_API_KEY: Even if it's named \"system\", if SYSTEM_API_KEY is used for any form of authentication or sensitive operation, ensure it's managed securely.

Makefile Improvements

  • Dependency for unittest on build-local and run-tests: Ensure that these dependencies are appropriately managed considering the context in which they are used. For example, build-local should not need to run every time tests are executed if the build artifacts are unchanged.

  • Newline at the end of file: Always ensure there is a newline at the end of files to comply with POSIX standards and avoid potential issues with tools that expect this.

Python Code Improvements

Exception Handling

  • The check_for_prompt_inj function correctly adds a try-except block when calling an external service. This is a good practice. Ensure the handling is as specific as possible (catching a too general exception could hide other issues), and consider how to handle the error apart from logging (e.g., retries, fallbacks).

Logging

  • Sensitive Information: Be mindful of logging sensitive information, especially in event_logger.debug(f\"CS Config URL: {url}\") and when logging responses containing potentially sensitive data. Ensure that logging such information complies with your security policies.

General Code Style and Quality

  • Comments and Documentation: Ensure your code changes are accompanied by relevant comments and documentation updates, explaining why certain changes have been made, especially when adding new dependencies or when complex logic is involved.

  • Requirements.txt: The addition of coverage in requirements.txt is good for testing coverage; make sure it's used in your CI pipeline to generate and potentially enforce coverage metrics for code quality assurance.

Copy link

github-actions bot commented Jul 8, 2024

GitHub Workflow Recommendations

Use Specific Python Versions and Update Actions

  • Python Version: It's advised to test against multiple versions of Python if possible, not just the latest (3.12). This ensures compatibility across different environments. Example:
    yaml
    python-version: ['3.9', '3.10', '3.11', '3.12']
  • Action Versions: Using actions/checkout@v2 and actions/setup-python@v2 is good, but consider using the latest major versions (if available) to leverage new features and security fixes. Example replacement:
    uses: actions/checkout@v3
    uses: actions/setup-python@v3

Security Practices in GitHub Actions

  • Securing Environment Variables: Writing secrets to a .env file as shown can be risky because if the file gets checked into version control, secrets can be exposed. GitHub Actions automatically mask secrets in logs, but ensuring that .env does not get accidentally committed is crucial. As a best practice, pass environment variables directly to the command or script needing them or use GitHub Actions' built-in env: key.

Makefile Enhancements

  • No newline at End of File: It's a good practice to end files with a newline character to avoid potential issues with UNIX text processing tools which expect this. Simply add a newline to the end of Makefile.

Code Quality in Python script (prompts.py)

Exception Handling

  • Broad Exception Handling: Catching a general Exception is not recommended because it can make it difficult to understand what errors your code is prepared to handle. If possible, catch specific exceptions. Example:
    except requests.exceptions.RequestException as err:

Logging

  • Duplicate Response Logging: There is repetitive logging of response.json() upon detection of a prompt injection. It would be more efficient to log this once before the condition and then log additional context as needed.
  • Error Logging Level: For errors, especially around critical operations like security checks, consider logging at an error level (event_logger.error) with as much context as possible to aid in troubleshooting.

Requirements.txt

  • Dependency Management: Consider specifying versions for the project dependencies to ensure consistent environments and compatibility. Using the latest version of dependencies without specifying a version can lead to unexpected behavior or compatibility issues.
    Example:
    requests==2.25.1
    Flask==2.0.1
    pytest==6.2.4
    coverage==5.5
    

This practice helps in avoiding "it works on my machine" syndrome and ensures more predictable and stable builds.

Copy link

github-actions bot commented Jul 8, 2024

Review of Changes

GitHub Actions Workflow

General Best Practices
  • Updated versions are recommended for GitHub Actions to ensure the latest features, bug fixes, and security updates are utilized.
  • Consider specifying a more restrictive list of branches for triggering workflows, only if necessary to reduce unnecessary runs.
  • Review the necessity of printing out the pwd command in the Run tests step, as it seems more diagnostic than functional for the test run.

yml

  • uses: actions/checkout@v3 # Update to use the latest version
  • uses: actions/setup-python@v3 # Update to use the latest version

##### Security & Best Practice
- It's good practice to avoid echoing secrets or sensitive environment variables. Consider removing or replacing these with more secure logging methods if necessary for debugging.
- Ensure that the `.env` file created during the workflow is not inadvertently captured as an artifact or in logs, exposing sensitive data.

#### Makefile Improvements

##### Best Practices
- It’s a good practice to declare phony targets in `Makefile` to ensure that make understands these targets are not actual files.
- Consider adding a clean up or teardown process for your `unittest` target to maintain a clean environment post-execution.

```make
.PHONY: mkdocs-build unittest run-tests inference-build inference-run

Python Code Improvements

Error Handling
  • The update enhances error handling within the check_for_prompt_inj function, which is a positive practice.
  • In addition to logging errors, consider implementing a strategy for handling these errors more gracefully in the calling code.
Security and Code Quality
  • Use more specific exceptions in your try-except blocks rather than catching a general Exception to ensure that unexpected errors are not silently ignored or mishandled.
  • Ensure logging does not inadvertently expose sensitive information, particularly in error messages or diagnostics related to external API calls.
try:
    ...
except requests.exceptions.RequestException as err: # Use a more specific exception
    event_logger.error(f\"Failed to perform prompt injection detection: {err}\")
General Best Practice
  • When posting JSON data with requests, you can use the json= parameter instead of data= with json.dumps(). This automatically sets the content type to application/json.
response = requests.post(url, headers=headers, json=data) # Simplified JSON posting

Dependencies Management

Version Pinning
  • It is recommended to pin versions in requirements.txt to ensure reproducibility and prevent compatibility issues.
  • Regularly update dependencies to maintain security patches up to date, after testing for compatibility.
coverage==x.x.x # Specify the exact version required

Overall, these changes aim to enhance security, maintainability, and efficiency of the codebase.

@mrickettsk mrickettsk merged commit a3b4335 into main Jul 8, 2024
3 checks passed
@mrickettsk mrickettsk deleted the feat/test-suite branch July 8, 2024 09:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants