Zarr streaming e2e test #8137

frcroth · 2024-10-21T13:55:26Z

TODOs:

Reset changes to tools/test.sh
Zarr 3 TEsts (zarr json and cube)

Issues:

fixes Add end to end tests for zarr streaming #7997

Summary by CodeRabbit

Release Notes

New Features
- Enhanced error messages for user authentication, organization management, and dataset handling for clearer user feedback.
- Introduced new end-to-end tests for Zarr data formats, expanding dataset functionality coverage.
Bug Fixes
- Improved error handling messages related to dataset access and user management for better clarity.
Chores
- Updated .gitignore to track the test-dataset.zip file for important test data.
- Modified test setup to ensure necessary datasets are prepared before running tests, enhancing robustness.
- Adjusted volume mapping for the e2e-tests service in the configuration.
- Enhanced CircleCI configuration to ensure the directory for test data is created with appropriate permissions before tests run.

fm3

Nice stuff! I added a couple of questions

frontend/javascripts/test/backend-snapshot-tests/datasets.e2e.ts

frontend/javascripts/test/e2e-setup.ts

fm3 · 2024-10-22T11:18:55Z

.circleci/config.yml

@@ -118,6 +118,8 @@ jobs:
      - run:
          name: Run end-to-end tests
          command: |
+            mkdir -p "binaryData/Connectomics department"
+            chmod 777 "binaryData/Connectomics department"


below it looks like the new test actually uses Organization_X. Why is it necessary to create the Connectomics department directory here?

I don't know why, but the volumes for the docker compose are arranged in such a way:

webknossos/docker-compose.yml

Line 204 in bedb16e

- ./binaryData/Connectomics department:/home/${USER_NAME:-sbt-user}/webknossos/binaryData/Organization_X

Hm, I think that is a legacy artifact, from when things were named differently. Could you name it Organization_X there too and see if that breaks anything?

test/e2e/End2EndSpec.scala

coderabbitai · 2024-11-04T09:10:12Z

Walkthrough

The changes encompass modifications to user-facing error messages and status notifications in the conf/messages file, updates to the docker-compose.yml file for service volume mapping, enhancements to end-to-end tests in datasets.e2e.ts and End2EndSpec.scala, and adjustments in the e2e-setup.ts file for handling volatile values. A new entry was also added to the .gitignore file to track a specific zip file. These updates collectively improve error clarity, testing robustness, and configuration management.

Changes

File	Change Summary
`conf/messages`	Updated and added error messages related to user authentication, organization management, dataset handling, and annotations; removed messages for `braintracing` service.
`docker-compose.yml`	Changed volume mapping for `e2e-tests` service from `Connectomics department` to `Organization_X`.
`frontend/javascripts/test/backend-snapshot-tests/datasets.e2e.ts`	Added `replaceVolatileValues` import; modified test cases to use this function; introduced new test cases for Zarr streaming.
`frontend/javascripts/test/e2e-setup.ts`	Added `sortingKey` to `volatileKeys`; updated `writeTypeCheckingFile` function; modified global `fetch` function for URL handling.
`test/dataset/.gitignore`	Added entry to track `test-dataset.zip`.
`test/e2e/End2EndSpec.scala`	Added `ensureTestDataset` method to prepare test datasets; modified test case to invoke this method.

Assessment against linked issues

Objective	Addressed	Explanation
Add end to end tests for zarr streaming (#7997)	✅
Ensure robust testing for dataset functionalities	✅

🐇 In the meadow, we hop and play,
With messages clear, come what may!
Zarr streaming tests now take their flight,
Ensuring our code is robust and bright.
So let’s celebrate with a joyful cheer,
For changes that bring us all good cheer! 🌼✨

Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

❤️ Share

🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

Review comments: Directly reply to a review comment made by CodeRabbit. Example:
- I pushed a fix in commit <commit_id>, please review it.
- Generate unit testing code for this file.
- Open a follow-up GitHub issue for this discussion.
Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
- @coderabbitai generate unit testing code for this file.
- @coderabbitai modularize this function.
PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
- @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
- @coderabbitai read src/utils.ts and generate unit testing code.
- @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
- @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

@coderabbitai pause to pause the reviews on a PR.
@coderabbitai resume to resume the paused reviews.
@coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
@coderabbitai full review to do a full review from scratch and review all the files again.
@coderabbitai summary to regenerate the summary of the PR.
@coderabbitai resolve resolve all the CodeRabbit review comments.
@coderabbitai configuration to show the current CodeRabbit configuration for the repository.
@coderabbitai help to get help.

Other keywords and placeholders

Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (`.coderabbit.yaml`)

You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
Please see the configuration documentation for more information.
If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

Visit our Documentation for detailed information on how to use CodeRabbit.
Join our Discord Community to get help, request features, and share feedback.
Follow us on X/Twitter for updates and announcements.

coderabbitai

Actionable comments posted: 3

🧹 Outside diff range and nitpick comments (3)

test/e2e/End2EndSpec.scala (2)
52-53: Consider making paths configurable

The hardcoded paths for test dataset and data directory could make the tests brittle. Consider:

Moving these to configuration

Making them relative to the project root
-    val testDatasetPath = "test/dataset/test-dataset.zip"
-    val dataDirectory = new File("binaryData/Organization_X")
+    val testDatasetPath = config.getString("test.dataset.path")
+    val dataDirectory = new File(config.getString("test.data.directory"))
51-71: Consider adding test data cleanup

The method creates test data but doesn't handle cleanup. This could lead to accumulated test data over multiple test runs.

Consider adding a cleanup method and using it in a after block:
private def cleanupTestDataset(): Unit = {
  val dataDirectory = new File("binaryData/Organization_X/test-dataset")
  if (dataDirectory.exists()) {
    dataDirectory.delete()
  }
}

// Use in an after block
override def after = {
  cleanupTestDataset()
  super.after
}
frontend/javascripts/test/backend-snapshot-tests/datasets.e2e.ts (1)
73-110: Consider adding data content validation

While snapshot testing is good for regression testing, consider adding explicit assertions about the data content to catch potential data corruption issues early.

For example:
 test("Zarr streaming", async (t) => {
   const zattrsResp = await fetch("/data/zarr/Organization_X/test-dataset/segmentation/.zattrs", {
     headers: new Headers(),
   });
   const zattrs = await zattrsResp.text();
+  const zattrsJson = JSON.parse(zattrs);
+  t.truthy(zattrsJson.dimension_separator, "Missing dimension_separator in .zattrs");
   t.snapshot(zattrs);

   const rawDataResponse = await fetch(
     "/data/zarr/Organization_X/test-dataset/segmentation/1/0.1.1.0",
     {
       headers: new Headers(),
     },
   );
   const bytes = await rawDataResponse.arrayBuffer();
+  t.true(bytes.byteLength > 0, "Raw data should not be empty");
   const base64 = btoa(String.fromCharCode(...new Uint8Array(bytes)));
   t.snapshot(base64);
 });

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Reviewing files that changed from the base of the PR and between f937be0 and 459ff77.

⛔ Files ignored due to path filters (2)

frontend/javascripts/test/snapshots/public-test/test-bundle/test/backend-snapshot-tests/datasets.e2e.js.snap is excluded by !**/*.snap
test/dataset/test-dataset.zip is excluded by !**/*.zip

📒 Files selected for processing (6)

conf/messages (0 hunks)
docker-compose.yml (1 hunks)
frontend/javascripts/test/backend-snapshot-tests/datasets.e2e.ts (4 hunks)
frontend/javascripts/test/e2e-setup.ts (2 hunks)
test/dataset/.gitignore (1 hunks)
test/e2e/End2EndSpec.scala (4 hunks)

💤 Files with no reviewable changes (1)

conf/messages

✅ Files skipped from review due to trivial changes (2)

docker-compose.yml
test/dataset/.gitignore

🔇 Additional comments (7)

test/e2e/End2EndSpec.scala (3)

3-3: LGTM: Import added for ZipIO utility

The import is correctly added to support the use of ZipIO.unzipToFolder as suggested in previous reviews.

12-13: LGTM: File handling imports added

The imports for File and Paths are appropriate for the file operations being performed.

33-33: LGTM: Test data preparation added

The ensureTestDataset() call is correctly placed to prepare test data before running the tests.

frontend/javascripts/test/backend-snapshot-tests/datasets.e2e.ts (3)

2-8: LGTM: Import and setup changes look good

The addition of replaceVolatileValues and the dataset check trigger in the setup enhance test stability by handling volatile values and ensuring proper dataset initialization.

Also applies to: 24-24

39-39: LGTM: Consistent handling of volatile values

The systematic application of replaceVolatileValues across all snapshot assertions ensures consistent and reliable test outcomes by normalizing volatile data.

Also applies to: 44-44, 51-51, 59-59

91-110: 🛠️ Refactor suggestion

Add error handling and verify experimental endpoint usage

The test implementation needs error handling, and we should verify the experimental endpoint usage.

Add error handling:

 test("Zarr 3 streaming", async (t) => {
   const zarrJsonResp = await fetch(
     "/data/zarr3_experimental/Organization_X/test-dataset/segmentation/zarr.json",
     {
       headers: new Headers(),
     },
   );
+  t.true(zarrJsonResp.ok, `Failed to fetch zarr.json: ${zarrJsonResp.statusText}`);
   const zarrJson = await zarrJsonResp.text();
   t.snapshot(zarrJson);

   const rawDataResponse = await fetch(
     "/data/zarr3_experimental/Organization_X/test-dataset/segmentation/1/0.1.1.0",
     {
       headers: new Headers(),
     },
   );
+  t.true(rawDataResponse.ok, `Failed to fetch raw data: ${rawDataResponse.statusText}`);
   const bytes = await rawDataResponse.arrayBuffer();
   const base64 = btoa(String.fromCharCode(...new Uint8Array(bytes)));
   t.snapshot(base64);
 });

Let's verify the experimental endpoint usage:

✅ Verification successful

Let me check the implementation details of these files to understand how the experimental endpoint is used.

Experimental endpoint usage is consistent with backend implementation

The experimental endpoint usage in the test is aligned with the backend implementation in DSRemoteTracingstoreClient.scala. The backend has proper handling for both Zarr v2 and experimental Zarr v3 endpoints through the getZarrVersionDependantSubPath helper. The test is correctly exercising this experimental endpoint.

However, the error handling suggestion in the review comment remains valid:

Add response status checks using t.true(response.ok) for both fetch calls
Include status text in error messages for better debugging

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Search for other occurrences of the experimental endpoint
# to ensure consistent usage across the codebase
rg -l "zarr3_experimental" --type ts --type scala

Length of output: 220

Script:

#!/bin/bash
# Check the implementation in both files
echo "=== Frontend Test Implementation ==="
rg -A 10 "zarr3_experimental" frontend/javascripts/test/backend-snapshot-tests/datasets.e2e.ts

echo -e "\n=== Scala Implementation ==="
rg -A 10 "zarr3_experimental" webknossos-datastore/app/com/scalableminds/webknossos/datastore/services/DSRemoteTracingstoreClient.scala

Length of output: 2631

frontend/javascripts/test/e2e-setup.ts (1)

134-136: LGTM! Import statement addition improves type safety.

The addition of the type import statement in the generated TypeScript file ensures proper type checking and follows TypeScript best practices.

coderabbitai · 2024-11-04T09:32:55Z

test/e2e/End2EndSpec.scala

+    if (!dataDirectory.listFiles().exists(_.getName == "test-dataset"))
+      ZipIO.unzipToFolder(
+        testDatasetZip,
+        Paths.get(dataDirectory.toPath.toString, "test-dataset"),
+        includeHiddenFiles = true,
+        hiddenFilesWhitelist = List(),
+        truncateCommonPrefix = true,
+        excludeFromPrefix = None
+      )


🛠️ Refactor suggestion

Improve error handling in unzip operation

The unzip operation could fail silently. Consider:

Adding error handling for the unzip operation

Validating the unzipped content

- if (!dataDirectory.listFiles().exists(_.getName == "test-dataset")) - ZipIO.unzipToFolder( + if (!dataDirectory.listFiles().exists(_.getName == "test-dataset")) { + try { + ZipIO.unzipToFolder( testDatasetZip, Paths.get(dataDirectory.toPath.toString, "test-dataset"), includeHiddenFiles = true, hiddenFilesWhitelist = List(), truncateCommonPrefix = true, excludeFromPrefix = None - ) + ) + // Validate unzipped content + if (!new File(dataDirectory, "test-dataset").exists()) { + throw new Exception("Failed to unzip test dataset properly") + } + } catch { + case e: Exception => + logger.error("Failed to prepare test dataset", e) + throw e + } + }

Committable suggestion skipped: line range outside the PR's diff.

coderabbitai · 2024-11-04T09:32:55Z

frontend/javascripts/test/backend-snapshot-tests/datasets.e2e.ts

+test("Zarr streaming", async (t) => {
+  const zattrsResp = await fetch("/data/zarr/Organization_X/test-dataset/segmentation/.zattrs", {
+    headers: new Headers(),
+  });
+  const zattrs = await zattrsResp.text();
+  t.snapshot(zattrs);
+
+  const rawDataResponse = await fetch(
+    "/data/zarr/Organization_X/test-dataset/segmentation/1/0.1.1.0",
+    {
+      headers: new Headers(),
+    },
+  );
+  const bytes = await rawDataResponse.arrayBuffer();
+  const base64 = btoa(String.fromCharCode(...new Uint8Array(bytes)));
+  t.snapshot(base64);
+});


🛠️ Refactor suggestion

Add error handling for fetch requests

While the test implementation is good, it should handle potential network failures or invalid responses.

Consider adding error handling:

test("Zarr streaming", async (t) => { const zattrsResp = await fetch("/data/zarr/Organization_X/test-dataset/segmentation/.zattrs", { headers: new Headers(), }); + t.true(zattrsResp.ok, `Failed to fetch .zattrs: ${zattrsResp.statusText}`); const zattrs = await zattrsResp.text(); t.snapshot(zattrs); const rawDataResponse = await fetch( "/data/zarr/Organization_X/test-dataset/segmentation/1/0.1.1.0", { headers: new Headers(), }, ); + t.true(rawDataResponse.ok, `Failed to fetch raw data: ${rawDataResponse.statusText}`); const bytes = await rawDataResponse.arrayBuffer(); const base64 = btoa(String.fromCharCode(...new Uint8Array(bytes))); t.snapshot(base64); });

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

test("Zarr streaming", async (t) => {

const zattrsResp = await fetch("/data/zarr/Organization_X/test-dataset/segmentation/.zattrs", {

headers: new Headers(),

});

const zattrs = await zattrsResp.text();

t.snapshot(zattrs);

const rawDataResponse = await fetch(

"/data/zarr/Organization_X/test-dataset/segmentation/1/0.1.1.0",

{

headers: new Headers(),

},

);

const bytes = await rawDataResponse.arrayBuffer();

const base64 = btoa(String.fromCharCode(...new Uint8Array(bytes)));

t.snapshot(base64);

});

test("Zarr streaming", async (t) => {

const zattrsResp = await fetch("/data/zarr/Organization_X/test-dataset/segmentation/.zattrs", {

headers: new Headers(),

});

t.true(zattrsResp.ok, `Failed to fetch .zattrs: ${zattrsResp.statusText}`);

const zattrs = await zattrsResp.text();

t.snapshot(zattrs);

const rawDataResponse = await fetch(

"/data/zarr/Organization_X/test-dataset/segmentation/1/0.1.1.0",

{

headers: new Headers(),

},

);

t.true(rawDataResponse.ok, `Failed to fetch raw data: ${rawDataResponse.statusText}`);

const bytes = await rawDataResponse.arrayBuffer();

const base64 = btoa(String.fromCharCode(...new Uint8Array(bytes)));

t.snapshot(base64);

});

coderabbitai · 2024-11-04T09:32:56Z

frontend/javascripts/test/e2e-setup.ts

@@ -39,6 +39,7 @@ const volatileKeys: Array<string | number | symbol> = [
  "lastActivity",
  "tracingTime",
  "tracingId",
+  "sortingKey",


💡 Codebase verification

Snapshots need to be updated due to "sortingKey" changes

The codebase contains snapshot files that reference "sortingKey", specifically in datasets.e2e.js.md. Adding "sortingKey" to volatileKeys means these snapshots need to be recomputed as they will now handle this property differently during testing.

frontend/javascripts/test/snapshots/public-test/test-bundle/test/backend-snapshot-tests/datasets.e2e.js.md contains "sortingKey" references that will be affected

Other snapshot files may also need updates when tests are re-run with the new volatileKeys configuration

🔗 Analysis chain

Verify snapshot recomputation after adding "sortingKey".

The addition of "sortingKey" to volatileKeys is appropriate. As per the comment above the array, snapshots need to be recomputed when this array is modified.

Run the following script to check if there are any snapshot files that need updating:

Let me check the content of these snapshot files to see if they contain any instances of "sortingKey" that might be affected.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash # Description: Check for snapshot files that might need updating # Look for snapshot files in the test directory fd -e snap -e snapshot . frontend/javascripts/test

Length of output: 1460

Script:

#!/bin/bash # Search for "sortingKey" in snapshot files to understand the impact rg "sortingKey" frontend/javascripts/test/snapshots/

Length of output: 193

coderabbitai

Actionable comments posted: 1

🧹 Outside diff range and nitpick comments (4)

test/e2e/End2EndSpec.scala (4)

51-56: Consider making paths configurable

The hardcoded paths "test/dataset/test-dataset.zip" and "binaryData/Organization_X" should ideally be configuration values for better flexibility and maintainability.

+ private val TestDatasetPath = "test/dataset/test-dataset.zip"
+ private val DataDirectoryPath = "binaryData/Organization_X"

  private def ensureTestDataset(): Unit = {
-   val testDatasetPath = "test/dataset/test-dataset.zip"
-   val dataDirectory = new File("binaryData/Organization_X")
+   val testDatasetPath = TestDatasetPath
+   val dataDirectory = new File(DataDirectoryPath)

51-86: Consider breaking down the method into smaller, focused functions

The ensureTestDataset method has multiple responsibilities. Consider breaking it down into smaller, focused functions for better maintainability and testing:

validatePaths
createDirectories
unzipDataset
validateDataset

private def validatePaths(testDatasetZip: File): Unit = {
  if (!testDatasetZip.exists()) {
    throw new Exception("Test dataset zip file does not exist.")
  }
}

private def createDirectories(dataDirectory: File): Unit = {
  if (!dataDirectory.exists()) {
    dataDirectory.mkdirs()
  }
}

private def unzipDataset(testDatasetZip: File, dataDirectory: File): Unit = {
  if (!dataDirectory.listFiles().exists(_.getName == "test-dataset")) {
    ZipIO.unzipToFolder(
      testDatasetZip,
      Paths.get(dataDirectory.toPath.toString, "test-dataset"),
      includeHiddenFiles = true,
      hiddenFilesWhitelist = List(),
      truncateCommonPrefix = true,
      excludeFromPrefix = None
    )
  }
}

private def validateDataset(dataDirectory: File): Unit = {
  // ... validation logic ...
}

private def ensureTestDataset(): Unit = {
  val testDatasetZip = new File(TestDatasetPath)
  val dataDirectory = new File(DataDirectoryPath)
  
  validatePaths(testDatasetZip)
  createDirectories(dataDirectory)
  unzipDataset(testDatasetZip, dataDirectory)
  validateDataset(dataDirectory)
}

73-79: Improve error messages with more context

The error messages could be more descriptive to help with debugging.

-     throw new Exception("Test dataset was not unzipped successfully.")
+     throw new Exception(s"Failed to unzip test dataset to ${dataDirectory.getAbsolutePath}/test-dataset")
-     throw new Exception("Required file does not exist.")
+     throw new Exception(s"Required file ${testFile.getAbsolutePath} does not exist")

62-70: Consider adding logging for better debugging

Add logging statements to track the progress of dataset preparation, especially during CI/CD runs.

     if (!dataDirectory.listFiles().exists(_.getName == "test-dataset"))
+      logger.info(s"Unzipping test dataset to ${dataDirectory.getAbsolutePath}/test-dataset")
       ZipIO.unzipToFolder(
         testDatasetZip,
         Paths.get(dataDirectory.toPath.toString, "test-dataset"),
         includeHiddenFiles = true,
         hiddenFilesWhitelist = List(),
         truncateCommonPrefix = true,
         excludeFromPrefix = None
       )
+      logger.info("Test dataset unzipped successfully")

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Reviewing files that changed from the base of the PR and between 459ff77 and 152acca.

📒 Files selected for processing (1)

test/e2e/End2EndSpec.scala (4 hunks)

🔇 Additional comments (2)

test/e2e/End2EndSpec.scala (2)

3-3: LGTM: Imports are appropriate

The new imports are well-scoped and necessary for the file operations being performed.

Also applies to: 12-13

33-34: LGTM: Proper test data initialization

The dataset preparation is correctly placed before the test execution, ensuring prerequisites are met.

coderabbitai · 2024-11-04T10:01:26Z

test/e2e/End2EndSpec.scala

+    val testFileSource = scala.io.Source.fromFile(testFile)
+    val testFileContent = try testFileSource.mkString
+    finally testFileSource.close()
+    if (testFileContent.isEmpty) {
+      throw new Exception("Required file is empty.")
+    }


🛠️ Refactor suggestion

Use try-with-resources for file handling

The file resource handling could be improved using Scala's Using to ensure proper resource cleanup.

- val testFileSource = scala.io.Source.fromFile(testFile) - val testFileContent = try testFileSource.mkString - finally testFileSource.close() - if (testFileContent.isEmpty) { - throw new Exception("Required file is empty.") - } + scala.util.Using(scala.io.Source.fromFile(testFile)) { source => + val content = source.mkString + if (content.isEmpty) { + throw new Exception("Required file is empty.") + } + content + }.getOrElse(throw new Exception("Failed to read test file"))

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

val testFileSource = scala.io.Source.fromFile(testFile)

val testFileContent = try testFileSource.mkString

finally testFileSource.close()

if (testFileContent.isEmpty) {

throw new Exception("Required file is empty.")

}

scala.util.Using(scala.io.Source.fromFile(testFile)) { source =>

val content = source.mkString

if (content.isEmpty) {

throw new Exception("Required file is empty.")

}

content

}.getOrElse(throw new Exception("Failed to read test file"))

coderabbitai

Actionable comments posted: 1

🧹 Outside diff range and nitpick comments (1)

test/e2e/End2EndSpec.scala (1)

51-87: Enhance error messages with more context.

The error messages could be more descriptive to help with debugging test failures.

-      throw new Exception("Test dataset zip file does not exist.")
+      throw new Exception(s"Test dataset zip file does not exist at path: $testDatasetPath")

-      throw new Exception("Test dataset was not unzipped successfully.")
+      throw new Exception(s"Failed to unzip test dataset to: ${dataDirectory.getAbsolutePath}/test-dataset")

-      throw new Exception("Required file does not exist.")
+      throw new Exception(s"Required file does not exist: ${testFile.getAbsolutePath}")

-      throw new Exception("Required file is empty.")
+      throw new Exception(s"Required file is empty: ${testFile.getAbsolutePath}")

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Reviewing files that changed from the base of the PR and between 152acca and 1d72230.

📒 Files selected for processing (1)

test/e2e/End2EndSpec.scala (4 hunks)

🔇 Additional comments (5)

test/e2e/End2EndSpec.scala (5)

3-3: LGTM: Imports are appropriate for the new functionality.

The added imports for ZipIO, File, and Paths are necessary for the dataset management functionality being implemented.

Also applies to: 12-13

33-34: LGTM: Proper test setup sequence.

The ensureTestDataset() call is correctly placed before the HTTP request and webdriver tests, ensuring proper test data initialization.

63-71: Improve error handling in unzip operation.

The unzip operation needs better error handling and validation.

The previous review comment about improving error handling in the unzip operation is still valid. Please implement the suggested changes for proper error handling and validation.

81-86: Use Scala's Using for resource management.

The file reading operation should use Scala's Using for proper resource management.

The previous review comment about using try-with-resources is still valid. Please implement the suggested changes for better resource management.

51-87: Verify test dataset structure in the codebase.

Let's verify the test dataset structure to ensure it contains the necessary Zarr files for streaming tests.

✅ Verification successful

Let me gather more information about the test dataset structure by examining the test files and related configurations.

This will help us understand the expected structure and validate if Zarr-related test configurations are properly set up.

Based on the gathered information, I can now provide the final response:

Test dataset structure verification is appropriate for its purpose

The test dataset setup in End2EndSpec.scala is correctly structured for its intended use. While it doesn't explicitly contain Zarr files, this is expected as:

The file datasource-properties.json is the primary requirement for the test setup
Zarr-specific tests are separately handled in Zarr3TestSuite.scala with mock data
The codebase shows extensive Zarr support with proper implementation files, but streaming tests don't necessarily require actual Zarr files in the test dataset

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Verify test dataset structure and content

# Check if the test dataset zip exists and examine its structure
if [ -f "test/dataset/test-dataset.zip" ]; then
    echo "Test dataset zip contents:"
    unzip -l test/dataset/test-dataset.zip
    
    # Look for Zarr-related files
    echo -e "\nSearching for Zarr files in the repository:"
    rg -l "\.zarr" || echo "No .zarr files found"
    rg -l "zarr.json" || echo "No zarr.json files found"
fi

Length of output: 4562

Script:

#!/bin/bash
# Check test files and configurations related to the test dataset

# Look for test dataset references and configurations
echo "Checking test dataset references:"
rg -l "test-dataset" test/

# Check content of test configuration files
echo -e "\nChecking Zarr3TestSuite.scala:"
cat test/backend/Zarr3TestSuite.scala

# Look for dataset properties file references
echo -e "\nChecking for dataset properties references:"
rg "datasource-properties.json" test/

Length of output: 2508

coderabbitai · 2024-11-04T10:18:20Z

test/e2e/End2EndSpec.scala

+    if (!dataDirectory.exists()) {
+      dataDirectory.mkdirs()
+    }
+    "chmod 777 binaryData/Organization_X".run()


⚠️ Potential issue

Replace shell command with Java API for permission management.

Using shell commands for permission management is platform-dependent and potentially unsafe. Consider using Java's NIO API instead.

- "chmod 777 binaryData/Organization_X".run() + dataDirectory.setReadable(true, false) + dataDirectory.setWritable(true, false) + dataDirectory.setExecutable(true, false)

Committable suggestion skipped: line range outside the PR's diff.

coderabbitai

Actionable comments posted: 1

🧹 Outside diff range and nitpick comments (2)

test/e2e/End2EndSpec.scala (1)
51-70: Consider enhancing error handling and logging

While the implementation is generally good, consider adding more detailed error messages and logging:

Add logging for directory creation and zip extraction steps

Include file paths in error messages for better debugging
 private def ensureTestDataset(): Unit = {
   val testDatasetPath = "test/dataset/test-dataset.zip"
   val dataDirectory = new File("binaryData/Organization_X")
   if (!dataDirectory.exists()) {
+    logger.info(s"Creating directory: ${dataDirectory.getAbsolutePath}")
     dataDirectory.mkdirs()
   }
   val testDatasetZip = new File(testDatasetPath)
   if (!testDatasetZip.exists()) {
-    throw new Exception("Test dataset zip file does not exist.")
+    throw new Exception(s"Test dataset zip file does not exist at: ${testDatasetZip.getAbsolutePath}")
   }
   // Skip unzipping if the test dataset is already present
-  if (!dataDirectory.listFiles().exists(_.getName == "test-dataset"))
+  if (!dataDirectory.listFiles().exists(_.getName == "test-dataset")) {
+    logger.info(s"Unzipping test dataset from ${testDatasetZip.getAbsolutePath}")
     ZipIO.unzipToFolder(
       testDatasetZip,
       Paths.get(dataDirectory.toPath.toString, "test-dataset"),
       includeHiddenFiles = true,
       hiddenFilesWhitelist = List(),
       truncateCommonPrefix = true,
       excludeFromPrefix = None
     )
+  }
.circleci/config.yml (1)

121-121: Consider using more restrictive permissions.

While 777 permissions work in CI, they are overly permissive. Consider using more restrictive permissions (e.g., 755) if possible, as this follows security best practices.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Reviewing files that changed from the base of the PR and between 1d72230 and 595701e.

📒 Files selected for processing (2)

.circleci/config.yml (1 hunks)
test/e2e/End2EndSpec.scala (4 hunks)

🔇 Additional comments (3)

test/e2e/End2EndSpec.scala (2)

3-3: LGTM: Import statements are appropriate

The added imports are necessary for file operations and ZIP handling in the new dataset preparation functionality.

Also applies to: 12-13

33-34: LGTM: Proper test data initialization

The dataset preparation is correctly placed before the HTTP request and webdriver tests, ensuring test data availability.

.circleci/config.yml (1)

121-121: Verify the directory naming consistency.

Based on the previous review discussion, confirm if using Organization_X here aligns with the rest of the codebase, as there were mentions of legacy naming patterns (Connectomics).

✅ Verification successful

Directory naming Organization_X is consistent with the codebase

The naming pattern Organization_X is consistently used across the configuration files, specifically in:

.circleci/config.yml for directory creation
docker-compose.yml for volume mounting

There are no references to legacy naming patterns like "Connectomics", and the same directory structure is maintained throughout the configuration files.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Check for any remaining references to Connectomics in the codebase
# to ensure naming consistency

echo "Searching for 'Connectomics' references in docker-compose.yml and e2e test files..."
rg -i "connectomics" docker-compose.yml "test/**/*.{ts,scala}"

Length of output: 306

Script:

#!/bin/bash
# Let's do a broader search for both Organization_X and any potential legacy naming patterns
echo "Searching for Organization_X references..."
rg -i "organization_x" --type-add 'config:*.{yml,yaml,json}' --type config

echo -e "\nSearching for directory structure..."
fd -t d -d 3 "Organization_X\|organization_x\|connectomics" -x ls -ld {}

echo -e "\nChecking CircleCI config for similar directory creation patterns..."
rg "mkdir.*binary" .circleci/config.yml

Length of output: 735

coderabbitai · 2024-11-04T10:48:32Z

test/e2e/End2EndSpec.scala

+    // Test if the dataset was unzipped successfully
+    if (!dataDirectory.listFiles().exists(_.getName == "test-dataset")) {
+      throw new Exception("Test dataset was not unzipped successfully.")
+    }
+    val testFile = new File(dataDirectory, "test-dataset/datasource-properties.json")
+    if (!testFile.exists()) {
+      throw new Exception("Required file does not exist.")
+    }
+    val testFileSource = scala.io.Source.fromFile(testFile)
+    val testFileContent = try testFileSource.mkString
+    finally testFileSource.close()
+    if (testFileContent.isEmpty) {
+      throw new Exception("Required file is empty.")
+    }


🛠️ Refactor suggestion

Improve resource management and validation robustness

The current implementation has two areas for improvement:

Use Scala's Using for better resource management (as suggested in previous review)

Add more comprehensive validation of the unzipped content

- val testFileSource = scala.io.Source.fromFile(testFile) - val testFileContent = try testFileSource.mkString - finally testFileSource.close() - if (testFileContent.isEmpty) { - throw new Exception("Required file is empty.") - } + val testFileContent = scala.util.Using(scala.io.Source.fromFile(testFile)) { source => + val content = source.mkString + if (content.isEmpty) { + throw new Exception(s"Required file is empty: ${testFile.getAbsolutePath}") + } + // Additional validation could be added here + if (!content.contains("\"dataFormat\"")) { + throw new Exception(s"Invalid datasource properties file format: ${testFile.getAbsolutePath}") + } + content + }.getOrElse(throw new Exception(s"Failed to read test file: ${testFile.getAbsolutePath}"))

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

// Test if the dataset was unzipped successfully

if (!dataDirectory.listFiles().exists(_.getName == "test-dataset")) {

throw new Exception("Test dataset was not unzipped successfully.")

}

val testFile = new File(dataDirectory, "test-dataset/datasource-properties.json")

if (!testFile.exists()) {

throw new Exception("Required file does not exist.")

}

val testFileSource = scala.io.Source.fromFile(testFile)

val testFileContent = try testFileSource.mkString

finally testFileSource.close()

if (testFileContent.isEmpty) {

throw new Exception("Required file is empty.")

}

// Test if the dataset was unzipped successfully

if (!dataDirectory.listFiles().exists(_.getName == "test-dataset")) {

throw new Exception("Test dataset was not unzipped successfully.")

}

val testFile = new File(dataDirectory, "test-dataset/datasource-properties.json")

if (!testFile.exists()) {

throw new Exception("Required file does not exist.")

}

val testFileContent = scala.util.Using(scala.io.Source.fromFile(testFile)) { source =>

val content = source.mkString

if (content.isEmpty) {

throw new Exception(s"Required file is empty: ${testFile.getAbsolutePath}")

}

// Additional validation could be added here

if (!content.contains("\"dataFormat\"")) {

throw new Exception(s"Invalid datasource properties file format: ${testFile.getAbsolutePath}")

}

content

}.getOrElse(throw new Exception(s"Failed to read test file: ${testFile.getAbsolutePath}"))

frcroth added 4 commits October 21, 2024 13:28

Add zip test dataset and extract it in e2e test setup

c276a20

Add snapshot tests for zarr streaming

26dafe7

Lint frontend

8fb12a1

Try to make data directory writable

0968608

frcroth changed the title ~~Add zip test dataset and extract it in e2e test setup~~ Zarr streaming e2e test Oct 21, 2024

Add zarr 3 test

2d6065a

frcroth force-pushed the test-zarr-streaming branch from f55a56b to 2d6065a Compare October 21, 2024 14:50

frcroth and others added 2 commits October 21, 2024 17:16

Restore changes to test.sh

b6ee187

Merge branch 'master' into test-zarr-streaming

9323568

frcroth requested a review from fm3 October 21, 2024 15:18

frcroth marked this pull request as ready for review October 21, 2024 15:24

fm3 reviewed Oct 22, 2024

View reviewed changes

Simplify testing setup

4a09bf6

frcroth requested a review from fm3 October 23, 2024 08:29

frcroth and others added 4 commits October 23, 2024 10:46

Fix lint issue

01f4c60

Move option initalization to test

6309c9b

Merge branch 'master' into test-zarr-streaming

88f9feb

Use Organization X directory

ee0b7c3

Merge branch 'master' into test-zarr-streaming

459ff77

coderabbitai bot reviewed Nov 4, 2024

View reviewed changes

Add more tests on wether the dataset was unzipped correctly

152acca

coderabbitai bot reviewed Nov 4, 2024

View reviewed changes

Does chmod help?

1d72230

coderabbitai bot reviewed Nov 4, 2024

View reviewed changes

Try again with chmod

8034308

frcroth requested review from fm3 and removed request for fm3 November 4, 2024 10:38

Merge branch 'master' into test-zarr-streaming

595701e

coderabbitai bot reviewed Nov 4, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Zarr streaming e2e test #8137

Zarr streaming e2e test #8137

frcroth commented Oct 21, 2024 •

edited by coderabbitai bot

Loading

fm3 left a comment

fm3 Oct 22, 2024

frcroth Oct 23, 2024

fm3 Oct 23, 2024

coderabbitai bot commented Nov 4, 2024 •

edited

Loading

Chat

CodeRabbit Commands (Invoked using PR comments)

Other keywords and placeholders

CodeRabbit Configuration File (`.coderabbit.yaml`)

Documentation and Community

coderabbitai bot left a comment

coderabbitai bot Nov 4, 2024

coderabbitai bot Nov 4, 2024

coderabbitai bot Nov 4, 2024

coderabbitai bot left a comment

coderabbitai bot Nov 4, 2024

coderabbitai bot left a comment

coderabbitai bot Nov 4, 2024

coderabbitai bot left a comment

coderabbitai bot Nov 4, 2024

Zarr streaming e2e test #8137

Are you sure you want to change the base?

Zarr streaming e2e test #8137

Conversation

frcroth commented Oct 21, 2024 • edited by coderabbitai bot Loading

TODOs:

Issues:

Summary by CodeRabbit

Release Notes

fm3 left a comment

Choose a reason for hiding this comment

fm3 Oct 22, 2024

Choose a reason for hiding this comment

frcroth Oct 23, 2024

Choose a reason for hiding this comment

fm3 Oct 23, 2024

Choose a reason for hiding this comment

coderabbitai bot commented Nov 4, 2024 • edited Loading

Walkthrough

Changes

Assessment against linked issues

Chat

CodeRabbit Commands (Invoked using PR comments)

Other keywords and placeholders

CodeRabbit Configuration File (.coderabbit.yaml)

Documentation and Community

coderabbitai bot left a comment

Choose a reason for hiding this comment

coderabbitai bot Nov 4, 2024

Choose a reason for hiding this comment

coderabbitai bot Nov 4, 2024

Choose a reason for hiding this comment

coderabbitai bot Nov 4, 2024

Choose a reason for hiding this comment

coderabbitai bot left a comment

Choose a reason for hiding this comment

coderabbitai bot Nov 4, 2024

Choose a reason for hiding this comment

coderabbitai bot left a comment

Choose a reason for hiding this comment

coderabbitai bot Nov 4, 2024

Choose a reason for hiding this comment

coderabbitai bot left a comment

Choose a reason for hiding this comment

coderabbitai bot Nov 4, 2024

Choose a reason for hiding this comment

frcroth commented Oct 21, 2024 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Nov 4, 2024 •

edited

Loading

CodeRabbit Configuration File (`.coderabbit.yaml`)