Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improved image creation #1002

Open
10 of 14 tasks
manthey opened this issue Nov 29, 2022 · 4 comments
Open
10 of 14 tasks

Improved image creation #1002

manthey opened this issue Nov 29, 2022 · 4 comments

Comments

@manthey
Copy link
Member

manthey commented Nov 29, 2022

Currently, the vips source adds a new method that, when used with addTile, can take a series of images (PIL or numpy or vips) and output a file in a few formats. The addTile method species where in the single frame image the new data is located and allows for masking the data as it is added.

We want to expand this capability:

  • Support arbitrary data axes to have multiple frames. The base image can have different sample depths (currently L, LA, RGB, RGBA, but it would be nice to support arbitrary hyperspectal sample channels OR indexed data). Frames could be arbitrary OR could have specific axes: C (channel, which should have channel names as optional metadata), Z (vertical Z stack), T (time), XY (also called P or P and Q for physical sample position). We don't currently handle other axes in any source, but having arbitrary axes would make reading things like netcdf more obvious.
  • Support arbitrary numpy dtypes for the data
  • Support a variety of inputs for data and masks: PIL, numpy arrays, vips images
  • Set a variety of metadata: pixel scale (see In zarr sink, set mm_x, mm_y #1482).
  • Set image description (see In zarr sink, set image description #1483)
  • Set channel names (In zarr sink, set channel names #1484)
  • Support cropping the data.
  • Support padding the data to different sizes. We might want to specify a default color for all channels.
  • Add data at different scales (e.g., a lower resolutions from the base). This could do a nearest neighbor mapping to the base resolution, but we might want to eventually add an option for how the data is scaled. The general form of this would be to accept an affine transform.
  • For geospatial data, specify a base projection and corner locations in some manner, and add an optional different projection when adding data.
  • Have entry-point based file output plugins for this.
  • Add associated images (In zarr sink, support adding associated images #1485)
  • Add arbitrary additional metadata.
  • Allow new images to have a optional path so that they could be created over multiple program runs. This would require having a close method (preferably implicit) that saves enough state to resume image creation (Support multiprocessing through the zarr sink #1487)

One method of storing this data internally would be a hdf5 file via the h5py module. We could make a new hdf5 source (limited to just files we create) to allow reading the data as it is being created.

We always want the property that the last data added "wins", overwriting any existing data at its location. If the added data has a mask, it should only change the appropriate locations. If it has an alpha channel, this should be applied as it is added. As part of this, we may want to have a default alpha channel that is transparent until any data is added. Note that if data is added to the same location multiple times with alpha values that are 0 < a < 1, we need to make sure that they mix in the expected proportion.

As a starting point, we should create tests that will exercise the existing features and then expand the tests as we add features.

@manthey
Copy link
Member Author

manthey commented Nov 29, 2022

@annehaley Initially, what I'd like to see is a test that can be expanded to include features as we add them. There are a bunch of tests in https://github.com/girder/large_image/blob/master/test/test_source_vips.py, so the first test might just be a variant where we generate synthetic data rather than reading from an existing source. Alternately, using the test tile source which can generate multi-frame fractal data might work, too, though that might be limited because it won't be a range of numpy dtypes.

@annehaley
Copy link
Collaborator

Should all of this work go with the vips source or will it likely be its own source?

@manthey
Copy link
Member Author

manthey commented Nov 29, 2022

Should all of this work go with the vips source or will it likely be its own source?

Another test could go in the test_source_vips.py file, but I think we will end up creating a new hdf5 source and tests files for that since vips won't efficiently handle creating multi-frame images of arbitrary dimensions.

You could also create a new test file with an appropriate name (test_image_creation.py?).

@manthey
Copy link
Member Author

manthey commented Mar 21, 2024

A lot of this has been added with #1446.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants