file size ranges from ... to 1 gb
- Deal with the differing z-axes
- We need to investigate whether the scale changes with differing z-values
-
Not care
- Cons
- definitely harder to learn because of different
- Pros
- less work and complexity
- Easy to try out
- Cons
-
Use 2d slices
- Cons
- Lose a lot of information
- Pro
- Easy to do
- Cons
-
2.5D We slice up the CT scan along the z-axis. The slices are constant depth, like 10. Then run it as a 2D image through a UNet, using the z-axis as the feature dimension.
Unet works as usual, with one difference. At the end, we need to output a tensor with dimension
We can then finally concatenate the predicted masks.
If scan depth is not evenly divisible by
#Stuff to do next: Datasets, Dataloaders, Model
Unet, ResNet blocks as a backbone, attention after every Resnet block
Remake cache Incomplete layer:
- attn layers
- bottom
- maybe more
Rebalance dataset sampling
check out this paper that is doing 2.5D segmentation. They use spatial attention, which we are not using. They also do some more tricks, so check it out.
The representation of the model is likely too small to be able to obtain class information about all the classes for each pixel. The last backbone block has a size that is three times as small as the desired output. You need to let the net have a larger representation.
Got rid of resizing due to scans in which the liver takes up the whole image
New Approach: for each slice, concatenate the information of the neighboring slices to the current slice and then predict the segmentation of the middle slice
Try using two neighboring slices on each side instead of just one
Also add spatial attention instead or along with channel attention.
How to create gifs with matplotlib: [https://towardsdatascience.com/basics-of-gifs-with-pythons-matplotlib-54dd544b6f30]
Try increasing the number of slices in the 2.5D
Also try using either SE blocks, or CBAM
Postprocessing:
- Remove all but the largest connected component of non-zero mask predictions