Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for 32bit floating point images #104

Closed
snehashis-roy opened this issue May 24, 2023 · 6 comments
Closed

Support for 32bit floating point images #104

snehashis-roy opened this issue May 24, 2023 · 6 comments

Comments

@snehashis-roy
Copy link

Hello.
Is it possible to convert 32-bit floating point 3D tif images to pyramid OMETIFF for the purpose of using it in OMERO? It seems that bioformats2raw can convert to zarr in a folder, but raw2ometiff gets stuck.

This is my script for a 3D float32 image (size 528x456x960 (HxWxD))

./bioformats2raw -c null -h 1024 -w 1024 --max_workers 12 --no-nested -r 8 --target-min-size=512 -p "$INPUT" $TMPDIR
./raw2ometiff --compression uncompressed --debug --max-workers=12 -p $TMPDIR $OUTPUT

This is the end of debug file where raw2ometiff gets stuck,

2023-05-24 00:02:32,129 [pool-1-thread-18] DEBUG c.g.p.PyramidFromDirectoryWriter - Writing series: 0, image: 959, tileIndex: 0
2023-05-24 00:02:32,129 [pool-1-thread-18] DEBUG c.g.p.PyramidFromDirectoryWriter -     writing 73728 compressed bytes at 1262149648
2023-05-24 00:02:32,129 [pool-1-thread-18] INFO  org.perf4j.TimingLogger - start[1684900952129] time[0] tag[writeTile]

I am happy to share the image and/or the debug output if it helps.
I am using 0.6.0 and 0.4.0

@melissalinkert
Copy link
Member

bioformats2raw/raw2ometiff are routinely used with floating point data, so I'd be a little surprised if there is an fundamental issue with that data type.

./raw2ometiff --compression uncompressed --debug --max-workers=12 -p $TMPDIR $OUTPUT

I would not expect this command to run at all. --max-workers=12 is not a valid option, but --max_workers=12 (the same option name as bioformats2raw) should work instead. Is this the exact command that produced the debug log?

I am using 0.6.0 and 0.4.0

bioformats2raw 0.6.1 and raw2ometiff 0.4.1 are the latest versions; I don't know of a specific change that might solve this problem, but it may be worth upgrading to see if that helps.

@snehashis-roy
Copy link
Author

You are right, it was indeed a typo in my script, --max-workers should have been --max_workers. However, --max-workers still works for raw2ometiff 0.4.0 and 0.4.1.

Here is my test 4D 32bit image that is not being converted by raw2ometiff (0.4.1 & 0.4.0), bioformats2raw can successfully convert it though. The image opens fine on Fiji.

https://hpc.nih.gov/~roys5/Fos_heatmap_51623_B.tif

I used the following commands,

./bioformats2raw-0.6.1/bin/bioformats2raw -c null -h 1024 -w 1024 --max_workers 12 --no-nested -r 2 --target-min-size=512 -p "Fos_heatmap_51623_B.tif" ./tmp.suFH2ojHgm
./raw2ometiff-0.4.1/bin/raw2ometiff --compression uncompressed --max_workers=12 -p ./tmp.suFH2ojHgm Fos_heatmap_51623_B.ome.tiff

The last step of raw2ometiff hangs, but a potentially corrupt OMETIFF file is generated. tiffinfo on the corrupt image shows this,

TIFFFetchDirectory: Sanity check on directory count failed, zero tag directories not supported.
TIFFReadDirectory: Failed to read directory at offset 16.

@melissalinkert
Copy link
Member

Thanks for the additional details, @snehashis-roy.

Unfortunately, I so far haven't been able to reproduce this issue using Fos_heatmap_51623_B.tif; both bioformats2raw and raw2ometiff as described in #104 (comment) complete without error. The runtime for each is under a minute, and the resulting OME-TIFF is valid.

If you can capture a jstack when you see raw2ometiff get stuck, that may help us to understand what's going wrong. To do that, start the raw2ometiff command as before. In a separate terminal, run:

$ jps
...
8734 PyramidFromDirectoryWriter

That prints a list of running Java processes; look for the line with PyramidFromDirectoryWriter as above and note the number (the PID). When you see raw2ometiff get stuck, run in a separate terminal:

$ jstack 8734 > raw2ometiff-jstack.txt

replacing 8734 with the PID you noted previously. Please attach the raw2ometiff-jstack.txt file to this issue.

A few other things to consider:

  • Is the output OME-TIFF (Fos_heatmap_51623_B.ome.tiff) being written to a local disk? If not, does writing to a local disk work?
  • raw2ometiff writes all of the pixel data first, then has a final step to go back and write all of the metadata. When I do the conversion, this last step takes 30 seconds or so (admittedly not on a fast system). Approximately how much longer than that are you waiting after the last line of debug output is written, before raw2ometiff is killed?
  • 12 workers looks like a lot for this data (since each plane is smaller than 1024x1024), but the final metadata writing step is not parallelized so I wouldn't expect the worker count to have an impact here. You could try with just 1 worker for completeness, but I don't think that would resolve the problem.

@snehashis-roy
Copy link
Author

snehashis-roy commented May 24, 2023

This is the jstack $PID output of stuck raw2ometiff

2023-05-24 18:21:54
Full thread dump OpenJDK 64-Bit Server VM (25.152-b12 mixed mode):

"Attach Listener" #27 daemon prio=9 os_prio=0 tid=0x00002aab74001000 nid=0x11984 waiting on condition [0x0000000000000000]
   java.lang.Thread.State: RUNNABLE

"Java2D Disposer" #22 daemon prio=10 os_prio=0 tid=0x00002aaab1182800 nid=0x11843 in Object.wait() [0x00002aab733b3000]
   java.lang.Thread.State: WAITING (on object monitor)
	at java.lang.Object.wait(Native Method)
	- waiting on <0x00000005d268d0b8> (a java.lang.ref.ReferenceQueue$Lock)
	at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:143)
	- locked <0x00000005d268d0b8> (a java.lang.ref.ReferenceQueue$Lock)
	at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:164)
	at sun.java2d.Disposer.run(Disposer.java:148)
	at java.lang.Thread.run(Thread.java:745)

"ProgressBar" #21 daemon prio=5 os_prio=0 tid=0x00002aaab108c000 nid=0x11842 waiting on condition [0x00002aab5f81e000]
   java.lang.Thread.State: WAITING (parking)
	at sun.misc.Unsafe.park(Native Method)
	- parking to wait for  <0x0000000080287dd0> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
	at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
	at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2039)
	at java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.java:1081)
	at java.util.concurrent.ScheduledThreadPoolExecautor$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.java:809)
	at java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1067)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1127)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
	at java.lang.Thread.run(Thread.java:745)

"Service Thread" #17 daemon prio=9 os_prio=0 tid=0x00002aaab0101000 nid=0x1183d runnable [0x0000000000000000]
   java.lang.Thread.State: RUNNABLE

"C1 CompilerThread11" #16 daemon prio=9 os_prio=0 tid=0x00002aaab00fd800 nid=0x1183c waiting on condition [0x0000000000000000]
   java.lang.Thread.State: RUNNABLE

"C1 CompilerThread10" #15 daemon prio=9 os_prio=0 tid=0x00002aaab00fb800 nid=0x1183b waiting on condition [0x0000000000000000]
   java.lang.Thread.State: RUNNABLE

"C1 CompilerThread9" #14 daemon prio=9 os_prio=0 tid=0x00002aaab00f9800 nid=0x1183a waiting on condition [0x0000000000000000]
   java.lang.Thread.State: RUNNABLE

"C1 CompilerThread8" #13 daemon prio=9 os_prio=0 tid=0x00002aaab00f7000 nid=0x11839 waiting on condition [0x0000000000000000]
   java.lang.Thread.State: RUNNABLE

"C2 CompilerThread7" #12 daemon prio=9 os_prio=0 tid=0x00002aaab00f5000 nid=0x11838 waiting on condition [0x0000000000000000]
   java.lang.Thread.State: RUNNABLE

"C2 CompilerThread6" #11 daemon prio=9 os_prio=0 tid=0x00002aaab00f2800 nid=0x11837 waiting on condition [0x0000000000000000]
   java.lang.Thread.State: RUNNABLE

"C2 CompilerThread5" #10 daemon prio=9 os_prio=0 tid=0x00002aaab00f0800 nid=0x11836 waiting on condition [0x0000000000000000]
   java.lang.Thread.State: RUNNABLE

"C2 CompilerThread4" #9 daemon prio=9 os_prio=0 tid=0x00002aaab00e6000 nid=0x11835 waiting on condition [0x0000000000000000]
   java.lang.Thread.State: RUNNABLE

"C2 CompilerThread3" #8 daemon prio=9 os_prio=0 tid=0x00002aaab00e4000 nid=0x11834 waiting on condition [0x0000000000000000]
   java.lang.Thread.State: RUNNABLE

"C2 CompilerThread2" #7 daemon prio=9 os_prio=0 tid=0x00002aaab00e2000 nid=0x11833 waiting on condition [0x0000000000000000]
   java.lang.Thread.State: RUNNABLE

"C2 CompilerThread1" #6 daemon prio=9 os_prio=0 tid=0x00002aaab00e0000 nid=0x11832 waiting on condition [0x0000000000000000]
   java.lang.Thread.State: RUNNABLE

"C2 CompilerThread0" #5 daemon prio=9 os_prio=0 tid=0x00002aaab00dd000 nid=0x11831 waiting on condition [0x0000000000000000]
   java.lang.Thread.State: RUNNABLE

"Signal Dispatcher" #4 daemon prio=9 os_prio=0 tid=0x00002aaab00db800 nid=0x11830 runnable [0x0000000000000000]
   java.lang.Thread.State: RUNNABLE

"Finalizer" #3 daemon prio=8 os_prio=0 tid=0x00002aaab00a8000 nid=0x1182f in Object.wait() [0x00002aab5cd0f000]
   java.lang.Thread.State: WAITING (on object monitor)
	at java.lang.Object.wait(Native Method)
	- waiting on <0x0000000080214420> (a java.lang.ref.ReferenceQueue$Lock)
	at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:143)
	- locked <0x0000000080214420> (a java.lang.ref.ReferenceQueue$Lock)
	at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:164)
	at java.lang.ref.Finalizer$FinalizerThread.run(Finalizer.java:209)

"Reference Handler" #2 daemon prio=10 os_prio=0 tid=0x00002aaab00a3800 nid=0x1182e in Object.wait() [0x00002aab5cc0e000]
   java.lang.Thread.State: WAITING (on object monitor)
	at java.lang.Object.wait(Native Method)
	- waiting on <0x000000008021c3d8> (a java.lang.ref.Reference$Lock)
	at java.lang.Object.wait(Object.java:502)
	at java.lang.ref.Reference.tryHandlePending(Reference.java:191)
	- locked <0x000000008021c3d8> (a java.lang.ref.Reference$Lock)
	at java.lang.ref.Reference$ReferenceHandler.run(Reference.java:153)

"main" #1 prio=5 os_prio=0 tid=0x00002aaab000c800 nid=0x11818 runnable [0x00002aaaaabd3000]
   java.lang.Thread.State: RUNNABLE
	at java.io.RandomAccessFile.setLength(Native Method)
	at loci.common.NIOFileHandle.setLength(NIOFileHandle.java:229)
	at loci.common.AbstractNIOHandle.validateLength(AbstractNIOHandle.java:87)
	at loci.common.NIOFileHandle.writeSetup(NIOFileHandle.java:659)
	at loci.common.NIOFileHandle.writeShort(NIOFileHandle.java:605)
	at loci.common.RandomAccessOutputStream.writeShort(RandomAccessOutputStream.java:333)
	at loci.formats.tiff.TiffSaver.writeIFDValue(TiffSaver.java:580)
	at loci.formats.tiff.TiffSaver.writeIFD(TiffSaver.java:535)
	at com.glencoesoftware.pyramid.PyramidFromDirectoryWriter.writeIFD(PyramidFromDirectoryWriter.java:1226)
	at com.glencoesoftware.pyramid.PyramidFromDirectoryWriter.writeIFDs(PyramidFromDirectoryWriter.java:988)
	at com.glencoesoftware.pyramid.PyramidFromDirectoryWriter.convertToPyramid(PyramidFromDirectoryWriter.java:822)
	at com.glencoesoftware.pyramid.PyramidFromDirectoryWriter.call(PyramidFromDirectoryWriter.java:238)
	at com.glencoesoftware.pyramid.PyramidFromDirectoryWriter.call(PyramidFromDirectoryWriter.java:99)
	at picocli.CommandLine.executeUserObject(CommandLine.java:1783)
	at picocli.CommandLine.access$900(CommandLine.java:145)
	at picocli.CommandLine$RunLast.executeUserObjectOfLastSubcommandWithSameParent(CommandLine.java:2150)
	at picocli.CommandLine$RunLast.handle(CommandLine.java:2144)
	at picocli.CommandLine$RunLast.handle(CommandLine.java:2108)
	at picocli.CommandLine$AbstractParseResultHandler.handleParseResult(CommandLine.java:1968)
	at picocli.CommandLine.parseWithHandlers(CommandLine.java:2349)
	at picocli.CommandLine.parseWithHandler(CommandLine.java:2284)
	at picocli.CommandLine.call(CommandLine.java:2560)
	at com.glencoesoftware.pyramid.PyramidFromDirectoryWriter.main(PyramidFromDirectoryWriter.java:213)

"VM Thread" os_prio=0 tid=0x00002aaab009b800 nid=0x1182d runnable 

"GC task thread#0 (ParallelGC)" os_prio=0 tid=0x00002aaab0021800 nid=0x11819 runnable 

"GC task thread#1 (ParallelGC)" os_prio=0 tid=0x00002aaab0023000 nid=0x1181a runnable 

"GC task thread#2 (ParallelGC)" os_prio=0 tid=0x00002aaab0025000 nid=0x1181b runnable 

"GC task thread#3 (ParallelGC)" os_prio=0 tid=0x00002aaab0027000 nid=0x1181c runnable 

"GC task thread#4 (ParallelGC)" os_prio=0 tid=0x00002aaab0028800 nid=0x1181d runnable 

"GC task thread#5 (ParallelGC)" os_prio=0 tid=0x00002aaab002a800 nid=0x1181e runnable 

"GC task thread#6 (ParallelGC)" os_prio=0 tid=0x00002aaab002c800 nid=0x1181f runnable 

"GC task thread#7 (ParallelGC)" os_prio=0 tid=0x00002aaab002e000 nid=0x11820 runnable 

"GC task thread#8 (ParallelGC)" os_prio=0 tid=0x00002aaab0030000 nid=0x11821 runnable 

"GC task thread#9 (ParallelGC)" os_prio=0 tid=0x00002aaab0032000 nid=0x11822 runnable 

"GC task thread#10 (ParallelGC)" os_prio=0 tid=0x00002aaab0033800 nid=0x11823 runnable 

"GC task thread#11 (ParallelGC)" os_prio=0 tid=0x00002aaab0035800 nid=0x11824 runnable 

"GC task thread#12 (ParallelGC)" os_prio=0 tid=0x00002aaab0037000 nid=0x11825 runnable 

"GC task thread#13 (ParallelGC)" os_prio=0 tid=0x00002aaab0039000 nid=0x11826 runnable 

"GC task thread#14 (ParallelGC)" os_prio=0 tid=0x00002aaab003b000 nid=0x11827 runnable 

"GC task thread#15 (ParallelGC)" os_prio=0 tid=0x00002aaab003c800 nid=0x11828 runnable 

"GC task thread#16 (ParallelGC)" os_prio=0 tid=0x00002aaab003e800 nid=0x11829 runnable 

"GC task thread#17 (ParallelGC)" os_prio=0 tid=0x00002aaab0040800 nid=0x1182a runnable 

"GC task thread#18 (ParallelGC)" os_prio=0 tid=0x00002aaab0042000 nid=0x1182b runnable 

"GC task thread#19 (ParallelGC)" os_prio=0 tid=0x00002aaab0044000 nid=0x1182c runnable 

"VM Periodic Task Thread" os_prio=0 tid=0x00002aaab0103800 nid=0x1183e waiting on condition 

JNI global references: 463

My java -version gives this,

java -version
openjdk version "1.8.0_152-release"
OpenJDK Runtime Environment (build 1.8.0_152-release-1056-b12)
OpenJDK 64-Bit Server VM (build 25.152-b12, mixed mode)

@snehashis-roy
Copy link
Author

Hello.
This slowdown was indeed caused by network storage. On other large 5D images (e.g. 100x100x5x5x10000 in XYCZT), I also observed that the last step is exponentially faster on a local spinning SATA HDDs (~30sec) compared to a network storage (>2hours). The network SAS storage in on an HPC, connected by 10Gbps Mellanox card, where I usually get 9.2Gbps reading/writing files. So the bandwidth is probably not the issue. I am wondering why the last step has such a wide speed speed discrepancy based on storage location.
Thank you.

@melissalinkert
Copy link
Member

Thanks for the update, @snehashis-roy. Ultimately I think this is down to repeatedly updating the file length as new data is written; we know this can be expensive on network storage, and the final metadata writing step in raw2ometiff doesn't do a great job of minimizing the number of writes to disk. ome/bioformats#3983 documents similar issues.

ome/ome-common-java#78 is likely the long-term solution we'll pursue here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants