Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ripper doesn't wait for voltage files / BOT files #9

Open
tbenst opened this issue Mar 18, 2021 · 15 comments
Open

Ripper doesn't wait for voltage files / BOT files #9

tbenst opened this issue Mar 18, 2021 · 15 comments

Comments

@tbenst
Copy link
Collaborator

tbenst commented Mar 18, 2021

@drinnenb or @chrisroat have you seen this error before?
``

python ~/code/two-photon/two-photon/process.py --input_dir /scratch/b115/ --output_dir /scratch/b115/process-output/ --recording 2021-03-16_h2b6s/fish1:TSeries_64cell_8concurrent_2power_8rep-207 --preprocess
2021-03-18 10:30:45.064 metadata:22 INFO Extracting metadata from xml files:
/scratch/b115/2021-03-16_h2b6s/fish1/TSeries_64cell_8concurrent_2power_8rep-207/TSeries_64cell_8concurrent_2power_8rep-207.xml
/scratch/b115/2021-03-16_h2b6s/fish1/TSeries_64cell_8concurrent_2power_8rep-207/TSeries_64cell_8concurrent_2power_8rep-207_Cycle00001_VoltageRecording_001.xml
2021-03-18 10:30:47.161 metadata:102 INFO The following metadata is written to: /scratch/b115/process-output/2021-03-16_h2b6s/fish1/TSeries_64cell_8concurrent_2power_8rep-207/output/metadata.json
{'channels': {0: {'enabled': True, 'name': 'frame starts'},
1: {'enabled': True, 'name': 'secondary'},
2: {'enabled': True, 'name': 'winfluo'},
3: {'enabled': True, 'name': 'Blue'},
4: {'enabled': True, 'name': 'VR timestamps'},
5: {'enabled': True, 'name': 'green'},
6: {'enabled': True, 'name': 'LED'},
7: {'enabled': True, 'name': 'respir'}},
'laser': {'power': None, 'wavelength': None},
'layout': {'frames_per_sequence': 77378, 'sequences': 1},
'optical_zoom': 1.0,
'period': 0.033216582,
'size': {'channels': 2,
'frames': 77378,
'x_px': 512,
'y_px': 512,
'z_planes': 1}}
2021-03-18 10:30:47.265 process:92 INFO Found stim channel "respir", enabled=True
Traceback (most recent call last):
File "/home/tyler/code/two-photon/two-photon/process.py", line 353, in
main()
File "/home/tyler/code/two-photon/two-photon/process.py", line 114, in main
preprocess(basename_input, dirname_output, fname_csv, fname_uncorrected_hdf5, fname_hdf5, mdata,
File "/home/tyler/code/two-photon/two-photon/process.py", line 160, in preprocess
df_voltage = pd.read_csv(fname_csv, index_col='Time(ms)', skipinitialspace=True)
File "/home/tyler/opt/anaconda3/lib/python3.8/site-packages/pandas/io/parsers.py", line 686, in read_csv
return _read(filepath_or_buffer, kwds)
File "/home/tyler/opt/anaconda3/lib/python3.8/site-packages/pandas/io/parsers.py", line 458, in _read
data = parser.read(nrows)
File "/home/tyler/opt/anaconda3/lib/python3.8/site-packages/pandas/io/parsers.py", line 1196, in read
ret = self._engine.read(nrows)
File "/home/tyler/opt/anaconda3/lib/python3.8/site-packages/pandas/io/parsers.py", line 2231, in read
index, names = self._make_index(data, alldata, names)
File "/home/tyler/opt/anaconda3/lib/python3.8/site-packages/pandas/io/parsers.py", line 1677, in _make_index
index = self._agg_index(index)
File "/home/tyler/opt/anaconda3/lib/python3.8/site-packages/pandas/io/parsers.py", line 1770, in _agg_index
arr, _ = self._infer_types(arr, col_na_values | col_na_fvalues)
File "/home/tyler/opt/anaconda3/lib/python3.8/site-packages/pandas/io/parsers.py", line 1871, in _infer_types
mask = algorithms.isin(values, list(na_values))
File "/home/tyler/opt/anaconda3/lib/python3.8/site-packages/pandas/core/algorithms.py", line 443, in isin
if np.isnan(values).any():
TypeError: ufunc 'isnan' not supported for the input types, and the inputs could not be safely coerced to any supported types according to the casting rule ''safe''

@drinnenb
Copy link
Collaborator

drinnenb commented Mar 18, 2021 via email

@tbenst
Copy link
Collaborator Author

tbenst commented Mar 18, 2021

The problem is that the ripping of VoltageRecording, as well as BOT, happens after the tiff ripping, so our script is killing the ripper prematurely. Temporary fix is to dramatically increase

RIP_EXTRA_WAIT_SECS = 10 # Extra time to wait after ripping is detected to be done.
to 36000

@chrisroat
Copy link
Contributor

chrisroat commented Mar 18, 2021 via email

@tbenst
Copy link
Collaborator Author

tbenst commented Mar 19, 2021

Not your fault at all! I don’t think I gave you example files that had voltage, an oversight. will post some info a bit later

@chrisroat
Copy link
Contributor

chrisroat commented Mar 19, 2021 via email

@tbenst
Copy link
Collaborator Author

tbenst commented Apr 1, 2021

@chrisroat here's a path on Oak for an experiment that should output VoltageRecording files as well as BOT files (brightness over time; a csv file for region-of-interest fluorescent traces):
/oak/stanford/groups/deissero/users/tyler/share/chris/2021-03-30_wt-chrmine_6dpf_h2b6s/fish1/TSeries-28cell-1concurrent-2power-10trial-052

@tbenst tbenst changed the title pandas csv read error Ripper doesn't wait for voltage files / BOT files Apr 1, 2021
@chrisroat
Copy link
Contributor

chrisroat commented Apr 1, 2021 via email

@tbenst
Copy link
Collaborator Author

tbenst commented Apr 1, 2021

VoltageOutput is a separate functionality from VoltageRecording--the former generates TTLs, the latter records digitizes / records voltage.

There are a few relevant files that tell us a voltage recording will be created, namely:

TSeries-28cell-1concurrent-2power-10trial-052_Cycle00001_VoltageRecording_001
TSeries-28cell-1concurrent-2power-10trial-052_Cycle00001_VoltageRecording_001_VRFilelist.txt
TSeries-28cell-1concurrent-2power-10trial-052_Cycle00001_VoltageRecording_001.xml

After ripping, there should be a corresponding TSeries-28cell-1concurrent-2power-10trial-052_Cycle00001_VoltageRecording_001.csv file (one per Cycle).

In another recording with 8 trials, there are 8x these files: /oak/stanford/groups/deissero/users/tyler/b115/2021-03-16_rschrmine_h2b6s/fish3. When ripped, this recording also generates 8 x TSeries_cross-stim_p125_100x100-310_Cycle00001-botData.csv files, with an incrementing number following Cycle.

BOT is a Prairie View functionality that we use typically for online monitoring of experiments, so it's not as critical that we get this one right.

Edit: note that Cycle00001 does not always start at 1! but it does always increment by 1

@chrisroat
Copy link
Contributor

Yeah, we figured out the VoltageRecording. I think it will be easy to wait for its output csv to appear. Thanks for the tip on having multiple cycles.

I was curious if VoltageOutput means there is going to be a file to wait for?

For the botData, it's not clear how to tell apriori it will be there and to wait for it. None of the previous datasets I looked at have it. Perhaps I can dig through the xml files and find something.

@tbenst
Copy link
Collaborator Author

tbenst commented Apr 1, 2021

No, VoltageOutput does not mean there will be a file to wait for AFAICT. I think all the info is already in the xml file so no ripping needed. The BOT is a weird one. I'm not sure where it's stored--perhaps it comes out of the binary blob that stores the tiffs

Your intuition is spot on though for BOT in the xml files, I found:
<PVBOTs botData="TSeries-lrhab_raphe_stim-40trial-038_Cycle00026-botData.csv"> inside of /data/dlab/b115/2020-10-28_elavl3-chrmine-Kv2.1_h2b6s_8dpf/fish1/TSeries-lrhab_raphe_stim-40trial-038/TSeries-lrhab_raphe_stim-40trial-038.xml for example

@jmdelahanty
Copy link

Hello everyone! I was curious about your use of the BOT files. Currently, our recordings don't use the BOT function. What do you use it for? Is there something that we're missing out on if we don't have that data?

@tbenst
Copy link
Collaborator Author

tbenst commented Oct 15, 2021

BOT= brightness over time. It’s a way of drawing ROIs in the bruker software so you can monitor an experiment

@jmdelahanty
Copy link

That makes sense! So basically you select a neuron/group of neurons that you're particularly interested in as an ROI or something and go from there?

And also, since I have you here, do you do any behavior while you're recording from the mice? How does your lab record behavior data/stimulate the brain at once? My post-doc mentor wants to simultaneously record from the brain and stimulate at the same time during the behavior session all in one long recording. Do you make a new t-series for each trial or something similar? The Bruker documentation hasn't been super helpful in getting this going.

If I could meet with one of you at some point over zoom it would be immensely helpful and such a privilege to learn from you!

@tbenst
Copy link
Collaborator Author

tbenst commented Oct 15, 2021

So basically you select a neuron/group of neurons that you're particularly interested in as an ROI or something and go from there?

Yes

And also, since I have you here, do you do any behavior while you're recording from the mice? How does your lab record behavior data/stimulate the brain at once?

I'm part of Team Fish :), but yes concurrent behavior & stim is a common paradigm, usually taking advantage of the analog inputs for synchronization TTL signals, and behavior-specific software that saves its own files.

If I could meet with one of you at some point over zoom it would be immensely helpful and such a privilege to learn from you!

Sure, I'm happy to meet and talk through / share whatever I can that's helpful. let's lock down a time over email? tbenst at stanford edu

@jmdelahanty
Copy link

jmdelahanty commented Jan 24, 2022

Potential way to solve this for at least voltage files is to poll the filesize with os.stat(path).st_size, something like this:

csv_size = os.stat(path).st_size

time.sleep(10)

new_csv_size = os.stat(path).st_size

if csv_size == new_csv_size:
    logging.info("CSV Conversion Complete")

else:
    logging.info("CSV Still convering...")

I'm in the process of adding this to the container now to see what it does. But this way at least you won't have to worry about keeping the ripper running for a long time. In our file naming scheme, it will always do the csv conversion first because the filelist.txt has a digit in it (the date) for the voltage recording. The imaging filelist.txt is just called Cycle whatever.

Update to this:

It works properly if you include it as part of the rip.py script. I've modified what you've created for the repo for our lab's structure, but you can see how I implemented what you made here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants