Joss reviewanswers (#27)

* Create CONTRIBUTING.md * Update CONTRIBUTING.md * add info on osf publication * small modif readme to * debug svm with 3 groups and different size * change default behavior to no svm * compatibility with dplyr 1.02 * working with 3 and more than 3 groups * working with online, mbr files * add information in readme * Jcolomb metadatacreator (#23) * correct vis. abstract (svg) * erase installation of the osfr package in the manual installation file * update due to new osfr package * rerun packrat from scratch * migrate from packrat to renv * suppress packrat from dependencies * Update Readme.md * modify helper function * update readme * update readme 2 * modif * modif readme + helper function * debug on mbr and helper * change order of warning in app * adding figures and modify part of the text in readme * polishing * went to dplyr 1.0.2 to use relocate * Added images in readme * Add something for version if the app is downloaded (no access to git release number) * add default plot to hour summary tab * work version number, copied tags on non git part * modify to allow answer in interactive session * clean some data, update html doc * add message for data that has to be provided locally * typo corrected * added message for data with source is USB_stick * added message, cleaning some remnant comments in the app code * all to .r instead of .R * change extension name * cleaning .R files on web (#31) * cleaning .R files on web * Rename Softwareheader.R to Softwareheader.r (#32) rename apps and call to .R to .r codes * Testing (#35) * adding comment on filtering time windows * move code "source ("Rcode/ICA.r")" making reading the code easier * add review file + testthat (svm not included) * SVM code taken out of master * independence on metadata, svm tests added * Added mention of reviewer code in readme, add review code for SVM, set tuning range back to default. change some filenames * update readme as of #36 * modify abstract
jcolomb · Apr 14, 2021 · c24a931 · c24a931
1 parent 580daae
commit c24a931
Show file tree

Hide file tree

Showing 76 changed files with 1,505 additions and 1,901,592 deletions.
diff --git a/.github/CONTRIBUTING.md b/.github/CONTRIBUTING.md
@@ -0,0 +1,28 @@
+# Contributing to HCS analysis
+
+## Suggest developement and report bugs
+
+Please use the Github issues to report bugs or ask for new analysis.
+Similarly, if you would like to expand the software to new type of raw data,
+start there so that I can help you get started, and other users may learn about your initiative!
+
+
+## Fixing documentation
+
+Small typos or grammatical errors in documentation may be edited directly using
+the GitHub web interface, so long as the changes are made in the _source_ file.
+
+## New analyses
+
+In order to build new analysis, I would encourage you to start by using the `analysis/master_noshiny.R` file to upload some data, you can then use the .rdata file created (this does not work with the shiny app.).
+
+You may want to have a look at `archives_notpeerreviewed⁩/testscode⁩/analyseBseq` to see how to access the raw data (that code is trying to analyse sequence of behavior).
+
+
+## Pull request process
+
+Please check github help center to learn how to create pull requests: https://help.github.com/en/github/collaborating-with-issues-and-pull-requests/about-pull-requests
+
+### Code of Conduct
+
+We do not have a code of conduct (see why here, https://openscholarship.co.uk/renew-your-coc/), but please be nice, and welcome any contribution.
diff --git a/DESCRIPTION b/DESCRIPTION
diff --git a/Metadata_information/HELPER_create_metadata.R b/Metadata_information/HELPER_create_metadata.R
@@ -1,69 +1,100 @@
 #---------helpfiles to create metadata.csv without missing files
-
-#set directory to HCS folder with all data, typically "HCS3_output"
 library (dplyr)
-files = data.frame(f=as.character(dir(recursive = T)),stringsAsFactors = F)
 
-#get file names for behavior and minute files
-filesb = files %>% filter (grepl('beh',f)| grepl('Beh',f))
-filese = files %>% filter (grepl('min',f)|grepl('Min',f))
+#set directory to HCS folder with all data, typically "HCS3_output" (in Rstudio: Session- set working directory - browse)
 
-# split name of files
-filese2 = data.frame(dir=dirname(filese$f), basename (filese$f))
-filesb2 = data.frame(dir=dirname(filesb$f), basename (filesb$f))
+# comment or erase alternative you will not use, only one option possible
+#Alternative = "work with minutes xlsx exports" # files must have 'min' in their name
+#Alternative = "work with hourly xlsx exports"  # files must have 'hour' in their name
+Alternative = "work with mbr files" # files must have 'mbr' in their name
 
-#bind 2 files in one table
-meta1= cbind(filese2, filesb2)
-
-# alternatively get the hour data and duplicate it
-filesa = files %>% filter (grepl('bin',f)|grepl('Bin',f))
-filesa2 = data.frame(dir=dirname(filesa$f), basename (filesa$f))
-meta1=cbind(filesa2, filesa2)
-#meta1= cbind(filese2, "filesb2"=filese2) #spec vida
+## Run the rest of the code, if it does not work, you may need to trick the grepl commands to filter things in or out.
+library (dplyr)
+files = data.frame(f=as.character(dir(recursive = T)),stringsAsFactors = F)
 
-#create report table and write it down:
-meta1$animal_ID =NA
-names (meta1) = c("experiment_folder_name","Onemin_summary", "dir" ,              
-                  "Behavior_sequence", "animal_ID")
+output= structure(list(animal_ID = character(0), animal_birthdate = character(0), 
+                       gender = character(0), treatment = character(0), genotype = character(0), 
+                       other_category = character(0), date = character(0), test_cage = character(0), 
+                       real_time_start = character(0), Lab_ID = character(0), Exclude_data = character(0), 
+                       comment = character(0), experiment_folder_name = character(0), 
+                       Behavior_sequence = character(0), Onemin_summary = character(0), 
+                       Onehour_summary = character(0), primary_behav_sequence = character(0), 
+                       primary_position_time = character(0), primary_datafile = character(0)), row.names = integer(0), class = "data.frame")
+
+
+##Alternative A: work with minutes xlsx exports:
+##--------------
+if (Alternative == "work with minutes xlsx exports"){
+  # get file names for behavior and minute files (xlsx exports)
+  filesb = files %>% filter (grepl('beh',f)| grepl('Beh',f))
+  filese = files %>% filter (grepl('min',f)|grepl('Min',f))
+
+
+  # split name of files
+  filese2 = data.frame(dir=dirname(filese$f), basename (filese$f))
+  filesb2 = data.frame(dir=dirname(filesb$f), basename (filesb$f))
+
+
+  #bind 2 files in one table
+  meta1= cbind(filese2, filesb2)
+
+  meta1$animal_ID =NA
+  names (meta1) = c("experiment_folder_name","Onemin_summary", "dir" ,              
+                    "Behavior_sequence", "animal_ID")
+  meta1$primary_datafile = "min_summary"
+  meta1=meta1 %>% select (experiment_folder_name,Behavior_sequence,Onemin_summary, animal_ID)
+
+}
+
+if (Alternative == "work with hourly xlsx exports"){
+
+  filesa = files %>% filter (grepl('hour',f)|grepl('hour',f)) 
+  filesa2 = data.frame(dir=dirname(filesa$f), basename (filesa$f))
+  meta1=cbind(filesa2, filesa2)
+  meta1$animal_ID =NA
+  names (meta1) = c("experiment_folder_name","Onehour_summary", "dir" , "Behavior_sequence", "animal_ID")
 meta1=meta1 %>% select (experiment_folder_name,Behavior_sequence,Onemin_summary, animal_ID)
+meta1$primary_datafile = "hour_summary"
+output= left_join(meta1, output) %>% relocate (names (output))
+}
+### Alternative B: work with hourly xlsx exports:
+##--------------
+
+
+
+### Alternative C: work with mbr files (created automatically):
+##--------------
+if (Alternative == "work with mbr files"){
+  filesf = files %>% filter (grepl('mbr',f)|grepl('MBR',f))
+  ## putatively exclude some files
+
+  filesf= filesf %>% filter (grepl('HomeCageScan',f)) # exclude files with glut in their name
+  filesf2 = data.frame(dir=dirname(filesf$f), basename (filesf$f))
+  meta1= filesf2
+  names (meta1)= c("experiment_folder_name","primary_behav_sequence")
+  meta1$primary_datafile = "mbr"
+  meta1 <- meta1 %>% mutate (date = paste0(
+    "20",
+    substr(primary_behav_sequence, 13, 14),
+    "-",
+    substr(primary_behav_sequence, 15, 16),
+    "-",
+    substr(primary_behav_sequence, 17, 18)
+    )
+  ) 
+
+  }
 
 
+#meta1= cbind(filese2, "filesb2"=filese2) #spec vida
 
-#View(meta1) # you should have all files listed in a square dataframe (no NA)
-write.csv(meta1, "eachfile2.csv", fileEncoding = "UTF8")
-
-#--- manual intervention needed on eachfile2.csv: -------------
-#modification on the csv by hand: check animal ID is the same for 2 files, write animal ID
-#if animals were tested more than once, add a distinction in a new column (treatment for instance).
-# change file name to eachfile.csv.
-#use the file as a base for the metadata file or try to merge it with a different version of the metadata if you can.
-
-#------------------------Merging: code to modify for your data
-
-#now we will read the file back 
-meta1=read.csv ("eachfile.csv")
-meta1$animal_ID = as.character(meta1$animal_ID)
-
-### manual entries here!
-# We will now merge it with  the old metadata file (to be modified for your metadata!):
-# read old metadata files : 
-
-data <- read_excel("D:/HCSdata/Rosendmund_VGlut1.1_HCS_all_ML_24112016.xlsx",
-col_types = c("text", "text", "text",
-"text", "text", "numeric", "numeric",
-"numeric", "numeric", "text"))
+#create report table and write it down:
 
-
-#merging:
 
-data$animal_ID <- as.character(data$`animal ID`)
-#data$animal_ID <- as.character(data$animal_ID)
-#data$animal_ID <- data$`id cohort.2`
 
 
-a =left_join(data,meta1, by = "animal_ID")
+#View(meta1) # you should have all files listed in a square dataframe (no NA)
+output= left_join(meta1, output) %>% relocate (names (output))
+write.csv(output, "eachfile2.csv", fileEncoding = "UTF8")
 
-#View(a)
-write.csv(a, "metadata4.csv")
 
-# now modify this by hand to get the right column names.
diff --git a/Metadata_information/HELPER_join_animal-info.R b/Metadata_information/HELPER_join_animal-info.R
@@ -0,0 +1,36 @@
+#--- manual intervention needed on eachfile2.csv: -------------
+## This may be used to include information about the animals present in a different spreadsheet.
+#modification on the csv by hand: check animal ID is the same for 2 files, write animal ID
+#if animals were tested more than once, add a distinction in a new column (treatment for instance).
+# change file name to eachfile.csv.
+#use the file as a base for the metadata file or try to merge it with a different version of the metadata if you can.
+
+#------------------------Merging: code to modify for your data
+
+#now we will read the file back
+meta1=read.csv ("eachfile.csv")
+meta1$animal_ID = as.character(meta1$animal_ID)
+
+### manual entries here!
+# We will now merge it with  the old metadata file (to be modified for your metadata!):
+# read old metadata files :
+
+data <- read_excel("D:/HCSdata/Rosendmund_VGlut1.1_HCS_all_ML_24112016.xlsx",
+                   col_types = c("text", "text", "text",
+                                 "text", "text", "numeric", "numeric",
+                                 "numeric", "numeric", "text"))
+
+
+#merging:
+
+data$animal_ID <- as.character(data$`animal ID`)
+#data$animal_ID <- as.character(data$animal_ID)
+#data$animal_ID <- data$`id cohort.2`
+
+
+a =left_join(data,meta1, by = "animal_ID")
+
+#View(a)
+write.csv(a, "metadata4.csv")
+
+# now modify this by hand to get the right column names.
diff --git a/Metadata_information/Readme.html b/Metadata_information/Readme.html