You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
@wiederm Mentioned interested in having a smaller ani2x dataset (larger than our testing set) for training examination. @jchodera suggested limiting to molecules with C, H, O, which I think is good. This would allow us to more directly compare with PhAlkEthOH.
PhAlkEthOH has 12,271 unique molecules, ANI2x has 16,514 unique molecules. I'm not sure how many molecules are in ANI2x with only C, H, O, but if this number is less than PhAlkEthOH, we can create a smaller subset of it to match.
It might be interesting to see the overlap of these datasets. The ANI2x dataset does not contain the smiles strings for the molecules, but probably could do some other relevant comparisons. I think something as simple as looking at the overlap of molecular weight (since we are limited to CHO) would probably be good. Could also just do this as two plots, one for molecules with O, one for molecules without O.
The text was updated successfully, but these errors were encountered:
@wiederm Mentioned interested in having a smaller ani2x dataset (larger than our testing set) for training examination.
@jchodera suggested limiting to molecules with C, H, O, which I think is good. This would allow us to more directly compare with PhAlkEthOH.
PhAlkEthOH has 12,271 unique molecules, ANI2x has 16,514 unique molecules. I'm not sure how many molecules are in ANI2x with only C, H, O, but if this number is less than PhAlkEthOH, we can create a smaller subset of it to match.
It might be interesting to see the overlap of these datasets. The ANI2x dataset does not contain the smiles strings for the molecules, but probably could do some other relevant comparisons. I think something as simple as looking at the overlap of molecular weight (since we are limited to CHO) would probably be good. Could also just do this as two plots, one for molecules with O, one for molecules without O.
The text was updated successfully, but these errors were encountered: