Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Chiral spiro compounds are labelled achiral, InChI generated is incorrect #40

Open
supersciencegrl opened this issue Aug 9, 2024 · 10 comments
Assignees

Comments

@supersciencegrl
Copy link

supersciencegrl commented Aug 9, 2024

I am working from the InChI Web Demo (https://iupac-inchi.github.io/InChI-Web-Demo/), and I originally found this issue when creating InChI using rdkit in Python, and the same occurs when I try to generate InChI from structure in ChemDraw.

C2-symmetrical spiro compounds which contain a single stereogenic spiro atom seem to (sometimes or always?) be labelled as achiral, and the InChI generated does not contain point stereochemistry.

Here is an example. The first molecule is (R)-SDP, and the second is (S)-SDP. They have different structures, and the point stereochemistry is represented accurately by SMILES. However, when I try to generate an InChI in any of the three above systems, the stereochemistry is lost.
C12=CC=CC(P(C3=CC=CC=C3)C4=CC=CC=C4)=C1[C@@]5(CC2)CCC6=C5C(P(C7=CC=CC=C7)C8=CC=CC=C8)=CC=C6
C12=CC=CC(P(C3=CC=CC=C3)C4=CC=CC=C4)=C1[C@]5(CC2)CCC6=C5C(P(C7=CC=CC=C7)C8=CC=CC=C8)=CC=C6

The log on the InChI Web Demo states: "InChI options: Warning (Not chiral)"

However, InChI is generated correctly for the topologically-similar compound, (R)-ShiP.
InChI=1S/C23H19O3P/c1-2-8-18(9-3-1)24-27-25-19-10-4-6-16-12-14-23(21(16)19)15-13-17-7-5-11-20(26-27)22(17)23/h1-11H,12-15H2/t23-/m0/s1

@fbaensch-beilstein
Copy link
Collaborator

Dear @supersciencegrl,

we will have a closer look to this soon.
Could you please provide us the mol files for (R)-SDP and (S)-SDP you have used.

@gblanke02
Copy link
Contributor

gblanke02 commented Aug 12, 2024 via email

@supersciencegrl
Copy link
Author

Hi team,

Thanks for looking at it! I appreciate all your work. I'll try and come along (online) to the InChI discussion on Saturday.

Here's the 2D molfile I tried (this one created in ChemDraw; I had the same behaviour when just using a SMILES string). Either this or the enantiomer perform exactly the same way using rdkit in Python.


untitled.mol
ChemDraw08122419562D

43 50 0 0 1 0 0 0 0 0999 V2000
-0.7846 -1.0799 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0
-1.4991 -1.4924 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0
-2.2136 -1.0799 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0
-2.2136 -0.2549 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0
-1.4991 0.1576 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0
-1.7813 0.9328 0.0000 P 0 0 0 0 0 0 0 0 0 0 0 0
-1.2510 1.5648 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0
-1.5331 2.3400 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0
-1.0028 2.9720 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0
-0.1904 2.8288 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0
0.0918 2.0535 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0
-0.4385 1.4215 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0
-2.5937 1.0761 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0
-3.1240 0.4441 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0
-3.9365 0.5873 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0
-4.2187 1.3626 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0
-3.6884 1.9946 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0
-2.8759 1.8513 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0
-0.7846 -0.2549 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0
0.0000 0.0000 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0
-0.4849 0.6674 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0
0.0000 1.3349 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0
0.7846 1.0799 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0
0.7846 0.2549 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0
1.4991 -0.1576 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0
1.7813 -0.9328 0.0000 P 0 0 0 0 0 0 0 0 0 0 0 0
2.5937 -1.0761 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0
2.8759 -1.8513 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0
3.6884 -1.9946 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0
4.2187 -1.3626 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0
3.9365 -0.5873 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0
3.1240 -0.4441 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0
1.2510 -1.5648 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0
0.4385 -1.4215 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0
-0.0918 -2.0535 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0
0.1904 -2.8288 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0
1.0028 -2.9720 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0
1.5331 -2.3400 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0
2.2136 0.2549 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0
2.2136 1.0799 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0
1.4991 1.4924 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0
0.4849 -0.6674 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0
0.0000 -1.3349 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0
1 2 2 0 0
2 3 1 0 0
3 4 2 0 0
4 5 1 0 0
5 6 1 0 0
6 7 1 0 0
7 8 2 0 0
8 9 1 0 0
9 10 2 0 0
10 11 1 0 0
11 12 2 0 0
7 12 1 0 0
6 13 1 0 0
13 14 2 0 0
14 15 1 0 0
15 16 2 0 0
16 17 1 0 0
17 18 2 0 0
13 18 1 0 0
5 19 2 0 0
1 19 1 0 0
20 19 1 1 0
20 21 1 0 0
21 22 1 0 0
22 23 1 0 0
23 24 2 0 0
20 24 1 0 0
24 25 1 0 0
25 26 1 0 0
26 27 1 0 0
27 28 2 0 0
28 29 1 0 0
29 30 2 0 0
30 31 1 0 0
31 32 2 0 0
27 32 1 0 0
26 33 1 0 0
33 34 2 0 0
34 35 1 0 0
35 36 2 0 0
36 37 1 0 0
37 38 2 0 0
33 38 1 0 0
25 39 2 0 0
39 40 1 0 0
40 41 2 0 0
23 41 1 0 0
20 42 1 6 0
42 43 1 0 0
1 43 1 0 0
M END

@gblanke02
Copy link
Contributor

gblanke02 commented Aug 13, 2024 via email

@supersciencegrl
Copy link
Author

Interesting! ChemDraw can both recognize the above molfile corresponds to a chiral compound, and assign the correct (R)-descriptor. However, (off topic but) there are similar compounds that ChemDraw recognizes as chiral but for some reason cannot assign - (R)-ShiP is one. InChI=1S/C23H19O3P/c1-2-8-18(9-3-1)24-27-25-19-10-4-6-16-12-14-23(21(16)19)15-13-17-7-5-11-20(26-27)22(17)23/h1-11H,12-15H2/t23-/m1/s1

@nbehrnd
Copy link

nbehrnd commented Aug 13, 2024

@supersciencegrl The simple copy-paste (like a text) of the .sdf/.mol file scrambles the format of the file. This adds an obstacle for processing down the road, e.g.

$ obabel untitled.mol -osdf
==============================
*** Open Babel Warning  in ReadMolecule
  WARNING: Problems reading a MDL file
1 2 2 0 0
Invalid bond specification, atom numbers or bond order are wrong;
each should be in a field of three characters.

0 molecules converted

because it anticipates a format of 7(I3) for the connectivity table. One solution to this is to enclose the copy (like a code block) by a leading and trailing line of three back ticks/accents grave. Else (e.g., long log files of MOPAC/Gaussian etc), either the attachment after the addition of a .txt file extension, or joining multiple files into a .zip archive the paste/drop/attach on GitHub equally accepts (caveat: not in replies via email, only from the session in the web browser).

Example water


 OpenBabel08132410443D

  3  2  0  0  0  0  0  0  0  0999 V2000
    0.9444    0.0690   -0.0831 O   0  0  0  0  0  0  0  0  0  0  0  0
    1.9123    0.0601   -0.0375 H   0  0  0  0  0  0  0  0  0  0  0  0
    0.6655   -0.1094    0.8276 H   0  0  0  0  0  0  0  0  0  0  0  0
  1  2  1  0  0  0  0
  1  3  1  0  0  0  0
M  END
$$$$

Edit: plus the fenced code block has the paperclip in the top right corner.

@gblanke02
Copy link
Contributor

gblanke02 commented Aug 13, 2024 via email

@fbaensch-beilstein
Copy link
Collaborator

Since github does not accept mol files as an attachment, just change the extension into .txt.
That would be very helpful.

@supersciencegrl
Copy link
Author

That makes sense! molfile attached in .txt format.
(R)-SDP.txt

@JanCBrammer
Copy link
Collaborator

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants