llcppg:refactor to type-node level #157

luoliwoshang · 2025-01-15T02:55:45Z

fix #61
fix #99

缺陷

目前Go Binding的生成逻辑是，根据获得的文件，均先处理其include文件，再处理文件本身。

对于大多数较为头文件组织清晰的库，可以正常进行转换，这里即会优先访问 type.h 处理 A的定义，然后在main.h 中正确的引用。
main.h

typedef struct A {
    long a;
} A;
#include "compat.h"

compat.h

typedef A B;

func (p *DocFileSetProcessor) ProcessFileSet(files []*ast.FileEntry) error {
	for _, inc := range p.depIncs {
		idx := FindEntry(files, inc)
		if idx < 0 {
			continue
		}
		p.visitedFile[files[idx].Path] = struct{}{}
	}
	for _, file := range files {
		p.visitFile(file.Path, files)
	}
	if p.done != nil {
		p.done()
	}
	return nil
}

但是根据C语言本省的特性，其Include的位置是可以存在于文件的任何位置的,对于如下姿势的声明方式，即不能做到正确转换,如下的用例即会存在先访问compat.h，但是此时A并没有定义。
main.h

typedef struct A {
    int a;
} A;

#include "compat.h"

compat.h

typedef A B;

resolution

USR

通过libclang的USR的唯一标识，可以标识某个引用实际对应的类型。

这是对于 struct A 的描述。可以看到定义的时候Ident对应的USR是c:@S@A,其 Field的USR是 c:@S@A@FI@a, 对于typedef A; 对应的USR即为 c:main.h@T@A, 并且其引用的Underlying Type即为引用 struct A；所以可以看到Type对应的USR即为 c:@S@A

      "decls": [
        {
          "_Type": "TypeDecl",
          "Loc": {
            "_Type": "Location",
            "File": "./main.h"
          },
          "Name": {
            "_Type": "Ident",
            "Name": "A",
            "USR": "c:@S@A"
          },
          "Type": {
            "_Type": "RecordType",
            "Tag": 0,
            "Fields": {
              "_Type": "FieldList",
              "List": [
                {
                  "_Type": "Field",
                  "Type": {
                    "_Type": "BuiltinType",
                    "Kind": 6,
                    "Flags": 4
                  },
                  "Names": [
                    {
                      "_Type": "Ident",
                      "Name": "a",
                      "USR": "c:@S@A@FI@a"
                    }
                  ]
                }
              ]
            },
          }
        },
        {
          "_Type": "TypedefDecl",
          "Loc": {
            "_Type": "Location",
            "File": "./main.h"
          },
          "Name": {
            "_Type": "Ident",
            "Name": "A",
            "USR": "c:main.h@T@A"
          },
          "Type": {
            "_Type": "TagExpr",
            "Name": {
              "_Type": "Ident",
              "Name": "A",
              "USR": "c:@S@A"
            },
            "Tag": 0
          }
        }
      ],

而对于compat.h 中的 typedef 的引用可以看到正常的引用了 c:main.h@T@A

        {
          "_Type": "TypedefDecl",
          "Loc": {
            "_Type": "Location",
            "File": "./compat.h"
          },
          "Doc": null,
          "Parent": null,
          "Name": {
            "_Type": "Ident",
            "Name": "B",
            "USR": "c:compat.h@T@B"
          },
          "Type": {
            "_Type": "Ident",
            "Name": "A",
            "USR": "c:main.h@T@A"
          }
        }

保留typedef struct A { } A;的两个同名节点，在gogensig中再进行处理，因为在引用关系中，引用struct A和引用typedef A是两个不同的USR表示。

通过类型引用关系重排文件顺序，保证被依赖的类型所在的文件会优先进行处理 (X,不可行)

可以解决如下问题，即根据类型引用可以分析得出 compat.h 的 B依赖了 A，所以优先处理main.h 的所有节点再处理compat.h 的定义。

typedef struct A {
    long a;
} A;
#include "compat.h"

compat.h

typedef A B;

但是这个方案会导致以下用例中 main.h中的C定义时，B还未定义，所以基于引用关系分析出来的文件粒度的处理顺序，并不能满足当前需求。
main.h

typedef struct A {
    long a;
} A;
#include "compat.h"
typedef B C;

compat.h

typedef A B;

根据类型定义粒度，决定处理类型初始化的顺序。

根据AST引用关系分析出，创建类型的顺序，根据这个顺序来完成类型在Go包初始化的顺序。

创建类型的顺序可能并不与C头文件中的一致？
可能可以创建那些类型定义节点，保证其在树中的结构，再根据引用关系逐渐完成类型初始化？

codecov · 2025-01-15T03:01:47Z

Codecov Report

Attention: Patch coverage is 83.48416% with 146 lines in your changes missing coverage. Please review.

Project coverage is 93.17%. Comparing base (89c24fe) to head (d1303b1).
Report is 17 commits behind head on main.

Files with missing lines	Patch %	Lines
cmd/gogensig/convert/package.go	61.88%	75 Missing and 18 partials ⚠️
cmd/gogensig/convert/convert.go	71.50%	42 Missing and 9 partials ⚠️
cmd/gogensig/config/conf.go	87.50%	1 Missing ⚠️
cmd/gogensig/convert/type.go	93.75%	1 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             main     #157      +/-   ##
==========================================
- Coverage   98.16%   93.17%   -5.00%     
==========================================
  Files          17       18       +1     
  Lines        2179     2768     +589     
==========================================
+ Hits         2139     2579     +440     
- Misses         28      149     +121     
- Partials       12       40      +28

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

qiniu-x · 2025-01-26T16:10:56Z