Query Document Network #729

GavinMendelGleason · 2021-10-23T13:30:23Z

GavinMendelGleason
Oct 23, 2021
Maintainer

Query Document Network

In the pre 1.0 TerminusDB we had a feature which allowed easy
visualisation of the graph by returning a set of nodes and edges which
was easily consumed by D3 or other javascript applications which draw
networks/graphs.

It was a very convenient feature which would make a good TerminusDB
endpoint useable by those that want to draw networks with minimal
effort using data from the graph.

The basic idea is to treat each document within TerminusDB in a way
similar to a hypertext document in HTML - with links in represented as
an edge pointing to the document, and links out from any level of the
document represented as a link out.

Each document could be loaded as the node objects, or simply as an ID
in cases where only the links are of interest, when speed is
important, or where a subsequent load of the document based on the ID
would be just as convenient.

The input

Seeds

These are the IDs from which to start loading. Often times the seeds
are known because the starting point is known. In the case of a GIS, a
region selection could give a query that gives back the IDs within the
region and these could act as the seed. Or perhaps we are interested
in the connections starting from a particular already known document
or a document which has a particular field.

The seeds should be represented as a list of IDs.

Edges

This could be either a disjunctive or pattern query which states which
edges are allowed in the traversal. We could accept any WOQL edge
pattern, or simply a list of fields.

Max Depth

This simply tells how far to traverse from the documents provided by
the seeds.

{ "seeds" : ["Id1", "Id2"],
  "edges" : ["follows"],
  "max_depth" : 3,
}

The API output

The API should return something easily consumable by D3. For instance:

{
  "nodes":  [
    { "@type": "Person",
      "@id" : "Person/Mlle.Baptistine",
      "name" : "Mlle.Baptistine",
      "follows" : "Person/Napoleon" },
    { "@type": "Person",
      "@id" : "Person/Napoleon",
      "name" : "Napoleon",
      "follows" : "Person/Mlle.Baptistine" },
    { "@type": "Person",
      "@id" : "Person/Myriel",
      "name" : "Myriel",
      "follows" : "Person/Mlle.Baptistine"},
  ],
  "links" : [
    { "source" : "Person/Mlle.Baptistine",
      "target" : "Person/Napoleon",
      "weight" : 10 },
    { "source" : "Person/Napoleon",
      "target" : "Person/Mlle.Baptistine",
      "weight" : 10 },
    { "source" : "Person/Myriel",
      "target" : "Person/Mlle.Baptistine",
      "weight" : 5 }
  ]
}

Why can't I do this in WOQL?

Currently the links can't be constructed due to the fact that we need
to "cons"-up a document. This could be easily fixed if we allowed
unification with documents which included embedded variables!

The query itself for nodes would look something like this:

let [nodes,object,seed,id] = vars(["nodes","object","seed","id"])
select(nodes).
  group_by(true,
           object,
           and(member(seed), seeds),
               path(seed,p(edge),id),
               triple(id, "rdf:type", "Type"),
               triple("Type", "rdf:type", "sys:Class", "schema"),
               get_document(id, object)),
           nodes))

How can we obtain the links at the same time?

All in all it seems easier to implement this directly in prolog, but
then what expressiveness are we missing that makes this easier?

GavinMendelGleason · 2021-10-23T17:50:43Z

GavinMendelGleason
Oct 23, 2021
Maintainer Author

This appears to be pretty close. With a document template you could make this work:

(
        select(
            [v('Links'), v('Document')],
            group_by(
                true,
                [v('Seed'), v('ID')],
                (   member(v('Seed'), Seeds),
                    path(v('Seed'),p(edge),v('ID')),
                    t(v('ID'), rdf:type, v('Type')),
                    t(v('Type'), rdf:type, sys:'Class', schema)),
                v('Links'))),
        select(
            [v('Nodes')],
            group_by(
                true,
                v('Document'),
                (   distinct(
                        [v('Node')],
                        (   member(v('Link'),v('Links')),
                            [v('Seed'), v('ID')] = v('Link'),
                            (   v('Node') = v('Seed')
                            ;   v('Node') = v('ID'))
                        )
                    ),
                    get_document(v('Node'), v('Document'))
                ),
                v('Nodes')
            )
        )
    )

0 replies

GavinMendelGleason · 2021-10-23T21:12:45Z

GavinMendelGleason
Oct 23, 2021
Maintainer Author

Since matching structures in dictionaries would be desirable, there is a question about how to represent them. Essentially it is like a quasi-quote, where you have part of the dictionary as literal, but sections are variables to be matched.

If this is to be represented with an arbitrary dictionary, it can't be saved if we only have the WOQL schema, since it will not describe all of the shapes of the matching dictionaries etc.

Should we simply quote the matching dictionaries? But then there are variables which are actual WOQL objects which are fully quoted and not represented until loaded.

Should we have a representation for arbitrary dictionaries embedded in WOQL? Each key-value would turn into a key-value object. This would make quasi-quote easy, but the representation would be rather cumbersome to produce.

let x = vars(["x"])
let d = quasi_quote({ "a" : 1, "b": 2, "c" : unqote(x) })
assert(d == {"@type" : "Dictionary", 
                    "data" : [{"@type" : "KeyValue", "key" : "a", "value" : 1} 
                                 {"@type" : "KeyValue", "key" : "b", "value" : 2 }, 
                                 {"@type" : "Variable", "variable" : "x" }]
                   })

0 replies

GavinMendelGleason · 2021-10-26T08:35:31Z

GavinMendelGleason
Oct 26, 2021
Maintainer Author

With a small patch to the python client, the following will work:

[Link,Links,Weight,
 Seed,Id,Type,
 Class,Node,Nodes,
 Document,Source,Target] = WQ().vars("Link","Links","Weight",
                                     "Seed","Target","Type",
                                     "Class","Node","Nodes",
                                     "Document","Source","Target")
query = (
    WQ().select(Links).group_by([],
                                Doc({ "source" : Seed, "target": Target, "weight" : Weight }),
                                Links,
                                ( WQ().true() # triple(Seed, "rdf:type", "Node")
                                  & WQ().path(Seed,"edge,node",Target)
                                  & WQ().path(Seed,"edge,weight",Weight)
                                  & WQ().triple(Target, "rdf:type", Type)
                                  & WQ().triple(Type, "rdf:type", "sys:Class", schema)))
    & WQ().select(Nodes).group_by([],
                                  Document,
                                  Nodes,
                                  ( WQ().distinct(Node,
                                                  ( WQ().member(Link,Links)
                                                    & WQ().dot(Link, "source", Source)
                                                    & WQ().dot(Link, "target", Target)
                                                    & ( WQ().eq(Node,Source)
                                                        | WQ().eq(Node,Target) )
                                                   ))
                                    & WQ().read_object(Node, Document)
                                   )))

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

TerminusDB

Query Document Network #729

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 3 comments

{{title}}

{{title}}

{{title}}

Select a reply

TerminusDB

Query Document Network #729

GavinMendelGleason Oct 23, 2021 Maintainer

Query Document Network

The input

The API output

Why can't I do this in WOQL?

Replies: 3 comments

GavinMendelGleason Oct 23, 2021 Maintainer Author

GavinMendelGleason Oct 23, 2021 Maintainer Author

GavinMendelGleason Oct 26, 2021 Maintainer Author

GavinMendelGleason
Oct 23, 2021
Maintainer

GavinMendelGleason
Oct 23, 2021
Maintainer Author

GavinMendelGleason
Oct 23, 2021
Maintainer Author

GavinMendelGleason
Oct 26, 2021
Maintainer Author