Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature/static filter for the TREE specification #86

Open
wants to merge 207 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 164 commits
Commits
Show all changes
207 commits
Select commit Hold shift + click to select a range
c40f019
TREE traversal actor support for filter function added, but filter fu…
constraintAutomaton Oct 18, 2022
ada26f6
bus for link travel optimization and some lint changes that were not …
constraintAutomaton Oct 18, 2022
5ec0762
bus-optimize-link-traversal integrated but does nothing
constraintAutomaton Oct 19, 2022
7c4bb96
lint-fix
constraintAutomaton Oct 19, 2022
f43a931
support for multiple filter targeting a single key
constraintAutomaton Oct 20, 2022
7212a6a
manual filter started
constraintAutomaton Oct 21, 2022
6c4884c
metadata of TREE starting to be extracted using interface to describe…
constraintAutomaton Oct 25, 2022
449fc4a
non usefull metadata of TREE deleted
constraintAutomaton Oct 25, 2022
8cae037
comment added for the description of an interface
constraintAutomaton Oct 25, 2022
81016c1
conditional relation deleted
constraintAutomaton Oct 26, 2022
b49b974
implementation with custom bus
constraintAutomaton Oct 26, 2022
175255a
filter place in the new bus with sparqlee
constraintAutomaton Oct 31, 2022
89b8bbb
static filter with sparqlee implemented
constraintAutomaton Nov 3, 2022
e7c9a00
filter tree traversal query operator deleted
constraintAutomaton Nov 3, 2022
eabaa2c
useless comment deleted
constraintAutomaton Nov 3, 2022
4e7ceeb
unused context key deleted
constraintAutomaton Nov 3, 2022
d67fb0f
better reference for the Tree metadata
constraintAutomaton Nov 3, 2022
188d4ee
lint-fix
constraintAutomaton Nov 3, 2022
96ad45f
test actor link tree extractor done
constraintAutomaton Nov 4, 2022
b5de349
lint-fix
constraintAutomaton Nov 4, 2022
9eb377d
test coverage brought back to 100% by adding test for treeMetadataExt…
constraintAutomaton Nov 4, 2022
4c80c1c
unit test test method of optmize link traversal
constraintAutomaton Nov 4, 2022
ecb1504
first test of run method
constraintAutomaton Nov 7, 2022
2e1b318
test almost done need to pass the coverage
constraintAutomaton Nov 7, 2022
3e2b527
test coverage brought to 100% and lint fix
constraintAutomaton Nov 8, 2022
fc04a59
fix import metadataExtractor test
constraintAutomaton Nov 8, 2022
9cd62c2
lint-fix
constraintAutomaton Nov 8, 2022
200be0b
documentation upgraded
constraintAutomaton Nov 8, 2022
21aa6bb
some renaming of tests
constraintAutomaton Nov 8, 2022
815da8a
better documentation
constraintAutomaton Nov 10, 2022
9c5a29d
hard copy of filter operation deleted
constraintAutomaton Nov 10, 2022
3e48b62
deep copy reimplemented with lodash
constraintAutomaton Nov 10, 2022
2329cb5
deep cloning of the filter expression is not necessary anymore in the…
constraintAutomaton Nov 10, 2022
79cf7b0
Merge branch 'tmp' into feature/time-filtering-tree-sparqlee-implemen…
constraintAutomaton Nov 10, 2022
6db03d2
lodash depency deleted
constraintAutomaton Nov 10, 2022
afce916
first
constraintAutomaton Nov 10, 2022
6c9acfb
first
constraintAutomaton Nov 10, 2022
60b6660
Merge branch 'tmp' into feature/time-filtering-tree-sparqlee-implemen…
constraintAutomaton Nov 10, 2022
512963f
deep copy for filter expression deleted
constraintAutomaton Nov 10, 2022
369afea
now it reject the TREE extractor actor when the optimize link travers…
constraintAutomaton Nov 14, 2022
0641ce8
simplyfication of the handling of undefined values
constraintAutomaton Nov 14, 2022
ae21b4f
useless dependecies deleted
constraintAutomaton Nov 14, 2022
39c9f32
filter actor translated into a class
constraintAutomaton Nov 14, 2022
a2d1e4c
bus-optmize-link-traversal deleted and the filter actor converted int…
constraintAutomaton Nov 14, 2022
655a7ed
lint-fix
constraintAutomaton Nov 14, 2022
cfd01a7
extra space and faulty config deleted
constraintAutomaton Nov 15, 2022
7567791
extra space added
constraintAutomaton Nov 15, 2022
bb59cc2
.gitignore file reverted
constraintAutomaton Nov 15, 2022
fe839e6
renaming of tree metadata type
constraintAutomaton Nov 15, 2022
b0ec21a
new tests for nested query and construct query
constraintAutomaton Nov 16, 2022
83a71bb
documentation of the tree metadata
constraintAutomaton Nov 16, 2022
31b7e71
added space deleted
constraintAutomaton Nov 16, 2022
a79e8a7
better documentation for TREE link extractor
constraintAutomaton Nov 16, 2022
b093a18
documentation added to the filter node class
constraintAutomaton Nov 16, 2022
c6bcd82
documentation updated for the getTreeQuadsRawRelations function
constraintAutomaton Nov 16, 2022
fe5cb90
lint-fix
constraintAutomaton Nov 16, 2022
6cab244
documentation of the function findBgp
constraintAutomaton Nov 16, 2022
1d1cba6
whole doesNodeExist method coverded by the test
constraintAutomaton Nov 17, 2022
3f25089
test for the method applyFilter added
constraintAutomaton Nov 17, 2022
54a956e
more documentation for the TREE extractor actor
constraintAutomaton Nov 17, 2022
d165b23
generateTreeRelationFilter newFilterExpression build with the Algebra…
constraintAutomaton Nov 17, 2022
030bce0
unit test description rework for the FilterNode and the ActorExtractL…
constraintAutomaton Nov 17, 2022
e260e02
documentation added for findBgp method
constraintAutomaton Nov 17, 2022
f51c041
doesNodeExist method change for findNode
constraintAutomaton Nov 18, 2022
6ac1ab3
WIP: Simplify tree logic
rubensworks Nov 18, 2022
0d4aee3
WIP: fixing the variables and the tests
constraintAutomaton Nov 21, 2022
f5ea05a
code refactoring done
constraintAutomaton Nov 22, 2022
d677f8d
Merge pull request #1 from constraintAutomaton/feature/time-filtering…
constraintAutomaton Nov 22, 2022
b69f1f1
lint-fix
constraintAutomaton Nov 22, 2022
fcc380c
dot added to a comment
constraintAutomaton Nov 22, 2022
8ec97eb
WIP: handling of equal operation
constraintAutomaton Nov 22, 2022
31e087b
WIP: solver module started to remplace sparqlee
constraintAutomaton Nov 24, 2022
8798016
transform the filter into a an expression with a list of operator
constraintAutomaton Nov 29, 2022
39144c7
Expression transform into equation.
constraintAutomaton Nov 30, 2022
836574a
Unit test started for SolutionRange and unit test started for Solutio…
constraintAutomaton Dec 5, 2022
deaa784
inverse function for SolutionRange created and tested
constraintAutomaton Dec 5, 2022
8e6abf8
not operation for SolutionDomain implemented and tested
constraintAutomaton Dec 5, 2022
ddc7a3f
SolutionDomain and solution range moved to there own file
constraintAutomaton Dec 6, 2022
bac52ea
add method and addWithAndOperator implemented and tested
constraintAutomaton Dec 6, 2022
c20072d
LinkOperator implemented in a separated file and tested
constraintAutomaton Dec 6, 2022
7a286ef
WIP: resolve domain function started.
constraintAutomaton Dec 6, 2022
29975b0
resolveEquationSystem created but not tested.
constraintAutomaton Dec 8, 2022
0e5ccca
test for filterOperatorToRelationOperator
constraintAutomaton Dec 8, 2022
897e880
unit test function filterOperatorToRelationOperator, isSparqlOperandN…
constraintAutomaton Dec 8, 2022
5c31d94
small formating
constraintAutomaton Dec 9, 2022
d1143de
Fix outdated importing.
constraintAutomaton Dec 9, 2022
aa69545
unit test convertTreeRelationToSolverExpression
constraintAutomaton Dec 13, 2022
e96a761
unit test resolveEquation
constraintAutomaton Dec 13, 2022
b000790
unit test for createEquationSystem
constraintAutomaton Dec 13, 2022
53679c5
Unit test resolveAFilterTerm.
constraintAutomaton Dec 14, 2022
bde73fb
unit test recursifFilterExpressionToSolverExpression
constraintAutomaton Dec 15, 2022
70ae60f
unit test for isRelationFilterExpressionDomainEmpty done.
constraintAutomaton Dec 15, 2022
3f8f172
management of variable consistency
constraintAutomaton Dec 15, 2022
d4c9c0f
solverInterface documented.
constraintAutomaton Dec 16, 2022
62b7c64
solver documented.
constraintAutomaton Dec 16, 2022
602b0a4
fix test function name.
constraintAutomaton Dec 16, 2022
3866cca
unit test for clone method of class SolutionDomain.
constraintAutomaton Dec 17, 2022
1d34c2b
Documentation of SolutionRange and modification of getIntersection wh…
constraintAutomaton Dec 19, 2022
87c1c63
Documentation for SolutionDomain done.
constraintAutomaton Dec 19, 2022
514aad4
Documentation of LinkOperator.
constraintAutomaton Dec 19, 2022
60e0d66
Made previous test work with the solver.
constraintAutomaton Jan 16, 2023
a05784b
small formating
constraintAutomaton Jan 16, 2023
fad999d
Fix the test of the actor link tree traversal actor related to the fi…
constraintAutomaton Jan 16, 2023
6bf424e
lint-fix
constraintAutomaton Jan 16, 2023
dd0152f
Full coverage of unit test restore.
constraintAutomaton Jan 17, 2023
469b6a1
Merge pull request #2 from constraintAutomaton/feature/filter-custom-…
constraintAutomaton Jan 17, 2023
43f5b4a
Comment added to lastOperation.
constraintAutomaton Jan 17, 2023
ebce330
Small typo corrected.
constraintAutomaton Jan 18, 2023
cf5aa53
Small typo corrected.
constraintAutomaton Jan 18, 2023
211fddc
Array.at method delete from the codebase because it is not implemente…
constraintAutomaton Jan 18, 2023
ad175bc
unicorn/prefer-at lint rule disable as it is not compatible with node…
constraintAutomaton Jan 18, 2023
df83685
Add edge-case unit tests on isRelationFilterExpressionDomainEmpty
rubensworks Jan 24, 2023
b043adb
Comment added to the first operation more clear.
constraintAutomaton Feb 2, 2023
6656bea
Recursive solution started.
constraintAutomaton Feb 9, 2023
b014447
Merge branch 'comunica:master' into recursif_resolve
constraintAutomaton Feb 12, 2023
ce5cb44
Ulp added to manage value just above and just below.
constraintAutomaton Feb 15, 2023
9a5b1e6
Recursive solving work with the current test.
constraintAutomaton Feb 16, 2023
96f6d04
lint-fix
constraintAutomaton Feb 16, 2023
b8c008a
merge conflict resolved
constraintAutomaton Feb 16, 2023
db4195e
new system of error added for the traversal of the filter expression.
constraintAutomaton Feb 20, 2023
b198356
lint fix
constraintAutomaton Feb 20, 2023
a6e99d2
Test coverage made to 100%.
constraintAutomaton Feb 22, 2023
ac9a9dd
lint-fix
constraintAutomaton Feb 22, 2023
2fb851b
merge fix
constraintAutomaton Feb 22, 2023
35c52a3
merge fix
constraintAutomaton Feb 22, 2023
231e9ed
merge recursif method
constraintAutomaton Feb 22, 2023
231be46
Starting to implement the concept of empty range.
constraintAutomaton Feb 23, 2023
5ca446e
merge
constraintAutomaton Mar 6, 2023
c43dd16
made stable for the benchmark
constraintAutomaton Mar 6, 2023
bcca78a
merge
constraintAutomaton Mar 9, 2023
ef40471
modification of the argument interface of the TREE actor.
constraintAutomaton Mar 9, 2023
01fe014
Fix the problem that the solver was could not be disable.
constraintAutomaton Mar 13, 2023
158e429
config file for the tree traversal edited, one for no solver created …
constraintAutomaton Mar 29, 2023
3b57e3c
edge test cases passed.
constraintAutomaton Apr 3, 2023
96f5245
Test coverage back to 100% and name given to edge cases test.
constraintAutomaton Apr 3, 2023
2583d9f
lint-fix
constraintAutomaton Apr 3, 2023
d69c9f7
useless file deleted
constraintAutomaton Apr 3, 2023
ee593af
Test function replace by a function that get the filter expression to…
constraintAutomaton Apr 3, 2023
6571d26
Simplification of the procedure to check undefined values
constraintAutomaton Apr 3, 2023
8519fee
LinkOperator doesn't rely on a global value to have an unique ID.
constraintAutomaton Apr 3, 2023
65693a5
isRelationFilterExpressionDomainEmpty renamed for isBooleanExpression…
constraintAutomaton Apr 3, 2023
4517139
merge
constraintAutomaton Apr 3, 2023
4f53a5b
Fix faulty depencies version inside the tree actor.
constraintAutomaton Apr 3, 2023
f754ab0
Readme updated.
constraintAutomaton Apr 3, 2023
6e4d3a2
made the readme multi line.
constraintAutomaton Apr 3, 2023
34a9b86
A better description of the actor.
constraintAutomaton Apr 3, 2023
47d3937
TREE metadata move with the tree actor.
constraintAutomaton Apr 3, 2023
9a95a6c
merge
constraintAutomaton Apr 3, 2023
9110beb
documentation added of the method of FilterNode.
constraintAutomaton Apr 3, 2023
d1ed29b
faulty depency fix.
constraintAutomaton Apr 3, 2023
938bec9
faulty depency fix.
constraintAutomaton Apr 3, 2023
dcdf16b
lint-fix
constraintAutomaton Apr 3, 2023
33838ab
typo in comment fix
constraintAutomaton Apr 3, 2023
d8102db
handling of a domain with only an empty range.
constraintAutomaton Apr 3, 2023
75efc78
Made so that we have an empty domain if the only element inside it is…
constraintAutomaton Apr 5, 2023
e258ba9
Typo in comment fix.
constraintAutomaton Apr 5, 2023
7c9e31e
function renamed.
constraintAutomaton Apr 5, 2023
951929a
Config file renamed.
constraintAutomaton Apr 5, 2023
dbafb29
SolutionRange renamed to SolutionInterval, undefined argument in the …
constraintAutomaton Apr 14, 2023
a7e02d5
made domain fully immutable and get_domain function renamed for getDo…
constraintAutomaton Apr 14, 2023
6515831
made the last operator field of SolutionDomain frozen.
constraintAutomaton Apr 14, 2023
c7339ea
LinkOperator class removed.
constraintAutomaton Apr 14, 2023
b4a4357
Util library used to find the relevant variables.
constraintAutomaton Apr 14, 2023
c381791
lint-fix
constraintAutomaton Apr 14, 2023
d4b70cd
made operator independent classes
constraintAutomaton May 5, 2023
501e6a0
Unit test logic operator done.
constraintAutomaton May 24, 2023
ce28445
Solution domain test done.
constraintAutomaton May 24, 2023
7b26e5c
unit test resolve working until the not operator.
constraintAutomaton May 25, 2023
447a3a3
useless comment deleted
constraintAutomaton May 25, 2023
3586f48
unable to simply rewrite the query, will try merging domain approach
constraintAutomaton May 25, 2023
8964299
equal operator and multiple apply And.
constraintAutomaton May 26, 2023
267d963
filter rewritting function implemented
constraintAutomaton May 26, 2023
db9c270
fix test with not operator
constraintAutomaton May 27, 2023
8f12850
fix current unit tests.
constraintAutomaton May 27, 2023
73e2e36
fix simple double equal assertion
constraintAutomaton May 27, 2023
d95bb50
unit test added for multiple negation
constraintAutomaton May 27, 2023
83e6858
lint-fix partial
constraintAutomaton May 30, 2023
825e308
logic operator unit test done
constraintAutomaton May 30, 2023
c71c710
test coverage back to 100%
constraintAutomaton May 30, 2023
ff09f06
lint-fix
constraintAutomaton May 30, 2023
64cc5c2
useless variable deleted
constraintAutomaton May 30, 2023
55405c0
Decoupling solver.
constraintAutomaton May 30, 2023
9f5f803
method to resolved refactored
constraintAutomaton May 31, 2023
ed1138b
FilterNode class deleted and transformed into a module of function
constraintAutomaton May 31, 2023
54e6750
test case brought back to 100%
constraintAutomaton Jun 1, 2023
5cf18d0
merge with master
constraintAutomaton Jun 1, 2023
4f7ac2d
metadata extractor moved inside the extrac link class and reachabilit…
constraintAutomaton Jun 1, 2023
7660b83
An interface for the buildRelationElement method has been created.
constraintAutomaton Jun 1, 2023
36d957b
Apply suggestions from code review
constraintAutomaton Jun 1, 2023
7af9161
readme updated
constraintAutomaton Jun 1, 2023
baf4b04
typo corrected
constraintAutomaton Jun 1, 2023
efbfc66
merge
constraintAutomaton Jun 1, 2023
a33c7dc
typo corrected.
constraintAutomaton Jun 2, 2023
334585d
typo corrected and lint-fix.
constraintAutomaton Jun 2, 2023
87be5e7
eslint in relation to node 14 deleted as it not supported anymore.
constraintAutomaton Jun 2, 2023
a954b2a
Useless dependencies deleted.
constraintAutomaton Jun 3, 2023
d0bfcf3
merge
constraintAutomaton Jun 6, 2023
356ad75
index.mjs ignored
constraintAutomaton Jun 6, 2023
89cb40b
first
constraintAutomaton Jun 12, 2023
330e79c
more generalized and multi interval function
constraintAutomaton Jun 12, 2023
f3b2084
lint-fix
constraintAutomaton Jun 12, 2023
3444a87
more unit test for and operator
constraintAutomaton Jun 12, 2023
04106bb
fix the unit tests
constraintAutomaton Jun 12, 2023
5b7cf45
fix bug when multiple and operation the array sorting modify frozen v…
constraintAutomaton Jul 7, 2023
c5309e7
fix problem where the and statement had as an input an array of inter…
constraintAutomaton Jul 7, 2023
933a5df
Merge branch 'comunica:master' into feature/time-filtering-tree-sparq…
constraintAutomaton Aug 1, 2023
c3be27b
merge
constraintAutomaton Aug 11, 2023
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions .eslintrc.js
Original file line number Diff line number Diff line change
Expand Up @@ -75,6 +75,7 @@ module.exports = {
'unicorn/consistent-destructuring': 'off',
'unicorn/no-array-callback-reference': 'off',
'unicorn/no-new-array': 'off',
'unicorn/prefer-at': 'off', // not compatible with node v14

// TS
'@typescript-eslint/lines-between-class-members': ['error', { exceptAfterSingleLine: true }],
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
{
"@context": [
"https://linkedsoftwaredependencies.org/bundles/npm/@comunica/runner/^2.0.0/components/context.jsonld",

"https://linkedsoftwaredependencies.org/bundles/npm/@comunica/actor-extract-links-tree/^0.0.0/components/context.jsonld"
],
"@id": "urn:comunica:default:Runner",
"@type": "Runner",
"actors": [
{
"@id": "urn:comunica:default:extract-links/actors#extract-links-tree",
"@type": "ActorExtractLinksTree",
"reachabilityCriterionUseSPARQLFilter": false
constraintAutomaton marked this conversation as resolved.
Show resolved Hide resolved
}
]
}

constraintAutomaton marked this conversation as resolved.
Show resolved Hide resolved
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,8 @@
"actors": [
{
"@id": "urn:comunica:default:extract-links/actors#extract-links-tree",
"@type": "ActorExtractLinksTree"
"@type": "ActorExtractLinksTree",
"reachabilityCriterionUseSPARQLFilter": true
}
]
}
Expand Down
14 changes: 11 additions & 3 deletions packages/actor-extract-links-extract-tree/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,13 @@

[![npm version](https://badge.fury.io/js/%40comunica%2Factor-extract-links-tree.svg)](https://www.npmjs.com/package/@comunica/actor-extract-links-tree)

A comunica [Extract Links](https://github.com/comunica/comunica-feature-link-traversal/tree/master/packages/bus-extract-links) [TREE](https://treecg.github.io/specification/) Actor.
A comunica [Extract Links Actor](https://github.com/comunica/comunica-feature-link-traversal/tree/master/packages/bus-extract-links) for the [TREE](https://treecg.github.io/specification/).
constraintAutomaton marked this conversation as resolved.
Show resolved Hide resolved

The [Guided Linked Traversal Query Processing](https://arxiv.org/abs/2005.02239)
constraintAutomaton marked this conversation as resolved.
Show resolved Hide resolved
option that can be enabled using the `reachabilityCriterionUseSPARQLFilte` flag. The traversal algorithm will consider the solvability of the query filter expression
combined with the [`tree:relation`](https://treecg.github.io/specification/#Relation) of each data source encountered.
A more thorough explanation is available in the poster article
["How TREE hypermedia can speed up Link Traversal-based Query Processing for SPARQL queries with filters"](https://constraintautomaton.github.io/How-TREE-hypermedia-can-speed-up-Link-Traversal-based-Query-Processing-queries/)

This module is part of the [Comunica framework](https://github.com/comunica/comunica),
and should only be used by [developers that want to build their own query engine](https://comunica.dev/docs/modify/).
Expand All @@ -22,13 +28,15 @@ After installing, this package can be added to your engine's configuration as fo
{
"@context": [
...
"https://linkedsoftwaredependencies.org/bundles/npm/@comunica/actor-extract-links-tree/^2.0.0/components/context.jsonld"
"https://linkedsoftwaredependencies.org/bundles/npm/@comunica/actor-extract-links-tree/^0.0.1/components/context.jsonld"
],
"actors": [
...
{
"@id": "urn:comunica:default:extract-links/actors#extract-links-tree",
"@type": "ActorExtractLinksTree"
"@type": "ActorExtractLinksTree",
"reachabilityCriterionUseSPARQLFilter": true

}
]
}
Expand Down
141 changes: 103 additions & 38 deletions packages/actor-extract-links-extract-tree/lib/ActorExtractLinksTree.ts
Original file line number Diff line number Diff line change
@@ -1,84 +1,149 @@
import type {
IActionExtractLinks,
IActorExtractLinksOutput, IActorExtractLinksArgs,
IActorExtractLinksOutput,
} from '@comunica/bus-extract-links';
import { ActorExtractLinks } from '@comunica/bus-extract-links';
import type { ILink } from '@comunica/bus-rdf-resolve-hypermedia-links';
import type { IActorTest } from '@comunica/core';
import { DataFactory } from 'rdf-data-factory';
import type * as RDF from 'rdf-js';

const DF = new DataFactory<RDF.BaseQuad>();
import type { IActorArgs, IActorTest } from '@comunica/core';
import type { IActionContext } from '@comunica/types';
import type * as RDF from 'rdf-js';
import { termToString } from 'rdf-string';
import { FilterNode } from './FilterNode';
import { TreeNodes } from './TreeMetadata';
import type { ITreeRelationRaw, ITreeRelation, ITreeNode } from './TreeMetadata';
import { buildRelationElement, materializeTreeRelation, addRelationDescription } from './treeMetadataExtraction';

/**
* A comunica Extract Links Tree Extract Links Actor.
*/
export class ActorExtractLinksTree extends ActorExtractLinks {
public static readonly aNodeType = DF.namedNode('https://w3id.org/tree#node');
public static readonly aRelation = DF.namedNode('https://w3id.org/tree#relation');
private static readonly rdfTypeNode = DF.namedNode('http://www.w3.org/1999/02/22-rdf-syntax-ns#type');

public constructor(args: IActorExtractLinksArgs) {
private readonly reachabilityCriterionUseSPARQLFilter: boolean = true;
public constructor(args: IActorExtractLinksTreeArgs) {
super(args);
this.reachabilityCriterionUseSPARQLFilter = args.reachabilityCriterionUseSPARQLFilter === undefined ?
true :
args.reachabilityCriterionUseSPARQLFilter;
}

public async test(action: IActionExtractLinks): Promise<IActorTest> {
return true;
}

public isUsingReachabilitySPARQLFilter(): boolean {
return this.reachabilityCriterionUseSPARQLFilter;
}

public async run(action: IActionExtractLinks): Promise<IActorExtractLinksOutput> {
return new Promise((resolve, reject) => {
const metadata = action.metadata;
const currentNodeUrl = action.url;
const pageRelationNodes: Set<string> = new Set();
const currentPageUrl = action.url;
// Identifiers of the relationships defined by the TREE document, represented as stringified RDF terms.
const relationIdentifiers: Set<string> = new Set();
// Maps relationship identifiers to their description.
// At this point, there's no guarantee yet that these relationships are linked to the current TREE document.
const relationDescriptions: Map<string, ITreeRelationRaw> = new Map();
const relations: ITreeRelation[] = [];
// An array of pairs of relationship identifiers and next page link to another TREE document,
// represented as stringified RDF terms.
const nodeLinks: [string, string][] = [];
const links: ILink[] = [];

// Forward errors
metadata.on('error', reject);

// Invoke callback on each metadata quad
// Collect information about relationships spread over quads, so that we can accumulate them afterwards.
metadata.on('data', (quad: RDF.Quad) =>
this.getTreeQuadsRawRelations(quad,
currentNodeUrl,
pageRelationNodes,
nodeLinks));
this.interpretQuad(quad,
currentPageUrl,
relationIdentifiers,
nodeLinks,
relationDescriptions));

// Resolve to discovered links
metadata.on('end', () => {
// Validate if the node forward have the current node as implicit subject
for (const [ nodeValue, link ] of nodeLinks) {
if (pageRelationNodes.has(nodeValue)) {
links.push({ url: link });
// Accumulate collected relationship information.
metadata.on('end', async() => {
constraintAutomaton marked this conversation as resolved.
Show resolved Hide resolved
// Validate if the potential relation node are linked with the current page
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// Validate if the potential relation node are linked with the current page
// Add the relationsips that apply to the current page.

// and add the relation description if it is connected.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// and add the relation description if it is connected.

for (const [ identifier, link ] of nodeLinks) {
// Check if the identifier is the object of a relation of the current page
if (relationIdentifiers.has(identifier)) {
const relationDescription = relationDescriptions.get(identifier);
// Add the relation to the relation array
relations.push(materializeTreeRelation(relationDescription || {}, link));
}
}
resolve({ links });

// Create a ITreeNode object
constraintAutomaton marked this conversation as resolved.
Show resolved Hide resolved
const node: ITreeNode = { relation: relations, identifier: currentPageUrl };
let acceptedRelation = relations;
constraintAutomaton marked this conversation as resolved.
Show resolved Hide resolved
if (this.reachabilityCriterionUseSPARQLFilter) {
// Filter the relation based on the query
const filters = await this.applyFilter(node, action.context);
acceptedRelation = this.handleFilter(filters, acceptedRelation);
}
resolve({ links: acceptedRelation.map(el => ({ url: el.node })) });
});
});
}

/**
* @param {ITreeNode} node - TREE metadata
* @param {IActionContext} context - context of the action; containing the query
* @returns {Promise<Map<string, boolean>>} a map containing the filter
*/
public async applyFilter(node: ITreeNode, context: IActionContext): Promise<Map<string, boolean>> {
return await new FilterNode().run(node, context);
}

/**
* @param { Map<string, boolean>} filters
* @param {ITreeRelation[]} acceptedRelation - the current accepted relation
* @returns {ITreeRelation[]} the relation when the nodes has been filtered
*/
private handleFilter(filters: Map<string, boolean>, acceptedRelation: ITreeRelation[]): ITreeRelation[] {
return filters.size > 0 ?
acceptedRelation.filter(relation => filters?.get(relation.node)) :
acceptedRelation;
}

/**
* A helper function to find all the relations of a TREE document and the possible next nodes to visit.
* The next nodes are not guaranteed to have as subject the URL of the current page,
* so filtering is necessary afterward.
* @param quad the current quad.
* @param url url of the page
* @param pageRelationNodes the url of the relation node of the page that have as subject the URL of the page
* @param nodeLinks the url of the next potential page that has to be visited,
* regardless if the implicit subject is the node of the page
* @param {RDF.Quad} quad - The current quad.
* @param {string} currentPageUrl - The url of the page.
* @param {Set<string>} relationIdentifiers - Identifiers of the relationships defined by the TREE document,
* represented as stringified RDF terms.
* @param {[string, string][]} nodeLinks - An array of pairs of relationship identifiers and next page link to another
* TREE document, represented as stringified RDF terms.
* @param {Map<string, ITreeRelationRaw>} relationDescriptions - Maps relationship identifiers to their description.
*/
private getTreeQuadsRawRelations(
private interpretQuad(
quad: RDF.Quad,
url: string,
pageRelationNodes: Set<string>,
currentPageUrl: string,
relationIdentifiers: Set<string>,
nodeLinks: [string, string][],
relationDescriptions: Map<string, ITreeRelationRaw>,
): void {
// If it's a relation of the current node
if (quad.subject.value === url && quad.predicate.equals(ActorExtractLinksTree.aRelation)) {
pageRelationNodes.add(quad.object.value);
if (quad.subject.value === currentPageUrl && quad.predicate.value === TreeNodes.Relation) {
relationIdentifiers.add(termToString(quad.object));
// If it's a node forward
} else if (quad.predicate.equals(ActorExtractLinksTree.aNodeType)) {
nodeLinks.push([ quad.subject.value, quad.object.value ]);
} else if (quad.predicate.value === TreeNodes.Node) {
nodeLinks.push([ termToString(quad.subject), quad.object.value ]);
}
const descriptionElement = buildRelationElement(quad);
if (descriptionElement) {
const [ value, key ] = descriptionElement;
addRelationDescription(relationDescriptions, quad, value, key);
}
}
}

export interface IActorExtractLinksTreeArgs
extends IActorArgs<IActionExtractLinks, IActorTest, IActorExtractLinksOutput> {
/**
* If true (default), then we use a reachability criterion that prune links that don't respect the
* SPARQL filter
* @default {true}
*/
reachabilityCriterionUseSPARQLFilter: boolean;
}
139 changes: 139 additions & 0 deletions packages/actor-extract-links-extract-tree/lib/FilterNode.ts
Original file line number Diff line number Diff line change
@@ -0,0 +1,139 @@
import { BindingsFactory } from '@comunica/bindings-factory';
import { KeysInitQuery } from '@comunica/context-entries';
import type { IActionContext } from '@comunica/types';
import { Algebra, Factory as AlgebraFactory, Util } from 'sparqlalgebrajs';
import { isBooleanExpressionTreeRelationFilterSolvable } from './solver';
import type { Variable } from './solverInterfaces';
import type { ITreeRelation, ITreeNode } from './TreeMetadata';

const AF = new AlgebraFactory();
const BF = new BindingsFactory();

/**
* A class to apply [SPAQL filters](https://www.w3.org/TR/sparql11-query/#evaluation)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
* A class to apply [SPAQL filters](https://www.w3.org/TR/sparql11-query/#evaluation)
* A class to apply [SPARQL filters](https://www.w3.org/TR/sparql11-query/#evaluation)

* to the [TREE specification](https://treecg.github.io/specification/).
* It use [sparqlee](https://github.com/comunica/sparqlee) to evaluate the filter where
* the binding are remplace by the [value of TREE relation](https://treecg.github.io/specification/#traversing).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
* the binding are remplace by the [value of TREE relation](https://treecg.github.io/specification/#traversing).
* the binding are replaced by the [value of TREE relation](https://treecg.github.io/specification/#traversing).

*/
export class FilterNode {
constraintAutomaton marked this conversation as resolved.
Show resolved Hide resolved
/**
* Return the filter expression if the TREE node has relations
* @param {ITreeNode} node - The current TREE node
* @param {IActionContext} context - The context
* @returns {Algebra.Expression | undefined} The filter expression or undefined if the TREE node has no relations
*/
public getFilterExpressionIfTreeNodeHasConstraint(node: ITreeNode,
constraintAutomaton marked this conversation as resolved.
Show resolved Hide resolved
context: IActionContext): Algebra.Expression | undefined {
if (!node.relation) {
return undefined;
}

if (node.relation.length === 0) {
return undefined;
}

const query: Algebra.Operation = context.get(KeysInitQuery.query)!;
const filterExpression = FilterNode.findNode(query, Algebra.types.FILTER);
if (!filterExpression) {
return undefined;
}

return filterExpression.expression;
}

/**
* Analyze if the tree:relation(s) of a tree:Node should be followed and return a map
* where if the value of the key representing the URL to follow is true than the link must be followed
* if it is false then it should be ignored.
* @param {ITreeNode} node - The current TREE node
* @param {IActionContext} context - The context
* @returns {Promise<Map<string, boolean>>} A map of the indicating if a tree:relation should be follow
*/
public async run(node: ITreeNode, context: IActionContext): Promise<Map<string, boolean>> {
const filterMap: Map<string, boolean> = new Map();

const filterOperation: Algebra.Expression | undefined =
this.getFilterExpressionIfTreeNodeHasConstraint(node, context);

if (!filterOperation) {
return new Map();
}

// Extract the bgp of the query.
const queryBody: Algebra.Operation = context.get(KeysInitQuery.query)!;

// Capture the relation from the function argument.
const relations: ITreeRelation[] = node.relation!;

for (const relation of relations) {
// Accept the relation if the relation does't specify a condition.
if (!relation.path || !relation.value) {
filterMap.set(relation.node, true);
continue;
}
// Find the quad from the bgp that are related to the TREE relation.
const variables = FilterNode.findRelevantVariableFromBgp(queryBody, relation.path);
constraintAutomaton marked this conversation as resolved.
Show resolved Hide resolved

// Accept the relation if no variable are linked with the relation.
if (variables.length === 0) {
filterMap.set(relation.node, true);
continue;
}
let filtered = false;
// For all the variable check if one is has a possible solutions.
for (const variable of variables) {
filtered = filtered || isBooleanExpressionTreeRelationFilterSolvable(
{ relation, filterExpression: filterOperation, variable },
);
}

filterMap.set(relation.node, filtered);
}
return filterMap;
}

/**
* Find the variables from the BGP that match the predicate defined by the TREE:path from a TREE relation.
* The subject can be anyting.
* @param {Algebra.Operation} queryBody - the body of the query
* @param {string} path - TREE path
* @returns {Variable[]} the variables of the Quad objects that contain the TREE path as predicate
*/
private static findRelevantVariableFromBgp(queryBody: Algebra.Operation, path: string): Variable[] {
const resp: Variable[] = [];
const addVariable = (quad: any): boolean => {
if (quad.predicate.value === path && quad.object.termType === 'Variable') {
resp.push(quad.object.value);
}
return true;
};

Util.recurseOperation(queryBody, {
[Algebra.types.PATH]: addVariable,
[Algebra.types.PATTERN]: addVariable,

});
return resp;
}

/**
* Find the first node of type `nodeType`, if it doesn't exist
* it return undefined.
* @param {Algebra.Operation} query - the original query
* @param {string} nodeType - the type of node requested
* @returns {any}
*/
private static findNode(query: Algebra.Operation, nodeType: string): any {
let currentNode = query;
do {
if (currentNode.type === nodeType) {
return currentNode;
}
if ('input' in currentNode) {
currentNode = currentNode.input;
}
} while ('input' in currentNode);
return undefined;
}
}

Loading