Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Question] Doubts About the Logical Design of edge Cache Invalidation in Graph Servers #2719

Open
1 task done
haohao0103 opened this issue Jan 7, 2025 · 10 comments
Open
1 task done
Labels
inactive question Further information is requested

Comments

@haohao0103
Copy link
Contributor

Problem Type (问题类型)

struct / logic (架构 / 逻辑设计问题)

Before submit

  • 我已经确认现有的 IssuesFAQ 中没有相同 / 重复问题 (I have confirmed and searched that there are no similar problems in the historical issue and documents)

Environment (环境信息)

  • Server Version: 1.0.0

Your Question (问题描述)

CacheGraphTransaction#commitMutation2Backend()
commitMutation2Backend, i think the edge cache needs to be cleared only when there is an edge update or deletion vertex that would cause the associated edge to be deleted,Why do we need to trigger the clear of edge cache for the addition and update operations of vertex?

Vertex/Edge example (问题点 / 边数据举例)

No response

Schema [VertexLabel, EdgeLabel, IndexLabel] (元数据结构)

No response

Copy link

dosubot bot commented Jan 7, 2025

The edge cache is cleared during the addition and update operations of a vertex in the CacheGraphTransaction#commitMutation2Backend method because any change to a vertex can potentially affect the edges linked to it. The current implementation does not have a precise strategy to update only the affected edges, so it opts to clear the entire edge cache to ensure consistency. This approach is a temporary solution until a more precise strategy is developed [1].

To continue talking to Dosu, mention @dosu.


Help Dosu learn! Give it feedback: Great Response | Irrelevant Answer | Incorrect Sources | Too Verbose | Hallucination | Other

@dosubot dosubot bot added the question Further information is requested label Jan 7, 2025
@haohao0103
Copy link
Contributor Author

@dosu I understand that the edge cache is based on the queryId, and the key (queryId) of the edge cache is constructed from the query conditions and vertex IDs. If only the attributes of a vertex are updated, it should not affect the edge cache because:

  1. The structure of the edge (source vertex, target vertex, label, etc.) has not changed.
  2. The query results of the edge do not depend on the attribute values of the vertex.

Copy link

dosubot bot commented Jan 7, 2025

Updating the attributes of a vertex affects the edge cache because the cache invalidation mechanism is triggered by any vertex update, not just changes to the edge structure or query results. This is due to the current implementation, which lacks a precise strategy to selectively update only the affected edges. As a result, the entire edge cache is cleared to maintain consistency, even if the edge structure and query results remain unchanged [1].

To continue talking to Dosu, mention @dosu.


Help Dosu learn! Give it feedback: Great Response | Irrelevant Answer | Incorrect Sources | Too Verbose | Hallucination | Other

@haohao0103
Copy link
Contributor Author

@dosu "the cache invalidation mechanism is triggered by any vertex update",Can you provide a concrete example to explain why updating vertex attributes should trigger edge cache invalidation? What problems might occur if the edge cache is not invalidated?

Copy link

dosubot bot commented Jan 7, 2025

Updating vertex attributes should trigger edge cache invalidation because edges often depend on the properties of their connected vertices. If the edge cache is not invalidated, it can lead to inconsistencies where the cached edges do not reflect the updated state of the vertices they connect, causing incorrect query results or application logic errors.

For example, consider a test case where two vertices v1 and v2 are connected by an edge. Initially, the edge cache is populated with this edge. If a property of vertex v1 is updated (e.g., adding a property "name" with value "test-name"), the edge cache is invalidated to ensure that subsequent queries reflect the updated vertex properties.

If the edge cache were not invalidated, querying the edges connected to v1 might return outdated information, such as the old vertex properties, leading to potential errors in applications relying on this data. The test confirms that after updating a vertex property, the edge cache size is reset to zero, ensuring that the next query retrieves fresh data:

cache.addVertexProperty(new HugeVertexProperty<>(v1,
                                                 cache.graph().schema()
                                                      .getPropertyKey("name"),
                                                 "test-name"));
cache.commit();

Assert.assertEquals(0L,
                    Whitebox.invoke(cache, "edgesCache", "size"));
String name = cache.queryEdgesByVertex(IdGenerator.of(1)).next().outVertex()
                   .value("name");
Assert.assertEquals("test-name", name);

This demonstrates the necessity of edge cache invalidation upon vertex updates to maintain data consistency [1].

To continue talking to Dosu, mention @dosu.


Help Dosu learn! Give it feedback: Great Response | Irrelevant Answer | Incorrect Sources | Too Verbose | Hallucination | Other

@haohao0103
Copy link
Contributor Author

@dosu cache.addVertexProperty(new HugeVertexProperty<>(v1,
cache.graph().schema()
.getPropertyKey("name"),
"test-name"));
cache.commit(); Where does this code come from? I haven't seen any logic in the project that directly caches properties.

Copy link

dosubot bot commented Jan 7, 2025

The code snippet involving cache.addVertexProperty and cache.commit comes from the CachedGraphTransactionTest class in the file hugegraph-server/hugegraph-test/src/main/java/org/apache/hugegraph/unit/cache/CachedGraphTransactionTest.java. This test method demonstrates the use of cache.addVertexProperty and cache.commit to add properties to vertices and commit the changes [1].

To continue talking to Dosu, mention @dosu.


Help Dosu learn! Give it feedback: Great Response | Irrelevant Answer | Incorrect Sources | Too Verbose | Hallucination | Other

@haohao0103
Copy link
Contributor Author

@dosu CachedGraphTransaction#testEdgeCacheClearWhenUpdateVertex() the cacheKey of edgeCache is

Query * from EDGE where [OWNER_VERTEX == 2, DIRECTION == OUT OR DIRECTION == IN]=
Query * from EDGE where [OWNER_VERTEX == 2, DIRECTION == OUT OR DIRECTION == IN], Query * from EDGE where [OWNER_VERTEX == 1, DIRECTION == OUT OR DIRECTION == IN]=
Query * from EDGE where [OWNER_VERTEX == 1, DIRECTION == OUT OR DIRECTION == IN]

When we update the name property of v3, this operation does not change the edge query result, so we do not need to clear the edge cache.

Copy link

dosubot bot commented Jan 7, 2025

The edge cache is cleared in CachedGraphTransaction#testEdgeCacheClearWhenUpdateVertex() when updating the name property of vertex v3 to ensure consistency and correctness of the cache. Although updating the name property of v3 does not directly change the edge query result, the cache mechanism is designed to invalidate the cache whenever a vertex is updated. This is a precautionary measure to ensure that any potential indirect effects on the cache are accounted for, maintaining the integrity of the cached data. The test verifies that the cache is cleared by checking the cache size before and after the update operation, confirming that the cache is reset to zero and then repopulated with the correct data [1][2].

To continue talking to Dosu, mention @dosu.


Help Dosu learn! Give it feedback: Great Response | Irrelevant Answer | Incorrect Sources | Too Verbose | Hallucination | Other

Copy link

Due to the lack of activity, the current issue is marked as stale and will be closed after 20 days, any update will remove the stale label

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
inactive question Further information is requested
Projects
None yet
Development

No branches or pull requests

1 participant