Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: questions with synonym tags do not show under the origin tag #685

Merged
merged 5 commits into from
Dec 18, 2023

Conversation

hgaol
Copy link
Member

@hgaol hgaol commented Dec 15, 2023

fix #680 .

@fenbox fenbox requested a review from LinkinStars December 15, 2023 10:24
@LinkinStars LinkinStars self-assigned this Dec 15, 2023
@LinkinStars
Copy link
Member

@hgaol Firstly, thank you very much for your contribution.

In the page http://127.0.0.1:8080/tags/kubernetes, user can find all questions that search by synonym tag. That's right you've fixed the issue in this part.

But in the page http://127.0.0.1:8080/search?q=%5Bkubernetes%5D, user can search content via expression such as [tag]. This place still has the same issue that hasn't been fixed.

@hgaol
Copy link
Member Author

hgaol commented Dec 15, 2023

@hgaol Firstly, thank you very much for your contribution.

In the page http://127.0.0.1:8080/tags/kubernetes, user can find all questions that search by synonym tag. That's right you've fixed the issue in this part.

But in the page http://127.0.0.1:8080/search?q=%5Bkubernetes%5D, user can search content via expression such as [tag]. This place still has the same issue that hasn't been fixed.

Thanks for figuring it out! I'll fix it later.

@hgaol
Copy link
Member Author

hgaol commented Dec 15, 2023

update,

after dive into the SQL logic, I guess the original design is option1. So for the synonym scenario, I think it may need a 2D array for tags. Each dimension is for each tag and its mainTag and synonym tags. Takes 2 tags as example, the logic could be

AND tag_rel0.status = ?
AND tag_rel0.tag_id IN (?, ?)
AND tag_rel1.status = ?
AND tag_rel1.tag_id IN (?)

Hi @LinkinStars , one quick question, what's the expected logic when there're multiple tags? option 1 or 2? I think for synonym tags, it's option 2. what about if 2 different tags?

option 1: return objects with both tag1 and tag2.
option 2: return objects with either tag1 or tag2.

@kumfo
Copy link
Member

kumfo commented Dec 18, 2023

update,

after dive into the SQL logic, I guess the original design is option1. So for the synonym scenario, I think it may need a 2D array for tags. Each dimension is for each tag and its mainTag and synonym tags. Takes 2 tags as example, the logic could be

AND tag_rel0.status = ?
AND tag_rel0.tag_id IN (?, ?)
AND tag_rel1.status = ?
AND tag_rel1.tag_id IN (?)

Hi @LinkinStars , one quick question, what's the expected logic when there're multiple tags? option 1 or 2? I think for synonym tags, it's option 2. what about if 2 different tags?

option 1: return objects with both tag1 and tag2. option 2: return objects with either tag1 or tag2.

Currently, SQL supports multi tag search. If you need to support synonymous tag search, you should directly query synonymous tags when parsing the underlying tags. This way, future search plugins can also be used.

The Search Parse Code :

search parse

You can add code like ts.tagRepo.GetIDsByMainTagId after line 139.

@hgaol
Copy link
Member Author

hgaol commented Dec 18, 2023

Hi @kumfo , I've tested adding ts.tagRepo.GetIDsByMainTagId and added below lines

synIDs, err := sp.tagCommonService.GetTagIDsByMainTagID(ctx, tag.ID)
...
tagIDs = append(tagIDs, synIDs...)

but found it'll only return the objects with all these tags exist. You can find part of the SQL for search. IMO, it does not solve the issue. Take k8s and kubernates as example, when I search [k8s], the expected objects are which have either k8s or kubernates tag, right? similar scenario for searching [kubernates].

...
      `post_update_time`
    FROM
      `question`
      INNER JOIN tag_rel as tag_rel0 ON question.id = tag_rel0.object_id
      INNER JOIN tag_rel as tag_rel1 ON question.id = tag_rel1.object_id
    WHERE
      `question`.`status` < 10
      AND `question`.`show` = 1
      AND tag_rel0.status = 1
      AND tag_rel0.tag_id = tagId1
      AND tag_rel1.status = 1
      AND tag_rel1.tag_id = tagId2
...

@kumfo
Copy link
Member

kumfo commented Dec 18, 2023

scenario

Sorry! I know what the problem is, and I overlooked the previous search logic. The original tag search must include all these tags at the same time. If you want to modify it now, this search logic will probably need to be processed during the search query.

You need to change the code in file [internal/repo/search_common/search_repo.go](https://github.com/apache/incubator-answer/blob/152934c740d163c2b6345ac2e8c8f542957efa75/internal/repo/search_common/search_repo.go#L150), and you search the comments of // check tag in file. you add synonymous tags query and add or conditions to SQL.

However, if possible, you can parse the synonymous tags in search_parse.go, obtain a separate query parameter, and submit it to this search query. Then the subsequent search plugins can also be modified to be compatible with this query method. but the program will will become more complex.

@hgaol
Copy link
Member Author

hgaol commented Dec 18, 2023

Hi @kumfo , exactly! I've pushed new commit about the changed logic for easy communication. In a word, it changes the original SQL from

AND tag_rel0.status = 1
AND tag_rel0.tag_id = tagId1
AND tag_rel1.status = 1
AND tag_rel1.tag_id = tagId2

to

AND tag_rel0.status = ?
AND tag_rel0.tag_id IN (?, ?)
AND tag_rel1.status = ?
AND tag_rel1.tag_id IN (?)

As you mentioned, it changes the interface and all the search plugin need to be modified to be compatible. I'm not sure if there's any better solution. Feel free to let me know and I can implement in this PR. Thanks!

@kumfo
Copy link
Member

kumfo commented Dec 18, 2023

@hgaol Good Job! but I saw your commit that find you not change the query conditions to in query, You need to change the query conditions,

Original is :

And(builder.Eq{
    ast + ".tag_id": tagID,
    ast + ".status": entity.TagRelStatusAvailable,
})

it need to change like this:

And(builder.In(ast+".tag_id", tagID)).
And(builder.Eq{
    ast + ".status": entity.TagRelStatusAvailable,
})

you can search the code comments of // check tag, there are 3 areas that need to be modified.

@hgaol
Copy link
Member Author

hgaol commented Dec 18, 2023

Hi @kumfo , thanks for your review! I found that when there's array in builder.Eq, it'll change it into in. And I've tested it. related documents.
image
https://pkg.go.dev/xorm.io/[email protected]

but agree that your solution is more clear to indicate that it's a in instead of =.

@hgaol
Copy link
Member Author

hgaol commented Dec 18, 2023

I haven't updated for SearchQuestions and SearchAnswers yet. Just want to make sure if it's the expected solution. Do I need to update SearchQuestions and SearchAnswers as well?

@kumfo
Copy link
Member

kumfo commented Dec 18, 2023

I haven't updated for SearchQuestions and SearchAnswers yet. Just want to make sure if it's the expected solution. Do I need to update SearchQuestions and SearchAnswers as well?

Yeah, you need to update SearchQuestions and SearchAnswers as well. Because there are three ways to search, the default search includes questions and answers. If questions(use is:question) or answers (use is:answer are specified, tag searches can also be used at the same time.

@hgaol
Copy link
Member Author

hgaol commented Dec 18, 2023

hi @kumfo , updated. Since Eq{ast+."tag_id": []string{"c", "d"}} works, I didn't change it into builder.In(ast+".tag_id", tagID). And I've tested that it works well.
Feel free to let me know if you prefer builder.In(ast+".tag_id", tagID), and I can update and test it.

@kumfo
Copy link
Member

kumfo commented Dec 18, 2023

hi @kumfo , updated. Since Eq{ast+."tag_id": []string{"c", "d"}} works, I didn't change it into builder.In(ast+".tag_id", tagID). And I've tested that it works well. Feel free to let me know if you prefer builder.In(ast+".tag_id", tagID), and I can update and test it.

There is no problem in handling it this way. I personally think that using the in explicit identifier will make it clearer for people who read the code.

If you are willing to continue to make changes, I will wait for you to make changes and merge the code in. If you keep the current situation, it doesn't matter, I will also merge the code in.😊

@hgaol
Copy link
Member Author

hgaol commented Dec 18, 2023

Updated to replace with builder.In. For search plugins, I'm not quite familiar with and may take some time to ramp up. If it's not urgent, I'd like to try to update them to be compatible as well.

@kumfo kumfo changed the base branch from main to dev December 18, 2023 09:52
@kumfo kumfo merged commit 67d1af3 into apache:dev Dec 18, 2023
4 checks passed
@kumfo
Copy link
Member

kumfo commented Dec 18, 2023

Hi @hgaol

Well done!, I merged it to dev branch, your submissions will be released in version 1.2.5.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

After setting the tag as a synonym, the questions do not show under the origin tag.
3 participants