-
Notifications
You must be signed in to change notification settings - Fork 72
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Introduced QID search in translation. #545
Changes from 4 commits
755436d
f2ebb1d
778498d
8fad882
9efac94
f7b0569
2fdd74b
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -23,6 +23,7 @@ | |
|
||
import ast | ||
import contextlib | ||
import requests | ||
import json | ||
import os | ||
import re | ||
|
@@ -736,3 +737,70 @@ def check_index_exists(index_path: Path, overwrite_all: bool = False) -> bool: | |
return choice == "Skip process" | ||
|
||
return False | ||
|
||
|
||
def check_qid_is_language(qid: str): | ||
""" | ||
Parameters | ||
---------- | ||
qid : str | ||
The QID to check Wikidata to see if it's a language and return its English label. | ||
|
||
Outputs | ||
------- | ||
str | ||
The English label of the Wikidata language entity. | ||
|
||
Raises | ||
------ | ||
ValueError | ||
An invalid QID that's not a language has been passed. | ||
""" | ||
api_endpoint = "https://www.wikidata.org/w/rest.php/wikibase/v0" | ||
request_string = f"{api_endpoint}/entities/items/{qid}" | ||
|
||
request = requests.get(request_string, timeout=5) | ||
request_result = request.json() | ||
|
||
if request_result["statements"]["P31"]: | ||
instance_of_values = request_result["statements"]["P31"] | ||
for val in instance_of_values: | ||
if val["value"]["content"] == "Q34770": | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. 🤔 Coming across this.. was actually wondering if we do have it documented somewhere what If not, should we? Was thinking that a quick markdown table could suffice really. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Or actually - scratch that It might be more helpful perhaps if that information is closer to where it's used in code. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Does another metadata file make sense for this? That way we reference an object and get the QID from a human readable object key? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
hmm could be 🤔 Were you thinking of something like this? # Instead of...
# instance_of_values = request_result["statements"]["P31"]
instance_of_property = wikidata["property"]["instance-of"]
instance_of_values = request_result["statements"][instance_of_property]
for val in instance_of_values:
# Instead of...
# if val["value"]["content"] == "Q34770":
if val["value"]["content"] == wikidata["entity"]["language"]: There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Along the lines of this, yes :) |
||
print(f"{request_result['labels']['en']} ({qid}) is a language.\n") | ||
return request_result["labels"]["en"] | ||
|
||
raise ValueError("The passed Wikidata QID is not a language.") | ||
|
||
|
||
def get_language_iso_code(qid: str): | ||
""" | ||
Parameters | ||
---------- | ||
qid : str | ||
Get the ISO code of a language given its Wikidata QID. | ||
|
||
Outputs | ||
------- | ||
str | ||
The ISO code of the language. | ||
|
||
Raises | ||
------ | ||
ValueError | ||
An invalid QID that's not a language has been passed. | ||
KeyError | ||
The ISO code for the language is not available. | ||
""" | ||
|
||
api_endpoint = f"https://www.wikidata.org/w/api.php?action=wbgetentities&ids={qid}&props=claims&format=json" | ||
response = requests.get(api_endpoint) | ||
data = response.json() | ||
try: | ||
return data["entities"][qid]["claims"]["P305"][0]["mainsnak"]["datavalue"][ | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
ditto |
||
"value" | ||
] | ||
|
||
except ValueError: | ||
raise ValueError("The passed Wikidata QID is not a language.") | ||
except KeyError: | ||
return KeyError("The ISO code for the language is not available.") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Had to move the
check_qid_is_language
method inutils.py
for ImportError.