-
Notifications
You must be signed in to change notification settings - Fork 383
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BUG] my indexing always does not progress any more #154
Comments
Error: Please create a text file, add your own content to it, and index it with a new index name. |
I add another new index name in script and check status of indexing job, still 0.0% completed yet. After observe logs by the command “watch kubectl get jobs -n graphrag”, "indexing-job-33b5e67636ee5ae3432d87c2cc8408d5" always exists, and not any new indexing job is created. How to kill it or start up another new one job ?
|
@fangnster Did you change index name and storage name in the 1-Quickstart.ipynb? |
yes , new index name and new storage name have been changed |
@@fangnster |
this job has been running for several days, and status of this job always stay 0.0% completed for several days. How to fix it ? ![]() this screen shot is the same , whatever index name and storage name are changed before and after |
@fangnster |
I have deleted precious index and storage files, and restart another new file names, and then with look into running job of indexing , such as named "graphrag-index-manager-****" starts up one by one , which is killed automatically after 5 mins, and then restart another one. Therefore, the error shown on the first floor that is "reason:conflict" in logs occurs. |
@fangnster |
after study into these command, I restart a new indexing job with new storage and indexing file names. And then the same error "reason:conflict" occurs . At observation of that progress of that new indexing, I find a script "indexing-job-manage-template.yaml" as follows: Whether I can delay the 5 mins schedule to longer interval, such as 15mins etc., in order that the last indexing job has been processed completely . Could you tell me the reason of setting of the "5 mins" ? |
When you initiate an indexing job, a record of it is put into CosmosDB for the job and it is listed in a state of "Scheduled." The K8s CronJob runs every 5 mins and checks CosmosDB for Scheduled indexing jobs, and then initiates actual indexing processes for them in order. It uses a k8s Job deployment for an indexing pod to be spun up (the |
Could you tell me how to modify the cronjob interval of 5 mins to longer one ? |
you can edit the template for the cron job by doing and looking for the Note that if you want to change it permanently between deployments, you'd change it in this file, and redeploy the backend container to Azure Container Registry. |
In addition, I redeploy a new this file in schedule "*/15 * * * " line, and then success. However, I check the cronjob by "kubectl describe cronjob", still former schedule "/5 * * * *" line |
I've been stuck in the scheduled 0.0% state for a long time too, but the cronjob that manages the indexing jobs (created from indexing-job-manager-template.yaml) runs every 5 minutes, which doesn't seem to have anything to do with the indexing processing time (you said it runs/shuts down every 5 minutes, but I understand that only the run is every 5 minutes). My guess is that you're simply getting that error because the AKS job (indexing-job-*) didn't complete and no new indexing job was created, but the one that was already created is still running. If there's a problem, it's probably the part where the indexing job doesn't complete and hangs. (I haven't had indexing complete in over 30 minutes either, but I'm not sure if it's in progress or hanging). |
I found that the pod of indexing job was in hanging mostly and graphrag-index-manager start up every 5 mins and log error that include "already exists". |
Hi! Am I missing anything? Thanks! |
@mb-porini - if the index status is showing as {scheduled} then the backend API has done what it is going to do. There is a k8s @fangnster - If the |
Hi, thanks for your kind multi-responses to my question. Additionally, I try to execute (1) and (2) in several times, the error as same as usual , that is below screen shot and : |
Did anyone found a solution for this ?, I'm stuck int he same situation ' job always stay 0.0% completed for last two days and no errors'..thank you in advance |
Hi @alopezcruz, I have to be clear, I made a very big mistake. During the configuration of the environment I decided to use a different compute size due to my subscription limitation. I figured out a little bit later that in the YAML configuration file there are some minimum requirements that has to be reached. So my problem has be solved completing the modified configuration. After that it worked perfectly. Thanks for asking |
hi @mb-porini ,
Best, |
Well, there is a very good amount of yaml files so I suggest you to modify them only if you are currently aware of how to handle them. Moreover, I'm quite sure that the main file was this. |
Hi @timothymeyers , @mb-porini Thank you for the reply, appreciated, below are the logs from the index , also @timothymeyers , the dataset being used is the one of the sample from Wikipedia 'California' : Scheduling job for index: TestCalifornia Traceback (most recent call last): During handling of the above exception, another exception occurred: Traceback (most recent call last): Please let me know your comments, thank you again for your time.. |
Hi @timothymeyers, @mb-porini, I did figure this out, your advice put me on the right track, I leave below what I found in case others face similar issue. |
my sample wikipedia articles are indexed and always 0.0% completed , how do I fix it?
screen shot for pot logs as follows:
##########################################################
kubectl logs job/graphrag-index-manager-28738255 -n graphrag -f
Scheduling job for index: testindex
[ERROR] 2024-08-22 02:58:32,367 - Index job manager encountered error scheduling indexing job
Traceback (most recent call last):
File "/backend/manage-indexing-jobs.py", line 43, in schedule_indexing_job
batch_v1.create_namespaced_job(
File "/usr/local/lib/python3.10/site-packages/kubernetes/client/api/batch_v1_api.py", line 210, in create_namespaced_job
return self.create_namespaced_job_with_http_info(namespace, body, **kwargs) # noqa: E501
File "/usr/local/lib/python3.10/site-packages/kubernetes/client/api/batch_v1_api.py", line 309, in create_namespaced_job_with_http_info
return self.api_client.call_api(
File "/usr/local/lib/python3.10/site-packages/kubernetes/client/api_client.py", line 348, in call_api
return self.__call_api(resource_path, method,
File "/usr/local/lib/python3.10/site-packages/kubernetes/client/api_client.py", line 180, in __call_api
response_data = self.request(
File "/usr/local/lib/python3.10/site-packages/kubernetes/client/api_client.py", line 391, in request
return self.rest_client.POST(url,
File "/usr/local/lib/python3.10/site-packages/kubernetes/client/rest.py", line 279, in POST
return self.request("POST", url,
File "/usr/local/lib/python3.10/site-packages/kubernetes/client/rest.py", line 238, in request
raise ApiException(http_resp=r)
kubernetes.client.exceptions.ApiException: (409)
Reason: Conflict
HTTP response headers: HTTPHeaderDict({'Audit-Id': '3da54996-302b-4b53-8550-eda0a9ca4ee3', 'Cache-Control': 'no-cache, private', 'Content-Type': 'application/json', 'X-Kubernetes-Pf-Flowschema-Uid': '4394828c-45ff-46b1-99c3-43de3fef08f8', 'X-Kubernetes-Pf-Prioritylevel-Uid': '95614f89-7a01-4064-bb56-9f052b3cb22f', 'Date': 'Thu, 22 Aug 2024 02:58:30 GMT', 'Content-Length': '290'})
HTTP response body: {"kind":"Status","apiVersion":"v1","metadata":{},"status":"Failure","message":"jobs.batch "indexing-job-33b5e67636ee5ae3432d87c2cc8408d5" already exists","reason":"AlreadyExists","details":{"name":"indexing-job-33b5e67636ee5ae3432d87c2cc8408d5","group":"batch","kind":"jobs"},"code":409}
Traceback (most recent call last):
File "/backend/manage-indexing-jobs.py", line 43, in schedule_indexing_job
batch_v1.create_namespaced_job(
File "/usr/local/lib/python3.10/site-packages/kubernetes/client/api/batch_v1_api.py", line 210, in create_namespaced_job
return self.create_namespaced_job_with_http_info(namespace, body, **kwargs) # noqa: E501
File "/usr/local/lib/python3.10/site-packages/kubernetes/client/api/batch_v1_api.py", line 309, in create_namespaced_job_with_http_info
return self.api_client.call_api(
File "/usr/local/lib/python3.10/site-packages/kubernetes/client/api_client.py", line 348, in call_api
return self.__call_api(resource_path, method,
File "/usr/local/lib/python3.10/site-packages/kubernetes/client/api_client.py", line 180, in __call_api
response_data = self.request(
File "/usr/local/lib/python3.10/site-packages/kubernetes/client/api_client.py", line 391, in request
return self.rest_client.POST(url,
File "/usr/local/lib/python3.10/site-packages/kubernetes/client/rest.py", line 279, in POST
return self.request("POST", url,
File "/usr/local/lib/python3.10/site-packages/kubernetes/client/rest.py", line 238, in request
raise ApiException(http_resp=r)
kubernetes.client.exceptions.ApiException: (409)
Reason: Conflict
HTTP response headers: HTTPHeaderDict({'Audit-Id': '3da54996-302b-4b53-8550-eda0a9ca4ee3', 'Cache-Control': 'no-cache, private', 'Content-Type': 'application/json', 'X-Kubernetes-Pf-Flowschema-Uid': '4394828c-45ff-46b1-99c3-43de3fef08f8', 'X-Kubernetes-Pf-Prioritylevel-Uid': '95614f89-7a01-4064-bb56-9f052b3cb22f', 'Date': 'Thu, 22 Aug 2024 02:58:30 GMT', 'Content-Length': '290'})
HTTP response body: {"kind":"Status","apiVersion":"v1","metadata":{},"status":"Failure","message":"jobs.batch "indexing-job-33b5e67636ee5ae3432d87c2cc8408d5" already exists","reason":"AlreadyExists","details":{"name":"indexing-job-33b5e67636ee5ae3432d87c2cc8408d5","group":"batch","kind":"jobs"},"code":409}
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/backend/manage-indexing-jobs.py", line 120, in
main()
File "/backend/manage-indexing-jobs.py", line 116, in main
schedule_indexing_job(index_to_schedule)
File "/backend/manage-indexing-jobs.py", line 55, in schedule_indexing_job
pipeline_job["status"] = PipelineJobState.FAILED
TypeError: 'PipelineJob' object does not support item assignment
The text was updated successfully, but these errors were encountered: