[BUG] my indexing always does not progress any more #154

fangnster · 2024-08-22T03:12:03Z

my sample wikipedia articles are indexed and always 0.0% completed , how do I fix it?

screen shot for pot logs as follows:
##########################################################
kubectl logs job/graphrag-index-manager-28738255 -n graphrag -f

Scheduling job for index: testindex
[ERROR] 2024-08-22 02:58:32,367 - Index job manager encountered error scheduling indexing job
Traceback (most recent call last):
File "/backend/manage-indexing-jobs.py", line 43, in schedule_indexing_job
batch_v1.create_namespaced_job(
File "/usr/local/lib/python3.10/site-packages/kubernetes/client/api/batch_v1_api.py", line 210, in create_namespaced_job
return self.create_namespaced_job_with_http_info(namespace, body, **kwargs) # noqa: E501
File "/usr/local/lib/python3.10/site-packages/kubernetes/client/api/batch_v1_api.py", line 309, in create_namespaced_job_with_http_info
return self.api_client.call_api(
File "/usr/local/lib/python3.10/site-packages/kubernetes/client/api_client.py", line 348, in call_api
return self.__call_api(resource_path, method,
File "/usr/local/lib/python3.10/site-packages/kubernetes/client/api_client.py", line 180, in __call_api
response_data = self.request(
File "/usr/local/lib/python3.10/site-packages/kubernetes/client/api_client.py", line 391, in request
return self.rest_client.POST(url,
File "/usr/local/lib/python3.10/site-packages/kubernetes/client/rest.py", line 279, in POST
return self.request("POST", url,
File "/usr/local/lib/python3.10/site-packages/kubernetes/client/rest.py", line 238, in request
raise ApiException(http_resp=r)
kubernetes.client.exceptions.ApiException: (409)
Reason: Conflict
HTTP response headers: HTTPHeaderDict({'Audit-Id': '3da54996-302b-4b53-8550-eda0a9ca4ee3', 'Cache-Control': 'no-cache, private', 'Content-Type': 'application/json', 'X-Kubernetes-Pf-Flowschema-Uid': '4394828c-45ff-46b1-99c3-43de3fef08f8', 'X-Kubernetes-Pf-Prioritylevel-Uid': '95614f89-7a01-4064-bb56-9f052b3cb22f', 'Date': 'Thu, 22 Aug 2024 02:58:30 GMT', 'Content-Length': '290'})
HTTP response body: {"kind":"Status","apiVersion":"v1","metadata":{},"status":"Failure","message":"jobs.batch "indexing-job-33b5e67636ee5ae3432d87c2cc8408d5" already exists","reason":"AlreadyExists","details":{"name":"indexing-job-33b5e67636ee5ae3432d87c2cc8408d5","group":"batch","kind":"jobs"},"code":409}

Traceback (most recent call last):
File "/backend/manage-indexing-jobs.py", line 43, in schedule_indexing_job
batch_v1.create_namespaced_job(
File "/usr/local/lib/python3.10/site-packages/kubernetes/client/api/batch_v1_api.py", line 210, in create_namespaced_job
return self.create_namespaced_job_with_http_info(namespace, body, **kwargs) # noqa: E501
File "/usr/local/lib/python3.10/site-packages/kubernetes/client/api/batch_v1_api.py", line 309, in create_namespaced_job_with_http_info
return self.api_client.call_api(
File "/usr/local/lib/python3.10/site-packages/kubernetes/client/api_client.py", line 348, in call_api
return self.__call_api(resource_path, method,
File "/usr/local/lib/python3.10/site-packages/kubernetes/client/api_client.py", line 180, in __call_api
response_data = self.request(
File "/usr/local/lib/python3.10/site-packages/kubernetes/client/api_client.py", line 391, in request
return self.rest_client.POST(url,
File "/usr/local/lib/python3.10/site-packages/kubernetes/client/rest.py", line 279, in POST
return self.request("POST", url,
File "/usr/local/lib/python3.10/site-packages/kubernetes/client/rest.py", line 238, in request
raise ApiException(http_resp=r)
kubernetes.client.exceptions.ApiException: (409)
Reason: Conflict
HTTP response headers: HTTPHeaderDict({'Audit-Id': '3da54996-302b-4b53-8550-eda0a9ca4ee3', 'Cache-Control': 'no-cache, private', 'Content-Type': 'application/json', 'X-Kubernetes-Pf-Flowschema-Uid': '4394828c-45ff-46b1-99c3-43de3fef08f8', 'X-Kubernetes-Pf-Prioritylevel-Uid': '95614f89-7a01-4064-bb56-9f052b3cb22f', 'Date': 'Thu, 22 Aug 2024 02:58:30 GMT', 'Content-Length': '290'})
HTTP response body: {"kind":"Status","apiVersion":"v1","metadata":{},"status":"Failure","message":"jobs.batch "indexing-job-33b5e67636ee5ae3432d87c2cc8408d5" already exists","reason":"AlreadyExists","details":{"name":"indexing-job-33b5e67636ee5ae3432d87c2cc8408d5","group":"batch","kind":"jobs"},"code":409}

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "/backend/manage-indexing-jobs.py", line 120, in
main()
File "/backend/manage-indexing-jobs.py", line 116, in main
schedule_indexing_job(index_to_schedule)
File "/backend/manage-indexing-jobs.py", line 55, in schedule_indexing_job
pipeline_job["status"] = PipelineJobState.FAILED
TypeError: 'PipelineJob' object does not support item assignment

rnpramasamyai · 2024-08-22T03:28:55Z

Error:
indexing-job-33b5e67636ee5ae3432d87c2cc8408d5" already exists

Please create a text file, add your own content to it, and index it with a new index name.

fangnster · 2024-08-22T09:20:58Z

I add another new index name in script and check status of indexing job, still 0.0% completed yet. After observe logs by the command “watch kubectl get jobs -n graphrag”， "indexing-job-33b5e67636ee5ae3432d87c2cc8408d5" always exists, and not any new indexing job is created. How to kill it or start up another new one job ?

Error: indexing-job-33b5e67636ee5ae3432d87c2cc8408d5" already exists

Please create a text file, add your own content to it, and index it with a new index name.

rnpramasamyai · 2024-08-22T09:29:28Z

@fangnster Did you change index name and storage name in the 1-Quickstart.ipynb?

fangnster · 2024-08-22T09:36:44Z

@fangnster Did you change index name and storage name in the 1-Quickstart.ipynb?

yes , new index name and new storage name have been changed

rnpramasamyai · 2024-08-22T09:42:56Z

@@fangnster
There may already be an index job running. Please check the status of the indexing job and whether the indexing pod is running.

fangnster · 2024-08-22T09:54:29Z

@@fangnster There may already be an index job running. Please check the status of the indexing job and whether the indexing pod is running.

this job has been running for several days, and status of this job always stay 0.0% completed for several days. How to fix it ?

this screen shot is the same , whatever index name and storage name are changed before and after

rnpramasamyai · 2024-08-22T09:59:16Z

@fangnster
Please to stop or delete the index. There are many APIs available for deleting the index and storage. Please check your APIM.

fangnster · 2024-08-22T15:49:20Z

@fangnster Please to stop or delete the index. There are many APIs available for deleting the index and storage. Please check your APIM.

I have deleted precious index and storage files, and restart another new file names, and then with look into running job of indexing , such as named "graphrag-index-manager-****" starts up one by one , which is killed automatically after 5 mins, and then restart another one. Therefore, the error shown on the first floor that is "reason:conflict" in logs occurs.

screenshot as follows:

rnpramasamyai · 2024-08-23T03:13:55Z

@fangnster
Please use the instructions below to retrieve logs from the pods.

fangnster · 2024-08-25T16:23:34Z

@fangnster Please use the instructions below to retrieve logs from the pods.

after study into these command, I restart a new indexing job with new storage and indexing file names. And then the same error "reason:conflict" occurs . At observation of that progress of that new indexing, I find a script "indexing-job-manage-template.yaml" as follows:

Whether I can delay the 5 mins schedule to longer interval, such as 15mins etc., in order that the last indexing job has been processed completely . Could you tell me the reason of setting of the "5 mins" ?

timothymeyers · 2024-08-26T13:26:53Z

When you initiate an indexing job, a record of it is put into CosmosDB for the job and it is listed in a state of "Scheduled."

The K8s CronJob runs every 5 mins and checks CosmosDB for Scheduled indexing jobs, and then initiates actual indexing processes for them in order. It uses a k8s Job deployment for an indexing pod to be spun up (the indexing-<id> pod).

fangnster · 2024-08-26T16:53:07Z

When you initiate an indexing job, a record of it is put into CosmosDB for the job and it is listed in a state of "Scheduled."

The K8s CronJob runs every 5 mins and checks CosmosDB for Scheduled indexing jobs, and then initiates actual indexing processes for them in order. It uses a k8s Job deployment for an indexing pod to be spun up (the indexing-<id> pod).

Could you tell me how to modify the cronjob interval of 5 mins to longer one ?

timothymeyers · 2024-08-26T18:10:15Z

you can edit the template for the cron job by doing kubectl edit cj/graphrag-index-manager

and looking for the schedule: "*/5 * * * *" line. Change the number to a different number of minutes, and save the manifest.

Note that if you want to change it permanently between deployments, you'd change it in this file, and redeploy the backend container to Azure Container Registry.

fangnster · 2024-08-26T21:26:50Z

you can edit the template for the cron job by doing kubectl edit cj/graphrag-index-manager

and looking for the schedule: "*/5 * * * *" line. Change the number to a different number of minutes, and save the manifest.

Note that if you want to change it permanently between deployments, you'd change it in this file, and redeploy the backend container to Azure Container Registry.

I try to solve the error by search_engine , but all failure as above screenshot

In addition, I redeploy a new this file in schedule "*/15 * * * " line, and then success. However, I check the cronjob by "kubectl describe cronjob", still former schedule "/5 * * * *" line

MeroZemory · 2024-08-27T18:16:54Z

@fangnster Please to stop or delete the index. There are many APIs available for deleting the index and storage. Please check your APIM.

I have deleted precious index and storage files, and restart another new file names, and then with look into running job of indexing , such as named "graphrag-index-manager-****" starts up one by one , which is killed automatically after 5 mins, and then restart another one. Therefore, the error shown on the first floor that is "reason:conflict" in logs occurs.

screenshot as follows:

I've been stuck in the scheduled 0.0% state for a long time too, but the cronjob that manages the indexing jobs (created from indexing-job-manager-template.yaml) runs every 5 minutes, which doesn't seem to have anything to do with the indexing processing time (you said it runs/shuts down every 5 minutes, but I understand that only the run is every 5 minutes).

My guess is that you're simply getting that error because the AKS job (indexing-job-*) didn't complete and no new indexing job was created, but the one that was already created is still running.

If there's a problem, it's probably the part where the indexing job doesn't complete and hangs. (I haven't had indexing complete in over 30 minutes either, but I'm not sure if it's in progress or hanging).

fangnster · 2024-08-27T21:43:46Z

@fangnster Please to stop or delete the index. There are many APIs available for deleting the index and storage. Please check your APIM.

I have deleted precious index and storage files, and restart another new file names, and then with look into running job of indexing , such as named "graphrag-index-manager-****" starts up one by one , which is killed automatically after 5 mins, and then restart another one. Therefore, the error shown on the first floor that is "reason:conflict" in logs occurs.
screenshot as follows:

I've been stuck in the scheduled 0.0% state for a long time too, but the cronjob that manages the indexing jobs (created from indexing-job-manager-template.yaml) runs every 5 minutes, which doesn't seem to have anything to do with the indexing processing time (you said it runs/shuts down every 5 minutes, but I understand that only the run is every 5 minutes).

My guess is that you're simply getting that error because the AKS job (indexing-job-*) didn't complete and no new indexing job was created, but the one that was already created is still running.

If there's a problem, it's probably the part where the indexing job doesn't complete and hangs. (I haven't had indexing complete in over 30 minutes either, but I'm not sure if it's in progress or hanging).

I found that the pod of indexing job was in hanging mostly and graphrag-index-manager start up every 5 mins and log error that include "already exists".

mb-porini · 2024-08-29T08:03:51Z

Hi!
I hope to find a little help from the community. I'm trying to get the indexer from {scheduled, completion_percent: 0.0} to "running" but I cannot find the correct API call on the backend, I'm following the Quickstart.ipynb using the wikipedia dataset.

Am I missing anything?

Thanks!

timothymeyers · 2024-08-29T12:44:11Z

@mb-porini - if the index status is showing as {scheduled} then the backend API has done what it is going to do. There is a k8s CronJob that spins up every 5 minutes to check for indexing jobs in a {scheduled} state and will kick off a k8s Job to start the process. See if you can look at the Pod logs for recently completed CronJobs or Jobs. The indexer-<hash> pod is the one that would update the indexing status to {running}.

@fangnster - If the indexer-<hash> pod fails unexpectedly, you can get into a situation where the index state in CosmosDB is out of sync (it will say 'running' when it should be 'failed'). You can either (1) delete the index using the API (AdvancedStart notebook has code for this), (2) rerun the index with a new index name, (3) go into CosmosDB and manually update the status to read 'failed' and then restart the indexing job via the API again. (1) and (2) are considerably more straightforward than (3), I think. Also, for future reference, kubectl edit failed because you don't have vi or vim installed.

fangnster · 2024-08-29T15:39:38Z

@mb-porini - if the index status is showing as {scheduled} then the backend API has done what it is going to do. There is a k8s CronJob that spins up every 5 minutes to check for indexing jobs in a {scheduled} state and will kick off a k8s Job to start the process. See if you can look at the Pod logs for recently completed CronJobs or Jobs. The indexer-<hash> pod is the one that would update the indexing status to {running}.

@fangnster - If the indexer-<hash> pod fails unexpectedly, you can get into a situation where the index state in CosmosDB is out of sync (it will say 'running' when it should be 'failed'). You can either (1) delete the index using the API (AdvancedStart notebook has code for this), (2) rerun the index with a new index name, (3) go into CosmosDB and manually update the status to read 'failed' and then restart the indexing job via the API again. (1) and (2) are considerably more straightforward than (3), I think. Also, for future reference, kubectl edit failed because you don't have vi or vim installed.

Hi, thanks for your kind multi-responses to my question. Additionally, I try to execute (1) and (2) in several times, the error as same as usual , that is below screen shot and :

KubaBir · 2024-08-30T13:56:33Z

I have the same problem.
The index-manager kicks off, starts an indexing-job and later crashes due the the duplicate name (trying to queue the same job again?). The job remains in pending state and the manager in CrashLoopBackOff

I depolyed the repo from vs code container and am using gpt-4o-mini

Here is the whole log from index-manager

Traceback (most recent call last):
  File "/backend/manage-indexing-jobs.py", line 43, in schedule_indexing_job
    batch_v1.create_namespaced_job(
  File "/usr/local/lib/python3.10/site-packages/kubernetes/client/api/batch_v1_api.py", line 210, in create_namespaced_job
    return self.create_namespaced_job_with_http_info(namespace, body, **kwargs)  # noqa: E501
  File "/usr/local/lib/python3.10/site-packages/kubernetes/client/api/batch_v1_api.py", line 309, in create_namespaced_job_with_http_info
    return self.api_client.call_api(
  File "/usr/local/lib/python3.10/site-packages/kubernetes/client/api_client.py", line 348, in call_api
    return self.__call_api(resource_path, method,
  File "/usr/local/lib/python3.10/site-packages/kubernetes/client/api_client.py", line 180, in __call_api
    response_data = self.request(
  File "/usr/local/lib/python3.10/site-packages/kubernetes/client/api_client.py", line 391, in request
    return self.rest_client.POST(url,
  File "/usr/local/lib/python3.10/site-packages/kubernetes/client/rest.py", line 279, in POST
    return self.request("POST", url,
  File "/usr/local/lib/python3.10/site-packages/kubernetes/client/rest.py", line 238, in request
    raise ApiException(http_resp=r)
kubernetes.client.exceptions.ApiException: (409)
Reason: Conflict
HTTP response headers: HTTPHeaderDict({'Audit-Id': '100823ba-e15d-47a0-a92f-d08284067eba', 'Cache-Control': 'no-cache, private', 'Content-Type': 'application/json', 'X-Kubernetes-Pf-Flowschema-Uid': '563bf5c9-3718-45a1-837b-473fd3f842d9', 'X-Kubernetes-Pf-Prioritylevel-Uid': 'e1bfb844-2c87-4f34-9901-7895c0dc1c2b', 'Date': 'Fri, 30 Aug 2024 13:50:41 GMT', 'Content-Length': '290'})
HTTP response body: {"kind":"Status","apiVersion":"v1","metadata":{},"status":"Failure","message":"jobs.batch \"indexing-job-3e23e8160039594a33894f6564e1b134\" already exists","reason":"AlreadyExists","details":{"name":"indexing-job-3e23e8160039594a33894f6564e1b134","group":"batch","kind":"jobs"},"code":409}


Traceback (most recent call last):
  File "/backend/manage-indexing-jobs.py", line 43, in schedule_indexing_job
    batch_v1.create_namespaced_job(
  File "/usr/local/lib/python3.10/site-packages/kubernetes/client/api/batch_v1_api.py", line 210, in create_namespaced_job
    return self.create_namespaced_job_with_http_info(namespace, body, **kwargs)  # noqa: E501
  File "/usr/local/lib/python3.10/site-packages/kubernetes/client/api/batch_v1_api.py", line 309, in create_namespaced_job_with_http_info
    return self.api_client.call_api(
  File "/usr/local/lib/python3.10/site-packages/kubernetes/client/api_client.py", line 348, in call_api
    return self.__call_api(resource_path, method,
  File "/usr/local/lib/python3.10/site-packages/kubernetes/client/api_client.py", line 180, in __call_api
    response_data = self.request(
  File "/usr/local/lib/python3.10/site-packages/kubernetes/client/api_client.py", line 391, in request
    return self.rest_client.POST(url,
  File "/usr/local/lib/python3.10/site-packages/kubernetes/client/rest.py", line 279, in POST
    return self.request("POST", url,
  File "/usr/local/lib/python3.10/site-packages/kubernetes/client/rest.py", line 238, in request
    raise ApiException(http_resp=r)
kubernetes.client.exceptions.ApiException: (409)
Reason: Conflict
HTTP response headers: HTTPHeaderDict({'Audit-Id': '100823ba-e15d-47a0-a92f-d08284067eba', 'Cache-Control': 'no-cache, private', 'Content-Type': 'application/json', 'X-Kubernetes-Pf-Flowschema-Uid': '563bf5c9-3718-45a1-837b-473fd3f842d9', 'X-Kubernetes-Pf-Prioritylevel-Uid': 'e1bfb844-2c87-4f34-9901-7895c0dc1c2b', 'Date': 'Fri, 30 Aug 2024 13:50:41 GMT', 'Content-Length': '290'})
HTTP response body: {"kind":"Status","apiVersion":"v1","metadata":{},"status":"Failure","message":"jobs.batch \"indexing-job-3e23e8160039594a33894f6564e1b134\" already exists","reason":"AlreadyExists","details":{"name":"indexing-job-3e23e8160039594a33894f6564e1b134","group":"batch","kind":"jobs"},"code":409}



During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/backend/manage-indexing-jobs.py", line 120, in <module>
    main()
  File "/backend/manage-indexing-jobs.py", line 116, in main
    schedule_indexing_job(index_to_schedule)
  File "/backend/manage-indexing-jobs.py", line 55, in schedule_indexing_job
    pipeline_job["status"] = PipelineJobState.FAILED
TypeError: 'PipelineJob' object does not support item assignment

I also tried removing the index via the api (its removed along with the storage container) and changing the names of both the storage container and the index

alopezcruz · 2024-11-20T00:48:27Z

Did anyone found a solution for this ?, I'm stuck int he same situation ' job always stay 0.0% completed for last two days and no errors'..thank you in advance

mb-porini · 2024-11-20T11:09:21Z

Hi @alopezcruz,

I have to be clear, I made a very big mistake. During the configuration of the environment I decided to use a different compute size due to my subscription limitation. I figured out a little bit later that in the YAML configuration file there are some minimum requirements that has to be reached. So my problem has be solved completing the modified configuration. After that it worked perfectly.

Thanks for asking

alopezcruz · 2024-11-21T02:09:08Z

hi @mb-porini ,

Thank you for the reply, can you guide me with 'YAML  configuration file' , where this file in the repo?

Best,

mb-porini · 2024-11-21T10:55:33Z

Well, there is a very good amount of yaml files so I suggest you to modify them only if you are currently aware of how to handle them. Moreover, I'm quite sure that the main file was this.
Good luck

timothymeyers · 2024-11-21T12:53:48Z

Check your Kubernetes cluster and see if you have an indexing pod running. Check the pod log streams for any errors. How big of a dataset are you working with? On Nov 20, 2024, at 9:09 PM, alopezcruz ***@***.***> wrote: hi @mb-porini<https://github.com/mb-porini> , Thank you for the reply, can you guide me with 'YAML configuration file' , where this file in the repo? Best, — Reply to this email directly, view it on GitHub<#154 (comment)> or unsubscribe<https://github.com/notifications/unsubscribe-auth/ABJKW6UJSNJ7DWZVZXIRMR32BU6FVBFKMF2HI4TJMJ2XIZLTSSBKK5TBNR2WLJDUOJ2WLJDOMFWWLO3UNBZGKYLEL5YGC4TUNFRWS4DBNZ2F6YLDORUXM2LUPGBKK5TBNR2WLJDUOJ2WLJDOMFWWLLTXMF2GG2C7MFRXI2LWNF2HTAVFOZQWY5LFUVUXG43VMWSG4YLNMWVXI2DSMVQWIX3UPFYGLAVFOZQWY5LFVI3DONBZGIZTQOBXGSSG4YLNMWUWQYLTL5WGCYTFNSWHG5LCNJSWG5C7OR4XAZNMJFZXG5LFINXW23LFNZ2KM5DPOBUWG44TQKSHI6LQMWVHEZLQN5ZWS5DPOJ42K5TBNR2WLKJXG44DINJRHEYTBAVEOR4XAZNFNFZXG5LFUV3GC3DVMWVDENBXHE3DOMJXGEZIFJDUPFYGLJLMMFRGK3FFOZQWY5LFVI3DONBZGIZTQOBXGSTXI4TJM5TWK4VGMNZGKYLUMU>. You are receiving this email because you commented on the thread. Triage notifications on the go with GitHub Mobile for iOS<https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675> or Android<https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub>.

alopezcruz · 2024-11-21T15:33:03Z

Check your Kubernetes cluster and see if you have an indexing pod running. Check the pod log streams for any errors. How big of a dataset are you working with? On Nov 20, 2024, at 9:09 PM, alopezcruz @.***> wrote: hi @mb-porini https://github.com/mb-porini , Thank you for the reply, can you guide me with 'YAML configuration file' , where this file in the repo? Best, — Reply to this email directly, view it on GitHub<#154 (comment)> or unsubscribehttps://github.com/notifications/unsubscribe-auth/ABJKW6UJSNJ7DWZVZXIRMR32BU6FVBFKMF2HI4TJMJ2XIZLTSSBKK5TBNR2WLJDUOJ2WLJDOMFWWLO3UNBZGKYLEL5YGC4TUNFRWS4DBNZ2F6YLDORUXM2LUPGBKK5TBNR2WLJDUOJ2WLJDOMFWWLLTXMF2GG2C7MFRXI2LWNF2HTAVFOZQWY5LFUVUXG43VMWSG4YLNMWVXI2DSMVQWIX3UPFYGLAVFOZQWY5LFVI3DONBZGIZTQOBXGSSG4YLNMWUWQYLTL5WGCYTFNSWHG5LCNJSWG5C7OR4XAZNMJFZXG5LFINXW23LFNZ2KM5DPOBUWG44TQKSHI6LQMWVHEZLQN5ZWS5DPOJ42K5TBNR2WLKJXG44DINJRHEYTBAVEOR4XAZNFNFZXG5LFUV3GC3DVMWVDENBXHE3DOMJXGEZIFJDUPFYGLJLMMFRGK3FFOZQWY5LFVI3DONBZGIZTQOBXGSTXI4TJM5TWK4VGMNZGKYLUMU. You are receiving this email because you commented on the thread. Triage notifications on the go with GitHub Mobile for iOShttps://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Androidhttps://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

Hi @timothymeyers , @mb-porini

Thank you for the reply, appreciated, below are the logs from the index , also @timothymeyers , the dataset being used is the one of the sample from Wikipedia 'California' :

Scheduling job for index: TestCalifornia
[ERROR] 2024-11-21 15:15:13,384 - Index job manager encountered error scheduling indexing job
Traceback (most recent call last):
File "/backend/manage-indexing-jobs.py", line 43, in schedule_indexing_job
batch_v1.create_namespaced_job(
File "/usr/local/lib/python3.10/site-packages/kubernetes/client/api/batch_v1_api.py", line 210, in create_namespaced_job
return self.create_namespaced_job_with_http_info(namespace, body, **kwargs) # noqa: E501
File "/usr/local/lib/python3.10/site-packages/kubernetes/client/api/batch_v1_api.py", line 309, in create_namespaced_job_with_http_info
return self.api_client.call_api(
File "/usr/local/lib/python3.10/site-packages/kubernetes/client/api_client.py", line 348, in call_api
return self.__call_api(resource_path, method,
File "/usr/local/lib/python3.10/site-packages/kubernetes/client/api_client.py", line 180, in __call_api
response_data = self.request(
File "/usr/local/lib/python3.10/site-packages/kubernetes/client/api_client.py", line 391, in request
return self.rest_client.POST(url,
File "/usr/local/lib/python3.10/site-packages/kubernetes/client/rest.py", line 279, in POST
return self.request("POST", url,
File "/usr/local/lib/python3.10/site-packages/kubernetes/client/rest.py", line 238, in request
raise ApiException(http_resp=r)
kubernetes.client.exceptions.ApiException: (409)
Reason: Conflict
HTTP response headers: HTTPHeaderDict({'Audit-Id': '66821845-5b25-4c2e-9789-28a9a888d6ed', 'Cache-Control': 'no-cache, private', 'Content-Type': 'application/json', 'X-Kubernetes-Pf-Flowschema-Uid': '92193793-5a4f-4190-b374-38aba384c2b0', 'X-Kubernetes-Pf-Prioritylevel-Uid': 'f6b2ffef-9aad-4623-835b-fc9919c36491', 'Date': 'Thu, 21 Nov 2024 15:15:12 GMT', 'Content-Length': '290'})
HTTP response body: {"kind":"Status","apiVersion":"v1","metadata":{},"status":"Failure","message":"jobs.batch "indexing-job-6c416233b5012abcfea8417472538ad1" already exists","reason":"AlreadyExists","details":{"name":"indexing-job-6c416233b5012abcfea8417472538ad1","group":"batch","kind":"jobs"},"code":409}

Traceback (most recent call last):
File "/backend/manage-indexing-jobs.py", line 43, in schedule_indexing_job
batch_v1.create_namespaced_job(
File "/usr/local/lib/python3.10/site-packages/kubernetes/client/api/batch_v1_api.py", line 210, in create_namespaced_job
return self.create_namespaced_job_with_http_info(namespace, body, **kwargs) # noqa: E501
File "/usr/local/lib/python3.10/site-packages/kubernetes/client/api/batch_v1_api.py", line 309, in create_namespaced_job_with_http_info
return self.api_client.call_api(
File "/usr/local/lib/python3.10/site-packages/kubernetes/client/api_client.py", line 348, in call_api
return self.__call_api(resource_path, method,
File "/usr/local/lib/python3.10/site-packages/kubernetes/client/api_client.py", line 180, in __call_api
response_data = self.request(
File "/usr/local/lib/python3.10/site-packages/kubernetes/client/api_client.py", line 391, in request
return self.rest_client.POST(url,
File "/usr/local/lib/python3.10/site-packages/kubernetes/client/rest.py", line 279, in POST
return self.request("POST", url,
File "/usr/local/lib/python3.10/site-packages/kubernetes/client/rest.py", line 238, in request
raise ApiException(http_resp=r)
kubernetes.client.exceptions.ApiException: (409)
Reason: Conflict
HTTP response headers: HTTPHeaderDict({'Audit-Id': '66821845-5b25-4c2e-9789-28a9a888d6ed', 'Cache-Control': 'no-cache, private', 'Content-Type': 'application/json', 'X-Kubernetes-Pf-Flowschema-Uid': '92193793-5a4f-4190-b374-38aba384c2b0', 'X-Kubernetes-Pf-Prioritylevel-Uid': 'f6b2ffef-9aad-4623-835b-fc9919c36491', 'Date': 'Thu, 21 Nov 2024 15:15:12 GMT', 'Content-Length': '290'})
HTTP response body: {"kind":"Status","apiVersion":"v1","metadata":{},"status":"Failure","message":"jobs.batch "indexing-job-6c416233b5012abcfea8417472538ad1" already exists","reason":"AlreadyExists","details":{"name":"indexing-job-6c416233b5012abcfea8417472538ad1","group":"batch","kind":"jobs"},"code":409}

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "/backend/manage-indexing-jobs.py", line 120, in
main()
File "/backend/manage-indexing-jobs.py", line 116, in main
schedule_indexing_job(index_to_schedule)
File "/backend/manage-indexing-jobs.py", line 55, in schedule_indexing_job
pipeline_job["status"] = PipelineJobState.FAILED
TypeError: 'PipelineJob' object does not support item assignment

Please let me know your comments, thank you again for your time..

alopezcruz · 2024-11-23T00:30:50Z

Hi @timothymeyers, @mb-porini,

I did figure this out, your advice put me on the right track, I leave below what I found in case others face similar issue.
In Azure RG while auditing the Kubernetes service resource, node pools (index, graph rag, agent), I noticed some scaling warnings (related to index node), so going into 'Node pools' -> 'index node' and opening the 'scale node pool' noticed the machine or VM (Standard E8s v5 east us) have 0 quotas available and off course not able scale this node at all ( manually or auto scaling ,you should test adding a minimum node count).
Finally, after requesting the quotas VM (Standard E8s v5 east us) and quotas were applied, everything when smooth and all indexing jobs were completed to 100%.
May be this can be included in the installation or documentation, warning about quotas for specific nodes, etc., my installation was smooth but never gave me a warning or stop me about this. Anyway, thank you again for your support.
Best,

fangnster added the bug Something isn't working label Aug 22, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[BUG] my indexing always does not progress any more #154

[BUG] my indexing always does not progress any more #154

fangnster commented Aug 22, 2024

rnpramasamyai commented Aug 22, 2024

fangnster commented Aug 22, 2024

rnpramasamyai commented Aug 22, 2024

fangnster commented Aug 22, 2024

rnpramasamyai commented Aug 22, 2024 •

edited

Loading

fangnster commented Aug 22, 2024 •

edited

Loading

rnpramasamyai commented Aug 22, 2024

fangnster commented Aug 22, 2024 •

edited

Loading

rnpramasamyai commented Aug 23, 2024

fangnster commented Aug 25, 2024 •

edited

Loading

timothymeyers commented Aug 26, 2024

fangnster commented Aug 26, 2024

timothymeyers commented Aug 26, 2024

fangnster commented Aug 26, 2024 •

edited

Loading

MeroZemory commented Aug 27, 2024

fangnster commented Aug 27, 2024

mb-porini commented Aug 29, 2024

timothymeyers commented Aug 29, 2024

fangnster commented Aug 29, 2024

KubaBir commented Aug 30, 2024 •

edited

Loading

alopezcruz commented Nov 20, 2024

mb-porini commented Nov 20, 2024

alopezcruz commented Nov 21, 2024

mb-porini commented Nov 21, 2024

timothymeyers commented Nov 21, 2024 via email

alopezcruz commented Nov 21, 2024

alopezcruz commented Nov 23, 2024 •

edited

Loading

[BUG] my indexing always does not progress any more #154

[BUG] my indexing always does not progress any more #154

Comments

fangnster commented Aug 22, 2024

rnpramasamyai commented Aug 22, 2024

fangnster commented Aug 22, 2024

rnpramasamyai commented Aug 22, 2024

fangnster commented Aug 22, 2024

rnpramasamyai commented Aug 22, 2024 • edited Loading

fangnster commented Aug 22, 2024 • edited Loading

rnpramasamyai commented Aug 22, 2024

fangnster commented Aug 22, 2024 • edited Loading

rnpramasamyai commented Aug 23, 2024

fangnster commented Aug 25, 2024 • edited Loading

timothymeyers commented Aug 26, 2024

fangnster commented Aug 26, 2024

timothymeyers commented Aug 26, 2024

fangnster commented Aug 26, 2024 • edited Loading

MeroZemory commented Aug 27, 2024

fangnster commented Aug 27, 2024

mb-porini commented Aug 29, 2024

timothymeyers commented Aug 29, 2024

fangnster commented Aug 29, 2024

KubaBir commented Aug 30, 2024 • edited Loading

alopezcruz commented Nov 20, 2024

mb-porini commented Nov 20, 2024

alopezcruz commented Nov 21, 2024

mb-porini commented Nov 21, 2024

timothymeyers commented Nov 21, 2024 via email

alopezcruz commented Nov 21, 2024

alopezcruz commented Nov 23, 2024 • edited Loading

rnpramasamyai commented Aug 22, 2024 •

edited

Loading

fangnster commented Aug 22, 2024 •

edited

Loading

fangnster commented Aug 22, 2024 •

edited

Loading

fangnster commented Aug 25, 2024 •

edited

Loading

fangnster commented Aug 26, 2024 •

edited

Loading

KubaBir commented Aug 30, 2024 •

edited

Loading

alopezcruz commented Nov 23, 2024 •

edited

Loading