You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This is due to the subject API resource lacking the optimized last_id support. That was added to speed up the classifications API but it should be ported to the each resource.
Paging through the resource result sets via next / previous links is the standard support for resources and subject does work this way. Does that meet your use case here?
I think that I didn't give enough thought to what I actually needed to accomplish here.
I realised that in order for iteratively downloaded (yay for last_id) classifications to be useful, I need the metadata from the subject to link those classifications back to the science catalog.
My first thought was to download all new subjects with last_id - but of course, that's not how subjects work! Old subjects can get new classifications.
Paging would work to download all subjects, but doing that daily would be slow and duplicate calls.
My current solution is to get the specific subject for each new classification:
subject_id = classification['links']['subjects'][0] # only works for single-subject projects
subject = get_subject(project_id, subject_id) # assume id is unique
and decorate get_subject (which is simply the Python client) with a huge lru_cache, on the assumption that subjects tend to appear repeatedly at similar times (i.e. the currently active subject set).
This saves me having to maintain an up-to-date duplicate database of all subjects, but is a bit slow vs. the optimised classification interface.
I would guess that wanting to get the subject details along with the classification details would be quite useful for others, though I'm not sure how best to implement this.
Adding last_id={id} to Subject.where() appears to have no effect and no error.
Test Case
Executing:
`subjects = Subject.where(
scope='project',
project_id='5733'
)
for n in range(10):
s = subjects.next()`
Gives the following result:
<Subject 30091684>
<Subject 30091682>
<Subject 30091673>
<Subject 30091670>
<Subject 30091664>
<Subject 30091662>
<Subject 30091656>
<Subject 30091654>
<Subject 30091645>
<Subject 30091641>
Adding last_id=30091682 gives the same result as above.
The text was updated successfully, but these errors were encountered: