Returning success count from the `.populate()` call #1050

ttngu207 · 2022-09-02T17:08:35Z

Introduce a new argument return_success_count to the .populate() routine, that would return the count of successful make() calls (i.e. number of successful jobs) in one .populate() call.

This functionality allows for us to answer the question: "Did anything happen in my .populate() call?" without having to call the key_source again (e.g. with .progress())

…count

dimitri-yatsenko · 2022-12-07T17:28:51Z

datajoint/autopopulate.py

        if suppress_errors:
            return error_list
+        if return_success_count:
+            return sum(success_list)


If success_list is only used to report the sum, why not use a counter instead?

I went with this approach because of the multi-processing logic - calling mp.Pool(). I'm not sure if, say success_count += 1, is robust in the case of parallel multi-processing

make sense.

dimitri-yatsenko · 2022-12-07T17:35:57Z

datajoint/autopopulate.py

+        if suppress_errors and return_success_count:
+            return sum(success_list), error_list
        if suppress_errors:
            return error_list
+        if return_success_count:
+            return sum(success_list)


These tuple returns are tough because the order matters. How about we return this as a dict instead?

Suggested change

if suppress_errors and return_success_count:

return sum(success_list), error_list

if suppress_errors:

return error_list

if return_success_count:

return sum(success_list)

ret = {}

if suppress_errors:

ret['error_list'] = error_list

if return_success_count:

ret['success_count'] = sum(success_list)

return ret

This would be a backward-incompatible change, so we need to update users.

If we use this dictionary return, then we can always return the success count and the user does not need to request it.

Agreed, but as you said, the biggest thing is breaking backward compatibility and if we can avoid it, we should.
Breaking backward-compatibility for such a minor new feature is not worth it in my view.

Tuple returns are not as clean, but we can compensate with clear documentation of this feature in the docstring

I think it's okay to break backward compatibility for better design decisions long-term.

At this point, the docstring did not specify the return argument. So it's okay to introduce a modification now with proper documentation.

@dimitri-yatsenko , yes, docstring did not specify the return argument. But we would still break backward compatibility if we make the changes you suggested. I'm okay with it if you're okay with it

dimitri-yatsenko · 2023-02-15T15:26:11Z

datajoint/autopopulate.py

@@ -322,6 +338,7 @@ def _populate1(
                    )
                    if jobs is not None:
                        jobs.complete(self.target.table_name, self._job_key(key))
+                    return True


Not sure about this True. The docstring needs an update to explain. It's a bit odd to have so many different return types.

What about returning True/False for success/failure instead of None (if suppress_errors=False)?

When does it ever return False?

Then you should add return False to the case when the function returns without populating anything.

Does not populating anything constitute a False? False seems to indicate failure (maybe that's just me thinking so).

The current function produces True if successful, None if nothing is populated, throws an exception, or returns (key, error). It never returns False. It seems that if a function returns True, it should also return False in other cases. Perhaps instead of None, we should return False. It would not indicate an error, but simply not populating for other reasons than an error.

Agreed, I'll update accordingly

ttngu207 · 2023-06-21T16:41:22Z

Hi @dimitri-yatsenko , circling back to this as we need this feature for sciops.
I do like much better the return for autopopulate as dict, very clean. But that means breaking backward compatibility. So, are we firm with the decision to do this? What is our protocol for these kind of backward incompatible changes - new major version, warning statement, etc.?

dimitri-yatsenko

Would you update the CHANGELOG?

dimitri-yatsenko · 2023-10-02T20:49:37Z

datajoint/autopopulate.py

@@ -176,6 +177,8 @@ def populate(
        :param max_calls: if not None, populate at most this many keys
        :param display_progress: if True, report progress_bar
        :param processes: number of processes to use. Set to None to use all cores
+        :param return_success_count: if True, return the count of successful `make()` calls.


the docstring appears to be missing a description of the return argument.

Should we omit this input and always include the success count as part of the return dict?

I'm a bit unclear what you mean here. The docstring is included there.
Do you mean it is not descriptive enough?

Oh, I see what you mean

The docstring should include :return: ...

dimitri-yatsenko · 2023-10-02T21:00:57Z

tests_old/test_autopopulate.py

+        restriction = self.subject.proj(animal="subject_id").fetch("KEY")[0]
+        d = self.trial.connection.dependencies
+        d.load()
+        success_count, _ = self.trial.populate(


will populate return a dict now?

the test should reflect that populate returns a dict.

…or_list`

dimitri-yatsenko · 2023-10-03T16:13:54Z

datajoint/autopopulate.py

+                make_kwargs=make_kwargs,
+            )
+
+            if processes == 1:


is processes == 0 handled? Perhaps

if not processes: return { "success_count": 0, "error_list": [], }

It is handled in the else block

datajoint/autopopulate.py

dimitri-yatsenko · 2023-10-03T16:31:10Z

datajoint/autopopulate.py

@@ -322,6 +338,7 @@ def _populate1(
                    )
                    if jobs is not None:
                        jobs.complete(self.target.table_name, self._job_key(key))
+                    return True


The current function produces True if successful, None if nothing is populated, throws an exception, or returns (key, error). It never returns False. It seems that if a function returns True, it should also return False in other cases. Perhaps instead of None, we should return False. It would not indicate an error, but simply not populating for other reasons than an error.

Co-authored-by: Dimitri Yatsenko <[email protected]>

dimitri-yatsenko · 2023-10-03T18:48:12Z

datajoint/autopopulate.py

+                    status = self._populate1(key, jobs, **populate_kwargs)
+                    if status is not None:
+                        if isinstance(status, tuple):
+                            error_list.append(status)
+                        elif status:
+                            success_list.append(1)


If we change _populate to return True, False, or (key, error)

Suggested change

status = self._populate1(key, jobs, **populate_kwargs)

if status is not None:

if isinstance(status, tuple):

error_list.append(status)

elif status:

success_list.append(1)

status = self._populate1(key, jobs, **populate_kwargs)

if status is True:

success_list.append(1)

elif isinstance(status, tuple):

error_list.append(status)

else:

assert status is False

Thinh Nguyen added 12 commits September 1, 2022 13:41

add mechanism to return populate's success count

29357fe

add test for populate returning success_count

b737003

improve docstring

d1011fb

fix test_populate_with_success_count

6f7a0c0

bugfix test_populate_with_success_count

1f358a9

black formatting

eb827e6

black formatting

1b4806e

Merge branch 'master' into populate_success_count

0abd3c0

improve populate returns to be consistent with the input kwargs

6bf2afc

formatting

37801d6

Merge remote-tracking branch 'upstream/master' into populate_success_…

7a258d4

…count

minor formatting

9480435

dimitri-yatsenko requested changes Dec 7, 2022

View reviewed changes

dimitri-yatsenko reviewed Feb 15, 2023

View reviewed changes

Merge branch 'master' into populate_success_count

0f84560

Thinh Nguyen and others added 3 commits July 14, 2023 10:37

Merge branch 'master' into populate_success_count

02127a0

Merge branch 'datajoint:master' into populate_success_count

c061f8a

Merge branch 'datajoint:master' into populate_success_count

9ef2046

dimitri-yatsenko requested changes Oct 2, 2023

View reviewed changes

Merge branch 'datajoint:master' into populate_success_count

c66ff04

dimitri-yatsenko reviewed Oct 2, 2023

View reviewed changes

update CHANGELOG

ff6b81c

dimitri-yatsenko reviewed Oct 2, 2023

View reviewed changes

.populate() call now returns a dict with success_count and `err…

45938aa

…or_list`

ttngu207 requested a review from dimitri-yatsenko October 3, 2023 15:37

dimitri-yatsenko requested changes Oct 3, 2023

View reviewed changes

ttngu207 and others added 2 commits October 3, 2023 11:38

Apply suggestions from code review

e143ce8

Co-authored-by: Dimitri Yatsenko <[email protected]>

return False if nothing gets populated in ._populate1()

291a468

ttngu207 requested a review from dimitri-yatsenko October 3, 2023 16:55

dimitri-yatsenko reviewed Oct 3, 2023

View reviewed changes

minor code cleanup

008a723

ttngu207 requested a review from dimitri-yatsenko October 4, 2023 13:33

ttngu207 mentioned this pull request Oct 5, 2023

remove kwarg "return_success_count" datajoint-company/datajoint-utilities#38

Merged

code cleanup - refactor _populate1

18fd619

dimitri-yatsenko changed the title ~~New kwarg for autopopulate - returning success count after the .populate() call~~ Returning success count from the .populate() call Oct 9, 2023

dimitri-yatsenko approved these changes Oct 9, 2023

View reviewed changes

dimitri-yatsenko merged commit 10511e7 into datajoint:master Oct 9, 2023
9 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Returning success count from the `.populate()` call #1050

Returning success count from the `.populate()` call #1050

ttngu207 commented Sep 2, 2022 •

edited

Loading

dimitri-yatsenko Dec 7, 2022

ttngu207 Dec 7, 2022

dimitri-yatsenko Oct 2, 2023

dimitri-yatsenko Dec 7, 2022

dimitri-yatsenko Dec 7, 2022

ttngu207 Dec 7, 2022 •

edited

Loading

dimitri-yatsenko Feb 15, 2023

dimitri-yatsenko Oct 2, 2023

ttngu207 Oct 2, 2023

dimitri-yatsenko Feb 15, 2023

ttngu207 Jun 21, 2023 •

edited

Loading

dimitri-yatsenko Oct 2, 2023

dimitri-yatsenko Oct 2, 2023

ttngu207 Oct 2, 2023

dimitri-yatsenko Oct 3, 2023

ttngu207 Oct 3, 2023

ttngu207 commented Jun 21, 2023

dimitri-yatsenko left a comment

dimitri-yatsenko Oct 2, 2023

dimitri-yatsenko Oct 2, 2023

ttngu207 Oct 2, 2023

ttngu207 Oct 2, 2023

dimitri-yatsenko Oct 2, 2023

dimitri-yatsenko Oct 2, 2023

dimitri-yatsenko Oct 2, 2023

ttngu207 Oct 3, 2023

dimitri-yatsenko Oct 3, 2023

ttngu207 Oct 3, 2023

dimitri-yatsenko Oct 3, 2023

dimitri-yatsenko Oct 3, 2023 •

edited

Loading

Returning success count from the .populate() call #1050

Returning success count from the .populate() call #1050

Conversation

ttngu207 commented Sep 2, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ttngu207 Dec 7, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ttngu207 Jun 21, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ttngu207 commented Jun 21, 2023

dimitri-yatsenko left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

dimitri-yatsenko Oct 3, 2023 • edited Loading

Choose a reason for hiding this comment

Returning success count from the `.populate()` call #1050

Returning success count from the `.populate()` call #1050

ttngu207 commented Sep 2, 2022 •

edited

Loading

ttngu207 Dec 7, 2022 •

edited

Loading

ttngu207 Jun 21, 2023 •

edited

Loading

dimitri-yatsenko Oct 3, 2023 •

edited

Loading