Model Review Form #2384

lucas-wilkins · 2022-11-10T16:44:27Z

lucas-wilkins
Nov 10, 2022
Collaborator

I was asked by colleagues to look at getting a particular unverified model from the marketplace validated and incorporated into SasView.

Currently the process is rather opaque, and I made a list of things that I think should go into it, which I think might serve as a template/form for model reviewing, as well as serving as documentation for the process that each model has undergone.

Here it is:

Model Review Form (Draft)

Reviewer Name: Joe Bloggs
Affiliation (optional): Joe's Blog
Contact Details (optional): [email protected]
Date: 01/01/1970

Summary

In the opinion of this reviewer:

This model is of high enough quality and has undergone sufficient testing to be considered verified?

The reviewer has considered the following

Whether the model undergone, and passed, sufficient numerical validation?
Whether the model linked to the appropriate literature?
Whether the documentation present and correct?
Whether the code quality good?

Details

All the following questions should be answerable by considering the review.

The aim is to:

establish whether in the opinion of the reviewer the model is good to go
document the review process

Essentially, this is a list of things that should be considered. They do not have to be answered directly, though it might be easier to do so.

An answer of the form "this is not applicable or achievable in this case" along with a reason is a perfectly acceptable way of addressing a some of the questions. For some of them, "no" is also an acceptable answer.

Consistency with Literature

Is the model based on one or more peer-reviewed papers? If so what are they? Are they pay walled?

To what extent does the code reflect the equations reported in the literature provided?

If it differs, where, is this acceptable, and why?

Is the parameterisation of the model the same as that reported in the literature?

Numerical Validation

Has the model been compared to other implementations? What are the details of this? What was the outcome?

Are the any analytic results that have been used to validate the code? What was the outcome?

If there is any additional validation code, is it freely available? Where is it?

Are subcomponents numerically valid? e.g. are there tests for helper functions?

Is there sufficient test coverage?

Where the do expected values come from?

What version of sasview and sasmodels was it verified with?

Code Standards

Is the code easy to follow?

Are variable and function names explicit, if not, do they unambiguously match the literature?

Documentation

Is the documentation correct?

Are the appropriate doc strings filled out?

Is the documentation properly formatted?

Are references to the literature correct, and appropriate details and differences highlighted?

Versioning and Compatibility

Is this model compatible with the latest version of SasView? (which is...)

Is this model compatible with the latest version of sasmodels? (which is...)

Is it compatible with other versions?

lucas-wilkins · 2022-11-10T16:48:05Z

lucas-wilkins
Nov 10, 2022
Collaborator Author

Think the most interested parties would be: @pkienzle @butlerpd @krzywon and maybe @wpotrzebowski

0 replies

krzywon · 2022-11-10T17:29:59Z

krzywon
Nov 10, 2022
Maintainer

Here are some comments:

Are they pay walled?

Pay walled to what extent? I don't think this is relevant because most articles have authorship and abstracts available for free, and emailing the contact author usually results in a PDF.

To what extent does the code reflect the equations reported in the literature provided?

If it differs, where, is this acceptable, and why?

I would suggest giving a few real-world examples of why this might occur like 'code speed increase', 'minimize code reuse', etc.

Has the model been compared to other implementations? What are the details of this? What was the outcome?

Another question to ask is 'Are there other available implementations?'

Is there sufficient test coverage?

Define 'sufficient'. Some of our built-in models only have a single test.

Are the appropriate doc strings filled out?

Should we have a minimal documentation requirement? I would suggest the definition, references, and authorship at a minimum.

Other suggestions:

New heading: SasView and sasmodels compatibility
Sasview/models compatibility: What version of SasView/sasmodels is this written in/for?
Sasview/models compatibility: What versions of sasview/sasmodels has this been tested in?
Documentation: Is the user documentation properly formatted? This might require doc-build tests.

1 reply

lucas-wilkins Nov 10, 2022
Collaborator Author

Pay walled to what extent? I don't think this is relevant because most articles have authorship and abstracts available for free, and emailing the contact author usually results in a PDF.

It's just a nudge to get people to consider whether the references are easily available. An abstract is not enough for this kind of thing. People should favour open publication. The answer can be no.

Another question to ask is 'Are there other available implementations?'

I think this falls under the "this is not applicable or achievable in this case" umbrella

Define 'sufficient'.

The point of using words like "sufficient" is to avoid defining them. All of these questions could be prefixed with "In the judgement of the reviewer"

Will address other points by changing the text

butlerpd · 2022-11-10T17:35:13Z

butlerpd
Nov 10, 2022
Maintainer

Thanks for starting this discussion @lucas-wilkins. The process for adding a model to sasmodels has been pretty much as all additions to SasView code: Somebody writes it, it gets reviewed as usual (code, equation math, correctness of results) and eventually it gets approved. As with all our approvals, some have received more scrutiny than others 😄. Having a PR template similar to what we now are experimenting with in the sasview repo is good step towards normalizing the review process. This may be a bit heavy for a first go for that purpose but needs more thought (at least from me before I'd stand behind that statement 😃 ).

The Marketplace on the other hand may be the more important use case. Historically, as evidenced by the flags added to the database, the thought was that it provide a free wheeling marketplace where folks could upload their code to share, with their colleagues and the community, but with no expectation that anyone other than the author had looked at it. This would then be followed by some kind of review by folks in the sasview development community but that may be the wrong way to think about it? Perhaps we need to publish a process for this to be done by folks who are part of the sasview community but do not want to be part of the github developers list?

Following on, the aspiration was that some of the "validated" models could find their way into the distribution, but that has never happened. Partly because no externally uploaded models have ever been "reviewed" in the first place. So again we should maybe think of what that process looks like and how to choose? Some initial thoughts related to that:

for models in the marketplace it is not clear that they need the tests since they never run except as part of the build process? (but would be needed to incorporate into the distribution)
documentation would be an easier ask on the marketplace if, at the least, uploading the file automatically scraped the documentation for rendering the way our build system does (currently you have to cut and past it in so does not have to be part of the file - moreover not all sphinx acceptable code will render on the marketplace). I think there is a marketplace issue for that? Ideally a way to create the documentation live would be nice but probably not possible without adding a lot of bloat (like bundling sphinx in the application ... or can we do it with a cdn?- but that would only work if one always has an internet connection, something we've tried to avoid so far)
Regarding bringing new models in, one of course needs to think about the number growing so large that even breaking into categories is unwieldy. This could be done in various ways, but with the current scheme and the reparameterization interface to models (though again it may be an easier ask if we added a simple reparameterization editor to the GUI - also I believe exists as an issue), I would suggest that:
- we only add models in basic shape and sld space - i.e. the fundamental scattering equation for the general model. In principle everything else can be done using that model with reparemeterization. Another way to say that is that anything that is a reparameterization of a base model would not be added to the distribution.
- we establish some kind of "level of interest" test. In other words if it is really too niche maybe it is not appropriate/necessary to make it part of the distribution? Of course the problem with models on the marketplace is that they will only work with the version of SasView they were written for unless somebody updates them for new versions when breaking changes are made to sasmodels. Again a version control is envisioned for the marketplace but has yet to be implemented.
- The corollary to models only available on the marketplace getting "stale" is that all the models brought into the sasview installer will have to be maintained by the active developers every time a new release is made. So another reason to be careful how many are included in the distribution: long term sustainability vis a vis resources?

Probably enough initial thoughts for a first post ....

1 reply

pkienzle Nov 10, 2022
Collaborator

Re: staleness, we should be extremely hesitant to break the model interface. Users will have private models in their plugins directory which are outside our ability to update. An actionable ticket would be to add some ancient models to the repo as test cases.

lucas-wilkins · 2022-11-10T19:31:54Z

lucas-wilkins
Nov 10, 2022
Collaborator Author

Regarding bringing new models in, one of course needs to think about the number growing so large that even breaking into categories is unwieldy.

There's various ways of dealing with this

no externally uploaded models have ever been "reviewed" in the first place.

Indeed, you probably need cause to have them reviewed, and to request reviews from people.

This may be a bit heavy for a first go for that purpose but needs more thought

Yeah, it seemed a bit severe, hopefully my revisions make it a bit more flexible.

all the models brought into the sasview installer will have to be maintained by the active developers every time a new release is made.

what do you mean by this?

Perhaps we need to publish a process for this to be done by folks who are part of the sasview community but do not want to be part of the github developers list?

It might get people engaged more generally

0 replies

pkienzle · 2022-11-10T19:54:01Z

pkienzle
Nov 10, 2022
Collaborator

I wonder if we should have a marketplace repo in sasview? Define $category/$modelname/marketplace.json file with the fields we need and we can slurp this in on push to main. This allows us to keep specialized models outside the main distribution but provide git for source control and the PR infrastructure for code review/verification. I suggest that the main acceptance criterion be code security, not utility, correctness, completeness, documentation or adherence to standards. Add a "verified" field to the json file to activate a "sasview approved" check mark to the entry.

We can still maintain a free-for-all interface for those who do not have a github account.

Note that ORCID operates an OAuth service (@bmaranville has played with this). We may be able to use that to digitally sign a model. Free-for-all with reputation.

We could just use ORCID for user management so all uploads are associated with an orcid id. Still caveat emptor but the seller's reputation is on the line.

0 replies

lucas-wilkins · 2022-11-10T20:46:04Z

lucas-wilkins
Nov 10, 2022
Collaborator Author

Add a "verified" field to the json file to activate a "sasview approved" check mark to the entry.

So are you suggesting checking validity or not? If we're going to verify we need to verify properly.

Checking for security is fine, but I wouldn't stick a 'verified' sticker on it.

0 replies

pkienzle · 2022-11-10T21:51:49Z

pkienzle
Nov 10, 2022
Collaborator

I'm suggesting that "model is correct" and "safe* to run on your machine" are the two things that users care about (hopefully!). Code quality (docs, tests, cleanliness, efficiency, utility, ...) not so much.

I believe anything beyond a simple check that the code isn't suspicious and that the tests run is all we need. We have smoke tests that are run even if the publisher didn't supply any expected output values.

I believe it is up to the model publisher to convince users that the model is correct. Attaching a github repo to the code means that interested publishers can have a forum for users to open issues and submit PRs. It also allows them to mint DOIs for citation. The advantage of Orcid is that it ties the model to their reputation (in exchange for more citations of their work).

Obviously we will want to check validity and code quality before including the model in sasmodels.

*probably safe. That is, no more risky than running any other bit of sasview code. The indemnity clause in our license needs to apply to the check marks.

0 replies

pkienzle · 2022-11-10T22:47:26Z

pkienzle
Nov 10, 2022
Collaborator

Clarification: a simple code check is all that sasview needs to provide.

I'm all in favour of having a process for users to validate models in the marketplace.

0 replies

lucas-wilkins · 2022-11-10T22:47:42Z

lucas-wilkins
Nov 10, 2022
Collaborator Author

Safe is pretty easy to check.

I think users do care about the docs, not the comprehensiveness of them, but that they say what the model does and what they do say is

Code quality is something that we should care about, not the users.

I believe anything beyond a simple check that the code isn't suspicious and that the tests run is all we need.

Depends on whether we call that verified or not.

To an extent, I don't really care about the procedure, as long as it is clear what the check marks mean. That said, we do have a responsibility not to introduce errors into the scientific ecosystem, to prevent bad practice, and to make sure code is maintainable if other people rely on it. This is not limited to a legal responsibility, and making people accountable through Orcid or whatever does not absolve us of it.

Reminds me of previous conversation with @butlerpd, who is of the mind that we should enable the user base to do whatever they want. I, on the other hand, being a English liberal, would let them do what they want as long as they don't hurt anyone else - this would include hindering scientific progress and humankind by doing bad science. It's almost like the concept of negative liberty fell off the boat crossing the Atlantic ;)

3 replies

pkienzle Nov 10, 2022
Collaborator

Validation takes significant effort. This is not a barrier I want to erect before others can use the code. Even validated code is known to be broken.

Exhibit A: SasView/sasmodels#97 (lattice spacing in paracrystals) was an active PR for 3 years.
Exhibit B: SasView/sasmodels#109 (thin cylinders and disks) is an open issue after even more years.

The review form needs more detail on numerical checks. Specifically you need to check single precision models at extreme q values. sasmodels.compare provides tools for comparing single to double. A lengthy chapter about sasmodels.compare, explore/{realspace,precision,symint,asymint}.py, molydb/Model2SAS, ... would be nice, along with a course in numerical analysis with an emphasis on special function approximation and numerical integration.

lucas-wilkins Nov 11, 2022
Collaborator Author

I was under the impression that validation wasn't a requirement for putting the code on the marketplace. In fact, the reason I started this is because the model I was looking on the marketplace didn't have a verified/validated tag, and I wanted to know how to make it have one.

The review form needs more detail on numerical checks. Specifically you need to check single precision models at extreme q values. sasmodels.compare provides tools for comparing single to double. A lengthy chapter about sasmodels.compare, explore/{realspace,precision,symint,asymint}.py, molydb/Model2SAS, ... would be nice, along with a course in numerical analysis with an emphasis on special function approximation and numerical integration.

Or, you know, just a place to write down what you did.

pkienzle Nov 11, 2022
Collaborator

The point of a checklist is to catch common errors. I listed two of them.

I'm also saying that you need something less strict than validated. Maybe red for unchecked, yellow for source code checked, green for validated. Any updates to the model should send it back to red.

Third, I'm suggesting you drop the code quality criterion. The skill set is different, and it is of minimal relevance to the user. You already note documentation separately; if it is inconsistent with the calculation it would fail validity.

lucas-wilkins · 2022-11-11T01:20:42Z

lucas-wilkins
Nov 11, 2022
Collaborator Author

The point of a checklist is to catch common errors. I listed two of them.

So, I can easily add single precision errors as something that should be checked, though I'm not sure who it is that is supposed to do the course in numerical analysis, or how best to verify that.

I'm also saying that you need something less strict than validated. Maybe red for unchecked, yellow for source code checked, green for validated. Any updates to the model should send it back to red.

Not a bad idea

Third, I'm suggesting you drop the code quality criterion. The skill set is different, and it is of minimal relevance to the user. You already note documentation separately; if it is inconsistent with the calculation it would fail validity.

There are at least four reasons why it should be able to be rejected on the basis of code quality:

Sometimes people write code that is so unreadable that nobody can be reasonably asked to review it.
People might want to build on an existing model.
People might wish to update it to keep up with changes in dependencies.
Clean code is (to some degree) transparent code, and transparency is an important facet of integrity.

0 replies

butlerpd · 2022-11-11T04:22:20Z

butlerpd
Nov 11, 2022
Maintainer

It sounds to me like there are several different things at issues here? I think it would be useful to clearly separate review/validation/verification of models submitted at PRs in sasmodels from models in the marketplace. As mentioned before I think it would be very useful to have some basic standards for reviewing/approving a model in sasmodels as a step towards improving the consistency and quality of those models.

While it was not at all clear to me from the initial discussion, it is now clear that this thread started as something completely different: mainly to specify how to get a marketplace contributed model to be listed as "verified." To me this is a very different standard as @pkienzle suggests. Actually, this may be a case of "it seemed like a good idea at the time." I think that was a very vague, not well thought out idea to add such a flag to the database without any real idea of how it would be used... and it has not been. However since it is there and displays an ominous red x instead of the soothing green check mark next to all those from sasmodels, we have had several requests of "how can I get verified."

The question may therefore be whether that was a good idea? Should we get rid of it? should we expand on it? should we be doing something else altogether? and if we keep it as it is now, what does a green check mark mean? Indeed such a nomenclature and GUI may have connotations that are inconsistent with what can (or should) actually be provided?

If we stick to the current marketplace approach, I would agree mostly with @pkienzle that code quality is not what should be verified with the caveat of @lucas-wilkins point 1 and 2. Point 3 really isn't possible with the current marketplace I don't think but there probably is a need to expand the marketplace to allow for adding library functions separately? code quality becomes an issue if being pulled into sasmodels IMO. We should however provide an independent assessment of whether it does what it says it does and to the extent that it can be checked against some other code or by doing some MC maybe that too should be done? Basically that the person submitting is not "selling snake oil" ... in the best judgement of the person(s) doing the review :-)

Alternatively we could make the verified flag mean that it is ready to be merged into sasmodels, including appropriate tests, which would have a much higher bar? Or, more reasonably we should have two flags if we want to go there I think?

0 replies

smk78 · 2022-12-08T12:07:10Z

smk78
Dec 8, 2022
Maintainer

The opening statement in this discussion was:

I was asked by colleagues to look at getting a particular unverified model from the marketplace validated and incorporated into SasView.

For me, this raises two issues:

Is the Marketplace at present giving the community the impression that if there isn't a green tick the model is not fit for use? And/or, giving an expectation that we will get around to testing/validating every model that is submitted to the Marketplace? If the answer to one or both of those questions is 'yes' then I think it indicates some action is required on our part.
It would be instructive to understand why @lucas-wilkins colleagues wanted the model incorporated into SasView. This may or may not be related to 1).

0 replies

butlerpd · 2022-12-26T15:13:13Z

butlerpd
Dec 26, 2022
Maintainer

Indeed @smk78, I suspect that the impression most people have regarding the green tick mark, or lack thereof, is "both of the above." So I guess we do need to figure out what we think is the best way to handle contributed models to the marketplace (and sustainable) thing to do (which could be to get rid of the check marks) sooner than later?

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

SasView

Model Review Form #2384

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 13 comments 5 replies

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

{{title}}

Select a reply

SasView

Model Review Form #2384

lucas-wilkins Nov 10, 2022 Collaborator

Model Review Form (Draft)

Summary

Details

Consistency with Literature

Numerical Validation

Code Standards

Documentation

Versioning and Compatibility

Replies: 13 comments · 5 replies

lucas-wilkins Nov 10, 2022 Collaborator Author

krzywon Nov 10, 2022 Maintainer

lucas-wilkins Nov 10, 2022 Collaborator Author

butlerpd Nov 10, 2022 Maintainer

pkienzle Nov 10, 2022 Collaborator

lucas-wilkins Nov 10, 2022 Collaborator Author

pkienzle Nov 10, 2022 Collaborator

lucas-wilkins Nov 10, 2022 Collaborator Author

pkienzle Nov 10, 2022 Collaborator

pkienzle Nov 10, 2022 Collaborator

lucas-wilkins Nov 10, 2022 Collaborator Author

pkienzle Nov 10, 2022 Collaborator

lucas-wilkins Nov 11, 2022 Collaborator Author

pkienzle Nov 11, 2022 Collaborator

lucas-wilkins Nov 11, 2022 Collaborator Author

butlerpd Nov 11, 2022 Maintainer

smk78 Dec 8, 2022 Maintainer

butlerpd Dec 26, 2022 Maintainer

lucas-wilkins
Nov 10, 2022
Collaborator

Replies: 13 comments 5 replies

lucas-wilkins
Nov 10, 2022
Collaborator Author

krzywon
Nov 10, 2022
Maintainer

lucas-wilkins Nov 10, 2022
Collaborator Author

butlerpd
Nov 10, 2022
Maintainer

pkienzle Nov 10, 2022
Collaborator

lucas-wilkins
Nov 10, 2022
Collaborator Author

pkienzle
Nov 10, 2022
Collaborator

lucas-wilkins
Nov 10, 2022
Collaborator Author

pkienzle
Nov 10, 2022
Collaborator

pkienzle
Nov 10, 2022
Collaborator

lucas-wilkins
Nov 10, 2022
Collaborator Author

pkienzle Nov 10, 2022
Collaborator

lucas-wilkins Nov 11, 2022
Collaborator Author

pkienzle Nov 11, 2022
Collaborator

lucas-wilkins
Nov 11, 2022
Collaborator Author

butlerpd
Nov 11, 2022
Maintainer

smk78
Dec 8, 2022
Maintainer

butlerpd
Dec 26, 2022
Maintainer