Iterating on linkml-project-cookiecutter with usability feedback? #2203
Replies: 8 comments 10 replies
-
We started this conversation on slack, and @sneakers-the-rat had some good suggestions: I think we should generally refactor the cli entrypoints so that they all stem off a single linkml command - we already have all the click stuff in place for this, would just need a refactor into something that could easily be backwards compatible too. so eg. the existing scripts might work like this: list available generators
use a generator, eg. instead of gen-pydantic
some other examples
Then we could replace the cookiecutter as a separate repo with a command that does an interactive prompt by default (or else accepts args for noninteractive).
I think the configuration should all be moved to a linkml.yaml file that lives at the root of a repository and corresponds to a pydantic/dataclass model. One of the major problems with the cookiecutter is that it does too much - I think it’s probably more common that someone wants to use linkml in a relatively constrained context/within another project vs. create a totally independent project that uses all possible generators. As is, the cookiecutter basically tries to be an entire python packaging system, which is just a lot to take on. So then we could have some project config like: build config for all generatorsbuild:
configure specific generators
and so on. Then we can use more familiar idioms like linkml Having a declarative config like that would also let us do stuff like have pre-commit hooks to ensure that models are up to date, etc. As we all know one of the major challenges with this whole linked data thing is that “getting and maintaining a URI is hard” - it looks like w3id is relatively easy to make a new namespace under, or if we wanted to do something similar with the linkml registry that would be cool. something like |
Beta Was this translation helpful? Give feedback.
-
@pkalita-lbl also raised the point that we might not be sure about my hunch raised above, and I think thats a good point:
Would this be a place to get a sense of that? I wonder how many projects work in the cookiecutter mode of having schema(s) and all generated models as a relatively independent project, vs. Using them in a more constrained capacity/integrating within another project? I also wonder if the former, whether that is a desired choice (which is fine!) Or a byproduct of the cookiecutter being a recommended entrypoint vs. A CLI tool for managing schemas not tied to a repo as I suggest above? Welcome anyone to chime in here, if we've taken prior polls like this, or anyone who is currently using linkml feel free to chime in on how you use it now and what your ideal model would be re: tooling around schema and generator management :) |
Beta Was this translation helpful? Give feedback.
-
Also - even if an organization is developing a LinkML as its own repo (the flagship-y schemas I contribute to are happy with an independent repo for LinkML schemas, so I would 👍 that design paradigm vs. one where the schema is incorporated into another application layer), it would be nice to know how many serializations (in practice) are consumed (and which ones). I feel like we could isolate generators for inclusion gen-project and the cookiecutter vs. those that can be easily added later on as needed. some examples: Project: Artifacts Used in Practice |
Beta Was this translation helpful? Give feedback.
-
On point 4: I found copier nicer to work with than the combo of cookiecutter and cruft. At the time of the comparison cruft did not work at all on windows (this is no problem anymore). But having one tool less to install/understand may be good. |
Beta Was this translation helpful? Give feedback.
-
OK lemme take an accounting here of what all the cookiecutter does, and maybe that helps shape what form we want next version to look like:
Thoughts on having a template...So there are a number of things that are currently problems/dont work (eg. the generated python package doesn't have an The template is relatively opinionated about the structure of a project - eg. using poetry, dynamic versioning, how docs are built and hosted, directory structure (eg. of the thing that i find myself really liking about it and wanting more generally is the cli entrypoint of What else? I think there are a number of things that are like "i don't think we should try and do this," but others i'm not so sure and could go either way. Eg. I don't think we should handle python package generation, dependency management, docs building, CI/CD, include default schemas in the project, etc. I also think that the idea of being able to update a project with cruft or another template manager is interesting but i dont' see how that would go in practice, one would have to keep everything in the same structure as the template, but it would be somewhat unpredictable what matters and what doesn't, and i can imagine that being impossible for most projects almost immediately. But what of the above functionality aside from the build cli stuff do we want to keep? Pitch for config + cli project managerI will try not to repeat what i said above, but lemme add some additional thoughts: so in any case i think all the configuration should be unified into a single place (that has its own linkml schema) rather than being spread out across Another thing that is a bigger-picture need is a) schema discoverability, b) schema identification (ie. having a correct and stable URL), and crucially c) relationship between generated artifacts and the source schema. I had alluded to this in the february workshop, but it would be nice to have something like a i'll leave this very long post there for now bc i need to head out the door for something but just some thoughts from this morning |
Beta Was this translation helpful? Give feedback.
-
Mainly to have better support for windows users, which typically don't have make, but also to make writing tasks easier, I explored some alternatives to make/makefiles. I submitted a PR that explores using duty which can be pipx-installed just like While doing this I was also noticing that the commands are not that well named and it is quite unclear what they do in detail without looking into the code (=makefile). Also the use of an |
Beta Was this translation helpful? Give feedback.
-
Hi, I just wanted to provide some feedback on this discussion as a "normal/noobie" user of LinkML. I'm not a developer. I only have rudimentary Python skills, that will allow me to write scripts and basic packages. I like to learn how to do things by looking at concrete working examples. So an example repo with commented code, a default schema, a default data example, and a default test.py helps me to learn how to use LinkML and basic Python testing/packaging. The cookie-cutter template served me as such an example repo a lot along with consulting the docs. If I understand the "config + cli project manager" discussed here correctly, then I also think this would make certain things much simpler. So I'd assume the template/example repo that I want/need could then consist only of an example config (with comments that tell me what each parameter in the config does) and a Readme that explains how to use the But how would the testing and doc generation be handled then? What I like about the cookiecutter template and would miss if this functionality would be gone:
What I don't need/like:
I hope this feedback helps. I really like LinkML and its community, so thanks for all your help and work! |
Beta Was this translation helpful? Give feedback.
-
I like to continue @sierra-moxon´s numbered list and add the additional points discussed here:
On 10.: In my opinion there should be only two files (and maybe the option to read from pyproject.toml as @sneakers-the-rat suggested)
Make or just or whatever task runner should only read from these 2 config files. The current situation is partly related to wanting to read the config easily with make (giving rise to the non-standard A note on updating an existing project after changes were made to the cookiecutter-template: This works quite OK currently if the structure is kept. Cruft ignores files that the users add to their generated projects. However, it is very important to design a template in a way that this updating works well. |
Beta Was this translation helpful? Give feedback.
-
During the ISMB 2024 tutorial, we noticed some teachability challenges with our linkml-project-cookiecutter. This discussion is meant to collect ideas on refactoring, replacing, or adding another cookiecutter for new users.
To get the discussion started, some feedback from the tutorial :
Our cookiecutter is excellent at demo'ing the capabilities of our framework. Out of the box it helps users generate a python development environment, provides some control mechanisms over the generation of various linkml serializations, it gives the user an easy path to pypi publishing and doc generation via GH actions templates, and it sets up a testing framework based on examples.
Depending on the level of expertise of the user, however, the resulting project does have a few gotchas:
Beta Was this translation helpful? Give feedback.
All reactions