Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New workflow proposal for Documentation changes #784

Open
rimolive opened this issue Oct 21, 2024 · 7 comments
Open

New workflow proposal for Documentation changes #784

rimolive opened this issue Oct 21, 2024 · 7 comments

Comments

@rimolive
Copy link
Member

rimolive commented Oct 21, 2024

One of the biggest challenges when working with Kubeflow 1.9 release is working on documentation changes for all Kubeflow components and add-ons. The Release Handbook describes the Docs Lead role as the person to coordinate documentation changes with all WGs, leaving the task to the Release Manager to take that responsibility in case the Release Team does not have any volunteers for the Docs Lead.

Motivation

The Kubeflow community wants to fill an open gap from a past Kubeflow User Survey to make documentation better, clear, and concise so users can get the information they need about every component, the community workflows, release communication, etc.

Problem Statement / Current Scenario

Throughout the releases, documentation changes hasn't been enough to cover everything in the new releases. Also, usually only one person volunteers to the Docs Lead role, adding a single flow to make changes in the entire documentation.

To make that worse, in case of no volunteers for the Docs Lead role, this task must be done by the Release Manager. This is a bad idea given the amount of responsibilities the Release Manager currently have.

Proposal

We could use the experience of the manifests sync phase in the Kubeflow releases to do the same with the documentation. That means every Working Group will keep a copy of the component documentation under the component GitHub repository, and after Feature Freeze Docs lead can use bash scripts to copy the documentation content to the kubeflow/website repository.

Some changes in the current roles are:

  • WG lead: Will coordinate work with the WG members to create documentation under the /docs path inside the component GitHub repository
  • Docs Lead: Will work on maintain a set of scripts to sync documentation changes from Kubeflow components
  • Release Manager: Will work on coordinating the Documentation sync phase in the Kubeflow release cycle

Why we need to change?

No one is better skilled to create documentation about a Kubeflow component than the WG members. That way, as part of a new contribution that needs documentation, the contributor can add in the component repo the code and documentation. Another advantage of this new workflow is to make documentation changes faster for releases, and it can be automated at some level, leaving the responsibility of the Docs Lead (and if we keep the rule of handover to the Release Manager in case we don't have any Docs Lead volunteers) easier to manage.

Open Questions

  • Who should handle add-on components documentation?
  • Who will be responsible for the rest of the documentation (Community, Overview, Architecture, etc.)?
  • Who should be able to maintain documentation layout?
  • How Swagger/API Changes should be handled?
  • How to keep track of the redirect issues in the documentation?

References

Slide deck

cc @kubeflow/release-team @kubeflow/kubeflow-steering-committee @kubeflow/wg-automl-leads @kubeflow/wg-data-leads @kubeflow/wg-notebooks-leads @kubeflow/wg-pipeline-leads @kubeflow/wg-training-leads @kubeflow/wg-manifests-leads

@andreyvelich
Copy link
Member

Thank you for doing this @rimolive!
As we discussed on the community meeting, can we convert this proposal to the official KEP (Kubeflow Enhancement Proposal) under community repo ?
We can use the same KEP template as for Kubernetes: https://github.com/kubernetes/enhancements/blob/master/keps/NNNN-kep-template/README.md.

Similar to how we did it for Kubeflow Training V2: https://github.com/kubeflow/training-operator/tree/master/docs/proposals/2170-kubeflow-training-v2

@diegolovison
Copy link

I was a documentation lead for Kubeflow 1.9 and I am the documentation lead for Kubeflow 1.10.
I have the following observations:

  • Each component should have its own repository, as component versions often do not align with Kubeflow's overall version. For instance, Pipelines have a different release cadence than Kubeflow.
  • There is no need for a dedicated documentation lead. Each team can handle reviews and improvements to their respective documentation independently.
  • Significant changes to the documentation repository, such as updates to layouts, images, etc., should be discussed during community meetings.

@HumairAK
Copy link

HumairAK commented Oct 22, 2024

I cannot stress enough how important this is for improving component documentation.

It is a HUGE overhead adding documentation to a separate repo for ever PR that goes to the component repo, and enforcing best practices here is just painful. @rimolive 's proposal allows us to keep docs next to code, and this will easily allow us to review PRs, and within the PRs enforce the addition of new docs as part of said PR.

Docs Lead: Will work on maintain a set of scripts to sync documentation changes from Kubeflow components

My only suggestion here is, that the docs are pulled from tagged version commits for each component. The version matching the version going into the upcoming KF release. This will also help resolve another major pain point, which is the kubeflow/website docs being too ahead because the component versions have not yet released the code that implements these features. This would resolve that issue.

There is no need for a dedicated documentation lead. Each team can handle reviews and improvements to their respective documentation independently.

I would say the open questions still warrant some sort of a documentation lead. At least for the transition period.

@thesuperzapper
Copy link
Member

Personally, I think storing the docs in separate repos will be problematic.

However, I think we all agree that the core issue is allowing per-component versioning (or at least some way for end users to know what version of a component added/removed a feature).

An alternative to fully versioning each component docs, is to use a JavaScript based approach. Everything could still live in the main repo, but we give docs writers a way to say "only show this section/page for version X.Y.Z of the component".

For example, we could have a version drop-down on each component section, which lists each version of that component, and hides sections which were added after that version when selected.

We can also put an indication within the docs itself about which version the feature was added in.

@diegolovison
Copy link

@thesuperzapper I liked the idea!

@StefanoFioravanzo
Copy link
Member

@rimolive Thanks for starting this issue! I fully agree with your proposal. I think offloading documentation ownership to each WG is essential, doing this by moving the actual doc "source code" to each WG's repository is a practical way of enforcing that ownership.

You raise valid questions and concerns that I think we can discuss and resolve in a dedicated KEP. There will certainly be multiple solutions to each one, so let's work together to find a good comprise.

It's obvious that the current way of doing things hasn't worked very well and does not allow us to scale. This is a good discussion to push the community towards a leaner, more decentralized, more scalable documentation practice.

@jgarciao
Copy link

Another approach could be to organize the website and documentation like the Argo project:

  • Parent website: https://argoproj.github.io
    • Landing page, one brief section for each component (with link to their dedicated Documentation site), Blog, ...
  • Component's documentation: https://argo-workflows.readthedocs.io
    • There is a version selector
    • Sections: Home, Getting Started, User Guide, Operator Manual, Developer Guide, Roadmap,Blog

I think having kubeflow.org as parent website, presenting all components and being a central point for community engagement (Blog, Events, ...) and have the details about each component in their dedicated site could be also a good approach.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants