Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Process for maintaining Taxonomies #24

Open
gregboyer opened this issue Aug 6, 2019 · 6 comments
Open

Process for maintaining Taxonomies #24

gregboyer opened this issue Aug 6, 2019 · 6 comments
Assignees

Comments

@gregboyer
Copy link
Collaborator

What process might work for maintaining taxonomies?

@gregboyer
Copy link
Collaborator Author

Notes/thoughts/questions from meeting with Ovio and DemocracyLab:

  • What if someone submits a project that does not fit guidelines? E.g. inappropriate topic, for-profit, etc.
  • Does there need to be a project entry approval process? DemocracyLab approves all projects
  • Should we have an “other” entry that people can submit as a means of gathering potential addition to taxonomies?
  • Ensure that taxonomies are backward compatible
  • Can projects be generating using NLP and current GitHub repos?

@nikolajbaer
Copy link

For submitting projects outside of guidelines, we could have this be part of an ongoing validation of all projects listed (maybe weekly) that could then be used to alert project or brigade leaders that they could improve their project (or a link where to propose a new addition to the taxonomy).

Actions for this would be:

  1. A place where new additions or changes to taxonomy could be proposed / discussed / ratified
  2. The verification task and a corresponding "brigade's project meta data status" page and monthly or quarterly reminder to update any projects not passing validation.

@themightychris
Copy link
Member

themightychris commented Aug 8, 2019

@nikolajbaer @gregboyer what I'm building towards for the proof-of-concept is an entirely open github-moderated process for maintaining the taxonomies. Last week I "projected" a number of existing taxonomies into how I'm picturing our format: https://github.com/codeforamerica/civic-tech-taxonomy/branches/all?utf8=%E2%9C%93&query=sites

The format consists of a separate TOML file for each "record" with keys recursively sorted alphabetically so the same data will always produce the same file, and then the path of each file is its identifier. This format will give us the most effective platform as far as I could determine to have an open process for both humans and machines to engage with the taxonomies.

One thing I really like about this approach is that if tools start to be built to get configured directly with what repo#branch to pull taxonomy from, than folks can easily switch to or experiment on forked taxonomies, or even start out managing their current taxonomy as a fork that gradually merges.

We will begin developing a master taxonomy in a new branch in a similar format, and then a mix of humans and bots/tools/scripts can open Pull Requests to propose/discuss/ratify changes

Then we can set up a CI process to publish taxonomy updates to various useful formats upon merge

I described this a bit more over in civictechindex/brigade-project-index#9

@gregboyer
Copy link
Collaborator Author

To build on Chris' idea above, it'll just be managed by people with the ability to merge into the particular project. We can start with a draft of some guidelines for communication, turnaround time, etc; and maybe have 3-5 people who can vote and review issues. They should also document how and why they make decisions.

@gregboyer gregboyer self-assigned this Sep 16, 2019
@gregboyer
Copy link
Collaborator Author

Colin reviewed all of the mvp stories and tagged them with bullets on which taxonomies/metadata is required for that. We can create a draft that meets those requirements and go from there.

@themightychris themightychris transferred this issue from civictechindex/brigade-project-index Jun 8, 2021
@giosce
Copy link
Collaborator

giosce commented Jun 8, 2021

First decision point is whether we want a top-down or bottom-up taxonomy.
Top-down means we (who?) decide buckets and synonyms within them.
Bottom-up means we (who?) categorize all the tags retrieved by the project-index-crawler

As of June 2020 we have a mix, I have added 100s of synonyms mostly within the buckets that I found here.
I'm happy with this approach but it means, a) to continue to categorize (put in buckets) 100s more "crawler" tags and b) figure out a maintenance model.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants