feat: dynamic ground truths #14

Keyrxng · 2024-10-24T04:04:28Z

Resolves #13

github-actions · 2024-10-24T04:05:29Z

Unused types (1)

Filename	types
src/types/llm.ts	`GroundTruthsSystemMessage`

Keyrxng · 2024-10-24T04:07:30Z

Marking as ready for review to get eyes on it and opinions as-is.

Picked from e6586a4

QA: ubq-testing#6 (comment)

Each array belongs to the review it performed on the QA PR, it also contains the actual review (I haven't started to refine the review prompt).

Spec that it's sourcing truths from here

As I said I don't have full context but if the truths are better sourced from something else let me know but this seems appropriate at least from the purpose of #11.

The ones currently in use in development appear to be more like categories/themes/genres to me, not sure if this approach defeats the purpose of them or not.

[
  'The bot should initiate review when a pull request is created as a draft and finalized by the contributor.',
  'The bot should parse the issue specification and pull request diff to assess compliance.',
  'If the pull request does not meet the specification, the bot should provide actionable feedback and change the review state to requested changes.',
  'The bot should convert non-compliant pulls back to draft status if they fail the specification check.',
  "The bot should only leave a 'commented' state for pulls that meet the specification.",
  'If a collaborator re-finalizes a draft pull, the bot should stop further interventions.',
  'The inspection process should be triggered only during initial creation and when a draft is finalized by the pull author.'
]
[
  'The bot should verify that the pull request is initially opened as a draft.',
  'The bot should check for changes from draft to finalized pull request status for initiating review.',
  'The bot needs to check pull request diffs against the issue specification for compliance.',
  'The bot should provide actionable feedback for specification discrepancies in the review.',
  'If the pull request does not meet specifications, the bot should convert it back to draft and request changes.',
  "If the pull request meets specifications, the bot should mark it as 'commented' without approval.",
  'The bot must refrain from intervening if a collaborator changes the pull request back to finalized.',
  'The bot’s intervention should be limited to triggers on pull creation and author-led status changes.',
  'Optionally handle Continuous Integration (CI) checks separately due to external factors.',
  'Consider implementing a daily limit on bot reviews per user to prevent abuse of the review system.'
]
[
  'The contributor must initially open the pull request as a draft.',
  'When the pull request is ready for review, the contributor should convert it to a finalized pull request.',
  'The bot should analyze the issue specification along with the pull request diff.',
  'The bot should provide actionable feedback indicating any missing specifications.',
  "If the pull request doesn't meet the specification, the bot should require changes and revert the pull back to a draft.",
  'If the pull request meets the specification, the bot should leave a comment without approval.',
  'The bot must not intervene if a collaborator changes the pull request from draft to finalized.',
  'The bot should only conduct inspections upon pull creation and when the author finalizes a draft.',
  'Optional: Ensure CI passes, but account for potential external failures.',
  'Optional: Limit bot reviews to one per day per contributor to prevent excessive use for minor changes.'
]

Keyrxng · 2024-10-24T04:16:52Z

@0x4007 @sshivaditya2019 @gentlementlegen @rndquu requesting review

CI can be ignored as it's used in #11 but not here, or I can comment it out or something so it passes CI.

0x4007 · 2024-10-24T09:21:16Z

Why are you adding redundant information in those arrays?

Keyrxng · 2024-10-24T14:01:19Z

Why are you adding redundant information in those arrays?

I'm not manually adding anything it's GPT that's creating the array contents based on the spec and prompt, that's it.

Without the context and add. info I requested here #13, I'm not 100% how to refine and improve inline with @sshivaditya2019' original intention for them, I know how I'd refine them personally but this is not my show.

Right now GPT is consuming the task spec and creating these outputs based on this prompt and settings to completions endpoint

Keyrxng · 2024-10-24T14:38:21Z

the @ubqbot command right now that's in development the prompt looks like this regarding ground truths:

You Must obey the following ground truths: ["typescript" : "github" : "cloudflare worker" : "actions" : "jest" : "supabase": "openai"]

Which doesn't make a whole lot of sense to me without the additional context. These are more like a classification of the subject areas of the tech stack involved in the query/task/org?

If that's the true intention of "Ground Truths" then I know how to refactor. Or is how I'm using them the correct way to use them for my use-case?

0x4007 · 2024-10-24T23:14:12Z

I think we should not have redundant messages but they should be more substantial than the keywords we have now.

Keyrxng · 2024-10-24T23:33:15Z

In my opinion "Ground Truths" should be considered in relation to the use-case if they are intended to guardrail the model to conform to a specific workflow, which we can consider different applications, i.e chatbot vs code review.

@ubqbot: General Organization Chatbot. It's truths should be like "The org uses ...these tech stacks only consider these in your response.", "This repo uses ...these frameworks/libs", etc...
pull precheck: Pull Request Review. It's truths should be based on the task spec (and maybe some of our contribution standards) as that's the source of truth for this application of the model, proving the spec is implemented. It's truths are dynamic, if we had a contributing.md or something in each repo we could ground truth those. E.g: No JS files, No empty strings, etc

See this comment for my suggestion on dynamically generating the chatbot ground truths

Keyrxng · 2024-10-25T17:23:11Z

Dynamic chatbot ground truths QA used within my fork of this repo so it's pulling the deps and languages of this repo.

ubq-testing#8
ubq-testing#9

 [
  {
    role: 'system',
    content: '\n' +
      'Using the input provided, your goal is to produce an array of strings that represent "Ground Truths."\n' +
      'These ground truths are high-level abstractions that encapsulate the tech stack and dependencies of the repository.\n' +
      '  \n' +
      'Each ground truth should:\n' +
      '- Be succinct and easy to understand.\n' +
      '- Use only the information provided in the input.\n' +
      '- Focus on essential requirements, behaviors, or assumptions involved in the repository.\n' +
      '  \n' +
      'Example:\n' +
      'Languages: { TypeScript: 60%, JavaScript: 15%, HTML: 10%, CSS: 5%, ... }\n' +
      'Dependencies: Esbuild, Wrangler, React, Tailwind CSS, ms, React-carousel, React-icons, ...\n' +
      'Dev Dependencies: @types/node, @types/jest, @mswjs, @testing-library/react, @testing-library/jest-dom, @Cypress ...\n' +
      'Ground Truths:\n' +
      '- The repo predominantly uses TypeScript, with JavaScript, HTML, and CSS also present.\n' +
      '- The repo is a React project that uses Tailwind CSS.\n' +
      '- The project is built with Esbuild and deployed with Wrangler, indicating a Cloudflare Workers project.\n' +
      '- The repo tests use Jest, Cypress, mswjs, and React Testing Library.\n' +
      '  \n' +
      'Conditions:\n' +
      'Assume your output builds the foundation for a chatbot to understand the repository when asked an arbitrary query.\n' +
      'Do not list every language or dependency, focus on the most prevalent ones.\n' +
      'Focus on what is essential to understand the repository at a high level.\n' +
      'Brevity is key. Use zero formatting. Do not wrap in quotes, backticks, or other characters.\n' +
      'response === ["some", "array", "of", "strings"]\n' +
      '  \n' +
      'Generate similar ground truths adhering to a maximum of 10.\n' +
      '  \n' +
      'Return a JSON parsable array of strings representing the ground truths, without comment or directive.'
  },
  {
    role: 'user',
    content: '{"dependencies":{"@mswjs/data":"^0.16.2","@octokit/rest":"20.1.1","@octokit/webhooks":"13.2.7","@sinclair/typebox":"0.32.33","@supabase/supabase-js":"^2.45.4","@ubiquity-dao/ubiquibot-logger":"^1.3.0","dotenv":"^16.4.5","openai":"^4.63.0","typebox-validators":"0.3.5","voyageai":"^0.0.1-5"},"devDependencies":{"@actions/core":"^1.11.1","@actions/github":"^6.0.0","@commitlint/cli":"19.3.0","@commitlint/config-conventional":"19.2.2","@cspell/dict-node":"5.0.1","@cspell/dict-software-terms":"3.4.6","@cspell/dict-typescript":"3.1.5","@eslint/js":"9.5.0","@jest/globals":"29.7.0","@types/jest":"^29.5.12","@types/node":"20.14.5","cspell":"8.9.0","eslint":"9.5.0","eslint-config-prettier":"9.1.0","eslint-plugin-check-file":"2.8.0","eslint-plugin-prettier":"5.1.3","eslint-plugin-sonarjs":"1.0.3","husky":"9.0.11","jest":"29.7.0","jest-junit":"16.0.0","jest-md-dashboard":"0.8.0","knip":"5.21.2","lint-staged":"15.2.7","npm-run-all":"4.1.5","prettier":"3.3.2","ts-jest":"29.1.5","tsx":"4.15.6","typescript":"5.4.5","typescript-eslint":"7.13.1","wrangler":"^3.81.0"},"languages":[["TypeScript",0.9235672829913418],["PLpgSQL",0.03861807956191261],["JavaScript",0.03622889642996839],["Shell",0.00158574101677714]]}'
  }
]
languages:  [                                                                                                                                                                               
  [ 'TypeScript', 0.9235672829913418 ],                                                                                                                                                     
  [ 'PLpgSQL', 0.03861807956191261 ],                                                                                                                                                       
  [ 'JavaScript', 0.03622889642996839 ],
  [ 'Shell', 0.00158574101677714 ]
]
Ground Truths:  [
  'The repository is primarily written in TypeScript, with some PLpgSQL and JavaScript code.',
  'The project uses Supabase for backend services.',
  'Integration with GitHub APIs is handled via Octokit.',
  "The application leverages OpenAI's API for AI functionalities.",
  'Jest is used as the testing framework, configured for TypeScript.',
  'ESLint and Prettier are employed for code linting and formatting.',
  'GitHub Actions manage the CI/CD workflows.',
  'Husky and lint-staged are set up for pre-commit hooks.',
  'The project is deployed using Wrangler, indicating deployment to Cloudflare Workers.',
  'Commit messages are enforced using Commitlint with conventional commit standards.'
]

 [
  {
    role: 'system',
    content: '\n' +
      'Using the input provided, your goal is to produce an array of strings that represent "Ground Truths."\n' +
      'These ground truths are high-level abstractions that encapsulate the tech stack and dependencies of the repository.\n' +
      '  \n' +
      'Each ground truth should:\n' +
      '- Be succinct and easy to understand.\n' +
      '- Use only the information provided in the input.\n' +
      '- Focus on essential requirements, behaviors, or assumptions involved in the repository.\n' +
      '  \n' +
      'Example:\n' +
      'Languages: { TypeScript: 60%, JavaScript: 15%, HTML: 10%, CSS: 5%, ... }\n' +
      'Dependencies: Esbuild, Wrangler, React, Tailwind CSS, ms, React-carousel, React-icons, ...\n' +
      'Dev Dependencies: @types/node, @types/jest, @mswjs, @testing-library/react, @testing-library/jest-dom, @Cypress ...\n' +
      'Ground Truths:\n' +
      '- The repo predominantly uses TypeScript, with JavaScript, HTML, and CSS also present.\n' +
      '- The repo is a React project that uses Tailwind CSS.\n' +
      '- The project is built with Esbuild and deployed with Wrangler, indicating a Cloudflare Workers project.\n' +
      '- The repo tests use Jest, Cypress, mswjs, and React Testing Library.\n' +
      '  \n' +
      'Conditions:\n' +
      'Assume your output builds the foundation for a chatbot to understand the repository when asked an arbitrary query.\n' +
      'Do not list every language or dependency, focus on the most prevalent ones.\n' +
      'Focus on what is essential to understand the repository at a high level.\n' +
      'Brevity is key. Use zero formatting. Do not wrap in quotes, backticks, or other characters.\n' +
      'response === ["some", "array", "of", "strings"]\n' +
      '  \n' +
      'Generate similar ground truths adhering to a maximum of 10.\n' +
      '  \n' +
      'Return a JSON parsable array of strings representing the ground truths, without comment or directive.'
  },
  {
    role: 'user',
    content: '{"dependencies":{"@mswjs/data":"^0.16.2","@octokit/rest":"20.1.1","@octokit/webhooks":"13.2.7","@sinclair/typebox":"0.32.33","@supabase/supabase-js":"^2.45.4","@ubiquity-dao/ubiquibot-logger":"^1.3.0","dotenv":"^16.4.5","openai":"^4.63.0","typebox-validators":"0.3.5","voyageai":"^0.0.1-5"},"devDependencies":{"@actions/core":"^1.11.1","@actions/github":"^6.0.0","@commitlint/cli":"19.3.0","@commitlint/config-conventional":"19.2.2","@cspell/dict-node":"5.0.1","@cspell/dict-software-terms":"3.4.6","@cspell/dict-typescript":"3.1.5","@eslint/js":"9.5.0","@jest/globals":"29.7.0","@types/jest":"^29.5.12","@types/node":"20.14.5","cspell":"8.9.0","eslint":"9.5.0","eslint-config-prettier":"9.1.0","eslint-plugin-check-file":"2.8.0","eslint-plugin-prettier":"5.1.3","eslint-plugin-sonarjs":"1.0.3","husky":"9.0.11","jest":"29.7.0","jest-junit":"16.0.0","jest-md-dashboard":"0.8.0","knip":"5.21.2","lint-staged":"15.2.7","npm-run-all":"4.1.5","prettier":"3.3.2","ts-jest":"29.1.5","tsx":"4.15.6","typescript":"5.4.5","typescript-eslint":"7.13.1","wrangler":"^3.81.0"},"languages":[["TypeScript",0.9235672829913418],["PLpgSQL",0.03861807956191261],["JavaScript",0.03622889642996839],["Shell",0.00158574101677714]]}'
  }
]
languages:  [                                                                                                                                                                               
  [ 'TypeScript', 0.9235672829913418 ],                                                                                                                                                     
  [ 'PLpgSQL', 0.03861807956191261 ],                                                                                                                                                       
  [ 'JavaScript', 0.03622889642996839 ],
  [ 'Shell', 0.00158574101677714 ]
]
Ground Truths:  [
  'The repository is primarily written in TypeScript with minor use of JavaScript and PLpgSQL.',
  'It integrates with Supabase for backend services.',
  'The project leverages OpenAI for AI functionalities.',
  'Environment variables are managed using dotenv.',
  'Deployment is handled with Wrangler, indicating Cloudflare Workers usage.',
  'The development setup includes Jest for testing and ESLint for linting.',
  'GitHub Actions are employed for continuous integration and deployment workflows.',
  'Commit messages are standardized using Commitlint and enforced with Husky hooks.',
  'The project uses @octokit libraries for GitHub API interactions and webhooks.',
  'TypeScript is utilized with typebox for schema validation and type safety.'
]

src/handlers/ground-truths/prompts.ts

src/types/llm.ts

…-ground-truths

Keyrxng · 2024-10-25T19:02:44Z

@gentlementlegen @0x4007 @sshivaditya2019 @rndquu

Why can't we request reviews in this org lmao? Anyway this is ready for review team, thanks.

src/handlers/ground-truths/chat-bot.ts

src/handlers/ground-truths/prompts.ts

src/handlers/ground-truths/create-ground-truth-completion.ts

.gitignore

src/types/llm.ts

Keyrxng · 2024-10-25T21:54:37Z

QA: ubq-testing#11 (comment)

future improvements:

make categories like testing, architecture etc so we can get fuller bodied results for specific areas i.e below it just says Jest is utilized but it would be better if it also included @msjw as it would know what testing db setup to use etc, so each category can have an individual little prompt.

with every usage of @ubqbot we get (I've wrapped in backticks so you all can see)

<!-- Ubiquity - LLM Ground Truths and Token Usage - runPlugin - undefined
{
  "metadata": {
    "groundTruths": [
      "The repository is primarily written in TypeScript with some PLpgSQL and JavaScript.",
      "Supabase is used for backend services and database management.",
      "GitHub Actions are integrated for continuous integration and deployment workflows.",
      "Jest is utilized for testing the codebase.",
      "ESLint and Prettier are employed for code linting and formatting.",
      "The project leverages OpenAI APIs for its functionalities.",
      "Wrangler indicates that the project is deployed on Cloudflare Workers.",
      "TypeBox is used for type definitions and schema validations.",
      "Husky and lint-staged manage Git hooks and enforce code quality.",
      "Environment variables are handled using dotenv."
    ],
    "tokenUsage": {
      "input": 923,
      "output": 46,
      "total": 969
    }
  },
  "caller": "runPlugin"
}
-->

0x4007 · 2024-10-26T21:03:59Z

@sshivaditya2019 you should review and decide when this pull is ready. I encourage QA for changes to prove they work and ideally you should also test as a reviewer

Keyrxng · 2024-10-27T11:55:48Z

lmk if there is anything holding back this PR and I'll push it forward

sshivaditya2019 · 2024-10-27T16:25:24Z

lmk if there is anything holding back this PR and I'll push it forward

I was working on setting this up in the repo, sorry for the delay. LGTM!

feat: dynamic ground truths

9075b90

Keyrxng marked this pull request as ready for review October 24, 2024 04:06

Keyrxng mentioned this pull request Oct 24, 2024

Feat/pull precheck #11

Draft

Keyrxng added 11 commits October 25, 2024 17:12

chore: bump TS to latest, gitignore, cspell

78dbb46

chore: update ask-llm with dynamic truths

63685b2

chore: Logs export

c764b11

chore: groundTruths completion util

b6e2489

chore: groundTruths system msg util

22b1cba

feat: fetch repo langs and deps

b5bbb2b

feat: findGroundTruths

0567c5c

chore: improve templates, type helper utils

5e15472

chore: rename .d.ts, consolidate types

23e62db

chore: validate util, typeguards

b4ff6bc

chore: use default template worker port

a145b34

Keyrxng mentioned this pull request Oct 25, 2024

Context window formatting #16

Open

Keyrxng added 2 commits October 25, 2024 18:34

chore: remove logs, formatting, prompt tweak

49c9dcd

chore: test handlers - add readme fetch, stats - fix openai resp

96f98af

Keyrxng commented Oct 25, 2024

View reviewed changes

src/handlers/ground-truths/prompts.ts Show resolved Hide resolved

Keyrxng commented Oct 25, 2024

View reviewed changes

src/types/llm.ts Show resolved Hide resolved

Keyrxng added 2 commits October 25, 2024 19:58

chore: remove obj decon

ebd3c41

Merge remote-tracking branch 'upstream/development' into feat/dynamic…

5ff5b71

…-ground-truths

chore: knip fix

359451e

Keyrxng commented Oct 25, 2024

View reviewed changes

src/handlers/ground-truths/chat-bot.ts Show resolved Hide resolved

sshivaditya2019 reviewed Oct 25, 2024

View reviewed changes

src/handlers/ground-truths/chat-bot.ts Show resolved Hide resolved

sshivaditya2019 reviewed Oct 25, 2024

View reviewed changes

src/handlers/ground-truths/prompts.ts Show resolved Hide resolved

sshivaditya2019 reviewed Oct 25, 2024

View reviewed changes

src/handlers/ground-truths/create-ground-truth-completion.ts Outdated Show resolved Hide resolved

chore: move groundTruthCompletion into adapters

f7fa7b5

Keyrxng mentioned this pull request Oct 25, 2024

Improve repo ground truth sourcing #17

Open

0x4007 reviewed Oct 25, 2024

View reviewed changes

.gitignore Outdated Show resolved Hide resolved

src/types/llm.ts Show resolved Hide resolved

Keyrxng added 3 commits October 25, 2024 22:06

chore: mock ground truths completion fn

5351bae

chore: remove t.ts gitignore

9615b00

chore: embed groundTruths in html comment

29177f5

0x4007 assigned sshivaditya2019 Oct 26, 2024

Keyrxng mentioned this pull request Oct 27, 2024

RLHF #18

Open

sshivaditya2019 merged commit 2a1e15b into ubiquity-os-marketplace:development Oct 27, 2024
2 checks passed

ubiquity-os-beta bot mentioned this pull request Oct 27, 2024

Chatbot: Dynamic ground truths #13

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: dynamic ground truths #14

feat: dynamic ground truths #14

Keyrxng commented Oct 24, 2024

github-actions bot commented Oct 24, 2024 •

edited

Loading

Keyrxng commented Oct 24, 2024 •

edited

Loading

Keyrxng commented Oct 24, 2024 •

edited

Loading

0x4007 commented Oct 24, 2024

Keyrxng commented Oct 24, 2024 •

edited

Loading

Keyrxng commented Oct 24, 2024 •

edited

Loading

0x4007 commented Oct 24, 2024

Keyrxng commented Oct 24, 2024 •

edited

Loading

Keyrxng commented Oct 25, 2024 •

edited

Loading

Keyrxng commented Oct 25, 2024

Keyrxng commented Oct 25, 2024 •

edited

Loading

0x4007 commented Oct 26, 2024

Keyrxng commented Oct 27, 2024

sshivaditya2019 commented Oct 27, 2024 •

edited

Loading

feat: dynamic ground truths #14

feat: dynamic ground truths #14

Conversation

Keyrxng commented Oct 24, 2024

github-actions bot commented Oct 24, 2024 • edited Loading

Unused types (1)

Keyrxng commented Oct 24, 2024 • edited Loading

Keyrxng commented Oct 24, 2024 • edited Loading

0x4007 commented Oct 24, 2024

Keyrxng commented Oct 24, 2024 • edited Loading

Keyrxng commented Oct 24, 2024 • edited Loading

0x4007 commented Oct 24, 2024

Keyrxng commented Oct 24, 2024 • edited Loading

Keyrxng commented Oct 25, 2024 • edited Loading

Keyrxng commented Oct 25, 2024

Keyrxng commented Oct 25, 2024 • edited Loading

0x4007 commented Oct 26, 2024

Keyrxng commented Oct 27, 2024

sshivaditya2019 commented Oct 27, 2024 • edited Loading

github-actions bot commented Oct 24, 2024 •

edited

Loading

Keyrxng commented Oct 24, 2024 •

edited

Loading

Keyrxng commented Oct 24, 2024 •

edited

Loading

Keyrxng commented Oct 24, 2024 •

edited

Loading

Keyrxng commented Oct 24, 2024 •

edited

Loading

Keyrxng commented Oct 24, 2024 •

edited

Loading

Keyrxng commented Oct 25, 2024 •

edited

Loading

Keyrxng commented Oct 25, 2024 •

edited

Loading

sshivaditya2019 commented Oct 27, 2024 •

edited

Loading