Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix group assignment in votes matrix export #1944

Open
wants to merge 2 commits into
base: edge
Choose a base branch
from

Conversation

colinmegill
Copy link
Member

check base cluster for pid, check group for base cluster

@jucor
Copy link
Contributor

jucor commented Feb 27, 2025

@ballPointPenguin , as per @colinmegill request, flagging the ``.membersand.id` problem that are tripping the cypress-run test. It seems the fix has a wrong way to access the fields of `baseClusters`:

  function getGroupId(pid: number) {
    if (!baseClusters || !groupClusters) {
      return undefined;
    }

    // First find which base cluster contains this participant
    const baseClusterIds = baseClusters.id;
    const baseClusterMembers = baseClusters.members;
    let baseClusterId = -1;
    for (let i = 0; i < baseClusterIds.length; i++) {
      if (baseClusterMembers[i].includes(pid)) {
        baseClusterId = baseClusterIds[i];
        break;
      }
    }

    // If we found a base cluster, check which group contains it
    if (baseClusterId !== -1) {
      for (const group of groupClusters) {
        if (group.members.includes(baseClusterId)) {
          return group.id;
        }
      }
    }

    return undefined;
  }

The code runs as is, but we do get a linter error.
@colinmegill told me to flag it but leave the fix of the fix up to you @ballPointPenguin . (and indeed I tried a first fix-of-the-fix, which broke the fix, then again I don't know TypeScript and haven't coded in Javascript since 1999 :P)

@jucor
Copy link
Contributor

jucor commented Feb 27, 2025

I confirm that:
1/ I could repro the bug: on edge branch,

(polis-data-dive) julien@Juliens-MacBook-Pro:edge$ time csvstat -c 2  2025-02-27-1444-[ZINVITE]-participant-votes.csv 
  2. "group-id"

        Type of data:          Boolean
        Contains null values:  True (excluded from calculations)
        Non-null values:       100
        Unique values:         3
        Most common values:    None (3247x)
                               True (75x)
                               False (25x)

Row count: 3347

real    0m18.568s
user    0m18.055s

2/ this pull-request as it is fixes the bug (even with the typescript warning discussed above -- there must be a tolerance in typescript :) ):

(polis-data-dive) julien@Juliens-MacBook-Pro:post-fix$ time csvstat -c 2  2025-02-27-1455-[ZINVITE]-participant-votes.csv 
  2. "group-id"

        Type of data:          Boolean
        Contains null values:  True (excluded from calculations)
        Non-null values:       2693
        Unique values:         3
        Most common values:    True (2495x)
                               None (654x)
                               False (198x)

Row count: 3347

real    0m18.183s
user    0m17.805s
sys     0m0.203s

Thanks @metasoarous for the clear report, and @colinmegill for coding the fix.
@ballPointPenguin Can you merge please? (with or without the minor typescript fix above, up to you)

@jucor
Copy link
Contributor

jucor commented Feb 27, 2025

@ballPointPenguin Ah, I think I know why Colin's code works in spite of the linter error. His code respects the actual structure of the JSON, as illustrated by @metasoarous in the screenshots here: #1930 (comment)

That's why it works. So I suspect it's only somewhere the Typescript type definition for baseClusters (a few lines above getGroupId) that does not respect the JSON format. And I'm completely ignorant of Typescript, but I imagine that at least in some cases the reality of the object overrides the types?

@jucor
Copy link
Contributor

jucor commented Feb 27, 2025

So I believe (but haven't tested) that the right type should be:

const baseClusters: { id: number[]; members: number[][] } | undefined = pca?.asPOJO["base-clusters"];

Actually, give me a second and I'll test the export with that modification.

@jucor
Copy link
Contributor

jucor commented Feb 27, 2025

Yup, confirmed, with the modified type I describe above the linter is happy and the export is still good :)

(polis-data-dive) julien@Juliens-MacBook-Pro:post-fix-with-correct-type$ time csvstat -c 2  2025-02-27-1720-[ZINVITE]-participant-votes.csv 
  2. "group-id"

        Type of data:          Boolean
        Contains null values:  True (excluded from calculations)
        Non-null values:       2693
        Unique values:         3
        Most common values:    True (2495x)
                               None (654x)
                               False (198x)

Row count: 3347

real    0m18.549s
user    0m17.712s
sys     0m0.200s

I don't know how to edit this PR as I don't have push access to the polis repo, so I'll let you do it :)

* fix ts error in export.ts

* improved exports.ts and pca.ts to better handle group-id mapping, proper types, logging, and general code cleanup.

* Fix and improve export test

* fix server build
@ballPointPenguin
Copy link
Contributor

Updated this PR with cleanup and fixes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants