Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Easy way to retrieve the groupNames associated with kmer's color? #92

Open
mr-eyes opened this issue Oct 24, 2021 · 0 comments
Open

Easy way to retrieve the groupNames associated with kmer's color? #92

mr-eyes opened this issue Oct 24, 2021 · 0 comments
Labels
question Further information is requested v2 kProcessor version 2

Comments

@mr-eyes
Copy link
Member

mr-eyes commented Oct 24, 2021

While I am in the wrapping process, I tried to retrieve the groups associated with a kmer color, but I couldn't find a direct way.

So, I will put what I understood so far, and please correct me if I'm wrong.

After indexing, we will have a kDataFrame with key(hashVal):Val(kmerOrder). Then we can get the color associated with that kmer through the following getKmerColumn function getKmerColumn("color", hashVal)

T getKmerColumnValue(const string& columnName,uint64_t kmer);

Or by kmer Order like here,

T getKmerColumnValueByOrder(const string& columnName,uint64_t kmerOrder);

Now I have the color. How can I get to the color->group_IDs through the kDataFrame in an easy way, if possible?


Here's the corresponding Python code for this.

import kProcessor as kp

kf_map = kp.kDataFramePHMAP(21)

fasta_file = "seq.fa"
names_file = "seq.fa.names"

kp.index(kf_map, {"kSize": 21}, fasta_file, 1, names_file)

print(f"total size: {kf_map.size()}")
print(f"Column names: {kf_map.getColumnNames()}")

hash_to_color = dict()

it = kf_map.begin()
while it != kf_map.end():
    kmer_hash = it.getHashedKmer()
    kmer_color = kf_map.getKmerColumnValue_int("color", it.getHashedKmer())
    hash_to_color[kmer_hash] = kmer_color
    it.next()


print("kmer to colors")
for _hash, color in hash_to_color.items():
    print(f"hash({_hash}) : color({color})")

cc @drtamermansour @shokrof

@mr-eyes mr-eyes added question Further information is requested v2 kProcessor version 2 labels Oct 24, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested v2 kProcessor version 2
Projects
None yet
Development

No branches or pull requests

1 participant