Amenities and ergonomics #481
Replies: 2 comments
-
Can you elaborate on what you tried here? Was it the
Can you be more specific about what you mean? Which Rust types are you hoping to get from from Cap'n Proto types? |
Beta Was this translation helpful? Give feedback.
-
I tried to replace the LogJuicer low level index implementation to enable building and reading the index matrix using capnproto. The indexes are bundled inside a model, and my goal was to avoid decoding the indexes that are unused, and in the future, being able to pull indexes from multiple models efficiently. LogJuicer presently uses these data types to define a single index: use sprs::*;
const SIZE: usize = 260000;
pub type F = f32;
pub type FeaturesMatrix = CsMatBase<F, usize, Vec<usize>, Vec<usize>, Vec<F>>;
pub type FeaturesMatrixView<'a> = CsMatView<'a, F>;
// building
pub trait IndexBuilder {
type Reader;
fn add(&mut self, line: &str);
fn build(self) -> Self::Reader;
}
pub struct FeaturesMatrixBuilder {
current_row: usize,
row: Vec<usize>,
col: Vec<usize>,
val: Vec<f32>,
}
impl traits::IndexBuilder for FeaturesMatrixBuilder {
type Reader = FeaturesMatrix;
fn add(&mut self, line: &str) { todo!() }
fn build(self) -> FeaturesMatrix {
TriMat::from_triplets((self.row.len(), SIZE), self.row, self.col, self.val).to_csr()
}
}
// reading
pub trait IndexReader {
fn rows(&self) -> usize;
fn distance(&self, lines: &[String]) -> Vec<f32>;
}
impl traits::IndexReader for FeaturesMatrix {
fn distance(&self, targets: &[String]) -> Vec<f32> {
search_mat_chunk(&self.view(), targets)
}
fn rows(&self) -> usize {
self.rows()
}
}
pub fn search_mat_chunk(baselines: &FeaturesMatrixView, lines: &[String]) -> Vec<F> { ... } So I tried to implement these traits using this message's schema:
But couldn't make it work like this: pub struct DiskIndex<'a> {
matrix: FeaturesMatrixView<'a>,
}
impl traits::IndexReader for DiskIndex<'_> {
fn rows(&self) -> usize {
self.matrix.rows()
}
fn distance(&self, lines: &[String]) -> Vec<f32> {
search_mat_chunk(&self.matrix, lines)
}
}
impl<'a> DiskIndex<'a> {
pub fn from_reader(reader: schema_capnp::index::Reader<'a>) -> Self {
// TODO: handle u64 -> usize transmutation
let indptr: &[usize] = &[];
let indices: &[usize] = &[];
let shape = (reader.get_rows() as usize, SIZE);
let data: &[F] = reader.get_data_storage().unwrap().as_slice().unwrap();
DiskIndex {
matrix: CsMatView::new(shape, indptr, indices, data),
}
}
} … as I couldn't make it past this error:
Here is what I tried for the building trait:
This gets a bit more complicated because these indexes are managed from the model crate using this data type: pub struct Model<IR: IndexReader> {
pub created_at: SystemTime,
pub baselines: Baselines,
pub indexes: std::collections::HashMap<IndexName, IR>,
} … when running, LogJuicer is building indexes on demand, and they get stored in the above HashMap (from std). I tried storing the reader values, wrapping the whole thing inside a function that takes the buffer segment as an argument, but no matter what I tried, it looked overly complicated. Well I am not sure this can actually work while supporting a non-capnproto based storage. I think it would help if we could get such a Rust type and helper from the above schema: pub struct Index<'a> {
rows: u64,
iptrStorage: &'a [u64],
indStorage: &'a [u64],
dataStorage: &'a [f32],
}
fn fromMessage(data: &'a [u8]) -> Result<Index<'a>> |
Beta Was this translation helpful? Give feedback.
-
Hello folks,
As noted in #258 or #136 , I found the generated code hard to use to achieve zero copy usage. I had great hope in #243 but I was not able to please the borrow checker for my use-case. Even though the application is similar (using sparse vector to compute similarity), I was not able to make it works in logjuicer where the vectors are accessed through an intermediary hashmap, resulting in a mess of lifetimes.
It looks like the latest version 0.19 fixed some of the issues (like to write a slice of numbers, thanks!), but is there a long term plan to make it easy to get idiomatic rust data type from a capnproto message without doing any copy?
Thanks!
-Tristan
Beta Was this translation helpful? Give feedback.
All reactions