[feature] cross-pipe indexes SDK: enabling context-aware AI tools #1311

louis030195 · 2025-02-08T19:12:19Z

context

AI is as good as the context you provide it, regardless of it's architecture, weights, training, gpt-42 will be as good as the context you provide it.

screenpipe builds layers of abstractions on top of raw recordings. pipes create valuable contextual data that could be indexed and queried by other pipes, similar to how AI assistants use tools to access different knowledge bases. this proposal aims to standardize how pipes share and consume these contextual indexes.

problem

valuable contextual data is siloed within individual pipes
no standardized way to index and query cross-pipe data
missing opportunities for AI-driven context enrichment
current sharing methods are hacky
pipes reinvent the wheel for common patterns

proposed solution

create an indexes SDK that allows pipes to:

publish local indexes (abstracted data)
AI can autonomously query other pipes' indexes via tools / best AI engineering practices
subscribe to index updates

core indexes examples

// common index types that pipes can publish and consume
type IndexTypes = {
  // activity patterns
  'activity.summary': {
    interval: '5min' | '15min' | '1hour',
    timestamp: number,
    data: {
      tags: string[],
      summary: string,
      apps: string[],
      focus_level: number // 0-1
    }
  },
  
  // knowledge/notes
  'knowledge.chunk': {
    timestamp: number,
    data: {
      content: string,
      tags: string[],
      source: string,
      type: 'note' | 'document' | 'chat' | 'email'
    }
  },

  // communication style
  'communication.style': {
    timestamp: number,
    data: {
      tone: string[],  // ['formal', 'casual', 'technical']
      common_phrases: string[],
      writing_patterns: {
        avg_sentence_length: number,
        vocabulary_level: string
      }
    }
  },

  // task context
  'task.item': {
    timestamp: number,
    data: {
      title: string,
      status: 'todo' | 'in_progress' | 'done',
      priority: number,
      context: string,
      source: string
    }
  }
}

example pipes & integrations

engineering assistant pipe

const engineeringPipe = {
  async suggestIssueComment(issueUrl: string) {
    // get relevant technical context
    const techContext = await pipe.indexes.query('knowledge.chunk', {
      timeRange: lastWeek,
      tags: ['technical', 'architecture']
    })
    
    // get user's communication style
    const style = await pipe.indexes.query('communication.style', {
      timeRange: lastMonth
    })
    
    // get related tasks
    const tasks = await pipe.indexes.query('task.item', {
      status: 'in_progress',
      tags: ['engineering']
    })
    
    return generateTechnicalComment(issueUrl, {
      context: techContext,
      style,
      relatedTasks: tasks
    })
  }
}

task extraction pipe

const taskPipe = {
  async extractTasks() {
    // analyze recent activity
    const activities = await pipe.indexes.query('activity.summary', {
      timeRange: lastHour
    })
    
    // get communication context
    const communications = await pipe.indexes.query('knowledge.chunk', {
      timeRange: lastHour,
      type: ['chat', 'email']
    })
    
    const tasks = identifyTasks(activities, communications)
    
    // publish new tasks
    await pipe.indexes.publish('task.item', tasks.map(t => ({
      timestamp: Date.now(),
      data: t
    })))
  }
}

sales assistant pipe

const salesPipe = {
  async enhanceSalesCall(transcript: string) {
    // get customer interaction history
    const history = await pipe.indexes.query('knowledge.chunk', {
      timeRange: lastMonth,
      tags: ['customer', 'sales']
    })
    
    // get product knowledge
    const productKnowledge = await pipe.indexes.query('knowledge.chunk', {
      tags: ['product', 'features', 'pricing']
    })
    
    // get communication patterns
    const style = await pipe.indexes.query('communication.style', {
      timeRange: lastWeek
    })
    
    return generateSalesInsights(transcript, {
      history,
      productKnowledge,
      style
    })
  }
}

linear.app integration pipe

const linearPipe = {
  async enhanceTicket(ticketId: string) {
    // get engineering context
    const techContext = await pipe.indexes.query('knowledge.chunk', {
      timeRange: lastWeek,
      tags: ['technical']
    })
    
    // get related tasks
    const tasks = await pipe.indexes.query('task.item', {
      tags: ['engineering']
    })
    
    // get team activity patterns
    const teamActivity = await pipe.indexes.query('activity.summary', {
      timeRange: lastDay,
      tags: ['engineering']
    })
    
    return generateTicketContext(ticketId, {
      techContext,
      relatedTasks: tasks,
      teamActivity
    })
  }
}

meeting summarizer pipe

const meetingPipe = {
  async generateSummary(meetingId: string) {
    // get participant context
    const participants = await pipe.indexes.query('knowledge.chunk', {
      timeRange: lastMonth,
      tags: ['profile', 'background']
    })
    
    // get project context
    const projectContext = await pipe.indexes.query('knowledge.chunk', {
      tags: ['project', 'objectives']
    })
    
    // get action items
    const tasks = await pipe.indexes.query('task.item', {
      status: 'todo'
    })
    
    return generateMeetingSummary(meetingId, {
      participants,
      projectContext,
      pendingTasks: tasks
    })
  }
}

technical considerations

local sqlite storage with efficient indexing
standardized index schemas per pipe type
real-time pub/sub for index updates
typescript-first with zod validation
privacy-preserving (100% local)
efficient time-based querying
support for full-text search
support for vector embeddings
support for metadata filtering

implementation details

// core SDK interface
interface IndexesSDK {
  // publishing
  publish(indexName: keyof IndexTypes, data: IndexData): Promise<void>
  
  // querying
  query(indexName: keyof IndexTypes, filters: {
    timeRange?: TimeRange
    tags?: string[]
    type?: string
    fullText?: string
    vector?: number[]
    metadata?: Record<string, any>
  }): Promise<IndexData[]>
  
  // subscriptions
  subscribe(indexName: keyof IndexTypes, callback: (data: IndexData) => void): () => void
  
  // schema validation
  validateSchema(indexName: keyof IndexTypes, data: any): boolean
}

next steps

iterate & finalize design
implement it

questions

how should we handle index versioning?
what's the optimal storage strategy for different index types? should we just store everything as file (eg obsidian file-first approach) or can we just sqlite or such less lindy solutions?
how to handle data retention? what about memories from last 90 days?
should we add data transformation utilities?

louis030195 added the enhancement New feature or request label Feb 8, 2025

louis030195 pinned this issue Feb 8, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[feature] cross-pipe indexes SDK: enabling context-aware AI tools #1311

[feature] cross-pipe indexes SDK: enabling context-aware AI tools #1311

louis030195 commented Feb 8, 2025

[feature] cross-pipe indexes SDK: enabling context-aware AI tools #1311

[feature] cross-pipe indexes SDK: enabling context-aware AI tools #1311

Comments

louis030195 commented Feb 8, 2025

context

problem

proposed solution

core indexes examples

example pipes & integrations

technical considerations

implementation details

next steps

questions