Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[feature] cross-pipe indexes SDK: enabling context-aware AI tools #1311

Open
2 tasks
louis030195 opened this issue Feb 8, 2025 · 0 comments
Open
2 tasks
Labels
enhancement New feature or request

Comments

@louis030195
Copy link
Collaborator

context

AI is as good as the context you provide it, regardless of it's architecture, weights, training, gpt-42 will be as good as the context you provide it.

screenpipe builds layers of abstractions on top of raw recordings. pipes create valuable contextual data that could be indexed and queried by other pipes, similar to how AI assistants use tools to access different knowledge bases. this proposal aims to standardize how pipes share and consume these contextual indexes.

Image

problem

  • valuable contextual data is siloed within individual pipes
  • no standardized way to index and query cross-pipe data
  • missing opportunities for AI-driven context enrichment
  • current sharing methods are hacky
  • pipes reinvent the wheel for common patterns

proposed solution

create an indexes SDK that allows pipes to:

  1. publish local indexes (abstracted data)
  2. AI can autonomously query other pipes' indexes via tools / best AI engineering practices
  3. subscribe to index updates

core indexes examples

// common index types that pipes can publish and consume
type IndexTypes = {
  // activity patterns
  'activity.summary': {
    interval: '5min' | '15min' | '1hour',
    timestamp: number,
    data: {
      tags: string[],
      summary: string,
      apps: string[],
      focus_level: number // 0-1
    }
  },
  
  // knowledge/notes
  'knowledge.chunk': {
    timestamp: number,
    data: {
      content: string,
      tags: string[],
      source: string,
      type: 'note' | 'document' | 'chat' | 'email'
    }
  },

  // communication style
  'communication.style': {
    timestamp: number,
    data: {
      tone: string[],  // ['formal', 'casual', 'technical']
      common_phrases: string[],
      writing_patterns: {
        avg_sentence_length: number,
        vocabulary_level: string
      }
    }
  },

  // task context
  'task.item': {
    timestamp: number,
    data: {
      title: string,
      status: 'todo' | 'in_progress' | 'done',
      priority: number,
      context: string,
      source: string
    }
  }
}

example pipes & integrations

  1. engineering assistant pipe
const engineeringPipe = {
  async suggestIssueComment(issueUrl: string) {
    // get relevant technical context
    const techContext = await pipe.indexes.query('knowledge.chunk', {
      timeRange: lastWeek,
      tags: ['technical', 'architecture']
    })
    
    // get user's communication style
    const style = await pipe.indexes.query('communication.style', {
      timeRange: lastMonth
    })
    
    // get related tasks
    const tasks = await pipe.indexes.query('task.item', {
      status: 'in_progress',
      tags: ['engineering']
    })
    
    return generateTechnicalComment(issueUrl, {
      context: techContext,
      style,
      relatedTasks: tasks
    })
  }
}
  1. task extraction pipe
const taskPipe = {
  async extractTasks() {
    // analyze recent activity
    const activities = await pipe.indexes.query('activity.summary', {
      timeRange: lastHour
    })
    
    // get communication context
    const communications = await pipe.indexes.query('knowledge.chunk', {
      timeRange: lastHour,
      type: ['chat', 'email']
    })
    
    const tasks = identifyTasks(activities, communications)
    
    // publish new tasks
    await pipe.indexes.publish('task.item', tasks.map(t => ({
      timestamp: Date.now(),
      data: t
    })))
  }
}
  1. sales assistant pipe
const salesPipe = {
  async enhanceSalesCall(transcript: string) {
    // get customer interaction history
    const history = await pipe.indexes.query('knowledge.chunk', {
      timeRange: lastMonth,
      tags: ['customer', 'sales']
    })
    
    // get product knowledge
    const productKnowledge = await pipe.indexes.query('knowledge.chunk', {
      tags: ['product', 'features', 'pricing']
    })
    
    // get communication patterns
    const style = await pipe.indexes.query('communication.style', {
      timeRange: lastWeek
    })
    
    return generateSalesInsights(transcript, {
      history,
      productKnowledge,
      style
    })
  }
}
  1. linear.app integration pipe
const linearPipe = {
  async enhanceTicket(ticketId: string) {
    // get engineering context
    const techContext = await pipe.indexes.query('knowledge.chunk', {
      timeRange: lastWeek,
      tags: ['technical']
    })
    
    // get related tasks
    const tasks = await pipe.indexes.query('task.item', {
      tags: ['engineering']
    })
    
    // get team activity patterns
    const teamActivity = await pipe.indexes.query('activity.summary', {
      timeRange: lastDay,
      tags: ['engineering']
    })
    
    return generateTicketContext(ticketId, {
      techContext,
      relatedTasks: tasks,
      teamActivity
    })
  }
}
  1. meeting summarizer pipe
const meetingPipe = {
  async generateSummary(meetingId: string) {
    // get participant context
    const participants = await pipe.indexes.query('knowledge.chunk', {
      timeRange: lastMonth,
      tags: ['profile', 'background']
    })
    
    // get project context
    const projectContext = await pipe.indexes.query('knowledge.chunk', {
      tags: ['project', 'objectives']
    })
    
    // get action items
    const tasks = await pipe.indexes.query('task.item', {
      status: 'todo'
    })
    
    return generateMeetingSummary(meetingId, {
      participants,
      projectContext,
      pendingTasks: tasks
    })
  }
}

technical considerations

  • local sqlite storage with efficient indexing
  • standardized index schemas per pipe type
  • real-time pub/sub for index updates
  • typescript-first with zod validation
  • privacy-preserving (100% local)
  • efficient time-based querying
  • support for full-text search
  • support for vector embeddings
  • support for metadata filtering

implementation details

// core SDK interface
interface IndexesSDK {
  // publishing
  publish(indexName: keyof IndexTypes, data: IndexData): Promise<void>
  
  // querying
  query(indexName: keyof IndexTypes, filters: {
    timeRange?: TimeRange
    tags?: string[]
    type?: string
    fullText?: string
    vector?: number[]
    metadata?: Record<string, any>
  }): Promise<IndexData[]>
  
  // subscriptions
  subscribe(indexName: keyof IndexTypes, callback: (data: IndexData) => void): () => void
  
  // schema validation
  validateSchema(indexName: keyof IndexTypes, data: any): boolean
}

next steps

  1. iterate & finalize design
  2. implement it

questions

  • how should we handle index versioning?
  • what's the optimal storage strategy for different index types? should we just store everything as file (eg obsidian file-first approach) or can we just sqlite or such less lindy solutions?
  • how to handle data retention? what about memories from last 90 days?
  • should we add data transformation utilities?
@louis030195 louis030195 added the enhancement New feature or request label Feb 8, 2025
@louis030195 louis030195 pinned this issue Feb 8, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

1 participant