Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Abstracted data storage layer and added memory and mongodb implementations #82

Open
wants to merge 20 commits into
base: main
Choose a base branch
from

Conversation

eugene-manuilov
Copy link
Contributor

@eugene-manuilov eugene-manuilov commented Jan 31, 2025

Re #31

Abstracted the data storage layer to let developers use different databases for persistent or temporary storages. Created a new pacakage @daydreams/storage to define storage related types and kind names. This is needed because we need to share interfaces between different database implementations and the core package.

This PR includes two implementations: Memory and MongoDB implementations. The memory storage comes as the default storage in the core package. The mongodb is located in its own package to let everyone use it only if they need it.

I haven't renamed the MongoDb class located in the core package (core/db/mongo-db.ts), but it needs to be renamed because it is neither a database class nor mongodb implementation. This is more like a service class, but a more thoughtful decision should be made.

Here is a simple example how to use a memory storage:

import { Orchestrator, MongoDb, MemoryStorage } from "@daydreamsai/core";

const memStorage = new MemoryStorage();

const orchestrator = new Orchestrator(
  roomManager,
  vectorDb,
  processor,
  new MongoDb(memStorage),
  { ... }
);

And here is an example how to set up a mongodb storage:

import { Orchestrator, MongoDb } from "@daydreamsai/core";
import { MongoStorage } from "@daydreamsai/mongodb-storage";

const mongodb = new MongoStorage("mongodb://localhost:27017", "myApp");

await mongodb.connect();
await mongodb.migrate();

const orchestrator = new Orchestrator(
  roomManager,
  vectorDb,
  processor,
  new MongoDb(mongodb),
  { ... }
);

Summary by CodeRabbit

  • New Features
    • Improved project navigation with updated editor settings that hide build output.
    • Added a database UI service in the container configuration for easier MongoDB management.
  • Documentation
    • Enhanced usage guides and examples with clearer integration of MongoDB-backed storage.
  • Refactor
    • Streamlined data handling across various integration examples (Discord, Telegram, Twitter, etc.) for more reliable operations.
  • Chores
    • Updated the build process to a more cohesive multi-package command.

Copy link

vercel bot commented Jan 31, 2025

The latest updates on your projects. Learn more about Vercel for Git ↗︎

Name Status Preview Comments Updated (UTC)
daydreams ✅ Ready (Inspect) Visit Preview 💬 Add feedback Feb 3, 2025 10:17am

Copy link
Contributor

coderabbitai bot commented Jan 31, 2025

Walkthrough

The changes update configuration files and code examples to use a new storage mechanism. The Visual Studio Code settings now exclude the dist folder. The Docker Compose file adds a new service for Mongo Express. In multiple examples and documentation, the old MongoDb usage is replaced with MongoStorage, including migration steps and repository-based deletions. Several core package files are refactored to adopt a repository pattern and new in‑memory storage implementations. New packages for MongoDB storage and a generic storage interface have been introduced along with updated build scripts and TypeScript configurations.

Changes

File(s) Change Summary
.vscode/settings.json Added exclusions for **/dist in both files.exclude and search.exclude.
docker-compose.yaml Added mongo-express service with environment variables and exposed ports; added mongo-data volume for persistent MongoDB storage.
docs/docs/pages/index.mdx Updated code snippet to import and initialize MongoStorage for database connection and lifecycle management.
examples/example-*.ts Replaced MongoDb with MongoStorage, switched from deleteAll() to migrate() followed by repository deletion calls, and updated orchestrator initialization.
package.json & packages/core/package.json Updated build script from a pnpm command to lerna run build; added "types": "dist/index.d.ts" and dependency on @daydreamsai/storage.
packages/core/src/core/db/mongo-db.ts Refactored to remove direct MongoDB dependencies; now uses repository instances via the Storage interface.
packages/core/src/core/index.ts & .../memory.ts Added export for MemoryStorage and removed connect()/close() methods from the OrchestratorDb interface.
packages/core/src/core/storages/* Introduced new in‑memory storage implementations: MemoryRepository and MemoryStorage.
packages/mongodb-storage/* New package @daydreamsai/mongodb-storage added with implementations for MongoStorage and MongoRepository, including TS config and bundler settings.
packages/storage/* New package @daydreamsai/storage added with constants, query types, and interfaces for Repository and Storage, plus related TS config and bundler settings.

Sequence Diagram(s)

sequenceDiagram
    participant App as Application
    participant MS as MongoStorage
    participant Repo as Repository

    App->>MS: Initialize connection (connect)
    App->>MS: Migrate database (migrate)
    MS->>Repo: Initialize/get repository for tasks/orchestrators/chats
    Repo-->>MS: Repository ready
    MS-->>App: Storage ready with repositories
    App->>App: Initialize orchestrator with new database (makeFlowLifecycle)
Loading

Poem

I'm a little rabbit, hopping through the code,
Skipping over dist and lighter loads.
With MongoStorage, I now leap and bound,
Migrating tasks with each joyful sound.
Repositories assembled, a garden so new,
In this vibrant code field, I cheerfully stew!

✨ Finishing Touches
  • 📝 Generate Docstrings (Beta)

Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Generate unit testing code for this file.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai generate docstrings to generate docstrings for this PR. (Beta)
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 6

🧹 Nitpick comments (25)
packages/mongodb-storage/src/mongo-storage.ts (3)

77-81: Verify connection logic.
Using listenerCount("connect") to detect whether the client is connected is uncommon and might not reliably reflect the underlying connection state. Consider a more explicit check (e.g., a private _connected flag or MongoClient#topology.isConnected() in older versions) to prevent multiple connection attempts or concurrency issues.

 public async connect(): Promise<void> {
-    if (!this.client.listenerCount("connect")) {
+    if (!this.db.serverConfig.isConnected()) {
         await this.client.connect();
     }
 }

88-90: Add error/reconnect handling for close.
If the database connection was never successfully opened or unexpectedly dropped, calling close() might throw or silently fail. Consider a defensive check to see if the client is connected or store a ready state flag.


97-114: Consider error handling and logging in migrate.
Creating indexes can fail or conflict, especially if indexes with the same definition already exist in production. You might want to wrap index creation in try-catch blocks or log any errors for better observability.

packages/mongodb-storage/src/mongo-repository.ts (2)

104-164: Potential performance optimizations and stricter query checks.
While constructing _query from the Filter interface is flexible, it might benefit from direct type checks or schema validation to avoid unexpected query shapes. Also, see if you can delegate more complex queries to a utility function or use parameterized queries to keep code simpler.


191-193: Production caution: deleteAll.
In a production environment, deleteAll() can be dangerous. Consider restricting usage to development, test, or explicit user actions to avoid accidental mass deletions.

examples/example-vision.ts (2)

65-69: Consider environment-based instantiation details.
Hardcoding the MongoDB URI ("mongodb://localhost:27017") and database name ("myApp") might be fine for an example, but in a real application, you’d typically load these from environment variables for flexible deployment.


76-80: Consider whether mass deletion is intentional.
Deleting all SCHEDULED_TASKS, ORCHESTRATORS, and CHATS might be a handy reset for demos, but be cautious for real deployments. If this code is part of an example, it might be best to label it as such or guard it with a development check.

packages/core/src/core/db/mongo-db.ts (4)

13-13: Consider following through on the TODO comment.
This class is no longer MongoDB-specific and should be renamed to reflect its actual role.


15-17: Avoid the non-null assertion.
Using !: Repository can hide assignment or initialization issues. If possible, initialize them directly or ensure a proper check in the constructor to prevent potential runtime errors.

- private tasks!: Repository;
- private orchestrators!: Repository;
- private chats!: Repository;
+ private tasks: Repository;
+ private orchestrators: Repository;
+ private chats: Repository;

72-73: Ensure proper indexing for performance.
When querying by { status: "pending", nextRunAt: { lte: now } }, large collections may benefit from having indexes on status and nextRunAt.

Also applies to: 75-76


134-134: Data purge confirmed.
deleteAll() is explicitly called; ensure this is not accidentally triggered in production without safeguards.

examples/example-server.ts (2)

37-41: Consider environment checks before wiping data.
Deleting all data can be catastrophic if run by mistake in production. Add environment-based safeguards.

+if (process.env.NODE_ENV === 'development') {
   await Promise.all([
     kvDb.getRepository(SCHEDULED_TASKS_KIND).deleteAll(),
     ...
   ]);
+}

43-43: Rename to reflect a non-Mongo-specific class.
Following the TODO in mongo-db.ts, consider a name like OrchestratorDbImpl or GenericDb.

packages/storage/src/repository.ts (2)

50-54: Consider improving type safety of the update method.

The set and push parameters use Record<string, any> which loses type safety. Consider using generics to maintain type information:

-    update(
-        id: string,
-        set: Record<string, any>,
-        push?: Record<string, any>
-    ): Promise<void>;
+    update<T>(
+        id: string,
+        set: Partial<T>,
+        push?: Partial<Record<keyof T, T[keyof T][]>>
+    ): Promise<void>;

64-64: Consider making the query parameter optional.

The find method could support retrieving all documents when no query is provided:

-    find<T>(query: Filter, limits?: Limits, sort?: Sort): Promise<T[]>;
+    find<T>(query?: Filter, limits?: Limits, sort?: Sort): Promise<T[]>;
packages/storage/src/query-types.ts (3)

28-33: Consider improving type safety of the Filter type.

The Filter type could be more type-safe by using generics:

-export type Filter = {
-    [key: string]: FilterOperation | any;
-}
+export type Filter<T = any> = {
+    [P in keyof T]?: FilterOperation | T[P];
+}

76-85: Add validation for limit and skip values.

The Limits type should ensure positive numbers:

 export type Limits = {
-    limit?: number;
-    skip?: number;
+    limit?: number & { __brand: 'PositiveNumber' };
+    skip?: number & { __brand: 'NonNegativeNumber' };
 };

+// Helper functions to ensure valid values
+export const createLimit = (n: number): Limits['limit'] => {
+    if (n <= 0) throw new Error('Limit must be positive');
+    return n as Limits['limit'];
+};
+
+export const createSkip = (n: number): Limits['skip'] => {
+    if (n < 0) throw new Error('Skip must be non-negative');
+    return n as Limits['skip'];
+};

38-71: Consider adding more MongoDB-style operators.

The FilterOperation type could include more common MongoDB operators:

 export type FilterOperation = {
     eq?: any;
     gt?: any;
     gte?: any;
     in?: ReadonlyArray<any>;
     lt?: any;
     lte?: any;
     ne?: any;
     nin?: ReadonlyArray<any>;
+    regex?: string | RegExp;
+    exists?: boolean;
+    type?: string;
+    all?: ReadonlyArray<any>;
+    size?: number;
 }
packages/core/src/core/storages/memory-storage.ts (2)

42-42: Consider improving type safety of the repositories map.

The repositories map could be more type-safe:

-    private repositories: Record<string, MemoryRepository> = {};
+    private repositories = new Map<string, MemoryRepository>();

50-55: Consider adding memory leak prevention.

The getRepository method should include a way to limit or clean up unused repositories:

+    private readonly maxRepositories = 100;
+
     public getRepository(kind: string): Repository {
-        if (!this.repositories[kind]) {
-            this.repositories[kind] = new MemoryRepository();
+        let repo = this.repositories.get(kind);
+        if (!repo) {
+            if (this.repositories.size >= this.maxRepositories) {
+                throw new Error('Maximum number of repositories reached');
+            }
+            repo = new MemoryRepository();
+            this.repositories.set(kind, repo);
         }
-        return this.repositories[kind];
+        return repo;
     }
examples/example-discord.ts (1)

68-72: Consider using Promise.allSettled for more robust error handling.

When deleting from multiple repositories, using Promise.allSettled instead of Promise.all would ensure that failures in one repository don't prevent deletions in others.

-    await Promise.all([
+    await Promise.allSettled([
         KVDB.getRepository(SCHEDULED_TASKS_KIND).deleteAll(),
         KVDB.getRepository(ORCHESTRATORS_KIND).deleteAll(),
         KVDB.getRepository(CHATS_KIND).deleteAll(),
     ]);
examples/example-telegram.ts (1)

65-68: Consider using Promise.allSettled for more robust error handling.

When deleting from multiple repositories, using Promise.allSettled instead of Promise.all would ensure that failures in one repository don't prevent deletions in others.

-    await Promise.all([
+    await Promise.allSettled([
         scheduledTaskDb.getRepository(SCHEDULED_TASKS_KIND).deleteAll(),
         scheduledTaskDb.getRepository(ORCHESTRATORS_KIND).deleteAll(),
     ]);
packages/core/src/core/storages/memory-repository.ts (2)

83-116: Consider extracting filter logic into a separate method.

The filter logic is complex and could benefit from being extracted into a separate method for better maintainability and testability.

+    private matchesFilter(item: any, query: Filter): boolean {
+        for (const key in query) {
+            if (typeof query[key] === 'string' && query[key] !== item[key]) {
+                return false;
+            } else if (typeof query[key] === 'object') {
+                if (query[key].eq && query[key].eq !== item[key]) {
+                    return false;
+                }
+                // ... rest of the filter logic
+            }
+        }
+        return true;
+    }
+
     public find<T>(query: Filter, limits?: Limits, sort?: Sort): Promise<T[]> {
-        const items = Object.values(this.data).filter((item) => {
-            for (const key in query) {
-                // ... filter logic
-            }
-            return true;
-        }) as T[];
+        const items = Object.values(this.data).filter(item => this.matchesFilter(item, query)) as T[];
🧰 Tools
🪛 Biome (1.9.4)

[error] 110-110: Change to an optional chain.

Unsafe fix: Change to an optional chain.

(lint/complexity/useOptionalChain)


110-110: Use optional chaining for safer property access.

As suggested by the static analysis tool, using optional chaining would make the code more robust.

-                    if (query[key].nin && query[key].nin.includes(item[key])) {
+                    if (query[key].nin?.includes(item[key])) {
🧰 Tools
🪛 Biome (1.9.4)

[error] 110-110: Change to an optional chain.

Unsafe fix: Change to an optional chain.

(lint/complexity/useOptionalChain)

packages/storage/tsconfig.json (1)

17-26: Strictness and Unused Code Flags
Enabling "strict": true is a solid decision. The relaxed flags for "noUnusedLocals" and "noUnusedParameters" may be reconsidered in the future for cleaner code if desired, but are acceptable as is.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 6a3d395 and 5944953.

⛔ Files ignored due to path filters (1)
  • pnpm-lock.yaml is excluded by !**/pnpm-lock.yaml
📒 Files selected for processing (34)
  • .vscode/settings.json (1 hunks)
  • docker-compose.yaml (1 hunks)
  • docs/docs/pages/index.mdx (2 hunks)
  • examples/example-api.ts (3 hunks)
  • examples/example-discord.ts (3 hunks)
  • examples/example-hyperliquid.ts (3 hunks)
  • examples/example-server.ts (4 hunks)
  • examples/example-telegram.ts (4 hunks)
  • examples/example-twitter.ts (4 hunks)
  • examples/example-vision.ts (2 hunks)
  • package.json (1 hunks)
  • packages/core/package.json (2 hunks)
  • packages/core/src/core/db/mongo-db.ts (13 hunks)
  • packages/core/src/core/index.ts (2 hunks)
  • packages/core/src/core/memory.ts (0 hunks)
  • packages/core/src/core/schedule-service.ts (2 hunks)
  • packages/core/src/core/storages/index.ts (1 hunks)
  • packages/core/src/core/storages/memory-repository.ts (1 hunks)
  • packages/core/src/core/storages/memory-storage.ts (1 hunks)
  • packages/core/src/core/types/index.ts (1 hunks)
  • packages/mongodb-storage/package.json (1 hunks)
  • packages/mongodb-storage/src/index.ts (1 hunks)
  • packages/mongodb-storage/src/mongo-repository.ts (1 hunks)
  • packages/mongodb-storage/src/mongo-storage.ts (1 hunks)
  • packages/mongodb-storage/tsconfig.json (1 hunks)
  • packages/mongodb-storage/tsup.config.ts (1 hunks)
  • packages/storage/package.json (1 hunks)
  • packages/storage/src/constants.ts (1 hunks)
  • packages/storage/src/index.ts (1 hunks)
  • packages/storage/src/query-types.ts (1 hunks)
  • packages/storage/src/repository.ts (1 hunks)
  • packages/storage/src/storage.ts (1 hunks)
  • packages/storage/tsconfig.json (1 hunks)
  • packages/storage/tsup.config.ts (1 hunks)
💤 Files with no reviewable changes (1)
  • packages/core/src/core/memory.ts
✅ Files skipped from review due to trivial changes (10)
  • packages/mongodb-storage/tsup.config.ts
  • packages/storage/src/constants.ts
  • packages/storage/tsup.config.ts
  • .vscode/settings.json
  • packages/storage/src/index.ts
  • packages/mongodb-storage/tsconfig.json
  • packages/core/src/core/storages/index.ts
  • packages/mongodb-storage/package.json
  • packages/mongodb-storage/src/index.ts
  • packages/storage/package.json
🧰 Additional context used
🪛 Biome (1.9.4)
packages/core/src/core/storages/memory-repository.ts

[error] 110-110: Change to an optional chain.

Unsafe fix: Change to an optional chain.

(lint/complexity/useOptionalChain)

🔇 Additional comments (40)
packages/mongodb-storage/src/mongo-storage.ts (2)

44-44: Solid introduction of MongoStorage class.
The overall structure follows the Storage interface well, and the separation of concerns is clear.


122-127: Check for concurrency in getRepository.
Access to this.repositories runs the risk of concurrency issues if multiple requests call getRepository simultaneously for the same collection. Although JavaScript event loop concurrency is single-threaded, if your usage scenario includes concurrency or complex initialization, a short lock or safe check might be beneficial.

packages/mongodb-storage/src/mongo-repository.ts (3)

39-39: Clear naming and structure.
The class name, MongoRepository, and the constructor signature effectively communicate the purpose of this class in handling MongoDB-based operations.


172-174: Same ID handling caution as above.
Similar to update(), if _id is stored as an ObjectId, you might want to convert the string query parameter to an ObjectId. Otherwise, direct equality matches might fail.


182-184: Confirm ID type for delete.
If _id is an ObjectId in the database, passing a string literal here may not match documents. Ensure consistent usage across insert, find, update, and delete flows.

examples/example-vision.ts (4)

27-27: Imports are logically grouped.
The new MongoStorage import is introduced coherently alongside other imports.


73-75: Migrate after connect.
Running await scheduledTaskDb.migrate(); right after connecting is a good approach to ensure your schema and indexes are up to date before usage. Continue verifying that the relevant operation side effects do not block or degrade runtime performance.


82-82: Potential mismatch in usage.
Here, const orchestratorDb = new MongoDb(scheduledTaskDb); references the old MongoDb class even though you have introduced MongoStorage. Confirm whether you still need the old MongoDb or if you intended to replace it with a new instance of MongoStorage.


86-86: Flow lifecycle injection.
makeFlowLifecycle(orchestratorDb, conversationManager) is well placed. Just ensure that the orchestratorDb matches your final storage choice.

packages/core/src/core/db/mongo-db.ts (8)

1-2: Imports look consistent with the new storage approach.
No issues found.


25-25: Add error handling or fallback checks when retrieving repositories.
If a repository kind is not properly set up or fails to initialize, the application may run into runtime errors. Consider adding checks or try-catch blocks.

Also applies to: 27-30


42-42: Creation logic looks straightforward.
The overall approach to creating and inserting tasks appears correct.

Also applies to: 61-61


85-88: Consider concurrency controls for these operations.
Methods like markRunning, markCompleted, updateNextRun, getOrCreateChat, and addChatMessage can introduce race conditions if called simultaneously. You might need atomic queries, version checks, or transactions to prevent inconsistent data states or duplicate chats/messages.

Also applies to: 95-98, 108-112, 137-137, 143-143, 168-168, 179-192


162-162: Successful insert for a new chat.
No further concerns.


199-200: Safe retrieval of chat messages.
Returning doc?.messages || [] avoids runtime issues if the chat is missing.

Also applies to: 202-202


211-211: General retrieval logic looks good.
No immediate concerns found with these find operations.

Also applies to: 215-215, 220-220


229-230: Proper sorting and error handling.
Using { createdAt: "desc" } for ordering and catching errors helps maintain clarity.

Also applies to: 233-233, 241-241

examples/example-server.ts (2)

23-24: Imports align with the new storage approach.
No issues found.


26-29: Connection and setup logic looks clean.
Establishing connection, migrating indexes, and logging are well-structured.

Also applies to: 31-31, 34-34

packages/core/src/core/index.ts (1)

20-20: Memory storage export is aligned with the new storage abstraction.
No issues found.

Also applies to: 42-42

packages/storage/src/storage.ts (1)

33-41: Well-designed storage interface!

The Storage interface follows the Interface Segregation Principle with a minimal, focused API. The documentation clearly describes the purpose and usage.

packages/core/src/core/schedule-service.ts (1)

38-40: LGTM! Good defensive programming.

Adding a null check for task._id before proceeding with task processing is a good practice to prevent potential runtime errors.

examples/example-discord.ts (1)

56-59: LGTM! Good database initialization sequence.

The code properly initializes the MongoDB connection and creates necessary indexes through the migrate step.

Also applies to: 61-65

examples/example-telegram.ts (1)

54-57: LGTM! Good database initialization sequence.

The code properly initializes the MongoDB connection and creates necessary indexes through the migrate step.

Also applies to: 59-63

examples/example-twitter.ts (3)

30-31: LGTM!

The new imports correctly bring in the required storage-related types and constants.


75-82: LGTM!

The migration and cleanup steps are properly implemented:

  1. Database migration is performed
  2. Indexes are created
  3. All repositories are cleaned up in parallel

66-69: Verify the orchestratorDb initialization.

The code initializes kvDb with MongoStorage but then creates orchestratorDb using the old MongoDb class. This might lead to compatibility issues.

Run the following script to check for similar patterns in other files:

Also applies to: 84-84

✅ Verification successful

OrchestratorDb initialization verified as correct.

  • The instance of MongoStorage (kvDb) is intentionally wrapped by MongoDb when initializing orchestratorDb.
  • No additional instances or conflicting usage patterns were found in the codebase that indicate a compatibility issue.
🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Check for potential inconsistencies in database initialization
# Test: Search for patterns where MongoStorage is used with MongoDb
rg -A 5 "new MongoStorage.*\n.*new MongoDb"

Length of output: 237


Script:

#!/bin/bash
# Description: Re-run the search for inconsistencies in database initialization
# between MongoStorage and MongoDb using multiline mode.
rg -U -A 5 "new MongoStorage.*\n.*new MongoDb"

Length of output: 48


Script:

#!/bin/bash
# Display relevant code section from examples/example-twitter.ts (lines 60-90)
echo "Displaying lines 60 to 90 from examples/example-twitter.ts:"
sed -n '60,90p' examples/example-twitter.ts

Length of output: 1073

examples/example-api.ts (2)

70-73: Verify the orchestratorDb initialization.

Similar to example-twitter.ts, there's a potential compatibility issue between MongoStorage and MongoDb.

Also applies to: 87-87


78-85: LGTM!

The database migration and cleanup steps are consistent with other examples.

packages/core/src/core/types/index.ts (1)

651-651: LGTM!

Making _id optional in the ScheduledTask interface aligns with MongoDB's behavior of auto-generating IDs when not provided.

docker-compose.yaml (3)

16-17: Volume Mapping for Mongo Service
The addition of the volumes: key with mapping - mongo-data:/data/db for the mongo service ensures that MongoDB data persists between container restarts. Confirm that the mongo-data volume meets production backup/retention policies.


18-27: Mongo-Express Service Configuration
The new mongo-express service is configured with the proper image, restart policy, port mapping, and environment variables. This addition provides a visual interface for managing MongoDB, which is useful during development and troubleshooting.


28-29: Volume Declaration Verification
The declaration of the mongo-data volume at the bottom of the file is correctly placed and ensures that persistent storage is properly defined for the MongoDB service.

packages/storage/tsconfig.json (2)

1-8: General TypeScript Configuration
The compiler options leverage modern ECMAScript features using "target": "ESNext" and "module": "ESNext", and the inclusion of "jsx": "react-jsx" indicates support for React JSX. Verify that JSX support is intentional in a storage-related package.


11-15: Bundler Mode and NoEmit Setting
Settings such as "moduleResolution": "bundler" and "noEmit": true are appropriate for a development and type-checking setup where output files are not required.

package.json (1)

15-16: Build Script Transition to Lerna
Modifying the "build" script from a package-specific command to "lerna run build" reflects the shift toward building all packages in the repository. Ensure that Lerna’s configuration properly manages dependency order and build sequencing across the monorepo.

packages/core/package.json (2)

7-7: Type Definitions Specification
Adding "types": "dist/index.d.ts" enhances TypeScript support for consumers of the @daydreamsai/core package by clearly pointing to the declaration files.


35-35: Introducing Storage Dependency
Including "@daydreamsai/storage": "workspace:^" in the dependencies promotes modularity by integrating the new abstract storage layer. This change is in line with the overall architectural improvements.

docs/docs/pages/index.mdx (2)

105-106: Integration of New MongoStorage Import
The import of MongoStorage from @daydreamsai/mongodb-storage reflects the updated storage abstraction. Ensure that all subsequent usage in the example code is consistent with the new design.


141-144: Orchestrator Initialization
The orchestrator initialization using makeFlowLifecycle with the (presumably updated) orchestrator database and ConversationManager demonstrates proper integration of the storage layer into the overall architecture.

Comment on lines +83 to +102
public async update(
id: string,
set: Record<string, any>,
push: Record<string, any> = {}
): Promise<void> {
const data = {
$set: {},
$push: {},
};

if (Object.keys(set).length) {
data["$set"] = set;
}

if (Object.keys(push).length) {
data["$push"] = push;
}

await this.collection.updateOne({ _id: id }, data);
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Convert ID fields to ObjectId if applicable.
If your MongoDB collection uses ObjectId for _id, updating by _id: id strings can fail or silently update zero documents. Consider converting the string to an ObjectId or confirming the _id is stored as a string.

@@ -101,7 +101,8 @@
     async update(
         id: string,
         set: Record<string, any>,
         push: Record<string, any> = {}
     ): Promise<void> {
         ...
-        await this.collection.updateOne({ _id: id }, data);
+        const _id = new ObjectId(id);
+        await this.collection.updateOne({ _id }, data);
     }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
public async update(
id: string,
set: Record<string, any>,
push: Record<string, any> = {}
): Promise<void> {
const data = {
$set: {},
$push: {},
};
if (Object.keys(set).length) {
data["$set"] = set;
}
if (Object.keys(push).length) {
data["$push"] = push;
}
await this.collection.updateOne({ _id: id }, data);
}
public async update(
id: string,
set: Record<string, any>,
push: Record<string, any> = {}
): Promise<void> {
const data = {
$set: {},
$push: {},
};
if (Object.keys(set).length) {
data["$set"] = set;
}
if (Object.keys(push).length) {
data["$push"] = push;
}
const _id = new ObjectId(id);
await this.collection.updateOne({ _id }, data);
}

Comment on lines 223 to +225
// Get all orchestrator records for this user
const histories = await kvDb.getOrchestratorsByUserId(userId);
const histories =
await orchestratorDb.getOrchestratorsByUserId(userId);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Security consideration: user authorization checks.
You’re fetching orchestrator data purely by userId or chatId. If these endpoints are public, a malicious or unverified user can access someone else’s data by changing the parameter. Consider integrating authentication and ownership checks.

Also applies to: 254-254

Comment on lines +46 to +47
const id = Math.random().toString(36).substr(2, 9);
this.data[id] = data;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Consider using a more robust ID generation method.

Using Math.random() for ID generation could potentially lead to collisions. Consider using UUID v4 or another robust ID generation method.

-        const id = Math.random().toString(36).substr(2, 9);
+        const id = crypto.randomUUID();
         this.data[id] = data;
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
const id = Math.random().toString(36).substr(2, 9);
this.data[id] = data;
const id = crypto.randomUUID();
this.data[id] = data;

Comment on lines +164 to +175
public findOne<T>(query: Filter): Promise<T | null> {
const item = Object.values(this.data).find((item) => {
for (const key in query) {
if (query[key] !== item[key]) {
return false;
}
}
return true;
}) as T;

return Promise.resolve(item || null);
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Reuse filter logic from find method.

The findOne method duplicates filter logic from the find method. Consider reusing the filter logic.

     public findOne<T>(query: Filter): Promise<T | null> {
-        const item = Object.values(this.data).find((item) => {
-            for (const key in query) {
-                if (query[key] !== item[key]) {
-                    return false;
-                }
-            }
-            return true;
-        }) as T;
+        const item = Object.values(this.data).find(item => this.matchesFilter(item, query)) as T;
         return Promise.resolve(item || null);
     }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
public findOne<T>(query: Filter): Promise<T | null> {
const item = Object.values(this.data).find((item) => {
for (const key in query) {
if (query[key] !== item[key]) {
return false;
}
}
return true;
}) as T;
return Promise.resolve(item || null);
}
public findOne<T>(query: Filter): Promise<T | null> {
const item = Object.values(this.data).find(item => this.matchesFilter(item, query)) as T;
return Promise.resolve(item || null);
}

Comment on lines +60 to 63
const scheduledTaskDb = new MongoStorage(
"mongodb://localhost:27017",
"myApp",
"scheduled_tasks"
);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Consider refactoring the database initialization pattern.

The repeated pattern of initializing MongoStorage followed by MongoDb across multiple files suggests a need for abstraction.

Consider creating a factory function to handle database initialization consistently:

+// In a shared utility file
+function createOrchestrator(url: string, dbName: string) {
+  const storage = new MongoStorage(url, dbName);
+  return {
+    storage,
+    orchestratorDb: new MongoDb(storage)
+  };
+}

-const scheduledTaskDb = new MongoStorage(
-  "mongodb://localhost:27017",
-  "myApp",
-);
-const orchestratorDb = new MongoDb(scheduledTaskDb);

+const { storage: scheduledTaskDb, orchestratorDb } = createOrchestrator(
+  "mongodb://localhost:27017",
+  "myApp"
+);

Also applies to: 77-77

Comment on lines +132 to +139
// Connect to MongoDb and provision indexes
const scheduledTaskDb = new MongoStorage("mongodb://localhost:27017", "myApp");

await scheduledTaskDb.connect();
await scheduledTaskDb.migrate();

const orchestratorDb = new MongoDb(scheduledTaskDb);

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Inconsistent Use of Mongo Storage Classes
While a MongoStorage instance is created and connected (lines 132–136), the code then uses new MongoDb(scheduledTaskDb) (line 138) to initialize the orchestrator database. This mix of MongoStorage and MongoDb is inconsistent given the PR’s intent to abstract and modernize the storage layer. Consider replacing MongoDb with MongoStorage (or renaming it appropriately) to avoid confusion and ensure consistency.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant