Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: Kubernetes Job Agent #198

Merged
merged 16 commits into from
Nov 4, 2024
Merged

fix: Kubernetes Job Agent #198

merged 16 commits into from
Nov 4, 2024

Conversation

zacharyblasczyk
Copy link
Member

@zacharyblasczyk zacharyblasczyk commented Nov 3, 2024

Summary by CodeRabbit

  • New Features

    • Enhanced job management through integration with the @ctrlplane/node-sdk.
    • Introduced new properties (externalId, jobAgentConfig) in the API response schemas for improved context and configuration options.
    • Added a new endpoint for acknowledging jobs.
    • Implemented a new method to retrieve currently running jobs associated with the agent.
    • Simplified API client creation process for better efficiency.
  • Bug Fixes

    • Resolved issues with job updates and retrieval by implementing more robust API calls.
  • Refactor

    • Updated return types for improved type safety in job scanning functions.
    • Transitioned to a more object-oriented approach in managing job operations.
    • Improved readability of API schemas with better formatting.

Copy link
Contributor

coderabbitai bot commented Nov 3, 2024

Walkthrough

The changes in this pull request enhance the functionality of the Kubernetes job agent by transitioning to a more object-oriented approach using the Job type from the @ctrlplane/node-sdk. Key modifications include updates to job management functions such as deployManifest, spinUpNewJobs, and updateExecutionStatus, which now utilize the Job instance methods for better encapsulation. The API client initialization has been simplified, and the OpenAPI specification has been updated to include new properties and endpoints, improving job management and configuration.

Changes

File Path Change Summary
integrations/kubernetes-job-agent/src/index.ts - Added import for Job.
- Updated deployManifest, spinUpNewJobs, and updateExecutionStatus methods to use Job instance methods.
- Refactored scan function to call spinUpNewJobs and updateExecutionStatus without parameters.
integrations/kubernetes-job-agent/src/sdk.ts - Changed API client instantiation from DefaultApi to createClient.
- Updated basePath to baseUrl.
- Exported api directly from createClient.
integrations/kubernetes-job-agent/src/utils.ts - Updated return type of ScannerFunc from SetTargetProvidersTargetsRequest["targets"] to Operations["setTargetProvidersTargets"]["requestBody"]["content"]["application/json"]["targets"].
apps/webservice/src/app/api/v1/jobs/[jobId]/openapi.ts - Added externalId and jobAgentConfig properties to the GET response schema for /v1/jobs/{jobId}.
- Updated PATCH request body schema to include externalId.
integrations/kubernetes-job-agent/src/agent.ts - Introduced new file agent.ts implementing JobAgent with configuration from environment variables.
openapi.v1.json - Added new properties and endpoints, including externalId and jobAgentConfig in job-related responses.
- Reintroduced and updated /v1/job-agents/{agentId}/queue/acknowledge endpoint.
packages/node-sdk/src/index.ts - Added running method to JobAgent class to retrieve currently running jobs.
- Changed Job class visibility from private to public.
packages/node-sdk/src/schema.ts - Renamed and restructured endpoints, including updates to job acknowledgment and retrieval functionalities.
- Added optional externalId fields to getJob and updateJob operations.

Possibly related PRs

Suggested reviewers

  • jsbroks

Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Generate unit testing code for this file.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Outside diff range and nitpick comments (4)
integrations/kubernetes-job-agent/src/sdk.ts (1)

5-8: Consider adding error handling for missing environment variables.

The client initialization should handle cases where environment variables are undefined or invalid.

Consider wrapping the client initialization in a try-catch block and adding validation:

+if (!env.CTRLPLANE_API_URL || !env.CTRLPLANE_API_KEY) {
+  throw new Error('Missing required environment variables: CTRLPLANE_API_URL and CTRLPLANE_API_KEY must be defined');
+}
+
 export const api = createClient({
   baseUrl: env.CTRLPLANE_API_URL,
   apiKey: env.CTRLPLANE_API_KEY,
 });
integrations/kubernetes-job-agent/src/utils.ts (2)

Line range hint 3-10: Consider making the utility function more type-flexible.

The function currently forces the return type to Record<string, string>, which limits its usefulness when dealing with non-string values.

Here's a more flexible implementation using generics:

-export function omitNullUndefined(obj: object) {
-  return Object.entries(obj).reduce<Record<string, string>>(
+export function omitNullUndefined<T extends object>(obj: T) {
+  return Object.entries(obj).reduce<Partial<T>>(
    (acc, [key, value]) => {
-      if (value !== null && value !== undefined) acc[key] = value;
+      if (value !== null && value !== undefined) acc[key as keyof T] = value as T[keyof T];
      return acc;
    },
    {},
  );
}

12-15: Consider creating a type alias for better maintainability.

The deeply nested type path, while type-safe, could be hard to maintain and understand. Consider creating a dedicated type alias.

+type TargetProviderTargets = Operations["setTargetProvidersTargets"]["requestBody"]["content"]["application/json"]["targets"];
+
-export type ScannerFunc = () => Promise<
-  Operations["setTargetProvidersTargets"]["requestBody"]["content"]["application/json"]["targets"]
->;
+export type ScannerFunc = () => Promise<TargetProviderTargets>;

This change would:

  1. Make the code more readable
  2. Centralize the type definition for easier updates
  3. Allow reuse of the type if needed elsewhere
integrations/kubernetes-job-agent/src/index.ts (1)

86-92: Ensure robust response data handling

Consider enhancing the check for response.data. Using if (!response.data) covers both undefined and null cases, ensuring that the code handles all falsy values appropriately.

📜 Review details

Configuration used: .coderabbit.yaml
Review profile: CHILL

📥 Commits

Reviewing files that changed from the base of the PR and between 8b6f3d4 and c6cf741.

📒 Files selected for processing (3)
  • integrations/kubernetes-job-agent/src/index.ts (7 hunks)
  • integrations/kubernetes-job-agent/src/sdk.ts (1 hunks)
  • integrations/kubernetes-job-agent/src/utils.ts (2 hunks)
🔇 Additional comments (12)
integrations/kubernetes-job-agent/src/sdk.ts (1)

1-8: LGTM! The API client initialization is now more streamlined.

The transition to createClient simplifies the configuration while maintaining the essential functionality.

Let's verify the impact of these changes on the codebase:

✅ Verification successful

Based on the results, I can now generate the final response:

API client migration is complete and consistent

The verification shows:

  • No traces of old API client (DefaultApi or Configuration) remain in the codebase
  • All API endpoints are using the new client consistently with proper path parameters and request structure
  • The migration pattern is uniform across all endpoints (GET, POST, PATCH)
🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Verify the usage of the old vs new API client methods
# to ensure all calls have been updated appropriately

# Check for any remaining usage of DefaultApi or Configuration
echo "Checking for any remaining old API usage..."
rg -l "DefaultApi|Configuration.*from.*@ctrlplane/node-sdk" --type ts

# Look for any direct usage of the API client to verify method compatibility
echo "Checking API client usage patterns..."
rg -A 3 "api\.(GET|POST|PATCH|PUT|DELETE)" --type ts

Length of output: 2767

integrations/kubernetes-job-agent/src/utils.ts (1)

1-1: LGTM! Good improvement in type safety.

The switch to importing the Operations type from the SDK is a good move. It provides better type safety by using the generated API types directly.

integrations/kubernetes-job-agent/src/index.ts (10)

6-6: Importing 'JobAgent' from '@ctrlplane/node-sdk'

The import statement correctly brings in the JobAgent class from @ctrlplane/node-sdk, which aligns with the shift towards utilizing the SDK for job management as indicated in the PR objectives.


37-44: Verify the use of status 'invalid_job_agent'

Ensure that 'invalid_job_agent' is the appropriate status code to indicate that the job name was not found in the manifest. Confirm that this status aligns with the API's expected status codes and update it if necessary (e.g., to 'invalid_job' or another relevant status).


51-59: Updating job status to 'in_progress' after successful job creation

The API call correctly updates the job status to 'in_progress', sets the externalId, and provides a success message. This ensures that the job status accurately reflects its state in the system.


72-76: Consistent use of status codes in error handling

Verify that 'invalid_job_agent' is the correct status code for reporting an error during manifest deployment. Consistency with the API's status codes is crucial for proper error handling and downstream processing.


100-107: Proper error handling when fetching job details

The retrieval of job details and the subsequent null check are implemented correctly. This ensures that the code gracefully handles scenarios where job details are not available.


114-114: Rendering manifest with job details

The call to renderManifest correctly uses job.jobAgentConfig.manifest and je.data to generate the manifest. This ensures that the job-specific details are properly injected into the manifest template.


117-121: Acknowledging the job before deployment

The API call to acknowledge the job is correctly implemented. This step is important to inform the system that the job processing has begun, preventing potential duplicate processing.


143-149: Consistency in response data structure

Unlike previous API responses, here response.data is assigned directly to jobs. Ensure that response.data is indeed an array of jobs. If the API returns an object containing a jobs array, consider destructuring it as const { jobs = [] } = response.data; for consistency and to prevent potential runtime errors.


165-169: Updating job execution status and message

The API call effectively updates the job's status and message based on the Kubernetes job status. This ensures that the job's progress is accurately reflected in the system.


192-200: Initializing 'JobAgent' instance and retrieving agent ID

The JobAgent is correctly instantiated with the necessary configuration, and the agent ID is retrieved using agent.get(). This setup is essential for managing job agents in a more structured manner.

Comment on lines 108 to 113
if (
typeof job.jobAgentConfig !== "object" ||
!("manifest" in job.jobAgentConfig)
)
throw new Error("Job manifest is required");

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Enhance null checking for 'job.jobAgentConfig'

The condition typeof job.jobAgentConfig !== "object" may not effectively handle cases where job.jobAgentConfig is null, since typeof null === "object". To prevent potential runtime errors, consider adding an explicit null check.

Apply this diff to fix the issue:

 if (
-  typeof job.jobAgentConfig !== "object" ||
+  job.jobAgentConfig == null ||
   !("manifest" in job.jobAgentConfig)
 )
   throw new Error("Job manifest is required");
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
if (
typeof job.jobAgentConfig !== "object" ||
!("manifest" in job.jobAgentConfig)
)
throw new Error("Job manifest is required");
if (
job.jobAgentConfig == null ||
!("manifest" in job.jobAgentConfig)
)
throw new Error("Job manifest is required");

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 10

🧹 Outside diff range and nitpick comments (10)
integrations/kubernetes-job-agent/src/agent.ts (1)

6-13: Consider architectural improvements for better maintainability.

A few architectural suggestions to consider:

  1. Move the "kubernetes-job" type to a constants file for better maintainability
  2. Consider implementing a proper singleton pattern with lazy initialization

Example implementation:

// constants.ts
export const AGENT_TYPES = {
  KUBERNETES_JOB: "kubernetes-job",
} as const;

// agent.ts
let agentInstance: JobAgent | null = null;

export const getAgent = (): JobAgent => {
  if (!agentInstance) {
    agentInstance = new JobAgent(
      {
        name: config.CTRLPLANE_AGENT_NAME,
        workspaceId: config.CTRLPLANE_WORKSPACE_ID,
        type: AGENT_TYPES.KUBERNETES_JOB,
      },
      api,
    );
  }
  return agentInstance;
};

This approach:

  • Centralizes configuration constants
  • Provides lazy initialization
  • Makes testing easier through the ability to reset the singleton
packages/node-sdk/src/index.ts (1)

Line range hint 114-154: Consider adding JSDoc documentation for the Job class.

Since this is a public SDK class, it would be valuable to add comprehensive documentation:

  • Class description
  • Method documentation with parameters and return types
  • Usage examples

Example documentation:

/**
 * Represents a job in the CtrlPlane system.
 * Used to manage individual job lifecycle and status.
 */
export class Job {
  /**
   * Creates a new Job instance
   * @param job - Job identifier object
   * @param client - API client instance
   */
  constructor(
    private job: { id: string },
    private client: ReturnType<typeof createClient>,
  ) {}

  /**
   * Acknowledges the job receipt
   * @returns Promise resolving to the acknowledgment response
   */
  acknowledge() {
    // ... existing implementation
  }
  // ... document other methods
}
integrations/kubernetes-job-agent/src/index.ts (4)

24-27: Consider using stricter typing for the manifest parameter.

The manifest parameter is typed as any which bypasses TypeScript's type checking. Consider creating an interface that defines the expected structure of a Kubernetes job manifest.

interface K8sJobManifest {
  metadata?: {
    name?: string;
    namespace?: string;
  };
  // Add other required fields
}

69-71: Improve error status type safety.

The type assertion "invalid_job_agent" as const suggests we're working with a string literal union type for status. Consider defining an enum or union type for job statuses to prevent typos and ensure type safety.

type JobStatus = 'pending' | 'in_progress' | 'invalid_job_agent' | 'completed' | 'failed';

81-82: Consider adding concurrency control for job creation.

Using Promise.allSettled without limits could potentially overwhelm the Kubernetes API server if there are many jobs. Consider implementing a concurrency limit.

// Example implementation using p-limit
import pLimit from 'p-limit';
const limit = pLimit(5); // Process max 5 jobs concurrently
await Promise.allSettled(
  jobs.map((job: Job) => limit(() => processJob(job)))
);

Line range hint 144-153: Consider implementing retry logic for scan operations.

The scan operation is critical for job management. Consider implementing retry logic with exponential backoff for transient failures.

import { exponentialBackoff } from './utils';

const scanWithRetry = async (retries = 3) => {
  return exponentialBackoff(async () => {
    const { id } = await agent.get();
    logger.info(`Agent ID: ${id}`);
    await spinUpNewJobs();
    await updateExecutionStatus();
  }, retries);
};
apps/webservice/src/app/api/v1/jobs/[jobId]/openapi.ts (2)

214-224: Consider adding manifest format constraints.

Given this PR's focus on Kubernetes Job Agent, consider enhancing the manifest property schema to:

  1. Specify the expected format (e.g., YAML)
  2. Add pattern validation if it should match Kubernetes manifest structure
  3. Include example in the description
 jobAgentConfig: {
   type: "object",
   properties: {
     manifest: {
       type: "string",
-      description: "The manifest template for the job",
+      description: "The Kubernetes manifest template for the job (YAML format)",
+      examples: ["apiVersion: batch/v1\nkind: Job\nmetadata:\n  name: example"],
+      pattern: "^(apiVersion:|kind:|metadata:)",
     },
   },
   required: ["manifest"],
   description: "Configuration for the Job Agent",
 },

Line range hint 1-300: Consider additional Job Agent properties

Given this is a Kubernetes Job Agent implementation, consider adding these properties to enhance job management:

  1. jobAgentConfig.namespace - Target Kubernetes namespace
  2. jobAgentConfig.serviceAccount - Service account for job execution
  3. status.podStatus - Detailed pod status information

This would provide better visibility and control over Kubernetes job execution.

openapi.v1.json (1)

207-210: Add documentation and schema validation for targetFilter.

The new targetFilter property lacks documentation and schema validation. Consider:

  1. Adding a description explaining its purpose and usage
  2. Defining a specific schema for allowed properties instead of using additionalProperties: true
 "targetFilter": {
   "type": "object",
-  "additionalProperties": true
+  "description": "Filter criteria for targeting specific deployment targets",
+  "properties": {
+    // Define specific properties and their types here
+  },
+  "additionalProperties": false
 }
packages/node-sdk/src/schema.ts (1)

170-170: Fix grammatical error in comment: "Get an agent's running jobs"

The comment at line 170 should be corrected for proper grammar.

Apply this diff to fix the comment:

-    /** Get a agents running jobs */
+    /** Get an agent's running jobs */
📜 Review details

Configuration used: .coderabbit.yaml
Review profile: CHILL

📥 Commits

Reviewing files that changed from the base of the PR and between c6cf741 and 58f2ee5.

📒 Files selected for processing (6)
  • apps/webservice/src/app/api/v1/jobs/[jobId]/openapi.ts (2 hunks)
  • integrations/kubernetes-job-agent/src/agent.ts (1 hunks)
  • integrations/kubernetes-job-agent/src/index.ts (4 hunks)
  • openapi.v1.json (4 hunks)
  • packages/node-sdk/src/index.ts (1 hunks)
  • packages/node-sdk/src/schema.ts (5 hunks)
🔇 Additional comments (7)
integrations/kubernetes-job-agent/src/agent.ts (1)

1-4: LGTM! Clean and well-organized imports.

The imports are properly structured with named imports and correct ESM extensions.

packages/node-sdk/src/index.ts (1)

114-114: LGTM! Appropriate visibility change for SDK usage.

Making the Job class public is a good decision as it:

  • Enables better object-oriented job management
  • Maintains proper encapsulation of job operations
  • Provides a clean public interface for SDK consumers
apps/webservice/src/app/api/v1/jobs/[jobId]/openapi.ts (1)

51-56: LGTM! Well-defined external ID property.

The externalId property is properly defined with appropriate nullability and a clear description that includes an example use case.

openapi.v1.json (1)

646-650: LGTM! Well-structured job schema updates.

The changes to the job schema are well-implemented with:

  • Clear documentation for the externalId property
  • Proper schema definition for jobAgentConfig with required fields
  • Consistent updates to the required fields list

Also applies to: 831-842, 850-851

packages/node-sdk/src/schema.ts (3)

349-351: 'targetFilter' field added correctly to Environment

The targetFilter field has been added to the environment object with appropriate typing. This allows for specifying target filters in environment configurations.


606-607: 'externalId' field added to 'Job' object appropriately

The externalId field is correctly added to the Job object, allowing for optional external job identifiers such as GitHub workflow run IDs.


657-661: 'jobAgentConfig' added to 'Job' object correctly

The jobAgentConfig property, including the manifest field, has been appropriately added to the Job object. This enhances the job configuration capabilities.

packages/node-sdk/src/index.ts Show resolved Hide resolved
integrations/kubernetes-job-agent/src/index.ts Outdated Show resolved Hide resolved
openapi.v1.json Outdated Show resolved Hide resolved
packages/node-sdk/src/schema.ts Outdated Show resolved Hide resolved
packages/node-sdk/src/schema.ts Outdated Show resolved Hide resolved
packages/node-sdk/src/schema.ts Outdated Show resolved Hide resolved
packages/node-sdk/src/schema.ts Outdated Show resolved Hide resolved
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Outside diff range and nitpick comments (2)
packages/node-sdk/src/schema.ts (2)

Line range hint 163-179: Add missing path parameter for agentId.

The endpoint /v1/job-agents/{agentId}/queue/next requires agentId in the path, but the path parameter is currently set to path?: never.

Apply this diff to include the required path parameter:

   "/v1/job-agents/{agentId}/queue/next": {
     parameters: {
       query?: never;
       header?: never;
-      path?: never;
+      path: {
+        /** @description The ID of the job agent */
+        agentId: string;
+      };
       cookie?: never;
     };

Line range hint 623-678: Enhance job configuration schema.

The jobAgentConfig schema is too permissive with a generic manifest string. Consider adding validation for the manifest structure and additional configuration options.

Consider enhancing the schema:

   jobAgentConfig: {
-    /** @description The manifest template for the job */
-    manifest: string;
+    /** @description The manifest template for the job */
+    manifest: {
+      /** @description The type of manifest (e.g., kubernetes, docker) */
+      type: string;
+      /** @description The manifest content */
+      content: string;
+      /** @description Optional manifest variables */
+      variables?: Record<string, string>;
+    };
+    /** @description Optional job timeout in seconds */
+    timeoutSeconds?: number;
+    /** @description Optional retry configuration */
+    retry?: {
+      maxAttempts: number;
+      backoffSeconds: number;
+    };
   };
📜 Review details

Configuration used: .coderabbit.yaml
Review profile: CHILL

📥 Commits

Reviewing files that changed from the base of the PR and between 58f2ee5 and cc97ea2.

📒 Files selected for processing (3)
  • github/get-job-inputs/index.js (1 hunks)
  • openapi.v1.json (3 hunks)
  • packages/node-sdk/src/schema.ts (7 hunks)
🔥 Files not summarized due to errors (1)
  • github/get-job-inputs/index.js: Error: Server error: no LLM provider could handle the message
🔇 Additional comments (5)
packages/node-sdk/src/schema.ts (2)

1120-1166: ⚠️ Potential issue

Enhance acknowledgeAgentJob response schema.

The current response schema returns an empty job object (Record<string, never>). This matches a past review comment that suggested defining a proper response schema.

Apply this diff to align with other job response schemas:

   content: {
     "application/json": {
-      job?: Record<string, never>;
+      job: {
+        id: string;
+        status: string;
+        /** Format: date-time */
+        acknowledgedAt: string;
+        /** @description External job identifier */
+        externalId?: string | null;
+      };
     };
   };

215-234: ⚠️ Potential issue

Add missing jobId parameter in acknowledgeAgentJob endpoint.

The endpoint /v1/job-agents/{agentId}/queue/acknowledge should include the jobId to specify which job is being acknowledged.

Apply this diff to add the required path parameter:

-  "/v1/job-agents/{agentId}/queue/acknowledge": {
+  "/v1/job-agents/{agentId}/queue/acknowledge/{jobId}": {
     parameters: {
       query?: never;
       header?: never;
-      path?: never;
+      path: {
+        /** @description The ID of the job agent */
+        agentId: string;
+        /** @description The ID of the job to acknowledge */
+        jobId: string;
+      };
       cookie?: never;
     };
openapi.v1.json (2)

617-621: LGTM: Job schema enhancements are well-structured.

The additions of externalId and jobAgentConfig properties improve the job schema by:

  • Supporting external job tracking with proper nullability
  • Providing structured job agent configuration with clear property requirements

Also applies to: 778-789, 797-798


1492-1556: Previous review comments are still applicable.

The endpoint still needs improvements in:

  1. Schema definition for the job object in success response
  2. Correct error message for 404 response
github/get-job-inputs/index.js (1)

28075-28081: LGTM! Clean implementation of the running jobs fetcher.

The implementation follows good practices:

  • Proper use of async/await
  • Clean error handling with optional chaining
  • Consistent mapping to Job instances

Let's verify the API endpoint usage:

✅ Verification successful

API endpoint verified and correctly implemented

The endpoint /v1/job-agents/{agentId}/jobs/running is properly defined in the OpenAPI specification with a matching GET operation, confirming that the implementation correctly uses the API contract.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Verify the API endpoint definition exists in OpenAPI spec
# Expected: The endpoint should be defined in the API spec

# Search for the endpoint definition in OpenAPI spec files
fd -e yaml -e json . | xargs rg -l "/v1/job-agents/.+/jobs/running"

Length of output: 83


Script:

#!/bin/bash
# Let's verify the exact endpoint definition in the OpenAPI spec
rg -A 5 "/v1/job-agents/{agentId}/jobs/running" openapi.v1.json

Length of output: 201


Script:

#!/bin/bash
# Let's verify the exact endpoint definition in the OpenAPI spec
# Escaping the curly braces in the regex pattern
rg -A 5 "/v1/job-agents/\{agentId\}/jobs/running" openapi.v1.json

Length of output: 258

packages/node-sdk/src/schema.ts Outdated Show resolved Hide resolved
packages/node-sdk/src/schema.ts Outdated Show resolved Hide resolved
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Outside diff range and nitpick comments (4)
integrations/kubernetes-job-agent/src/index.ts (3)

69-71: Consider enhancing error handling

While the error handling is functional, consider these improvements:

  1. Define status constants to avoid magic strings
  2. Include the full error stack in debug logs for better troubleshooting
+const JOB_STATUS = {
+  INVALID: 'invalid_job_agent',
+  IN_PROGRESS: 'in_progress',
+} as const;

 await job.update({
-  status: "invalid_job_agent" as const,
+  status: JOB_STATUS.INVALID,
   message: error.body?.message || error.message,
 });
+logger.debug('Deployment error details', { 
+  error: error,
+  stack: error.stack 
+});

145-149: Consider parallel execution for performance

The sequential execution of spinUpNewJobs and updateExecutionStatus could be optimized by running them in parallel since they appear to be independent operations.

-await spinUpNewJobs();
-await updateExecutionStatus();
+await Promise.all([
+  spinUpNewJobs(),
+  updateExecutionStatus()
+]);

124-124: Add type safety for job status updates

Consider adding type validation for the status and message before updating the job.

+type JobStatus = 'success' | 'failed' | 'in_progress' | 'invalid_job_agent';
+
+function isValidJobStatus(status: string): status is JobStatus {
+  return ['success', 'failed', 'in_progress', 'invalid_job_agent'].includes(status);
+}

-await job.update({ status, message });
+if (!isValidJobStatus(status)) {
+  logger.warn(`Invalid status received: ${status}, defaulting to 'failed'`);
+  await job.update({ status: 'failed', message: `Invalid status: ${status}` });
+} else {
+  await job.update({ status, message });
+}
packages/node-sdk/src/schema.ts (1)

851-913: Consider adding type constraints for releaseFilter.

The operation is well-structured, but the releaseFilter property allows any value which might be too permissive.

Consider defining a more specific type for releaseFilter:

  releaseFilter?: {
-   [key: string]: unknown;
+   /** @description Filter criteria for releases */
+   version?: string;
+   metadata?: {
+     [key: string]: string;
+   };
  };
📜 Review details

Configuration used: .coderabbit.yaml
Review profile: CHILL

📥 Commits

Reviewing files that changed from the base of the PR and between cc97ea2 and fa5fbea.

📒 Files selected for processing (4)
  • apps/webservice/src/app/api/v1/jobs/[jobId]/openapi.ts (2 hunks)
  • integrations/kubernetes-job-agent/src/index.ts (4 hunks)
  • openapi.v1.json (32 hunks)
  • packages/node-sdk/src/schema.ts (7 hunks)
🚧 Files skipped from review as they are similar to previous changes (2)
  • apps/webservice/src/app/api/v1/jobs/[jobId]/openapi.ts
  • openapi.v1.json
🔇 Additional comments (6)
integrations/kubernetes-job-agent/src/index.ts (4)

1-11: LGTM: Clean import restructuring

The transition from direct API client usage to the agent-based approach with Job type imports reflects a good architectural improvement towards better encapsulation.


24-54: LGTM: Enhanced error handling and job status management

The changes improve error handling by properly updating job status and tracking external IDs. The validation of manifest name and consistent error reporting through the Job instance methods is a good practice.


111-116: Enhance error handling for invalid externalId format


87-90: 🛠️ Refactor suggestion

Add proper validation for jobAgentConfig

Additionally, consider adding type validation for the manifest structure:

+interface JobManifest {
+  metadata?: {
+    name?: string;
+    namespace?: string;
+  };
+}

+function isValidManifest(manifest: any): manifest is JobManifest {
+  return typeof manifest === 'object' && manifest !== null;
+}

 const manifest = renderManifest(
   jobDetails.jobAgentConfig.manifest,
   jobDetails,
 );
+if (!isValidManifest(manifest)) {
+  throw new Error('Invalid manifest structure');
+}
packages/node-sdk/src/schema.ts (2)

146-162: Missing deploymentId path parameter.

The issue identified in the past review comment is still valid for this endpoint.


623-624: LGTM! Well-documented external ID field.

The addition of the optional externalId field with proper documentation and typing is a good improvement.

packages/node-sdk/src/schema.ts Outdated Show resolved Hide resolved
packages/node-sdk/src/schema.ts Outdated Show resolved Hide resolved
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Outside diff range and nitpick comments (1)
packages/node-sdk/src/schema.ts (1)

674-677: Consider defining a more specific type for jobAgentConfig.

Using [key: string]: unknown provides maximum flexibility but minimal type safety. Consider defining a more specific interface for the job agent configuration to improve type checking and developer experience.

For example:

jobAgentConfig: {
  /** @description Configuration for the job agent execution */
  manifest?: string;
  /** @description Environment variables for the job */
  env?: Record<string, string>;
  /** @description Resource requirements */
  resources?: {
    cpu?: string;
    memory?: string;
  };
};
📜 Review details

Configuration used: .coderabbit.yaml
Review profile: CHILL

📥 Commits

Reviewing files that changed from the base of the PR and between fa5fbea and 890acd3.

📒 Files selected for processing (4)
  • apps/webservice/src/app/api/v1/jobs/[jobId]/openapi.ts (2 hunks)
  • integrations/kubernetes-job-agent/src/index.ts (4 hunks)
  • openapi.v1.json (31 hunks)
  • packages/node-sdk/src/schema.ts (5 hunks)
🚧 Files skipped from review as they are similar to previous changes (2)
  • apps/webservice/src/app/api/v1/jobs/[jobId]/openapi.ts
  • openapi.v1.json
🔇 Additional comments (7)
integrations/kubernetes-job-agent/src/index.ts (5)

1-1: LGTM! Clean import organization.

The imports are well-organized with types separated from regular imports.

Also applies to: 8-8


24-54: LGTM! Robust error handling with proper status updates.

The function now properly uses the Job instance for status updates and has comprehensive error handling.


145-149: LGTM! Clean implementation of the scan function.

The function properly fetches the agent ID and calls the job management functions in the correct order with appropriate error handling.


87-90: ⚠️ Potential issue

Add validation before manifest rendering.

The code directly accesses jobDetails.jobAgentConfig.manifest without proper validation.

Apply this diff to add validation:

+ if (!jobDetails.jobAgentConfig?.manifest) {
+   throw new Error('Job agent config or manifest is missing');
+ }
const manifest = renderManifest(
  jobDetails.jobAgentConfig.manifest as string,
  jobDetails,
);

111-117: ⚠️ Potential issue

Enhance error handling for invalid externalId format.

The current implementation silently returns when externalId is invalid.

Apply this diff to update the job status on invalid externalId:

 const [namespace, name] = jobDetails.externalId?.split("/") ?? [];
 if (namespace == null || name == null) {
   logger.error("Invalid external run ID", {
     jobId: jobDetails.id,
     externalId: jobDetails.externalId,
   });
-  return;
+  await job.update({
+    status: "invalid_job_agent",
+    message: `Invalid external run ID: ${jobDetails.externalId}`,
+  });
+  return;
 }
packages/node-sdk/src/schema.ts (2)

366-368: LGTM! Well-structured addition of targetFilter.

The targetFilter property is properly typed and maintains consistency with the schema.


850-912: LGTM! Well-structured release channel operation.

The operation is well-defined with:

  • Clear request/response schemas
  • Proper error handling
  • Good documentation

packages/node-sdk/src/schema.ts Show resolved Hide resolved
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Outside diff range and nitpick comments (3)
integrations/kubernetes-job-agent/src/index.ts (3)

Line range hint 24-75: Add manifest validation before deployment.

While the function handles various error cases well, it would be beneficial to validate the manifest structure before attempting deployment.

Consider adding this validation:

 const deployManifest = async (
   job: Job,
   jobId: string,
   namespace: string,
   manifest: any,
 ) => {
   try {
+    if (!manifest || typeof manifest !== 'object') {
+      await job.update({
+        externalId: "",
+        status: "invalid_job_agent",
+        message: "Invalid manifest format.",
+      });
+      return;
+    }
     const name = manifest?.metadata?.name;

81-117: Enhance error handling in Promise.allSettled.

While the function handles individual job errors well, it would be beneficial to log the results of Promise.allSettled to track failed jobs.

Consider adding this error handling:

-    await Promise.allSettled(
+    const results = await Promise.allSettled(
       jobs.map(async (job: Job) => {
         // ... existing code ...
       }),
     );
+    const failures = results.filter(
+      (result): result is PromiseRejectedResult =>
+        result.status === "rejected"
+    );
+    if (failures.length > 0) {
+      logger.warn(`${failures.length} job(s) failed to process`, {
+        errors: failures.map(f => f.reason),
+      });
+    }

Line range hint 1-183: Consider implementing retry mechanisms for resilience.

While the code is well-structured, adding retry mechanisms for transient failures (like network issues or Kubernetes API temporary unavailability) would improve reliability.

Consider:

  1. Implementing exponential backoff for Kubernetes API calls
  2. Adding circuit breakers for external service calls
  3. Implementing dead letter queues for failed jobs that need manual intervention

Would you like me to provide example implementations for these patterns?

📜 Review details

Configuration used: .coderabbit.yaml
Review profile: CHILL

📥 Commits

Reviewing files that changed from the base of the PR and between 890acd3 and c5fbf3a.

📒 Files selected for processing (1)
  • integrations/kubernetes-job-agent/src/index.ts (4 hunks)
🔇 Additional comments (4)
integrations/kubernetes-job-agent/src/index.ts (4)

1-11: LGTM! Imports are well-organized and appropriate.

The new imports support the transition to a more object-oriented approach with the Job type and add schema validation capabilities.


77-79: LGTM! Schema validation is well-defined.

The zod schema provides strong typing and runtime validation for the job agent configuration.


Line range hint 119-153: LGTM with a note.

The function has been successfully updated to use Job instance methods. Note that there's an existing review comment about enhancing error handling for invalid externalId format that should be addressed.


Line range hint 155-168: LGTM! Clean and well-structured error handling.

The scan function has been simplified while maintaining robust error handling and logging.

@zacharyblasczyk zacharyblasczyk merged commit 6c88bbf into main Nov 4, 2024
9 checks passed
@zacharyblasczyk zacharyblasczyk deleted the fix-kubernetes-job-agent branch November 4, 2024 05:52
@coderabbitai coderabbitai bot mentioned this pull request Nov 6, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants