Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use database for storing artifacts #695

Merged
merged 23 commits into from
Oct 7, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
74 changes: 70 additions & 4 deletions docs/artifact-manager.md
Original file line number Diff line number Diff line change
@@ -1,16 +1,17 @@
# Artifact Manager

The `Artifact Manager` is a builtin hypha service for indexing, managing, and storing resources such as datasets, AI models, and applications. It is designed to provide a structured way to manage datasets and similar resources, enabling efficient listing, uploading, updating, and deleting of files.
The `Artifact Manager` is a built-in Hypha service for indexing, managing, and storing resources such as datasets, AI models, and applications. It provides a structured way to manage datasets and similar resources, enabling efficient listing, uploading, updating, and deleting of files.

A typical use case for the `Artifact Manager` is as a backend for a single-page web application displaying a gallery of datasets, AI models, applications or other type of resources. The default metadata of an artifact is designed to render a grid of cards on a webpage.
A typical use case for the `Artifact Manager` is as a backend for a single-page web application that displays a gallery of datasets, AI models, applications, or other types of resources. The default metadata of an artifact is designed to render a grid of cards on a webpage.

**Note:** The `Artifact Manager` is only available when your Hypha server has S3 storage enabled.

**Note:** The `Artifact Manager` is only available when your hypha server enabled s3 storage.

## Getting Started

### Step 1: Connecting to the Artifact Manager Service

To use the `Artifact Manager`, you first need to connect to the Hypha server. This API allows you to create, read, edit, and delete datasets in the artifact registry (stored in s3 bucket for each workspace).
To use the `Artifact Manager`, you first need to connect to the Hypha server. This API allows you to create, read, edit, and delete datasets in the artifact registry (stored in a S3 bucket for each workspace).

```python
from hypha_rpc.websocket_client import connect_to_server
Expand Down Expand Up @@ -216,6 +217,18 @@ await artifact_manager.commit(prefix="collections/schema-dataset-gallery/valid-d
print("Valid dataset committed.")
```

### Step 3: Accessing the collection via HTTP API

You can access the collection via the HTTP API to retrieve the schema and datasets.
This can be used for rendering a gallery of datasets on a webpage.

```javascript
// Fetch the schema for the collection
fetch("https://hypha.aicell.io/my-workspace/artifact/public/collections/schema-dataset-gallery")
.then(response => response.json())
.then(data => console.log("Schema:", data.collection_schema));
```

## API Reference

This section details the core functions provided by the `Artifact Manager` for creating, managing, and validating artifacts such as datasets and collections.
Expand Down Expand Up @@ -441,3 +454,56 @@ await artifact_manager.commit(prefix="collections/dataset-gallery/example-datase
datasets = await artifact_manager.list(prefix="collections/dataset-gallery")
print("Datasets in the gallery:", datasets)
```


## HTTP API for Accessing Artifacts

The `Artifact Manager` provides an HTTP endpoint for retrieving artifact manifests and data. This is useful for public-facing web applications that need to access datasets, models, or applications.

### Endpoint: `/{workspace}/artifact/{path:path}`

- **Workspace**: The workspace in which the artifact is stored.
- **Path**: The relative path to the artifact.
- For public artifacts, the path must begin with `public/`.
- For private artifacts, the path does not include the `public/` prefix and requires proper authentication.

### Request Format:

- **Method**: `GET`
- **Parameters**:
- `workspace`: The workspace in which the artifact is stored.
- `path`: The path to the artifact (e.g., `public/collections/dataset-gallery/example-dataset`).
- `stage` (optional): A boolean flag to indicate whether to fetch the staged version of the manifest (`_manifest.yaml`). Default is `False`.

### Response:

- **For public artifacts**: Returns the artifact manifest if it exists under the `public/` prefix.
- **For private artifacts**: Returns the artifact manifest if the user has the necessary permissions.

### Example:

#### Fetching a public artifact:

```python
import requests

SERVER_URL = "https://hypha.aicell.io"
workspace = "my-workspace"
response = requests.get(f"{SERVER_URL}/{workspace}/artifact/public/collections/dataset-gallery/example-dataset")
if response.ok:
artifact = response.json()
print(artifact["name"]) # Output: Example Dataset
else:
print(f"Error: {response.status_code}")
```

#### Fetching a private artifact:

```python
response = requests.get(f"{SERVER_URL}/{workspace}/artifact/collections/private-dataset-gallery/private-example-dataset")
if response.ok:
artifact = response.json()
print(artifact["name"]) # Output: Private Example Dataset
else:
print(f"Error: {response.status_code}")
```
14 changes: 14 additions & 0 deletions helm-charts/hypha-server/templates/deployment.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -44,12 +44,26 @@ spec:
args: {{- toYaml .Values.startupCommand.args | nindent 12 }}
env:
{{- toYaml .Values.env | nindent 12 }}
volumeMounts:
- name: {{ .Values.persistence.volumeName }}
mountPath: {{ .Values.persistence.mountPath }}
livenessProbe:
{{- toYaml .Values.livenessProbe | nindent 12 }}
readinessProbe:
{{- toYaml .Values.readinessProbe | nindent 12 }}
resources:
{{- toYaml .Values.resources | nindent 12 }}
volumes:
- name: {{ .Values.persistence.volumeName }}
persistentVolumeClaim:
claimName: {{ .Values.persistence.existingClaim | default (include "hypha-server.fullname" .) }}
{{- if not .Values.persistence.existingClaim }}
accessModes:
{{- toYaml .Values.persistence.accessModes | nindent 14 }}
resources:
requests:
storage: {{ .Values.persistence.size }}
{{- end }}
{{- with .Values.nodeSelector }}
nodeSelector:
{{- toYaml . | nindent 8 }}
Expand Down
17 changes: 11 additions & 6 deletions helm-charts/hypha-server/values.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -102,12 +102,6 @@ env:
key: JWT_SECRET
- name: PUBLIC_BASE_URL
value: "https://hypha.amun.ai"
# Use the pod's UID as the server ID
# This is important to ensure Hypha Server can handle multiple replicas
- name: HYPHA_SERVER_ID
valueFrom:
fieldRef:
fieldPath: metadata.uid

# Define command-line arguments here
startupCommand:
Expand All @@ -117,3 +111,14 @@ startupCommand:
- "--port=9520"
- "--public-base-url=$(PUBLIC_BASE_URL)"
# - "--redis-uri=redis://redis.hypha.svc.cluster.local:6379/0"
- "--database-uri=sqlite+aiosqlite:///app/data/artifacts.db"

# Persistence Configuration
persistence:
volumeName: hypha-app-storage
mountPath: /app/data
storageClass: ""
accessModes:
- ReadWriteOnce
size: 5Gi
existingClaim: "" # If you have an existing claim, specify it here. Otherwise, a new PVC will be created.
2 changes: 1 addition & 1 deletion hypha/VERSION
Original file line number Diff line number Diff line change
@@ -1,3 +1,3 @@
{
"version": "0.20.38"
"version": "0.20.37.post4"
}
6 changes: 3 additions & 3 deletions hypha/apps.py
Original file line number Diff line number Diff line change
Expand Up @@ -65,12 +65,12 @@ def close(_) -> None:

self.event_bus.on_local("shutdown", close)

async def setup_workspace(self, overwrite=True, context=None):
async def setup_applications_collection(self, overwrite=True, context=None):
"""Set up the workspace."""
ws = context["ws"]
# Create an collection in the workspace
manifest = {
"id": "description",
"id": "applications",
"type": "collection",
"name": "Applications",
"description": f"A collection of applications for workspace {ws}",
Expand Down Expand Up @@ -205,7 +205,7 @@ async def install(
try:
await self.artifact_manager.read("applications", context=context)
except KeyError:
await self.setup_workspace(overwrite=True, context=context)
await self.setup_applications_collection(overwrite=True, context=context)
# Create artifact using the artifact controller
prefix = f"applications/{app_id}"
await self.artifact_manager.create(
Expand Down
Loading
Loading