Skip to content

Commit

Permalink
add starwhale glossary doc (#55)
Browse files Browse the repository at this point in the history
  • Loading branch information
tianweidut authored Dec 11, 2023
1 parent 73558d7 commit 873065a
Show file tree
Hide file tree
Showing 6 changed files with 49 additions and 7 deletions.
20 changes: 20 additions & 0 deletions docs/concepts/glossary.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
---
title: Starwhale Glossary
---

On this page you find a list of important terminology used throughout the Starwhale documentation.

* **Starwhale Dataset**: An abstraction of datasets in the machine learning field by Starwhale, which implements dataset construction, sharing, loading, version control and visualization to meet the requirements of processes like model training and evaluation.
* **Starwhale Model**: A standard package format for models in machine learning defined by Starwhale, including model weight files, code and configurations, etc. It meets requirements like model evaluation, fine-tuning in processes like model package construction, sharing, version control and running.
* **Starwhale Runtime**: An abstraction of program running environments in the machine learning field by Starwhale. It shields details like Dockerfile writing and CUDA installation and realizes a reproducible, shareable Python running environment.
* **Starwhale Instance**: Each deployment of Starwhale is called an instance. All instances can be managed by the `swcli`. There are 3 types of Starwhale instances: Starwhale Standalone, Starwhale Server and Starwhale Cloud. Starwhale tries to keep concepts consistent across different types of instances. In this way, people can easily exchange data and migrate between them.
* **Starwhale Standalone**: One of the 3 Starwhale instance types. Aimed at independent developers, deployed in local development environments and managed through the `swcli` command line tool to meet development, debugging needs etc.
* **Starwhale Server**: One of the 3 Starwhale instance types. Aimed at team users, deployed in private data centers, relies on Kubernetes clusters, provides centralized, interactive, secure services.
* **Starwhale Cloud**: One of the 3 Starwhale instance types. Hosted public cloud service, available at <https://cloud.starwhale.cn>, operated and maintained by the Starwhale team, no installation needed, ready to use.
* **`swcli`**: A Starwhale command line tool written in Python, used to manage model packages, datasets and runtimes on different instances.
* **datastore**: An infrastructure in Starwhale, provides storage and access methods like Big Table, meets requirements like storage and retrieval of datasets and evaluation data.
* **Starwhale Project**: The basic unit to organize different resources (e.g. models, datasets etc).
* **`.swignore` file**: Similar to .gitignore, .dockerignore files, used to define ignoring some files or folders. The Starwhale model building process will try to read this file and decide which files to ignore.
* **`model.yaml` file**: A descriptive file defining how to build a Starwhale Model, optional.
* **`dataset.yaml` file**: A descriptive file defining how to build a Starwhale Dataset, needs to work with some Python scripts. Used by `swcli dataset build` command, optional.
* **`runtime.yaml` file**: A descriptive file defining a Starwhale Runtime, used by `swcli runtime build` command, optional.
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
---
title: Starwhale的名词解释
---

本文会列出 Starwhale 的一些重要的专有术语。

* **Starwhale Dataset**:Starwhale 对机器学习领域数据集的一种抽象,实现数据集的构建、分发、加载、版本控制和可视化展示,满足模型训练、评测等过程中对数据集的需求。
* **Starwhale Model**:Starwhale 定义的一种机器学习中模型的标准包格式,包括模型权重文件、代码和配置等,满足模型评测、模型微调等环节中模型包的构建、分发、版本控制和运行等需求。
* **Starwhale Runtime**:Starwhale 对机器学习领域程序运行环境的一种抽象,屏蔽Dockerfile编写和CUDA安装等细节,实现一种可复现、可分享的Python运行环境。
* **Starwhale Instance**:Starwhale的每个部署称为一个实例。所有实例都可以通过`swcli`进行管理。Starwhale 实例有Starwhale Standalone, Starwhale Server 和 Starwhale Cloud 三种类型。Starwhale 会在不同实例上保持概念上的一致性,用户可以轻松的在不同实例上复制模型、数据集和运行时。
* **Starwhale Standalone**:Starwhale 三种实例类型之一。面向独立开发者,部署在本地开发环境中,通过`swcli`命令行工具进行管理,满足开发、调试等需求。
* **Starwhale Server**:Starwhale 三种实例类型之一。面向团队用户,部署在私有数据中心里,依赖Kubernetes集群,提供集中化、Web交互式的、安全的服务。
* Starwhale Cloud:Starwhale 三种实例类型之一。托管在公有云上的服务,访问地址为 <https://cloud.starwhale.cn>,由Starwhale团队负责运维,无需安装,开箱即用。
* **`swcli`**:是Python编写的Starwhale命令行工具,可以对不同实例上的模型包、数据集和运行时进行管理。
* **datastore**:Starwhale 中的一个基础设施,提供类似Big Table的存储和访问方式,满足数据集和评测数据的存储、检索等需求。
* **Starwhale Project**:是组织不同资源(如模型、数据集等)的基本单位。
* **`.swignore` 文件**:与`.gitignore``.dockerignore`等文件类似,用来定义忽略某些文件或文件夹,Starwhale 模型构建过程会尝试读取该文件,并决定哪些文件会被忽略掉。
* **`model.yaml` 文件**:是一种定义Starwhale Model如何构建的描述性文件,非必需。
* **`dataset.yaml` 文件**:是一种定义Starwhale Dataset如何构建的描述性文件,需要与一些Python脚本配合使用。`swcli dataset build` 命令会使用。非必需。
* **`runtime.yaml` 文件**:是一种定义Starwhale Runtime的描述性文件,`swcli runtime build` 命令会使用。非必需。
11 changes: 6 additions & 5 deletions i18n/zh/docusaurus-plugin-content-docs/current/concepts/index.md
Original file line number Diff line number Diff line change
@@ -1,10 +1,11 @@
---
title: Starwhale常用概念
title: Starwhale 常用概念
---

本节介绍Starwhale中的一些基本概念。

* [Starwhale中的命名规则](names)
* [Starwhale项目](project)
* [Starwhale 中的角色和权限](roles-permissions)
* [Starwhale 中的资源版本控制](versioning)
* [Starwhale 的名词解释](explanation)
* [Starwhale 的命名规则](names)
* [Starwhale 的项目](project)
* [Starwhale 的角色和权限](roles-permissions)
* [Starwhale 的资源版本控制](versioning)
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,6 @@ title: Starwhale中的项目

“项目”是组织不同资源(如模型、数据集等)的基本单位。您可以将项目用于不同的目的。例如,您可以为数据科学家团队、产品线或特定模型创建项目。用户通常在日常工作中会参与一个或多个项目。

Starwhale Server/Cloud 项目按账号分组。Starwhale Standalone 没有帐号概念。所以您不会在S tarwhale Standalone 项目中看到任何帐号前缀。Starwhale Server/Cloud项目可以是“公共”或“私有”。公共项目意味着同一实例上的所有用户在默认情况下都自动成为该项目的“访客”角色。有关角色的更多信息,请参阅[Starwhale中的角色和权限](roles-permissions)
Starwhale Server/Cloud 项目按账号分组。Starwhale Standalone 没有帐号概念。所以您不会在Starwhale Standalone 项目中看到任何帐号前缀。Starwhale Server/Cloud项目可以是“公共”或“私有”。公共项目意味着同一实例上的所有用户在默认情况下都自动成为该项目的“访客”角色。有关角色的更多信息,请参阅[Starwhale中的角色和权限](roles-permissions)

Starwhale Standalone会自动创建一个“self”项目并将其配置为默认项目。
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
---
title: Starwhale中的资源版本控制
title: Starwhale的资源版本控制
---

- Starwhale管理所有模型、数据集和运行时的历史记录。对特定资源的每次更新都会附加一个新版本的历史记录。
Expand Down
1 change: 1 addition & 0 deletions sidebars.js
Original file line number Diff line number Diff line change
Expand Up @@ -33,6 +33,7 @@ module.exports = {
id: "concepts/index"
},
items: [
"concepts/glossary",
"concepts/names",
"concepts/project",
"concepts/roles-permissions",
Expand Down

0 comments on commit 873065a

Please sign in to comment.