-
Notifications
You must be signed in to change notification settings - Fork 22
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Sandbox] HAMi #97
Comments
TAG-Runtime |
|
all public repos are on the scope for donation k8s-dra-driver are forked for convenience, we plan to make our own dra-driver |
We've been exploring the combination of HAMi and DRA and are currently in the roadmap as well |
@raravena80 has TAG Runtime reviewed this project and have a recommendation to the TOC? |
They presented on May 16th, 2024. Info:
TAG-Runtime is good with the project going to Sandbox provide they fulfill the CNCF Sandbox admission checklist. |
Hami project review: https://docs.google.com/document/d/1Lb4HYnJR21AEsNGurtXcXqEzdrKu95cG0NziufGEI0c/edit FYI @angellk @raravena80 @srust @rajaskakodkar Some feedback is still needed from the authors in the doc for completeness. |
Thank you very much. I have made comments in the document and look forward to your reply. |
Thanks again for the very detailed and high-quality review of HAMi. I have replied to all the comments. If you have any questions, please leave a message. I would like to clarify a few points.1. Risk of single vendor contribution.Due to the non-standard contribution method (direct commit, no PR) before, the statistical information is inaccurate. At present, DaoCloud and 4paradigm have similar contributions, This is the current contributor statistics, https://github.com/Project-HAMi/HAMi/graphs/contributors?from=2021-07-04&to=2024-08-09&type=c The top eight contributors come from four different vendors(sort by commits), 4paradigm, DaoCloud, SAP,NIVIC @archlitchi 4paradigm Therefore, I understand that there is no risk of single vendor contribution. Of course, we will standardize the contribution process and look for more contributors in the future. |
Thanks @wawa0210, I have incorporated your comments, and amended the context. Thank you for your collaboration and swift responses on the review. |
TAG Contributor strategy has reviewed this project and found the following:
This review is for the TOC’s information only. Sandbox projects are not required to have full governance or contributor documentation. |
After discussion with HAMi maintainers, we added a governance document, https://github.com/Project-HAMi/HAMi?tab=readme-ov-file#governance
HAMi has three maintainers, and eleven community members
We currently have a weekly community meeting in Chinese,this is our calendar, there is also a developer WeChat group, which currently has 137 members. Regarding public meeting minutes and screen recordings, this is indeed missing and needs to be improved. At the same time, we also need to pay attention to internationalization |
Yeah, that's challenging. But, if your contributors speak Chinese, that makes sense for your meetings. And if you can get meeting notes up in Chinese, other folks can use Google Translate. For that reason, notes are better than recordings. If you get accepted into the CNCF, you'll want to eventually cultivate a second, English-speaking community as well as your Chinese one. |
Regarding cloud native overlap, to elaborate further, the two projects, Volcano and Hami, each concentrate on distinct aspects. The two projects have an close collaboration. Taking GPU sharing as an instance, Volcano offers the scheduling of GPU virtualization resources with policy, while Hami provides the isolation of GPU memory and core on the node. The coordination of the two projects has been adopted by a number of users and has received great feedback. |
/vote |
Vote created@mrbobbytables has called for a vote on The members of the following teams have binding votes:
Non-binding votes are also appreciated as a sign of support! How to voteYou can cast your vote by reacting to
Please note that voting for multiple options is not allowed and those votes won't be counted. The vote will be open for |
The TOC would also like the project to engage with the following Kubernetes groups in addition to completing the recommendations from the TAG:
|
Thank you very much for the reminder. It happens that HK Kubecon will start on August 21st, and HAMi maintainers will attend the meeting. We will actively try to communicate with these SIG people, listen to their suggestions for HAMi's future, and enrich the roadmap |
/check-vote |
Vote statusSo far Summary
Binding votes (4)
|
User | Vote | Timestamp |
---|---|---|
wawa0210 | In favor | 2024-08-20 15:51:30.0 +00:00:00 |
Votes can only be checked once a day. |
/check-vote |
Vote statusSo far Summary
Binding votes (7)
|
User | Vote | Timestamp |
---|---|---|
raravena80 | In favor | 2024-08-20 23:35:09.0 +00:00:00 |
archlitchi | In favor | 2024-08-21 1:34:09.0 +00:00:00 |
zanetworker | In favor | 2024-08-21 11:07:37.0 +00:00:00 |
wawa0210 | In favor | 2024-08-21 15:16:48.0 +00:00:00 |
Vote closedThe vote passed! 🎉
Summary
Binding votes (8)
|
User | Vote | Timestamp |
---|---|---|
@raravena80 | In favor | 2024-08-20 23:35:09.0 +00:00:00 |
@archlitchi | In favor | 2024-08-21 1:34:09.0 +00:00:00 |
@zanetworker | In favor | 2024-08-21 11:07:37.0 +00:00:00 |
@wawa0210 | In favor | 2024-08-21 15:16:48.0 +00:00:00 |
Welcome and congrats on getting accepted as a CNCF Sandbox project! You can get started on your on-boarding checklist here: #132 and if you have any questions, please don't hesitate to reach out! |
thanks, we'll working on it |
With #132 created we can go ahead and close this out :) Congrats again! |
Application contact emails
[email protected],[email protected]
Project Summary
Heterogeneous AI Computing Virtualization Middleware (HAMi), is an "all-in-one" tool designed to manage Heterogeneous AI Computing Devices in a k8s cluster.
Project Description
Heterogeneous AI Computing Virtualization Middleware (HAMi) is an "all-in-one" tool designed to manage Heterogeneous AI Computing Devices in a k8s cluster. It includes everything you would expect, such as:
nvidia.com/use-gputype
ornvidia.com/nouse-gputype
.nvidia.com/use-gpuuuid
ornvidia.com/nouse-gpuuuid
.nvidia.com/gpu
if you prefer.The core features of HAMi are as follows
The HAMi architecture is as follows
Application Scenarios
Org repo URL (provide if all repos under the org are in scope of the application)
https://github.com/Project-HAMi
Project repo URL in scope of application
core repo : https://github.com/Project-HAMi/HAMi
And the corresponding multi-public repo https://github.com/Project-HAMi/
Additional repos in scope of the application
No response
Website URL
http://project-hami.io/
Roadmap
https://github.com/Project-HAMi/HAMi?tab=readme-ov-file#roadmap
Roadmap context
Contributing Guide
https://github.com/Project-HAMi/HAMi/blob/master/CONTRIBUTING.md
Here are our community meeting minutes
https://docs.google.com/document/d/1YC6hco03_oXbF9IOUPJ29VWEddmITIKIfSmBX8JtGBw/edit?usp=sharing
Code of Conduct (CoC)
https://github.com/Project-HAMi/HAMi/blob/master/CODE_OF_CONDUCT.md
Adopters
We have done a survey and found that dozens of adopters are already using HAMi. We will maintain it in the HAMi documentation later. Online survey results
Contributing or Sponsoring Org
4paradigm,DaoCloud, HuaweiCloud,Rise Union
Maintainers file
https://github.com/Project-HAMi/HAMi/blob/master/MAINTAINERS.md
IP Policy
Trademark and accounts
Why CNCF?
The CNCF is the premier organization for cloud-native technologies and is backed by many leading companies in the industry. It also provides a platform for collaboration and community-building, which can lead to increased visibility, adoption, and contributions to HAMi.
At the same time, HAMi can be combined with more outstanding CNCF projects (such as: Volcano, Kuberay, Kueue) to provide one-stop service for AI infrastructure.
Benefit to the Landscape
As AI becomes more and more popular, different smart devices are springing up, represented by Nvidia, but there are many other smart devices that are also actively embracing K8s and CNCF. But how these numerous GPUs, NPUs and other devices can provide a consistent interactive experience on one platform is particularly important. This is exactly what HAMi is focused on doing. If users use HAMi, it will greatly simplify the management and operation of these GPUs and NPUs on K8s, and the application layer does not need to be aware of the differences in underlying hardware.
Cloud Native 'Fit'
HAMi is built using cloud native technology. It has now used scheduler-plugin, webhook, device-plugin and other technologies to manage and schedule heterogeneous AI computing devices. In the future, it will consider using DRA for architecture optimization.
Cloud Native 'Integration'
HAMi refers to the nvidia device-plugin project part of source codes to support nvidia gpu basic features. On top of this, we support the following functions for nvidia gpu extensions.
nvidia.com/use-gputype
ornvidia.com/nouse-gputype
.nvidia.com/use-gpuuuid
ornvidia.com/nouse-gpuuuid
.Cloud Native Overlap
We do not think there is direct overlap at this time with other CNCF projects. However, we do touch on some of the areas that other projects are investigating in the space of device-plugin,and scheduler enhancement.
Volcano also provides the ability to share GPUs. In version v1.8, the features of volcano-vgpu were contributed to the volcano repo by hami maintainer. However, after discussions with the maintainer of volcano, in order to support the independent development of the hami community, it was decided to release it in version v1.9. Later, this part of the function was transferred to the HAMi project and maintained by the HAMi community (repo --> https://github.com/Project-HAMi/volcano-vgpu-device-plugin)
Similar projects
Some comparisons with similar projects to HAMi
highlight
Comparison of GPU sharing solutions
Landscape
yes
HAMi is in landscape and also in cnai group
https://landscape.cncf.io/?group=cnaiBusiness Product or Service to Project separation
N/A
Project presentations
No response
Project champions
No response
Additional information
No response
The text was updated successfully, but these errors were encountered: