-
Notifications
You must be signed in to change notification settings - Fork 54
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allowing Software Preview Submissions #122
Comments
Preview is narrowly tailored for a reason. Preview submissions are not subject to review or reproducibility by customers, and the lack of these things is not good for a benchmark suite. The sanction of striking results next time if not replicated as Available is also a limited disincentive, since last round is "old news". So there needs to be commensurate benefit to allowing such submissions. For new hardware, it comes from the intense market and press interest which often accompanies new devices. Hardware and software are also different in important ways: Hardware cycles are relatively inelastic, and somewhat unpredictable towards the end of development if gating bugs are found and need to be worked around. In comparison, there's a well-understood iterative model for software lifecycles where a cutoff date is set, and features that don't make the date are deferred to next release. If a single feature is important, and of sufficient maturity to provide to customers in an early state, submitters can reasonably plan for a release of that feature, at least at beta quality. "Preview" is only useful if working software comes so late in the MLPerf cycle that you can't do a QA cycle on it and release as beta. Further, the iterative nature of software cycles means that you usually have one to ship, and one in development - and with adequate machine resources, it's possible to enter with both. The development one is always faster, of course, so if you think your competitors might submit theirs, you might also feel you have to submit yours. That would just increase everyone's work to nobody's benefit. That's not the case with hardware: it's typically either very close to ready, or unusable. It would be helpful for submitters in favor of Preview software to explain why the marginal benefit over submitting a beta outweighs those disadvantages. |
Thank you @DilipSequeira for your detailed reply. I completely understand the reasons for narrowing down the preview submissions and they are very reasonable. But from my experience most new hardware (except from well established companies) need last minute software changes and they are critical for their submissions. Also, I'm not very clear on the
Based on your previous comment I suppose your answer would be no to both these questions. While hardware release cycles are uniformly applicable to almost all submitters working on a new hardware software release cycles are particularly critical for those people working on newer hardware which may not have a fully stable software stack. For example while an established vendor might loose 3-5% performance due to using a released software against the latest one, a new vendor might be losing 10-50% performance or even a chance to submit if the accuracy threshold is not met. |
@arjunsuresh if you have preview hardware, you do not have to meet the Availability requirements for software necessary to use that component. Specifically, the availability requirement is waived for "newly developed" hardware components, where the definition of "newly developed" is in the rules. For your other questions, see the "Available" column here. My interpretation of the rules is:
|
Thank you @DilipSequeira on the links and interpretations. Actually I'm not talking about preview hardware but a hardware on which not all the MLPerf inference models have been tested on - my understanding is that only Nvidia and Intel have submitted inference results for all the inference models. So, any other submitter might suffer from software issues on their already available hardware which can restrict them from a possible submission. "If your software source is in a public repo, it's available so long as there's a commit hash. No release process is required." This makes life easier for those working on public repos (we being one). The rule also allows some additional PRs and so I see no issue arising here for submitters working on public repos. "Beta release" for closed binary is where clarity is needed. My interpretation or expectation of a "beta release" is that the binary should be properly tagged with a version number and made available to customers. This binary should be made before the submission deadline and any customer who asks for it after the submission deadline should have access to it. Mainly I think "asking for a proper release cycle" is unnecessary here. But since this is not what we are going through ourselves I would ask the opinion of other submitters here. Qualcomm might be the best one to answer as their software stack for Cloud AI 100 is not publicly available as of today. |
Model freeze, and model acceptance for a given round, are controlled by the working group, so if submitters feel there is insufficient time to prepare for a model, it will get pushed to the next round (and this has happened multiple times.) Once submitters have agreed to accept the model for a given round, it's a commitment to implement at beta quality or better if you want to submit on existing hardware - and this reflects a reality of the market: if hardware available in the market is intended for use in a particular class of workloads, customers should be able to port such models to it with reasonable effort. Of course, inability to produce software on a committed schedule will affect your ability to show the potential of your hardware, but that's not unique to MLPerf. |
Actually my request is not just restricted to a new model but even the existing models which are not yet tested on the hardware of a submitter (usually not applicable for Nvidia). I understand the reasons for at least "beta" quality -- but my concern is regarding the time delay between making a software binary and its release -- in many cases this takes weeks. We can consider the following scenario: As an OEM I'm making a system using an accelerator or company X. Now, a few days before the deadline company X provides a software binary to us (say from a nightly build) and this enables us to collect results for a MLPerf model. As per current rules, this result cannot go under Available category unless the given binary is classified as a "beta release" and time delay for the same vary depending on the internal policies of company X from a day to many weeks. Moreover I'm not sure if this restriction is bringing significant "quality difference" to a submission anyway and that's why I'm proposing to remove the "release cycle" requirement of the software for available category. |
The "beta" rule predates my involvement in MLPerf, so I don't know the original rationale. However, I would say this is not about the quality of submissions, but their credibility. There should be some threshold to prevent a submitter taking research software that might never be productized, labeling it a "beta", and using that to publish comparisons against other submitters. The "clear part of a release sequence" language means you're making a public commitment to your customers that these optimizations will be in production in a reasonably short time frame, and it's clearly enough decidable that it could be checked in audit. |
I completely agree on the credibility part. But do you agree on the following points?
|
|
"if your CI/CD is sufficiently mature that you post nightlies and any qualified customer can pick them up to use in their own development, then my interpretation would be that all of those nightlies are part of your release sequence, you could deem any one of them to be a beta, and you're covered. While I'm not aware of any submitters with closed source products at that level of software maturity, it's possible." Here, the bold part is not necessarily true. Like the nightly builds are made available to customers only on request or when a special need arises. I guess those selected nightly builds are still eligible to be part of the release sequence and hence can be considered as a "beta release" even if it is not put as an official beta release on the submitter website as on the submission deadline. "my understanding is that the intent of RDI is for cutting edge research which is a long way from productization. Software is rarely that far from productization. And your example (LLVM vs gcc) doesn't seem remotely close to approaching the intent of RDI. Others may have a different understanding of the category." I did not mean just substituting gcc with llvm - For example say I'm having a matured software stack based on one compiler framework which supports a good number of models in production use. Now we are researching on working with a new compiler framework which as of now only supports some special cases or say only one inference model. I can't call this software an "available one" - because there are no previous releases nor any estimate on the future release. From my experience release of such a software can easily take months if not years and I suppose it is fair to put them under RDI category. |
I think we're really getting into hypotheticals on the betas. My understanding based on past discussions is the threshold for Availability is that there's high confidence that this is not research software and that its presence in your customers' hands is imminent, and the test of that its that it's a "real" beta release. For RDI... if you're more than 221 days away from production, then you have no problem. If you're going to be (say) 90 days from production by the MLPerf deadline, then you probably want to start planning for an early beta as soon as the submission date is known. |
Thank you @DilipSequeira for the clarification on RDI. I think that part is clear now. Regarding "beta release", the only contention is regarding the requirement for the software binary to be "released" as on the submission date. Of course this is only relevant for companies where there is significant delay between the production of a binary and a possible beta release. If we relax the requirements for available category such that "software binary" must be available on the submission date and its "beta release" be available within 4 to 6 weeks (which ensures audit) from the submission deadline I suppose this problem goes away and we can agree on "no Software Preview submissions". |
@arjunsuresh It's important where possible to be able to establish the legality of a submission on submission day, without an obligation to do something later (with the exception of "preview" which implies something is coming later.) The implication of what you're suggesting is that SW submitters should plan to optimize right up to the deadline, submit, and then work on conformance. But for most submitters, it should work equally well to optimize until 6 weeks before the deadline, start making beta release candidates, roll in critical MLPerf-oriented optimizations to key operators in the last week or two with soak testing of those specific changes, and push the beta out for the submission date. The WG should only accept models into a round when a critical mass of submitters are confident that they can produce conformant implementations on submission day. The freeze date is set to support that; If we think it takes longer to produce conformant submissions than the current date allows, we should move it earlier. |
@DilipSequeira I understand your point in maintaining a beta release candidate specifically for MLPerf and that looks okay to me. But I'm not sure if all submitters can actually do a beta release from a beta candidate in a couple of days or within a week (mainly due to legal procedures). But I might be wrong here - lets see what other submitters have to say. |
Just summarizing our requirement
|
@arjunsuresh "Our" meaning OctoML, right? My understanding is that you guys submit OSS, so none of the above discussion is relevant to you, as anything from an OSS repo is always classed as "Available". |
Yes Dilip. Actually we also have plans to test the performance on different hardware platforms and there we also rely on their respective software stack. |
What's the problem with asking those manufacturers to provide release software (or at least beta), if they're submitting Available hardware, given that the schedules are known six months in advance. It seems the only reason is that you want to incorporate all the results of experiments right up to the deadline, rather than having an internal deadline that your partners can target. But the cost of that is allowing more unauditable unreproducible submissions. As well as the problems I pointed out above - supposed we allowed this.
|
yes Dilip. The point is to make use of all results of the experiments as much as possible. As we discussed in the Inference meeting, we can use |
This also adds a definition of a Reproducible software component so that RDI does not become a venue for submissions that are not reproducible and can never be reproducible.
Hi @DilipSequeira can you please give a PR for your change? This issue came up for Tiny submissions too and as per the current rules if a software is not available but hardware is, it is problematic. |
Thank you @DilipSequeira Sorry I had missed it earlier. |
The current MLCommons rules do not allow a preview submission just because a software component used is in "preview" (not available) stage. This restriction is not desirable as
One option currently is to submit such a system under RDI category. But the rules are a bit ambiguous on
Software RDI components
. So, my proposal is to restrict RDI components toonly hardware components
and allow "Preview software" in Preview category.The text was updated successfully, but these errors were encountered: