Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Relink, don't rebuild #790

Open
1 of 3 tasks
osiewicz opened this issue Sep 30, 2024 · 2 comments
Open
1 of 3 tasks

Relink, don't rebuild #790

osiewicz opened this issue Sep 30, 2024 · 2 comments
Labels
major-change A proposal to make a major change to rustc T-compiler Add this label so rfcbot knows to poll the compiler team

Comments

@osiewicz
Copy link

osiewicz commented Sep 30, 2024

This is a Major Change Proposal for "Relink, don't rebuild" (RDR), which is a shorthand for reducing the amount of crate rebuilds happening when working with large multi-crate workspaces.

Cargo issue I'm trying to resolve

Initial discussion on Cargo Zulip

Cargo implementation branch

Rustc implementation branch

Proposal

Problem

Over at Zed we've noticed that even seemingly innocent changes to the crate that has many dependents causes us to rebuild (almost) the whole world. Adding a dbg! statement to the body of non-inlineable function, formatting code, updating a dependency to a new minor version and such still forces us to rebuild all dependants of an affected crate.

Proposed solution

I would like to experiment with exposing a public interface hash of a crate for use by cargo (and other build systems) in order to allow it to skip rebuilds of dependent crates in cases where the public interface of a crate does not change.
Such hash should be digest of all items from current crate that are reachable from other crates; this includes:

  • exports/reexports
  • traits
  • structs
  • functions:
    • We also need to account for whether the function is inline-able or not. If it is inline-able, we need to hash both function signature and function body. If it's not, it's sufficient to hash function signature.

The ability to skip rebuilds would come in handy on multiple occasions:

  • When working on a multi-crate workspace, the change to one crate would not necessarily cause rebuilds in its dependents. The benefit is transitive - even if there is a public interface change in crate A, the public interface of dependents of A might not change, allowing us to skip rebuilds of their dependents in such cases.
  • Upgrading a registry dependency would not require a rebuild of all crates in the workspace when the new version is interface-compatible with the old one. This might come in handy especially with crates that have mostly stable interface nowadays, such as serde.
  • It's not just cargo build that would benefit from this change; in particular, Rust Analyzer leans on cargo check to obtain rustc diagnostics. Thus, if we were to incorporate that mechanism into cargo check, we could get the same speedup within RA for free.
  • In case of our workspace (zed), we are either stuck on a bottlenecking crate (in which cores can sometimes stay idle) or we have too many crates to build on a 16-core machine, which leaves the build machine starved for resources. While initiatives like parallel frontend help us with the bottlenecking crates, they do not solve the resource starvation problem during the build. By building fewer crates we are going to have more resources to make use of these improvements, potentially speeding up our builds even further.
  • We should account for cargo features; if enabling/disabling a feature does not change the public interface of a crate, a rebuild should not be necessary.

C builds generally have a similar property, where the source file is split into the implementation (.c) file and header (interface) file. The dependents of a given translation unit are rebuilt only when the header file changes - changes to implementation files require just relinking, which is what we're after.

Obstacles

The major challenge lies in calculating the hash - it adds burden for the maintainers who would have to be extremely cautious about what is a part of the public interface, as false positives with that feature (not detecting that the interface of a crate has changed) would lead to miscompilations, which is obviously scary. A glaring footgun with the hash is that it has to be insensitive to position of public items within source code, as otherwise the utility of this feature as a whole would be significantly diminished. 1

Another obstacle is that this feature would require rethinking how crate loader operates; namely, our POC ICEs whenever a new item is added/removed, as doing so shifts DefIds which are relied upon by artifacts that are not being rebuilt. For example, if a crate B depends on A and we rebuild just A, the artifact that we have for B might have its references invalidated. This is a major hurdle for us in pushing this work forward.

Gains

In our internal experiments, RDR can slash dev cargo build times by as much as 50% (timings available in Cargo issue). cargo check could potentially benefit even more, as it doesn't have to go through linking steps. cargo check --timings after making a non-breaking change to the project crate in Zed could take 0.7s instead of 4.5s needed to recheck all of the dependants of this crate.

Mentors or Reviewers

I don't have anyone in particular in mind.

Process

The main points of the Major Change Process are as follows:

  • File an issue describing the proposal.
  • A compiler team member or contributor who is knowledgeable in the area can second by writing @rustbot second.
    • Finding a "second" suffices for internal changes. If however, you are proposing a new public-facing feature, such as a -C flag, then full team check-off is required.
    • Compiler team members can initiate a check-off via @rfcbot fcp merge on either the MCP or the PR.
  • Once an MCP is seconded, the Final Comment Period begins. If no objections are raised after 10 days, the MCP is considered approved.

You can read more about Major Change Proposals on forge.

Comments

This issue is not meant to be used for technical discussion. There is a Zulip stream for that. Use this issue to leave procedural comments, such as volunteering to review, indicating that you second the proposal (or third, etc), or raising a concern that you would like to be addressed.

Footnotes

  1. This is also why I've disregarded SVH, as it is position-sensitive. It also accounts for changes to private items.

@osiewicz osiewicz added major-change A proposal to make a major change to rustc T-compiler Add this label so rfcbot knows to poll the compiler team labels Sep 30, 2024
@rustbot
Copy link
Collaborator

rustbot commented Sep 30, 2024

This issue is not meant to be used for technical discussion. There is a Zulip stream for that. Use this issue to leave procedural comments, such as volunteering to review, indicating that you second the proposal (or third, etc), or raising a concern that you would like to be addressed.

Concerns or objections to the proposal should be discussed on Zulip and formally registered here by adding a comment with the following syntax:

@rustbot concern reason-for-concern 
<description of the concern> 

Concerns can be lifted with:

@rustbot resolve reason-for-concern 

See documentation at https://forge.rust-lang.org

cc @rust-lang/compiler @rust-lang/compiler-contributors

@rustbot rustbot added the to-announce Announce this issue on triage meeting label Sep 30, 2024
@exrok

This comment has been minimized.

@apiraino apiraino removed the to-announce Announce this issue on triage meeting label Oct 17, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
major-change A proposal to make a major change to rustc T-compiler Add this label so rfcbot knows to poll the compiler team
Projects
None yet
Development

No branches or pull requests

4 participants