Switched to static linking of llvm #100

timostrunk · 2025-02-13T08:59:10Z

I switched to static linking of libllvm here and added it to the ignore_run_exports on unix. I did not change the build behaviour on Windows as it seems to already build with default settings there and I have no means to test it.

This fixes #99 and #84.

Reasons against this PR:

Obviously this will increase binary size, because libllvm is now statically linked. Im explicitly tagging @xhochy here, because he was against static linking this.
If the symbol version issue would be identified and resolved it would be cleaner

Reasons to merge this PR:

It resolves a hard to debug issue, which occurs using libraries, where debugging skill is required to find out, why a segmentation fault actually happened
It can easily be reverted, once the symbol isolation is gone.

Checklist

Used a personal fork of the feedstock to propose changes
Bumped the build number (if the version is unchanged)
Reset the build number to 0 (if the version changed)
Re-rendered with the latest conda-smithy (Use the phrase @conda-forge-admin, please rerender in a comment in this PR for automated rerendering)
Ensured the license file is being packaged.

timostrunk · 2025-02-13T08:59:23Z

@conda-forge-admin, please rerender

conda-forge-admin · 2025-02-13T09:00:31Z

Hi! This is the friendly automated conda-forge-linting service.

I just wanted to let you know that I linted all conda-recipes in your PR (recipe/meta.yaml) and found it was in an excellent condition.

timostrunk · 2025-02-13T09:32:37Z

ppc64le still broken, will look into it

timostrunk · 2025-02-13T10:46:57Z

ppc64le still broken, will look into it

build works locally on cross-compile and goes further than action runner. Test fail (as expected) due to invalid binary format.

timostrunk · 2025-02-14T11:11:18Z

I will try to dynamic link on ppc64le still. I think the native build has issues. The local crossbuild on my machine works though, but I don't want to sacrifice the tests, which I would need to do, if I let the github runner build on linux64.

Reverted the PR to draft, because there will be some noise.

timostrunk · 2025-02-14T12:30:24Z

@conda-forge-admin, please rerender

conda-forge-admin · 2025-02-14T12:32:46Z

Hi! This is the friendly automated conda-forge-webservice.

I tried to rerender for you, but it looks like there was nothing to do.

_{This message was generated by GitHub Actions workflow run https://github.com/conda-forge/conda-forge-webservices/actions/runs/13329229741. Examine the logs at this URL for more detail.}

…nda-forge-pinning 2025.02.12.20.08.11

timostrunk · 2025-02-14T13:27:28Z

@jakirkham : This is now ready for review. ppc64le still builds shared, because I was unable to static link on a native ppc64le platform.

jakirkham

As already discussed at length previously ( #73 ), will not be accepting a change to static linking

Think we need to find a better way to address the issue users are encountering. One avenue may be changing the linker flags (stripping, symbol visibility, etc.)

timostrunk · 2025-02-19T22:04:05Z

My analysis showed that the cause of the issue might not be in llvmlite, so changing the flags here won't change much. The issue is that libllvm15 links into libllvm19, so I presume the right place for a fix is there. If my analysis is right libllvm15 should be in conflict with libllvm19, as it is not safe to have both in the same environment.

Until this fix is found it is desastrous for the conda-forge ecosystem not to provide a workaround. Currently not merging this means that people using pytorch and numba in the same conda environment either

face segmentation faults
need to downpin pytorch and keep the pin until llvmlite gets libllvm19 compatibility at which point this starts working again 'by chance' and will most probably break again with the next libllvm release.

numba and pytorch is a very common combination. Therefore: Please reconsider merging this. It can be reverted once the issue is solved.

timostrunk · 2025-02-19T22:13:46Z

Also in #73 you write:
"Users have been working with Numba on CUDA for some time without issues and this uses the dynamic linking solution we currently have

Not following the argument for static linking"

The reason this is working is only because numba and pytorch-cuda required the same libllvm version at this precise time. When libllvm14 was used by llvmlite and libllvm15 was used by pytorch-cuda, it broke. Now the same thing happens with libllvm15 and 19.

h-vetinari · 2025-02-19T22:14:26Z

As already discussed at length previously ( #73 ), will not be accepting a change to static linking

Pity you did not respond on the thread I opened in the core channel. I will bring this to a vote within core.

jakirkham · 2025-02-19T22:14:36Z

If this is indeed an LLVM issue, let's file on that feedstock. If there was already an issue filed and it was closed prematurely, am happy to reopen (provided a link)

timostrunk · 2025-02-19T22:16:38Z

This is the issue I opened on the llvm-dev feedstock: conda-forge/llvmdev-feedstock#312

beckermr · 2025-02-21T18:09:36Z

I am happy to take on the maintenance burden of pushing updates to the llvm versions for static linking if that helps unblock this.

beckermr · 2025-02-25T23:52:11Z

OK friends. I have another idea on how to ease the maintenance burden of static linking so we can unblock this PR.

I have added a new feature to the bot in this PR: regro/cf-scripts#3755.

It allows the bot to update static libs according to an abstract spec as follows.

In the extra section you add this bit of yaml

extra:
  static_linking_host_requirements:
    - llvmdev 15.*
    - llvm 15.*

This specifies the abstract requirements for the static library you want in host.

Then in your recipe, you list the exact packages you care about like this

requirements:
  host:
    - python
    - setuptools
    - llvmdev 15.0.7 h2621b3d_4  # [osx and arm64]
    - llvm 15.0.7 h4a7a88c_4     # [osx and arm64]
    - llvmdev 15.0.7 hbedff68_4  # [osx and x86]
    - llvm 15.0.7 hed0f868_4     # [osx and x86]

The bot then does the following computation:

For each abstract static lib host requirement in the extra section

Get the latest version for all platforms the feedstock builds
If the latest versions+build numbers across all platforms differ, bail on whole update.

Extract the specs of packages in host in the current recipe that

match the abstract host requirements in the recipe per platform
are exact specs

If any versions and/or build numbers at equal version have increased when comparing 1 and 2

update the static libs in the recipe
issue a PR

The result of this is an update to the host section like this:

requirements:
  host:
    - python
    - setuptools
    - llvmdev 15.0.7 h4429f82_5  # [osx and arm64]
    - llvm 15.0.7 h0cf516b_5     # [osx and arm64]
    - llvmdev 15.0.7 hc29ff6c_5  # [osx and x86]
    - llvm 15.0.7 hb21d583_5     # [osx and x86]
    - zlib
    - vs2015_runtime  # [win]

The bot stores the new static lib versions it used to update the feedstock as part of the PR info it has. This should prevent it from issuing duplicate PRs.

Further, by bailing if not all of the versions+build numbers of the new static libs match, we should prevent PRs being issued in the middle of a build of the static libs on the backend.

Finally, by restricting the search for updates to the static libs to the abstract specs in extra, the bot will only issue an update for increases in minor+patch versions and/or build numbers in the say 15.* series for llvm, etc.

If this is of interest to you all, let me know and I can finish up the bot PR, make a PR to this feedstock, and we can start trying it out.

cc @isuruf @h-vetinari @jakirkham @conda-forge/llvmlite

h-vetinari · 2025-02-26T01:07:42Z

Thank you for trying to find a solution on this @beckermr! I'm fine with whatever setup that lets us get rid of the segfaults. To me the static_linking_host_requirements: machinery sounds a bit like overkill, but if it addresses the maintenance concerns here, then I'm happy to go in that direction.

isuruf · 2025-02-26T01:22:30Z

Let's move it to bot section of conda-forge.yml instead of keeping it in meta.yaml.

Can we also use something like

    - llvmdev 15.0.7 *_5
    - llvm 15.0.7 *_5

instead to make it human updateable too?

beckermr · 2025-02-26T01:50:27Z

I'm not in favor of moving it to the bot section because then the recipe is not self-contained.

Happy to make the changes on the hash.

isuruf · 2025-02-26T01:59:49Z

I'm not in favor of moving it to the bot section because then the recipe is not self-contained.

What do you mean by this?

beckermr · 2025-02-26T03:00:01Z

I mean when people look at the recipe, they won't see any hints/info on what exactly is going on in host if the abstract requirements are in a entirely separate file. At least by having the abstract requirements in the extra section of the same file, there is a clue that something different is going on. Also, that section gets imbedded in the final artifact which has additional advantages.

isuruf · 2025-02-26T03:13:03Z

You'd already have

host:
    - llvmdev 15.0.7 *_5
    - llvm 15.0.7 *_5

which tells anyone looking at the recipe what is going on.
The bot would come and bump the value from *_5 to *_6.

beckermr · 2025-02-26T03:37:00Z

No it does not. There is missing information as follows.

The spec llvmdev 15.0.7 *_5 could be the result of many other specs one might list in static_linking_host_requirements, eg

llvdev <17
llvmdev >12,<18
llvmdev 15.0.*

The contents in static_linking_host_requirements are what would be in the recipe if we were to not exactly pin the static host requirement and instead let conda solve for it. It thus belongs in the recipe.

beckermr · 2025-02-26T03:56:50Z

Globs in the hash part of the build string are now working. The bot will use a glob if the recipe previously had one. Otherwise it will use the exact pin.

isuruf · 2025-02-26T14:16:49Z

We don't even have the pinnings from conda-forge-pinning in the recipe, so I don't see why we need to have it in the recipe explicitly here.

isuruf · 2025-02-26T14:18:11Z

We could also have

host:
- llvmdev {{ llvm_15_static }}

and have a migrator. This should make it work just like other migrators.

beckermr · 2025-02-26T14:35:54Z

We don't even have the pinnings from conda-forge-pinning in the recipe, so I don't see why we need to have it in the recipe explicitly here.

It goes beyond this.

With only the semi-exact pin (e.g., llvm 15.0.7 *_6), we don't know which updates are allowed and which ones are not. Can we update llvm 15.0.7 *_6 to llvm 19.0.0 <build>? Can we update llvm 15.0.7 *_6 to llvm 15.1.0 <build>?

Assuming only the major version matters works for this feedstock, but won't work in general.

have a migrator

The migrator idea suffers from the same issue. The pinnings would need to list the exact package spec (eg llvm 15.0.7 *_6) on a given platform. In order for the bot to issue a PR to the pinnings repo, it needs to again know what updates are allowed to this exact pin.

For ABI migrations, the bot checks if the latest version of the package is outside the version range of the current pin. That won't work here since we want updates within a specific version series.

So TL;DR; without something specifying that we want the latest package within llvm 15.*, we'd have to make an assumption (i.e., only update within the major version) or allow any updates (e.g., move to the latest always). That last thing would not work for this feedstock since llvmlite needs llvm 15, llvmdev 15 etc.

beckermr · 2025-02-26T14:36:29Z

Here is another idea for the API. We put a key in the conda-forge.yml like this

bot:
  update_static_libs: true

and then in host we list both the "abstract" and "pinned" requirement

requirements:
  host:
    - llvm 15.0.*               # "abstract" requirement since no build string
    - llvm 15.0.* blah_*        # "abstract" requirement since no build number
    - llvm 15.0.7 blah_*_5      # current pin with no hash
    - llvm 15.0.7 blah_h4541_5  # or a current pin w/ an exact hash

The bot would know the difference between the "abstract" versus "pinned" requirement by the presence or absence of a full build string and/or a build number at the end of the build string.

The trade-off here is between being explicit versus laundering information in the extra section. Neither is perfect, but we might prefer one over the other.

isuruf · 2025-02-26T15:59:03Z

Here is another idea for the API

I like this idea.

beckermr · 2025-02-27T13:17:28Z

Ok. The bot is ready, pending review and merge of the PR.

Shall we proceed?

recipe/meta.yaml

beckermr · 2025-02-27T14:12:17Z

@conda-forge-admin rerender

beckermr · 2025-02-27T14:13:01Z

@conda-forge-admin rerender

…nda-forge-pinning 2025.02.27.07.39.47

conda-forge-admin · 2025-02-27T14:16:16Z

Hi! This is the friendly automated conda-forge-linting service.

I wanted to let you know that I linted all conda-recipes in your PR (recipe/meta.yaml) and found some lint.

Here's what I've got...

For recipe/meta.yaml:

❌ In conda-forge.yml: $.bot = {'update_static_libs': True, 'abi_migration_branches': ['rc']}.

{'update_static_libs': True, 'abi_migration_branches': ['rc']} is not valid under any of the given schemas
Schema
```
{
  "anyOf": [
    {
      "$ref": "#/$defs/BotConfig"
    },
    {
      "type": "null"
    }
  ]
}
```

_{This message was generated by GitHub Actions workflow run https://github.com/conda-forge/conda-forge-webservices/actions/runs/13569596802. Examine the logs at this URL for more detail.}

beckermr · 2025-02-27T14:32:19Z

@isuruf ppc64le is not statically linked in these builds due to some error @timostrunk had when they tried above.

beckermr · 2025-02-27T15:01:29Z

PR to fix the linter: conda-forge/conda-smithy#2253

beckermr · 2025-02-27T15:29:16Z

OK. This one is all green or will be soon. I do not want to merge when another maintainer has requested changes.

@jakirkham Can you look at what we've done here and reconsider your review possibly? We've enabled fully automatic updates via the bot and I've added myself to the feedstock to manage things around that as I expect we'll encounter a few bugs along the way.

timostrunk requested review from jakirkham, marcelotrevisani, mbargull, souravsingh and xhochy as code owners February 13, 2025 08:59

timostrunk mentioned this pull request Feb 13, 2025

Linking libllvm-19 before using libllvmlite (currently requiring libllvm-15) leads to segmentation faults. #99

Open

1 task

timostrunk marked this pull request as draft February 14, 2025 10:57

timostrunk force-pushed the switch_to_static branch from 05f65d3 to 7673ab6 Compare February 14, 2025 12:37

timostrunk and others added 2 commits February 14, 2025 13:56

Switched to static linking of llvm for linux-64 and osx-*

67353fe

MNT: Re-rendered with conda-build 25.1.2, conda-smithy 3.45.4, and co…

2a97a9a

…nda-forge-pinning 2025.02.12.20.08.11

timostrunk force-pushed the switch_to_static branch from 7673ab6 to 2a97a9a Compare February 14, 2025 13:08

timostrunk marked this pull request as ready for review February 14, 2025 13:09

timostrunk mentioned this pull request Feb 14, 2025

Unnecessary libllvm dependency for numba through llvmlite #84

Open

1 task

jakirkham requested changes Feb 19, 2025

View reviewed changes

beckermr mentioned this pull request Feb 26, 2025

feat: add static lib migrator regro/cf-scripts#3755

Open

1 task

cleanup

e8b1176

beckermr reviewed Feb 27, 2025

View reviewed changes

recipe/meta.yaml Outdated Show resolved Hide resolved

Update recipe/meta.yaml

230e541

add me too

df55c8e

beckermr approved these changes Feb 27, 2025

View reviewed changes

beckermr and others added 2 commits February 27, 2025 08:14

Update conda-forge.yml

e7cf0ea

MNT: Re-rendered with conda-build 25.1.2, conda-smithy 3.45.4, and co…

9d9ae56

…nda-forge-pinning 2025.02.27.07.39.47

Update build.sh

af6574a

Update build.sh

03326d4

beckermr mentioned this pull request Feb 27, 2025

static linkage migrations regro/cf-scripts#3747

Open

Switched to static linking of llvm #100

Are you sure you want to change the base?

Switched to static linking of llvm #100

Conversation

timostrunk commented Feb 13, 2025 • edited Loading

timostrunk commented Feb 13, 2025

conda-forge-admin commented Feb 13, 2025

timostrunk commented Feb 13, 2025

timostrunk commented Feb 13, 2025

timostrunk commented Feb 14, 2025

timostrunk commented Feb 14, 2025

conda-forge-admin commented Feb 14, 2025

timostrunk commented Feb 14, 2025

jakirkham left a comment

Choose a reason for hiding this comment

timostrunk commented Feb 19, 2025

timostrunk commented Feb 19, 2025

h-vetinari commented Feb 19, 2025

jakirkham commented Feb 19, 2025

timostrunk commented Feb 19, 2025

beckermr commented Feb 21, 2025

beckermr commented Feb 25, 2025 • edited Loading

h-vetinari commented Feb 26, 2025

isuruf commented Feb 26, 2025

beckermr commented Feb 26, 2025

isuruf commented Feb 26, 2025

beckermr commented Feb 26, 2025

isuruf commented Feb 26, 2025

beckermr commented Feb 26, 2025 • edited Loading

beckermr commented Feb 26, 2025

isuruf commented Feb 26, 2025

isuruf commented Feb 26, 2025

beckermr commented Feb 26, 2025

beckermr commented Feb 26, 2025

isuruf commented Feb 26, 2025

beckermr commented Feb 27, 2025

beckermr commented Feb 27, 2025

beckermr commented Feb 27, 2025

conda-forge-admin commented Feb 27, 2025 • edited Loading

beckermr commented Feb 27, 2025

beckermr commented Feb 27, 2025

beckermr commented Feb 27, 2025

timostrunk commented Feb 13, 2025 •

edited

Loading

beckermr commented Feb 25, 2025 •

edited

Loading

beckermr commented Feb 26, 2025 •

edited

Loading

conda-forge-admin commented Feb 27, 2025 •

edited

Loading