Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

update: flux-sched to build with cmake #118

Merged
merged 1 commit into from
Oct 18, 2023
Merged

Conversation

vsoch
Copy link
Member

@vsoch vsoch commented Oct 8, 2023

Problem: flux-sched 0.29.0 and after builds with cmake
Solution: add a dual package build for Autotools and CMake

This is a WIP pending:

  1. Successful build
  2. Wisdom/feedback from @trws

@trws
Copy link
Member

trws commented Oct 9, 2023

This looks really good, as long as it's behaving itself I don't see anything we need to change. Might be nice to set the build directory for both builders so we always build out of tree, but that would be a bonus.

One question, do we need the RelWithDebInfo option? I think I remember hearing that's no longer the default, so it's a good thing to have, but made me wonder.

@vsoch
Copy link
Member Author

vsoch commented Oct 9, 2023

@trws yay you are around! So I don’t think we need it - I only added it so we would have an example of one option. Would you like me to remove?

And if you want me to add the out of build directory can you show me how to do that with spack?

@trws
Copy link
Member

trws commented Oct 9, 2023 via email

@trws
Copy link
Member

trws commented Oct 9, 2023 via email

@vsoch
Copy link
Member Author

vsoch commented Oct 9, 2023

@trws I think we have this?

    # Avoid the infinite symlink issue
    # This workaround is documented in PR #3543
    build_directory = "spack-build"

is that what you mean? I can move it down into each respective builder class in case that doesn't take.

@vsoch
Copy link
Member Author

vsoch commented Oct 9, 2023

@alalazo could you take a look here and let us know if we did anything wrong? I just tried a few derivatives of moving some of the logic down into the build system and it broke things, so I think we might be at a good spot here.

@vsoch
Copy link
Member Author

vsoch commented Oct 9, 2023

Testing again with a cleared cache - the error was something about a missing key shared for gettext and that seems to correspond to this commit spack@8089aed

@vsoch
Copy link
Member Author

vsoch commented Oct 9, 2023

I've seen this error with spack too - it is sporadic and I don't know how to resolve it. I guess we are frozen.
image

@vsoch vsoch force-pushed the update-flux-sched-cmake branch 2 times, most recently from 501979f to e56a04e Compare October 16, 2023 05:58
@trws
Copy link
Member

trws commented Oct 16, 2023

That looks like there's some kind of dependency on this specific system between libarchive and iconv, but that spack isn't providing iconv. When I look at spack develop, libarchive explicitly has a +iconv variant on it, and it should get built, same here. What's the base image here? Something strange seems to be going on with pkg-config and I'm curious what's causing it.

@vsoch
Copy link
Member Author

vsoch commented Oct 16, 2023

@trws I fixed the bug in spack, and I'm not sure why it's not taking up here (I cleared the cache). I'm chocking it up (for now) to just one more day needed to sync everything. But the bug is fixed.

@vsoch
Copy link
Member Author

vsoch commented Oct 16, 2023

Something strange seems to be going on with pkg-config and I'm curious what's causing it.

Let's watch the build tonight and if it's still wonky follow up here!

@trws
Copy link
Member

trws commented Oct 16, 2023 via email

@vsoch
Copy link
Member Author

vsoch commented Oct 17, 2023

okay we are still failing tonight: https://github.com/flux-framework/spack/actions/runs/6541718560/job/17763633255. The first thing I checked is that we aren't using libarchive as an external dependency (no):

The only thing I can see about libarchive is the version:

2023-10-17T02:51:47.2075267Z ==> Installing libarchive-3.7.1-ox6uoadnuhngr5klurdvfa33d7xrvzoh [37/69]
2023-10-17T02:51:47.2076631Z ==> No binary for libarchive-3.7.1-ox6uoadnuhngr5klurdvfa33d7xrvzoh found: installing from source
2023-10-17T02:51:48.8732663Z ==> Fetching https://www.libarchive.org/downloads/libarchive-3.7.1.tar.gz
2023-10-17T02:51:48.8733453Z ==> No patches needed for libarchive
2023-10-17T02:51:48.8800471Z ==> libarchive: Executing phase: 'autoreconf'
2023-10-17T02:51:48.8974338Z ==> libarchive: Executing phase: 'configure'
2023-10-17T02:52:12.2551571Z ==> libarchive: Executing phase: 'build'
2023-10-17T02:52:43.5917219Z ==> libarchive: Executing phase: 'install'
2023-10-17T02:52:44.2191960Z ==> libarchive: Successfully installed libarchive-3.7.1-ox6uoadnuhngr5klurdvfa33d7xrvzoh
2023-10-17T02:52:44.2193657Z   Stage: 1.53s.  Autoreconf: 0.00s.  Configure: 23.34s.  Build: 31.32s.  Install: 0.46s.  Post-install: 0.05s.  Total: 56.83s
2023-10-17T02:52:44.2195496Z [+] /opt/spack/opt/spack/linux-ubuntu20.04-x86_64_v4/gcc-10.5.0/libarchive-3.7.1-ox6uoadnuhngr5klurdvfa33d7xrvzoh

We have the update here:

depends_on("libarchive+iconv", when="@0.38.0:")
so I'm not sure what is going on. I had thought we had transparency to see caches associated with an action, but now I'm not so sure reading this: https://github.com/actions/cache/blob/main/tips-and-workarounds.md#force-deletion-of-caches-overriding-default-cache-eviction-policy. We use it here: https://github.com/sciworks/spack-updater/blob/4c4acccb4185c4b3154fe8245447428a94015adc/release-check/action.yaml#L122-L127.

So what I can try (assuming a hypothesis is something is being cached) is to add a variable that disables it. Aside from that I think I'm out of ideas at the moment.

Update: running now, will report back.

@vsoch
Copy link
Member Author

vsoch commented Oct 17, 2023

okay absolutely no cache used - can confirm that in logs! But still failed with this error: https://github.com/flux-framework/spack/actions/runs/6542060275/job/17764502643

Let me know if you see something or have an idea of what to try. I'll keep thinking too.

@vsoch
Copy link
Member Author

vsoch commented Oct 17, 2023

I'm trying a local build now, with the latest of spack and the package.py file as it is here. If that works then I'm going to step back and try with pakages, which (should) just run the same command but we will see. If that doesn't work, great, I know what to debug! But if it does work I think I need to look at every step here, because there is a cache hidden somewhere I'm not finding. I'll update here with my progress. And sorry for the trouble, this workflow / set of actions has been working really well (and eliminate a lot of manual labor to update things with spack) so I'm curious what is going on here.

@vsoch
Copy link
Member Author

vsoch commented Oct 17, 2023

okay both worked. So hmm, I think there is a cache hiding somewhere.

@vsoch vsoch force-pushed the update-flux-sched-cmake branch from e56a04e to 70e48c1 Compare October 17, 2023 13:33
@vsoch
Copy link
Member Author

vsoch commented Oct 17, 2023

fingers and toes are crossed.... ! 🤞

@vsoch vsoch force-pushed the update-flux-sched-cmake branch from 0dde06d to bfc3292 Compare October 17, 2023 18:54
@vsoch
Copy link
Member Author

vsoch commented Oct 17, 2023

okay removing pakages didn't have any change - so it's not a nested cache. I'm next trying to remove spack external find in case there is an external dependency that is causing this.

@vsoch
Copy link
Member Author

vsoch commented Oct 17, 2023

okay that still didn't work. I'm going to now try removing the explicit install of clingo, the assumption being that it is somehow bringing something in to bork the subsequent build.

@vsoch
Copy link
Member Author

vsoch commented Oct 17, 2023

oh maaaah goosh!! It was CLINGO that monster!!!

okay to step back - the actions here were created before clingo (and the cache) was a thing. We installed it explicitly and created our own cache. This seems to have resulted in some dependency being installed that later borked the build.

What I'm going to do now is slowly add back the functionality from the actions that is good (e.g, spack external find and caches) but just not the install of clingo. Will report when it is ready for another look!

@vsoch
Copy link
Member Author

vsoch commented Oct 17, 2023

okay - first re-run is adding back spack external find, so we don't need to install like, cmake.

Problem: flux-sched 0.29.0 after builds with smake
Solution: add a dual package build for Autotools and Cmake
Signed-off-by: vsoch <[email protected]>
@vsoch vsoch force-pushed the update-flux-sched-cmake branch from bfc3292 to 88f7b5a Compare October 17, 2023 22:04
@vsoch
Copy link
Member Author

vsoch commented Oct 17, 2023

okay just pushed (hopefully) a final check. I've removed a lot of the caching for now - this means the builds will be longer, but probably we want to be testing that. If it becomes an issue with build time I can add them back. Certainly they were okay to have for a long while!

@vsoch
Copy link
Member Author

vsoch commented Oct 17, 2023

okay estimating this will take a little over 30 minutes -when this is green I'm going to test adding just the package build cache back.

@vsoch
Copy link
Member Author

vsoch commented Oct 17, 2023

okay 40 minutes! Cache is added back, trying again.

@vsoch
Copy link
Member Author

vsoch commented Oct 17, 2023

@trws after an epic journey through the land of caches and spack updates passed, this is ready for you to review again!

Turtles all the way down... 🐢 🐢 🐢 ... no no, caches all the way down... 📦 📦 📦 mumbles off into the distance

Copy link
Member

@trws trws left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great, thanks for tracking down that amazingly tricky cache issue!

@vsoch vsoch merged commit e7b50e5 into main Oct 18, 2023
4 checks passed
@vsoch vsoch deleted the update-flux-sched-cmake branch October 18, 2023 18:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants