Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix some small issues with rpm support #796

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

manuelnaranjo
Copy link

I started trying to update rules_pkg to build some of our RPMs at booking, and noticed a few issues.

This are small changes, and even though they're not related, I didn't thought it was a good idea to
have them in separate MRs

Copy link
Collaborator

@cgrindel cgrindel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. The doc_build CI failure does not seem to be related to the PR. I am unsure if this PR is the cause for the examples_naming errors.

pkg/make_rpm.py Outdated
@@ -109,7 +109,7 @@ def CopyAndRewrite(input_file, output_file, replacements=None, template_replacem
"""

with open(output_file, 'w') as output:
for line in fileinput.input(input_file):
for line in fileinput.input(input_file, encoding='utf-8'):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems like a reasonable value for encoding. I had a quick look at the doc for fileinput to understand what it selected for encoding when it was unspecified. Unfortunately, it does not say.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I dived into 3 python modules source code and gave up, I have the feeling it goes down all the way to native code to decide, because I'm using the relocatable python toolchain and getting different results on CentOS7 than CentOS Stream 8 and 9 for the same source code

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah. This is probably not going to be as portable as it should be.
Can you describe the problems you are having which lead to this change.
And a test case really needs to be provided.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Literally was getting ASCII being picked as the default encoder/decoder even with a py3.11 toolchain, I can try coming up with a test, I gave it a few minutes of research but I couldn't figure out how to influence it by passing different env variables into python

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mhh maybe the locale of my system is corrupted, I'll debug further tomorrow https://docs.python.org/3.11/glossary.html#term-locale-encoding sheds some light, I see rules_pkg set a few env variables to utf-8 but it's not getting the entire desired effect.
Also I had to bump rules_py through bzlmod, maybe I got in a state that the bug surfaces, what's weird is that it only happens on CO7 and not our COS8/9 container images

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@aiuto so I did some research, I have kind of a repro setup, which basically has a py_binary and a genrule, all the binary does is print the locale.getlocale and compare it against UTF-8, if I run the binary with bazel run on CO7 and COS9 I get the expected UTF-8, if I go through the genrule then it works on CO7 but it fails in COS9. What I noticed is that when I print the env variables in COS9 I see LC_CLANG=C.UTF-8 as an environmental variable, if I set LC_CLANG=en_US.UTF-8 through action_env then I get the expected results.

As part of bazelbuild/bazel#7370 you added LC_CLANG=UTF-8 as a default env for make_rpm, I'll keep digging, but it looks like my image is badly configured, shall I revert this part of the change then?

Copy link
Author

@manuelnaranjo manuelnaranjo Dec 20, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok more findings https://bugzilla.redhat.com/show_bug.cgi?id=902094 so COS8 added C.UTF-8 it started with CentOS Stream 8, I find other similar issues https://gitlab.haskell.org/ghc/ghc/-/issues/18607, and I see that bazel it self sets en_US.UTF-8 in the env variables for the javac calls https://github.com/bazelbuild/bazel/blob/c7a47846f12c8ad24b53cfc6dbcd99c701c293d9/src/main/java/com/google/devtools/build/lib/rules/java/JavaCompileActionBuilder.java#L57, does it make sense that I create a patch that sets the right env variable instead? Not sure how we would test this though, it requires a customized container image, otherwise I just drop the commit and set action_env in my .bazelrc, this only affects very old systems anyway

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@aiuto I changed the way to deal with this, I figured out that rules_pkg doesn't respect LC_CLANG from action_env, I made some changes to respect this value and only if not passed by the rule invocation to use the defaults.

What do you think?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for the late reply.
The current change feels both better and worse.

The most robust situation would be if we can do things in the python code that we can completely ignore the environment, and do the right thing no matter what OS or python version we have. Unfortunately, I don't have enough of a build farm to test all the combinations, so I don't know how to win.

Going back to your original change ( for line in fileinput.input(input_file, encoding='utf-8') ), what was wrong with that? It seems like the right thing. Chuck asked for a little more research, and you dug deeper and found the crawly things under the rock. Could we do the limited change of fixing make_rpm and get your builds to work?

pkg/make_rpm.py Outdated
@@ -109,7 +109,7 @@ def CopyAndRewrite(input_file, output_file, replacements=None, template_replacem
"""

with open(output_file, 'w') as output:
for line in fileinput.input(input_file):
for line in fileinput.input(input_file, encoding='utf-8'):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah. This is probably not going to be as portable as it should be.
Can you describe the problems you are having which lead to this change.
And a test case really needs to be provided.

When passing changelog as an argument the OutputGroupInfo expects
File and not Target, so it needs to be retrieved from the file
member of the context
If --action_env is being passed then the repo expects it to hit all
the actions, by forcing LANG/LC_CTYPE to override we're removing that
use case
@aiuto
Copy link
Collaborator

aiuto commented Mar 15, 2024

Friendly ping. Sorry for the slow ping. I was essentially away all of January and February.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants