-
Notifications
You must be signed in to change notification settings - Fork 428
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Attempt to use cgroup v2 prior to cgroup v1 in path resolver #1978
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -28,11 +28,21 @@ | |
#include "src/shared/metadata/cgroup_path_resolver.h" | ||
|
||
DEFINE_bool(force_cgroup2_mode, true, "Flag to force assume cgroup2 fs for testing purposes"); | ||
DEFINE_bool(force_cgroup1_mode, false, "Flag to force use of cgroup v1 fs"); | ||
|
||
namespace px { | ||
namespace md { | ||
|
||
StatusOr<std::string> CGroupBasePath(std::string_view sysfs_path) { | ||
std::string cgv2_base_path = absl::StrCat(sysfs_path, "/cgroup"); | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. We need to test extensively if we change this. Systems can have both cgroup and cgroup2 if I recall correctly, and this change has to potential to break things. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Unfortunately the OVH case is one where cgroup v1 and v2 are both present, but our logic fails for it. I agree that this needs to be tested very carefully. After thinking about this more, I think it will be difficult to be confident that all edge cases are covered even if time is invested in testing the known cases. I'm going to think about this more and see if there is a safer option for handling the OVH case. If the cgroup v1 code doesn't short circuit and allows reaching the cgroup v2 logic, then testing this reordering can be avoided. |
||
struct statfs info; | ||
auto fs_status = statfs(cgv2_base_path.c_str(), &info); | ||
bool cgroupv2 = (fs_status == 0) && (info.f_type == CGROUP2_SUPER_MAGIC); | ||
|
||
if (cgroupv2 && !FLAGS_force_cgroup1_mode) { | ||
return cgv2_base_path; | ||
} | ||
|
||
// Different hosts may mount different cgroup dirs. Try a couple for robustness. | ||
const std::vector<std::string> cgroup_dirs = {"cpu,cpuacct", "cpu", "pids"}; | ||
|
||
|
@@ -47,15 +57,11 @@ StatusOr<std::string> CGroupBasePath(std::string_view sysfs_path) { | |
} | ||
} | ||
|
||
std::string cgv2_base_path = absl::StrCat(sysfs_path, "/cgroup"); | ||
struct statfs info; | ||
auto fs_status = statfs(cgv2_base_path.c_str(), &info); | ||
bool cgroupv2 = (fs_status == 0) && (info.f_type == CGROUP2_SUPER_MAGIC); | ||
|
||
// TODO(ddelnano): this aids in testing the LegacyCGroupPathResolver. Determine what | ||
// to do with this. | ||
Comment on lines
+60
to
+61
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Should the default value of But I also think using flags in tests is error prone, as mentioned above, so we should try to get away from that. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I was worried about the possibility of |
||
if (cgroupv2 || FLAGS_force_cgroup2_mode) { | ||
return cgv2_base_path; | ||
} | ||
Comment on lines
+60
to
64
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It's unfortunate that this code branch needs to stay to support the LeagcyCGroupPathResolverTest.Cgroup2Format test. Was the intention to remove the legacy resolver at some point? Is that something that could be considered now? |
||
// (TODO): This check for cgroup2FS is eventually to be moved above the cgroupv1 check. | ||
|
||
return error::NotFound("Could not find CGroup base path"); | ||
} | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -32,6 +32,7 @@ constexpr std::string_view kContainerID = | |
"a7638fe3934b37419cc56bca73465a02b354ba6e98e10272542d84eb2014dd62"; | ||
|
||
TEST(CGroupPathResolver, GKEFormat) { | ||
FLAGS_force_cgroup1_mode = true; | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. If we must set flags in test, then use |
||
std::string cgroup_kubepod_path = | ||
"/sys/fs/cgroup/cpu,cpuacct/kubepods/pod8dbc5577-d0e2-4706-8787-57d52c03ddf2/" | ||
"14011c7d92a9e513dfd69211da0413dbf319a5e45a02b354ba6e98e10272542d/cgroup.procs"; | ||
|
@@ -56,6 +57,7 @@ TEST(CGroupPathResolver, GKEFormat) { | |
} | ||
|
||
TEST(CGroupPathResolver, GKEFormat2) { | ||
FLAGS_force_cgroup1_mode = true; | ||
std::string cgroup_kubepod_path = | ||
"/sys/fs/cgroup/cpu,cpuacct/kubepods/burstable/podc458de04-9784-4f7a-990e-cefe26b511f0/" | ||
"01aa0bfe91e8a58da5f1f4db469fa999fe9263c702111e611445cde2b9cb0c1a/cgroup.procs"; | ||
|
@@ -80,6 +82,7 @@ TEST(CGroupPathResolver, GKEFormat2) { | |
} | ||
|
||
TEST(CGroupPathResolver, StandardFormatDocker) { | ||
FLAGS_force_cgroup1_mode = true; | ||
std::string cgroup_kubepod_path = | ||
"/sys/fs/cgroup/cpu,cpuacct/kubepods.slice/" | ||
"kubepods-pod8dbc5577_d0e2_4706_8787_57d52c03ddf2.slice/" | ||
|
@@ -111,6 +114,7 @@ TEST(CGroupPathResolver, StandardFormatDocker) { | |
} | ||
|
||
TEST(CGroupPathResolver, StandardFormatCRIO) { | ||
FLAGS_force_cgroup1_mode = true; | ||
std::string cgroup_kubepod_path = | ||
"/sys/fs/cgroup/cpu,cpuacct/kubepods.slice/" | ||
"kubepods-pod8dbc5577_d0e2_4706_8787_57d52c03ddf2.slice/" | ||
|
@@ -142,6 +146,7 @@ TEST(CGroupPathResolver, StandardFormatCRIO) { | |
} | ||
|
||
TEST(CGroupPathResolver, OpenShiftFormat) { | ||
FLAGS_force_cgroup1_mode = true; | ||
std::string cgroup_kubepod_path = | ||
"/sys/fs/cgroup/cpu,cpuacct/kubepods.slice/kubepods-burstable.slice/" | ||
"kubepods-burstable-pod9b7969b2_aad0_47d4_b11c_4acfd1ce018e.slice/" | ||
|
@@ -181,6 +186,7 @@ TEST(CGroupPathResolver, OpenShiftFormat) { | |
} | ||
|
||
TEST(CGroupPathResolver, BareMetalK8s_1_21) { | ||
FLAGS_force_cgroup1_mode = true; | ||
std::string cgroup_kubepod_path = | ||
"/sys/fs/cgroup/cpu,cpuacct/system.slice/containerd.service/" | ||
"kubepods-besteffort-pod1544eb37_e4f7_49eb_8cc4_3d01c41be77b.slice:cri-containerd:" | ||
|
@@ -224,6 +230,7 @@ std::string GetSysFsPathFromTestDataFile(const std::string& fname, | |
} // namespace | ||
|
||
TEST(LegacyCGroupPathResolverTest, GKEFormat) { | ||
FLAGS_force_cgroup1_mode = true; | ||
ASSERT_OK_AND_ASSIGN( | ||
auto path_resolver, | ||
LegacyCGroupPathResolver::Create(GetSysFsPathFromTestDataFile( | ||
|
@@ -246,6 +253,7 @@ TEST(LegacyCGroupPathResolverTest, GKEFormat) { | |
} | ||
|
||
TEST(LegacyCGroupPathResolverTest, StandardFormat) { | ||
FLAGS_force_cgroup1_mode = true; | ||
ASSERT_OK_AND_ASSIGN( | ||
auto path_resolver, | ||
LegacyCGroupPathResolver::Create(GetSysFsPathFromTestDataFile( | ||
|
@@ -317,6 +325,7 @@ TEST(LegacyCGroupPathResolverTest, StandardFormat) { | |
} | ||
|
||
TEST(LeagcyCGroupPathResolverTest, Cgroup2Format) { | ||
FLAGS_force_cgroup1_mode = false; | ||
ASSERT_OK_AND_ASSIGN( | ||
auto path_resolver, | ||
LegacyCGroupPathResolver::Create(GetSysFsPathFromTestDataFile( | ||
|
@@ -349,6 +358,7 @@ TEST(LeagcyCGroupPathResolverTest, Cgroup2Format) { | |
} | ||
|
||
TEST(CGroupPathResolver, Cgroup2Format) { | ||
FLAGS_force_cgroup1_mode = false; | ||
std::string cgroup_kubepod_path = | ||
"/sys/fs/cgroup/kubepods.slice/" | ||
"kubepods-pod8dbc5577_d0e2_4706_8787_57d52c03ddf2.slice/" | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We used to do this more, but using flags for testing is generally error prone, because they persist across tests. I'd recommend finding a way to avoid flags like this. This goes for the
force_cgroup2_mode
as well, if we're going to try to fix this.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The idea behind this was if it's worth extensively testing the new logic ordering (v2 before v1), then this flag would be used to restore the old behavior in the event there was a distro/system that had issues. Since the tests expected that the check would happen in the previous ordering, it also needed to be used there as well.