-
Notifications
You must be signed in to change notification settings - Fork 235
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Top-level loop for compiler #1576
base: master
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 25 of 46 files at r1, 19 of 19 files at r2, all commit messages.
Reviewable status: 44 of 46 files reviewed, 27 unresolved discussions (waiting on @wmdi)
lib/compiler/src/compiler/unity_algorithm/unity_algorithm.cc
line 54 at r2 (raw file):
for (parallel_layer_guid_t layer : get_parallel_layers(pcg)) { // NOTE(@wmdi): Assume layers with the same key have the same allowed // machine views
It would be cleaner to instead bake the allowed machine views into the leaves of the MachineMappingProblemTree
itself to avoid any ambiguity, so that the allowed_machine_views
function isn't even taken as an argument, but is instead all handled by get_machine_mapping_problem_tree
Suggestion:
// NOTE(@wmdi): Assume layers with the same key have the same allowed
// machine view
lib/compiler/src/compiler/unity_algorithm/unity_algorithm.cc
line 59 at r2 (raw file):
} return mapping; }();
Replace with generate_map
from utils/containers.h
Code quote:
std::unordered_map<UnmappedOpCostEstimateKey, parallel_layer_guid_t>
mapping_from_unmapped_op_cost_estimate_key_parallel_layer = [&] {
std::unordered_map<UnmappedOpCostEstimateKey, parallel_layer_guid_t>
mapping;
for (parallel_layer_guid_t layer : get_parallel_layers(pcg)) {
// NOTE(@wmdi): Assume layers with the same key have the same allowed
// machine views
mapping.insert(
{get_unmapped_op_cost_estimate_key_for_layer(pcg, layer), layer});
}
return mapping;
}();
lib/compiler/src/compiler/unity_algorithm/unity_algorithm.cc
line 63 at r2 (raw file):
MachineMappingCache cached_subgraph_costs = MachineMappingCache{ {}, };
Suggestion:
MachineMappingCache cached_subgraph_costs = empty_machine_mapping_cache();
lib/compiler/src/compiler/unity_algorithm/unity_algorithm.cc
line 91 at r2 (raw file):
} PCGBinarySPDecomposition sp_decomp = maybe_sp_decomp.value();
Or alternatively, add a function
template <typename T>
T expect(std::optional<T> const &x, std::string const &err) {
return unwrap(x, []() { throw mk_runtime_error(err); });
}
to optional.h
and then use expect
instead of unwrap
for a slightly cleaner syntax
Suggestion:
if (!maybe_sp_decomp.has_value()) {
throw std::runtime_error("Fail to SP-ize PCG");
}
PCGBinarySPDecomposition sp_decomp = unwrap(
get_pcg_balanced_binary_sp_decomposition(pcg),
[]() {
throw std::runtime_error("Failed to get SP decomposition of PCG");
},
);
lib/compiler/src/compiler/unity_algorithm/unity_algorithm.cc
line 95 at r2 (raw file):
MachineMappingConstraints constraints = MachineMappingConstraints{ /*machine_views=*/{}, };
Suggestion:
MachineMappingProblemTree problem_tree = get_machine_mapping_problem_tree(pcg, sp_decomp);
MachineMappingConstraints constraints =
get_unconstrained_solution_for_layers(get_all_leaf_paths(problem_tree));
lib/compiler/src/compiler/unity_algorithm/unity_algorithm.cc
line 109 at r2 (raw file):
} else { runtime_with_optimal_mm = mm_result.raw_result.value().runtime; }
Pull out into a separate get_runtime_cost(MachineMappingResult const &)
that's responsible for this
Code quote:
float runtime_with_optimal_mm;
if (mm_result.raw_result == std::nullopt) {
runtime_with_optimal_mm = std::numeric_limits<float>::infinity();
} else {
runtime_with_optimal_mm = mm_result.raw_result.value().runtime;
}
lib/compiler/src/compiler/unity_algorithm/unity_algorithm.cc
line 155 at r2 (raw file):
/*machine_mapping=*/optimize_pcg(best_state.pcg).second, }; }
Would be nice to pull a bunch of this out into a separate generic backtracking_search
function. This function could either take std::function
s as arguments for the operations, or you could use dtgen to create a struct of std::function
s along the lines of GenericBinarySPDecompositionTreeImplementation
. Would just make things a lot cleaner to read as the core search algorithm wouldn't be all mixed up with the domain-specific logic
Code quote:
GraphOptimizeState best_state = optimize_pcg(pcg).first;
candidates.push(best_state);
for (int iteration = 0;
!candidates.empty() && iteration < search_config.budget;
++iteration) {
GraphOptimizeState current_state = candidates.top();
candidates.pop();
if (current_state < best_state) {
best_state = current_state;
} else if (current_state.runtime_with_optimal_mm >
best_state.runtime_with_optimal_mm * search_config.alpha) {
continue;
}
for (ParallelComputationGraph const &new_pcg :
all_pcgs_obtained_by_applying_a_substitution(current_state.pcg,
substitutions)) {
std::optional<GraphOptimizeState> new_pcg_optimize_result =
optimize_pcg(new_pcg).first;
if (new_pcg_optimize_result == std::nullopt) {
continue;
}
GraphOptimizeState new_state = new_pcg_optimize_result.value();
if (new_state.runtime_with_optimal_mm <= search_config.threshold &&
get_nodes(new_pcg.raw_graph).size() <= search_config.max_num_ops) {
candidates.push(new_state);
}
}
}
return SearchResult{
/*pcg=*/best_state.pcg,
/*machine_mapping=*/optimize_pcg(best_state.pcg).second,
};
}
lib/pcg/include/pcg/parallel_computation_graph/parallel_computation_graph.h
line 71 at r2 (raw file):
ParallelComputationGraph parallel_computation_graph_from_computation_graph(ComputationGraph const &);
Add unit tests for this
lib/compiler/include/compiler/machine_mapping/machine_mapping_problem_tree/get_machine_mapping_problem_tree_result.struct.toml
line 0 at r2 (raw file):
I'm not seeing this actually used anywhere? Should this be removed?
lib/compiler/include/compiler/compiler.h
line 21 at r2 (raw file):
SearchAlgorithm, AlgorithmConfig const &, DeviceType);
AlgorithmConfig
already specifies the search algorithm to be used, so the extra SearchAlgorithm
parameter is unnecessary
Suggestion:
SearchResult optimize(ComputationGraph const &,
MachineSpecification const &,
CostEstimator const &,
AlgorithmConfig const &,
DeviceType);
lib/compiler/src/compiler/machine_mapping/machine_mapping.cc
line 19 at r2 (raw file):
} MachineMapping get_machine_mapping_from_machine_mapping_result(
Add tests
lib/compiler/src/compiler/machine_mapping/machine_mapping.cc
line 24 at r2 (raw file):
BinarySPDecompositionTree sp_tree = binary_sp_tree_from_pcg_sp_tree(sp_decomposition);
Ideally operate over a PCGBinarySPDecomposition
instead (the underlying functions just require a GenericBinarySPDecompositionTreeImplementation
, which is available for PCGBinarySPDecomposition
here:
flexflow-train/lib/compiler/include/compiler/series_parallel/pcg/pcg_binary_sp_decomposition.h
Lines 16 to 20 in cd9b031
GenericBinarySPDecompositionTreeImplementation<PCGBinarySPDecomposition, | |
PCGBinarySeriesSplit, | |
PCGBinaryParallelSplit, | |
parallel_layer_guid_t> | |
generic_impl_for_pcg_sp_tree(); |
lib/compiler/src/compiler/machine_mapping/machine_mapping.cc
line 26 at r2 (raw file):
binary_sp_tree_from_pcg_sp_tree(sp_decomposition); auto get_layer_from_path =
Make this a top-level function in PCGBinarySPDecomposition
lib/compiler/src/compiler/machine_mapping/machine_mapping.cc
line 31 at r2 (raw file):
binary_sp_decomposition_tree_get_subtree_at_path(sp_tree, path); if (!subtree_optional.has_value()) { throw std::runtime_error("Invalid tree path");
Suggestion:
throw std::runtime_error(fmt::format("Invalid tree path {}", path));
lib/compiler/src/compiler/machine_mapping/machine_mapping.cc
line 35 at r2 (raw file):
BinarySPDecompositionTree subtree = subtree_optional.value(); if (!subtree.is_node()) { throw std::runtime_error("Invalid tree path to a leaf");
Suggestion:
throw std::runtime_error(fmt::format("Invalid tree path to a leaf: found {} instead", subtree));
lib/compiler/src/compiler/machine_mapping/machine_mapping.cc
line 38 at r2 (raw file):
} return parallel_layer_guid_t{ subtree.get<Node>(),
Suggestion:
subtree.require_leaf(),
lib/compiler/src/compiler/machine_mapping/machine_mapping.cc
line 53 at r2 (raw file):
} return MachineMapping{mm};
Replace with map_keys
from utils/containers
Code quote:
std::unordered_map<parallel_layer_guid_t, MachineView> mm;
if (mm_result.raw_result) {
FeasibleMachineMappingResult const &feasible_mm_result =
mm_result.raw_result.value();
for (auto const &[path, mv] :
feasible_mm_result.machine_mapping.raw_mapping) {
mm.insert({get_layer_from_path(path), mv});
}
}
return MachineMapping{mm};
lib/compiler/include/compiler/unity_algorithm/graph_optimize_state.h
line 12 at r2 (raw file):
float runtime_with_optimal_mm); ParallelComputationGraph pcg;
Why the change from GraphOptimizeResult
to ParallelComputationGraph
?
lib/compiler/include/compiler/unity_algorithm/unity_algorithm.h
line 17 at r2 (raw file):
std::vector<Substitution> const &substitutions, UnitySearchConfig const &search_config, DeviceType device_type);
Remove this argument and just hardcode DeviceType::GPU
for now--everything in this function is almost certainly going to assume GPU anyway, so this parameter is kind of a lie
Code quote:
DeviceType device_type);
lib/pcg/src/pcg/operator_task_space.cc
line 41 at r2 (raw file):
OperatorTaskSpace get_operator_task_space(ParallelComputationGraph const &pcg, parallel_layer_guid_t const &layer) { NOT_IMPLEMENTED();
Use the version from #1565
lib/compiler/src/compiler/unity_algorithm/graph_optimize_state.cc
line 31 at r2 (raw file):
// https://networkx.org/documentation/stable/reference/algorithms/generated/networkx.algorithms.graph_hashing.weisfeiler_lehman_graph_hash.html#networkx.algorithms.graph_hashing.weisfeiler_lehman_graph_hash size_t seed = 0; auto layers = topological_ordering(state.pcg);
Suggestion:
std::vector<parallel_layer_guid_t> layers = topological_ordering(state.pcg);
lib/compiler/src/compiler/unity_algorithm/graph_optimize_state.cc
line 33 at r2 (raw file):
auto layers = topological_ordering(state.pcg); ::FlexFlow::hash_combine(seed, layers.size()); for (auto layer : layers) {
Suggestion:
for (parallel_layer_guid_t const &layer : layers) {
lib/compiler/src/compiler/unity_algorithm/graph_optimize_state.cc
line 35 at r2 (raw file):
for (auto layer : layers) { ::FlexFlow::hash_combine(seed, get_parallel_layer_attrs(state.pcg, layer)); auto inputs = get_incoming_tensors(state.pcg, layer);
Suggestion:
std::vector<parallel_tensor_guid_t> inputs = get_incoming_tensors(state.pcg, layer);
lib/compiler/src/compiler/unity_algorithm/graph_optimize_state.cc
line 39 at r2 (raw file):
for (auto input : inputs) { for (size_t i = 0; i < layers.size(); ++i) { if (get_source_layer(input) == layers[i]) {
Suggestion:
if (get_source_layer(input) == layers.at(i)) {
lib/compiler/include/compiler/search_result.struct.toml
line 8 at r1 (raw file):
includes = [ "pcg/parallel_computation_graph/parallel_computation_graph.h", "machine_mapping/machine_mapping.h",
Prefer absolute includes
Suggestion:
"compiler/machine_mapping/machine_mapping.h",
lib/compiler/src/compiler/compiler.cc
line 6 at r2 (raw file):
namespace FlexFlow { SearchResult optimize(ComputationGraph const &computation_graph,
This type doesn't match the one declared in the header?
lib/compiler/src/compiler/compiler.cc
line 15 at r2 (raw file):
case SearchAlgorithm::DATA_PARALLEL: throw std::runtime_error( "Data parallel search algorithm is not implemented yet");
Can you take care of adding this in another PR? Created an issue to track it: #1577
lib/compiler/src/compiler/compiler.cc
line 28 at r2 (raw file):
} default: throw std::runtime_error("Unknown search algorithm");
Suggestion:
throw std::runtime_error(fmt::format("Unknown search algorithm {}", search_algorithm));
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 44 of 46 files reviewed, 27 unresolved discussions (waiting on @lockshaw)
lib/compiler/include/compiler/compiler.h
line 21 at r2 (raw file):
Previously, lockshaw (Colin Unger) wrote…
AlgorithmConfig
already specifies the search algorithm to be used, so the extraSearchAlgorithm
parameter is unnecessary
Done.
lib/compiler/include/compiler/machine_mapping/machine_mapping_problem_tree/get_machine_mapping_problem_tree_result.struct.toml
line at r2 (raw file):
Previously, lockshaw (Colin Unger) wrote…
I'm not seeing this actually used anywhere? Should this be removed?
Done.
lib/compiler/src/compiler/compiler.cc
line 6 at r2 (raw file):
Previously, lockshaw (Colin Unger) wrote…
This type doesn't match the one declared in the header?
Done.
lib/compiler/src/compiler/machine_mapping/machine_mapping.cc
line 53 at r2 (raw file):
Previously, lockshaw (Colin Unger) wrote…
Replace with
map_keys
fromutils/containers
Done.
lib/compiler/src/compiler/unity_algorithm/unity_algorithm.cc
line 109 at r2 (raw file):
Previously, lockshaw (Colin Unger) wrote…
Pull out into a separate
get_runtime_cost(MachineMappingResult const &)
that's responsible for this
Done.
lib/compiler/include/compiler/search_result.struct.toml
line 8 at r1 (raw file):
Previously, lockshaw (Colin Unger) wrote…
Prefer absolute includes
Done.
lib/compiler/include/compiler/unity_algorithm/graph_optimize_state.h
line 12 at r2 (raw file):
Previously, lockshaw (Colin Unger) wrote…
Why the change from
GraphOptimizeResult
toParallelComputationGraph
?
I feel it's better to remove the machine mapping information from GraphOptimizeResult
since it can be derived from the PCG by applying get_optimal_machine_mapping
. It seems redundant to have this field.
lib/compiler/include/compiler/unity_algorithm/unity_algorithm.h
line 17 at r2 (raw file):
Previously, lockshaw (Colin Unger) wrote…
Remove this argument and just hardcode
DeviceType::GPU
for now--everything in this function is almost certainly going to assume GPU anyway, so this parameter is kind of a lie
Done.
lib/compiler/src/compiler/unity_algorithm/graph_optimize_state.cc
line 31 at r2 (raw file):
// https://networkx.org/documentation/stable/reference/algorithms/generated/networkx.algorithms.graph_hashing.weisfeiler_lehman_graph_hash.html#networkx.algorithms.graph_hashing.weisfeiler_lehman_graph_hash size_t seed = 0; auto layers = topological_ordering(state.pcg);
Done.
lib/compiler/src/compiler/unity_algorithm/graph_optimize_state.cc
line 33 at r2 (raw file):
auto layers = topological_ordering(state.pcg); ::FlexFlow::hash_combine(seed, layers.size()); for (auto layer : layers) {
Done.
lib/compiler/src/compiler/unity_algorithm/graph_optimize_state.cc
line 35 at r2 (raw file):
for (auto layer : layers) { ::FlexFlow::hash_combine(seed, get_parallel_layer_attrs(state.pcg, layer)); auto inputs = get_incoming_tensors(state.pcg, layer);
Done.
lib/compiler/src/compiler/unity_algorithm/graph_optimize_state.cc
line 39 at r2 (raw file):
for (auto input : inputs) { for (size_t i = 0; i < layers.size(); ++i) { if (get_source_layer(input) == layers[i]) {
Done.
lib/compiler/src/compiler/machine_mapping/machine_mapping.cc
line 31 at r2 (raw file):
binary_sp_decomposition_tree_get_subtree_at_path(sp_tree, path); if (!subtree_optional.has_value()) { throw std::runtime_error("Invalid tree path");
Done.
lib/compiler/src/compiler/machine_mapping/machine_mapping.cc
line 35 at r2 (raw file):
BinarySPDecompositionTree subtree = subtree_optional.value(); if (!subtree.is_node()) { throw std::runtime_error("Invalid tree path to a leaf");
Done.
lib/compiler/src/compiler/machine_mapping/machine_mapping.cc
line 38 at r2 (raw file):
} return parallel_layer_guid_t{ subtree.get<Node>(),
Done.
lib/compiler/src/compiler/compiler.cc
line 28 at r2 (raw file):
} default: throw std::runtime_error("Unknown search algorithm");
Done.
lib/compiler/src/compiler/unity_algorithm/unity_algorithm.cc
line 63 at r2 (raw file):
MachineMappingCache cached_subgraph_costs = MachineMappingCache{ {}, };
Done.
lib/compiler/src/compiler/unity_algorithm/unity_algorithm.cc
line 95 at r2 (raw file):
MachineMappingConstraints constraints = MachineMappingConstraints{ /*machine_views=*/{}, };
Done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 36 of 46 files reviewed, 27 unresolved discussions (waiting on @lockshaw)
lib/compiler/src/compiler/machine_mapping/machine_mapping.cc
line 24 at r2 (raw file):
Previously, lockshaw (Colin Unger) wrote…
Ideally operate over a
PCGBinarySPDecomposition
instead (the underlying functions just require aGenericBinarySPDecompositionTreeImplementation
, which is available forPCGBinarySPDecomposition
here:flexflow-train/lib/compiler/include/compiler/series_parallel/pcg/pcg_binary_sp_decomposition.h
Lines 16 to 20 in cd9b031
GenericBinarySPDecompositionTreeImplementation<PCGBinarySPDecomposition, PCGBinarySeriesSplit, PCGBinaryParallelSplit, parallel_layer_guid_t> generic_impl_for_pcg_sp_tree();
I don't quite understand this. Could you elaborate more on the relations between the classes involved?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 36 of 46 files reviewed, 27 unresolved discussions (waiting on @lockshaw)
lib/compiler/src/compiler/unity_algorithm/unity_algorithm.cc
line 54 at r2 (raw file):
Previously, lockshaw (Colin Unger) wrote…
It would be cleaner to instead bake the allowed machine views into the leaves of the
MachineMappingProblemTree
itself to avoid any ambiguity, so that theallowed_machine_views
function isn't even taken as an argument, but is instead all handled byget_machine_mapping_problem_tree
Seems we cannot do that, since allowed machine views depend on the amount of resources for each sub-problem, and therefore we cannot generate those at the beginning of the search
lib/compiler/src/compiler/unity_algorithm/unity_algorithm.cc
line 59 at r2 (raw file):
Previously, lockshaw (Colin Unger) wrote…
Replace with
generate_map
fromutils/containers.h
We cannot replace it since the map is generated from value to key
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 36 of 46 files reviewed, 27 unresolved discussions (waiting on @lockshaw)
lib/compiler/src/compiler/unity_algorithm/unity_algorithm.cc
line 91 at r2 (raw file):
Previously, lockshaw (Colin Unger) wrote…
Or alternatively, add a function
template <typename T> T expect(std::optional<T> const &x, std::string const &err) { return unwrap(x, []() { throw mk_runtime_error(err); }); }
to
optional.h
and then useexpect
instead ofunwrap
for a slightly cleaner syntax
Done.
lib/pcg/src/pcg/operator_task_space.cc
line 41 at r2 (raw file):
Previously, lockshaw (Colin Unger) wrote…
Use the version from #1565
Done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 1 of 46 files at r1, 4 of 8 files at r3, 30 of 30 files at r4, all commit messages.
Reviewable status: all files reviewed, 12 unresolved discussions (waiting on @wmdi)
lib/compiler/src/compiler/machine_mapping/machine_mapping.cc
line 24 at r2 (raw file):
Previously, wmdi (Mengdi Wu) wrote…
I don't quite understand this. Could you elaborate more on the relations between the classes involved?
This is the "vtable hack" I mentioned--essentially, rather than using C++ inheritance (which is full of sharp edges), we just specify a .struct.toml
of std::function
s that specify the interface (let's call this template <typename T> struct Vtable<T>;
, and then can define template <typename T>
functions that take in a Vtable<T> const &
to allow the polymorphism. Here you're using a function (get_subtree_at_path
) that uses the vtable hack to operate over any type that has a valid GenericBinarySPDecompositionTreeImplementation
. Right now you're first converting to a BinarySPDecompositionTree
, but you can just operate directly over a PCGBinarySPDecomposition
lib/compiler/src/compiler/unity_algorithm/unity_algorithm.cc
line 54 at r2 (raw file):
Previously, wmdi (Mengdi Wu) wrote…
Seems we cannot do that, since allowed machine views depend on the amount of resources for each sub-problem, and therefore we cannot generate those at the beginning of the search
Good point. To fix this, you could bake in the allowed_machine_views for the whole possible set of resources, and then filter them at the leaf level?
lib/compiler/src/compiler/unity_algorithm/unity_algorithm.cc
line 59 at r2 (raw file):
Previously, wmdi (Mengdi Wu) wrote…
We cannot replace it since the map is generated from value to key
True--would unordered_map_from_pairs
combined with transform
(and optionally zip
) work?
lib/compiler/src/compiler/machine_mapping/machine_mapping.cc
line 20 at r4 (raw file):
} MachineMapping get_machine_mapping_from_machine_mapping_result(
Suggestion:
std::optional<MachineMapping>
lib/compiler/src/compiler/machine_mapping/machine_mapping.cc
line 53 at r4 (raw file):
} return MachineMapping{mm};
Suggestion:
return transform(mm_result.raw_result,
[&](FeasibleMachineMappingResult const &feasible_mm_result) {
return MachineMapping{
map_keys(feasible_mm_result.machine_mapping.raw_mapping,
get_layer_from_path),
};
});
lib/compiler/test/src/unity_algorithm.cc
line 5 at r4 (raw file):
TEST_SUITE(FF_TEST_SUITE) { // TODO: to be udpated
Update
lib/compiler/test/src/graph_optimize_state.cc
line 8 at r4 (raw file):
TEST_SUITE(FF_TEST_SUITE) { // TODO(@wmdi): to be udpated
Update
lib/compiler/src/compiler/unity_algorithm/unity_algorithm.cc
line 84 at r4 (raw file):
}; auto get_runtime_cost = [](MachineMappingResult const &mm_result) {
Move this to machine_mapping_result.h
Previously, lockshaw (Colin Unger) wrote…
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: all files reviewed, 12 unresolved discussions (waiting on @lockshaw and @victorli2002)
lib/compiler/src/compiler/machine_mapping/machine_mapping.cc
line 24 at r2 (raw file):
Previously, lockshaw (Colin Unger) wrote…
This is the "vtable hack" I mentioned--essentially, rather than using C++ inheritance (which is full of sharp edges), we just specify a
.struct.toml
ofstd::function
s that specify the interface (let's call thistemplate <typename T> struct Vtable<T>;
, and then can definetemplate <typename T>
functions that take in aVtable<T> const &
to allow the polymorphism. Here you're using a function (get_subtree_at_path
) that uses the vtable hack to operate over any type that has a validGenericBinarySPDecompositionTreeImplementation
. Right now you're first converting to aBinarySPDecompositionTree
, but you can just operate directly over aPCGBinarySPDecomposition
Done.
lib/compiler/src/compiler/machine_mapping/machine_mapping.cc
line 26 at r2 (raw file):
Previously, lockshaw (Colin Unger) wrote…
Make this a top-level function in
PCGBinarySPDecomposition
Done.
lib/compiler/src/compiler/unity_algorithm/unity_algorithm.cc
line 54 at r2 (raw file):
Previously, lockshaw (Colin Unger) wrote…
Good point. To fix this, you could bake in the allowed_machine_views for the whole possible set of resources, and then filter them at the leaf level?
Done.
lib/compiler/src/compiler/unity_algorithm/unity_algorithm.cc
line 59 at r2 (raw file):
Previously, lockshaw (Colin Unger) wrote…
True--would
unordered_map_from_pairs
combined withtransform
(and optionallyzip
) work?
Done.
lib/compiler/src/compiler/unity_algorithm/unity_algorithm.cc
line 155 at r2 (raw file):
Previously, lockshaw (Colin Unger) wrote…
Done.
lib/compiler/src/compiler/unity_algorithm/unity_algorithm.cc
line 84 at r4 (raw file):
Previously, lockshaw (Colin Unger) wrote…
Move this to
machine_mapping_result.h
Done. BTW, how does [[nodiscard]]
work in machine_mapping_result.h
?
lib/compiler/src/compiler/machine_mapping/machine_mapping.cc
line 20 at r4 (raw file):
} MachineMapping get_machine_mapping_from_machine_mapping_result(
Done.
lib/compiler/src/compiler/machine_mapping/machine_mapping.cc
line 53 at r4 (raw file):
} return MachineMapping{mm};
Done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 17 of 57 files reviewed, 12 unresolved discussions (waiting on @lockshaw and @victorli2002)
lib/compiler/test/src/graph_optimize_state.cc
line 8 at r4 (raw file):
Previously, lockshaw (Colin Unger) wrote…
Update
Done.
lib/compiler/test/src/unity_algorithm.cc
line 5 at r4 (raw file):
Previously, lockshaw (Colin Unger) wrote…
Update
Done.
Description of changes:
Implement the top-level loop for compiler.
TODO:
Related Issues:
Linked Issues:
Issues closed by this PR:
Before merging:
This change is![Reviewable](https://camo.githubusercontent.com/1541c4039185914e83657d3683ec25920c672c6c5c7ab4240ee7bff601adec0b/68747470733a2f2f72657669657761626c652e696f2f7265766965775f627574746f6e2e737667)