The first case of the ROM section oftests/arch/ice40/memories.ys takes 326 seconds to run with majority of this runtime spent sorting cells inopt_expr. This PR allows sorting to only fail a limited number of times on a given module before reaching fixpoint without trying to sort the cells. This cuts down the test case runtime to 14 seconds. The limit can be changed with the new-effort argument.

5.3.12. Executing OPT_EXPR pass (perform const folding).Optimizing module sync_rom.Couldn't topologically sort cells, optimizing module sync_rom may take a longer time.Couldn't topologically sort cells, optimizing module sync_rom may take a longer time.Couldn't topologically sort cells, optimizing module sync_rom may take a longer time.Couldn't topologically sort cells, optimizing module sync_rom may take a longer time.Couldn't topologically sort cells, optimizing module sync_rom may take a longer time.Effort of 5 exceeded, no longer attempting toposort on module sync_rom.<suppressed ~6151 debug messages>

opt_expr: limit effort to reduce runtime on anomalous designs

fcb3ded

widlarizer requested a review frompovik

November 28, 2024 17:10

povik reviewed

Nov 28, 2024

View reviewed changes

passes/opt/opt_expr.cc Outdated

		sort_fails++;
		if (sort_fails >= effort)
		log("Effort of %d exceeded, no longer attempting toposort on module %s.\n",
		effort, log_id(module));
		}

		for (auto cell : cells.sorted)

Copy link

Member

povikNov 28, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Isn't this empty if you don't runcells.sort()? And if that's the case, then the true cause of the speed-up isn't not bothering to sort anymore but giving up entirely

Copy link

CollaboratorAuthor

widlarizerNov 28, 2024•
edited
Loading

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Hm, there's alog_assert(GetSize(sorted) == GetSize(nodes)); right before the return fromsort() so I guess it isn't but I'll verify it later

EDIT: yeah if you don't even runsort() then it's going to be empty, yikes

Copy link

CollaboratorAuthor

widlarizerNov 28, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Fixed by iterating over a list of references to a snapshot ofmodule->cells(). This increases the runtime to 83 seconds, which I still prefer to 326 seconds. I should say, all runtimes I'm mentioning in this PR are measured withENABLE_DEBUG=1 for no particular reason other having that around inMakefile.conf

Copy link

Member

povikNov 28, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

We could cache the sorting result provided we remove cells we removed

Copy link

CollaboratorAuthor

widlarizerNov 28, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

I can't really foresee the consequences of doing that, though - for how many cycles would we cache etc

Copy link

Member

povikNov 28, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

For the lifetime of oneopt_expr invocation. It should be ok as long as we zero out the position in the sorted array corresponding to any removed cells.

I am not sure if there's a case where opt_expr replaces a cell with two or more cells, that's a slight complication

opt_expr: don't give up on opt when out of effort

888e012

widlarizer mentioned this pull request

Dec 2, 2024

read_verilog gives multiple drivers to variables initialized both in declaration and initial#4792

Open

KrystalDelusion reviewed

Mar 19, 2025

View reviewed changes

Copy link

Member

KrystalDelusion left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

I'm struggling to follow the control flow here; my comments are based on the understanding thatiterator is a lambda function, which gets called once with another (anonymous) lambda function. But if that is the case then I'm not sure whyiterator is a lambda at all if it only gets called once. In fact, naming the inner anonymous function (the contents of the previous for loop overcells.sorted), moving lines 494-537 after it, dropping the outer lambda and calling the new lambda directly instead of the locally definedreplace_cell seems to give the same results, takes very slightly less time, and is more readable. Which probably also suggests that the inner lambda function could even be elevated to a non-lambda?

passes/opt/opt_expr.cc Outdated

		std::vector<Cell*> module_cells = module->cells();
		auto iterator = [&](auto&& replace_cell) {
		if (sort_fails >= effort) {
		// log("Running on unsorted")

Copy link

Member

KrystalDelusionMar 19, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Remove commented out code

Suggested change

// log("Running on unsorted")

passes/opt/opt_expr.cc Outdated

		cells.edge(cells.node(outbit_to_cell.at(bit)), r_index);
		}

		if (sort_fails < effort && !cells.sort()) {

Copy link

Member

KrystalDelusionMar 19, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Does this line need to checksort_fails? It's already in the else section which checkssort_fails, so the only way thesort_fails value could have increased would be ifcells.node orcells.edge could increase it and I can't find anyway that would happen. And removing it doesn't appear to affecttests/arch/ice40/memories.ys.

Suggested change

	if (sort_fails < effort &&!cells.sort()) {
	if (!cells.sort()) {

Copy link

CollaboratorAuthor

widlarizerMar 20, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Yeah, good catch, that's redundant, this must be left over from experimenting with the pass

passes/opt/opt_expr.cc Outdated


		for (auto cell : cells.sorted)
		std::vector<Cell*> module_cells = module->cells();
		auto iterator = [&](auto&& replace_cell) {

Copy link

Member

KrystalDelusionMar 19, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Consider renamingreplace_cell here to avoid confusion with thereplace_cell function defined on line 120?

widlarizer added3 commits

March 20, 2025 12:27

opt_expr: remove commented log statements. NFC

8190036

opt_expr: remove redundant guard against sorting

89b12e4

opt_expr: neater variable names, avoid shadowing

0a15a23

Copy link

CollaboratorAuthor

widlarizer commentedMar 20, 2025

@KrystalDelusion I suppose I tried to be too clever by retaining the pre-existing code ordering to make it easier to review (for myself as well). It uncouples the design traversal without reordering the responsible code blocks, forcing a refactoring into separate functions or creating a worker class that holds all the silly state. When you're saying you named the anonymous function and moved the replacement code after it, do you mean that you nested a function withinreplace_const_cells? If so, I don't think that's standard C++. Otherwise, you have to pass a bunch of variables in, or create a stateful worker (which is the correct thing to do)

Copy link

Member

KrystalDelusion commentedMar 20, 2025

By named I mean assigned to a variable (auto l = ..)

Copy link

CollaboratorAuthor

widlarizer commentedMar 20, 2025

Uh, anyway, I spent a couple hours refactoringopt_expr. It's atemil/opt_expr-limit-effort-refactor. I'm not sure if it's worth the diff now that I see what you meant. If so, I can pull it into this PR or a followup one but I have to attend to different tasks at the moment

Copy link

Member

KrystalDelusion commentedMar 20, 2025

Ah I see, although I think just the minimal refactor to avoid thevisitor lambda is still an improvement in readability (I pushed tokrys/opt_expr-limit-effort)

Copy link

Member

KrystalDelusion commentedMar 20, 2025

Also the CI failing is fixed in latest main, just needs to rebase

Labels

None yet

3 participants

Movatterモバイル変換

Navigation Menu

Search code, repositories, users, issues, pull requests...

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

opt_expr: limit effort to reduce runtime on anomalous designs#4782

Are you sure you want to change the base?

opt_expr: limit effort to reduce runtime on anomalous designs#4782