When running patch eval on Modal, I see that for some instances, the content of test output files being captured are out of order, which causes the test summary to fall outside the>>>>> Start Test Output and>>>>> End Test Output markers. I attached a sample log file below.

test_output_astropy__astropy-12907.txt

This PR removes the content slicing line and uses the whole file content for parsing.

Any other comments?

🧡 Thanks for contributing!

remove problematic content slicing

6a83d74

ryanhoangt mentioned this pull request

May 22, 2025

Add option to run patch evaluation on ModalOpenHands/OpenHands#8607

Merged

2 tasks

Copy link

Member

john-b-yang commentedMay 22, 2025

Hmm I understand the problem, thanks for noticing it. But I probably wouldn't prefer this be the default behavior. Perhaps we can condition this statement on whether Modal eval is being used.

It's not a big difference I guess, but the reason this statement exists is to make sure text not related to testing output is ignored.

skip content slicing on Modal only

b8ffb7b

Copy link

ContributorAuthor

ryanhoangt commentedMay 27, 2025

@john-b-yang that makes sense, I modified the code to skip the content slicing on Modal only, could you help take another look?

Copy link

ContributorAuthor

ryanhoangt commentedJun 16, 2025

Hi@john-b-yang, just a friendly ping on this PR - please let me know if there's anything I can clarify or improve. Thanks!

ryanhoangt added2 commits

July 25, 2025 09:09

Merge branch 'main' into fix-modal-patch-eval

03846bf

add extra validation for make_run_report

aa0f1ed

Copy link

neubig commentedAug 29, 2025

Hi@john-b-yang and/or@klieret, would it be possible to get this merged? We're currently having to work off a branch of swe-bench and it's causing us logistics issues at the moment. Thanks a lot for the consideration!

Labels

None yet

Movatterモバイル変換

Navigation Menu

Search code, repositories, users, issues, pull requests...

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Remove problematic content slicing in test output parsing#402

Are you sure you want to change the base?

Remove problematic content slicing in test output parsing#402

Uh oh!

Conversation

ryanhoangt commentedMay 22, 2025

Reference Issues/PRs

What does this implement/fix? Explain your changes.

Any other comments?

Uh oh!

john-b-yang commentedMay 22, 2025

Uh oh!

ryanhoangt commentedMay 27, 2025

Uh oh!

ryanhoangt commentedJun 16, 2025

Uh oh!

neubig commentedAug 29, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants