Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Remove problematic content slicing in test output parsing#402

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.

Already on GitHub?Sign in to your account

Open
ryanhoangt wants to merge4 commits intoSWE-bench:main
base:main
Choose a base branch
Loading
fromryanhoangt:fix-modal-patch-eval

Conversation

@ryanhoangt
Copy link
Contributor

Reference Issues/PRs

Fix#377

What does this implement/fix? Explain your changes.

When running patch eval on Modal, I see that for some instances, the content of test output files being captured are out of order, which causes the test summary to fall outside the>>>>> Start Test Output and>>>>> End Test Output markers. I attached a sample log file below.

test_output_astropy__astropy-12907.txt

This PR removes the content slicing line and uses the whole file content for parsing.

Any other comments?

🧡 Thanks for contributing!

@john-b-yang
Copy link
Member

Hmm I understand the problem, thanks for noticing it. But I probably wouldn't prefer this be the default behavior. Perhaps we can condition this statement on whether Modal eval is being used.

It's not a big difference I guess, but the reason this statement exists is to make sure text not related to testing output is ignored.

@ryanhoangt
Copy link
ContributorAuthor

@john-b-yang that makes sense, I modified the code to skip the content slicing on Modal only, could you help take another look?

@ryanhoangt
Copy link
ContributorAuthor

Hi@john-b-yang, just a friendly ping on this PR - please let me know if there's anything I can clarify or improve. Thanks!

@neubig
Copy link

Hi@john-b-yang and/or@klieret, would it be possible to get this merged? We're currently having to work off a branch of swe-bench and it's causing us logistics issues at the moment. Thanks a lot for the consideration!

Sign up for freeto join this conversation on GitHub. Already have an account?Sign in to comment

Reviewers

No reviews

Assignees

No one assigned

Labels

None yet

Projects

None yet

Milestone

No milestone

Development

Successfully merging this pull request may close these issues.

Inconsistent evaluation in Modal vs without using Modal

3 participants

@ryanhoangt@john-b-yang@neubig

[8]ページ先頭

©2009-2025 Movatter.jp