Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Practice prompt calibration #670

Merged
merged 4 commits into from
Nov 5, 2024
Merged

Practice prompt calibration #670

merged 4 commits into from
Nov 5, 2024

Conversation

wpietri
Copy link
Contributor

@wpietri wpietri commented Nov 5, 2024

No description provided.

…he 0.5 standards. Moving calibration testing to modelbench-private.
…he 0.5 standards. Moving calibration testing to modelbench-private.
@wpietri wpietri requested a review from bkorycki November 5, 2024 18:56
@wpietri wpietri requested a review from a team as a code owner November 5, 2024 18:56
Copy link

github-actions bot commented Nov 5, 2024

MLCommons CLA bot All contributors have signed the MLCommons CLA ✍️ ✅

Copy link
Contributor

@bkorycki bkorycki left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for doing this!

src/modelbench/run.py Outdated Show resolved Hide resolved
src/modelbench/run.py Show resolved Hide resolved
Copy link
Collaborator

@bollacker bollacker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see nothing obviously wrong.

Copy link
Collaborator

@dhosterman dhosterman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

@wpietri wpietri merged commit 10a1e74 into main Nov 5, 2024
4 checks passed
@github-actions github-actions bot locked and limited conversation to collaborators Nov 5, 2024
benchmarks.append(GeneralPurposeAiChatBenchmarkV1(l, "ensemble"))
run_result = run_benchmarks_for_suts(benchmarks, reference_suts, 100)
run_result = run_benchmarks_for_suts(benchmarks, reference_suts, None)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Style: for non-obvious arguments, I generally recommend naming it, for 3am me. E.g.

run_benchmarks_for_suts(benchmarks, reference_suts, what_this_does=None)

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants