vm model guided inference #266

arendu · 2024-08-16T22:34:53Z

What does this PR do ?

does inference using a value model or a process reward model.

Please update the CHANGELOG.md under next version with high level changes in this PR.

# Add a code snippet demonstrating how to use this

Pre checks:

Make sure you read and followed Contributor guidelines
Did you write any new necessary tests?
Did you add or update any necessary documentation? Make sure to also update the NeMo Framework User Guide which contains the tutorials

Signed-off-by: arendu <adithya.r@gmail.com>

for more information, see https://pre-commit.ci

arendu added 3 commits August 14, 2024 18:03

update

bc69436

Signed-off-by: arendu <adithya.r@gmail.com>

value model at inference

4ff13bc

Signed-off-by: arendu <adithya.r@gmail.com>

reward/value model consulting

eb20431

Signed-off-by: arendu <adithya.r@gmail.com>

arendu requested review from odelalleau and gshennvm August 16, 2024 22:34

[pre-commit.ci] auto fixes from pre-commit.com hooks

10cdc1d

for more information, see https://pre-commit.ci

arendu requested a review from terrykong August 16, 2024 22:35

github-actions bot added Utils Algorithms labels Aug 16, 2024

arendu requested a review from yidong72 August 16, 2024 22:35