chore: changed the assertions to make sure the num_updates is a multiple of num_evaluations #1083

Louay-Ben-nessir · 2024-07-02T14:44:52Z

What?

Changed the assertions to make sure the num_updates is a multiple of num_evaluations.

Why?

Only num_evaluation * num_updates_per_eval are ran while training which can lead to some missed updates if the num_updates is not a multiple of num_evaluations.

How?

changed the assertions.

OmaymaMahjoub

Thanks @Louay-Ben-nessir, can you as well check if the other systems suffer from the same problem or not 🙏

sash-a · 2024-07-03T07:35:05Z

A suggestion here that would remove the need for the assert and make mava easier to configure is to change the variable to evaluation frequency and then store num_evaluations in the config as config.arch.num_evals = config.system.num_updates // config.arch.eval_freq so that we still have that info when logging

Louay-Ben-nessir · 2024-07-03T09:12:36Z

Thanks @Louay-Ben-nessir, can you as well check if the other systems suffer from the same problem or not 🙏

I think this issue is exclusive to ppo systems

A suggestion here that would remove the need for the assert and make mava easier to configure is to change the variable to evaluation frequency and then store num_evaluations in the config as config.arch.num_evals = config.system.num_updates // config.arch.eval_freq so that we still have that info when logging

This a huge improvement over the current implementation but it's still not exact in some cases. losing some updates is worth it for the flexibility tho so I'll change it.

sash-a · 2024-07-03T09:49:30Z

Ah right we could lose some updates. @RuanJohn has had an issue with this in the past, so maybe a jnp.ceil is needed here, just double check with him

…tion-assertion

RuanJohn

The hard assert here is a bit too strict in my opinion.
Something we could do is to make a warning that says the number of timesteps someone is assuming their experiment will run for might not happen and then give the total number of timesteps that will run.

chore: change the assertions

1826dd2

Louay-Ben-nessir requested review from arnupretorius, DriesSmit, RuanJohn, jcformanek, siddarthsingh1, sash-a, OmaymaMahjoub, ulricharmel and callumtilbury as code owners July 2, 2024 14:44

pull-request-size bot added the size/S label Jul 2, 2024

OmaymaMahjoub previously approved these changes Jul 2, 2024

View reviewed changes

Merge branch 'develop' into chore--num_updates-multiple-of-num_evalua…

672d5ac

…tion-assertion

RuanJohn requested changes Jul 5, 2024

View reviewed changes

feat: a check_total_timesteps for ppo systems

9d0b2d5

Louay-Ben-nessir dismissed OmaymaMahjoub’s stale review via 9d0b2d5 July 6, 2024 12:53

pull-request-size bot added size/M and removed size/S labels Jul 6, 2024

WiemKhlifi assigned Louay-Ben-nessir Jul 23, 2024

Louay-Ben-nessir mentioned this pull request Jul 29, 2024

Feat: sebulba ff_ippo #1088

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

chore: changed the assertions to make sure the num_updates is a multiple of num_evaluations #1083

chore: changed the assertions to make sure the num_updates is a multiple of num_evaluations #1083

Louay-Ben-nessir commented Jul 2, 2024

OmaymaMahjoub left a comment

sash-a commented Jul 3, 2024

Louay-Ben-nessir commented Jul 3, 2024

sash-a commented Jul 3, 2024

RuanJohn left a comment

chore: changed the assertions to make sure the num_updates is a multiple of num_evaluations #1083

Are you sure you want to change the base?

chore: changed the assertions to make sure the num_updates is a multiple of num_evaluations #1083

Conversation

Louay-Ben-nessir commented Jul 2, 2024

What?

Why?

How?

OmaymaMahjoub left a comment

Choose a reason for hiding this comment

sash-a commented Jul 3, 2024

Louay-Ben-nessir commented Jul 3, 2024

sash-a commented Jul 3, 2024

RuanJohn left a comment

Choose a reason for hiding this comment