Enhancement: Support for `extra_info` in Reward Calculation #266

maksimstw · 2025-02-13T06:29:42Z

Enhancement: Support for `extra_info` in Reward Calculation

Summary

This update enhances the reward computation process by introducing an additional extra_info parameter. This allows users to pass in more contextual information when calculating rewards, improving flexibility for different datasets.

Changes Made

Updated _default_compute_score to accept an extra_info argument:

def _default_compute_score(data_source, solution_str, ground_truth, extra_info):

Modified the reward manager (naive.py) to pass extra_info from data_item.non_tensor_batch to compute_score:

extra_info = data_item.non_tensor_batch['extra_info']
score = self.compute_score(
    data_source=data_source,
    solution_str=sequences_str,
    ground_truth=ground_truth,
    extra_info=extra_info,
)

Why This Change?

Some datasets require additional context beyond data_source, solution_str, and ground_truth for accurate reward computation.
The new extra_info field allows users to pass custom metadata, ideally in dictionary form, as specified in the official documentation.
This change maintains compatibility with existing dataset processing scripts, as they already include the extra_info field.

Impact

Improved flexibility: Users can now pass additional contextual information, making reward computation more adaptable to different datasets.
Backward compatibility: Since all example datasets already include extra_info, this update should integrate seamlessly.

Let me know if any modifications are needed!

pass in extra info to the reward function.

allowing extra_info to be passed in for more advance compute_score

vermouth1992 · 2025-02-14T14:14:15Z

Could you perform formatting according to readme?

maksimstw added 2 commits February 12, 2025 22:14

Update naive.py

eada848

pass in extra info to the reward function.

Update __init__.py

5b2317c

allowing extra_info to be passed in for more advance compute_score

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Enhancement: Support for `extra_info` in Reward Calculation #266

Enhancement: Support for `extra_info` in Reward Calculation #266

maksimstw commented Feb 13, 2025

vermouth1992 commented Feb 14, 2025

Enhancement: Support for extra_info in Reward Calculation #266

Are you sure you want to change the base?

Enhancement: Support for extra_info in Reward Calculation #266

Conversation

maksimstw commented Feb 13, 2025