Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BFCL] Wrong Format in the Possible Answers of Live Parallel Multiple #622

Open
tanliboy opened this issue Sep 4, 2024 · 0 comments
Open

Comments

@tanliboy
Copy link

tanliboy commented Sep 4, 2024

Describe the issue
There are several format issues in the possible answers of live test cases.

ID datapoint

  1. Datapoint / Model Handler permalink:
    https://github.com/ShishirPatil/gorilla/blob/main/berkeley-function-call-leaderboard/data/possible_answer/BFCL_v2_live_parallel_multiple.json
    Here are failed examples from my model (IMHO, most of them are valid.)
    sample.txt

What is the issue

  • Some live test cases are incorrectly treating the string type as an array of strings, which is causing the correct answers to fail.
  • There are some inconsistencies in the function names in the live tests. For instance, the function x**2 is being replaced with x^2, and the original lambda function names are being rejected.
  • Some test cases are translating certain function parameters. For example, if a user inputs a location in Chinese, the expected answers only accept the translated (non-Chinese) version, causing mismatches.

Proposed Changes

Correct the possible answers.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant