Added api call example #1587

yoavkatz · 2025-02-09T10:27:39Z

No description provided.

Signed-off-by: Yoav Katz <[email protected]>

elronbandel · 2025-02-09T10:38:13Z

examples/api_call_evaluation.py

+print("Example prompt:")
+
+print(json.dumps(results.instance_scores[0]["source"], indent=4))
+
+print("Instance Results:")
+df = results.instance_scores.to_df(
+    columns=[
+        "user_request",
+        "reference_query",
+        "prediction",
+        "processed_references",
+        "processed_prediction",
+        "score",
+    ]
+)
+for index, row in df.iterrows():
+    print(f"Row {index}:")
+    for col_name, value in row.items():
+        print(f"{col_name}: {value}")
+    print("-" * 20)


Can we simplify that part in any way?

I opened an issue about this (#1588) The current 'results.instance_scores.summary' is not usable because it prints too long lines. I think we want to change it to print in a different way, and then we could use it here.

examples/api_call_evaluation.py

elronbandel

This is a nice example that share a lot of similarity with Text to SQL.
In both cases there is (1) user query (2) api/database and that target is to translate the user query to the api/database language.
I think its worth looking at how the SQL was implemented: specifically:

(1) The database is defined by its unique identifier (in this case could be OpenAPI(url="https://petstore.swagger.io/v2/pets")
(2) the representation of the database to the model is determined by a dedicated type serializer (in this case it could be OpenAPISpecificationSerializer(format="json") # can be yaml)
(3) Metrics could also use deterministic verification like with. https://pypi.org/project/openapi-spec-validator/ that can use the OpenAPI directly to verify schema/syntax/execution

Signed-off-by: Yoav Katz <[email protected]>

yoavkatz · 2025-02-09T16:30:48Z

This is a nice example that share a lot of similarity with Text to SQL. In both cases there is (1) user query (2) api/database and that target is to translate the user query to the api/database language. I think its worth looking at how the SQL was implemented: specifically:

(1) The database is defined by its unique identifier (in this case could be OpenAPI(url="https://petstore.swagger.io/v2/pets") (2) the representation of the database to the model is determined by a dedicated type serializer (in this case it could be OpenAPISpecificationSerializer(format="json") # can be yaml) (3) Metrics could also use deterministic verification like with. https://pypi.org/project/openapi-spec-validator/ that can use the OpenAPI directly to verify schema/syntax/execution

I wanted to keep this simple for now (I did it to show one of the users, who actually had their internal API spec representation). That's why I did not put the task in the catalog yet - and left it as a standalone example file - I felt it's not mature enough to be general.
2.I agree a more elaborate approach could be done with a serializer for OpenAPI specs.
I think the above library validates the scheme (which I assume to be correct), while we want to validate the CURL commands generated based on the schema.

Signed-off-by: Yoav Katz <[email protected]>

yoavkatz · 2025-02-11T09:13:44Z

This is a nice example that share a lot of similarity with Text to SQL. In both cases there is (1) user query (2) api/database and that target is to translate the user query to the api/database language. I think its worth looking at how the SQL was implemented: specifically:
(1) The database is defined by its unique identifier (in this case could be OpenAPI(url="https://petstore.swagger.io/v2/pets") (2) the representation of the database to the model is determined by a dedicated type serializer (in this case it could be OpenAPISpecificationSerializer(format="json") # can be yaml) (3) Metrics could also use deterministic verification like with. https://pypi.org/project/openapi-spec-validator/ that can use the OpenAPI directly to verify schema/syntax/execution

I wanted to keep this simple for now (I did it to show one of the users, who actually had their internal API spec representation). That's why I did not put the task in the catalog yet - and left it as a standalone example file - I felt it's not mature enough to be general.
2.I agree a more elaborate approach could be done with a serializer for OpenAPI specs.

I think the above library validates the scheme (which I assume to be correct), while we want to validate the CURL commands generated based on the schema.

Given the above - you think we can merge?

Added api call example

cd57ea3

Signed-off-by: Yoav Katz <[email protected]>

elronbandel reviewed Feb 9, 2025

View reviewed changes

examples/api_call_evaluation.py Outdated Show resolved Hide resolved

elronbandel reviewed Feb 9, 2025

View reviewed changes

Improved split to components

0804a9c

Signed-off-by: Yoav Katz <[email protected]>

yoavkatz and others added 2 commits February 11, 2025 11:11

Moved api_spec to task

c3caceb

Signed-off-by: Yoav Katz <[email protected]>

Merge branch 'main' into api_call_evaluation

02f0d9e

Merge branch 'main' into api_call_evaluation

711677d

elronbandel approved these changes Feb 12, 2025

View reviewed changes

elronbandel enabled auto-merge (squash) February 12, 2025 08:51

elronbandel merged commit 6c82ceb into main Feb 12, 2025
17 checks passed

elronbandel deleted the api_call_evaluation branch February 12, 2025 08:54

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Added api call example #1587

Added api call example #1587

yoavkatz commented Feb 9, 2025

elronbandel Feb 9, 2025

yoavkatz Feb 9, 2025

elronbandel left a comment •

edited

Loading

yoavkatz commented Feb 9, 2025

yoavkatz commented Feb 11, 2025

Added api call example #1587

Added api call example #1587

Conversation

yoavkatz commented Feb 9, 2025

elronbandel Feb 9, 2025

Choose a reason for hiding this comment

yoavkatz Feb 9, 2025

Choose a reason for hiding this comment

elronbandel left a comment • edited Loading

Choose a reason for hiding this comment

yoavkatz commented Feb 9, 2025

yoavkatz commented Feb 11, 2025

elronbandel left a comment •

edited

Loading