-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Benchmarking Performance Different Than Reported #3
Comments
I think the planners did not compile at all, which is why you get Tagging this as an install issue, and looping in @agiachris @Khodeir |
I looked into this further -- things seem fine on my end. Install dependencies + version info
C, CXX compiler versions used
Run planners
FF-X
FD-lama-first
Cerberus-seq-sat
Cerberus-seq-agl (on the
DecStar-agl-decoupled
FD-seq-opt-lmcut
Delfi
Compilation issueReturning to the issue in question, I think it'd be worth looking into compiler changes acros Ubuntu 20.04, 22.04, and whether that's causing the FF and FD variants to not compile on your end. One other change I notice is also that I used python 3.10 (although I'm not sure that's what's causing the FF, FD planners to not compile on your end) |
Thanks for the fast and detailed response, really appreciate it! I follow your advice by starting with a fresh To facilitate reproducibility, I used docker to produce the results. I made a Dockerfile and a script for running as in my PR. You can run all my tests by running I hope we can together work on a good Dockerfile so that other people can use it more easily because this is such a good benchmark. I will also contribute a docker image once we fix the dependency issues and pass all tests. FF-X
FF
FD-seq-opt-lmcut
###FD-lama-first
Delfi
DecStar-opt-decoupled
DecStar-agl-decoupled
Cerberus-seq-sat
Cerberus-seq-agl
lapkt-bfws
I notice that for the cases with 'nan' failures, it always associates with the following warning
I'm glad to provide more details if necessary! |
Hello, I also have the same problem, I would like to ask if you have solved it |
Hi, thanks for such an all-around repo for working with 3DSG planning!
I would like to reproduce the benchmarking results in your repo under the benchmark folder to make sure everything runs properly before testing my own planners. However, during my testing, the behaviors of the planners are quite different than what are reported.
As of 07/20/2023, I ran all available planners in
pddlgym_planners/__init__.py
with pddl_domaintaskographyv2tiny1
with the commandpython scripts/benchmark/plan.py --domain-name $DOMAIN_NAME --planner $PLANNER
. The results are the following:FF
: error while runningFF-X
: the same error as FFFD-lama-first
: plan failureCerberus-seq-sat
: plan falureCerberus-seq-agl
: plan failureDecStar-agl-decoupled
: plan failurelapkt-bfws
: slightly different behavior thanbenchmark/taskographyv2tiny1_bfws
. My result:reported in
benchmark/taskographyv2tiny1_bfws/taskographyv2tiny1_bfws_test.json
:FD-seq-opt-lmcut
: plan failureDelfi
: plan failure:DecStar-opt-decoupled
: plan failureI followed the installation stated in the https://github.com/taskography/taskography-api#installation with only a few changes to fix some errors:
0. Ubuntu 22.04.
,
at the end of linetaskography-api/setup.py
Line 26 in bcb47fc
pip install -e .
andpip install -r requirements.txt
.importlib-metadata
from 6.7.0 to 4.12.0 to avoid error'EntryPoints' object has no attribute 'get'
. Source: https://stackoverflow.com/questions/73929564/entrypoints-object-has-no-attribute-get-digital-oceanfrom __future__ import annotations
to the first line to avoid errorfrom __future__ imports must occur at the beginning of the file
. Source: https://stackoverflow.com/questions/38688504/from-future-imports-must-occur-at-the-beginning-of-the-file-what-definesscripts/validate/loader.py
andscripts/validate/taskography_env.py
, pass both.I'm willing to offer more details if needed. Highly appreciate it if you could offer some help as a solid benchmark is the pre-requisite to any possible future researches. Thanks in advance!
The text was updated successfully, but these errors were encountered: