You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Thank you for this repo which saved me time to do a quick analysis on CVPR24.
I would like to give back to the community and so I made a first draft of what paperlists could produce in Huggingface datasets format : https://huggingface.co/datasets/hunoutl/paperlists
It is a raw dataset. I kept most of the keys that I applied to all of the papers. Do you have an idea of what would be possible to standardize everything?
For now I have a simple code for merging, I will try to find time to make it cleaner and share it.
I had made synthetic data for CVPR with the use of LLM to complete information and add new ones (country of belonging). I'm thinking of going over all the papers in the future.
The text was updated successfully, but these errors were encountered:
Thanks for using this papercopilot/paperlist repository. It's great to have it hosted on Hugging Face (huggingface/papercopilot) as well. I actually started an organization on Hugging Face, but I haven't posted anything there yet, lol.
I've also spent some time thinking about whether we can standardize everything during development, and I believe we can. This paper list is powered by papercopilot/paperbot and is currently organized into modules by conference.
I used to put all conference papers into a large data table and use the title as the key. However, there's a chance that papers could share the same title, making it difficult to identify missing papers. Therefore, I split them from the big standard output into smaller shards.
Still, it would be good to have a standard output as a function of the paperbot at output time to make it an easy-to-use tool.
Thank you for this repo which saved me time to do a quick analysis on CVPR24.
I would like to give back to the community and so I made a first draft of what paperlists could produce in Huggingface datasets format : https://huggingface.co/datasets/hunoutl/paperlists
It is a raw dataset. I kept most of the keys that I applied to all of the papers. Do you have an idea of what would be possible to standardize everything?
For now I have a simple code for merging, I will try to find time to make it cleaner and share it.
I had made synthetic data for CVPR with the use of LLM to complete information and add new ones (country of belonging). I'm thinking of going over all the papers in the future.
The text was updated successfully, but these errors were encountered: