You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository has been archived by the owner on Jun 30, 2022. It is now read-only.
Hi everyone, I've been meaning to write this issue for a while but ended up forgetting. As you probably noticed, development of RockHound has pretty much stopped. The main reason is that the way we were going about adding datasets had several fatal flaws that made development unsustainable:
No scope definition. We basically didn't define a scope for the project and so had no way of indicating which datasets would fit. This is mostly because we didn't know and thought we could just add everything here with no problem.
CI and gallery infrastructure doesn't scale. As we started adding more datasets, our set up of downloading everything on CI to test and build the documentation started breaking. This is clearly not possible to do on free CI and would only get worse as the project grew.
Lack of open licenses. Initially, the goal was to download and read into pandas/xarray all of the data and models that exits without clear licenses and in terrible custom formats (like CRUST1.0). This works fine for small files but as soon as the pre-processing gets more involved, this solution becomes difficult to maintain. The best way would be to repackage the datasets into a format that can be easily loaded with pandas/xarray but since these data have no license, we can't do that.
So development efforts slowly fizzled out because of these mounting problems that we had no good way of fixing. As such, we realized that development was not sustainable. A better approach to these problems is to build packages for specific datasets. That solves the CI/growth issue. So if anyone wants to pick that up, please feel free to use whatever code we have that helps you do that 🙂
For the future, we decided to shift our focus to curating a set of data that has open licenses and serves as good sample data for tutorials and documentation. Our packages have this already but we wanted to consolidate on a single package that can be shared. This effort has a much tighter scope so it's easier to know if we're going in the right direction.
Sorry for the delay in this message and thank you to everyone who contributed to RockHound! We will be closing most PRs and Issues and putting a notice on the README linking to this issue.
The text was updated successfully, but these errors were encountered:
Hi everyone, I've been meaning to write this issue for a while but ended up forgetting. As you probably noticed, development of RockHound has pretty much stopped. The main reason is that the way we were going about adding datasets had several fatal flaws that made development unsustainable:
So development efforts slowly fizzled out because of these mounting problems that we had no good way of fixing. As such, we realized that development was not sustainable. A better approach to these problems is to build packages for specific datasets. That solves the CI/growth issue. So if anyone wants to pick that up, please feel free to use whatever code we have that helps you do that 🙂
For the future, we decided to shift our focus to curating a set of data that has open licenses and serves as good sample data for tutorials and documentation. Our packages have this already but we wanted to consolidate on a single package that can be shared. This effort has a much tighter scope so it's easier to know if we're going in the right direction.
We're gathering and packaging these datasets in https://github.com/fatiando-data and have created the Ensaio package to load them.
Sorry for the delay in this message and thank you to everyone who contributed to RockHound! We will be closing most PRs and Issues and putting a notice on the README linking to this issue.
The text was updated successfully, but these errors were encountered: