Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Quickstart seems broken #133

Open
koaning opened this issue Aug 19, 2024 · 3 comments
Open

Quickstart seems broken #133

koaning opened this issue Aug 19, 2024 · 3 comments

Comments

@koaning
Copy link

koaning commented Aug 19, 2024

I am trying to get the tutorial running locally but seem to hit an issue with the first cell block.

import ibis

con = ibis.connect("duckdb://nycflights13.ddb")
con.create_table(
    "flights", ibis.examples.nycflights13_flights.fetch().to_pyarrow(), overwrite=True
)
con.create_table(
    "weather", ibis.examples.nycflights13_weather.fetch().to_pyarrow(), overwrite=True
)

When I run it I get this error:

---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
Cell In[3], line 5
      1 import ibis
      3 con = ibis.connect("duckdb://nycflights13.ddb")
      4 con.create_table(
----> 5     "flights", ibis.examples.nycflights13_flights.fetch().to_pyarrow(), overwrite=True
      6 )
      7 con.create_table(
      8     "weather", ibis.examples.nycflights13_weather.fetch().to_pyarrow(), overwrite=True
      9 )

File ~/Development/probabl/venv/lib/python3.11/site-packages/ibis/examples/__init__.py:45, in Example.fetch(self, table_name, backend)
     41     table_name = name
     43 board = _get_board()
---> 45 (path,) = board.pin_download(name)
     47 if backend.name in _DIRECT_BACKENDS:
     48     # Read directly into these backends. This helps reduce memory
     49     # usage, making the larger example datasets easier to work with.
     50     if path.endswith(".parquet"):

File [~/Development/probabl/venv/lib/python3.11/site-packages/pins/boards.py:394](http://localhost:8888/lab/tree/~/Development/probabl/venv/lib/python3.11/site-packages/pins/boards.py#line=393), in BaseBoard.pin_download(self, name, version, hash)
    376 def pin_download(self, name, version=None, hash=None) -> Sequence[str]:
    377     """Download the files contained in a pin.
    378 
    379     This method only downloads the files in a pin. In order to read and load
   (...)
    391 
    392     """
--> 394     meta = self.pin_fetch(name, version)
    396     if hash is not None:
    397         raise NotImplementedError("TODO: validate hash")

File [~/Development/probabl/venv/lib/python3.11/site-packages/pins/boards.py:188](http://localhost:8888/lab/tree/~/Development/probabl/venv/lib/python3.11/site-packages/pins/boards.py#line=187), in BaseBoard.pin_fetch(self, name, version)
    187 def pin_fetch(self, name: str, version: Optional[str] = None) -> Meta:
--> 188     meta = self.pin_meta(name, version)
    190     # TODO: sanity check caching (since R pins does a cache touch here)
    191     # path = self.construct_path([self.board, name, version])
    192     # self.fs.get(...)
   (...)
    195     #       need to ensure user can have a readable cache
    196     #       so they could pin_fetch and then examine the result, a la pin_download
    197     return meta

File [~/Development/probabl/venv/lib/python3.11/site-packages/pins/boards.py:151](http://localhost:8888/lab/tree/~/Development/probabl/venv/lib/python3.11/site-packages/pins/boards.py#line=150), in BaseBoard.pin_meta(self, name, version)
    148     selected_version = guess_version(version)
    149 else:
    150     # otherwise, get the last pin version
--> 151     versions = self.pin_versions(name, as_df=False)
    153     if not len(versions):
    154         raise NotImplementedError("TODO: sanity check when no versions")

File [~/Development/probabl/venv/lib/python3.11/site-packages/pins/boards.py:106](http://localhost:8888/lab/tree/~/Development/probabl/venv/lib/python3.11/site-packages/pins/boards.py#line=105), in BaseBoard.pin_versions(self, name, as_df)
    104 all_versions = []
    105 for full_path in versions_raw:
--> 106     version = self.keep_final_path_component(full_path)
    107     all_versions.append(guess_version(version))
    109 # sort them, with latest last

File [~/Development/probabl/venv/lib/python3.11/site-packages/pins/boards.py:635](http://localhost:8888/lab/tree/~/Development/probabl/venv/lib/python3.11/site-packages/pins/boards.py#line=634), in BaseBoard.keep_final_path_component(self, path)
    634 def keep_final_path_component(self, path):
--> 635     return path.split("[/](http://localhost:8888/)")[-1]

AttributeError: 'dict' object has no attribute 'split'
@koaning
Copy link
Author

koaning commented Aug 19, 2024

These are my versions.

ibis-framework==9.3.0
ibis-ml==0.1.2

@koaning
Copy link
Author

koaning commented Aug 19, 2024

If possible, I might recommend hosting a csv on Github that one can just pull down locally. It seems that there are many libraries in between of getting this downloaded and actually getting the tutorial working. I have had to install an extra dependency, update my SSL certificate and still cannot seem to get the data in order to get started.

@deepyaman
Copy link
Collaborator

Hey @koaning. Thanks for giving IbisML a try!

Based on your error message, it looks like you're running into an issue fetching the example data Ibis provides. Did you use the install command from the tutorial: pip install 'ibis-framework[duckdb,examples]' ibis-ml scikit-learn?

I just tried this on my end in a fresh 3.12 Conda environment, and I wasn't able to replicate your issue. This is what was installed for me:

Installing collected packages: pytz, appdirs, zipp, xxhash, urllib3, tzdata, typing-extensions, toolz, threadpoolctl, sqlglot, six, pyyaml, pygments, pyasn1, pyarrow-hotfix, protobuf, parsy, oauthlib, numpy, multidict, mdurl, MarkupSafe, joblib, importlib-resources, idna, humanize, google-crc32c, fsspec, frozenlist, duckdb, decorator, charset-normalizer, certifi, cachetools, attrs, atpublic, aiohappyeyeballs, yarl, scipy, rsa, requests, python-dateutil, pyasn1-modules, pyarrow, proto-plus, markdown-it-py, jinja2, importlib-metadata, googleapis-common-protos, google-resumable-media, aiosignal, scikit-learn, rich, requests-oauthlib, pandas, ibis-framework, google-auth, aiohttp, pins, ibis-ml, google-auth-oauthlib, google-api-core, google-cloud-core, google-cloud-storage, gcsfs
Successfully installed MarkupSafe-2.1.5 aiohappyeyeballs-2.3.7 aiohttp-3.10.4 aiosignal-1.3.1 appdirs-1.4.4 atpublic-5.0 attrs-24.2.0 cachetools-5.5.0 certifi-2024.7.4 charset-normalizer-3.3.2 decorator-5.1.1 duckdb-1.0.0 frozenlist-1.4.1 fsspec-2024.6.1 gcsfs-2024.6.1 google-api-core-2.19.1 google-auth-2.34.0 google-auth-oauthlib-1.2.1 google-cloud-core-2.4.1 google-cloud-storage-2.18.2 google-crc32c-1.5.0 google-resumable-media-2.7.2 googleapis-common-protos-1.63.2 humanize-4.10.0 ibis-framework-9.3.0 ibis-ml-0.1.2 idna-3.7 importlib-metadata-8.2.0 importlib-resources-6.4.3 jinja2-3.1.4 joblib-1.4.2 markdown-it-py-3.0.0 mdurl-0.1.2 multidict-6.0.5 numpy-2.1.0 oauthlib-3.2.2 pandas-2.2.2 parsy-2.1 pins-0.8.6 proto-plus-1.24.0 protobuf-5.27.3 pyarrow-17.0.0 pyarrow-hotfix-0.6 pyasn1-0.6.0 pyasn1-modules-0.4.0 pygments-2.18.0 python-dateutil-2.9.0.post0 pytz-2024.1 pyyaml-6.0.2 requests-2.32.3 requests-oauthlib-2.0.0 rich-13.7.1 rsa-4.9 scikit-learn-1.5.1 scipy-1.14.0 six-1.16.0 sqlglot-25.9.0 threadpoolctl-3.5.0 toolz-0.12.1 typing-extensions-4.12.2 tzdata-2024.1 urllib3-2.2.2 xxhash-3.5.0 yarl-1.9.4 zipp-3.20.0

(The most relevant ones should be duckdb-1.0.0, ibis-framework-9.3.0, ibis-ml-0.1.2, numpy-2.1.0, pandas-2.2.2, pins-0.8.6, pyarrow-17.0.0, pyarrow-hotfix-0.6, scikit-learn-1.5.1).

That said, I did notice that for https://ibis-project.github.io/ibis-ml/#create-your-first-recipe, could be more explicit in the requirements; I had to go install a number of these requirements afterward for that one, since it just says pip install ibis-ml.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: backlog
Development

No branches or pull requests

2 participants