Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Panel notebook for tag data #26

Merged
merged 2 commits into from
May 21, 2024
Merged

Panel notebook for tag data #26

merged 2 commits into from
May 21, 2024

Conversation

aderrien7
Copy link
Contributor

The noteboook can generate panel for tag data by accesing them from s3 bucket

The noteboook can generate panel for tag data by accesing them from s3 bucket
@annefou
Copy link
Collaborator

annefou commented May 7, 2024

I am not sure why but I am getting a permission error:

---------------------------------------------------------------------------
ClientError                               Traceback (most recent call last)
File [/srv/conda/envs/notebook/lib/python3.11/site-packages/s3fs/core.py:113](https://gfts.minrk.net/srv/conda/envs/notebook/lib/python3.11/site-packages/s3fs/core.py#line=112), in _error_wrapper(func, args, kwargs, retries)
    112 try:
--> 113     return await func(*args, **kwargs)
    114 except S3_RETRYABLE_ERRORS as e:

File [/srv/conda/envs/notebook/lib/python3.11/site-packages/aiobotocore/client.py:408](https://gfts.minrk.net/srv/conda/envs/notebook/lib/python3.11/site-packages/aiobotocore/client.py#line=407), in AioBaseClient._make_api_call(self, operation_name, api_params)
    407     error_class = self.exceptions.from_code(error_code)
--> 408     raise error_class(parsed_response, operation_name)
    409 else:

ClientError: An error occurred (403) when calling the HeadObject operation: Forbidden

The above exception was the direct cause of the following exception:

PermissionError                           Traceback (most recent call last)
Cell In[4], line 11
      9 track_plot = pn.bind(plot_track,tag_id=tag_widget)
     10 emission_plot = pn.bind(plot_emission,tag_id=tag_widget)
---> 11 track_emission = pn.Row(time_plot,track_plot,emission_plot)
     13 #Combining plots with the widget
     14 plots = pn.Row(tag_widget,track_emission)

File [/srv/conda/envs/notebook/lib/python3.11/site-packages/panel/layout/base.py:825](https://gfts.minrk.net/srv/conda/envs/notebook/lib/python3.11/site-packages/panel/layout/base.py#line=824), in ListPanel.__init__(self, *objects, **params)
    821     if 'objects' in params:
    822         raise ValueError("A %s's objects should be supplied either "
    823                          "as positional arguments or as a keyword, "
    824                          "not both." % type(self).__name__)
--> 825     params['objects'] = [panel(pane) for pane in objects]
    826 elif 'objects' in params:
    827     objects = params['objects']

File [/srv/conda/envs/notebook/lib/python3.11/site-packages/panel/layout/base.py:825](https://gfts.minrk.net/srv/conda/envs/notebook/lib/python3.11/site-packages/panel/layout/base.py#line=824), in <listcomp>(.0)
    821     if 'objects' in params:
    822         raise ValueError("A %s's objects should be supplied either "
    823                          "as positional arguments or as a keyword, "
    824                          "not both." % type(self).__name__)
--> 825     params['objects'] = [panel(pane) for pane in objects]
    826 elif 'objects' in params:
    827     objects = params['objects']

File [/srv/conda/envs/notebook/lib/python3.11/site-packages/panel/pane/base.py:87](https://gfts.minrk.net/srv/conda/envs/notebook/lib/python3.11/site-packages/panel/pane/base.py#line=86), in panel(obj, **kwargs)
     85 if kwargs.get('name', False) is None:
     86     kwargs.pop('name')
---> 87 pane = PaneBase.get_pane_type(obj, **kwargs)(obj, **kwargs)
     88 if len(pane.layout) == 1 and pane._unpack:
     89     return pane.layout[0]

File [/srv/conda/envs/notebook/lib/python3.11/site-packages/panel/param.py:815](https://gfts.minrk.net/srv/conda/envs/notebook/lib/python3.11/site-packages/panel/param.py#line=814), in ParamRef.__init__(self, object, **params)
    813 self._validate_object()
    814 if not self.defer_load:
--> 815     self._replace_pane()

File [/srv/conda/envs/notebook/lib/python3.11/site-packages/panel/param.py:883](https://gfts.minrk.net/srv/conda/envs/notebook/lib/python3.11/site-packages/panel/param.py#line=882), in ParamRef._replace_pane(self, force, *args)
    881 else:
    882     try:
--> 883         new_object = self.eval(self.object)
    884         if new_object is Skip and new_object is Undefined:
    885             self._inner_layout.loading = False

File [/srv/conda/envs/notebook/lib/python3.11/site-packages/panel/param.py:1106](https://gfts.minrk.net/srv/conda/envs/notebook/lib/python3.11/site-packages/panel/param.py#line=1105), in ParamFunction.eval(self, ref)
   1104 @classmethod
   1105 def eval(self, ref):
-> 1106     return eval_function_with_deps(ref)

File [/srv/conda/envs/notebook/lib/python3.11/site-packages/param/parameterized.py:165](https://gfts.minrk.net/srv/conda/envs/notebook/lib/python3.11/site-packages/param/parameterized.py#line=164), in eval_function_with_deps(function)
    163         args = (getattr(dep.owner, dep.name) for dep in arg_deps)
    164         kwargs = {n: getattr(dep.owner, dep.name) for n, dep in kw_deps.items()}
--> 165 return function(*args, **kwargs)

File [/srv/conda/envs/notebook/lib/python3.11/site-packages/param/depends.py:53](https://gfts.minrk.net/srv/conda/envs/notebook/lib/python3.11/site-packages/param/depends.py#line=52), in depends.<locals>._depends(*args, **kw)
     51 @wraps(func)
     52 def _depends(*args, **kw):
---> 53     return func(*args, **kw)

File [/srv/conda/envs/notebook/lib/python3.11/site-packages/param/reactive.py:594](https://gfts.minrk.net/srv/conda/envs/notebook/lib/python3.11/site-packages/param/reactive.py#line=593), in bind.<locals>.wrapped(*wargs, **wkwargs)
    591 @depends(**dependencies, watch=watch)
    592 def wrapped(*wargs, **wkwargs):
    593     combined_args, combined_kwargs = combine_arguments(wargs, wkwargs)
--> 594     return eval_fn()(*combined_args, **combined_kwargs)

File [/srv/conda/envs/notebook/lib/python3.11/site-packages/panel/io/cache.py:433](https://gfts.minrk.net/srv/conda/envs/notebook/lib/python3.11/site-packages/panel/io/cache.py#line=432), in cache.<locals>.wrapped_func(*args, **kwargs)
    431         func_cache[hash_value] = (ret, ts, count+1, time)
    432 else:
--> 433     ret = func(*args, **kwargs)
    434     with lock:
    435         func_cache[hash_value] = (ret, time, 0, time)

Cell In[3], line 7, in plot_time_series(plot_type, tag_id)
      2 @pn.cache
      3 
      4 # Functions to plot the different visualization for a given tag id
      5 def plot_time_series(plot_type="time series",tag_id="CB_A11071"):
      6     # load trajectories 
----> 7     trajectories = read_trajectories(track_modes,f"{scratch_root}[/](https://gfts.minrk.net/){tag_id}",storage_options, format="parquet")
      9     # Converting the trajectories to pandas DataFrames to access data easily
     10     mean_df = trajectories.trajectories[0].df

File [~/pangeo-fish/pangeo_fish/io.py:275](https://gfts.minrk.net/user/annefou/pangeo-fish/pangeo_fish/io.py#line=274), in read_trajectories(names, root, storage_options, format)
    272 if reader is None:
    273     raise ValueError(f"unknown format: {format}")
--> 275 return mpd.TrajectoryCollection([reader(root, name) for name in names])

File [~/pangeo-fish/pangeo_fish/io.py:275](https://gfts.minrk.net/user/annefou/pangeo-fish/pangeo_fish/io.py#line=274), in <listcomp>(.0)
    272 if reader is None:
    273     raise ValueError(f"unknown format: {format}")
--> 275 return mpd.TrajectoryCollection([reader(root, name) for name in names])

File [~/pangeo-fish/pangeo_fish/io.py:261](https://gfts.minrk.net/user/annefou/pangeo-fish/pangeo_fish/io.py#line=260), in read_trajectories.<locals>.read_parquet(root, name)
    258 def read_parquet(root, name):
    259     path = f"{root}[/](https://gfts.minrk.net/){name}.parquet"
--> 261     df = pd.read_parquet(path,
    262                          storage_options=storage_options)
    264     return mpd.Trajectory(df, name, x="longitude", y="latitude")

File [/srv/conda/envs/notebook/lib/python3.11/site-packages/pandas/io/parquet.py:667](https://gfts.minrk.net/srv/conda/envs/notebook/lib/python3.11/site-packages/pandas/io/parquet.py#line=666), in read_parquet(path, engine, columns, storage_options, use_nullable_dtypes, dtype_backend, filesystem, filters, **kwargs)
    664     use_nullable_dtypes = False
    665 check_dtype_backend(dtype_backend)
--> 667 return impl.read(
    668     path,
    669     columns=columns,
    670     filters=filters,
    671     storage_options=storage_options,
    672     use_nullable_dtypes=use_nullable_dtypes,
    673     dtype_backend=dtype_backend,
    674     filesystem=filesystem,
    675     **kwargs,
    676 )

File [/srv/conda/envs/notebook/lib/python3.11/site-packages/pandas/io/parquet.py:274](https://gfts.minrk.net/srv/conda/envs/notebook/lib/python3.11/site-packages/pandas/io/parquet.py#line=273), in PyArrowImpl.read(self, path, columns, filters, use_nullable_dtypes, dtype_backend, storage_options, filesystem, **kwargs)
    267 path_or_handle, handles, filesystem = _get_path_or_handle(
    268     path,
    269     filesystem,
    270     storage_options=storage_options,
    271     mode="rb",
    272 )
    273 try:
--> 274     pa_table = self.api.parquet.read_table(
    275         path_or_handle,
    276         columns=columns,
    277         filesystem=filesystem,
    278         filters=filters,
    279         **kwargs,
    280     )
    281     result = pa_table.to_pandas(**to_pandas_kwargs)
    283     if manager == "array":

File [/srv/conda/envs/notebook/lib/python3.11/site-packages/pyarrow/parquet/core.py:1776](https://gfts.minrk.net/srv/conda/envs/notebook/lib/python3.11/site-packages/pyarrow/parquet/core.py#line=1775), in read_table(source, columns, use_threads, schema, use_pandas_metadata, read_dictionary, memory_map, buffer_size, partitioning, filesystem, filters, use_legacy_dataset, ignore_prefixes, pre_buffer, coerce_int96_timestamp_unit, decryption_properties, thrift_string_size_limit, thrift_container_size_limit, page_checksum_verification)
   1770     warnings.warn(
   1771         "Passing 'use_legacy_dataset' is deprecated as of pyarrow 15.0.0 "
   1772         "and will be removed in a future version.",
   1773         FutureWarning, stacklevel=2)
   1775 try:
-> 1776     dataset = ParquetDataset(
   1777         source,
   1778         schema=schema,
   1779         filesystem=filesystem,
   1780         partitioning=partitioning,
   1781         memory_map=memory_map,
   1782         read_dictionary=read_dictionary,
   1783         buffer_size=buffer_size,
   1784         filters=filters,
   1785         ignore_prefixes=ignore_prefixes,
   1786         pre_buffer=pre_buffer,
   1787         coerce_int96_timestamp_unit=coerce_int96_timestamp_unit,
   1788         thrift_string_size_limit=thrift_string_size_limit,
   1789         thrift_container_size_limit=thrift_container_size_limit,
   1790         page_checksum_verification=page_checksum_verification,
   1791     )
   1792 except ImportError:
   1793     # fall back on ParquetFile for simple cases when pyarrow.dataset
   1794     # module is not available
   1795     if filters is not None:

File [/srv/conda/envs/notebook/lib/python3.11/site-packages/pyarrow/parquet/core.py:1329](https://gfts.minrk.net/srv/conda/envs/notebook/lib/python3.11/site-packages/pyarrow/parquet/core.py#line=1328), in ParquetDataset.__init__(self, path_or_paths, filesystem, schema, filters, read_dictionary, memory_map, buffer_size, partitioning, ignore_prefixes, pre_buffer, coerce_int96_timestamp_unit, decryption_properties, thrift_string_size_limit, thrift_container_size_limit, page_checksum_verification, use_legacy_dataset)
   1327     except ValueError:
   1328         filesystem = LocalFileSystem(use_mmap=memory_map)
-> 1329 finfo = filesystem.get_file_info(path_or_paths)
   1330 if finfo.is_file:
   1331     single_file = path_or_paths

File [/srv/conda/envs/notebook/lib/python3.11/site-packages/pyarrow/_fs.pyx:581](https://gfts.minrk.net/srv/conda/envs/notebook/lib/python3.11/site-packages/pyarrow/_fs.pyx#line=580), in pyarrow._fs.FileSystem.get_file_info()

File [/srv/conda/envs/notebook/lib/python3.11/site-packages/pyarrow/error.pxi:154](https://gfts.minrk.net/srv/conda/envs/notebook/lib/python3.11/site-packages/pyarrow/error.pxi#line=153), in pyarrow.lib.pyarrow_internal_check_status()

File [/srv/conda/envs/notebook/lib/python3.11/site-packages/pyarrow/error.pxi:88](https://gfts.minrk.net/srv/conda/envs/notebook/lib/python3.11/site-packages/pyarrow/error.pxi#line=87), in pyarrow.lib.check_status()

File [/srv/conda/envs/notebook/lib/python3.11/site-packages/pyarrow/_fs.pyx:1501](https://gfts.minrk.net/srv/conda/envs/notebook/lib/python3.11/site-packages/pyarrow/_fs.pyx#line=1500), in pyarrow._fs._cb_get_file_info()

File [/srv/conda/envs/notebook/lib/python3.11/site-packages/pyarrow/fs.py:335](https://gfts.minrk.net/srv/conda/envs/notebook/lib/python3.11/site-packages/pyarrow/fs.py#line=334), in FSSpecHandler.get_file_info(self, paths)
    333 for path in paths:
    334     try:
--> 335         info = self.fs.info(path)
    336     except FileNotFoundError:
    337         infos.append(FileInfo(path, FileType.NotFound))

File [/srv/conda/envs/notebook/lib/python3.11/site-packages/fsspec/asyn.py:118](https://gfts.minrk.net/srv/conda/envs/notebook/lib/python3.11/site-packages/fsspec/asyn.py#line=117), in sync_wrapper.<locals>.wrapper(*args, **kwargs)
    115 @functools.wraps(func)
    116 def wrapper(*args, **kwargs):
    117     self = obj or args[0]
--> 118     return sync(self.loop, func, *args, **kwargs)

File [/srv/conda/envs/notebook/lib/python3.11/site-packages/fsspec/asyn.py:103](https://gfts.minrk.net/srv/conda/envs/notebook/lib/python3.11/site-packages/fsspec/asyn.py#line=102), in sync(loop, func, timeout, *args, **kwargs)
    101     raise FSTimeoutError from return_result
    102 elif isinstance(return_result, BaseException):
--> 103     raise return_result
    104 else:
    105     return return_result

File [/srv/conda/envs/notebook/lib/python3.11/site-packages/fsspec/asyn.py:56](https://gfts.minrk.net/srv/conda/envs/notebook/lib/python3.11/site-packages/fsspec/asyn.py#line=55), in _runner(event, coro, result, timeout)
     54     coro = asyncio.wait_for(coro, timeout=timeout)
     55 try:
---> 56     result[0] = await coro
     57 except Exception as ex:
     58     result[0] = ex

File [/srv/conda/envs/notebook/lib/python3.11/site-packages/s3fs/core.py:1371](https://gfts.minrk.net/srv/conda/envs/notebook/lib/python3.11/site-packages/s3fs/core.py#line=1370), in S3FileSystem._info(self, path, bucket, key, refresh, version_id)
   1369 if key:
   1370     try:
-> 1371         out = await self._call_s3(
   1372             "head_object",
   1373             self.kwargs,
   1374             Bucket=bucket,
   1375             Key=key,
   1376             **version_id_kw(version_id),
   1377             **self.req_kw,
   1378         )
   1379         return {
   1380             "ETag": out.get("ETag", ""),
   1381             "LastModified": out.get("LastModified", ""),
   (...)
   1387             "ContentType": out.get("ContentType"),
   1388         }
   1389     except FileNotFoundError:

File [/srv/conda/envs/notebook/lib/python3.11/site-packages/s3fs/core.py:362](https://gfts.minrk.net/srv/conda/envs/notebook/lib/python3.11/site-packages/s3fs/core.py#line=361), in S3FileSystem._call_s3(self, method, *akwarglist, **kwargs)
    360 logger.debug("CALL: %s - %s - %s", method.__name__, akwarglist, kw2)
    361 additional_kwargs = self._get_s3_method_kwargs(method, *akwarglist, **kwargs)
--> 362 return await _error_wrapper(
    363     method, kwargs=additional_kwargs, retries=self.retries
    364 )

File [/srv/conda/envs/notebook/lib/python3.11/site-packages/s3fs/core.py:142](https://gfts.minrk.net/srv/conda/envs/notebook/lib/python3.11/site-packages/s3fs/core.py#line=141), in _error_wrapper(func, args, kwargs, retries)
    140         err = e
    141 err = translate_boto_error(err)
--> 142 raise err

PermissionError: Forbidden

@aderrien7
Copy link
Contributor Author

It looks like an issue with the permissions indeed. Can you access gfts-ifremer bucket? Tag data are stored there

@annefou
Copy link
Collaborator

annefou commented May 7, 2024

I can list the bucket, for instance:

s3.ls("gfts-ifremer/tags/tracks/AD_A11177

So I am not sure what permissions is required.

@aderrien7
Copy link
Contributor Author

aderrien7 commented May 7, 2024

Could you try this to just open this one file to see if you can access it :

import pandas as pd
pd.read_csv(s3.open("gfts-ifremer/tags/cleaned/AD_A11177/dst.csv"))

@annefou
Copy link
Collaborator

annefou commented May 7, 2024

Could you try this to just open this one file to see if you can access it :

import pandas as pd
pd.read_csv(s3.open("gfts-ifremer/tags/cleaned/AD_A11177/dst.csv"))

Yes I tried and as I said, I can list files but I cannot read/access them. (I get permission denied).

@annefou
Copy link
Collaborator

annefou commented May 7, 2024

@minrk should we all be added as ifremer users to access the data shared by ifremer?

@minrk
Copy link
Collaborator

minrk commented May 8, 2024

You do have permission on the bucket (try reading gfts-ifremer/test.txt), but I think these objects have been created with more restricted permissions than the default for the bucket.

I can try to change the bucket policy to fix it, but in the meantime, I think the creator of the file can change the acl on the uploaded object.

@tinaok
Copy link
Collaborator

tinaok commented May 13, 2024

You do have permission on the bucket (try reading gfts-ifremer/test.txt), but I think these objects have been created with more restricted permissions than the default for the bucket.

I can try to change the bucket policy to fix it, but in the meantime, I think the creator of the file can change the acl on the uploaded object.

We used normal xarray_to_zarr, and didn't do anything special. We do not know what kind of option we should give so that other people can read.

@minrk
Copy link
Collaborator

minrk commented May 13, 2024

I believe I've fixed this with user policies in #30

@aderrien7
Copy link
Contributor Author

Hi @annefou, I've just returned from vacation. From what I understand you were unable to access the files due to the usage rules used when I placed the files in the bucket, have you managed to overcome this issue since last time?

@annefou
Copy link
Collaborator

annefou commented May 21, 2024

Hi @annefou, I've just returned from vacation. From what I understand you were unable to access the files due to the usage rules used when I placed the files in the bucket, have you managed to overcome this issue since last time?

Min fixed the permission issues.

I just fixed the order of the python import (from pre-commit).

Panel does not show up in the jupyterhub. Does it work for you?

@aderrien7
Copy link
Contributor Author

Hi @annefou, I've just returned from vacation. From what I understand you were unable to access the files due to the usage rules used when I placed the files in the bucket, have you managed to overcome this issue since last time?

Min fixed the permission issues.

I just fixed the order of the python import (from pre-commit).

Panel does not show up in the jupyterhub. Does it work for you?

Yes it was working using jupyterhub from here : https://gfts.minrk.net
I belive you are using it here too ?

@annefou
Copy link
Collaborator

annefou commented May 21, 2024

Yes it was working using jupyterhub from here : https://gfts.minrk.net I belive you are using it here too ?

Ok. It did not work for me but I think it is good to go.

@minrk minrk merged commit 8317a11 into destination-earth:main May 21, 2024
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants