You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I was looking at Orbax code at the latest version==0.7.0 and found that pieces of code with quite heavy logic around tensorstore_spec creation seem to contain duplicates.
I'd like to know if this code duplication intended by design or I am welcome to submit a PR.
Would you consider get_tensorstore_spec to reuse build_kvstore_tspec under the hood?
Also, there seems to be a bit of obscurity with default ts_context value.
In orbax/checkpoint/serialization.py, there is TS_CONTEXT in public serialization.py that is used as a default value of context in async_serialize (actually, orbax does not use async_serialize anywhere and recommends using async_serialize_shards) , async_serialize_shards, async_deserialize and by StringHandler.
At the same time, intype_handlers.py, there is get_ts_context() (it references _DEFAULT_OCDBT_TS_CONTEXT), and get_ts_context is used by all other handler implementations.
So, TS_CONTEXT from serialization.py seems to be never used by common checkpoint IO code.
Should we somehow leave only 1 source of truth for default ts_context values?
The text was updated successfully, but these errors were encountered:
This code is currently in the process of a rework by @dicentra13. get_tensorstore_spec in serialization.py should be either removed or should reuse build_kvstore_tspec. I think it is only used in a couple places in internal code - just due slower progress in refactoring rather than any inherent need.
Again, async_serialize is used in one place internally that I'm working on eliminating, although probably will leave that function in place as a wrapper.
Agreed that ts_context should use one source of truth, that could use fixing.
Hi Orbax team,
I was looking at Orbax code at the latest version==0.7.0 and found that pieces of code with quite heavy logic around tensorstore_spec creation seem to contain duplicates.
I'd like to know if this code duplication intended by design or I am welcome to submit a PR.
Here
get_tensorstore_spec
is a part of public API, and I can't find any usage ofget_tensorstore_spec
byorbax-checkpoint
itselfhttps://github.com/google/orbax/blob/8b4e90d573082a5c7caa5f99c51db376f62a6995/checkpoint/orbax/checkpoint/serialization.py#L97C5-L124
And here is a very similar piece of code in
build_kvstore_tspec
in_internal
package, andbuild_kvstore_tspec
is used heavily bytype_handlers.py
orbax/checkpoint/orbax/checkpoint/_src/serialization/tensorstore_utils.py
Lines 62 to 135 in 8b4e90d
Would you consider
get_tensorstore_spec
to reusebuild_kvstore_tspec
under the hood?Also, there seems to be a bit of obscurity with default ts_context value.
orbax/checkpoint/serialization.py
, there is TS_CONTEXT in publicserialization.py
that is used as a default value ofcontext
in async_serialize (actually,orbax
does not useasync_serialize
anywhere and recommends usingasync_serialize_shards
) , async_serialize_shards, async_deserialize and by StringHandler.type_handlers.py
, there is get_ts_context() (it references_DEFAULT_OCDBT_TS_CONTEXT
), andget_ts_context
is used by all other handler implementations.So,
TS_CONTEXT
fromserialization.py
seems to be never used by common checkpoint IO code.Should we somehow leave only 1 source of truth for default ts_context values?
The text was updated successfully, but these errors were encountered: