Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

possibly confusing error message nd.array fails to encode np.arrays in lists + use case: ctbns #372

Open
mpacer opened this issue Sep 28, 2015 · 2 comments

Comments

@mpacer
Copy link
Contributor

mpacer commented Sep 28, 2015

Just testing out the functionality for moving from np.arrays to nd.arrays and I'm surprised that the following fails:

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-22-cb56525526cf> in <module>()
----> 1 nd.array([np.random.random(size=[2,2])])

dynd/nd/array.pyx in dynd.nd.array.array.__init__ (/Users/cocosci/Dropbox/Work/Resources/Repos/dynd-python/build/temp.macosx-10.10-x86_64-3.4/array.cxx:1390)()

TypeError: only length-1 arrays can be converted to Python scalars

Since the np.array has all of the striding data encoded internally, and np.arrays are just the non-ragged case of an nd.array, I feel like this should work. I imagine what is an issue is that it's ambiguous as to whether this is a case where you want a 2 × 2 nd.array or a 1 × 1 nd.array with object type 2 × 2.

Even so, if that is the issue, the error should say something about that ambiguity.

For a use case, consider if one has many clusters of continuous-time Markov Processes with subsets of nodes that are tightly coupled (locally ergodic) and that have relatively few links between these node-clusters. One might (in the vein of ctbns) want to define conditional intensity matrices for different values of discrete nodes, which are most easily stored as a n × n matrix associated with each cluster, and then the large scale dependencies between clusters at a higher order of abstraction.

It would seem that the ragged array approach would work well for this: nd.array with k top level dimensions, that describe the "out-going" state info from each cluster, and then inside each of the k clusters is a n_i × n_i matrix defining the internal dynamics of that cluster, where n_i can be different for each ik.

@mwiebe
Copy link
Member

mwiebe commented Oct 12, 2015

I suspect you want this to become a 3 dimensional array with shape (1, 2, 2), just as it does in NumPy?

In [2]: np.array([np.random.random(size=[2,2])])
Out[2]: 
array([[[ 0.89883947,  0.75418759],
        [ 0.53712153,  0.15815001]]])

I agree this error message is not very helpful, we should do better.

@mpacer
Copy link
Contributor Author

mpacer commented Oct 12, 2015

Well where dynd would come in handy would be in capturing the shapes as explicit datattypes of the conditional intensity matrices

test1 = np.array([np.random.random(size=[2,2])])
test2 = np.array([np.random.random(size=[2,2]), np.random.random(size=[3,3])])
test1.shape, test2.shape

which outputs:

((1, 2, 2), (2,))

Which loses all the information about the underlying element shapes once you have two matrix elements that have different shapes.

To be fair, I'm not sure if this would be the right way to implement the conditional intensity matrices for ctbns (I need to look more closely at how they were originally implemented, and had been trying to do it de novo seeing if I could make them work nicely with the nd.array directly), but it was a place where I figured the more free data-type structure could be useful.

I just ended up stymied immediately upon trying to explore whether that would work well and was a bit surprised at the error message.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants