Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GH-37340: [MATLAB] The column(index) method of arrow.tabular.RecordBatch errors if index refers to an arrow.array.Time32Array column #37347

Merged
merged 9 commits into from
Aug 23, 2023

Conversation

sgilmore10
Copy link
Member

@sgilmore10 sgilmore10 commented Aug 23, 2023

Rationale for this change

The column(index) method of arrow.tabular.RecordBatch errors if index refers to an arrow.array.Time32Array column.

>> string = arrow.array(["A", "B", "C"]);
>> time32 = arrow.array.Time32Array.fromMATLAB(seconds(1:3));
>> rb = arrow.tabular.RecordBatch.fromArrays(string, time32, ColumnNames=["String", "Time32"])

rb = 

String:   [
    "A",
    "B",
    "C"
  ]
Time32:   [
    00:00:01,
    00:00:02,
    00:00:03
  ]

>> time32Column = rb.column(2)

Error using  . 
Unsupported DataType: time32[s]

Error in arrow.tabular.RecordBatch/column (line 63)
            [proxyID, typeID] = obj.Proxy.getColumnByIndex(args);

column(index) is throwing this error because the case for Time32 was not added to arrow::matlab::array::proxy::wrap(arrowArray) in #37315. column(index) calls this function to create proxy objects around existing arrow arrays. Adding the case for Time32 will resolve this bug.

What changes are included in this PR?

  1. Updated arrow::Result<proxy::Array> wrap(const std::shared_ptr<arrow::Array>& array) to handle wrapping arrow::Time32Array instances within proxy::Time32Arrays.
  2. Added a new test utility called arrow.internal.test.tabular.createAllSupportedArrayTypes, which returns two cell arrays: arrowArrays and matlabData. The arrowArrays cell array contains one instance of each concrete subclass of arrow.array.Array. The matlabData cell array contains the MATLAB arrays used to generate each array in arrowArrays. This utility can be used to create an arrow.array.RecordBatch that contains all the arrow array types that are supported by the MATLAB interface.

Are these changes tested?

Yes. Updated the fromArrays test cases in tRecordBatch to verify the column(index) method of arrow.type.RecordBatch supports extracting columns of any arrow type (supported by the MATLAB Interface).

Are there any user-facing changes?

Yes. Fixed a bug in arrow.tabular.RecordBatch.

Future Directions

  1. [MATLAB] Add function handle to fromMATLAB static construction methods to TypeTraits classes #37345

@github-actions github-actions bot added awaiting changes Awaiting changes and removed awaiting review Awaiting review labels Aug 23, 2023
@github-actions github-actions bot added awaiting change review Awaiting change review and removed awaiting changes Awaiting changes labels Aug 23, 2023
@github-actions github-actions bot added awaiting merge Awaiting merge and removed awaiting change review Awaiting change review labels Aug 23, 2023
@kevingurney
Copy link
Member

+1

@kevingurney kevingurney merged commit 91a4927 into apache:main Aug 23, 2023
9 checks passed
@kevingurney kevingurney deleted the GH-37340 branch August 23, 2023 21:14
@kevingurney kevingurney removed the awaiting merge Awaiting merge label Aug 23, 2023
@conbench-apache-arrow
Copy link

After merging your PR, Conbench analyzed the 5 benchmarking runs that have been run so far on merge-commit 91a4927.

There were no benchmark performance regressions. 🎉

The full Conbench report has more details. It also includes information about possible false positives for unstable benchmarks that are known to sometimes produce them.

loicalleyne pushed a commit to loicalleyne/arrow that referenced this pull request Nov 13, 2023
….RecordBatch` errors if `index` refers to an `arrow.array.Time32Array` column (apache#37347)

### Rationale for this change

The `column(index)` method of `arrow.tabular.RecordBatch` errors  if `index` refers to an `arrow.array.Time32Array` column.

```matlab
>> string = arrow.array(["A", "B", "C"]);
>> time32 = arrow.array.Time32Array.fromMATLAB(seconds(1:3));
>> rb = arrow.tabular.RecordBatch.fromArrays(string, time32, ColumnNames=["String", "Time32"])

rb = 

String:   [
    "A",
    "B",
    "C"
  ]
Time32:   [
    00:00:01,
    00:00:02,
    00:00:03
  ]

>> time32Column = rb.column(2)

Error using  . 
Unsupported DataType: time32[s]

Error in arrow.tabular.RecordBatch/column (line 63)
            [proxyID, typeID] = obj.Proxy.getColumnByIndex(args);
```

`column(index)` is throwing this error because the case for `Time32` was not added to `arrow::matlab::array::proxy::wrap(arrowArray)` in apache#37315. `column(index)` calls this function to create proxy objects around existing arrow arrays. Adding the case for `Time32` will resolve this bug.

### What changes are included in this PR?

1. Updated `arrow::Result<proxy::Array> wrap(const std::shared_ptr<arrow::Array>& array)` to handle wrapping `arrow::Time32Array` instances within `proxy::Time32Array`s.
2. Added a new test utility called `arrow.internal.test.tabular.createAllSupportedArrayTypes`, which returns two cell arrays: `arrowArrays` and `matlabData`. The `arrowArrays` cell array contains one instance of each concrete subclass of `arrow.array.Array`. The `matlabData` cell array contains the MATLAB arrays used to generate each array in `arrowArrays`. This utility can be used to create an `arrow.array.RecordBatch` that contains all the arrow array types that are supported by the MATLAB interface. 

### Are these changes tested?

Yes. Updated the `fromArrays` test cases in `tRecordBatch` to verify the `column(index)` method of `arrow.type.RecordBatch` supports extracting columns of any arrow type (supported by the MATLAB Interface).

### Are there any user-facing changes?

Yes. Fixed a bug in `arrow.tabular.RecordBatch`.

### Future Directions

1. apache#37345

* Closes: apache#37340

Authored-by: Sarah Gilmore <[email protected]>
Signed-off-by: Kevin Gurney <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
2 participants