Skip to content

Commit

Permalink
apacheGH-41771: [C++] Iterator releases its resource immediately when…
Browse files Browse the repository at this point in the history
… it reads all values (apache#41824)

### Rationale for this change

`Iterator` keeps its resource (`ptr_`) until it's deleted but we can release its resource immediately when it reads all values. If `Iterator` keeps its resource until it's deleted, it may block closing a file. See apacheGH-41771 for this case.

### What changes are included in this PR?

Releases `ptr_` when `Next()` returns the end.

### Are these changes tested?

Yes.

### Are there any user-facing changes?

Yes.
* GitHub Issue: apache#41771

Authored-by: Sutou Kouhei <[email protected]>
Signed-off-by: Benjamin Kietzman <[email protected]>
  • Loading branch information
kou authored May 28, 2024
1 parent fe2d926 commit e6e00e7
Show file tree
Hide file tree
Showing 2 changed files with 55 additions and 3 deletions.
15 changes: 12 additions & 3 deletions cpp/src/arrow/util/iterator.h
Original file line number Diff line number Diff line change
Expand Up @@ -105,9 +105,18 @@ class Iterator : public util::EqualityComparable<Iterator<T>> {
Iterator() : ptr_(NULLPTR, [](void*) {}) {}

/// \brief Return the next element of the sequence, IterationTraits<T>::End() when the
/// iteration is completed. Calling this on a default constructed Iterator
/// will result in undefined behavior.
Result<T> Next() { return next_(ptr_.get()); }
/// iteration is completed.
Result<T> Next() {
if (ptr_) {
auto next_result = next_(ptr_.get());
if (next_result.ok() && IsIterationEnd(next_result.ValueUnsafe())) {
ptr_.reset(NULLPTR);
}
return next_result;
} else {
return IterationTraits<T>::End();
}
}

/// Pass each element of the sequence to a visitor. Will return any error status
/// returned by the visitor, terminating iteration.
Expand Down
43 changes: 43 additions & 0 deletions cpp/src/arrow/util/iterator_test.cc
Original file line number Diff line number Diff line change
Expand Up @@ -146,6 +146,49 @@ void AssertIteratorNext(T expected, Iterator<T>& it) {
ASSERT_EQ(expected, actual);
}

template <typename T>
class DeleteDetectableIterator {
public:
explicit DeleteDetectableIterator(std::vector<T> values, bool* deleted)
: values_(std::move(values)), i_(0), deleted_(deleted) {}

DeleteDetectableIterator(DeleteDetectableIterator&& source)
: values_(std::move(source.values_)), i_(source.i_), deleted_(source.deleted_) {
source.deleted_ = nullptr;
}

~DeleteDetectableIterator() {
if (deleted_) {
*deleted_ = true;
}
}

Result<T> Next() {
if (i_ == values_.size()) {
return IterationTraits<T>::End();
}
return std::move(values_[i_++]);
}

private:
std::vector<T> values_;
size_t i_;
bool* deleted_;
};

// Generic iterator tests

TEST(TestIterator, DeleteOnEnd) {
bool deleted = false;
Iterator<TestInt> it(DeleteDetectableIterator<TestInt>({1}, &deleted));
ASSERT_FALSE(deleted);
AssertIteratorNext({1}, it);
ASSERT_FALSE(deleted);
ASSERT_OK_AND_ASSIGN(auto value, it.Next());
ASSERT_TRUE(IsIterationEnd(value));
ASSERT_TRUE(deleted);
}

// --------------------------------------------------------------------
// Synchronous iterator tests

Expand Down

0 comments on commit e6e00e7

Please sign in to comment.