Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GlusterFS: CI jobs are failing for disperse volumes #95

Open
anoopcs9 opened this issue Nov 8, 2024 · 1 comment
Open

GlusterFS: CI jobs are failing for disperse volumes #95

anoopcs9 opened this issue Nov 8, 2024 · 1 comment

Comments

@anoopcs9
Copy link
Collaborator

anoopcs9 commented Nov 8, 2024

testcases/consistency/test_consistency.py::test_consistency[192.168.123.10-vol-disperse-glusterfs-default] FAILED [ 50%]
testcases/consistency/test_consistency.py::test_consistency[192.168.123.10-vol-replicate-glusterfs-default] PASSED [100%]

Last successful run was on Oct 21st: https://jenkins-samba.apps.ocp.cloud.ci.centos.org/view/GlusterFS/job/samba_glusterfs-integration-environment/817/

@anoopcs9
Copy link
Collaborator Author

anoopcs9 commented Nov 8, 2024

=================================== FAILURES ===================================
_______ test_consistency[192.168.123.10-vol-disperse-glusterfs-default] ________

self = <smbprotocol.connection.Connection object at 0x7f32ef195df0>

    def _process_message_thread(self):
        try:
            while True:
                # Wait for a max of 10 minutes before sending an echo that tells the SMB server the client is still
                # available. This stops the server from closing the connection and the associated sessions on a long
                # lived connection. A brief test shows Windows kills a connection at ~16 minutes so 10 minutes is a
                # safe choice.
                # https://github.com/jborean93/smbprotocol/issues/31
                try:
>                   b_msg = self.transport.recv(self._receive_timeout)

.tox/sanity/lib/python3.9/site-packages/smbprotocol/connection.py:1312: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
.tox/sanity/lib/python3.9/site-packages/smbprotocol/transport.py:115: in recv
    b_packet_size, timeout = self._recv(4, timeout)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

self = <smbprotocol.transport.Tcp object at 0x7f32ef195eb0>, length = 4
timeout = -0.1015451729998631

    def _recv(self, length, timeout):
        buffer = bytearray(length)
        offset = 0
        while offset < length:
            read_len = length - offset
            log.debug(f"Socket recv({read_len}) (total {length})")
    
            start_time = timeit.default_timer()
    
            with self._close_lock:
                if not self.connected:
                    # The socket was closed - need the no cover to avoid CI failing on race condition differences
                    return None, timeout  # pragma: no cover
    
                read = select.select([self._sock], [], [], max(timeout, 1))[0]
                timeout = timeout - (timeit.default_timer() - start_time)
                if not read:
                    log.debug("Socket recv(%s) timed out")
>                   raise TimeoutError()
E                   TimeoutError

.tox/sanity/lib/python3.9/site-packages/smbprotocol/transport.py:141: TimeoutError

The above exception was the direct cause of the following exception:

hostname = '192.168.123.10', share_name = 'vol-disperse-glusterfs-default'

    @pytest.mark.parametrize("hostname,share_name", generate_consistency_check())
    def test_consistency(hostname: str, share_name: str) -> None:
>       consistency_check(hostname, share_name)

testcases/consistency/test_consistency.py:58: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
testcases/consistency/test_consistency.py:32: in consistency_check
    smbclient.disconnect()
testhelper/smbclient.py:57: in disconnect
    smbclient.reset_connection_cache(
.tox/sanity/lib/python3.9/site-packages/smbclient/_pool.py:445: in reset_connection_cache
    connection.disconnect()
.tox/sanity/lib/python3.9/site-packages/smbprotocol/connection.py:952: in disconnect
    session.disconnect(True, timeout=timeout)
.tox/sanity/lib/python3.9/site-packages/smbprotocol/session.py:412: in disconnect
    tree.disconnect()
.tox/sanity/lib/python3.9/site-packages/smbprotocol/tree.py:287: in disconnect
    res = self.session.connection.receive(request)
.tox/sanity/lib/python3.9/site-packages/smbprotocol/connection.py:1028: in receive
    self._check_worker_running()  # The worker may have failed while waiting for the response, check again
.tox/sanity/lib/python3.9/site-packages/smbprotocol/connection.py:1184: in _check_worker_running
    raise self._t_exc
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

self = <smbprotocol.connection.Connection object at 0x7f32ef195df0>

    def _process_message_thread(self):
        try:
            while True:
                # Wait for a max of 10 minutes before sending an echo that tells the SMB server the client is still
                # available. This stops the server from closing the connection and the associated sessions on a long
                # lived connection. A brief test shows Windows kills a connection at ~16 minutes so 10 minutes is a
                # safe choice.
                # https://github.com/jborean93/smbprotocol/issues/31
                try:
                    b_msg = self.transport.recv(self._receive_timeout)
                except TimeoutError as ex:
                    # Check if the connection has unanswered keepalive echo requests with the reserved field set.
                    # When unanswered keep alive echo exists, the server did not respond withing two times the timeout.
                    # We assume that the server connection is dead and close it.
                    for r in self.outstanding_requests.values():
                        if (
                            r.response is None
                            and r.message["command"].get_value() == Commands.SMB2_ECHO
                            and r.message["reserved"].get_value() == 1
                        ):
                            # connection will be closed in finally block
>                           raise SMBConnectionClosed(
                                "Connection timed out. Server did not respond within timeout."
                            ) from ex
E                           smbprotocol.exceptions.SMBConnectionClosed: Connection timed out. Server did not respond within timeout.

.tox/sanity/lib/python3.9/site-packages/smbprotocol/connection.py:1324: SMBConnectionClosed
=============================== warnings summary ===============================
testcases/consistency/test_consistency.py::test_consistency[192.168.123.10-vol-disperse-glusterfs-default]
  /root/sit-test-cases/.tox/sanity/lib/python3.9/site-packages/_pytest/threadexception.py:82: PytestUnhandledThreadExceptionWarning: Exception in thread msg_worker-192.168.123.10:445
  
  Traceback (most recent call last):
    File "/root/sit-test-cases/.tox/sanity/lib/python3.9/site-packages/smbprotocol/connection.py", line 1312, in _process_message_thread
      b_msg = self.transport.recv(self._receive_timeout)
    File "/root/sit-test-cases/.tox/sanity/lib/python3.9/site-packages/smbprotocol/transport.py", line 115, in recv
      b_packet_size, timeout = self._recv(4, timeout)
    File "/root/sit-test-cases/.tox/sanity/lib/python3.9/site-packages/smbprotocol/transport.py", line 141, in _recv
      raise TimeoutError()
  TimeoutError
  
  The above exception was the direct cause of the following exception:
  
  Traceback (most recent call last):
    File "/root/sit-test-cases/.tox/sanity/lib/python3.9/site-packages/_pytest/threadexception.py", line 68, in thread_exception_runtest_hook
      yield
    File "/root/sit-test-cases/.tox/sanity/lib/python3.9/site-packages/pluggy/_callers.py", line 122, in _multicall
      teardown.throw(exception)  # type: ignore[union-attr]
    File "/root/sit-test-cases/.tox/sanity/lib/python3.9/site-packages/_pytest/unraisableexception.py", line 95, in pytest_runtest_call
      yield from unraisable_exception_runtest_hook()
    File "/root/sit-test-cases/.tox/sanity/lib/python3.9/site-packages/_pytest/unraisableexception.py", line 70, in unraisable_exception_runtest_hook
      yield
    File "/root/sit-test-cases/.tox/sanity/lib/python3.9/site-packages/pluggy/_callers.py", line 122, in _multicall
      teardown.throw(exception)  # type: ignore[union-attr]
    File "/root/sit-test-cases/.tox/sanity/lib/python3.9/site-packages/_pytest/logging.py", line 846, in pytest_runtest_call
      yield from self._runtest_for(item, "call")
    File "/root/sit-test-cases/.tox/sanity/lib/python3.9/site-packages/_pytest/logging.py", line 829, in _runtest_for
      yield
    File "/root/sit-test-cases/.tox/sanity/lib/python3.9/site-packages/pluggy/_callers.py", line 122, in _multicall
      teardown.throw(exception)  # type: ignore[union-attr]
    File "/root/sit-test-cases/.tox/sanity/lib/python3.9/site-packages/_pytest/capture.py", line 880, in pytest_runtest_call
      return (yield)
    File "/root/sit-test-cases/.tox/sanity/lib/python3.9/site-packages/pluggy/_callers.py", line 122, in _multicall
      teardown.throw(exception)  # type: ignore[union-attr]
    File "/root/sit-test-cases/.tox/sanity/lib/python3.9/site-packages/_pytest/skipping.py", line 257, in pytest_runtest_call
      return (yield)
    File "/root/sit-test-cases/.tox/sanity/lib/python3.9/site-packages/pluggy/_callers.py", line 103, in _multicall
      res = hook_impl.function(*args)
    File "/root/sit-test-cases/.tox/sanity/lib/python3.9/site-packages/_pytest/runner.py", line 174, in pytest_runtest_call
      item.runtest()
    File "/root/sit-test-cases/.tox/sanity/lib/python3.9/site-packages/_pytest/python.py", line 1627, in runtest
      self.ihook.pytest_pyfunc_call(pyfuncitem=self)
    File "/root/sit-test-cases/.tox/sanity/lib/python3.9/site-packages/pluggy/_hooks.py", line 513, in __call__
      return self._hookexec(self.name, self._hookimpls.copy(), kwargs, firstresult)
    File "/root/sit-test-cases/.tox/sanity/lib/python3.9/site-packages/pluggy/_manager.py", line 120, in _hookexec
      return self._inner_hookexec(hook_name, methods, kwargs, firstresult)
    File "/root/sit-test-cases/.tox/sanity/lib/python3.9/site-packages/pluggy/_callers.py", line 139, in _multicall
      raise exception.with_traceback(exception.__traceback__)
    File "/root/sit-test-cases/.tox/sanity/lib/python3.9/site-packages/pluggy/_callers.py", line 103, in _multicall
      res = hook_impl.function(*args)
    File "/root/sit-test-cases/.tox/sanity/lib/python3.9/site-packages/_pytest/python.py", line 159, in pytest_pyfunc_call
      result = testfunction(**testargs)
    File "/root/sit-test-cases/testcases/consistency/test_consistency.py", line 58, in test_consistency
      consistency_check(hostname, share_name)
    File "/root/sit-test-cases/testcases/consistency/test_consistency.py", line 32, in consistency_check
      smbclient.disconnect()
    File "/root/sit-test-cases/testhelper/smbclient.py", line 57, in disconnect
      smbclient.reset_connection_cache(
    File "/root/sit-test-cases/.tox/sanity/lib/python3.9/site-packages/smbclient/_pool.py", line 445, in reset_connection_cache
      connection.disconnect()
    File "/root/sit-test-cases/.tox/sanity/lib/python3.9/site-packages/smbprotocol/connection.py", line 952, in disconnect
      session.disconnect(True, timeout=timeout)
    File "/root/sit-test-cases/.tox/sanity/lib/python3.9/site-packages/smbprotocol/session.py", line 412, in disconnect
      tree.disconnect()
    File "/root/sit-test-cases/.tox/sanity/lib/python3.9/site-packages/smbprotocol/tree.py", line 287, in disconnect
      res = self.session.connection.receive(request)
    File "/root/sit-test-cases/.tox/sanity/lib/python3.9/site-packages/smbprotocol/connection.py", line 1028, in receive
      self._check_worker_running()  # The worker may have failed while waiting for the response, check again
    File "/root/sit-test-cases/.tox/sanity/lib/python3.9/site-packages/smbprotocol/connection.py", line 1184, in _check_worker_running
      raise self._t_exc
    File "/root/sit-test-cases/.tox/sanity/lib/python3.9/site-packages/smbprotocol/connection.py", line 1324, in _process_message_thread
      raise SMBConnectionClosed(
  smbprotocol.exceptions.SMBConnectionClosed: Connection timed out. Server did not respond within timeout.
  
  During handling of the above exception, another exception occurred:
  
  Traceback (most recent call last):
    File "/usr/lib64/python3.9/threading.py", line 980, in _bootstrap_inner
      self.run()
    File "/usr/lib64/python3.9/threading.py", line 917, in run
      self._target(*self._args, **self._kwargs)
    File "/root/sit-test-cases/.tox/sanity/lib/python3.9/site-packages/smbprotocol/connection.py", line 1406, in _process_message_thread
      self.disconnect(False)
    File "/root/sit-test-cases/.tox/sanity/lib/python3.9/site-packages/smbprotocol/connection.py", line 957, in disconnect
      self._t_worker.join(timeout=2)
    File "/usr/lib64/python3.9/threading.py", line 1057, in join
      raise RuntimeError("cannot join current thread")
  RuntimeError: cannot join current thread
  
    warnings.warn(pytest.PytestUnhandledThreadExceptionWarning(msg))

-- Docs: https://docs.pytest.org/en/stable/how-to/capture-warnings.html

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant