Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Handling pipe timeout is incorrect #639

Closed
apocelipes opened this issue Nov 29, 2023 · 12 comments
Closed

Handling pipe timeout is incorrect #639

apocelipes opened this issue Nov 29, 2023 · 12 comments
Labels
bug Something isn't working

Comments

@apocelipes
Copy link
Contributor

apocelipes commented Nov 29, 2023

As argued in #638 , set O_NONBLOCK at a pipe which will be dupped to stdout/stderr is unsafe, and has some potential problems.

Since the code only handle pipes, we can safely use blocking-pipes with poll -- There is only one reader, so it couldn't happen that the fd is ready and there is no data to read. (e.g. read by another reader or reported as ready like a socket because of protocol notifications, etc.)

The correct way to do timeout is insteading of using non-blocking io, to use a big enough buff (as big as the buffer inside the pipe) to read data from pipes, and call poll again no matter how much data is read. When read returns EOF or error, break out the loop. A big enough buffer is to reduce read calls, a 4096 length buffer works too but may be encounter some performance impacts.

O_NONBLOCK problems should be fixed.

Additionally pipe needs to set the O_CLOEXEC flag for the most important reason, besides performance and preventing file descriptor leaks, for security. We have no control over the external programs that are called, so it is important to minimize the amount of information that is exposed.

@apocelipes apocelipes added the bug Something isn't working label Nov 29, 2023
@CarterLi
Copy link
Member

Additionally pipe needs to set the O_CLOEXEC flag for the most important reason, besides performance and preventing file descriptor leaks, for security.

They were closed. However, you are right, O_CLOEXEC is a better option. In addition, we should check every open(2) syscall to have O_CLOEXEC used.

O_NONBLOCK problems should be fixed.

If timeout is not enabled, O_NONBLOCK won't be used. I don't see any problems here.

@apocelipes
Copy link
Contributor Author

If timeout is not enabled, O_NONBLOCK won't be used. I don't see any problems here.

If timeout set, fcntl will also change child process's pipe flags, this can be privode by this:

#include <fcntl.h>
#include <unistd.h>
#include <stdio.h>

int main()
{
        int fds[2] = {-1, -1};
        pipe(fds);
        int pid = fork();
        if (pid) {
                usleep(10000000);
                printf("before: %d\n", fcntl(fds[1], F_GETFL));
                fcntl(fds[1], F_SETFL, fcntl(fds[1], F_GETFL) | O_NONBLOCK);
                printf("after: %d\n", fcntl(fds[1], F_GETFL));
                waitpid(pid, NULL, 0);
        } else {
                while(1) {
                dup2(fds[1], 0);
                printf("child: %d\n", fcntl(0, F_GETFL));
                usleep(1000000);
                }
        }
}

image

@apocelipes
Copy link
Contributor Author

apocelipes commented Nov 29, 2023

As the output, even after dupping, the flags also been changed.

Since two processes share the same pipes, it might be a correct behavior, but i cannot found documents that doucmented this.

A more serious problem, we can't control whether fcntl is executed before or after exec, and there might be a problem if it is executed after exec.

@CarterLi
Copy link
Member

I didnt get it. Please speak Chinese

@apocelipes
Copy link
Contributor Author

使用fcntl给父进程的pipefd添加non-block标志,子进程里的pipe一样会被影响,被添加上non-block,即使已经使用dup进行重定向。

所以设置timeout后execve调用的进程一样有概率遇到EAGAIN的错误。

以及fork不保证父子进程运行顺序,因此很可能父进程fcntl添加non-block会发生在子进程execve之后,这很可能会意外影响被execve调用的程序的行为。

@apocelipes
Copy link
Contributor Author

解决上面问题的办法有两个:

  1. fork之前添加non-block,子进程里再把non-block去除
  2. 使用issue的方案,不使用非阻塞io

方案1不够简洁而且预想了第三方程序会如何处理输出;方案2本地环境测试几乎没有性能影响,能在不关心第三方程序怎么处理输出的同时解决fcntl跨进程产生影响以及fcntl和execve的时序问题。

@CarterLi
Copy link
Member

  • In parent, we add O_NONBLOCK to pipes[0]
  • In child, we dup pipes[1] to stdout, while pipes[0] is closed directly

Why is it a problem?

@apocelipes
Copy link
Contributor Author

apocelipes commented Nov 29, 2023

  • In child, we dup pipes[1] to stdout, while pipes[0] is closed directly

You are right. It's my miss. But timing problems still exists: fcntl can run before close/execve. In this case it could not cause a problem, but it is unsafe.

Moreover, Blocking-io code is simpler than a non-blocking one. And non-blocking IO did not result in significant performance gains.

@CarterLi
Copy link
Member

If fcntl is called before closing, the fd will be closed. If fcntl is called after closing, it just do nothing. Why unsafe?

It is about 1 am, lets stop arguing and go to bed.

@apocelipes
Copy link
Contributor Author

apocelipes commented Nov 29, 2023

If fcntl is called before closing, the fd will be closed. If fcntl is called after closing, it just do nothing. Why unsafe?

First, the situation where a process is unaware that its own data can be modified by other processes is inherently insecure.
Second, fcntl on a closed fd is not a nop, the call will fail and errno set to EBADF, errno will be contaminated.

In my timezone it is 2am, i will push a PR for this issue. Good night and we'll continue tomorrow.

@apocelipes
Copy link
Contributor Author

Another reason not use Non-blocking IO, we only wait for one file, no IO multiplexing at this point, so no performance up. You will cost almost equal time on poll+read that costs on read with blocking IO, but the code becomes more complex.
I will keep poll because it implements the timeout, but no more non-blocking IO.

@apocelipes
Copy link
Contributor Author

Close this since #640 has been merged.
Checking other open calls and adding necessary O_CLOEXEC flag should be another issue.
If any bug or performance regression caused by #640 is found, please reopen this issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants