Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Don't fail in gz_load() if read would block #993

Open
wants to merge 21 commits into
base: develop
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -20,6 +20,7 @@
/minigzip
/minigzip64
/minigzipsh
/gznonblk
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

gznonblk is the executable built from test/gznonblk.c.

/zlib.pc
/configure.log
/build
Expand Down
4 changes: 2 additions & 2 deletions ChangeLog
Original file line number Diff line number Diff line change
@@ -1,8 +1,8 @@

ChangeLog file for zlib

Changes in 1.3.1.1 (xx Jan 2024)
-
Changes in 1.3.1.1 (xx Jul 2024)
- Make non-blocking I/O work in gzread.c (Issue #992)

Changes in 1.3.1 (22 Jan 2024)
- Reject overflows of zip header fields in minizip
Expand Down
20 changes: 19 additions & 1 deletion Makefile.in
Original file line number Diff line number Diff line change
Expand Up @@ -77,11 +77,13 @@ shared: examplesh$(EXE) minigzipsh$(EXE)

all64: example64$(EXE) minigzip64$(EXE)

staticgznb: gznonblk$(EXE)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adding the gznonblk test executable to the Makefile - only the static build method is used as the other builds (shared and 64) are already tested via example and minigzip.

check: test

test: all teststatic testshared

teststatic: static
teststatic: static testgznonblk
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addubg the gznoblk test target as a dependency to this static test target, so the non-blocking functionality added to gzread.c in this PR can be tested.

@TMPST=tmpst_$$; \
if echo hello world | ${QEMU_RUN} ./minigzip | ${QEMU_RUN} ./minigzip -d && ${QEMU_RUN} ./example $$TMPST ; then \
echo ' *** zlib test OK ***'; \
Expand All @@ -90,6 +92,13 @@ teststatic: static
fi
@rm -f tmpst_$$

testgznonblk: staticgznb
@if ./gznonblk 44444 --client-fork 127.0.0.1 first second --delay third --delay -stopserver- ; then \
echo ' *** gznonblk test OK ***'; \
else \
echo ' *** gznonblk test FAILED ***'; false; \
fi

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here are the details of the gznonblk test target, which tests the non-blocking functionality added to gzread.c in this PR. The command is

./gznonblk 44444 --client-fork 127.0.0.1 first second --delay third --delay -stopserver-

  • ./gznonblk - run the executable built by staticgznb
    • N.B. both server and client sides are contained in this one executable.
  • 44444 - the port number for the server socket
  • --client-fork - the server parent spawns the client child

N.B. the rest of the arguments are strings or directives for the client only

  • 127.0.0.1 - the server host string to which the client will connect
    • i.e. localhost, but localhost may point to :: (IPV6) on some systems
  • first, second - the first two strings that the client will gzwrite to, and the server will gzread from, the socket
    • N.B. these two arguments are typically concatenated into a single gzread on the server side
  • --delay - a directive to the client to delay before processing the next argument
  • third - the third string gz-transferred from client to server
  • --delay - another delay to ensure the next and final string is not concatenated to the previous "third"
  • -stopserver - the final string gz-transferred from client to server
    • which will trigger the server to exit (see variable stopserver in test/gznonblk.c)

Once the server (parent) exits, it will pause, then check for the exit status of the client (child), and set its own return status to success (zero) IFF both server and client exited successfully.

testshared: shared
@LD_LIBRARY_PATH=`pwd`:$(LD_LIBRARY_PATH) ; export LD_LIBRARY_PATH; \
LD_LIBRARYN32_PATH=`pwd`:$(LD_LIBRARYN32_PATH) ; export LD_LIBRARYN32_PATH; \
Expand Down Expand Up @@ -145,6 +154,9 @@ example.o: $(SRCDIR)test/example.c $(SRCDIR)zlib.h zconf.h
minigzip.o: $(SRCDIR)test/minigzip.c $(SRCDIR)zlib.h zconf.h
$(CC) $(CFLAGS) $(ZINCOUT) -c -o $@ $(SRCDIR)test/minigzip.c

gznonblk.o: $(SRCDIR)test/gznonblk.c $(SRCDIR)zlib.h zconf.h
$(CC) $(CFLAGS) $(ZINCOUT) -c -o $@ $(SRCDIR)test/gznonblk.c
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Compiling test/gznonblk.c, patterned after the other test programs


example64.o: $(SRCDIR)test/example.c $(SRCDIR)zlib.h zconf.h
$(CC) $(CFLAGS) $(ZINCOUT) -D_FILE_OFFSET_BITS=64 -c -o $@ $(SRCDIR)test/example.c

Expand Down Expand Up @@ -287,6 +299,9 @@ example$(EXE): example.o $(STATICLIB)
minigzip$(EXE): minigzip.o $(STATICLIB)
$(CC) $(CFLAGS) $(LDFLAGS) -o $@ minigzip.o $(TEST_LIBS)

gznonblk$(EXE): gznonblk.o $(STATICLIB)
$(CC) $(CFLAGS) $(LDFLAGS) -o $@ gznonblk.o $(TEST_LIBS)
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Builds the gznonblk test program using the static library (libz.a)


examplesh$(EXE): example.o $(SHAREDLIBV)
$(CC) $(CFLAGS) -o $@ example.o $(LDFLAGS) -L. $(SHAREDLIBV)

Expand Down Expand Up @@ -370,6 +385,7 @@ clean: minizip-clean
rm -f *.o *.lo *~ \
example$(EXE) minigzip$(EXE) examplesh$(EXE) minigzipsh$(EXE) \
example64$(EXE) minigzip64$(EXE) \
gznonblk$(EXE) \
infcover \
libz.* foo.gz so_locations \
_match.s maketree contrib/infback9/*.o
Expand All @@ -392,6 +408,7 @@ tags:
adler32.o zutil.o: $(SRCDIR)zutil.h $(SRCDIR)zlib.h zconf.h
gzclose.o gzlib.o gzread.o gzwrite.o: $(SRCDIR)zlib.h zconf.h $(SRCDIR)gzguts.h
compress.o example.o minigzip.o uncompr.o: $(SRCDIR)zlib.h zconf.h
gznonblk.o: $(SRCDIR)zlib.h zconf.h
crc32.o: $(SRCDIR)zutil.h $(SRCDIR)zlib.h zconf.h $(SRCDIR)crc32.h
deflate.o: $(SRCDIR)deflate.h $(SRCDIR)zutil.h $(SRCDIR)zlib.h zconf.h
infback.o inflate.o: $(SRCDIR)zutil.h $(SRCDIR)zlib.h zconf.h $(SRCDIR)inftrees.h $(SRCDIR)inflate.h $(SRCDIR)inffast.h $(SRCDIR)inffixed.h
Expand All @@ -402,6 +419,7 @@ trees.o: $(SRCDIR)deflate.h $(SRCDIR)zutil.h $(SRCDIR)zlib.h zconf.h $(SRCDIR)tr
adler32.lo zutil.lo: $(SRCDIR)zutil.h $(SRCDIR)zlib.h zconf.h
gzclose.lo gzlib.lo gzread.lo gzwrite.lo: $(SRCDIR)zlib.h zconf.h $(SRCDIR)gzguts.h
compress.lo example.lo minigzip.lo uncompr.lo: $(SRCDIR)zlib.h zconf.h
gznonblk.lo: $(SRCDIR)zlib.h zconf.h
crc32.lo: $(SRCDIR)zutil.h $(SRCDIR)zlib.h zconf.h $(SRCDIR)crc32.h
deflate.lo: $(SRCDIR)deflate.h $(SRCDIR)zutil.h $(SRCDIR)zlib.h zconf.h
infback.lo inflate.lo: $(SRCDIR)zutil.h $(SRCDIR)zlib.h zconf.h $(SRCDIR)inftrees.h $(SRCDIR)inflate.h $(SRCDIR)inffast.h $(SRCDIR)inffixed.h
Expand Down
12 changes: 10 additions & 2 deletions gzread.c
Original file line number Diff line number Diff line change
Expand Up @@ -24,7 +24,7 @@ local int gz_load(gz_statep state, unsigned char *buf, unsigned len,
break;
*have += (unsigned)ret;
} while (*have < len);
if (ret < 0) {
if (ret < 0 && errno != EAGAIN && errno != EWOULDBLOCK) {
Copy link
Author

@drbitboy drbitboy Jul 22, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the core of this PR, because all of the read-related wrappers here in gzread.c
end up calling this gz_load routine: when

  • BOTH state-fd (file descriptor) is non-blocking (fcntl(fd, SET_FL, fcntl(fd, GET_FL) | O_NONBLOCK))
  • AND there are no data available to read,

then the read(2) return value (ret above, assigned on line 22) will be -1, and errno will be either EWOULDBLOCK or EAGAIN (Resource not available).

The change here prevents that case from returning -1 from gz_load() here, which in turn
allows the calling routine* to continue working with data from the file descriptor.

It gets a little tricky, because no bytes are returned, and 0 as the return value from read(2)
indicates EOF (End-Of-File).

N.B. tt may make sense to clear errno to a value of 0 here in gz_load(), just before line 22 above.

* gz_avail() or gz_fetch() or gz_read(), with the latter being the ultimate caller of the first two.

gz_error(state, Z_ERRNO, zstrerror());
return -1;
}
Expand Down Expand Up @@ -268,6 +268,7 @@ local int gz_skip(gz_statep state, z_off64_t len) {
local z_size_t gz_read(gz_statep state, voidp buf, z_size_t len) {
z_size_t got;
unsigned n;
int load_errno;
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Variable to keep track of errno from the read(2) in gz_load(), whether gz_load() is called from here (gz_read())

  • EITHER directly,
  • OR indirectly via gz_decomp() or gz_fetch().


/* if len is zero, avoid unnecessary operations */
if (len == 0)
Expand All @@ -283,6 +284,7 @@ local z_size_t gz_read(gz_statep state, voidp buf, z_size_t len) {
/* get len bytes to buf, or less than len if at the end */
got = 0;
do {
load_errno = 0;
/* set n to the maximum amount of len that fits in an unsigned int */
n = (unsigned)-1;
if (n > len)
Expand All @@ -307,25 +309,31 @@ local z_size_t gz_read(gz_statep state, voidp buf, z_size_t len) {
buffer */
else if (state->how == LOOK || n < (state->size << 1)) {
/* get more output, looking for header if required */
errno = 0;
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Prepare for errno to become EAGAIN/EWOULDBLOCK, indicating that the read(2) on the
non-blocking file descriptor would block, so the read(2) returns a value of -1.

if (gz_fetch(state) == -1)
return 0;
if (!state->x.have) { load_errno = errno; }
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The call to gz_fetch() (line 313may have resulted in a call to gz_load()/read(2) and updated the internal output buffer, so errno may have a value of EAGAIN/EWOULDBLOCK, indicating that read(2) would have blocked on a non-blocking file.

The next line, 316, will continue to the while at the end of this do-while loop.

The state->x internal output buffer is encapsulated in state->x.next and state->x.have (see lines 293-300 above).

So if there is any data in the internal output buffer (state->x.have is non-zero), then don't copy the value, possibly EAGAIN/EWOULDBLOCK, of errno from gz_fetch() here, because that would cause the while to exit the loop and the data in the internal input buffer would be lost.

If the the state->x internal output buffer is empty here (state->x.have is 0), then there was no data read in the gz_fetch() call (line 313), and it is save to copy the errno value to load_errno and continue, which, if errno is EAGAIN/EWOULDBLOCK here, will exit this do-while loop .

continue; /* no progress yet -- go back to copy above */
/* the copy above assures that we will leave with space in the
output buffer, allowing at least one gzungetc() to succeed */
}

/* large len -- read directly into user buffer */
else if (state->how == COPY) { /* read directly */
errno = 0;
if (gz_load(state, (unsigned char *)buf, n, &n) == -1)
return 0;
load_errno = errno;
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

gz_load() (line 324) read the data directly into the external output buffer (buf), and state->x was empty and not used here, so put the value of errno into load_errno, which if that value is EAGAIN/EWOULDBLOCK, will cause the do-while loop to exit (line 346).

}

/* large len -- decompress directly into user buffer */
else { /* state->how == GZIP */
state->strm.avail_out = n;
state->strm.next_out = (unsigned char *)buf;
errno = 0;
if (gz_decomp(state) == -1)
return 0;
load_errno = errno;
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

gz_decomp() (line 334) read the data directly into the external output buffer (buf; lines 331-332), and that buffer will be declared empty when state->x.have is cleared to 0 (line 338), so put the value of errno into load_errno, which if the value is ...

n = state->x.have;
state->x.have = 0;
}
Expand All @@ -335,7 +343,7 @@ local z_size_t gz_read(gz_statep state, voidp buf, z_size_t len) {
buf = (char *)buf + n;
got += n;
state->x.pos += n;
} while (len);
} while (len && load_errno != EAGAIN && load_errno != EWOULDBLOCK);
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Execute another pass on this do-while loop if

  • BOTH there is room left in the output buffer
  • AND we did not execute a blocking read on a non-blocking input on this pass


/* return number of bytes read into user buffer */
return got;
Expand Down
Loading