Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unable to use streaming Reader with bufio.NewScanner #131

Open
Technerder opened this issue Jun 18, 2023 · 2 comments
Open

Unable to use streaming Reader with bufio.NewScanner #131

Technerder opened this issue Jun 18, 2023 · 2 comments

Comments

@Technerder
Copy link

Please answer these questions before submitting your issue. Thanks!

What version of Go are you using (go version)?

go version go1.20 windows/amd64

What operating system and processor architecture are you using (go env)?

set GO111MODULE=on
set GOARCH=amd64
set GOBIN=
set GOCACHE=C:\Users\User\AppData\Local\go-build
set GOENV=C:\Users\User\AppData\Roaming\go\env
set GOEXE=.exe
set GOEXPERIMENT=
set GOFLAGS=
set AR=ar
set CC=gcc
set CXX=g++
set CGO_ENABLED=1
set GOMOD=C:\Users\User\Dev\PSDumpTool\go.mod
set GOWORK=
set CGO_CFLAGS=-O2 -g
set CGO_CPPFLAGS=
set CGO_CXXFLAGS=-O2 -g
set CGO_FFLAGS=-O2 -g
set CGO_LDFLAGS=-O2 -g
set PKG_CONFIG=pkg-config
set GOGCCFLAGS=-m64 -mthreads -Wl,--no-gc-sections -fmessage-length=0 -fdebug-prefix-map=C:\Users\User\AppData\Local\Temp\go-build599287788=/tmp/go-build -gno-record-gcc-switches

What did you do?

example_file.zst is a 12GB file which extracts to an ~130GB ndjson (new line delimeted json) file

package main

import (
	"bufio"
	"github.com/DataDog/zstd"
	"log"
	"os"
)

func main() {
	file, err := os.Open("example_file.zst")
	if err != nil {
		log.Printf("Couldn't open file: %v\n", err)
		return
	}
	defer file.Close()
	fileScanner := bufio.NewScanner(zstd.NewReader(file))
	fileScanner.Split(bufio.ScanLines)
	for fileScanner.Scan() {
		log.Printf("%s\n", fileScanner.Text())
	}
}

What did you expect to see?

I expected to see all lines to be printed in order

What did you see instead?

The program starts and exists without printing any lines or errors.

@Viq111
Copy link
Collaborator

Viq111 commented Jun 21, 2023

Hi @Technerder, would you be able to provider a small test case that can reproduce your example ?

I tried on a small payload and this seems to run fine for me:

package main

import (
	"bufio"
	"bytes"
	"fmt"
	"github.com/DataDog/zstd"
)

func main() {
	exampleString := "This is\na multiline\nexample\n"
	compressed, err := zstd.Compress(nil, []byte(exampleString))
	if err != nil {
		panic(err)
	}
	fileScanner := bufio.NewScanner(zstd.NewReader(bytes.NewBuffer(compressed)))
	fileScanner.Split(bufio.ScanLines)
	for fileScanner.Scan() {
		fmt.Printf("%s\n", fileScanner.Text())
	}
}

returns

ᐅ go run example.go
This is
a multiline
example

@Technerder
Copy link
Author

Sure

{"key": "value"}
{"key": "value"}
{"key": "value"}
{"key": "value"}
{"key": "value"}
{"key": "value"}
{"key": "value"}
{"key": "value"}
{"key": "value"}
{"key": "value"}

example.zip

The zip file contains both the raw and zstd compressed files, since I wasn't able to upload a .zst file to the comment.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants