Skip to content
This repository has been archived by the owner on Mar 23, 2020. It is now read-only.

Fixed bytes handling when reading content #2

Open
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

piercefreeman
Copy link

@piercefreeman piercefreeman commented Aug 15, 2018

Fixes a crash when trying to stream from byte-valued files.

Previously, we would read from the fileobj and try to append the value to our string-valued buffer, which would cause a type mismatch. By dynamically instantiating the buffer type we can retain the existing length logic while still returning a string to end clients.

Adding a PR here per discussion in internetarchive#26

@hungrymonkey
Copy link

I just want to tell you the modules doesnt work.

Python3.7

Traceback (most recent call last):
  File "print_all_urls.py", line 13, in <module>
    for record in f:
  File "/Users/psuedofinnish/git_repo/engage-warc-test/venv/src/warc/warc/warc.py", line 406, in __iter__
    record = self.read_record()
  File "/Users/psuedofinnish/git_repo/engage-warc-test/venv/src/warc/warc/warc.py", line 377, in read_record
    self.finish_reading_current_record()
  File "/Users/psuedofinnish/git_repo/engage-warc-test/venv/src/warc/warc/warc.py", line 371, in finish_reading_current_record
    self.current_payload.read()
  File "/Users/psuedofinnish/git_repo/engage-warc-test/venv/src/warc/warc/utils.py", line 69, in read
    return self._read(self.length)
  File "/Users/psuedofinnish/git_repo/engage-warc-test/venv/src/warc/warc/utils.py", line 79, in _read
    content = self.buf + self.fileobj.read(size)
TypeError: can only concatenate str (not "bytes") to str

Using this script.

or f_name in warc_files:
		f = warc.open( COLLECTIONS_DIR + str(f_name))
		for record in f:
			print( record['WARC-Target-URI'] )

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants