Skip to content

Commit

Permalink
build_command.record_key_spec:
Browse files Browse the repository at this point in the history
  Handle the fact that sh 1.13 can now give us either strings or bytes depending on the
  type of file or stream we are processing.
  • Loading branch information
Hans Chalupsky committed May 21, 2020
1 parent 3b20ba1 commit bf308d2
Showing 1 changed file with 6 additions and 0 deletions.
6 changes: 6 additions & 0 deletions kgtk/cli/sort.py
Original file line number Diff line number Diff line change
Expand Up @@ -112,12 +112,18 @@ def build_command(input=None, output=None, columns='1', colsep='\t', options='',

# define these in here, so we can pass in some process-local variables via closures:
def record_key_spec(chunk):
# starting with sh 1.13 it looks like we can get either strings or bytes here;
# if we get bytes we convert to an identical string using `latin1' encoding:
if isinstance(chunk, bytes):
chunk = chunk.decode('latin1')
buffer.write(chunk)
header = buffer.getvalue()
eol = header.find('\n')
if eol >= 0:
with open(sort_env['KGTK_HEADER'], 'w') as out:
out.write(header[0:eol+1])
# reencode from latin1 to utf8 for header processing:
header = header[0:eol].encode('latin1').decode(zcat.kgtk_encoding)
with open(sort_env['KGTK_SORT_KEY_SPEC'], 'w') as out:
out.write(build_sort_key_spec(header, columns, colsep))
# this signals to ignore the callback once we are done collecting the header:
Expand Down

0 comments on commit bf308d2

Please sign in to comment.