-
Notifications
You must be signed in to change notification settings - Fork 137
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
File with component element separator (ISA16) of 0x1d is unparsable #189
Comments
It looks like whatever non-control character immediately follows the separator after ISA15 is read to be ISA16. But # w.read_character is TokenReader#read_character that skips control characters
w.read_character.flatmap do |isa16, cR| It might be possible to accommodate having a control character as a component separator, but it might get complicated. The segment terminator, which is immediately after ISA16, is parsed exactly how you want ISA16 to be parsed: you can have a control character as a segment terminator. # cR.stream is StreamReader, and StreamReader#read_character does not skip control characters
cR.stream.read_character.flatmap do |char_, dR| Could you provide a sample (just the ISA, GS, and ST segments would be fine)? I'm not sure if you can paste it into a comment given the non-printable characters, but maybe you can upload a gist or maybe you can paste what |
Thank you. I think you're right about what's happening. |
@coenwulf, Thanks that helped! Sorry it took a while to get back to you. I think I've got a working fix, but I'm hoping you can provide another example that I can use as a fixture. The file you gave is missing some required segments (it failed because I pushed a branch called |
Also, you can use |
Thanks. I'll work on using that to create a sample. Probably next week
…On Thu, May 30, 2019, 17:36 kputnam ***@***.***> wrote:
Also, you can use bin/edi-obfuscate to strip out any meaningful
information from your file, like strings, dates, times, numbers, etc. The
only thing that should remain are segment names and qualifier element
values (ID).
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#189>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AB2LDLGIZDZND4JRQ7SXA43PYBXITANCNFSM4HLV3B3A>
.
|
It looks like the obfuscator is currently treats dates as numbers so ends
up with 00000000 in cases when it should have 20151230. I didn't dig into
it, just manually fixed my file.
As you suspected, I can't push to your repo.
Here's a gist:
https://gist.github.com/coenwulf/30389d15f91fbdfe22d4bec1d7cd5d14
-Jeremy
On Thu, May 30, 2019 at 7:21 PM Jeremy Fujimoto-Johnson <[email protected]>
wrote:
… Thanks. I'll work on using that to create a sample. Probably next week
On Thu, May 30, 2019, 17:36 kputnam ***@***.***> wrote:
> Also, you can use bin/edi-obfuscate to strip out any meaningful
> information from your file, like strings, dates, times, numbers, etc. The
> only thing that should remain are segment names and qualifier element
> values (ID).
>
> —
> You are receiving this because you were mentioned.
> Reply to this email directly, view it on GitHub
> <#189>,
> or mute the thread
> <https://github.com/notifications/unsubscribe-auth/AB2LDLGIZDZND4JRQ7SXA43PYBXITANCNFSM4HLV3B3A>
> .
>
|
Dropping a note here that I don't remember if this was already fixed or how the most recent release behaves in regards to this issue, but this issue is fixed and tested for explicitly in another branch that will be merged sometime in the future. I'll comment here again when it's merged. |
|
I have received some x12 214 files that have a value of \u001d (character 29) in position 105 of the file (which is the value for ISA16, the component element separator). When I try to parse those files using stupidedi it gets a fatal error that seems to indicate that it's skipping that character. It gets the S and the field separator (\u001e) instead of GS as the segment identifier and doesn't like it. The error is
found "S\u001E" instead of segment identifier
. But when I replace that character with a different one (e.g.,_
) it works fine.Is there a way to get it to accept that character (\u001d) as the component element separator?
For what it's worth, my code to that point looks basically like:
The text was updated successfully, but these errors were encountered: