Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

File with component element separator (ISA16) of 0x1d is unparsable #189

Open
coenwulf opened this issue May 8, 2019 · 8 comments
Open
Labels

Comments

@coenwulf
Copy link

coenwulf commented May 8, 2019

I have received some x12 214 files that have a value of \u001d (character 29) in position 105 of the file (which is the value for ISA16, the component element separator). When I try to parse those files using stupidedi it gets a fatal error that seems to indicate that it's skipping that character. It gets the S and the field separator (\u001e) instead of GS as the segment identifier and doesn't like it. The error is found "S\u001E" instead of segment identifier. But when I replace that character with a different one (e.g., _) it works fine.
Is there a way to get it to accept that character (\u001d) as the component element separator?

For what it's worth, my code to that point looks basically like:

    config = Stupidedi::Config.contrib
    parser = Stupidedi::Builder::StateMachine.build(config)
    machine, result = parser.read(Stupidedi::Reader.build(file))

    if result.fatal?
      result.explain {|reason| raise EDI214Error.new(reason + " at #{result.position.inspect}") }
    end
@kputnam
Copy link
Owner

kputnam commented May 9, 2019

It looks like whatever non-control character immediately follows the separator after ISA15 is read to be ISA16. But 0x1d is a control character, so it's skipped over and the next character is assumed to be the value.

# w.read_character is TokenReader#read_character that skips control characters
w.read_character.flatmap do |isa16, cR|

It might be possible to accommodate having a control character as a component separator, but it might get complicated. The segment terminator, which is immediately after ISA16, is parsed exactly how you want ISA16 to be parsed: you can have a control character as a segment terminator.

# cR.stream is StreamReader, and StreamReader#read_character does not skip control characters
cR.stream.read_character.flatmap do |char_, dR|

Could you provide a sample (just the ISA, GS, and ST segments would be fine)? I'm not sure if you can paste it into a comment given the non-printable characters, but maybe you can upload a gist or maybe you can paste what irb shows like "ISA...\x1D~".

@coenwulf
Copy link
Author

coenwulf commented May 9, 2019

Thank you. I think you're right about what's happening.
Here's a gist with a sample (with data stripped out): https://gist.github.com/coenwulf/52ba3f340bff663e949505d911d13615

@kputnam
Copy link
Owner

kputnam commented May 28, 2019

@coenwulf, Thanks that helped! Sorry it took a while to get back to you. I think I've got a working fix, but I'm hoping you can provide another example that I can use as a fixture. The file you gave is missing some required segments (it failed because B10 is missing, but I'm sure there are others).

I pushed a branch called gh-189 with the failing fixture. Could you check out that branch and run rake spec, then replace spec/fixtures/004010/QM214/pass/gh-189.edi with a passing file? I'm not sure if you can push to that branch, so just posting another gist would be fine. Thank you!

kputnam added a commit that referenced this issue May 31, 2019
@kputnam
Copy link
Owner

kputnam commented May 31, 2019

Also, you can use bin/edi-obfuscate to strip out any meaningful information from your file, like strings, dates, times, numbers, etc. The only thing that should remain are segment names and qualifier element values (ID).

@coenwulf
Copy link
Author

coenwulf commented May 31, 2019 via email

@coenwulf
Copy link
Author

coenwulf commented Jun 7, 2019 via email

@kputnam
Copy link
Owner

kputnam commented Sep 28, 2019

Dropping a note here that I don't remember if this was already fixed or how the most recent release behaves in regards to this issue, but this issue is fixed and tested for explicitly in another branch that will be merged sometime in the future. I'll comment here again when it's merged.

@kputnam kputnam added the defect label Sep 28, 2019
kputnam added a commit that referenced this issue Sep 30, 2019
kputnam added a commit that referenced this issue Sep 30, 2019
kputnam added a commit that referenced this issue Sep 30, 2019
@yulduz-om
Copy link

Dropping a note here that I don't remember if this was already fixed or how the most recent release behaves in regards to this issue, but this issue is fixed and tested for explicitly in another branch that will be merged sometime in the future. I'll comment here again when it's merged.
@kputnam : Hi! Any updates on merging this? Or could you possibly push the fix branch?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants