File with component element separator (ISA16) of 0x1d is unparsable #189

coenwulf · 2019-05-08T23:00:07Z

I have received some x12 214 files that have a value of \u001d (character 29) in position 105 of the file (which is the value for ISA16, the component element separator). When I try to parse those files using stupidedi it gets a fatal error that seems to indicate that it's skipping that character. It gets the S and the field separator (\u001e) instead of GS as the segment identifier and doesn't like it. The error is found "S\u001E" instead of segment identifier. But when I replace that character with a different one (e.g., _) it works fine.
Is there a way to get it to accept that character (\u001d) as the component element separator?

For what it's worth, my code to that point looks basically like:

    config = Stupidedi::Config.contrib
    parser = Stupidedi::Builder::StateMachine.build(config)
    machine, result = parser.read(Stupidedi::Reader.build(file))

    if result.fatal?
      result.explain {|reason| raise EDI214Error.new(reason + " at #{result.position.inspect}") }
    end

The text was updated successfully, but these errors were encountered:

kputnam · 2019-05-09T19:10:19Z

It looks like whatever non-control character immediately follows the separator after ISA15 is read to be ISA16. But 0x1d is a control character, so it's skipped over and the next character is assumed to be the value.

# w.read_character is TokenReader#read_character that skips control characters
w.read_character.flatmap do |isa16, cR|

It might be possible to accommodate having a control character as a component separator, but it might get complicated. The segment terminator, which is immediately after ISA16, is parsed exactly how you want ISA16 to be parsed: you can have a control character as a segment terminator.

# cR.stream is StreamReader, and StreamReader#read_character does not skip control characters
cR.stream.read_character.flatmap do |char_, dR|

Could you provide a sample (just the ISA, GS, and ST segments would be fine)? I'm not sure if you can paste it into a comment given the non-printable characters, but maybe you can upload a gist or maybe you can paste what irb shows like "ISA...\x1D~".

coenwulf · 2019-05-09T20:05:42Z

Thank you. I think you're right about what's happening.
Here's a gist with a sample (with data stripped out): https://gist.github.com/coenwulf/52ba3f340bff663e949505d911d13615

kputnam · 2019-05-28T22:51:23Z

@coenwulf, Thanks that helped! Sorry it took a while to get back to you. I think I've got a working fix, but I'm hoping you can provide another example that I can use as a fixture. The file you gave is missing some required segments (it failed because B10 is missing, but I'm sure there are others).

I pushed a branch called gh-189 with the failing fixture. Could you check out that branch and run rake spec, then replace spec/fixtures/004010/QM214/pass/gh-189.edi with a passing file? I'm not sure if you can push to that branch, so just posting another gist would be fine. Thank you!

kputnam · 2019-05-31T00:36:24Z

Also, you can use bin/edi-obfuscate to strip out any meaningful information from your file, like strings, dates, times, numbers, etc. The only thing that should remain are segment names and qualifier element values (ID).

coenwulf · 2019-05-31T07:55:08Z

Thanks. I'll work on using that to create a sample. Probably next week

…

On Thu, May 30, 2019, 17:36 kputnam ***@***.***> wrote: Also, you can use bin/edi-obfuscate to strip out any meaningful information from your file, like strings, dates, times, numbers, etc. The only thing that should remain are segment names and qualifier element values (ID). — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#189>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AB2LDLGIZDZND4JRQ7SXA43PYBXITANCNFSM4HLV3B3A> .

coenwulf · 2019-06-07T21:02:56Z

It looks like the obfuscator is currently treats dates as numbers so ends up with 00000000 in cases when it should have 20151230. I didn't dig into it, just manually fixed my file. As you suspected, I can't push to your repo. Here's a gist: https://gist.github.com/coenwulf/30389d15f91fbdfe22d4bec1d7cd5d14 -Jeremy On Thu, May 30, 2019 at 7:21 PM Jeremy Fujimoto-Johnson <[email protected]> wrote:

…

Thanks. I'll work on using that to create a sample. Probably next week On Thu, May 30, 2019, 17:36 kputnam ***@***.***> wrote: > Also, you can use bin/edi-obfuscate to strip out any meaningful > information from your file, like strings, dates, times, numbers, etc. The > only thing that should remain are segment names and qualifier element > values (ID). > > — > You are receiving this because you were mentioned. > Reply to this email directly, view it on GitHub > <#189>, > or mute the thread > <https://github.com/notifications/unsubscribe-auth/AB2LDLGIZDZND4JRQ7SXA43PYBXITANCNFSM4HLV3B3A> > . >

kputnam · 2019-09-28T00:18:23Z

Dropping a note here that I don't remember if this was already fixed or how the most recent release behaves in regards to this issue, but this issue is fixed and tested for explicitly in another branch that will be merged sometime in the future. I'll comment here again when it's merged.

yulduz-om · 2024-06-07T19:19:49Z

Dropping a note here that I don't remember if this was already fixed or how the most recent release behaves in regards to this issue, but this issue is fixed and tested for explicitly in another branch that will be merged sometime in the future. I'll comment here again when it's merged.
@kputnam : Hi! Any updates on merging this? Or could you possibly push the fix branch?

kputnam added a commit that referenced this issue May 28, 2019

Allow control character to be used as an component separator (GH-189)

c1d35ed

kputnam added a commit that referenced this issue May 31, 2019

Merge branch 'master' into gh-189

7de2d0a

kputnam added the defect label Sep 28, 2019

kputnam added a commit that referenced this issue Sep 30, 2019

Write spec for fixed GH-189

1332bba

kputnam added a commit that referenced this issue Sep 30, 2019

Write spec for fixed GH-189

87ccb03

kputnam added a commit that referenced this issue Sep 30, 2019

Write spec for fixed GH-189

97fb0a2

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

File with component element separator (ISA16) of 0x1d is unparsable #189

File with component element separator (ISA16) of 0x1d is unparsable #189

coenwulf commented May 8, 2019 •

edited

Loading

kputnam commented May 9, 2019

coenwulf commented May 9, 2019

kputnam commented May 28, 2019

kputnam commented May 31, 2019

coenwulf commented May 31, 2019 via email

coenwulf commented Jun 7, 2019 via email

kputnam commented Sep 28, 2019

yulduz-om commented Jun 7, 2024

File with component element separator (ISA16) of 0x1d is unparsable #189

File with component element separator (ISA16) of 0x1d is unparsable #189

Comments

coenwulf commented May 8, 2019 • edited Loading

kputnam commented May 9, 2019

coenwulf commented May 9, 2019

kputnam commented May 28, 2019

kputnam commented May 31, 2019

coenwulf commented May 31, 2019 via email

coenwulf commented Jun 7, 2019 via email

kputnam commented Sep 28, 2019

yulduz-om commented Jun 7, 2024

coenwulf commented May 8, 2019 •

edited

Loading