-
Notifications
You must be signed in to change notification settings - Fork 16
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add warnings for bare CR in line endings #8
base: master
Are you sure you want to change the base?
Conversation
Does this emit a single message for each input file that is processed? |
Currently it emits multiple messages: one for each string returned by fgets that contains a CR without an accompanying LF. I can modify it to do that if you prefer. |
Since fgets is reading a maximum of 96 bytes per invocation, that amounts to a message every 96 bytes, assuming those 96 bytes contain \r without an accompanying \n. This may seem like spam, but the volume of messages this emits is dwarfed by the other messages that the assembler emits because it doesn't have any line breaks. |
I suggested performing this test at file open time, so a single useful message could be emitted that contained the file name that is has the problematic line endings. Since that input is completely garbage, it would be reasonable to exit the program after emitting the single message about that file. The useful information will be displayed and the user can make the appropriate correction without having to dig through piles of noise to find the problem. |
OK, i did another commit that will skip the rest of the file and log an error if it encounters a bare CR: https://github.com/jblang/simtools/blob/master/crossassemblers/macro1/macro1.c#L1879. The logic in these if statements is a bit tortured, but I couldn't figure out another way to do it without changing the entire control flow of the program. The check for pass == 1 is necessary because otherwise the message gets printed twice. One other thing I noticed is that the nextfiodec function appears to actually be expecting bare CRs inside a fiodec string: https://github.com/jblang/simtools/blob/master/crossassemblers/macro1/macro1.c#L1519. I looked at the PDP-1 handbook and fiodec only has CR; there is no LF, so maybe it's valid in that case? So I am not sure if aborting on a bare CR would break some previously working source code or not. It's your call if you think it's too risky. |
Well, the reference to the bare CR in the nextfiodec routine will actually result in an infinite loop if a bare CR is ever encountered there, so I doubt this test breaks anything. Meanwhile, checking every line for the presence of a bare CR will certainly limit any case where that might ever be legitimate input. My suggestion was:
|
So that just does a check on the first 96 characters? That would have caught it, at least in all the files I tried. |
The line buffer is the maximum line size the program will support. If the first “line” of the file doesn’t have a reasonable line ending, we can hardly expect the input file to have meaningful content. If a user somehow provides an input file with some screwy combination of input data that mixes line endings (with some that only have CR) then they get the results of “garbage-in” => “garbage-out”. The original code already tolerates either LF or CRLF line endings. If the legal syntax of the input actually allows for imbedded CR within some part of the text of a "line", the use of this would hardly happen on the first line of the file. fgets() would read up to 94 characters until it finds a LF, so any number of CR characters may be contained in the line anywhere prior to the LF. A single CR immediately prior the ending LF will be squished out by readLine(), but any others will merely be data in the line. |
I've added a warning to macro1.c for bare CR line endings if you think it's worth committing.