Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Modify DKB PDF-Importer to support new transaction #3685

Merged
merged 1 commit into from
Dec 23, 2023
Merged

Modify DKB PDF-Importer to support new transaction #3685

merged 1 commit into from
Dec 23, 2023

Conversation

Nirus2000
Copy link
Member

https://forum.portfolio-performance.info/t/pdf-import-von-dkb/4449/102

Remove BOM in test files Dividende01.txt, Dividende02.txt and DIvidende11.txt


The character code 65279 represents the Unicode code for the Byte Order Mark, which often appears as an invisible character at the beginning of UTF-8 encoded files.

public void parse(String filename, DocumentContext documentContext, List<Item> items, String[] lines)

    String currentLine = lines[ii];
    int firstCharCode = (int) currentLine.charAt(0); // Charakter-Code des ersten Zeichens

    System.out.println("Character Code of first character in line [" + ii + "]: " + firstCharCode);

    Matcher matcher = startsWith.matcher(currentLine);
    System.out.println("Match gefunden => " + matcher.matches() + " [" + ii + "] -> " + currentLine);

    if (matcher.matches())
        blocks.add(ii);

Result:

Match gefunden => java.util.regex.Matcher[pattern=^10919 Berlin( Seite 1)?$ region=0,21 lastmatch=] -> 10919 Berlin Seite 1
---------------
Character Code of first character in line [0]: 65279

@Nirus2000 Nirus2000 added the pdf label Dec 22, 2023
@Nirus2000 Nirus2000 requested a review from buchen December 23, 2023 07:46
@buchen
Copy link
Member

buchen commented Dec 23, 2023

I imagine this character was added due to copying the UTF-8 output. I do not think it is in the string input that we generate out of the PDF.

@buchen buchen merged commit d5efce3 into portfolio-performance:master Dec 23, 2023
2 checks passed
@Nirus2000 Nirus2000 deleted the Modify-DKB-PDF-Importer-to-support-new-transaction branch December 23, 2023 08:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants