Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SDD memory error & d-DNNF example ignoring #118

Open
Zarach opened this issue Jun 13, 2024 · 6 comments
Open

SDD memory error & d-DNNF example ignoring #118

Zarach opened this issue Jun 13, 2024 · 6 comments

Comments

@Zarach
Copy link

Zarach commented Jun 13, 2024

Dear problog team,

I try to learn a Naive Bayes classifier for documents based on occuring words.

There are about 1000 examples and they should be classified (to be precise, the probability of the words, used at a class should be learned) like in your online example but to one of 4 classes.

If I run LFI with SDD a memory error occures.

If I run it with d-DNNF, it ignores nearly all of the given examples.
It runs into the following error because the calculated weight is very low, even at the beginning of the learning process:

    if self.semiring.is_zero(self._get_z()):
        raise InconsistentEvidenceError(context=" during evidence evaluation")

I guess this is the wanted behavior, but could you explain in an abstract way, why examples get ignored from the beginning?
Does it mean, that there is not enough information (not enough words used) in these examples to learn parameters?

@rmanhaeve
Copy link
Contributor

Hi Zarach

Could you perhaps give is an example of this behaviour?

Kind regards,
Robin

@Zarach
Copy link
Author

Zarach commented Jun 17, 2024

Hi Robin,

not that easy, because I run it in python, but I'll try.
Here are 2 files (with 48 examples) which are examples for the ddnf problem which can be used in standalone mode. Hope it will be comparable to the python run.
All examples get ignored when I run with ddnnf.

For the python run I also uploaded a txt-file with the list of examples, in python it is done with the Term() objects which you can't see in the txt file.

examples_small.txt
program_small.txt
example_list.txt

And another example file which should show the memory alloc error for sdd:

examples_big.txt

Kind regards,
Benjamin

@rmanhaeve
Copy link
Contributor

Hi Benjamin

It seems that you only give the training data, but there's no program attached. We'll need this as well to look into it in more depth.

@Zarach
Copy link
Author

Zarach commented Jul 4, 2024

Hi Robin,

program_small.txt is the program which should be executable in problog standalone mode.
At least on my side this works and reproduces the problem.

@rmanhaeve
Copy link
Contributor

I have solved the issue with the inconsistent evidence error by setting the initial probabilities to t(0.1) for all words, and by using log-space calculations, i.e. by running it with
lfi program_small.pl examples_small.pl --logspace

I'll now look into the memory issue

@rmanhaeve
Copy link
Contributor

I have noticed a calloc when using SDDs. Have you tried using -k sddx ?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants