Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Outlier detetion FQR works, but with the same outlier score, dots are too small #65

Open
HimmelStein opened this issue Aug 29, 2017 · 5 comments
Assignees

Comments

@jaroslav-kuchar
Copy link

For the frequent pattern based outlier detection algorithm (FQR?) - it is important to properly set the parameters. If it is possible try to decrease the minimum support parameter. I am not sure if it is possible to change any parameter in the UI.

@HimmelStein
Copy link
Author

@jaroslav-kuchar can you click the above link, and see the result? meanwhile, try to choose a Bonn dataset, select parameters on the left side, and see the visualization result on the right side.

@jaroslav-kuchar
Copy link

When I click on the link above, I can only see the following message - "Error: We are sorry, the analysis process did not finish in timely manner".
I also tried to start the analysis from scratch but I do not see any possibility to change any parameter.

@HimmelStein
Copy link
Author

the error of timing also happens to other data mining tools, like LOF

@larjohn
Copy link
Contributor

larjohn commented Sep 3, 2017

The timeout issue is related to the length of the dataset. Most probably, the datamining algorithms are not fed correctly, or they take to much time to finish the task.
This happened recently because I removed the pagesize=30 default value, which meant that only 30 lines of data would be fed to the algorithms. Now, all data lines are sent, by default, At this moment, probably the indigo version is the previous one, and it works, but we have to resolve this first, before moving on:
openbudgets/integration#14

Regarding the dots issue, it is the real relation of the data. 173 million vs 100 thousand is like that. Moreover, if I am not mistaken, I have used the algorithm from Pierro, which, I think used a square root normalizing in the size of the circles. Would you suggest log or sth else? Automatically selected?

@larjohn larjohn removed their assignment Sep 17, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants