-
Notifications
You must be signed in to change notification settings - Fork 96
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Directional coom #264
Directional coom #264
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks. Would be good to get a test for this as well.
Well, since I am not experienced in writing tests tbh, do you think you could redirect me to a short example so that I prepare one? Regards! |
Hi @atantos , last month I did some code refactoring and changed coom.jl. As a result, your code cannot be merged automatically. At the same time, I have some comments about your changes:
for (i, token) in enumerate(doc)
inner_range = if mode == :directional
i:min(m, i + window)
else
max(1, i-window):min(m, i+window)
end
# looking forward
@inbounds for j in inner_range
...
end
end
|
Hi @rssdev10 ! Thank you for your comments and am sorry for the delay!
Am not sure which ones you mean. I placed the
Yes, actually I would not need to write a second method. One is enough with the if-else solution. So, no objection here either.
Thanks for the link! How should we proceed from now on? Should you make the changes and add them to the commit? Am really not that familiar with how it works in cases like this. And, lastly. There is still a conflict and am not sure whether it has to do with the changes you made, although it does not seem to be the case. Your feedback and ideas are greatly appreciated! Alex |
Hi Alex!
There is a set of arguments in the current API that people are already using as described in the documentation. But if you add something to the fourth position when a method has 5 arguments, this means that you are making this method incompatible with previous code where something else was expected at the 4th position. So, abstractly, the new arguments should not be in the same 4th position, but in the last position. And if it is with some default value, this means that even the changed methods will be compatible with the previous code. Anyway, a keyword argument looks better in your case.
The changes I made are already in the master branch. So all you need to do is
Let me know if you have any questions. Or, use Julia Slack for private questions. |
I had to update this comment, since I did not have to do what I suggested earlier. So, I kept your advice with the Thanks again for the help, @rssdev10 ! Alex |
…(col) && continue row' check
Thanks, merged in the master branch. |
Added a directional
coo_matrix()
version tocoom.jl
so that the directional or asymmetric coocurrence matrix can be built withCooMatrix()
.The current version of coo_matrix() is bidirectional or symmetric and it looks ahead and back in the context of a window size n from the focus word. The directional coocurrence matrix takes into consideration only words that follow and in that sense it is asymmetric. The main result is that the coocurrence matrix is no longer symmetrical in the linear algebra sense and the cells do not include values that represent double frequencies of two words (of the crossing row and column). This means that no frequency value represents the frequency of word A in the vicinity of a window n around B as well as the opposite.
Let's assume that word A is on row i and word B on column j. The cell (i,j) has the frequency of word B found n words after the word A on the text. The cell (j,i), on the other hand, has the has the frequency of word A found n words after the word B on the text. These two values, apparently are not identical. That is why the directional cooccurence matrix is non-symmetric.
On coom.jl the new method
coo_matrix()
(on line 59) is added that redirects to the originalcoo_matrix()
if the direction argument is set tofalse
.The
direction::Bool
argument is also added to all CooMatrix() constructor methods that usecoo_matrix()
further down the line (on the same file).Having a directional coocurrence matrix will make things easier for calculating word association measures.
Please ignore the earlier September commits that are only indirectly related to today's commit.