-
Notifications
You must be signed in to change notification settings - Fork 349
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[WIP] Managing initilizations via a role scheme dictionary #1030
base: master
Are you sure you want to change the base?
Conversation
if key in self.initialization_schemes: | ||
raise ValueError("All initializations are accepted either" | ||
"through initialization_schemes or " | ||
"correspodong attribute but not both") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
typo
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
also s/attribute/keyword argument/
Technically this is @rizar's review but hopefully my comments from the peanut gallery are a little helpful to at least one of you 😛 |
@rizar could you please take a look at the high level structure of the PR. I'll take care of the comments of dwf when the high-level is approved. |
Sure! Will work on it today. |
Sorry for the delay, I am struggling with the cluster and my course project. I will definitely review it tomorrow. |
I have three suggestions:
|
I reorganized the PR a little bit. Again et's discuss the generality of the PR first, I will fix formatting stuff later. Some notes: 1-The problem with having the roles as keys is that, roles are objects, and different objects of same class will be hashed to different values. So instead I use the name of the roles as the keys. |
22b5a09
to
faf0d27
Compare
|
Indeed, def __hash__(self):
return hash(str(self)) should work in the case of roles. |
if self.initialization_schemes is None: | ||
self.initialization_schemes = {} | ||
|
||
initialization_to_role = {"weights_init": WEIGHT, 'biases_init': BIAS, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For the record, it looks like you have chosen to only handle a handful of <role>_init
keyword arguments, not all of them. This is probably good, because this clearly makes initialization_schemes
the primary way of defining the initialization schemes.
But the code clearly lacks a check if key not in initialization_to_role
. Since now we have a clearly defined white list, I would probably raise a warning, that calling the arguments smth_init
is recommended, but still pass this argument to the super().__init__
call.
I think the major design decision that I disagree with is having both Here is a different design which seems better to me
What do you think? |
self.parameter_roles.update(set([role])) | ||
|
||
def _initialize(self): | ||
for param in self.parameters: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a dangerous code. First, if a parameter has more than one role, it is not clear at all how it should be initialized. I would rather raise an exception upon encountering such a situation. Also, we should choose the most specific initialization scheme (e.g. the one for RECURRENT_WEIGHT
and not for just WEIGHT
if the parameter is RECURRENT_WEIGHT
).
@memimo , we should resume working on this, and I really hope we can finish it until I leave Montreal on June 8. |
893a301
to
45d4456
Compare
e7763cb
to
bc87d5a
Compare
@rizar could you take a look about the high level design choice? |
if self.initialization_schemes is None: | ||
self.initialization_schemes = {} | ||
|
||
if parameter_roles: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hold on, haven't we decided to not have parameter_roles
and instead collect roles the actual parameter variables?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought it was the other way :p I guess it was a confusion. I'll give it a try, the other way around
@rizar ok, so this version collect roles from parameter tags. Will rebase later on. |
|
||
super(Initializable, self).__init__(**kwargs) | ||
|
||
def _validate_roles(self): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess here you propagate initialization schemes does the hierarchy of roles. But it seems to me that you do not choose the most specific role out of those for which an initialization scheme is available. E.g. if we have an init. scheme for PARAMETER
and for WEIGHT
, and now we are looking for an init. scheme for RECURRENT_WEIGHT
, then we should always choose the scheme given for WEIGHT
. In the code
below it seems like the choice will depend on the order of the dictionary.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, I find it a bit weird that you change initialization_schemes
. This might make debugging quite a bit hard. I would rather put the search for the most specific init. scheme directly in initialize
.
@rizar I updated based on your latest comments. |
self.parameter_roles.update(child.parameter_roles) | ||
|
||
def _initialize(self): | ||
def get_scheme(role, schemes): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It does not seem to me that this code will choose the most specific rolefor which there is an intialization scheme.
Here roughly what I think should work:
def get_scheme(role, schemes):
candidates = []
for key in schemes:
if isinstance(role, key):
if all([isinstance(key, c) for c in candidates]):
candidates = [c]
else:
candidates.append(c)
if not candidates:
raise ValueError("No init scheme")
if len(candidates) > 1:
raise ValueError("Multiple candidate init schemes")
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure your code snippet do that either. Let's assume we have this class hierarchy A->B->C
a = A(), b=B(), c=C()
Then:
isinstance(c, A) -- True
type(c) == A -- False
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, but all I am using is "isinstance", so what's wrong?
On 6 June 2016 at 16:44, Mehdi Mirza [email protected] wrote:
In blocks/bricks/interfaces.py
#1030 (comment):
all_roles = []
for param in obj.parameters:
roles = param.tag.roles
# TODO do something smarter
if len(roles) > 0:
all_roles.append(roles[0])
return all_roles
self.parameter_roles = set(get_param_roles(self))
for child in self.children:
if isinstance(child, Initializable):
child._collect_roles()
self.parameter_roles.update(child.parameter_roles)
- def _initialize(self):
def get_scheme(role, schemes):
I'm not sure your code snippet do that either. Let's assume we have this
class hierarchy A->B->C
a = A(), b=B(), c=C()
Then:
isinstance(c, A) -- True
type(c) == A -- False—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://github.com/mila-udem/blocks/pull/1030/files/f3fca5343847c265abe7b7c2424d585485476b76#r65965285,
or mute the thread
https://github.com/notifications/unsubscribe/AAn8Yul0FN9mBXfoNDAvqFDqOW5JOviZks5qJIbJgaJpZM4HxS-w
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're not going up the hierarchy. Let's say if we have in schemes both C and B. And ideally we should choose the scheme associated with C not B, your code does not do that.
fixes #970
Re-attempt of #1021
So far it only changes simple.py will propagate changes to other bricks when the format is approved.