[WIP] Managing initilizations via a role scheme dictionary #1030

memimo · 2016-03-15T18:09:05Z

fixes #970
Re-attempt of #1021
So far it only changes simple.py will propagate changes to other bricks when the format is approved.

dwf · 2016-03-15T20:26:44Z

blocks/bricks/interfaces.py

+                if key in self.initialization_schemes:
+                    raise ValueError("All initializations are accepted either"
+                                     "through initialization_schemes or "
+                                     "correspodong attribute but not both")


also s/attribute/keyword argument/

dwf · 2016-03-15T20:39:06Z

Technically this is @rizar's review but hopefully my comments from the peanut gallery are a little helpful to at least one of you 😛

memimo · 2016-04-05T14:46:05Z

@rizar could you please take a look at the high level structure of the PR. I'll take care of the comments of dwf when the high-level is approved.

rizar · 2016-04-05T14:54:49Z

Sure! Will work on it today.

rizar · 2016-04-06T22:31:36Z

Sorry for the delay, I am struggling with the cluster and my course project. I will definitely review it tomorrow.

rizar · 2016-04-07T17:29:39Z

I have three suggestions:

use the roles as the keys in initialization_schemes
initialize parameters using the scheme given for their most specific role
check that the given initialization schemes match the parameter roles of the brick in two places: in the end of constructor and in _push_initialization_config. This will allow to detect errors as early as possible.

memimo · 2016-04-12T19:03:43Z

I reorganized the PR a little bit. Again et's discuss the generality of the PR first, I will fix formatting stuff later.

Some notes:

1-The problem with having the roles as keys is that, roles are objects, and different objects of same class will be hashed to different values. So instead I use the name of the roles as the keys.
2-One major API change is that, if someone wants to manually change the schemes self.weights_init and self.biases_init is not valid anymore and one has to use self.initializations_schems
3-There is a tiny bit difference in the GRU weight initialization values now. I don't believe that will have any practical impact on performance, unless someone has an evidence against it.

rizar · 2016-04-13T17:45:49Z

I think you gave up the idea of a role-indexed dictionary too early. Check https://docs.python.org/2/reference/datamodel.html#object.__hash__ , you should be able to overload this method.
I thought we would support weights_init and biases_init through __getattr__, why not?

dwf · 2016-04-13T18:15:22Z

Indeed,

def __hash__(self):
    return hash(str(self))

should work in the case of roles.

rizar · 2016-04-25T00:22:30Z

blocks/bricks/interfaces.py

+        if self.initialization_schemes is None:
+            self.initialization_schemes = {}
+
+        initialization_to_role = {"weights_init": WEIGHT, 'biases_init': BIAS,


For the record, it looks like you have chosen to only handle a handful of <role>_init keyword arguments, not all of them. This is probably good, because this clearly makes initialization_schemes the primary way of defining the initialization schemes.

But the code clearly lacks a check if key not in initialization_to_role. Since now we have a clearly defined white list, I would probably raise a warning, that calling the arguments smth_init is recommended, but still pass this argument to the super().__init__ call.

rizar · 2016-04-25T01:07:31Z

I think the major design decision that I disagree with is having both initializable_roles and parameter_roles. Neither of the two is fully suitable for controlling how the initialization schemes are spread down the hierarchy. The former is hardcoded, and the latter is known too late, after the parameters are created.

Here is a different design which seems better to me

each Initilizable brick has an attribute parameter_roles that contains all the roles that the brick's parameters will have when constructed
the value of this attribute is defined as follows:
- first, Initializable sets it to [WEIGHT, BIAS] (only [WEIGHT] is not use_bias)
- ... unless the user provides parameter_roles as a constructor argument
- then, after super(Initializable, self).__init__, Initializable._collect_roles is called to add
  the roles of the children to parameter_roles
we are still in Initializable.__init__, but we can already check that initialization_schemes does not contain schemes for the roles that are not in parameter_roles
furthermore, when the parameters are created (that is in allocate), we can check that all the parameters have roles that are present in parameter_roles
when initialization configuration is pushed (Initializable._push_initialization_config), only the schemes for the relevant roles should be pushed to the children. One should be careful here to check that sometimes initialization schemes for roles that are parents of the precise parameter roles should also be pushed.
in initialize, we can repeat the sanity check that we did in constructor, to make sure that user did not provide useless initialization schemes

What do you think?

rizar · 2016-04-25T01:26:37Z

blocks/bricks/interfaces.py

+                        self.parameter_roles.update(set([role]))
+
+    def _initialize(self):
+        for param in self.parameters:


This is a dangerous code. First, if a parameter has more than one role, it is not clear at all how it should be initialized. I would rather raise an exception upon encountering such a situation. Also, we should choose the most specific initialization scheme (e.g. the one for RECURRENT_WEIGHT and not for just WEIGHT if the parameter is RECURRENT_WEIGHT).

rizar · 2016-05-23T23:09:57Z

@memimo , we should resume working on this, and I really hope we can finish it until I leave Montreal on June 8.

memimo · 2016-06-01T18:52:49Z

@rizar could you take a look about the high level design choice?
Travis is falling on block_examples code, because some initilization values have changed. I will take care of that later, if the design is good.

rizar · 2016-06-01T21:05:07Z

blocks/bricks/interfaces.py

+        if self.initialization_schemes is None:
+            self.initialization_schemes = {}
+
+        if parameter_roles:


Hold on, haven't we decided to not have parameter_roles and instead collect roles the actual parameter variables?

I thought it was the other way :p I guess it was a confusion. I'll give it a try, the other way around

memimo · 2016-06-02T18:26:57Z

@rizar ok, so this version collect roles from parameter tags. Will rebase later on.

rizar · 2016-06-06T13:48:59Z

blocks/bricks/interfaces.py

+
+        super(Initializable, self).__init__(**kwargs)
+
+    def _validate_roles(self):


I guess here you propagate initialization schemes does the hierarchy of roles. But it seems to me that you do not choose the most specific role out of those for which an initialization scheme is available. E.g. if we have an init. scheme for PARAMETER and for WEIGHT, and now we are looking for an init. scheme for RECURRENT_WEIGHT, then we should always choose the scheme given for WEIGHT. In the code
below it seems like the choice will depend on the order of the dictionary.

Also, I find it a bit weird that you change initialization_schemes. This might make debugging quite a bit hard. I would rather put the search for the most specific init. scheme directly in initialize.

memimo · 2016-06-06T16:54:43Z

@rizar I updated based on your latest comments.

rizar · 2016-06-06T17:31:39Z

blocks/bricks/interfaces.py

+                self.parameter_roles.update(child.parameter_roles)
+
+    def _initialize(self):
+        def get_scheme(role, schemes):


It does not seem to me that this code will choose the most specific rolefor which there is an intialization scheme.

Here roughly what I think should work:

def get_scheme(role, schemes): candidates = [] for key in schemes: if isinstance(role, key): if all([isinstance(key, c) for c in candidates]): candidates = [c] else: candidates.append(c) if not candidates: raise ValueError("No init scheme") if len(candidates) > 1: raise ValueError("Multiple candidate init schemes")

I'm not sure your code snippet do that either. Let's assume we have this class hierarchy A->B->C
a = A(), b=B(), c=C()
Then:
isinstance(c, A) -- True
type(c) == A -- False

Right, but all I am using is "isinstance", so what's wrong?

On 6 June 2016 at 16:44, Mehdi Mirza [email protected] wrote:

In blocks/bricks/interfaces.py
#1030 (comment):

all_roles = []

for param in obj.parameters:

roles = param.tag.roles

# TODO do something smarter

if len(roles) > 0:

all_roles.append(roles[0])

return all_roles

self.parameter_roles = set(get_param_roles(self))

for child in self.children:

if isinstance(child, Initializable):

child._collect_roles()

self.parameter_roles.update(child.parameter_roles)

def _initialize(self):

def get_scheme(role, schemes):

I'm not sure your code snippet do that either. Let's assume we have this
class hierarchy A->B->C
a = A(), b=B(), c=C()
Then:
isinstance(c, A) -- True
type(c) == A -- False

—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://github.com/mila-udem/blocks/pull/1030/files/f3fca5343847c265abe7b7c2424d585485476b76#r65965285,
or mute the thread
https://github.com/notifications/unsubscribe/AAn8Yul0FN9mBXfoNDAvqFDqOW5JOviZks5qJIbJgaJpZM4HxS-w
.

You're not going up the hierarchy. Let's say if we have in schemes both C and B. And ideally we should choose the scheme associated with C not B, your code does not do that.

dwf reviewed Mar 15, 2016
View reviewed changes

memimo force-pushed the init branch from 239c8a9 to 54405ce Compare April 12, 2016 18:59

memimo force-pushed the init branch 2 times, most recently from 22b5a09 to faf0d27 Compare April 12, 2016 19:27

memimo force-pushed the init branch from 14470cd to c257c9a Compare April 13, 2016 18:34

rizar reviewed Apr 25, 2016
View reviewed changes

memimo force-pushed the init branch 3 times, most recently from 893a301 to 45d4456 Compare June 1, 2016 17:57

memimo force-pushed the init branch 2 times, most recently from e7763cb to bc87d5a Compare June 1, 2016 18:17

rizar reviewed Jun 1, 2016
View reviewed changes

memimo added 6 commits June 2, 2016 14:38

managing initilizations via a a role scheme dictionary

e8b1def

reorganize initialization

f1ef6ab

support weight_init, biases_init

2f09a35

get rid of initializable_roles

84dff78

collect parameter roles from tag

ecb758f

rebase

e4bbab9

memimo force-pushed the init branch from 442913b to e4bbab9 Compare June 2, 2016 18:53

rizar reviewed Jun 6, 2016
View reviewed changes

move search in role heiarachy to initilaize

f3fca53

rizar reviewed Jun 6, 2016
View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[WIP] Managing initilizations via a role scheme dictionary #1030

[WIP] Managing initilizations via a role scheme dictionary #1030

memimo commented Mar 15, 2016

dwf Mar 15, 2016

dwf Mar 15, 2016

dwf commented Mar 15, 2016

memimo commented Apr 5, 2016

rizar commented Apr 5, 2016

rizar commented Apr 6, 2016

rizar commented Apr 7, 2016

memimo commented Apr 12, 2016

rizar commented Apr 13, 2016

dwf commented Apr 13, 2016

rizar Apr 25, 2016

rizar commented Apr 25, 2016

rizar Apr 25, 2016

rizar commented May 23, 2016

memimo commented Jun 1, 2016

rizar Jun 1, 2016

memimo Jun 1, 2016

memimo commented Jun 2, 2016

rizar Jun 6, 2016

rizar Jun 6, 2016

memimo commented Jun 6, 2016

rizar Jun 6, 2016 •

edited

Loading

memimo Jun 6, 2016

rizar Jun 6, 2016

memimo Jun 7, 2016


		super(Initializable, self).__init__(**kwargs)

		def _validate_roles(self):

[WIP] Managing initilizations via a role scheme dictionary #1030

Are you sure you want to change the base?

[WIP] Managing initilizations via a role scheme dictionary #1030

Conversation

memimo commented Mar 15, 2016

Choose a reason for hiding this comment

Choose a reason for hiding this comment

dwf commented Mar 15, 2016

memimo commented Apr 5, 2016

rizar commented Apr 5, 2016

rizar commented Apr 6, 2016

rizar commented Apr 7, 2016

memimo commented Apr 12, 2016

rizar commented Apr 13, 2016

dwf commented Apr 13, 2016

Choose a reason for hiding this comment

rizar commented Apr 25, 2016

Choose a reason for hiding this comment

rizar commented May 23, 2016

memimo commented Jun 1, 2016

Choose a reason for hiding this comment

Choose a reason for hiding this comment

memimo commented Jun 2, 2016

Choose a reason for hiding this comment

Choose a reason for hiding this comment

memimo commented Jun 6, 2016

rizar Jun 6, 2016 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

rizar Jun 6, 2016 •

edited

Loading