Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove Explicit Parameter Tracking #11

Open
rizar opened this issue Sep 24, 2014 · 2 comments
Open

Remove Explicit Parameter Tracking #11

rizar opened this issue Sep 24, 2014 · 2 comments

Comments

@rizar
Copy link

rizar commented Sep 24, 2014

In Groundhog currently every layer has self.params: list of parameters its output depends on. As Jan thoughtfully pointed out about a month ago, it is not necessary since they can all be all retrieved by traversing computation graph. Then self.params_grad_scale elements should be attached to the parameters, which could be probably done by subclassing shared variable class (the problem is not quite clear what to subclass...).

@rizar rizar changed the title Remove explicit param tracking Remove explicit parameter tracking Sep 24, 2014
@janchorowski
Copy link

All Theano expressions (shared variables and regular expressions) have a tag attribute to which we can add such information.

Another option I am testing right now is to decouple the computation, from optimization tricks (gradient scaling) and regularization (weight decay, column norms). Since parameters have unique and often meaningful names, it is easy to write regexps or something similar to set rules such as: all weights of layer X are decayed by...

What do you think?

@rizar
Copy link
Author

rizar commented Sep 28, 2014

In general I like the idea. However, a question arises where should all this information (gradient scaling constants, weight decay constants, etc.) be stored. I still think that layers are good candidates for that.

We could do it like that:

  • every annotated variable keeps a reference to the layer whose output it is, like theano variables refer to Apply nodes
  • we provide the user with simple recursive function that scans the computation graphs and returns all layers used in it
  • user can do whatever he wants to select layers and apply modification to them

For instance (GH stands for groundhog):

x = TT.matrix('x')
h1 = GH.FeedForwardLayer(nin=784, nout=500, ...., name="layer1")(x)
h2 = GH.FeedForwardLayer(nin=50, nout=10, ..., name="layer2")(h1)
probs = GH.SoftmaxLayer(..., name="softmax")(h2)
...
softmax, = filter(lambda x : x.name == "softmax", GH.get_layers(probs))
softmax.weight_decay_coof = 0.001

@rizar rizar changed the title Remove explicit parameter tracking Remove Explicit Parameter Tracking Sep 28, 2014
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants