-
Notifications
You must be signed in to change notification settings - Fork 0
/
caffeNotes.txt
57 lines (29 loc) · 2.15 KB
/
caffeNotes.txt
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
Nvidia Switching:
sudo update-alternatives --config x86_64-linux-gnu_gl_conf
sudo ldconfig -n
sudo update-initramfs -u
CAFFE:
Always switch to test mode while testing or use test_nets (more that one test nets are supported).
test nets have to be declared at the solver
If we are coping the weights or resume training by loading weights we need to explicitly inform
caffe that the test and train nets share weights.
we need not have a seperate include thing in net.prototxt with train and test phase to use the
test_nets feature. use it when you need to use different data for the phases
Use python layers for simple tasks
HDF5 layer dont support data transformation
In contrastive loss in caffe 1-similar images (which was 0 in the original paper)
Tied weight change when u change just one of them
Contrastive loss in caffe currently has no reshape which led to error when we reshape just the input layer--- we can make changes and commmit
Caffe uses BGR because of openCV -- also BGR
Using weights in loss only affect the back prop and not the forward prop
PYTHON:
single underscore: is for programmers private use not loaded while using import
Double underscore:
Any identifier of the form __spam (at least two leading underscores, at most one trailing underscore) is textually replaced with _classname__spam, where classname is the current class name with leading underscore(s) stripped.
Name mangling is intended to give classes an easy way to define “private” instance variables and methods, without having to worry about instance variables defined by derived classes, or mucking with instance variables by code outside the class.
KERAS:
Keras has differnet tensor dimension format tf and th dont mix them, see the source code for details
DL:
Batchnormalization before activation: (Some contradication cases available for isolated cases)
The general use case is to use BN between the linear and non-linear layers in your network, because it normalizes the input to your activation function, so that you're centered in the linear section of the activation function
Batchnormalization before adding in resnet according to http://torch.ch/blog/2016/02/04/resnets.html