def get_reconstruction_cost(self, updates, pre_sigmoid_nv):
"""Approximation to the reconstruction error
Note that this function requires the pre-sigmoid activation as
input. To understand why this is so you need to understand a
bit about how Theano works. Whenever you compile a Theano
function, the computational graph that you pass as input gets
optimized for speed and stability. This is done by changing
several parts of the subgraphs with others. One such
optimization expresses terms of the form log(sigmoid(x)) in
terms of softplus. We need this optimization for the
cross-entropy since sigmoid of numbers larger than 30. (or
even less then that) turn to 1. and numbers smaller than
-30. turn to 0 which in terms will force theano to compute
log(0) and therefore we will get either -inf or NaN as
cost. If the value is expressed in terms of softplus we do not
get this undesirable behaviour. This optimization usually
works fine, but here we have a special case. The sigmoid is
applied inside the scan op, while the log is
outside. Therefore Theano will only see log(scan(..)) instead
of log(sigmoid(..)) and will not apply the wanted
optimization. We can not go and replace the sigmoid in scan
with something else also, because this only needs to be done
on the last step. Therefore the easiest and more efficient way
is to get also the pre-sigmoid activation as an output of
scan, and apply both the log and sigmoid outside scan such
that Theano can catch and optimize the expression.
"""
cross_entropy = T.mean(
T.sum(
self.input * T.log(T.nnet.sigmoid(pre_sigmoid_nv)) +
(1 - self.input) * T.log(1 - T.nnet.sigmoid(pre_sigmoid_nv)),
axis=1
)
)
return cross_entropy
评论列表
文章目录