tf.layers.dense 알아보기 (tf.tensordot ,tf.matmul)

tf.layers.dense 알아보기 (tf.tensordot ,tf.matmul)

2020. 4. 24. 13:39ㆍ분석 Python/Tensorflow

주로 나는 tensorflow를 사용하는 유저이다.

그래서 보통 fully connected layer를 사용할 때 필자는 weight와 bias를 다 지정하고 곱하고 더하는 식으로 한다.

w = tf.get_variable("w" , [in_dim,out_dim])
b = tf.get_variable("b" , [out_dim])
logit = tf.matmul(x,w)+b

왜냐하면 좀 더 weight에 특정한 짓(spectral norm) 같은 것을 구현하려면 high level api인 tf.layers.dense를 사용하면 안 되기 때문이다.

그래서 이번에 어쩌다 3 dimension을 다르고 있는데, 신기하게 tf.layers.dense가 작동하는 것을 확인하였고, 내부에서 어떻게 돌아가는지 확인해봤다.

일단 3 dimension을 만들어보자.

import tensorflow as tf
config = tf.ConfigProto(log_device_placement=True)
config.gpu_options.allow_growth = True
config.gpu_options.per_process_gpu_memory_fraction = 0.9
sess = tf.Session(config=config)
a = tf.constant([[[1,1,1,1],
                 [2,2,2,2],
                 [1,1,1,1]],
                 [[1,1,1,1],
                 [2,2,2,2],
                 [1,1,1,1]]], dtype = tf.float32)
a = tf.placeholder(tf.float32 , (None , 3,6))
print(a.get_shape())

이제 여기다가 tf.layers.dense를 사용하면 어떻데 될까?

b = tf.layers.dense(a,10)
print(b.get_shape())

!!!!

shape이 (?,3,6)->에서 (?,3,10)이 되었다.

일단 기본적인 tf.matmul을 사용한다면 저렇게 되지 않을 것이다!

그래서 tf.layers.dense source code를 찾다보니 tensordot을 사용하는 것을 알았다.

a = tf.constant([[[1,1,1,1],
                 [2,2,2,2],
                 [1,1,1,1]],
                 [[1,1,1,1],
                 [2,2,2,2],
                 [1,1,1,1]]], dtype = tf.float32)
## a shape (2,3,4)
c = tf.Variable(tf.random.normal([4,10]))
d = tf.Variable(tf.random.normal([10]))
rank = len(a.get_shape())
e = tf.tensordot(a, c , [[rank-1],[0]])
## e shape (2,3,10)
f = tf.nn.bias_add(e , d)
print(f)
init = tf.group(tf.global_variables_initializer(), tf.local_variables_initializer())
sess.run(init)
sess.run(f)

a tensor에서 지금 보면 1인 부분에는 같은 값이 나온 것을 알 수 있고,

2가 나온 부분에서는 2가 나온 부분끼리 같이 나온다

즉 [[1,1,1,1]](x) * [[shape(4,10)]](w) = [[shape(1,10)]](logit)

음... 머라 암튼 전체 n 차원이 있다고 한다면, n-1 차원과 n차원을 이용한다는 것이다.

예를 들면 다음과 같다.

(None , 3 ,5, 6) * (6,10) -> (None , 3,5,10)

a = tf.constant(np.array([1]*3*5*6).reshape(-1,3,5,6), dtype = tf.float32)
## (1, 3, 5, 6)
c = tf.Variable(tf.random.normal([6,10]))
rank = len(a.get_shape())
e = tf.tensordot(a, c , [[rank-1],[0]])
print(e.get_shape())
## (1, 3, 5, 10)

https://www.tensorflow.org/api_docs/python/tf/tensordot

tf.tensordot | TensorFlow Core v2.1.0

Tensor contraction of a and b along specified axes and outer product. View aliases Main aliases tf.linalg.tensordot Compat aliases for migration See Migration guide for more details. tf.compat.v1.linalg.tensordot, tf.compat.v1.tensordot tf.tensordot( a, b,

www.tensorflow.org

어떻게 된다는 느낌은 가지고 있지만, 구체적으로 계산은 아래와 같은 느낌일 것 같다.(수 알못)

저작자표시

'분석 Python > Tensorflow' 카테고리의 다른 글

[Keras] Weighted Cross Entropy 적용하는 방법 (0)	2020.11.21
Mixture Density NeuralNetwork 예제 (0)	2020.05.21
Learning from Multimodal Target 리뷰 (MDN)(Tensorflow) (0)	2020.04.19
tf.py_func 사용해보기 (0)	2020.04.17
tf.data 삽질해보기 (two iterator, feed_dict, GAN) (0)	2020.04.08

All I Need Is Data.

All I Need Is Data.

태그

최근글

댓글

공지사항

아카이브

'분석 Python > Tensorflow' 카테고리의 다른 글

관련글

티스토리툴바