[Python/pytorch] torch.nn.Conv2d()

05-13 02:19

Notice

Recent Posts

Recent Comments

Link

« 2025/05 »
일	월	화	수	목	금	토
				1	2	3
4	5	6	7	8	9	10
11	12	13	14	15	16	17
18	19	20	21	22	23	24
25	26	27	28	29	30	31

Tags more

Archives

Today

Total

관리 메뉴

BAN2ARU

[Python/pytorch] torch.nn.Conv2d() 본문

Language/Python

[Python/pytorch] torch.nn.Conv2d()

밴나루 2022. 4. 21. 16:09

CLASS
torch.nn.Conv2d(
	in_channels, out_channels, kernel_size, stride=1, padding=0,
	dilataion=1, groups=1, bias=True, padding_mode='zeros', device=None, dtype=None
)

Conv 2d에서 필수적으로 입력되어야 하는 파라미터는 in_channels, out_channels, kernel_size 임.

Input에 2D convolution 연산을 적용하는 함수이며, input size가 $(N,C_{in},H,W)$이면 output size는 $(N, C_{out}, H_{out}, W_{out})$ 이며, 관련하여 input/output에 관한 식은 아래와 같음.

$$ out(N_i,C_{out}) = bias({C_{out}}j) + \sum{k=0}^{C_{in}-1} weight({C_{out}}_j,k) \star input(N_i,k) $$

여기서, $\star$는 2D Convolution 연산이고, $N$은 batch size, $C$는 채널의 수, $H$는 height, $W$는 width를 나타냄. ([6] 참조)

stride는 single number 또는 tuple 형태이며, Convolution을 수행시의 kernel의 스텝 사이즈임.
padding은 input에 padding을 어느정도 할 것인지에 대한 것이며, 문자열 {’valid’, ‘same’} 또는 양쪽에 적용할 padding의 양을 가지고있는 int tuple일 수 있음. [’valid’는 padding 없는 옵션, ‘same’은 입력과 출력이 같도록 padding 수행]
dilation은 커널 사이의 간격을 의미하며, atrous 알고리즘이라고도 불림. ([1] 참조)
groups는 입력 채널과 출력 채널 사이의 관계를 나타내고, 옵션은 다음과 같음 ([2], [3] 참조)
- groups=1이면 모든 입력은 모든 출력과 convolution 연산이 됨. (일반적인 convolution)
- groups=2이면 입력을 2그룹으로 나누어서 각각 convolution 연산을 하고 그 결과를 concatenate함
- groups=in_channels이면 각각의 input 채널이 각각의 output 채널에 대응되어 convolution 연산을 하게 되며, 그 사이즈는 $\frac{out\_channels}{in\_channels}$가 됨.
kernel_size, stride, padding, dilation 파라미터들은 single int, tuple (two ints) 형태 모두 가능

파라미터

in_channels (int) - Input 이미지의 channel 수
out_channels (int) - Convolution에 의해 생성된 channel의 수
kernel_size (int or tuple) - Convolution kernel의 사이즈
stride (int or tuple, optional) - Convolution의 stride이며 default는 1임
padding (int, tuple or str, optional) - input의 양쪽(4방향)에 추가될 padding이며 default는 0임
padding_mode (string, optional) - ‘zeros’, ‘reflect’, ‘replicate’, ‘circular’이며 default는 ‘zeros’임
- zeros는 zero filling으로 0 값으로 채움
- reflect는 mirror filling으로 matrix에서 가장 가까운 픽셀로부터 거울 상(원점을 중심으로 점대칭)의 값으로 복사 ([4] 참조)
- replicate는 copy filling으로 matrix의 가장자리의 값을 복사
- circular는 circular filling으로 matrix 경계의 반대쪽에 있는 요소 반복 ([5] 참조)
dilation (int or tuple, optional) - kernel 사이의 간격을 말하며 default는 1임
groups (int, optional) - input channel에서 output channel로 block된 연결의 수이며 default는 1임
bias (bool, optional) - 만약 True라면, output에 학습가능한 bias를 추가하며 default는 True임.

Shape

Input : $(N, C_{in}, H_{in}, W_{in})$
Output : ($N, C_{out}, H_{out}, W_{out}$)

$$ H_{out} = [\frac{H_{in}+ 2 \times padding[0]-dilation[0] \times(kernel\_size[0]-1)-1)}{stride[0]} + 1] $$

$$ W_{out} = [\frac{W_{in}+ 2 \times padding[1]-dilation[1] \times(kernel\_size[1]-1)-1)}{stride[1]} + 1] $$

Reference

[0] https://pytorch.org/docs/stable/generated/torch.nn.Conv2d.html

[1] https://github.com/vdumoulin/conv_arithmetic/blob/master/README.md

[2] https://sanghyu.tistory.com/24

[3] https://discuss.pytorch.org/t/conv2d-certain-values-for-groups-and-out-channels-dont-work/14228/2

[4] https://newsight.tistory.com/301

[5] https://www.codestudyblog.com/cs2201py/40120041328.html

[6] https://excelsior-cjh.tistory.com/180

저작자표시

'Language > Python' 카테고리의 다른 글

[Python/ Pytorch] torch.nn.ModuleList() (0)	2022.04.21
[Python/ Pytorch] torch.nn.BatchNorm2d() (0)	2022.04.21
cuda(async=True)에서 async가 SyntaxError: invalid syntax 오류 (0)	2022.04.21
CMD에서 'tensorboard'은(는) 내부 또는 외부 명령, 실행할 수 있는 프로그램, 또는 배치 파일이 아닙니다. 오류 (0)	2022.04.19
[Python] Multiprocessing (0)	2022.04.10

'Language/Python' Related Articles

Comments

BAN2ARU

[Python/pytorch] torch.nn.Conv2d() 본문

[Python/pytorch] torch.nn.Conv2d()

'Language > Python' 카테고리의 다른 글

티스토리툴바