CNN (convolutional neural network)

  • convolutioanl layer
  • pooling layer
  • fully connected layer

๋„ˆ๋ฌด ์ž˜ ์ •๋ฆฌ๋œ ๊ธ€ ...

taewan.kim/post/cnn/

 

CNN, Convolutional Neural Network ์š”์•ฝ

Convolutional Neural Network, CNN์„ ์ •๋ฆฌํ•ฉ๋‹ˆ๋‹ค.

taewan.kim


์ฃผ์š” ์šฉ์–ด

  • convolution
  • channel
  • filter = kernel
  • stride
  • padding
  • feature map = activation map
  • pooling layer

convolutional layer

  • FC ๋ง๊ณ , CONV LAYER ์‚ฌ์šฉํ•˜๋Š” ์ด์œ 
    • parameter sharing
      • ์ด๋ฏธ์ง€์˜ ํŠน์ • ๋ถ€๋ถ„์—์„œ ์œ ์˜๋ฏธํ•œ filter๊ฐ€ ์ด๋ฏธ์ง€์˜ ๋‹ค๋ฅธ ๋ถ€๋ถ„์—์„œ๋„ ์‚ฌ์šฉ๋  ์ˆ˜ ์žˆ๋‹ค.
      • ์ด๋ฏธ์ง€์˜ ๊ณต๊ฐ„์ •๋ณด๋ฅผ ์œ ์ง€ํ•  ์ˆ˜ ์žˆ๋‹ค.
    • sparsity of connections
      • ๊ฐ๊ฐ์˜ output value๋Š” ์ ์€ ์ˆ˜์˜ input์—๋งŒ ์˜์กดํ•˜๊ธฐ ๋•Œ๋ฌธ์— parameter์˜ ์ˆ˜๋ฅผ ์ค„์ผ ์ˆ˜ ์žˆ๋‹ค.
  • filter
    • ์ด์ œ๋Š” filter๋ฅผ ์ง์ ‘ ๋””์ž์ธํ•  ํ•„์š”๊ฐ€ ์—†๋‹ค. CNN์—์„œ filter๋Š” learnable parameter์ด๋‹ค.
      • ์ด์ „์—๋Š” vertical edge detector, horizontal edge detector ๋“ฑ์„ ์ง์ ‘ ๋งŒ๋“ค๊ธฐ๋„ ํ•˜์˜€์œผ๋‚˜,
        ์ด์ œ๋Š” ๊ทธ๋Ÿด ํ•„์š”๊ฐ€ ์—†๋‹ค.
    • filter์˜ ํฌ๊ธฐ๋Š” 3*3*channel, 5*5*channel์„ ์ฃผ๋กœ ์‚ฌ์šฉํ•œ๋‹ค.
  • padding
    • ์ด๋ฏธ์ง€์˜ ๊ฐ€์žฅ ์ž๋ฆฌ์— ํŠน์ • ๊ฐ’์œผ๋กœ ์ฑ„์›Œ ๋„ฃ๋Š” ๊ฒƒ์„ ์˜๋ฏธํ•œ๋‹ค. ์ฃผ๋กœ 0์œผ๋กœ ์ฑ„์›Œ๋„ฃ๋Š”๋‹ค.
    • padding์„ ์‚ฌ์šฉํ•˜๋Š” ๋ชฉ์ 
      • layer๋ฅผ ํ†ต๊ณผํ•˜๋ฉฐ ์ด๋ฏธ์ง€์˜ ํฌ๊ธฐ๊ฐ€ ์ค„์–ด๋“œ๋Š” ๊ฒƒ์„ ๋ง‰๊ธฐ ์œ„ํ•˜์—ฌ
      • ๊ฐ€์žฅ์ž๋ฆฌ์— ์žˆ๋Š” ์ •๋ณด๋ฅผ ์—ฐ์‚ฐ์— ๋” ๋ฐ˜์˜ํ•˜๊ธฐ ์œ„ํ•˜์—ฌ
    • padding์˜ ์ข…๋ฅ˜
      • valid padding : no padding
      • same padding : input size์™€ output size๊ฐ€ ๋™์ผํ•˜๋„๋ก padding
  • stride
    • ์–ด๋– ํ•œ ๊ฐ„๊ฒฉ์œผ๋กœ ์ด๋ฏธ์ง€๋ฅผ ์ˆœํšŒํ•  ๊ฒƒ์ธ์ง€
      • stride = 1 : ํ•„ํ„ฐ๋ฅผ ํ•œ ์นธ์”ฉ ์›€์ง์ด๋ฉด์„œ convolution ์—ฐ์‚ฐ ์ˆ˜ํ–‰
      • stride = 2 : ํ•„ํ„ฐ๋ฅผ ๋‘ ์นธ์”ฉ ์›€์ง์ด๋ฉด์„œ convolution ์—ฐ์‚ฐ ์ˆ˜ํ–‰
  • ์ฐจ์›
    • l layer์—์„œ convolutional layer๋ฅผ ์ ์šฉํ•œ๋‹ค๊ณ  ํ•˜์ž.
      • F^[l] : filter size
      • P^[l] : padding size
      • S^[l] : stride size
      • C^[l] : number of filters
      • input : H^[l-1]*W^[l-1]*C^[l-1]
      • output : H^[l]*W^[l]*C^[l]
      • filter: F^[l]*F^[l]*C^[l-1]์ด C^[l]๊ฐœ
  • $$ H^{[l]} = \lfloor\frac{H^{[l-1]}+2P^{[l]}-F^{[l]}}{S^{[l]}}+1\rfloor $$
  • $$ W^{[l]} = \lfloor\frac{W^{[l-1]}+2P^{[l]}-F^{[l]}}{S^{[l]}}+1\rfloor $$
  • $$ num-of-parameter = (F^{[l]}*F^{[l]}*C^{[l-1]}+1)*C^{[l]} $$

pooling layer

  • ํ•™์Šต๋˜๋Š” ํŒŒ๋ผ๋ฏธํ„ฐ ์กด์žฌํ•˜์ง€ ์•Š๋Š”๋‹ค (parameter = 0)
  • input, output channel์ด ๋ณ€ํ•˜์ง€ ์•Š๋Š”๋‹ค.
  • pooling 
    • max pooling
    • average pooling
  • ์ฐจ์›
    • F : filter size
    • S : stride
    • IH : input height, IW : input width, IC : input channel
    • OH : output height, OW : output width, OC : output channel
    • OH = floor( (IH-F)/S+1 )
    • OW = floor( (IW-F)/S+1 )
    • IC = OC

fully connected layer

  • input์„ flattenํ•œ ๋’ค, FC ์ ์šฉ
  • ๋งˆ์ง€๋ง‰ FC layer์˜ unit์˜ ์ˆ˜๋Š” class์˜ ์ˆ˜์™€ ์ผ์น˜ํ•ด์•ผ ํ•œ๋‹ค.

 

'๐Ÿ™‚ > Coursera_DL' ์นดํ…Œ๊ณ ๋ฆฌ์˜ ๋‹ค๋ฅธ ๊ธ€

WEEK6 : Inception (googLeNet)  (0) 2020.12.23
WEEK6 : ResNet  (0) 2020.12.23
WEEK5 : end to end DL  (0) 2020.12.21
WEEK5 : Multi-Task Learning  (0) 2020.12.20
WEEK5 : Transfer Learning  (0) 2020.12.20

+ Recent posts