Face Recognition

  • what is face recognition
  • one hot learning
  • siamese network
  • triplet loss
  • face verification and binary classification

1. what is face recognition

  • face verification
    • ์ด ์‚ฌ๋žŒ์˜ ์–ผ๊ตด๊ณผ ์‹ ์›์ด ์ผ์น˜ํ•˜๋Š”์ง€ ํ™•์ธํ•˜๋Š” 1:1๋ฌธ์ œ
    • ์–ผ๊ตด ์‚ฌ์ง„์„ input์œผ๋กœ ๋ฐ›์•„, ์ด ์‚ฌ๋žŒ์˜ ์‚ฌ์ง„๊ณผ ์ด ์‚ฌ๋žŒ์˜ ID๊ฐ€ ์ผ์น˜ํ•˜๋Š”์ง€ ์—ฌ๋ถ€๋ฅผ output์œผ๋กœ ๋ฐ˜ํ™˜
  • face recognition
    • ์ด ์‚ฌ๋žŒ์ด ๊ทธ๋ฃน ์ค‘ ํ•œ ๋ช…์ธ์ง€ ํ™•์ธํ•˜๋Š” 1:K ๋ฌธ์ œ
    • ์–ผ๊ตด ์‚ฌ์ง„์„ input์œผ๋กœ ๋ฐ›์•„, ์ด ์‚ฌ๋žŒ์ด group ์ค‘ ํ•œ ๋ช…์— ์†ํ•˜๋Š”์ง€ ์—ฌ๋ถ€๋ฅผ output์œผ๋กœ ๋ฐ˜ํ™˜
    • face verification๋ณด๋‹ค ์–ด๋ ค์šด ๋ฌธ์ œ์ด๋‹ค.

2. one hot learning

face recognition ๋ฌธ์ œ๋ฅผ ์ผ๋ฐ˜์ ์ธ model ๋ฐฉ์‹์œผ๋กœ ํ•™์Šตํ•ด๋„ ๊ดœ์ฐฎ์€๊ฐ€?

์ผ๋ฐ˜์ ์ธ model(input-CNN-output)์œผ๋กœ ํ•™์Šต์„ ์ง„ํ–‰ํ•˜๋ฉด group์— ํ•œ ๋ช…์˜ ๋ฉค๋ฒ„๊ฐ€ ์ถ”๊ฐ€๋œ๋‹ค๋ฉด ๋ชจ๋ธ์„ ์žฌํ•™์Šต ํ•ด์•ผํ•œ๋‹ค.

์™œ๋ƒํ•˜๋ฉด output node์˜ ๊ฐœ์ˆ˜๊ฐ€ 1๊ฐœ ์ฆ๊ฐ€ํ•ด์•ผ ํ•˜๊ธฐ ๋•Œ๋ฌธ์ด๋‹ค.

๊ฒŒ๋‹ค๊ฐ€ ํ•œ ์‚ฌ๋žŒ์— ๋Œ€ํ•œ ์‚ฌ์ง„์ด ์ œํ•œ์ ์ด๊ธฐ ๋•Œ๋ฌธ์— ์ด ๋ชจ๋ธ์„ ํ•™์Šตํ•˜๋Š” ๊ฒƒ๋„ ๊ฒฐ์ฝ” ์‰ฌ์šด ๋ฌธ์ œ๊ฐ€ ์•„๋‹ˆ๋‹ค.

์ด๊ฒƒ์€ ์ข‹์€ ๋ฐฉ๋ฒ•์ด ์•„๋‹Œ ๊ฒƒ์œผ๋กœ ๋ณด์ธ๋‹ค.

 

๊ทธ๋ ‡๋‹ค๋ฉด ์–ด๋–ป๊ฒŒ ํ•™์Šต์„ ์ง„ํ–‰ํ•ด์•ผ ์ข‹์€๊ฐ€?

similarity function์„ ํ•™์Šตํ•˜๋Š” ๊ฒƒ์ด๋‹ค.

d(image_1, image_2) = degree of difference between images์ธ ํ•จ์ˆ˜ d๋ฅผ ํ•™์Šตํ•˜์—ฌ

d(img1, img2)์˜ ๊ฒฐ๊ณผ๊ฐ€ ์ž„๊ณ„๊ฐ’ ์ด์ƒ์ด๋ฉด ๋™์ผํ•œ ์‚ฌ๋žŒ์œผ๋กœ, ์ž„๊ณ„๊ฐ’ ๋ฏธ๋งŒ์ด๋ฉด ๋‹ค๋ฅธ ์‚ฌ๋žŒ์œผ๋กœ ํŒ๋ณ„ํ•˜๋„๋ก ํ•˜๋Š” ๊ฒƒ์ด๋‹ค

 

 


3. siamese network

coursera - Andrew Ng

face recognition์„ ์œ„ํ•ด ๋ชจ๋ธ์„ ํ•™์Šตํ•˜๋Š”๋ฐ, 

$$ \begin{cases}minimize ||f(x^{(i)})-f(x^{(j)})|| & x^{(i)} = x^{(j)}\\ maximize ||f(x^{(i)})-f(x^{(j)})||  & x^{(i)} \neq x^{(j)}\end{cases}  $$

 

๋‘ ์ด๋ฏธ์ง€ ์† ์ธ๋ฌผ์ด ๋™์ผ ์ธ๋ฌผ์ด๋ผ๋ฉด ๋ชจ๋ธ ํ†ตํ•ด encoding๋œ ๋ฒกํ„ฐ ์‚ฌ์ด์˜ ๊ฑฐ๋ฆฌ๊ฐ€ ์ค„์–ด๋“ค๋„๋ก,

๋‘ ์ด๋ฏธ์ง€ ์† ์ธ๋ฌผ์ด ๋‹ค๋ฅธ ์ธ๋ฌผ์ด๋ผ๋ฉด ๋ชจ๋ธ ํ†ตํ•ด encoding๋œ ๋ฒกํ„ฐ ์‚ฌ์ด์˜ ๊ฑฐ๋ฆฌ๊ฐ€ ์ปค์ง€๋„๋ก ํ•™์Šตํ•˜๋Š” ๊ฒƒ์ด๋‹ค.

 

 


4. triplet loss

์œ„์—์„œ ์ด๋ฏธ์ง€ ์† ์ธ๋ฌผ์ด ๋™์ผ ์ธ๋ฌผ์ด๋ผ๋ฉด encoding vector์‚ฌ์ด์˜ ๊ฑฐ๋ฆฌ๊ฐ€ ์ค„์–ด๋“ค๋„๋ก,

๋‹ค๋ฅธ ์ธ๋ฌผ์ด๋ผ๋ฉด encoding vector ์‚ฌ์ด์˜ ๊ฑฐ๋ฆฌ๊ฐ€ ์ปค์ง€๋„๋ก ํ•™์Šต์„ ํ•œ๋‹ค๊ณ  ํ–ˆ๋Š”๋ฐ,

๊ทธ๋ ‡๋‹ค๋ฉด ์–ด๋–ค loss๋ฅผ ์‚ฌ์šฉํ•ด์„œ ๋ชจ๋ธ์„ ํ•™์Šตํ•ด์•ผ ํ•˜๋Š”๊ฐ€?

 

 

 

๋ฐ”๋กœ Triplet Loss๋ฅผ ์‚ฌ์šฉํ•˜๋Š” ๊ฒƒ์ด๋‹ค.

 

์—ฌ๊ธฐ์„œ Anchor์™€ Positive๋Š” ๋™์ผ ์ธ๋ฌผ์„ ์ฐ์€ ์‚ฌ์ง„์ด๊ณ , negative๋Š” ๋‹ค๋ฅธ ์ธ๋ฌผ์„ ์ฐ์€ ์‚ฌ์ง„์ด๋‹ค.

Anchor๋ฅผ A, Positive๋ฅผ P, Negative๋ฅผ N์ด๋ผ๊ณ  ํ‘œ์‹œํ–ˆ์„ ๋•Œ triplet loss๋Š” ๋‹ค์Œ๊ณผ ๊ฐ™์ด ์ •์˜๋œ๋‹ค.

 

$$ L(A, P, N) = max(||f(A)-f(P)||^2 - ||f(A)-f(N)||^2 + \alpha,  0) $$

 

์—ฌ๊ธฐ์„œ alpha๋ฅผ ์‚ฌ์šฉํ•œ ์ด์œ ๋Š” ๋ฌด์—‡์ธ๊ฐ€?

์ฒซ ๋ฒˆ์งธ, ||f(A)-f(P)||^2, ||f(A)-f(N)||^2 ์‚ฌ์ด๋ฅผ ์ผ์ • ๊ฑฐ๋ฆฌ(alpha)๋ณด๋‹ค ์ปค์ง€๋„๋ก ํ•™์Šตํ•˜๊ธฐ ์œ„ํ•œ ๋ชฉ์ 

๋‘ ๋ฒˆ์งธ, f(A), f(P), f(N) ๋ชจ๋‘ ์œ ์‚ฌํ•˜๊ฒŒ ๋‚˜์™€ ||f(A)-f(P)||^2 > ||f(A)-f(N)||^2 ์กฐ๊ฑด์„ ๋งŒ์กฑํ•˜๋Š” ์ƒํ™ฉ์„ ํšŒํ”ผํ•˜๊ธฐ ๋ชฉ์ ๋„ ์žˆ๋‹ค.

์ฆ‰, ์ด๋ฏธ์ง€์— ๋Œ€ํ•ด ๋‹ค๋ฅธ encoding vector๊ฐ€ ๋‚˜์˜ฌ ์ˆ˜ ์žˆ๋„๋ก ํ•˜๊ธฐ ์œ„ํ•œ ์ œ์•ฝ์กฐ๊ฑด์ด๋‹ค.

 

 

์—ฌ๊ธฐ์„œ max(||f(A)-f(P)||^2 - ||f(A)-f(N)||^2 + alpha,  0)๋ฅผ ์‚ฌ์šฉํ•œ ์ด์œ ๋Š” ๋ฌด์—‡์ธ๊ฐ€?

||f(A)-f(P)||^2 + alpha๊ฐ€ ||f(A)-f(N)||^2๋ณด๋‹ค ์ž‘๋‹ค๋ฉด ์Œ์ˆ˜๊ฐ€ ๋‚˜์˜ค๊ธฐ ๋•Œ๋ฌธ์— loss์˜ ์ตœ์†Ÿ๊ฐ’์„ 0์œผ๋กœ ์ œํ•œํ•˜๊ธฐ ์œ„ํ•จ์ด๋‹ค.

 

 

์ผ๋ฐ˜์ ์œผ๋กœ ||f(A)-f(P)||^2 + alpha <= ||f(A)-f(N)||^2 ๋ฅผ ๋งŒ์กฑํ•˜๊ธฐ๋Š” ์‰ฝ๋‹ค.

์ด๋Ÿฌํ•œ ๊ฒฝ์šฐ loss๊ฐ€ 0์ด ๋˜๊ธฐ ๋•Œ๋ฌธ์— parameter๊ฐ€ ๋ฐ์ดํ„ฐ๋กœ๋ถ€ํ„ฐ ๋ฌด์—‡์ธ๊ฐ€๋ฅผ ๋ฐฐ์šธ ์ˆ˜ ์—†๊ฒŒ ๋œ๋‹ค.

๋”ฐ๋ผ์„œ ||f(A)-f(P)||^2 + alpha > ||f(A)-f(N)||^2์ด ๋  ์ˆ˜ ์žˆ๋Š”, ์ฆ‰ ๋ชจ๋ธ์ด ํ—ท๊ฐˆ๋ฆฌ๊ธฐ ์‰ฌ์šด (A, P, N) ์Œ์˜ ๋ฐ์ดํ„ฐ๋ฅผ ๊ณ ๋ฅด๋Š” ๊ฒƒ์ด ์ค‘์š”ํ•˜๋‹ค.

 

 

 

 

https://omoindrot.github.io/triplet-loss

 

training data์—์„œ triplet loss๋ฅผ ์ด์šฉํ•˜์—ฌ CNN ๋ชจ๋ธ์„ ํ•™์Šตํ•œ ํ›„,

test time์—์„œ๋Š” ํ•™์Šตํ•œ CNN ๋ชจ๋ธ์„ ์ด์šฉํ•˜์—ฌ ์ด๋ฏธ์ง€์— ๋Œ€ํ•œ encoding vector๋ฅผ ์–ป์€ ๋’ค,

DB์— ์กด์žฌํ•˜๋Š” ์—ฌ๋Ÿฌ ๋ช…์˜ member๋“ค๊ณผ์˜ encoding vector์™€ ๋น„๊ตํ•˜์—ฌ group์— ์†ํ•˜๋Š”์ง€ ์—ฌ๋ถ€๋ฅผ ๋ฆฌํ„ดํ•˜๊ฒŒ ๋œ๋‹ค.


5. face verification and binary classification

 

coursera - Andrew Ng

 

์œ„์™€ ๋ฐฉ๋ฒ•์ด ์•„๋‹ˆ๋ผ, binary classification์œผ๋กœ๋„ face recognition ๋ฌธ์ œ๋ฅผ ํ’€ ์ˆ˜ ์žˆ๋‹ค.

 

๋‘ ์‚ฌ๋žŒ์˜ ์ด๋ฏธ์ง€๋ฅผ ๋™์ผํ•œ ๋ชจ๋ธ์— ๋„ฃ์–ด encoding vector๋ฅผ ๊ตฌํ•œ ๋’ค,

๋‘ ๋ฒกํ„ฐ์˜ ์ฐจ์ด ๋ฒกํ„ฐ๋ฅผ ์ž…๋ ฅ์œผ๋กœ ํ•˜๋Š” binary classification์„ ์ˆ˜ํ–‰ํ•˜์—ฌ ๋‘ ์ด๋ฏธ์ง€๊ฐ€ ๋™์ผ ์ธ๋ฌผ์ธ์ง€ ์—ฌ๋ถ€๋ฅผ ๋ฆฌํ„ดํ•˜๋„๋ก ํ•œ๋‹ค.

 

$$ \hat y = \sigma(\sum_{k=1}^{V}w_k |f(x^{(i)})_k-f(x^{(j)})_k|+b) $$

  • V : encoding vector์˜ ๊ธธ์ด
  • f(x^(i))_k : i image์˜ encoding vector์˜ k๋ฒˆ์งธ ์š”์†Œ

'๐Ÿ™‚ > Coursera_DL' ์นดํ…Œ๊ณ ๋ฆฌ์˜ ๋‹ค๋ฅธ ๊ธ€

WEEK7 : convNet in 1D, 2D, 3D  (0) 2020.12.25
WEEK7 : Neural Style Transfer  (1) 2020.12.25
WEEK6 : Object Detection (2)  (0) 2020.12.23
WEEK6 : Object Detection (1)  (0) 2020.12.23
WEEK6 : convNet ์‚ฌ์šฉ์— ๋„์›€์ด ๋  ์ง€์‹  (0) 2020.12.23

+ Recent posts