Detecting Everything in the Open World: Towards Universal Object Detection

Contribution 세미나

PaperGPT 2024. 4. 15. 13:21

UniDetector 제안

다양한 이미지 셋(multiple sources and heterogeneous label spaces) 사용
Decoupling training manner(Learning open-world object proposals without learning to classify 논문과 비슷)
Probability calibration 사용

Image-text pretraining model은 regionCLIP 사용

3가지 형태의 training 방식 사용하여 비교

결국 마지막 형식이 제일 성능이 좋음(약간 이해 안되는 부분 있음..)

여기서는 ROI Head에서 classification을 고려(object or not), 그 비율은 hyperparameter(alpha)로 조정 (여기서는 0.3 사용)

마지막으로 probability calibration 사용

pi_j: the prior probability records the bias of the network to category j (test set안에서 고려, 테스트 이미지가 너무 적으면 학습 이미지도 같이 고려)

결국 category별 검출된 빈도수를 고려하여 calibration을 진행 함

(online? offline? 인지는 모름..)

Closed set에서 DINO보다 좋다..

Open set에서 GLIP보다 좋다..