아래 내용은 논문 [Toward Deep Supervised Anomaly Detection: Reinforcement Learning from Partially Labeled Anomaly Data]을 요약 번역해 작성한 글입니다.
Introduction
In anomaly detection area, it is also crucial to leverage those unlabeled data for the detection of both known and unknown anomalies
supervised learning의 문제
known anomaly 오버피팅의 가능성
It is therefore difficult, if not impossible, to obtain labeled training data that covers all possible classes of anomaly. This renders fully supervised methods impractical.
unsupervised learning의 문제
Unsupervised anomaly detection approaches can often detect diverse anomalies
but,
1) can produce many false positives due to the lack of prior knowledge of true anomalies.
2) they are often ineffective when handling high-dimensional and/or intricate data
semi-supervised learning의 장점
Despite of the small size, these labeled anomalies provide valuable prior knowledge, enabling significant accuracy improvements over unsupervised methods.
기존 semi-supervised learning 기법의 문제
utilize the labeled anomalies to build anomaly-informed models, but they exclusively fit the limited anomaly examples, ignoring the supervisory signals from possible anomalies in the unlabeled data.
논문에서 고안한 방식 (Deep Q-learning with Partially Labeled ANomalies (DPLAN))
anomaly detectionoriented deep reinforcement learning (DRL) approach that automatically and interactively fits the given anomaly examples and detects known/unknown anomalies in the unlabeled data simultaneously.
create an anomaly-biased simulation environment to enable the agent to effectively exploit the small set of labeled anomaly instances while being deliberately explore the large-scale unlabeled data for any possible anomalies from novel classes of anomaly.
define a combined reward function leveraging supervisory information from the labeled and unlabeled anomalies to achieve a balanced exploration-exploitation
1) the agent is implemented by deep Q-network (DQN) specifically designed for anomaly detection in the open-source Keras-based deep reinforcement learning project, namely, Keras-rl, available at https://github.com/kerasrl/keras-rl
2) A novel proximity-dependent observation sampling method is devised and incorporated into the simulation environment to efficiently and effectively sample next observation.
3) Further, a labeled anomaly data-based reward and an unsupervised isolation based reward are synthesized to drive the joint optimization for detecting both of the known and unknown anomalies.
Related work
Anomaly Detection 분야
기존의 unsupervised 접근
1) autoencoder/GANs (generative adversarial networks)-based reconstruction errors to measure normality
2) learning new feature representations tailored for specific anomaly measures
기존의 weakly-supervised 접근 방식
1) label propagation
2) end-to-end feature learning
DRL-driven Knowledge Discovery
DRL 방식이 성과를 보이면서, anamaly detection 분야에도 적용되기 시작했으며, inverse reinforcement learning 방식이 sequential anomaly detection 분야를 위해 등장했다.
Thre proposed approach
problem statement
기존의 anomaly informed 모델들이 라벨링된 적은 양의 anomaly set에서 anomaly에 대한 특징 정보를 추출에 집중했다면, 해당 논문은 라벨링된 데이터와 라벨링 되지 않은 많은 양의 데이터로부터 anomaly를 추출하는 방식/과정을 학습하는 것을 목적으로 한다