With the support of projects such as Strategic Priority Research Program of the Chinese Academy of Sciences, National Key R&D Program of China, and the National Natural Science Foundation, fundamental and applied researches are conducted around the intrinsic and derivative security issues of artificial intelligence. By analyzing the inherent mechanisms of algorithm defects, a comprehensive security evaluation system on artificial intelligence algorithm is established, exploring defect and risk mitigation mechanisms, and breaking through the theoretical and technical bottlenecks of "trustworthy, manageable, and controllable" intelligent algorithms, to ensure the safe application of intelligent algorithms.
The research group conducts the following studies centering on the endogenous security issues and derivative security issues of intelligent algorithms:
1. Adversarial Attacks and Defenses:
a) Adversarial Attacks: Explore how to improve the transferability of adversarial samples.
b) Adversarial Defenses: Improve the adversarial robustness of models from the perspectives of robust structure design, efficient adversarial training, etc.
2. Backdoor Attacks and Defenses:
a) Backdoor Attacks: Explore how to enhance the attack success rate, concealment and stability of backdoors.
b) Backdoor Defenses: Investigate methods on backdoor detection, trigger localization and backdoor removal.
3. Out-of-Distribution Generalization and Detection:
Research on theoretical analysis for trustworthy AI, effective domain shift measurement and domain generalization methods, etc.
4. Security Assessment of Multimodal Large Models:
Evaluate the fundamental capabilities, fairness, privacy leakage risks, hallucinations, and value misalignment of multimodal large models.
5. Deepfakes and forgery detection, liveness detection:
a) Digital world: Forgery methods such as generating specific individuals' voices, voice-driven synthesis, expression transfer, and image/video forgery detection methods.
Expression transfer
b) Physical World: Face anti-spoofing.
(a) Distribution differences between fake faces and real faces.
(b) Single-side domain generalization framework.