2023 IEEE International Conference on High Performance Computing & Communications, Data Science & Systems, Smart City & Dependability in Sensor, Cloud & Big Data Systems & Application (HPCC/DSS/SmartCity/DependSys)
Download PDF

Abstract

ChatGPT gathered the attention of millions of users shortly after its release, leading to the popularization of generative AI technologies. This research aims to emphasize one of the potential vulnerabilities associated with commonly used generative AI models. Even though it follows strict ethical and security policies, proper prompt engineering enables malicious misuse of ChatG PT such as spam email generation. In this paper, we present various scenarios of malicious prompt engineering to encourage the chatbot for spam email generation and rewriting the existing email. Also, we present the adversarial prompt engineering examples intended to evade the detection by spam filters by means of rewriting the given email while circumventing common spam characteristics. We experimentally evaluate the practical feasibility of prompt engineering on ChatGPT by assessing the performance of six common ML-based spam filters with emails modified by ChatG PT. From the experimental results, we show that adversarial prompt engineering decreases the performance of common ML-based spam filters, while NLP-based filter is robust to such modification. We also demonstrate that including ChatG PT rewritten emails in the training set leads to more robust ML-based spam filters, while the use of available AI-text detectors does not guarantee high detection rates of emails modified by the chatbot.
Like what you’re reading?
Already a member?
Get this article FREE with a new membership!

Related Articles