Abstract
The principal tasks of program analysis, including bug searching and code similarity detection, are executed at the function level. However, the accurate identification of functions within stripped binary files poses a significant challenge. This difficulty isexacer-bated by unformatted monolithic firmware images typically found in industrial controlling device, rendering existing methods ineffective due to their dependence on specific metadata, which may be absent. In this paper, we propose a new function identification scheme and a tool that target on monolithic firmware images, referred to as TaiE. Our scheme recognizes functions based on stack characteristics and does not rely on auxiliary information provided by the target file. We evaluate TaiE’s performance on synthetic and real-world targets comprising a total of 160 hardware platforms and 1,105 firmware images. The results show that TaiE achieves a precision greater than 97 \%97% and a recall higher than 87 \%87%, outperforming state-of-the-art tools. CCS CONCEPTS • Security and privacy \rightarrow→ Software reverse engineering; \cdot⋅ Computer systems organization \rightarrow→ Firmware.