Abstract:
Research in topics such as fraud detection and prevention have received a lot of
attention in the past few years, especially in the field of online recruitment systems. This
surge in research activity can be attributed to the significant yearly losses in money,
privacy, and sensitive data that are suffered by both job seekers and recruiting firms. A
combination of a severe loss and a continued increasing number of victims makes online
recruitment fraud a significant and timely research problem. Online Recruitment fraud is
a sophisticated process of offering fake job opportunities using online platforms and tools
to target job seekers (Vidros et al., 2017). The main goal is to cause loss of privacy for
online users, economic damage, or damage to the reputation of the employee. Online
recruitment fraud (ORF) is one of the most serious problems in recent times on the
Internet, which uses Applicant Tracking Systems (ATS). Over the recent decades, the
emergence and consolidation of ATS platforms have been adopted by many employers
and organizations. ATS is cloud-based software that businesses and recruiters use to track candidates during the hiring and recruitment process. Utilizing applicant tracking
software is primarily intended to streamline the hiring process. The company may be able
to hire for several positions at a time and receive hundreds of resumes automatically.
Businesses can use this software to filter, manage, and analyze candidates using simple
database functionality and even full-service tools. This software stores many candidates’
resumes which include their personal information.
This thesis investigates novel linguistic and knowledge-based patterns associated
with detecting job fraud and leverages a new feature extraction approach for the task of
detecting fraudulent job advertisements. A novel features space to improve the detection
accuracy of ORF shows improvement over various baselines. Finally, this work publishes
comparative results of various feature engineering schemes using different machine
learning methods. It conducts a critical analysis of errors by such models to inform future
research on this emerging area of ORF in the field of cybersecurity.
The application of this research study intends to positively contribute to the online
fraud research area and help prevent fraudsters from continuing to deceive people online.