The purpose of this paper is to investigate the feasibility of identifying the gender of an author by measuring the keystroke duration when typing a message.
Three classifiers were constructed and tested. The authors empirically evaluated the effectiveness of the classifiers by using empirical data. The authors used primary data as well as a publicly available dataset containing keystrokes from a different language to validate the language independence assumption.
The results of this paper indicate that it is possible to identify the gender of an author by analyzing keystroke durations with a probability of success in the region of 70 per cent.
The proposed approach was validated with a limited number of participants and languages, yet the statistical tests show the significance of the results. However, this approach will be further tested with other languages.
Having the ability to identify the gender of an author of a certain piece of text has value in digital forensics, as the proposed method will be a source of circumstantial evidence for “putting fingers on keyboard” and for arbitrating cases where the true origin of a message needs to be identified.
If the proposed method is included as part of a text-composing system (such as e-mail, and instant messaging applications), it could increase trust toward the applications that use it and may also work as a deterrent for crimes involving forgery.
The proposed approach combines and adapts techniques from the domains of biometric authentication and data classification.
Tsimperidis, I., Katos, V. and Clarke, N. (2015), "Language-independent gender identification through keystroke analysis", Information and Computer Security, Vol. 23 No. 3, pp. 286-301. https://doi.org/10.1108/ICS-05-2014-0032Download as .RIS
Emerald Group Publishing Limited
Copyright © 2015, Emerald Group Publishing Limited