To read this content please select one of the options below:

Language-independent gender identification through keystroke analysis

Ioannis Tsimperidis (Information Security and Incident Response Unit, Department of Electrical and Computer Engineering, Democritus University of Thrace, Greece)
Vasilios Katos (Information Security and Incident Response Unit, Department of Electrical and Computer Engineering, Democritus University of Thrace, Greece)
Nathan Clarke (Centre for Security, Communications & Network Research, Plymouth University, United Kingdom)

Information and Computer Security

ISSN: 2056-4961

Article publication date: 13 July 2015

380

Abstract

Purpose

The purpose of this paper is to investigate the feasibility of identifying the gender of an author by measuring the keystroke duration when typing a message.

Design/methodology/approach

Three classifiers were constructed and tested. The authors empirically evaluated the effectiveness of the classifiers by using empirical data. The authors used primary data as well as a publicly available dataset containing keystrokes from a different language to validate the language independence assumption.

Findings

The results of this paper indicate that it is possible to identify the gender of an author by analyzing keystroke durations with a probability of success in the region of 70 per cent.

Research limitations/implications

The proposed approach was validated with a limited number of participants and languages, yet the statistical tests show the significance of the results. However, this approach will be further tested with other languages.

Practical implications

Having the ability to identify the gender of an author of a certain piece of text has value in digital forensics, as the proposed method will be a source of circumstantial evidence for “putting fingers on keyboard” and for arbitrating cases where the true origin of a message needs to be identified.

Social implications

If the proposed method is included as part of a text-composing system (such as e-mail, and instant messaging applications), it could increase trust toward the applications that use it and may also work as a deterrent for crimes involving forgery.

Originality/value

The proposed approach combines and adapts techniques from the domains of biometric authentication and data classification.

Keywords

Citation

Tsimperidis, I., Katos, V. and Clarke, N. (2015), "Language-independent gender identification through keystroke analysis", Information and Computer Security, Vol. 23 No. 3, pp. 286-301. https://doi.org/10.1108/ICS-05-2014-0032

Publisher

:

Emerald Group Publishing Limited

Copyright © 2015, Emerald Group Publishing Limited

Related articles