An audio encryption scheme based on Fast Walsh Hadamard Transform and mixed chaotic keystreams

This paper introduces an audio encryption algorithm based on permutation of audio samples using discrete modified Henon map followed by substitution operation with keystream generated from the modified Lorenz-Hyperchaotic system. In this work, the audio file is initially compressed by Fast Walsh Hadamard Transform (FWHT)forremovingtheresidualintelligibilityinthetransformdomain.Theresultingfileisthenencryptedin twophases.InthefirstphasepermutationoperationiscarriedoutusingmodifieddiscreteHenonmaptoweakenthecorrelationbetweenadjacentsamples.Inthesecondphaseitutilizesmodified-Lorenzhyperchaotic systemforsubstitutionoperationtofillthesilentperiodswithinthespeechconversation.Dynamickeystreamgenerationmechanismisalsointroducedtoenhancethecorrelationbetweenplaintextandencryptedtext. Variousqualitymetricsanalysissuchascorrelation,signaltonoiseratio(SNR),differentialattacks,spectralentropy,histogramanalysis,keyspaceandkeysensitivityarecarriedouttoevaluatethequalityofthe proposedalgorithm.Thesimulationresultsandnumericalanalysesdemonstratethattheproposedalgorithmhasexcellentsecurityperformanceandrobustagainstvariouscryptographicattacks.


Introduction
Voice based communication becomes prominent in several areas such as military, phone banking, confidential voice conferencing, education etc. With the increasing need for secure speech communication, data encryption protocols are critically important for storage and transmission of sensitive information over exposed systems. Unlike text and message signals, adjacent samples of voice signals are highly correlated and slowly time-varying. Moreover, the presence of redundant and unvoiced samples in audio signal demands the need for efficient compression techniques in the transform domain. Therefore, the conventional cryptographic algorithms are poorly suited for speech encryption. With the advancement of security level, chaotic-systems play a significant role in developing cryptographic algorithms. Claude Shannon introduces two basic elements for secure cryptographic algorithms [1]. These elements are confusion (permutation) and diffusion (substitution) operations. In confusion operation, data samples are permuted according to some specific key parameters to destroy the local correlation between adjacent samples. While in diffusion operation, data samples are substituted with pseudo random numbers (PRNs) generated by some entropy sources, to change the sample values. Both these operations eventually strengthen the complex relationship among plaintext, ciphertext and symmetric key parameters. While developing symmetric key encryption algorithm, designers utilize substitution-permutation network as the basic structural element [2]. Chaos theory plays a significant role in developing encryption algorithms, due to its inherent properties such as topological transitivity, ergodicity, sensitive dependence on initial conditions and deterministic pseudo randomness. These eventually satisfy the basic requirement for theoretical cryptography. Most of the chaos based algorithms such as multiple iterations of chaotic map [3,4], bit-level scrambling approaches [5,6] bit-level confusion methods [7] and hybrid key methods [8] were designed based on the above mentioned properties. Also, single and multiple round of permutation-then-diffusion without substitution-box (PT-DWOS), DNA encoding methods are introduced based on the reproducibility and deterministic nature of chaotic functions, since the process can be repeated for the same function and same initial conditions. To improve the complexity and randomness of encryption scheme various chaotic systems are introduced e.g. discrete time [9], continuous time [10], hyper chaotic [11] and time delayed systems [12]. Cascading of different chaotic systems [13], iteratively expanding lower dimensional chaotic map to higher dimension [14], parametric perturbations of chaotic trajectories [15] are the common methodologies followed by the designers to develop algorithm based on chaos theory. Furthermore, chaos theory have been incorporated in many conventional cryptographic approaches like S-box design [16], RC5 stream cipher [17] and elliptical curve cryptography [18] to strengthen the security of the encryption processes. But these methods are flawed with limited keyspace and computational complexity. Recently, researchers have attempted to develop computationally efficient and unconditionally secure chaotic-quantum algorithms suitable for cloud and Internet of Things (IoT) environments [19][20][21][22].
Several audio encryption algorithms have been introduced to provide secure data transmission. Among these techniques voice encryption algorithm based on chaos theory are considered to be effective to handle with redundant and bulky audio files. An overview of speech encryption algorithm based on chaos theory is discussed here after. Long Jye Sheue et al. [23] proposed a speech encryption algorithm based on fractional order chaotic systems. It is based on two-channel transmission method where the original speech signal is encoded using a nonlinear function of the Lorenz chaotic system. Moreover, they analyzed the conditions for synchronization between fractional chaotic systems theoretically by using the Laplace transform. Mosa et al. [24] introduced an algorithm based on permutation and substitution of speech segments using chaotic Baker map. They used Discrete Cosine Transform (DCT) to remove the residual intelligibility in order to compress the signal. ACI Maysaa abd ulkareem et al. [25] proposed a method based on logistic map and blowfish encryption algorithm. They employed partial encryption method by wavelet packet transform for splitting the raw signal to improve the speed of encryption process. Halto et al. [26] presented a hybrid chaotic speech encryption algorithm in which Arnold cat map is utilized for permutation and logistic map for substitution operation. They used Discrete Cosine Transform (DCT) for compressing the audio signal to minimize residual intelligibility. In [27] Halto et al. presented a method where the Lorenz system generates the keystream for substitution operation and Rossler chaotic system for permutation process. Elashy at al. [28] proposed a two level audio encryption algorithm based on chaotic Baker maps and double phase random coding. In the first level it utilizes Baker map and in the second level it utilizes optical encryption using double random phase encoding (DRPE) for providing physical security which is hard to break. Sathya murthi et al. [29] introduced an algorithm based on chaotic shift keying. In [29], audio signals are sampled and its values are segmented into four levels, and then the samples are permuted using chaotic systems such as Logistic map, Tent map, Quadratic map, and Bernoulli's map. Sheela et al. [30] proposed an algorithm based on two-dimensional modified Henon map (2D-MHM) and standard map. They introduced Hybrid Chaotic Shift Transform (HCST), and deoxyribonucleic acid (DNA) encoding rules to enhance the security level. Aissa Belmeguenai et al. proposed a method, where only relevant part of the speech segment undergoes encryption process [31]. Animesh Roy et al. [32] presented an algorithm based on audio signal encryption using chaotic Henon map and lifting wavelet transform. Z. Habib et al. presented a paper based on amplitude scrambling and Discrete Cosine Transform coefficient scrambling. In [33], they designed a permutation network using TD-ERCS chaotic map. Some of the constraints observed in the above mentioned literature are summarized as follows: 1. Inefficient compression method to remove the unvoiced data segments.
2. Suggested Permutation methods are not strong enough to break the correlation between adjacent samples in the audio file.
3. Lower Dimensional chaotic maps have periodic window problems like smaller chaotic range and non-uniform distribution.
4. In substitution-permutation rounds, permutation matrix and keystream generated from the chaotic function depends only on the initial condition and control parameters of the chaotic map.
To overcome the above mentioned drawbacks, we propose an audio encryption algorithm based on chaotic maps based on modified Henon map and modified Lorenz-hyperchaotic system. To overcome the first drawback, the algorithm employs a signal compression mechanism by Fast Walsh Hadamard Transform (FWHT), which reduces the sample size for further encryption process by removing higher order coefficients. Unlike Fast Fourier Transform (FFT) and Discrete Cosine Transform (DCT), FWHT has excellent energy compression properties. Furthermore, rectangular basis functions of FWHT can be realized more efficiently in digital circuits rather than the trigonometric basis functions of the Fast Fourier Transform. The second problem can be eliminated by modified Henon map, which weakens the strong correlation between adjacent coefficients. The samples are shuffled through the strong permutation matrix generated with the modified Henon map. The output data is then diffused by XOR-ring the permuted coefficients with keystream generated by the modified Lorenz-hyperchaotic system. The encryption process in higher dimensional space eliminates periodic window problems such as limited chaotic range and non-uniform distribution. It also extends the keyspace and consequently enhances the security. To eliminate the fourth drawback, a dynamic keystream selection mechanism is introduced.

Audio encryption scheme
If the initial conditions remain the same, the introducer can easily acquire the keystream by known chosen plain text and chosen ciphertext attacks. This possibility can be prevented by dynamic keystream generation, in which the keystream generated will be relevant to audio segments. Therefore different audio segment generates completely different keystream which eliminates chosen plain text and chosen ciphertext attacks. The rest of this paper is organized as follows: The preliminary studies of the proposed audio encryption algorithm are presented in Section 2. Theoretical framework of the proposed approach is given in Section 3. Numerical simulations and performance evaluations are discussed in Section 4. Comparison of proposed work with other state-of-art is discussed in Section 5, followed by conclusion in Section 6.

Preliminary studies
In this section, we describe mathematical models of chaotic maps used for encryption, i.e., parametric perturbated Lorenz-hyperchaotic system and modified Henon map. Periodic, quasi-periodic, chaotic and hyper-chaotic behavior of parametric perturbated Lorenz map is discussed by means of bifurcation diagram. One dimensional signal compression by FWHT is also discussed.

Hyperchaotic system
Higher dimensional chaotic systems show distinct advantages over lower dimensional chaotic system due to its complex dynamic random behavior. In this work, we adopt a parametric perturbated Lorenz-hyperchaotic system [34]. Lorenz system shows chaotic behavior for the control parameters, a 5 10, b 5 8/3 and c 5 28 [35]. Parametric perturbation in Lorenz system may be given to all of the parameters a, b & c or on selected parameters. In the proposed system (1) the control parameter 'a' is selected for parametric perturbations by adding a PI controller in the feedback path of the Lorenz system.
where x; y; z; w are the state variables and a; b; c; k p ; k i are the system parameters. k p and k i are control parameters of the PI controller. Parametric perturbation changes the three dimensional autonomous system to non-autonomous system (1), which is equivalent to a four dimensional hyperchaotic system. When the control parameters, a ¼ 10; b ¼ 8 3 ; c ¼ 10; k p ¼ −3:6; k i ¼ 5:2; and initial conditions (1,1,1,1), then the Lyapunov exponent obtained are L 1 ¼ 0:01000, L 2 ¼ 0:421905, L 3 ¼ −0:326781, L 4 ¼ −13:385272. Since more than two Lyapunov exponents are positive, the system is hyperchaotic. The evolution of periodic, chaotic, and hyperchaotic attractors in this system can be generated by varying k i [À20, 15] by fixing all other parameters constant. Figure 1 illustrates the bifurcation diagram and Lyapunov exponents of the modified system.

Henon map dynamical system
Henon map was proposed by Michel Henon in 1976 as a comprehensible approach of the Poincare map that results from the solution of complex Lorenz equation [36]. Modified Henon map was developed to improve the complex dynamic behavior and bifurcation range [30]. In this map, x 2 n term of the seed map is replaced with nonlinear term cosðx n Þ. Modified Henon map can be mathematically modeled as follows: Audio encryption scheme In the proposed method, the system (2) is taken as a computational tool to generate permutation matrix. The data set resulted from dynamic system (2) for some specific values of a and b are used to generate permutation matrix for encryption process. This dynamical system (2) takes a point ðx n ; y n Þ in the two dimensional plane and map this point to a new point given by ðx nþ1 ; y nþ1 Þ. The map is dependent on two bifurcation parameters að> 0Þ and bð> 0Þ: Moreover, the parameter b measures contraction rate of the 2D quadratic Henon map which is independent of x n and y n . Figure 2 depicts the bifurcation diagram of both the systems. Bifurcation diagram of modified Henon map shows wider range of output distribution for the control parameter a compared to its seed map. Therefore, modified henon map increases the range of permutation operation compared to its seed map.

Fast Walsh Hadamard Transform (FWHT)
Walsh-Hadamard transform is used in different applications, such as data compression, processing of speech and image signals, coding and communications. It is an orthogonal transformation that decomposes input signals into rectangular waveforms called Walsh functions. The Walsh functions forms a system of orthogonal functions and have only two values þ1 and -1. This transformation is computationally simple since it has no multiplication and division operations. To implement FWHT of order n ¼ 2 m requires only nlogn addition and subtraction. Generally, Fast Walsh Hadamard transformation follows the recursive definition of symmetric Hadamard matrix. Let H be the Hadamard of order n ¼ 2 k described as follows: where ⊗ denotes Kronecker product.
The Walsh transform is modeled as in Eq. (5) for one dimensional signal.
where x and u are independent variable represented in n bits. The binary representation of x and u can be written as:  In this transformation, Walsh kernel forms an array of matrix having orthogonal rows and columns. Therefore, both the forward and inverse transformations are identical operations except for the constant multiplicative factor of 1/N for 1D signal.

Proposed method
The overall idea of the encryption process is depicted in Figure 3. The input speech signal is initially compressed by FWHT followed by permutation operation. Then, generate keystream for substitution operation relevant to characteristics of plain speech signal. The various steps in encryption process are systematically demonstrated as follows: 3.1 Encryption process 3.1.1 Audio signal compression by FWHT. Initially, we divide the input audio file into frames, each with N samples. Assume the message signal is m ¼ fm 1 ; m 2 ; :: We compress the audio signals based on Eq. (5) and reduce the sample space by discarding the higher order coefficients (8). Original signal, compressed signal with numerical values are displayed in Figure 4.
where m w ðP; 1Þ is the compressed data signal and P is the sample size of the compressed signal.
3.1.2 Permutation process. Prior to permutation process, the audio samples are reshaped m w ðP; 1Þ into two dimensional vector space m w ðQ; RÞ. In this step permute the audio samples depending on the random matrix generated from the modified Henon map by performing some transformation on the original data samples that produce ciphered audio samples C 1 ðQ; RÞ, with uniform histogram. The minimum number of iterations for Henon map should be greater than P 2 to completely permute the data samples. A chaotic system shows gradual evolution from Periodic to quasi-periodic and to chaotic regime by slowly varying the control parameters. Since there is a slow transition from periodic to chaotic, there may be a chance to produce periodic or redundant samples for the first few iterations. In this scenario, the first few iterations in the permutation process seems fairly close together, hence it should be discarded. Therefore, the total number of iterations is p 2 þ 1000. Algorithm 1 describes the entire permutation process in detail. Iterate the data samples as follows:   Compressed speech signal (c) Compressed signal with zeroing out the coefficients.

Audio encryption scheme
x i þ 1 ¼ mod ÀÀ 1 À acos x i Þ þ by j Þ; QÞ i ∈ ð1; QÞ y jþ1 ¼ modððx i Þ; RÞ j ∈ ð1; RÞ (9) where ðx i ; y j Þ is the initial position of the data sample and ðx iþ1 ; y jþ1 Þ is the first iterated position. Figure 5 displays the values of original and permuted data samples.

Dynamic keystream selection mechanism.
In order to make initial conditions dependent on each other and be sensitive on plain audio data, the initial conditions are derived from Eq. (10) and by performing basic arithmetic operations as in Eq. (11). The sub keys so obtained are the initial conditions xð0Þ; yð0Þ; zð0Þ; wð0Þ for the hyperchaotic system (1). In each round of operation the keys are updated, and it will avoid the possibility of various differential attacks.

Audio encryption scheme
x i ; y i ; z i ; w i indicate the i th element of keystreams. The absðxÞ) returns the absolute value of x and floorðxÞ returns the largest integer less than or equal to x. Generate the normalized keystream for substitution operation as follows: where key norm is the normalized keystream generated. X max; X min are the maximum and minimum values present in the array X. The sequence of operation is elaborated in Algorithm 2.

Substitution operation.
After first level encryption process (permutation operation), the two dimensional data samples ðC 1 ðQ; RÞÞ are reshaped to one dimensional data samples ðC 1 ðP; 1ÞÞ. Then, generate keystream for substitution operation based on Algorithm 2 and eliminate first few samples of the keystream since the samples are redundant. Substitution operation is then carried out by XOR-ing the permuted samples with the keystream generated using Eq. (12). Figure 6(a) and (b) show the speech samples after p 2 þ 1000 permutation operation and speech signal after substitution operation respectively. Pseudo code for substitution operation is given as in Algorithm 3.

Decryption process
The procedure of decryption process is just reverse of the encryption process. The decryption process can be performed easily by means of the pre-shared keys. The decryption process can be briefly described as follows: Step 1: Generate the same keystream bits according to the steps 3.1.3 in the encryption process.
Step 3: Do the inverse permutation process according to the step 3.1.2.

Simulation result and analysis
The proposed algorithm is simulated on a classical computer with MATLABR 2013a (version) software. Different voice samples of male and female audio files with sampling rate of 8000 samples/sec are selected for the test. We evaluated the performance of this algorithm through various statistical and differential analyses.

Correlation analysis
Correlation analysis is a statistical method to evaluate the performance of cryptographic algorithm over various statistical attacks [37]. Correlation coefficient analysis measures the mutual relationship between similar segments in the plain and encrypted audio file. A secure data encryption algorithm converts original data into random-like noisy signal with low correlation coefficient. Low correlation coefficient indicates the narrow correlation between original and encrypted speech files. In this work Correlation analysis is carried out for both Henon map and modified Henon map. Correlation coefficient ðr xy Þ between original and encrypted audio samples are calculated and listed in Table 1. Correlation coefficient between original data samples in the same audio file, and original and encrypted data samples are also calculated and given in Table 2. Analysis shows that there is an improvement of permutation operation since the correlation coefficients of modified Henon map is smaller than its seed map. Figure 7(a) shows the scatter plot diagram of original speech signal. Randomized nature of speech signal after permutation and substitution operation is illustrated in Figure 7(b) and (c). Correlation coefficient can be calculated as follows: where EðxÞ and EðyÞ are mean and σ x ; σ y are the standard deviation of the encrypted and decrypted speech signal.

Signal to noise ratio (SNR)
The signal to noise ratio is one of the straight forward methods to validate the performance of data encryption algorithm. SNR measures the noise content in the encrypted data signal.

Audio encryption scheme
Cryptanalyst always try to increase the noise content in the encrypted signal so as to minimize the information content in the encrypted data [38]. Figure 8 shows the original and encrypted speech signal. Figure 8(c, d) illustrates the randomized nature of encrypted signal after permutation and substitution operation. The SNR values of encrypted audio files are calculated based on the Eq. (16) and given in Table 3.

UACI and NSCR analysis
In data encryption, resistace to differential attacks is generally analyzed through the NSCR (number of samples change rate) and UACI (unified average changing intensity) tests [39]. In this analysis two different speech segments are encrypted with same keystreams, where the original speech segments are differed by one sample space. Then the encrypted speech segments are compared by the number of sample change rate (NSCR) and the unified average changing intensity (UACI). Both these parameters can be expressed as follows: c i and c ; i denotes the the audio samples at i th position of the encrypted speech samples and N corresponds to the length of the speech segments. The upper-bound for NSCR and UACI are 100% and 33.3% respectively. For a secure encryption scheme these parameters should be close to the upper bound ideal values. NSCR and UACI values of the proposed algorithm is calculated and listed in Table 4. The results show that the values obtained through proposed algorithm is considerably closer to ideal values.

Spectral entropy
Spectral entropy measures the randomness in both encrypted and original speech signal. Its measurement is based on the assumption that the spectrum of meaningful speech segment is correlated than the noisy signal [1]. The spectral measurement compares the entropy where the amplitude component of the power spectrum is taken as a probability parameter in entropy calculation. The amount of information can be calculated as the negative of entropy or the negative logarithm of probability. Thus, meaningful speech segments show low entropy since it contains organized data samples. However the encrypted speech signals have high entropy and large spectral peaks similar to noisy signal. The entropy E i can be measured as follows:

Audio encryption scheme
where PSD n is the normalized power spectrum and f i is the frequency of the signal. Irregularities of amplitude in original and encrypted signals are shown in Figure 9.

Keyspace and key sensitivity analysis
In the proposed algorithm, system parameters of modified Henon map ða ¼ 3:58; b ¼ 0:56Þ, system parameters of modified Lorenz-hyperchaotic system ða ¼ 10; b ¼ 3:33; c ¼ 28; k p ¼ −3:5; k i ¼ 5:2Þ and initial conditions of the hyperchaotic system ðxð0Þ; yð0Þ; zð0Þ; wð0ÞÞ constitutes the keyspace. If the precision of each system parameter and initial condition is set to 15 decimal points, the key space of the proposed algorithm is ð10 15 Þ 6 ¼ 2 548:11 . It is sufficiently large enough to resist the brute force attack. Key sensitivity is the essential quality for any good data encryption algorithm, which make sure that the security level of the algorithm against the brute-force attack [40]. It means that, a small variation for any key parameter bring an apparent change in both encrypted and decrypted speech signal. The effect of variation in keyparameter on encryption process is verified by encrypting the signal with slightly different initial conditions. The simulation result shows that the slight variations in keyparameter will result in completely different encrypted signal. Figure 10 shows the encrypted signals with two different initial conditions. To evaluate the key sensitivity of decrypted signal, encrypt the speech file with one fixed secret key then decryption is performed with slightly different keys. The resulting speech files decrypted with wrong keys apparently looks different and reveals no information. Figure 11(a) shows the decrypted speech signal with correct key. Figure 11(b, c, d, e) shows the decrypted signal with slight variations in the initial conditions in the range of 10 −15 :

Histogram analysis
Histogram analysis is one of the accurate methods to evaluate the quality of encrypted speech signal. Since a practical encryption algorithm is likely to encrypt original speech file into random like noise, it is desirable to obtain an encrypted speech file with equally probable sample values. Therefore, the encrypted speech furnishes no information that would facilitate  Table 3.
Signal to noise ratio.    ACI the possibility of any statistical attacks on the encrypted domain. Histogram of both encrypted and original speech signal is illustrated in Figure 12. Figure 12(b) displays the uniformly distributed histogram of encrypted speech signal, which indicates the randomness of encrypted speech signal. From the histogram, it is clear that the proposed algorithm is highly secure against various statistical attacks.

Computational complexity
Big-oh Notation is a unified way to express the complexity of an algorithm. In classical computation, computational complexity is evaluated by the elementary operations involved in the encryption process. Since speed of an algorithm depends on the target computer processor, it is difficult to estimate the exact runtime of an algorithm. Big-oh notation measures the execution time of an algorithm in terms of the input array size and the nature of arithmetic operations. In the proposed algorithm, the encryption process consists of p 2 þ 1000 round of permutation operations and a single round of substitution operations. Computational complexity of permutation operation is Oðn 2 Þ time or quadratic time, since the time complexity of the operation grows quadratically with respect to input array size n. However, the computational complexity of substitution operation is independent of input array size, since this process takes single step to complete the operation irrespective of array length n. Thus the computation complexity of substitution round is Oð1Þ. Computational complexity of entire process can be expressed as Oðn 2 Þ þ Oð1Þ ¼ Oðn 2 Þ.

Comparison with existing works
The proposed speech encryption algorithm differs from other methods, in terms of data compression, permutation and substitution operations. Therefore comparison of proposed method with other state-of-the-art approaches is difficult. However, we have analyzed various quality metrics such as key length, keyspace, signal to noise ratio (SNR), NPCR, UACI and correlation coefficient between original and encrypted signals and tabulated in Table 5. We have compared our proposed algorithm with advanced encryption standard (AES), Data Encryption Standard (triple DES), algorithm based on quantum chaotic system [16] and an algorithm based on substitution-permutation chaotic network [20]. Speech encryption algorithm based on Zaslavsky map [8], TD-ERCS chaotic map [33], multiple chaotic shift keying [29] and a non-chaotic [31], method are also considered for comparative analysis. The size of the proposed method's key space is greater than 2 540 (Section 4.5). It is clear from the simulation results (Table 3) that the encrypted speech signal contains more noise content than in original speech signal. Correlation coefficient ðr xy Þ is evaluated to be almost zero ( Audio encryption scheme against various differential attacks. From this analysis, we can found that proposed algorithm shows considerable improvement in almost all the encryption quality metrics.

Conclusion
In this paper, a novel approach for speech encryption algorithm based on parametric perturbated Lorenz-hyperchaotic system and modified Henon map is introduced. Modified Henon map maximized the permutation operation compared to its seed map, which eventually decreased the correlation between original and encrypted data samples. Selection of hyperchaotic system eliminated weak chaotic trajectories and smaller chaotic ranges, which is commonly observed in lower dimensional chaotic system. Due to the hyperchaotic nature, the proposed system has more keyspace, which protects the proposed algorithm against various statistical attacks like brute force attack. Moreover, dynamic keystream generated with hyperchaotic system eliminated the possibility of differential attacks. Furthermore, Fast Walsh Hadamard Transform (FWHT) improved the efficiency of algorithm by reducing the computational complexity while doing the compression process. Various simulations and numerical analysis have been carried out on classical computer to evaluate the performance of the proposed algorithm. Finally, we have made a comparison of proposed algorithm with chaotic, non-chaotic and standard encryption algorithms. From the comparative study, it can be concluded that the proposed algorithm shows improvement over some of the existing algorithms and it is an excellent choice for voice encryption in practical applications.