Search results1 – 2 of 2
The purpose of this paper is to apply the wavelet thresholding technique in order to analyze economic socio-political situations in Tunisia using textual data sets. This…
The purpose of this paper is to apply the wavelet thresholding technique in order to analyze economic socio-political situations in Tunisia using textual data sets. This technique is used to remove noise from contingency table. A comparative study is done on correspondence analysis and classification results (using k-means algorithm) before and after denoising.
Textual data set is collected from an electronic newspaper that offers actual economic news about Tunisia. Both the hard and the soft-thresholding techniques are applied based on various Daubechies wavelets with different vanishing moments.
The results obtained have proved the effectiveness of wavelet denoising method in textual data analysis. On one hand, this technique allowed reducing the loss of information generated by correspondence analysis, ensured a better quality of representation of the factorial plan, neglected the interest of lemmatization in textual analysis and improved the results of classification by k-means algorithm. On the other hand, the proximities provided by the factorial visualization validate the economic situation of Tunisia during the studied period showing mainly a stable situation before the revolution and a deteriorated one after the revolution.
The results are the first to analyze economic socio-political relations using textual data. The originality of this paper comes also from the joint use of correspondence analysis and wavelet thresholding in textual data analysis.
The purpose of this paper is to apply the Takagi-Sugeno (T-S) fuzzy model techniques in order to treat and classify textual data sets with and without noise. A comparative…
The purpose of this paper is to apply the Takagi-Sugeno (T-S) fuzzy model techniques in order to treat and classify textual data sets with and without noise. A comparative study is done in order to select the most accurate T-S algorithm in the textual data sets.
From a survey about what has been termed the “Tunisian Revolution,” the authors collect a textual data set from a questionnaire targeted at students. Five clustering algorithms are mainly applied: the Gath-Geva (G-G) algorithm, the modified G-G algorithm, the fuzzy c-means algorithm and the kernel fuzzy c-means algorithm. The authors examine the performances of the four clustering algorithms and select the most reliable one to cluster textual data.
The proposed methodology was to cluster textual data based on the T-S fuzzy model. On one hand, the results obtained using the T-S models are in the form of numerical relationships between selected keywords and the rest of words constituting a text. Consequently, it allows the authors to interpret these results not only qualitatively but also quantitatively. On the other hand, the proposed method is applied for clustering text taking into account the noise.
The originality comes from the fact that the authors validate some economical results based on textual data, even if they have not been written by experts in the linguistic fields. In addition, the results obtained in this study are easy and simple to interpret by the analysts.