J-measure based hybrid pruning for complexity reduction in classification rules

Liu, Han; Gegov, Alexander; Stahl, Frederic

Download

Preview

Text
- Published Version

Advice

Please see our End User Agreement.

It is advisable to refer to the publisher's version if you intend to cite from this work. See Guidance on citing.

Tools

Lists

Liu, H., Gegov, A. and Stahl, F. ORCID: https://orcid.org/0000-0002-4860-0203 (2013) J-measure based hybrid pruning for complexity reduction in classification rules. WSEAS Transactions on Systems, 12 (9). pp. 433-446. ISSN 2224-2678

Abstract/Summary

Prism is a modular classification rule generation method based on the ‘separate and conquer’ approach that is alternative to the rule induction approach using decision trees also known as ‘divide and conquer’. Prism often achieves a similar level of classification accuracy compared with decision trees, but tends to produce a more compact noise tolerant set of classification rules. As with other classification rule generation methods, a principle problem arising with Prism is that of overfitting due to over-specialised rules. In addition, over-specialised rules increase the associated computational complexity. These problems can be solved by pruning methods. For the Prism method, two pruning algorithms have been introduced recently for reducing overfitting of classification rules - J-pruning and Jmax-pruning. Both algorithms are based on the J-measure, an information theoretic means for quantifying the theoretical information content of a rule. Jmax-pruning attempts to exploit the J-measure to its full potential because J-pruning does not actually achieve this and may even lead to underfitting. A series of experiments have proved that Jmax-pruning may outperform J-pruning in reducing overfitting. However, Jmax-pruning is computationally relatively expensive and may also lead to underfitting. This paper reviews the Prism method and the two existing pruning algorithms above. It also proposes a novel pruning algorithm called Jmid-pruning. The latter is based on the J-measure and it reduces overfitting to a similar level as the other two algorithms but is better in avoiding underfitting and unnecessary computational effort. The authors conduct an experimental study on the performance of the Jmid-pruning algorithm in terms of classification accuracy and computational efficiency. The algorithm is also evaluated comparatively with the J-pruning and Jmax-pruning algorithms.

Item Type	Article
URI	https://centaur.reading.ac.uk/id/eprint/34529
Official URL	http://www.wseas.org/cms.action?id=6952
Refereed	Yes
Divisions	Science > School of Mathematical, Physical and Computational Sciences > Department of Computer Science
Uncontrolled Keywords	Data Mining, Machine Learning, Classification Rules, J-pruning, Jmax-pruning, Jmid-pruning, if-then rules, overfitting, J-measure
Publisher	WESAS
Download/View statistics	View download statistics for this item

Download Statistics

Downloads

Downloads per month over past year

Deposit Details

CORE (COnnecting REpositories)

University Staff: Request a correction | Centaur Editors: Update this record

Date Deposited:	29 Oct 2013 15:34	Date item deposited into CentAUR
Last Modified:	07 Jun 2026 10:48	Date item last modified