Accessibility navigation

Comparison of Clang Abstract Syntax Trees using string kernels

Torres, R., Kunkel, J. M., Dolz, M. F. and Ludwig, T. (2018) Comparison of Clang Abstract Syntax Trees using string kernels. In: CADO 2018, 16-20 July, Orleans, France, pp. 106-113.

Text - Accepted Version
· Please see our End User Agreement before downloading.


It is advisable to refer to the publisher's version if you intend to cite from this work. See Guidance on citing.

Official URL:


Abstract Syntax Trees (ASTs) are intermediate representations widely used by compiler frameworks. One of their strengths is that they can be used to determine the similarity among a collection of programs. In this paper we propose a novel comparison method that converts ASTs into weighted strings in order to get similarity matrices and quantify the level of correlation among codes. To evaluate the approach, we leveraged the corresponding strings derived from the Clang ASTs of a set of 100 source code examples written in C. Our kernel and two other string kernels from the literature were used to obtain similarity matrices among those examples. Next, we used Hierarchical Clustering to visualize the results. Our solution was able to identify different clusters conformed by examples that shared similar semantics. We demonstrated that the proposed strategy can be promisingly applied to similarity problems involving trees or strings.

Item Type:Conference or Workshop Item (Paper)
Divisions:Science > School of Mathematical, Physical and Computational Sciences > Department of Computer Science
ID Code:79588


Downloads per month over past year

University Staff: Request a correction | Centaur Editors: Update this record

Page navigation