pub H-BRS | Search

2 search hits

1 to 2

Sort by

Year
Year
Title
Title
Author
Author

What are the Features of this Software? (2014)

Paech, Barbara ; Hübner, Paul ; Merten, Thorsten

Application systems are often advertised with features, and features are used heavily for requirements man- agement. However, often software manufacturers only have incomplete information about the features of their software. The information is distributed over different sources, such as requirements documents, issue trackers, user manuals, and code. In this paper, we research the occurrence of feature information in open source software engineering data. We report on a case study with three open source systems. We analyze what information about features can be found in issue trackers and user documentation. Furthermore, we study the abstraction levels on which the features are described, how feature information is related, and we discuss the possibility to discover such information semi-automatically. To mirror the diversity of software development contexts, we choose open source systems, which are quite different, e.g., in the rigor of issue tracker usage. The results differ accordingly. One main result is that the user documentation did not provide more accurate information than the issue tracker compared to a provided feature list. The results also give hints on how the management of feature relevant information can be supported.

Classifying Unstructured Data into Natural Language Text and Technical Information (2014)

Merten, Thorsten ; Mager, Bastian ; Bürsner, Simone ; Paech, Barbara

Software repository data, for example in issue tracking systems, include natural language text and technical information, which includes anything from log files via code snippets to stack traces. However, data mining is often only interested in one of the two types e.g. in natural language text when looking at text mining. Regardless of which type is being investigated, any techniques used have to deal with noise caused by fragments of the other type i.e. methods interested in natural language have to deal with technical fragments and vice versa. This paper proposes an approach to classify unstructured data, e.g. development documents, into natural language text and technical information using a mixture of text heuristics and agglomerative hierarchical clustering. The approach was evaluated using 225 manually annotated text passages from developer emails and issue tracker data. Using white space tokenization as a basis, the overall precision of the approach is 0.84 and the recall is 0.85.

1 to 2

Author(s)
Title
Other Person(s)
Referee(s)
Abstract
Fulltext

Open Access

Refine

H-BRS Bibliography

Departments, institutes and facilities

Document Type

Year of publication

Language

Has Fulltext

Keywords

2 search hits