Confusion detection in code reviews

IRIS

Code reviews are an important mechanism for assuring quality of source code changes. Reviewers can either add general comments pertaining to the entire change or pinpoint concerns or shortcomings about a specific part of the change using inline comments. Recent studies show that reviewers often do not understand the change being reviewed and its context. Our ultimate goal is to identify the factors that confuse code reviewers and understand how confusion impacts the efficiency and effectiveness of code review(er)s. As the first step towards this goal we focus on the identification of confusion in developers' comments. Based on an existing theoretical framework categorizing expressions of confusion, we manually classify 800 comments from code reviews of the Android project. We observe that confusion can be reasonably well-identified by humans: raters achieve moderate agreement (Fleiss' kappa 0.59 for the general comments and 0.49 for the inline ones). Then, for each kind of comment we build a series of automatic classifiers that, depending on the goals of the further analysis, can be trained to achieve high precision (0.875 for the general comments and 0.615 for the inline ones), high recall (0.944 for the general comments and 0.988 for the inline ones), or substantial precision and recall (0.696 and 0.542 for the general comments and 0.434 and 0.583 for the inline ones, respectively). These results motivate further research on the impact of confusion on the code review process. Moreover, other researchers can employ the proposed classifiers to analyze confusion in other contexts where software development-related discussions occur, such as mailing lists.

Confusion detection in code reviews

Ebert, Felipe;Castor, Fernando;Novielli, Nicole;Serebrenik, Alexander

2017-01-01

Abstract

Code reviews are an important mechanism for assuring quality of source code changes. Reviewers can either add general comments pertaining to the entire change or pinpoint concerns or shortcomings about a specific part of the change using inline comments. Recent studies show that reviewers often do not understand the change being reviewed and its context. Our ultimate goal is to identify the factors that confuse code reviewers and understand how confusion impacts the efficiency and effectiveness of code review(er)s. As the first step towards this goal we focus on the identification of confusion in developers' comments. Based on an existing theoretical framework categorizing expressions of confusion, we manually classify 800 comments from code reviews of the Android project. We observe that confusion can be reasonably well-identified by humans: raters achieve moderate agreement (Fleiss' kappa 0.59 for the general comments and 0.49 for the inline ones). Then, for each kind of comment we build a series of automatic classifiers that, depending on the goals of the further analysis, can be trained to achieve high precision (0.875 for the general comments and 0.615 for the inline ones), high recall (0.944 for the general comments and 0.988 for the inline ones), or substantial precision and recall (0.696 and 0.542 for the general comments and 0.434 and 0.583 for the inline ones, respectively). These results motivate further research on the impact of confusion on the code review process. Moreover, other researchers can employ the proposed classifiers to analyze confusion in other contexts where software development-related discussions occur, such as mailing lists.

Scheda breve

Scheda completa

Scheda completa (DC)

Anno

2017

Codice ISBN

9781538609927

File in questo prodotto:

Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11586/208823

Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni

ND

36

29

social impact