This article traces the evolution of modern code review from formal inspections to tool-driven workflows, maps key research themes, and highlights a critical gapThis article traces the evolution of modern code review from formal inspections to tool-driven workflows, maps key research themes, and highlights a critical gap

What We Know (and Don’t) About Modern Code Reviews

1 INTRODUCTION

2 BACKGROUND AND RELATED WORK

3 RESEARCH DESIGN

4 MAPPING STUDY RESULTS

5 SURVEY RESULTS

6 COMPARING THE STATE-OF-THE-ART AND THE PRACTITIONERS’ PERCEPTIONS

7 DISCUSSION

8 CONCLUSIONS AND ACKNOWLEDGMENTS

REFERENCES

\

2 BACKGROUND AND RELATED WORK

In this section, we briefly revise the history of peer code reviews (2.1), illustrate the MCR process (2.2), discuss related work surveying MCR literature (2.3) and practitioner surveys in SE in general (2.4), illustrating the research gap that this study aims to fill (summarized in 2.5).

2.1 Peer Code Review

It is widely recognized that the peer review of code is an effective quality assurance strategy [2, 25]. Formal code inspections were introduced in the mid-1970s by Fagan [20]. The formal code inspection process requires well-defined inputs (a software product that is ready for review), planning and organizing of review resources (time, location, expertise), execution of the review following guidelines that facilitate the detection of defects, and a synthesis of the findings, which are used for improvement [2]. Kollanus and Koskinen [25] have reviewed the research on code inspections between 1980 and 2008 and found a peak of publications between the late 1990s and 2004 (averaging 14 papers per year) with a strong decline between 2005 and 2008 (4 papers per year).

\ This change in research interest coincides with the rise of MCR research, starting around 2007, which had a steady upward trend since 2011 [4]. Research on code inspections focused on reading techniques, effectiveness factors, processes, the impact of inspections, defect estimation, and inspection tools [25]. Interestingly, the tool aspect was the least researched one, with 16 out of 153 studies (10%). Modern code reviews (MCR) were born out of the need to perform lightweight yet efficient and effective quality assurance [3]. It is a technology-driven practice that complements continuous integration and deployment (CI/CD), a method to frequently and reliably release new features. CI/CD also saw a rise in practical adoption and research interest around 2010 [35].

\

2.2 Modern Code Review

Figure 1 illustrates the two phases and main six steps in MCR, which are typically supported by tools that integrate with version control systems (e.g., Gerrit, GitHub, and GitLab). The main actors involved in MCR are the code author(s) and the reviewer(s). While there may be organizational, technical and tooling differences between open source and commercial software development implementing MCR, the general steps are valid for both contexts. A significant difference of MCR in open source and commercial development is its perceived purpose. In open source development, reviewers focus on building relationships with core developers, while in commercial development, knowledge dissemination through MCR is more important [76].

\ In Step 1, the code author(s) prepare the code change for review, which usually includes a description of the intended modification and/or a reference to the corresponding issue recorded in a bug tracking system. When tools like Gerrit and GitHub are used, the change author creates a pull request. Questions that arise in this step are: What is the optimal size of a pull request? How can large changes be broken down into manageable pull requests? How should changes be documented?

\ In Step 2, the project/code owner selects one or more reviewers, typically using heuristics such as expertise in the affected code or time availability. Questions that arise in this step are: What is the optimal number of reviewers? Who is the best reviewer for a particular code change? What is the optimal workload for a reviewer? How should code changes be prioritized for review? In Step 3, the reviewer(s) are notified of their assignment, concluding the planning phase of MCR. Questions that arise in this step are: How much time should be allocated to a review? How should reviews be scheduled (batch, scattered throughout the work day/week)?

\ In Step 4, the reviewer(s) check the code changes for defects or suggest improvements. The procedure for this step is highly individualized and depends on tool support, company culture, and personal preference and experience. Typically, the reviewer inspects the code changes that are surrounded by unchanged (context) code and provide feedback to particular lines of code, as well as to the overall change. Questions that arise in this step are: What information is needed and what are the best practices for an effective review? What is the most effective way to describe findings and comments to code changes? Can the identification of certain defects or improvements be automated?

\ In Step 5, the reviewer(s) and author(s) discuss the code changes and feedback, often facilitated by tools that enable asynchronous communication and allow referencing code and actors. This interaction creates a permanent record of the technical considerations regarding the change that emerged during the review. Questions that arise in this step are: What are the key considerations for effective communication between reviewer(a) and author(s)? How can endless (unprofessional) discussions be avoided? How can consensus be facilitated?

\ In Step 6, the change is either rejected, accepted, or sent back to the author(s) for refinement. The decision process can be implemented with majority voting or rests upon the judgement of the project/code owner. Questions that arise in this step are: To what extent can the decision process be automated? What can we learn from accepted/rejected changes that can be used to accept/filter high/low quality patches earlier?

\ The questions above, together with other questions, are investigated in the primary studies identified in our study, but also in the literature surveys presented in the related work, which is discussed next.

\

2.3 Literature surveys

While surveys on software inspections in general [2, 25, 26], checklists [10], and tool support [28] have been conducted in the past, surveys on MCR have only recently received an increased interest from the research community (since 2019). We identified six studies, besides our own, that mentioned MCR in their review aim within a very short time frame (2019-2021). Table 1 summarizes key data of these reviews.

\ To the best of our knowledge, our systematic mapping study [4] presented the first results on the state-of-the-art in MCR research (April 2019). We identified and classified 177 research papers covering the time frame between 2007 and 2018. The goal of this mapping study was to identify the main themes of MCR research by analyzing the papers’ abstract. We observed an increasing trend of publications from 2011, with the major themes related to MCR processes, reviewer characteristics and selection, tool support, source code characteristics and review comments. In this paper,

we update the search to include studies published including 2021 and we considerably deepen the classification and analysis of the themes covered in MCR research, reporting on the major contributions, key takeaways and research gaps. Furthermore, we survey practitioners opinions on MCR research in order to juxtapose research trends with the perspective from the state-of-practice.

\ Briefly after our mapping study, Coelho et al. [16] published their mapping study on refactoringaware code reviews (May 2019). They argue that MCR can be conducted more efficiently if reviewers are aware of the type of changes and focus therefore their search on methods/techniques/tools that support the classification of code changes. They identified 13 primary studies (2007-2018), of which 9 are unique to their review. This could be due to the inclusion of "code inspection" in their search string, resulting in papers that are not related to MCR (e.g. [1, 15]), even though Coelho et al. mentioned MCR explicitly in their mapping aim.

\ Nazir et al. [30] published preliminary results of a systematic literature review on the benefits of MCR in January 2020. They identified 51 primary studies, published between 2013-2019, and synthesized nine clusters of studies that describe benefits of MCR: software quality improvement, knowledge exchange, code improvement, team benefits, individual benefits, ensuring documentation, risk minimization, distributed work benefits, and artifact knowledge dissemination.

Indriasari et al. [22] reviewed the literature on the benefits and barriers of MCR as a teaching and learning vehicle in higher education (September 2020). They identified 51 primary studies, published between 2013 and 2019, and found that skill development, learning support, product quality improvement, administrative process effectiveness and the social implications are the main drivers for introducing peer code reviews in education. Analyzing the set of primary studies they included, we observe that this review has the least overlap of all with the other reviews. This is likely due to the particular focus on peer code reviews in education, which was explicitly excluded, for example, in our study.

\ Çetin et al. [14] focused in their systematic literature review on the aspect of reviewer recommendations in MCR (April 2021). They identified 29 primary studies, published between 2009 and 2020, and report that the most common approaches are based on heuristics and machine learning, are evaluated on open source projects but still suffer from reproducibility problems, and are threatened by model generalizatibility and data integrity.

\ We discuss now the two reviews of MCR that are closest in scope and aim to our work and illustrate similarities in observations and the main differences in contributions between the reviews. Wang et al. published a pre-print [36] on the evolution of code review research (November 2019), which has been extended, peer-reviewed, and published in 2021 [37]. They identified 112 primary studies, published between 2011 and 2019. Similar to our results (see Figure 5b), they observe a predominant focus on evaluation and validation research, with fewer studies reporting experiences and solution proposals for MCR.

\ The unique contributions of their review are the assessment of the studies’ replicability (judged by availability of public data sets) and the identification and classification of metrics used in MCR research. The former is important as it allows other researchers to conduct replication studies and the latter helps researchers to design studies whose results can be benchmarked. Compared to Wang et al., our review of the themes studied in MCR research is more granular (9 vs. 47) and we provide a narrative summary of the papers’ contributions.

\ Finally, Davila and Nunes [18] performed a systematic literature review with the aim to provide a structured overview and analysis of the research done on MCR (2021). They identified 1381 primary studies published between 1998 and 2019, and provide an in-depth analysis of the literature, classifying the field into foundational studies (which try to understand the practice), proposal studies (improve the practice), and evaluation studies (measure and compare practices). Their synthesis provides excellent insights in the MCR state-of-art with findings that are interesting for researchers as well as practitioners.

\

2.4 Practitioner surveys

Several studies have investigated the practical relevance of software engineering by surveying practitioners (see Table 2). Lo et al. [27] were interested to gauge the relevance of research ideas presented at ICSE and ESEC/FSE, two premier conferences in Software Engineering. They summarized the key contributions of 571 papers and let practitioners (employees at Microsoft) rate the ideas on a scale from essential, worthwhile, unimportant to unwise. Overall, 71% of the ratings were positive. However, they found no correlation between academic impact (citation count) and relevance score.

\ Carver et al. [13] replicated Lo et al.’s study with research contributions from a different conference (ESEM), targeting a wider and international audience of practitioners. They also investigated what practitioners think SE research should focus on. Their conclusions are similar to the ICSE and ESEC/FSE study, with 67% overall positive ratings for the research, and no correlation between academic impact and relevance score. Furthermore, they found that the research at the conference addresses the needs expressed by the practitioners quite well. However, they highlight the need for improving the discoverability of the research to enable knowledge transfer between research and practice.

\ Finally, Franch et al. [21] surveyed practitioners in the field of requirements engineering and found mostly positive ratings (70%) for the research in this area. The practitioners justifications for positive ratings are related to the perceived problem relevance and solution utility described in research. The requirements engineering activities that should receive the most attention from research, according to the practitioners needs, are traceability, evaluation, and automation.

\

2.5 Research gap

While recent years have seen several literature surveys on MCR, we know very little about how this research is perceived by practitioners. Looking at the research questions shown in Table 1, one can observe that a few studies sought to investigate how MCR techniques [14] and approaches [18, 37] have been evaluated. None has yet studied practitioner perception of MCR research, even though general software and requirements engineering research has been the target of such surveys (see Table 2).

\ In this study, we focus the literature review on identifying the main themes and contributions of MCR research, summarizing that material in an accessible way in the form of evidence briefings, and gauge practitioners perceptions of this research using a survey. Based on the results of this two data collection strategies, we outline the most promising research avenues, from the perspective of their potential impact on practice.

:::info Authors:

  1. DEEPIKA BADAMPUDI
  2. MICHAEL UNTERKALMSTEINER
  3. RICARDO BRITTO

:::

:::info This paper is available on arxiv under CC BY-NC-SA 4.0 license.

:::

\

Market Opportunity
Salamanca Logo
Salamanca Price(DON)
$0.0002317
$0.0002317$0.0002317
-1.36%
USD
Salamanca (DON) Live Price Chart
Disclaimer: The articles reposted on this site are sourced from public platforms and are provided for informational purposes only. They do not necessarily reflect the views of MEXC. All rights remain with the original authors. If you believe any content infringes on third-party rights, please contact [email protected] for removal. MEXC makes no guarantees regarding the accuracy, completeness, or timeliness of the content and is not responsible for any actions taken based on the information provided. The content does not constitute financial, legal, or other professional advice, nor should it be considered a recommendation or endorsement by MEXC.