Coding on Copilot: 2023 Data Suggests Downward Pressure on Code Quality
To investigate, GitClear analyzed approximately 153 million changed lines of code, authored between January 2020 and December 2023 [A1]. This is the largest known database of highly structured code change data that has been used to evaluate code quality differences [A2]. We find disconcerting trends for maintainability. Code churn -- the percentage of lines that are reverted or updated less than two weeks after being authored -- is projected to double in 2024 compared to its 2021, pre-AI baseline. We further find that the percentage of "added code" and "copy/pasted code" is increasing in proportion to “updated,” “deleted,” and “moved” code. In this regard, code generated during 2023 more resembles an itinerant contributor, prone to violate the DRY-ness of the repos visited.