In addition to highlighting the significance of resolving ethical concerns in software communities, this work—the first comprehensive investigation of unethical behavior in OSS—also shows promising automated ways for detecting such issues.In addition to highlighting the significance of resolving ethical concerns in software communities, this work—the first comprehensive investigation of unethical behavior in OSS—also shows promising automated ways for detecting such issues.

How Automated Tools Are Making Open Source Software Safer

Abstract and 1. Introduction

  1. Background and Related Work

  2. Study of Unethical Behavior in OSS

    3.1 RQ1: Types of unethical behavior

    3.2 RQ2: Affected software artifacts

  3. Methodology

    4.1 Modeling via SWRL rules

    4.2 Automatic detection of unethical behavior

  4. Evaluation

  5. Discussion and Implications

  6. Threats to Validity

  7. Conclusion and References

7 THREATS TO VALIDITY

External. Our findings of unethical behavior may not generalize beyond the studied OSS projects and issues/PRs. There could be unethical behavior that are not reported to the issue tracker. Unfortunately, there is no conceivable way to study these unreported issues. As some issues may not have the ethics-related keywords that we used for searching, we could have also missed some unethical behavior. Nevertheless, our selected keywords already help us in discovering many types of unethical behavior. Hence, we believe the issues in our study provide a representative sample of the reported and resolved unethical issues in our studied repositories. While other types of unethical behavior discovered in our study is important, Etor can only detect six of them, and our evaluation is limited to these six types. Nevertheless, our experiments show that Etor can detect unethical behavior with relatively high accuracy.

\ Internal. Our code and scripts may have bugs that can affect our results. To mitigate this threat, we make our tool and data publicly available for inspection.

8 CONCLUSION

To better understand unethical behavior in OSS projects, we conduct a study of the types of unethical behavior in OSS projects. By reading and analyzing the discussion of stakeholders in OSS projects, our study of 316 GitHub issues identifies 15 types of unethical behavior. These unethical behaviors are affected by various types of software artifacts. Inspired by our study, we propose Etor, an ontology-based approach that can automatically detect unethical behavior. Our evaluation of Etor on 195,621 issues (1,765 repositories) shows that Etor can automatically detect 548 issues with 74.8% TP rate on average. As the first study that investigates the types of unethical behavior in OSS projects, we hope to raise awareness among OSS stakeholders regarding the importance of understanding ethical issues in OSS projects. While Etor shows promising results in automated detection of unethical behavior in OSS projects, we plan to enhance Etor in future to detect more types and reduce false positives using machine learning techniques.

REFERENCES

[1] [n.d.]. https://github.com/eslint/eslint/pull/15102

\ [2] [n.d.]. https://www.w3.org/2001/sw/#owl

\ [3] [n.d.]. http://www.w3.org/Submission/SWRL/

\ [4] [n.d.]. https://github.com/Pryaxis/handbook/issues/3

\ [5] [n.d.]. https://github.com/novus-package-manager/novus/issues/3

\ [6] [n.d.]. https://github.com/biddyweb/yes-cart/issues/33

\ [7] [n.d.]. https://github.com/CircuitVerse/Interactive-Book/issues/80

\ [8] [n.d.]. https://github.com/mpdf/mpdf/issues/15

\ [9] [n.d.]. https://github.com/pkalogiros/AudioMass/issues/1

\ [10] [n.d.]. https://github.com/minio/minio/issues/12143

\ [11] [n.d.]. https://github.com/wger-project/wger/issues/266

\ [12] [n.d.]. https://github.com/tranleduy2000/javaide/issues/236

\ [13] [n.d.]. https://github.com/flyingsaucerproject/flyingsaucer/pull/123

\ [14] [n.d.]. https://github.com/click-llc/click-integration-django/issues/1

\ [15] [n.d.]. https://github.com/twbs/bootstrap/issues/5632

\ [16] [n.d.]. https://github.com/NetHack/NetHack/issues/359

\ [17] [n.d.]. https://github.com/EasyEngine/easyengine/issues/488

\ [18] [n.d.]. https://github.com/katzwebservices/Contact-Form-7-Newsletter/issues/ 79

\ [19] [n.d.]. https://docs.github.com/en/rest/repos

\ [20] [n.d.]. https://www.legislation.gov.au/Details/C2017C00180

\ [21] [n.d.]. https://github.com/manuel-freire/ac2

\ [22] [n.d.]. https://docs.github.com/en/communities/setting-up-your-project-forhealthy-contributions/adding-a-license-to-a-repository

\ [23] [n.d.]. https://docs.github.com/en/repositories/managing-your-repositoryssettings-and-features/customizing-your-repository/licensing-a-repository

\ [24] [n.d.]. https://github.com/PyGithub/PyGithub

\ [25] [n.d.]. https://github.com/Anarios/return-youtube-dislike/issues/401 [26] [n.d.]. https://docs.github.com/en/repositories/managing-your-repositoryssettings-and-features/enabling-features-for-your-repository/disabling-issues

\ [27] [n.d.]. https://github.com/rydercalmdown/packagetheftpreventor

\ [28] [n.d.]. https://github.com/EtorChecker/Etor

\ [29] [n.d.]. ailab. https://github.com/bilibili/ailab

\ [30] [n.d.]. Are we correctly handling console.Console in node objectKeys(console)? https://github.com/sindresorhus/ts-extras/issues/50

\ [31] [n.d.]. CUDA vs Naive Speedup? https://github.com/d-li14/involution/issues/1

\ [32] [n.d.]. DogeBot2. https://github.com/DGXeon/DogeBot2 [33] [n.d.]. Squeeze tooltip in the sections panel. https://github.com/livebook-dev/ livebook/pull/536

\ [34] [n.d.]. VIP. https://github.com/Oreomeow/VIP

\ [35] [n.d.]. What is Plagiarism? ([n. d.]). https://www.plagiarism.org/article/what-isplagiarism

\ [36] 2021. , Report on University of Minnesota Breach-of-Trust Incident pages. https: //lwn.net/ml/linux-kernel/202105051005.49BFABCE@keescook/

\ [37] Anneliese Amschler Andrews and Arundeep S %J Empirical Software Engineering Pradhan. 2001. Ethical issues in empirical software engineering: the limits of policy. 6, 2 (2001), 105–110.

\ [38] Grigoris Antoniou and Frank van Harmelen. 2004. Web ontology language: Owl. In Handbook on ontologies. Springer, 67–92.

\ [39] Deepika Badampudi. [n.d.]. Reporting ethics considerations in software engineering publications. In 2017 ACM/IEEE International Symposium on Empirical Software Engineering and Measurement (ESEM). IEEE, 205–210.

\ [40] Sebastian Baltes and Stephan Diehl. 2016. Worse than spam: Issues in sampling software developers. In Proceedings of the 10th ACM/IEEE international symposium on empirical software engineering and measurement. 1–6.

\ [41] Sebastian Baltes and Stephan Diehl. 2019. Usage and attribution of Stack Overflow code snippets in GitHub projects. Empirical Software Engineering 24, 3 (2019), 1259–1295.

\ [42] Sebastian Baltes, Richard Kiefer, and Stephan Diehl. 2017. Attribution required: Stack overflow code snippets in GitHub projects. In 2017 IEEE/ACM 39th International Conference on Software Engineering Companion (ICSE-C). IEEE, 161–163.

\ [43] Dizza Beimel and Mor Peleg. 2011. Using OWL and SWRL to represent and reason with situation-based access control policies. Data & Knowledge Engineering 70, 6 (2011), 596–615.

\ [44] Stephen R Bergerson. 2000. E-commerce Privacy and the Black Hole of Cyberspace. Wm. Mitchell L. Rev. 27 (2000), 1527.

\ [45] Hanene Boussi Rahmouni, Tony Solomonides, Marco Casassa Mont, and Simon Shiu. 2009. Modelling and enforcing privacy for medical data disclosure across Europe. In Medical Informatics in a United and Healthy Europe. IOS Press, 695–699.

\ [46] Mark Cenite, Benjamin H Detenber, Andy WK Koh, Alvin LH Lim, Ng Ee %J New Media Soon, and Society. 2009. Doing the right thing online: a survey of bloggers’ ethical beliefs and practices. 11, 4 (2009), 575–597.

\ [47] Jason A Colquitt. 2001. On the dimensionality of organizational justice: a construct validation of a measure. Journal of applied psychology 86, 3 (2001), 386.

\ [48] Daniela S Cruzes and Tore Dyba. 2011. Recommended steps for thematic synthesis in software engineering. In 2011 international symposium on empirical software engineering and measurement. IEEE, 275–284.

\ [49] Daniela America da Silva, Henrique Duarte Borges Louro, Gildarcio Sousa Goncalves, Johnny Cardoso Marques, Luiz Alberto Vieira Dias, Adilson Marques da Cunha, and Paulo Marcelo Tasinaffo. 2021. Could a Conversational AI Identify Offensive Language? Information 12, 10 (2021), 418.

\ [50] Thomas Eisenbarth, Rainer Koschke, and Daniel Simon. 2003. Locating features in source code. IEEE Transactions on software engineering 29, 3 (2003), 210–224.

\ [51] Batya Friedman, Peter H Kahn, Alan Borning, and Alina Huldtgren. 2013. Value sensitive design and information systems. Springer, 55–95.

\ [52] Daniel M German, Yuki Manabe, and Katsuro Inoue. 2010. A sentence-matching method for automatic license identification of source code files. In Proceedings of the IEEE/ACM international conference on Automated software engineering. 437–446.

\ [53] Daniel M German, Gregorio Robles, Germán Poo-Caamaño, Xin Yang, Hajimu Iida, and Katsuro Inoue. 2018. "Was My Contribution Fairly Reviewed?" A Framework to Study the Perception of Fairness in Modern Code Reviews. In 2018 IEEE/ACM 40th International Conference on Software Engineering (ICSE). IEEE, 523–534.

\ [54] Nicolas E Gold and Jens Krinke. [n.d.]. Ethical Mining: A Case Study on MSR Mining Challenges. In Proceedings of the 17th International Conference on Mining Software Repositories. 265–276.

\ [55] Yaroslav Golubev, Maria Eliseeva, Nikita Povarov, and Timofey Bryksin. 2020. A study of potential code borrowing and license violations in java projects on github. In Proceedings of the 17th International Conference on Mining Software Repositories. 54–64.

\ [56] Frances S Grodzinsky, Keith Miller, and Marty J Wolf. 2003. Ethical issues in open source software. Journal of Information, Communication and Ethics in Society (2003).

\ [57] Idris Hsi and Colin Potts. 2000. Studying the Evolution and Enhancement of Software Features.. In icsm. 143.

\ [58] Syed Fatiul Huq, Ali Zafar Sadiq, and Kazi Sakib. 2019. Understanding the effect of developer sentiment on fix-inducing changes: An exploratory study on github pull requests. In 2019 26th Asia-Pacific Software Engineering Conference (APSEC). IEEE, 514–521.

\ [59] Nasif Imtiaz, Justin Middleton, Joymallya Chakraborty, Neill Robson, Gina Bai, and Emerson Murphy-Hill. [n.d.]. Investigating the effects of gender bias on GitHub. In 2019 IEEE/ACM 41st International Conference on Software Engineering (ICSE). IEEE, 700–711.

\ [60] Georgia M Kapitsaki, Frederik Kramer, and Nikolaos D Tselikas. 2017. Automating the license compatibility process in open source software with SPDX. Journal of systems and software 131 (2017), 386–401.

\ [61] Georgia M Kapitsaki, Nikolaos D Tselikas, and Ioannis E Foukarakis. 2015. An insight into license tools for open source software systems. Journal of Systems and Software 102 (2015), 72–87.

\ [62] ASM Kayes, Wenny Rahayu, Tharam Dillon, and Elizabeth Chang. 2018. Accessing data from multiple sources through context-aware access control. In 2018 17th IEEE International Conference On Trust, Security And Privacy In Computing And Communications/12th IEEE International Conference On Big Data Science And Engineering (TrustCom/BigDataSE). IEEE, 551–559.

\ [63] David Kocsis and Gert-Jan de Vreede. 2016. Towards a taxonomy of ethical considerations in crowdsourcing. (2016).

\ [64] Josh Lerner and Jean Tirole. 2005. The scope of open source licensing. Journal of Law, Economics, and Organization 21, 1 (2005), 20–56.

\ [65] Tyler McDonnell, Baishakhi Ray, and Miryung Kim. 2013. An empirical study of api stability and adoption in the android ecosystem. In 2013 IEEE International Conference on Software Maintenance. IEEE, 70–79.

\ [66] Deborah L McGuinness, Frank Van Harmelen, et al. 2004. OWL web ontology language overview. W3C recommendation 10, 10 (2004), 2004.

\ [67] Stuart McIlroy, Nasir Ali, and Ahmed E Hassan. 2016. Fresh apps: an empirical study of frequently-updated mobile apps in the Google play store. Empirical Software Engineering 21, 3 (2016), 1346–1370.

\ [68] Andrew McNamara, Justin Smith, and Emerson Murphy-Hill. [n.d.]. Does ACM’s code of ethics change ethical decision making in software development?. In Proceedings of the 2018 26th ACM joint meeting on european software engineering conference and symposium on the foundations of software engineering. 729–733.

\ [69] Brent Mittelstadt. 2019. Principles alone cannot guarantee ethical AI. Nature Machine Intelligence 1 (11 2019). https://doi.org/10.1038/s42256-019-0114-4

\ [70] Mainack Mondal, Leandro Araújo Silva, and Fabrício Benevenuto. 2017. A measurement study of hate speech in social media. In Proceedings of the 28th ACM conference on hypertext and social media. 85–94.

\ [71] Mark A Musen. 2015. The protégé project: a look back and a look forward. AI matters 1, 4 (2015), 4–12.

\ [72] Linus Nyman and Tommi Mikkonen. 2011. To fork or not to fork: Fork motivations in SourceForge projects. International Journal of Open Source Software and Processes (IJOSSP) 3, 3 (2011), 1–9.

\ [73] Christopher Oezbek et al. 2008. Research ethics for studying Open Source projects. 4th Research Room FOSDEM: Libre software communities meet research community (2008).

\ [74] Rolf-Helge Pfeiffer. 2020. What constitutes software? An empirical, descriptive study of artifacts. In Proceedings of the 17th International Conference on Mining Software Repositories. 481–491.

\ [75] Janice Singer and Norman G. %J IEEE Transactions on Software Engineering Vinson. 2002. Ethical issues in empirical studies of software engineering. 28, 12 (2002), 1171–1180.

\ [76] Josh Terrell, Andrew Kofink, Justin Middleton, Clarissa Rainear, Emerson R Murphy-Hill, and Chris Parnin. 2016. Gender bias in open source: Pull request acceptance of women versus men. PeerJ Prepr. 4 (2016), e1733.

\ [77] Matteo Turilli and Luciano Floridi. 2009. The ethics of information transparency. Ethics and Information Technology 11, 2 (2009), 105–112.

\ [78] Christopher Vendome, Mario Linares-Vásquez, Gabriele Bavota, Massimiliano Di Penta, Daniel German, and Denys Poshyvanyk. 2017. Machine learning-based detection of open source license exceptions. In 2017 IEEE/ACM 39th International Conference on Software Engineering (ICSE). IEEE, 118–129.

\ [79] Christopher Vendome, Mario Linares-Vásquez, Gabriele Bavota, Massimiliano Di Penta, Daniel German, and Denys Poshyvanyk. [n.d.]. License usage and changes: a large-scale study of java projects on github. In 2015 IEEE 23rd International Conference on Program Comprehension. IEEE, 218–228.

\ [80] Denny Vrandečić. 2009. Ontology evaluation. In Handbook on ontologies. Springer, 293–313.

\ [81] Qiushi Wu and Kangjie Lu. 2021. On the feasibility of stealthily introducing vulnerabilities in open-source software via hypocrite commits. In Proc. Oakland.

\ [82] Sihan Xu, Ya Gao, Lingling Fan, Zheli Liu, Yang Liu, and Hua Ji. 2021. LiDetector: License Incompatibility Detection for Open Source Software. ACM Transactions on Software Engineering and Methodology (2021).

\ [83] Di Yang, Pedro Martins, Vaibhav Saini, and Cristina Lopes. 2017. Stack overflow in github: any snippets there?. In 2017 IEEE/ACM 14th International Conference on Mining Software Repositories (MSR). IEEE, 280–290.

\

:::info Authors:

(1) Hsu Myat Win, Southern University of Science and Technology, China ([email protected]);

(2) Haibo Wang, Southern University of Science and Technology, China ([email protected]);

(3) Shin Hwei Tan, a corresponding author from Southern University of Science and Technology, China ([email protected]).

:::


:::info This paper is available on arxiv under CC BY 4.0 DEED license.

:::

\

Disclaimer: The articles reposted on this site are sourced from public platforms and are provided for informational purposes only. They do not necessarily reflect the views of MEXC. All rights remain with the original authors. If you believe any content infringes on third-party rights, please contact [email protected] for removal. MEXC makes no guarantees regarding the accuracy, completeness, or timeliness of the content and is not responsible for any actions taken based on the information provided. The content does not constitute financial, legal, or other professional advice, nor should it be considered a recommendation or endorsement by MEXC.

You May Also Like

Tron Founder Justin Sun Demoted? Here’s What We Know

Tron Founder Justin Sun Demoted? Here’s What We Know

The post Tron Founder Justin Sun Demoted? Here’s What We Know appeared on BitcoinEthereumNews.com. Justin Sun, Tron founder and crypto billionaire, has revealed his new role, and it looks like a demotion. In a post on X, Sun announced that he will be taking on the role of “chief customer support.” This marks a significant shift from his daily role as the creator of the Tron blockchain. Justin Sun invites feedback as chief support agent Notably, the chief customer support role is for SunPerp, a decentralized perpetual contract trading platform. SunPerp makes its public beta debut today, and to ensure a seamless transition while handling any issues that might arise, Sun will provide customer support. The Tron founder is known for unconventionally promoting his projects. His “demotion” to chief customer support might just be a strategy to draw attention to SunPerp and get it off on a sound footing. Today https://t.co/FrvjQXSUCy is rotating its chief customer support role, and I’ll be taking it on for a day. Sunperp has just entered public beta, so feel free to use it as you like. If you run into any issues, just throw them my way. @SunPerp_DEX — H.E. Justin Sun 👨‍🚀 (Astronaut Version) (@justinsuntron) September 19, 2025 Although SunPerp is still being tested and undergoing fine-tuning, Sun’s post could be a way to create awareness so users will try it out. The goal is to subject it to real-world scenario tests and see how it will perform when it fully launches. This period of public beta launch will allow SunPerp to gather feedback from users that could improve the functionality of the decentralized exchange. Tron’s founder, now acting as chief customer support, has encouraged users to try out SunPerp while welcoming feedback.  “Feel free to use it as you like. If you run into any issues, just throw them my way ” he wrote. Sun is assuring…
Share
BitcoinEthereumNews2025/09/20 10:02
YouTube Plans AI Expansion in 2026 While Promising Crackdown on ‘AI Slop’

YouTube Plans AI Expansion in 2026 While Promising Crackdown on ‘AI Slop’

The post YouTube Plans AI Expansion in 2026 While Promising Crackdown on ‘AI Slop’ appeared on BitcoinEthereumNews.com. In brief YouTube says it will step up detection
Share
BitcoinEthereumNews2026/01/22 10:40
Trump reverses planned Feb 1 tariffs on NATO nations after Greenland talks

Trump reverses planned Feb 1 tariffs on NATO nations after Greenland talks

The post Trump reverses planned Feb 1 tariffs on NATO nations after Greenland talks appeared on BitcoinEthereumNews.com. Donald Trump has reversed his plan to impose
Share
BitcoinEthereumNews2026/01/22 10:07