Without open source, there is no AI. It’s that simple. However, the current open-source licenses like the GPL, Apache License, and Mozilla Public License are not suitable for software-as-a-service or cloud services, and especially not for AI’s large language models (LLMs). This issue is already surfacing in court cases, such as the lawsuit against Microsoft, OpenAI, and GitHub for allegedly stealing open-source code in their AI-based systems. Similarly, authors are suing Microsoft and OpenAI for using their work in LLMs without proper attribution.
There are differing opinions on how AI-produced code should be treated. Some believe it should be public domain, while others argue that companies should view it as their property and protect it through copyright and contract laws. These debates highlight the challenges in finding a suitable licensing model for AI-generated code.
Simply claiming that an AI is open source is not enough, as seen with Meta’s Llama 2. The term “open source” should comply with the Open Source Definition (OSD) and use an OSI-approved license. Companies may face legal issues if they misrepresent their AI as open source.
The road to defining open source has been long and bumpy, starting with the GNU General Public License (GPL) and eventually leading to the Open Source Definition (OSD) and the creation of the OSI. However, applying open-source licenses to AI and LLMs is more complex. While some open LLMs exist, most contain proprietary, copyrighted, or undisclosed information.
To address this challenge, industry leaders are working on defining a new understanding of open-source AI. Stakeholders from AI companies, Google, Microsoft, GitHub, and various foundations are collaborating to create a draft of the AI Open Source Definition. The goal is to establish a common framework for open-source AI that benefits all parties involved. This draft is expected to be finalized soon, acknowledging the rapid advancement of AI and the need for a solid open-source framework.