Patent Risk Framework for Open Source Contributions
Open source contributions can impact your patent portfolio. How do you analyze that risk?
A shorter version of this article was published on Law360, Expert Analysis, on February 29, 2024.
Open source software has become indispensable in the tech industry for both engineering and strategic purposes. Infrastructure projects like Linux, Apache Spark, Kubernetes, and TensorFlow form the backbone of numerous tech stacks. Additionally, companies like Google have strategically open-sourced platforms like Android OS to bolster adoption of their proprietary services. This extends to AI. Meta open sourced PyTorch and allows free commercial use of Llama 2 (with restrictions discussed later in the article). Additionally, many commercial open source software (COSS) companies such as Confluent, GitHub, Databricks, HashiCorp, and MongoDB have surpassed $200 million in annual recurring revenue. These trends are accelerating.
The previous article in this series explained patent license provisions in widely used open source licenses. This article presents a framework for analyzing patent risks stemming from contributions to open source software. The framework addresses whether: (i) contributing to an open source project requires granting licenses to patent claims in the portfolio, and (ii) it is strategically advantageous to encumber the portfolio in this manner.
Companies vary in their sensitivity to the impact on their patent portfolio. On one end are commercial open source companies like Databricks, offering enterprise solutions rooted in open source projects like Apache Spark. Examples also include Elastic, associated with ElasticSearch, HashiCorp, associated with a cloud platform, and RedHat, associated with enterprise Linux. Their business models are intertwined closely with open source projects and communities. For such companies, the advantages of contributing to open source projects likely outweigh the impact on their patent portfolios. On the opposite end are companies like Apple, emphasizing proprietary technologies, protecting their intellectual property (IP), and being particularly sensitive to negative effects on their patent portfolios from open source contributions. Most companies fall in between - sensitive to contributions in tech areas crucial to differentiation (e.g., content ranking for Netflix, core search for Google), and less sensitive to contributions in non-differentiating tech areas.
With this background, the patent risk framework addresses the following questions:
License Grant Obligation
Will contributing to an open source project require granting licenses to patents within the portfolio? This can be broken down into three steps:
Where is the contribution being made?
What is the technical field and scope of the contribution?
Are there pertinent patent claims, and are they infringed?
Strategic Evaluation
Is it strategically advantageous or disadvantageous to encumber the portfolio in this manner?
This framework can be depicted visually:
FIG. 1. Framework for analyzing patent risk stemming from open source contributions.
Let's delve into this framework. We’ll restrict discussion to analysis of open source licenses that explicitly include patent license provisions—specifically, Apache 2.0 and GPL 3.0.
I. Apache 2.0 or GPL 3.0?
Under the prevailing interpretation, the Apache 2.0 patent license has a limited scope, covering only those claims that pertain to a contribution alone or the combination of the contribution and the open source project. Whether the Apache 2.0 license has a broader interpretation where the patent license is scoped to the entire modified software has not been litigated (at least not with a publicly available outcome). In contrast, the GPL 3.0 patent license (illustrated in FIG. 2 below) encompasses all patent claims related to a contributor's version of the software. That means, even for minor contributions like a bug fix, a GPL 3.0 patent license includes all patent claims infringed by any part of the contributor’s version of the GPL 3.0 software.
FIG. 2. The GPL 3.0 patent license has a broader impact, encompassing a larger set of patent claims compared to the Apache 2.0 patent license.
Therefore, a starting point in the analysis is identifying whether the contribution is made to an Apache 2.0 or GPL 3.0 project, as this determination guides the identification of relevant patent claims. In the context of initiating a new open source project, the choice of open source license, whether Apache 2.0 or GPL 3.0, sets the tone for the subsequent analysis.
Practice tips.
Open source licenses such as MIT and BSD do not include an express patent grant. Nevertheless, it is good practice to document patent claims affected by contributions to these projects. A prudent approach may be identifying claims that apply to the contributor's version of the software, akin to GPL 3.0.
Consider a contribution that ports existing open source code to a different language or platform. In cases where the code is licensed under Apache 2.0 and no innovative elements are introduced, there's a plausible argument that no contributor patent claims are implicated. However, caution may be warranted if the code is GPL 3.0 licensed and the contributor holds patents in the relevant area. The ported version qualifies as a contributor's version, thereby exposing the contributor's patent claims related to the ported version to an outbound patent license.
II. Technical field and scope
Evaluating the impact of a contribution on the patent portfolio requires a deep understanding of its technical field, as this directly influences the patents that might be affected by the contribution. For instance, contributing to the Blender open source project prompts the question of whether the patent portfolio encompasses patents in pertinent domains like graphics, modeling, rendering, or video processing. The technical field sets the stage for determining if any patent claims are implicated.
Open source contributions serve diverse technical purposes, and comprehending the motivation behind a contribution adds further context, including the technical scope, to the analysis.
Open source libraries often underpin the functionality of proprietary code, and contributions to these libraries may emerge during the development of proprietary software. The technical scope of such contributions can vary significantly. Many contributions, particularly bug fixes, lack innovative elements. Consider fixes like adjusting conditionals in if statements, correcting method call parameters, or introducing null checks where none existed before—typical scenarios for a considerable portion of bug fixes.1 From a patent perspective, these contributions are often not deemed patent-eligible or are not valuable enough for a company to pursue patent protection. In many cases, a company's patent portfolio might lack claims relevant to such bug fixes, resulting in a relatively low risk of impacting the patent portfolio, although it's advisable to verify.
The higher the degree of innovation in a contribution, the more likely it is to impact a mature company's patent portfolio.
Conversely, other contributions may enhance the functionality or performance of the library and could exhibit a more innovative technical scope. The higher the degree of innovation in a contribution, the more likely it is to have implications for patents in the company's portfolio. Therefore, assessing the nature and technical depth of contributions becomes crucial in understanding their potential impact on the patent portfolio.
It's worth considering whether innovations should be kept in-house. While many companies discover that the engineering overhead of maintaining a proprietary fork of a library can be substantial, potentially outweighing the cost of open sourcing the innovation, conduct a thorough analysis of the associated patent risk before making a decision. Include relevant stakeholders in the decision-making process.
In different scenarios, contributions might be directed towards large open source projects such as Apache Spark (data cluster management), Kubernetes (container orchestration), PyTorch, TensorFlow (machine learning frameworks) and Blender (3D graphics). The innovativeness of such contributions depends on the specific problem being addressed.
Practice tips.
Watch for employees who open source company proprietary information without authorization. Sometimes employee mobility drives this. Imagine an employee working on innovative proprietary technology. Sharing crucial components openly enables them to continue working on the project after parting their current employer.
Contributions made by employees in their personal capacity, rather than on behalf of the company, do not impact the company's patent portfolio. However, be careful in situations where the subject matter of the contribution overlaps areas of the company patent portfolio or company differentiators.
Ensure no proprietary code is contributed improperly. Verify that the contributed code aligns with the company's open source version and doesn't include proprietary code. This involves implementing robust processes and checks within the engineering team to prevent inadvertent or intentional contributions of proprietary code to public repositories like GitHub. Additionally, fostering a culture of awareness and adherence to the company's code contribution policies is crucial in ensuring that engineering teams understand the importance of not including proprietary code in open source contributions. Regular training and communication about these guidelines can help maintain code integrity and protect the company's intellectual property.
Implementing robust processes generally is crucial to ensuring that open source contributions are well-monitored and don't go unnoticed. In organizations where a culture of secrecy is prevalent, there exists a constant risk that employees might contribute code to open-source projects without obtaining clearance from the legal department. To address this, establish clear guidelines and mechanisms to track and document contributions. Encourage a culture of compliance through regular training and communication.
III. Infringed patent claims
The next step is to assess whether the patent portfolio includes pertinent claims, specifically those that read on the contribution or the combination of the contribution and the project (for Apache 2.0), and those that read on the software as a whole (for GPL 3.0). This task can be challenging, especially with extensive portfolios. When dealing with tens of thousands of patent claims, the question arises: where do you begin?
Don’t reinvent the wheel. If previous decisions have been made regarding the technical field or a specific open source project, leverage them. Earlier analyses may have already pinpointed patents relevant to a given technical area. Some legal departments maintain a roster of approved open source projects to which contributions are approved because they don’t implicate any portion of the patent portfolio. For instance, an electric vehicle manufacturer might pre-approve contributions to open source video game engine development like Godot Engine because these fields are vastly different.
However, scrutinize past decisions. While a prior choice might have allowed certain contributions because the company lacked patents in that technical field at the time, circumstances may have changed. The company might have entered that technical field, acquiring patents or filing applications since then. For instance, Google started with search but began offering web browsers, mobile operating systems, phones, cloud services, AI products, and others over time.
Cipher.ai employs a patent taxonomy (like the Universal Technology Taxonomy) to map a portfolio based on training data - exemplary patents that meet definitions for different technology areas - provided by the client.
Leverage tools. Utilize available tools efficiently. Established companies with well-defined processes often have pre-classified portfolios, enabling a swift filtration of patents by the relevant technical field. For instance, Cipher.ai employs a patent taxonomy (like the Universal Technology Taxonomy) to map a portfolio based on training data - exemplary patents that meet definitions for different technology areas - provided by the client. Companies such as ARM and RedHat have used Cipher to this effect. New Generative AI tools, such as the open source LLM foundational model Llama 2, can be fine-tuned for classification on premises or on a private cloud instance using training data. These tools can be used to categorize other patents in the portfolio according to a specific taxonomy.
Although not as rapid or precise, docketing tools like Anaqua or Questel allow keyword searches using terms pertinent to the technical field. Intelligence tools such as PatSnap or Google Patents are also valuable, although they lack access to unpublished patent applications.
Talk to people. In a sizable company, a team of portfolio managers, each overseeing specific technology areas, is typically responsible for leading patent prosecution efforts. For example, Meta's portfolio is divided into two segments: (i) app infrastructure and advertisement platform, and (ii) metaverse and AI. Individual portfolio managers specialize in specific areas within each division, offering a quick reference for guidance.
If the identification process reveals no relevant claims, it is reassuring news, indicating clarity in the lack of patent risk associated with the open source contribution.
Once relevant claims are pinpointed, the next step involves determining whether (i) the contribution or its combination with the software, or (ii) the software as a whole, infringes the identified patent claims. A patent license to the infringed patent claims must be granted if the contribution and distribution occurs.
Practice tips.
Consider potential patent deals on the horizon. If the company intends to acquire patents in the technical field related to the open source contribution, evaluate whether the obligations associated with the open source patent license align with the company's objectives for acquiring the patents. This consideration should play a role in due diligence for the transaction.
Consider the broader context. Examine patent claims that are "licenseable" (Apache 2.0) or "controlled" (GPL 3.0). If a set of patent rights is in-licensed, determine whether the company holds the right to sublicense these patent claims. If sublicensing is permitted, identify claims within this set that are pertinent to the contribution or the open source software. This necessitates a thorough review of in-bound licenses.
Exercise caution if in-bound license terms prohibit royalty-free sublicensing. Many licenses contain such provisions, as licensors generally prefer licensees not to provide their patents for free. However, this conflicts with the contributing company's ability to adhere to the patent license requirements of the open source license. For example, it would hinder a company from contributing to a GPL 3.0 project because GPL 3.0 requires the company to grant a royalty-free patent license to the relevant in-licensed patents. Similar considerations apply to an Apache 2.0 project.
If no pertinent patent claims are discovered in this phase, evaluate whether filing a patent application based on the contribution is worthwhile. Even if there are no immediate plans for patent enforcement, filing a patent application offers the company strategic flexibility in the future. This decision is more justifiable if the company has the financial capacity to pursue the filing.
Take cue from Google's AI patent strategy. It published some of the earliest papers in the field, like the seminal transformer paper on self-attention (published December 2017). It advanced the entire field forward, allowing OpenAI to bring ChatGPT to market ahead of Google’s Bard. Google nevertheless filed, and continues to file, patent applications based on the technology. A search on Google Patents shows that Google holds over 200 patents and patent publications related to self-attention, including US Patent 10,452,978, specifically based on the self-attention paper and claiming priority to a provisional patent application filed on May 23, 2017, and sustained through subsequent continuations.
IV. To license or not to license
Deciding whether the infringed claims should be encumbered through the contribution involves multiple considerations, with primary risks including licensing competitors and potential challenges in enforcing the patent.
A. Licensing competitors
The Apache 2.0 and GPL 3.0 patent licenses stipulate that all downstream users must benefit, eliminating the owner's control over licensing recipients. This raises a crucial question: Could a competitor utilize the open-source project for a proprietary alternative? For instance, if Company A releases software under the Apache 2.0 license, and Company B uses it to create and sell a proprietary version, A faces potential competition that swiftly emerged from its own software release. A cannot assert a successful patent infringement claim against B because B holds a patent license defense under Apache 2.0, as illustrated in FIG. 3 (below).
FIG. 3. Company B owes no royalties on the use of patent X because it has a license to patent X as a downstream user of the open source software.
Company A finds itself in a challenging situation: it provided its intellectual property without royalties to a competitor. The only potential avenue for enforcement is if Company B initiates a patent infringement claim against A, triggering the defensive termination provision and terminating B's license to A's patents.
Had Company A chosen to release its software under a GPL 3.0 license, the scenario would unfold differently. Even if Company B developed competing software based on A's released code, B would be compelled to license its software under GPL 3.0, mirroring A's terms. Compliance would require B to make the source code available upon distribution, as stipulated by GPL 3.0. Consequently, A would have access to B's source code, including enhancements, granting A a license to any of B's patents covering the software. This setup should minimize differentiation between A and B's software, potentially thwarting B's potential market dominance.
The outcome is not guaranteed, particularly if B is a significant player capable of bundling the software with other offerings.
Meta pursued a different strategy with LLaMa 2. Instead of choosing an Apache 2.0 license, which would permit any competitor to create a commercially competitive product, Meta chose a Community License. Under this license, downstream users with "greater than 700 million monthly active users" of their products or services must seek a separate license from Meta, rendering LLaMa 2 not truly open source. Consequently, Meta restricted major tech companies, including Amazon, Apple, Google, and Microsoft, each having more than 700 million monthly active users, from developing products and services utilizing LLaMa 2. Notably, the Community License doesn't provide an explicit patent license to downstream users.
Returning to the scenario where A released its software under Apache 2.0, one potential action for A could be to alter the license terms under which its software is provided. Similar to GPL 3.0, the modified license would prevent B from unfairly profiting from A’s work. An illustrative instance is Elasticsearch’s license change from Apache 2.0 to MongoDB’s Server Side Public License (SSPL) (see also related comments2 3 4), a move prompted by competing software offered by Amazon.
It's important to recognize that patents encumbered through a contribution may lose some of their licensing value.
Consider the scenario where patent X is bound by an Apache 2.0 patent license due to Company A's contribution. The following section assesses the implications of asserting patent X in a legal dispute.
B. Royalties
First, suppose that patent X is asserted against a downstream user B. Right off the bat, B has a patent license defense, meaning that B owes no royalties for using patent X, shown in FIG. 3 above.
However, the situation alters slightly if the defensive termination provision is activated. In this case, B would owe royalties for the ongoing infringing use of patent X. Nevertheless, if the defensive termination provision doesn't have retroactive effects, as in Apache 2.0, B wouldn't be liable for past damages (i.e., royalties for infringing use of patent X before the defensive termination). This scenario is depicted in FIG. 4 below. In both situations, A's return on patent X is limited.
FIG. 4. Company B loses the license to patent X if the defensive termination provision is triggered.
For A, there is another downside to asserting patent X. Consider the scenario where A’s open source software (corresponding to patent X) relies on other open source components. As a downstream user, A itself receives inbound patent licenses from the authors of those other open source components. However, the moment A asserts patent X, the defensive termination provision is activated, resulting in the termination of A's licenses for those components. Consequently, A's use of those components becomes unlicensed.
Practice tip.
Open source license enforcement can be an effective defensive tactic in patent infringement litigation. Jacobsen v. Katzer and Twin Peaks Software Inc. v. Red Hat, Inc. are illustrative. In both cases, the plaintiffs brought a patent infringement claim related to open source software. During discovery, the defendants learned that the plaintiffs used defendants’ open source code without complying with the terms of the respective open source license. Subsequently, both cases reached settlements. For patent plaintiffs, a comprehensive open source license compliance program becomes crucial in mitigating such defensive maneuvers.
If patent X is asserted against a third party (D) who doesn't utilize the open source software, the situation differs. For instance, D might sell competing proprietary software that infringes on patent X. Given that D is not a downstream user of the open source software, it does not receive a license to patent X. Consequently, D cannot employ the patent license defense accessible to B (depicted in FIG. 5). Nevertheless, A is not guaranteed full royalties on patent X. D is likely to contend that it owes no royalties because A already licenses patent X on a royalty-free basis to downstream users. D's argument is grounded in the idea that the economic value of patent X is influenced by other license agreements related to it, specifically, the royalty-free Apache 2.0 patent license in this case.
FIG. 5. Company D is not a downstream user of the open source software and, as such, does not have a license to patent X.
Company A has a counterargument – the Apache 2.0 patent license applies solely to the open source project, and D's utilization of patent X lies beyond that scope. A's position is fortified by tangible distinctions between the open source project's scope and D's software. For instance, they might pertain to entirely different fields of use; the open source project might be linked to search functionality, while D's software is automotive-related. Or the open source project may relate to semiconductors whereas D’s software may be pharmaceutical related. Demonstrating such concrete distinctions, including variations in geographic scope, if applicable, could empower A to maximize royalties for patent X.
That said, the encumbrance on patent X due to the Apache 2.0 patent license impacts the royalties that can be sought for patent X.
C. Injunctions
Parties typically seek injunctions when they want a court to order a specific action or prohibit a certain behavior, such as to stop a competitor from selling products that infringe its patents. The encumbrance on patent X based on the Apache 2.0 patent license can impact Company A’s ability to obtain an injunction to stop Company D from using patent X without authorization.
Prior to eBay Inc. v. MercExchange, L.L.C., a landmark decision by the United States Supreme Court in 2006, a successful finding of patent infringement often automatically led to the issuance of a permanent injunction. The eBay decision changed this approach in favor of a four-factor test for injunctive relief which considers equitable factors such as whether monetary damages are sufficient to compensate for the injury or whether the patent holder will suffer irreparable injury if an injunction is not granted. As a result, patent holders must now demonstrate a greater likelihood of suffering irreparable harm and show that other available remedies are insufficient.
Facts showing that a patent holder licensed the patents in question weigh against it when evaluating eBay’s four-factor test for injunctive relief. The patent holder’s willingness to license the patents demonstrates that monetary damages would be a sufficient remedy, without requiring an injunction. In the open source scenario where Company A licenses patent X on a royalty-free basis to downstream users, showing that monetary damages are insufficient becomes even harder.
In sum, licensing and enforcement related patent risks are summarized in Table 1 below.
TABLE 1. Summary of patent risks.
Having understood the patent risks, let's revisit the fundamental question: Should the patent claims related to the code contribution be encumbered by the open source patent license?
To make this decision, employing a cost-benefit framework is beneficial. Having considered the potential costs from the mentioned patent risks, let's explore the associated benefits.
D. Business objective
Google’s Android operating system and software stack runs on a variety of mobile devices. Most of Android is licensed under Apache 2.0, to foster broad adoption. Google's revenue primarily relies on advertising, and the more users engage with ads, the higher the revenue. Offering Android OS for free incentivized mobile device manufacturers to embrace it, enabling Google to outperform competitors like iOS and Windows Mobile OS, capturing a substantial user base. This success likely overshadowed potential revenue from an alternative strategy based on Android licensing supported by patent enforcement.
Or consider Meta’s AI product releases. PyTorch is a machine learning coding language used to develop generative AI models. PyTorch is licensed under the permissive BSD 3-Clause license. Meta, as a social media company, generates revenue from user-generated content (UGC) supported by advertising on platforms like Facebook, Instagram, and WhatsApp. Meta does not sell AI infrastructure (yet). Meta's primary business strategy involves facilitating easy content generation by users, and generative AI tools developed with PyTorch and other aspects of the generative AI ecosystem such as foundational models and data sets contribute to this objective. LLaMa 2 is not an open source foundational model per se - it requires a license fee for any developers with greater than 700 million daily users. Meta argues this balances the benefits of information-sharing and the costs to its business. This does not detract from Meta’s goal of moving generative AI technology forward as a whole to grow its UGC supported advertising revenue streams in the long term.
FIG. 6. LinkedIn post by Yann LeCun dated December 25, 2023.
Google’s Android and Meta’s AI business strategies can be generalized: commoditize your complement. Originally coined by Joel Spolsky the term refers to the idea that demand for a product increases when the prices of its complements decrease. What are complements? Think of gas and cars, computer hardware and operating systems, advertising and operating systems, or advertising and user generated content. More examples can be found here. An early case of commoditizing the complement was IBM’s commoditization of PC add-ons. IBM documented the interfaces between various components of the PC architecture allowing other manufacturers to sell add-ons such as memory cards, graphics cards, and printers. Cheap and numerous add-ons boosted demand for PCs.
E. Differentiators
A simpler decision point is whether the code contribution is in an area that is a differentiator for the company. Take Apple. Given its extensive lineup of proprietary devices and services, it's easier to identify technologies that are not considered differentiating. These typically include technologies that are not core to its business and those that are already commoditized. Noteworthy examples among Apple's open source projects are Kubernetes, Apache Spark, and Apache Cassandra.
Practice tip.
Apple is especially secretive in view of its attempt to surprise and delight customers. This manifests itself in employees knowing only what they need to know to do their immediate jobs. When a company’s forward looking plans are unavailable, it is non-trivial to determine whether a technology will be a differentiator in the future. Consider the Vision Pro. The spatial computing device includes infrared cameras for precision eye tracking and sensor arrays for real-time 3D mapping and hand tracking - technologies likely absent from Apple current product line up. Without knowledge of the upcoming launch of the Vision Pro how do you decide whether to allow an employee’s code contribution to an open source project in these areas? As a consequence of its secrecy, employee code contributions require unanimous approval from multiple members of its senior engineering and legal leadership team (see Hacker News discussion).
F. Talent
Another crucial reason to support open source contributions is to attract and retain engineering talent. Companies with a reputation for backing open source projects tend to attract skilled individuals. Developers appreciate the opportunity to collaborate on projects, enhance their skills, and gain recognition in the community. This is particularly true in the AI community, which often values publishing research. Given the academic roots of much top AI research and the involvement of leading AI personalities like Geoffrey Hinton, Andrew Ng, Yann LeCun, Fei-Fei Li, attracting such talent necessitates a willingness to publish findings.
In the end, the decision on whether patent claims relevant to the code contribution should be encumbered by the open source patent license depends on finding the right balance between the associated patent risks (costs) and benefits outlined above.
Practice tips.
In view of the patent risks, it is good to have thoughtful answers to these questions:
Will open sourcing a particular project achieve the desired business goals?
Who benefits from the patent license?
How does the patent license impact the ability to enforce patents against competitors or license the patents to third parties?
Can we better achieve our goals by maintaining proprietary code that is protected through trade secrets or patents?
V. Processes
The patent risk analysis described above requires close coordination between the patent team, the open source program office (OSPO), and engineering. Establish internal processes to facilitate this collaboration for managing open source contributions:
Documentation and Monitoring
Maintain comprehensive documentation detailing the company's inbound and outbound open source use and contributions.
Specify the company’s open source patent license obligations, including information on licensed patents, the projects they are licensed under, license types, and the entities licensed.
Regularly update documentation to reflect changes in the patent portfolio composition, such as acquisitions or inbound licenses.
Pre-Approved Open-Source Projects
Develop and maintain a list of pre-approved open-source projects based on company priorities and technology focus.
Periodically reassess and update the list to align with evolving company priorities.
Consider including projects in technology areas where the company lacks patents or has no plans to acquire patents, as well as projects in non-differentiating technology areas.
Employee Contribution Policy
Implement a clear policy governing employee contributions to open source projects.
Specify permissible types of projects based on the company's potential use or contribution.
For instance, if the company has no intention of using an AGPL licensed project, employees are restricted from contributing to AGPL licensed projects.
These measures will help streamline the coordination of patent-related aspects within the company's open source strategy.
Copyright © 2024 Shrut Kirti. Published here under a Creative Commons Attribution-NonCommercial 4.0 International License
The views expressed herein are the author’s own and do not reflect any positions or perspectives of current or former employers.
E. C. Campos and M. d. A. Maia, "Common Bug-Fix Patterns: A Large-Scale Observational Study," 2017 ACM/IEEE International Symposium on Empirical Software Engineering and Measurement (ESEM), Toronto, ON, Canada, 2017, pp. 404-413, doi: 10.1109/ESEM.2017.55.