GitHub Will Use Copilot Interaction Data to Train AI Models Unless Users Opt Out
From April 24 all Free Pro and Pro Plus users will have their code snippets and context used for model training by default
The company framed the change as aligned with established industry practices, saying real-world interaction data will help models better understand development workflows, deliver more accurate code suggestions, and catch potential bugs before they reach production.
GitHub noted that users who previously opted out of data collection will have their preference preserved. The opt-out setting is available in Copilot privacy settings.
The announcement cited improvements after incorporating interaction data from Microsoft employees, including increased code acceptance rates across multiple programming languages. GitHub argues that broadening the training data will yield similar benefits for all users.
Analysis
Why This Matters
Many developers chose Copilot specifically because their code was not used for training. The opt-out-by-default approach means millions will contribute training data unless they actively change settings.
Background
GitHub initially trained Copilot on publicly available code, drawing criticism and legal challenges. This expansion to interaction data represents a new frontier.
Key Perspectives
Open source advocates and privacy-conscious developers will likely push back. Enterprise users are shielded, which may accelerate the shift toward paid tiers.
What to Watch
Whether the developer community organises a backlash. The April 24 deadline gives developers a month to adjust their settings.