Also because those specific uses are mentioned in existing law and/or have been otherwise successfully defended. It gives their lawyers as many explicit tools as possible, before they need to argue around the implicit ones enabled by their policies & agreements being deliberately more vague elsewhere.
The point is that if they don't say that they won't, then they pretty much can if they choose to.
1. They will sometimes use the data for training their RLHF stuff, to "prevent harmful use" of the services.
2. The clause is exhaustive and therefore they won't use it for training, as otherwise that'd be mentioned, and are just going to log stuff for the usual monitoring purposes.
This is a storm in a teacup. I don't even know why I should care. If MS crawl some web pages I've written and AI gets slightly smarter by reading them, or if I have a chat with the AI and some engineers use it to make the AI work better, great. It's very hard to imagine concrete, real harm from them being able to do this, though I can understand why companies might worry about it spitting out their source code verbatim in some cases.
Mozilla's point is that the whole document is sufficiently vague that they could use it to defend pretty much whatever use of your content that conceive of now or in the near future.
I need to inform my customers what I do with their personal data. That includes to which companies I share that data with.
Having an excel with customer data is providing that data to Microsoft. So I need, as responsible of the data, to know how they will use it. Any use case that isn't obvious have to be cleared stated in the data privacy agreement. Including moving data outside EU into other countries like America (where US government can request that data without even informing us) or using their data to train AI.
Come'on. If we need to inform that we used chatgpt (just in case they provide PI), why we will not need to inform about Microsoft.
User content may include personal data but may also not...so in some senses, better to include totality of use cases in a non-data protection related document.
>Consent must be a specific, freely given, plainly worded, and unambiguous affirmation given by the data subject;
then in my opinion a court should step in and declare it void so that Microsoft isn't allowed to use any private data until they get their act together.
If it's so vague that it becomes meaningless that should default to granting no rights. Otherwise, why not publish your all-rights-granting privacy policy in Klingonian in a locked drawer in a toilet basement? ;)
The AI services section seems pretty clear in terms of limiting the use cases of user content:
"iv. Use of Your Content. As part of providing the AI services, Microsoft will process and store your inputs to the service as well as output from the service, for purposes of monitoring for and preventing abusive or harmful uses or outputs of the service."
Admittedly, I haven't read other parts to understand the full picture though.
IP addresses are slightly different because that address can be used to identify the subscriber in certain cases (who in turn may or may not be an individual).
BTW if you use client information to derive an identifier that is unique within a session and you send that identifier to a third-party (e.g. Google) this approach gives you zero benefits. In fact ePrivacy & GDPR don't mention cookies anywhere and don't care what technology you use to derive identifiers, if they can robustly identify an individual or device and you actually send them to another service (for purposes that are not strictly necessary for the performance of your service) you're obliged to asked for consent.