There's a whole genre of articles based on conflating two senses of "your," belonging to you, and about you. If I observe that you have blue eyes, that's a fact about you, but it doesn't belong to you, at least not without redefining what "belong" means.
It would be helpful if, instead of deliberately conflating these two senses of "your," articles on this subject went in the other direction and explicitly discussed the distinction.
We have from time to time redefined what "belong" means, but these redefinitions tend to be very complicated and have all sorts of unforeseen consequences. My guess is that a redefinition on the scale of facts about you = your property would be a disaster.
The funny thing is based on the way you described belong...it really made me realize the collection of data really seems to be eerily similar to this definition:
>Obtaining information with reference to the identity, habits, conduct, movements, whereabouts, affiliations, associations, transactions, reputation, or character of any society, person of groups of persons.
Guess what that is? It’s, in part, the statutory (legal) definition of private investigator, which is prohibited by law without a license.
So sure under the law we don’t necessarily own that data about ourselves, but the collection of the data is highly regulated for a reason. Obviously tech companies obtain user consent to collect data (at least some of the time), but if PI’s can be regulated in their activities so can a tech company or digital product, and it’s not to difficult to see regulations that say a tech company can’t collect user data and turn around and sell it to third parties.
I am curious though, how exactly would such a legal framework be a disaster in your opinion?
And here you're conflating "public facts about you" with "facts that only your cell phone / mattress / underpants / web browser know about you."
"Facts about you" falls into at least two categories: "Facts which you want and reasonably expect to remain private", and "facts which you display to the world and have no expectation of privacy about."
Which is fine as far as semantics go, but in the context of this thread the first category doesn't exist unless you're under 13 and in the US, or subject to the GDPR, or a few other nooks and crannies of international laws.
There's a simpler, shallower issue too: 'you' in headlines is a linkbait trope, so headline writers drop it in wherever they can. This one managed it twice. Presumably we're all wired to respond to "hey you!"
Since clickbait is against the site guidelines, we replace such titles with neutral ones. Usually we do that by copying a representative phrase from the article itself. But that was surprisingly hard to do this time.
I think it is pretty easy. The moment you start storing this data outside your brain, it's "my" data. You can memorize or observe it all you want but not store it permanently or semi-permanently elsewhere.
I think it is a sad state of affairs that it had to come to this point resulting from "don't ask for permission upfront, ask for forgiveness later" (most skip the latter part anyway, if not confronted with a shitstorm) mentality.
> I think it is a sad state of affairs that it had to come to this point resulting from "don't ask for permission upfront, ask for forgiveness later"
Huh. I actually believe most western societies are incredibly tilted in the "ask permission" direction. You have to petition to start a business, cut someone's hair, buy a gun, get married, copy a book... I could probably find a hundred things you're supposed to ask for permission before doing.
Personally, I prefer freedom: do whatever you want, as long as it doesn't violate someone's rights (and no, Imaginary Property doesn't count).
This is absolute nonsense and dangerous to boot. Facebook's ARPU is still less than $20/year. If they literally paid all of their profits directly to their user it would still be a pittance. The real value of privacy has to do with freedom, manipulation and surveillance, and that is not represented by economic measures at all. Talking about the value of the data in purely economic terms is exactly the way you would want to frame the issue if you had more nefarious plans down the line.
If a company has to pay users for their data, that puts financial disincentives on collecting it in the first place, which might be interesting, irrespective of the users' finances.
Yeah, the incentive is to charge the users for the service in order to then use that money to pay for their data; or be very selective about what users are allowed to use the platform so its "worth" to pay for the data of that user.
I'm working on a blockchain solution that gives consumers the ability to control and monetize their data:
www.databook.one
@dasil003 I agree that Facebook's ARPU is super low. However, with that statistic is related to ad revenue, not revenue from the sale of data. Facebook doesn't publically sell its consumer data. Thus, we're talking apples and oranges.
To buy raw data, you need to go to data brokers, who are currently earning $250 billion per year from selling consumer data. Raw data is super valuable because you can build predictive models with it. With very small samples of customer data, powerful models can be built that can deliver millions or even hundreds of millions of dollars to a business' bottom line.
I can testify to this because I led the technology team for company that earned millions by leveraging machine learning to predict who would sell their house bellow market prices in the next few months.
The problem with buying data from data brokers is that the quality is awful. For example, the accuracy rate of America's largest data broker, Acxiom is only 50%. This problem is holding back the advancement of machine learning. If consumers linked and verified their own data, we could improve that accuracy rate significantly and deliver much more accurate models to businesses.
Thus, we expect that compensation paid by companies to individuals sharing their complete data profiles could be significant.
I've seen estimates that the value of data will reach $7,500K per American per year by 2022.
Even if we reached 10% of this estimate, I'd want that money in my pocket instead of some data broker.
If you put the control in the hands of the people whose data is being handled, I'm sure a significant percentage will default to "Do not share" given an explicit choice.
People aren't going to directly assist and curate the ability for 3rd parties to spew more spam & spyware on their webpages and in their email inboxes. If they could see what sort of data is being tracked, that can easily trigger their creepiness receptors and wish to distance themselves from it.
The only reason people endure the current situation is because they don't feel they can do anything about it, and really don't know the extent of what's going on. Empowering them will finally allow them to actively break out of that cycle.
However, this also means that they can charge users $3/mo (just to be safe) and be at least as successful as they are now, but without all the spying and manipulation.
Unfortunately that will also put a cap on their profits, so it's unlikely they'll do that.
That's assuming their userbase is a lot less elastic than it probably is. They'd have to retain at least half of their current users in order for that to even come close to replacing current revenue. The reason these pay-with-your-privacy businesses are so successful is because charging actual money is a huge barrier that will drive away many customers.
> The real value of privacy has to do with freedom, manipulation and surveillance, and that is not represented by economic measures at all
Privacy is not the only issue.
Using user data for ad targeting is the past/present. The future is using users to train AI.
Facial recognition in photos is a good example of this. Facebook has the most advanced facial recognition in the world because they got millions (billions?) of hours of free labor from their users classifying photos.
Amazon Echo is doing the same thing with speech recognition.
Cloud based AI platforms capture the data from engineers training AI algorithms and the goal is to use that data to train an AI to replace the engineers.
Not only are people being duped into giving away their labor for free but they are increasingly being used to train the AIs that are going to replace their jobs.
The media portrayal of the tech world would be more impactful if it weren't so hyperbolic, dramatized, and skewed. There are myriad valid complaints about the tech world, but Facebook paying you for you posting your photos is not one of them.
Facebook is free. Google search is free. 15gb of gmail/google drive is free. You get those free, truly amazing services for nothing because the companies found ways to generate value from the data being gathered. The cost to maintain that infrastructure and provide those great services is pretty high and has taken some brilliant engineers years of sweat. (I don't work for either so I'm not tooting my own horn, just giving credit where I believe credit is due)
I think as it currently stands (collect data on me and I get free 99.99% uptime, cloud available services for email, cloud drive, and social networking) is a pretty sweet deal.
For some services I agree with your sentiment. But we don't have metrics so there's no way to know if this is a fair deal.
Remember, they're showing us ads in addition to collecting volumes of information on us, so we're paying in two ways.
And frankly I don't mind the ads, by principle, but I do mind that a few companies are able to use them to track us nearly everywhere online without allowing me to review what they're collecting and who is collecting it.
Here in the US, we cried tyranny when our security agencies were collecting metadata to monitor threats. And then we gave everything we had to Facebook and Google and didn't bat an eye as they sold us to anyone who pays.
I think the problem is at a much higher level - why do companies have this motivation.
It's because we've structured our society (in Western Democracies) so that people grow rich if they can trick you in to buying stuff. The better they are at tricking you, the more they earn. The higher the price compared to the cost, the more they earn. The lower the longevity of the goods, the more chance to sell to you again.
Many of the incentives are towards fooling people rather than creating useful goods that enrich society with minimal environmental impact and maximum longevity.
It's unsustainable and leads to powerful (ie rich) people who are incentivised to parasitically feed on poorer people.
Collecting data on us, and showing us ads are not TWO different things.
It's the same thing... If they stop collecting data, they don't still get to keep the same ad revenue.
The ads are only worth the amount there are because of the hyper specific information you can target ads with. If you get rid of that, than the ad model falls apart. They might still be worth...something, but if it was just about anonymous eyeballs, probably every ad would just be for Coke and Coca Cola would pay extremely little for it.
>For some services I agree with your sentiment. But we don't have metrics so there's no way to know if this is a fair deal.
"fair" is practically a weasel word. It doesn't have a universal standard. It doesn't mean anything until you explain what you consider to be fair. I once read a book on negotiations that called it "the F-word" - it was warning you to be wary of people using the word against you.
Think of this as a transaction between two parties doing business together. When business A is making a deal with business B, then A is not entitled to know how much money B will make out of it. It's perfectly fine to ask, but B not answering is very much within the rules (and is the norm). It is your job to do your own research to figure out how valuable your information is to them.
I also often see an attitude of "They make $20/year from the information I give them. So I should have the option to pay them $20/year for not tracking me." No business deal works that way. Each party sets their own requirements. If you demand that they offer paid services to stop tracking you, they get to say "No - you're welcome to use someone else's service if they give you that kind of a deal"
Finally, the way negotiations work: The focus will not be on what they make from your money, but how much you are willing to pay for the services you receive. The majority of people I know will not pay $10/mo for their email service. So the fact that Google may make $30/year from your use of Gmail is irrelevant. This is your BATNA.
>Remember, they're showing us ads in addition to collecting volumes of information on us, so we're paying in two ways.
And in how many ways are they paying you in services? Far more, I imagine.
>Here in the US, we cried tyranny when our security agencies were collecting metadata to monitor threats. And then we gave everything we had to Facebook and Google and didn't bat an eye as they sold us to anyone who pays.
We probably hang out in different crowds, but while government monitoring did make the news, the percentage of people I personally know who cared about it was less than those I know who say "I'm using a service other than the one provided by Google because they're already tracking me in so many ways".
BTW, although I use some Google services, by and large I do not. I do not actively use Gmail. It was very difficult for me to get used to the idea of letting them get to me via Android. I don't use my account on Youtube. I don't have a Facebook account. I don't use Twitter. I pay for my own email and web services. I tend to have privacy extensions on my browser. I don't use Whatsapp. My life is not at all miserable.
You have no way to say if it's a fair deal, because we don't have proper metrics to check. Plus this data collection is instrumental to the increasing centralization of power and wealth that has the potential of getting to a dangerous extreme soon. I think we should act, and I'm proud that the EU is doing just that. Lanier's proposal might be a bit too much, but at least revising the tax scheme for companies that profit from everyone's data is a necessary step, in my opinion, in order to properly redistribute value to society.
And how does the idea that Facebook still collects massive amounts of data on me despite the fact that I don't use their service reconcile with all that? What do I get in exchange for all that data FB has?
They aren't free. You pay for them in high prices for good and services. Those companies pay advertising which increase prices plus their tech size profits. And because the stock market machine demands greater and greater profits they need larger stronger monopolies to generate those obscene profits.
> They aren't free. You pay for them in high prices for good and services.
Your reasoning is extremely fallacious because you're overlooking a lot of nuance.
Yes, it is true that in aggregate part of the advertising costs get passed on to consumers. How much exactly depends on macroeconomic factors, industry dynamics, pricing power of distributors and retailers, and hundreds other variables. But you are right that in aggregate a non-zero amount of the advertising costs get passed on to consumers.
However, advertising is valuable to advertisers because it works to some degree. And it works because people are influenced by it. Now, as an individual I have no control over what other people are doing and so I have no control over how valuable advertising is and how much is being spent on it and how much is being passed on to consumers. Therefore as an individual I look around me and prices are what they are and already have in them advertising costs which will not go up by me using service. Therefore as an individual the marginal cost of me using these services is effectively zero (imagine the ratio between the value of advertising to me and the aggregate value of advertising, that's effectively zero). As an individual those services are free.
So it depends on what your definition of "free" is. A) Would prices be lower if advertising didn't exist? Yes [0]. B) Would prices be lower to me if I didn't use these services? No. The thing is that in definition A) I don't really have a choice, so it's irrelevant. I only have a choice as an individual.
[0] And even here some people will disagree with you. What I call "free market fundamentalists" will argue that No, they wouldn't, because the advertising leads to higher sales and greater operational efficiency, so that advertising costs are exactly offset by lower per-unit costs.
I think the parent commenter has a pretty straightforward point: the services are "free" in the common sense of the word, meaning you need not directly pay for them. You can reason about abstract value transfer relationships involving the concept of "free", but the core idea is that people are getting something in return for their money, and most people are observably okay with it.
> You pay for them in high prices for good and services. Those companies pay advertising which increase prices plus their tech size profits.
I don't think this is falsifiable. It sounds sort of feasible in the abstract, but how can we empirically measure that effect? As it stands, I could just as easily respond that, no, these companies make prices more efficient, because their advertising methodologies are more effective so companies spend less on their platforms than they would for equivalent ad campaigns elsewhere. Then we'd have to figure out how to investigate whether modern companies advertise more due to growth of the market, or because they have to due to catch up with everyone else using the tech advertising platforms. Narrowing down the answer, if there is one, would be nontrivial. Then I could also say that advertising makes tech services cheaper, because Facebook and Google earn more from user data than they would if they forced everyone to pay $20/month.
Which of us is right? We can't really say, it's effectively an opinion because, even if technically falsifiable, these claims are far from being empirically demonstrable.
Not sure I follow. You assert that a product will be designed to cost FOO, but then they decide to advertise and they increase the price to FOO + x. That's not how it works, the advertising costs are built in the original FOO, being presentations, conferences, regular ads, you name it.
Companies use advertising to increase their sales, either from 0 to something or from something to something more. Along with R&D, production costs, running costs and profit, the advertising cost is part of the planning from the start.
It's not like advertising didn't exist before the internet so blaming the tech giants solely for it's impact on society seems odd. They didn't create advertising, they simply took away advertising money from other mediums.
...would it? IE, would we still be reading it. We keep crying out for better media but private, public, for profit or not... the popular/successful stuff is hyperbolic, dramatized, and skewed. IF we want it even half as bad as we say we do... where's all the calm & collected media? Genuine question, where is it & why isn't it here?
^Sorry, way off topic
To the topic... this line of argument feels (I don't have much more to go an than feels) unsatisfying. Sweet or not, it's hard to see this as a deal, in any way but the most abstract. First, users' end of the deal is secretive, complicated and has hard to understand implications. Second, most of the benefits (to users and The Googles) and most of the dangers (to users and maybe society) are a product of aggregation, which complicates the concept of a deal even more.
Also, these things tend to be monopolistic and data is more moat.
The problem is that the ads model works for a while, then they realize it doesn't anymore and they change the free tier. This seemed to have happened recently with gmail. Impossible to know what the eventual monetization model will be when signing up. I think this is the justified cause of the mistrust. Also there are the observations 'free as in beer' and 'if you're not the customer you're the product.' Both of which have merit.
This is a conversation I have occasionally had with non tech friends. I'm sorry, you are the one being hyperbolic. It's like you haven't even noticed what's been happening the last 10 years or so.
> You get those free, truly amazing services
lol. Services that exist without an honest portrayal about what they do with the data they gather, or anything like clarity on the extent of what is possible.
Not content with that, they bend the service to better manipulate us not to provide a better service. Facebook run fun psychological experiments to see if they can manipulate their users - without calling for volunteers first or anything I recognise as ethics. How many times have FB been pulled up for being, cough, "less than honest"?
Google, especially, have built a reputation for simply gathering maximum possible data, then shuttering the product 18 months later. There is no reason a company Google's size need close Reader or Picasa or any number of others. Buy their in-home device or use G Drive? Hell no, i don't trust them to provide any availability once the data grab is complete. Sure if the company is entering leaner times you can understand canning some of their services, right now it's no more than a rounding error. So no, I'm in no rush to use any new Google offering - they don't exist long enough.
I think as it currently stands (collect data on me, lie to me, track me around the web as far as technically possible, track me off the service as far as technically possible, track me offline as far as possible, sell retargeting and other dubious tactics to ensure my life is 101% a buying consumer experience with no time off. In return I get a service that doesn't even meet the core need any more thanks to engagement algorithms bending it to meet FB and Google need not users') is a pretty terrifying deal.
Terrifying because we still think it's an honest exchange akin to swapping £1 for a bar of chocolate in a shop, just swap data for service. We still feel it's a fair exchange. Yeah, without honesty of even 10% of what is being done.
I'd actually like the sort of exchange you imply. Give me a great service in exchange for some data. Actually, yes, that seems fair so long as you use it reasonably. Hoovering up everything, with the Hoover turned up to 11, whilst implying "it's just a bit of data" is a confidence trick on the non-technical.
> This is a conversation I have occasionally had with non tech friends. I'm sorry, you are the one being hyperbolic. It's like you haven't even noticed what's been happening the last 10 years or so.
In my opinion, you're mistaking apathy for ignorance. Reasonable people can disagree about whether or not the exchange is fair. I work in tech, have heard everything you just wrote many times, and my response is continually, "Eh, I'm okay with that." People I've spoken to about this outside of tech often don't know about the data collection to quite the extent you're talking about and don't care about it. When I mention how extensive the data collection is, they're often surprised, but often they still don't care. The most I've personally seen is a vague discomfort that didn't seem to persist, given the person's behavior didn't change at all.
It gets kind of frustrating for people like myself to hear this argument rehashed over and over in the way you've just presented it, as though people who are okay with their data being used simply "don't understand" or haven't been paying attention. Everything you wrote is - while clearly not neutrally presented - in the zeitgeist on HN. There's nothing really new about it, and you're only convincing the people who already agree with you. Yes, we know our data is being used, we get it. We're picking up what you're putting down. Right there with you. We just don't really care.
You can draw a normative conclusion about me based on that statement if you'd like. It probably sounds callous and asinine from your perspective, but like I said: reasonable people can disagree. Anecdotally speaking, most people I "enlighten" about the data hoovering you describe don't meaningfully change their behavior or preferences. Consider that when people are not aware of how their data is being used, it doesn't always mean they're being victimized; it can also mean they are vaguely aware that their data is "out there", and they implicitly don't care enough about it to investigate further. That's a valid position to take.
>Facebook run fun psychological experiments to see if they can manipulate their users - without calling for volunteers first or anything I recognise as ethics.
I've always wondered: How is that different from A/B testing, or any other marketing experiment?
First, do you or anyone you know directly or indirectly profit from these types of services? Because I find your emphatic endorsement hard to understand.
To be fair, I do use some of these services, too (carefully). But, that’s only because often I can’t find quality alternatives. For instance, I would pay for a high-quality service like Mint (I don’t use it) that kept my data private, but there is none.
Since gathering / ad-selling services are so lucrative, they essentially subsidize the development of the highest-quality products in the market. The best engineers work for these companies. They have massive teams of people refining their products. And, these companies use IP laws to prevent clones. Essentially, they lock out for-fee competitors.
Because of this, I’m a strong supporter of consumer protection laws and regulations. Capital often flows toward deceptive and unethical practices. And, all companies are eventually forced to follow the leader, otherwise they disappear. Laws are the only ways to prevent this.
You start by saying the parent comment is hard to understand, then go on to explain why the statements are true. These two statements are essentially the same
>Since gathering / ad-selling services are so lucrative, they essentially subsidize the development of the highest-quality products in the market. The best engineers work for these companies. They have massive teams of people refining their products. And, these companies use IP laws to prevent clones. Essentially, they lock out for-fee competitors.
> You get those free, truly amazing services for nothing because the companies found ways to generate value from the data being gathered. The cost to maintain that infrastructure and provide those great services is pretty high and has taken some brilliant engineers years of sweat.
The only real difference is your use of the phrase "lock-out", but I don't think that's a reasonable way to look at it. If enough people really want to be part of a fee-based ecosystem, no one is stopping a company from engaging those people. You bring up IP laws, but that's really an independent issue, since fee-based and ad-based businesses can both use IP laws in the same way.
Try Tiller. It puts all of your data into a Google Sheet with nice premade templates. Not quite as nice UX as mint but pretty close and allows for better customization.
> For instance, I would pay for a high-quality service like Mint (I don’t use it) that kept my data private, but there is none.
Have you looked at ynab.com? I find their overall philosophy on budgeting to be more effective, the app is high quality, and it's solely about your budget, not about offering you credit cards.
It's one of the few products I recommend to anyone, and it's relatively expensive ($85 for the year I believe).
There is already mechanical Turk for the cases the author described. There are many paid initiatives to label data that these companies employ. These "AI" companies sample sizes are small compared to the services they provide against them. Once divided up its pointless to think about them "paying" you and is counter to the point. They already are paying in the form of the service they are providing. I.e free storage, service etc.
Democratizing the data so more companies can use it to build products beyond these major concentration of platform players is a much more practical. This is similar to the personal data requests legislation that Europe already has in place. This would be an extension to enable competing services to access this data as their customers opt into the the service. Enabling more companies to build more competitive products (employ people working towards market competitive products) not trying to set "sustainable" sheep that feed from the same troff of the concentrated power of these companies. Certainly universal basic income; but that should be wholly independent of being subordinates to these service providers.
Perhaps if I collected it and stored it, perhaps put it behind an accessible API and supplied Google et al with an API key to access it I might have a case for charging them for it.
PII, Personal Identifiable Information, is the phrase I know for this.
I'd like to see a requirement for all companies receiving PII to issue an account of how they use it and who it's sold to, those receiving your PII would also have an obligation to notify you, and offer access to the same audit data (how they're using it, who they got it from, who they gave/sold it to).
Legit companies would then leak information, by design, showing companies using/selling your data without notifying you.
Real fines like those anticipated in the GDPR would be needed to encourage companies to adhere to the legislation.
Okay, but like operationally, what does that even mean? How do we establish a legal definition of what "your data" even is?
When we talk about "big data" and data collection, the vast majority of what we're talking about basically just boils down to web logs, and derived data that's inferred from those logs. So is an entry in a web log that corresponds to something you did now your property? Is anything someone writes down about me now my property? If I walk in a store and the store owner writes down that I was there, is that my property?
But the value of my data alone probably is worth less than a dollar to Google and Facebook. Given the choice of getting paid a dollar, or using it for free (in exchange for them using my data), I would choose the latter.
> But the value of my data alone probably is worth less than a dollar
Calculating the value of your data is pretty easy actually. Just take the market cap of the company divided by users to get an average.
Google is at about $700B with may ~2B global users, so global average is more like $350/user. The value of US users is probably 10X less developed parts of world (based on ad rates) so US users are worth more like $3000.
You are undervaluing yourself :) which is how they win.
>Calculating the value of your data is pretty easy actually.
No it's not. You're mixing up the value of the data with the value of your attention. The world has made a killing off of advertising long before personalization came about so saying it's all due to personalization is a bit silly imho.
This wouldn't be such an issue if they couldn't see the data on which they are computing (and individual data wouldn't be exposed in data breaches either).
But I'm not sure if Google even cares too much about doing that (I know they've experimented with this, but nothing on a scale that matters), and Facebook certainly doesn't. Apple seems to be the only one that does somewhat with its differential privacy approach.
I think they can all do much more, but they're not trying too hard.
I'm not sure you understand what differential privacy is. Apple still has your data. If there's a breach, an attacker gets your data. It's probably encrypted, just as it is with pretty much any modern system where user data is collected, but differential privacy doesn't have anything to do with what data is collected or stored. Differential privacy is just a method for limiting how much information can be reliably inferred about an entire dataset from limited queries on statistical properties of the data.
It's basically this. Imagine I have two data sets that are identical, except one has your specific data and the other doesn't. It's possible in some circumstances to infer information about you by asking for things like averages on the datasets. Differential privacy is a method for defeating attacks that rely on that type of "leakage" of information. It assures that if two datasets are close enough, queries on those datasets will not yield significant information about their differences (i.e., the presence or absence of one particular data point won't be detectable).
It's all about how queries of the data perform, and not at all about the data itself.
I'll never understand why I can't pay Google and Facebook $20 to $50 per year to simply respect my privacy and not give me advertisements. I can't imagine they make more than this off my viewing behaviour (especially as I use adblock)
Because it's a boring, safe, traditional business model.
Instead they've bet on harvesting and exploiting information about the behaviour of the many, to ultimately allow them to exert some 'control/influence' over individuals.
And they're winning the business lottery with it.
More shame us for only considering ourselves when deciding whether or not to use a service like google. I can choose not to use google/facebook, but if everyone else uses them, they'll still have incredible power over me through the insights they gleaned about everyone else
It would be helpful if, instead of deliberately conflating these two senses of "your," articles on this subject went in the other direction and explicitly discussed the distinction.
We have from time to time redefined what "belong" means, but these redefinitions tend to be very complicated and have all sorts of unforeseen consequences. My guess is that a redefinition on the scale of facts about you = your property would be a disaster.
>Obtaining information with reference to the identity, habits, conduct, movements, whereabouts, affiliations, associations, transactions, reputation, or character of any society, person of groups of persons.
Guess what that is? It’s, in part, the statutory (legal) definition of private investigator, which is prohibited by law without a license.
So sure under the law we don’t necessarily own that data about ourselves, but the collection of the data is highly regulated for a reason. Obviously tech companies obtain user consent to collect data (at least some of the time), but if PI’s can be regulated in their activities so can a tech company or digital product, and it’s not to difficult to see regulations that say a tech company can’t collect user data and turn around and sell it to third parties.
I am curious though, how exactly would such a legal framework be a disaster in your opinion?
"Facts about you" falls into at least two categories: "Facts which you want and reasonably expect to remain private", and "facts which you display to the world and have no expectation of privacy about."
Since clickbait is against the site guidelines, we replace such titles with neutral ones. Usually we do that by copying a representative phrase from the article itself. But that was surprisingly hard to do this time.
Dead Comment
I think it is a sad state of affairs that it had to come to this point resulting from "don't ask for permission upfront, ask for forgiveness later" (most skip the latter part anyway, if not confronted with a shitstorm) mentality.
Huh. I actually believe most western societies are incredibly tilted in the "ask permission" direction. You have to petition to start a business, cut someone's hair, buy a gun, get married, copy a book... I could probably find a hundred things you're supposed to ask for permission before doing.
Personally, I prefer freedom: do whatever you want, as long as it doesn't violate someone's rights (and no, Imaginary Property doesn't count).
You're trying to distinguish between privacy and identity.
If there's a dividing line, it's very, very hard to discern.
If you collect data about me in the EU, I have legal control over that data. I can order you to give it to me. I can order you to delete it.
Sounds like ownership to me.
Dead Comment
It was $27.76 in Q4, and $21.20 in Q3 (for the US + Canada).
It's not implausible that it'll eventually hit $200 per year in the US + Canada.
For fun, here's their quarterly ARPU over the last few years in the US+CA, starting from 1Q15, excluding all Q4 results (Q4 skews because it spikes):
$8.32, $9.30, $10.49, $12.43, $14.34, $15.65, $17.07, $19.38, $21.20
Quite the increase machine. US+CA ARPU for 4Q17 ($27.76) was triple that of 4Q14 ($9).
www.databook.one
@dasil003 I agree that Facebook's ARPU is super low. However, with that statistic is related to ad revenue, not revenue from the sale of data. Facebook doesn't publically sell its consumer data. Thus, we're talking apples and oranges.
To buy raw data, you need to go to data brokers, who are currently earning $250 billion per year from selling consumer data. Raw data is super valuable because you can build predictive models with it. With very small samples of customer data, powerful models can be built that can deliver millions or even hundreds of millions of dollars to a business' bottom line.
I can testify to this because I led the technology team for company that earned millions by leveraging machine learning to predict who would sell their house bellow market prices in the next few months.
The problem with buying data from data brokers is that the quality is awful. For example, the accuracy rate of America's largest data broker, Acxiom is only 50%. This problem is holding back the advancement of machine learning. If consumers linked and verified their own data, we could improve that accuracy rate significantly and deliver much more accurate models to businesses.
Thus, we expect that compensation paid by companies to individuals sharing their complete data profiles could be significant.
I've seen estimates that the value of data will reach $7,500K per American per year by 2022.
Even if we reached 10% of this estimate, I'd want that money in my pocket instead of some data broker.
What are your thoughts?
People aren't going to directly assist and curate the ability for 3rd parties to spew more spam & spyware on their webpages and in their email inboxes. If they could see what sort of data is being tracked, that can easily trigger their creepiness receptors and wish to distance themselves from it.
The only reason people endure the current situation is because they don't feel they can do anything about it, and really don't know the extent of what's going on. Empowering them will finally allow them to actively break out of that cycle.
Unfortunately that will also put a cap on their profits, so it's unlikely they'll do that.
Privacy is not the only issue.
Using user data for ad targeting is the past/present. The future is using users to train AI.
Facial recognition in photos is a good example of this. Facebook has the most advanced facial recognition in the world because they got millions (billions?) of hours of free labor from their users classifying photos.
Amazon Echo is doing the same thing with speech recognition.
Cloud based AI platforms capture the data from engineers training AI algorithms and the goal is to use that data to train an AI to replace the engineers.
Not only are people being duped into giving away their labor for free but they are increasingly being used to train the AIs that are going to replace their jobs.
Facebook is free. Google search is free. 15gb of gmail/google drive is free. You get those free, truly amazing services for nothing because the companies found ways to generate value from the data being gathered. The cost to maintain that infrastructure and provide those great services is pretty high and has taken some brilliant engineers years of sweat. (I don't work for either so I'm not tooting my own horn, just giving credit where I believe credit is due)
I think as it currently stands (collect data on me and I get free 99.99% uptime, cloud available services for email, cloud drive, and social networking) is a pretty sweet deal.
Remember, they're showing us ads in addition to collecting volumes of information on us, so we're paying in two ways.
And frankly I don't mind the ads, by principle, but I do mind that a few companies are able to use them to track us nearly everywhere online without allowing me to review what they're collecting and who is collecting it.
Here in the US, we cried tyranny when our security agencies were collecting metadata to monitor threats. And then we gave everything we had to Facebook and Google and didn't bat an eye as they sold us to anyone who pays.
It's because we've structured our society (in Western Democracies) so that people grow rich if they can trick you in to buying stuff. The better they are at tricking you, the more they earn. The higher the price compared to the cost, the more they earn. The lower the longevity of the goods, the more chance to sell to you again.
Many of the incentives are towards fooling people rather than creating useful goods that enrich society with minimal environmental impact and maximum longevity.
It's unsustainable and leads to powerful (ie rich) people who are incentivised to parasitically feed on poorer people.
It's the same thing... If they stop collecting data, they don't still get to keep the same ad revenue.
The ads are only worth the amount there are because of the hyper specific information you can target ads with. If you get rid of that, than the ad model falls apart. They might still be worth...something, but if it was just about anonymous eyeballs, probably every ad would just be for Coke and Coca Cola would pay extremely little for it.
"fair" is practically a weasel word. It doesn't have a universal standard. It doesn't mean anything until you explain what you consider to be fair. I once read a book on negotiations that called it "the F-word" - it was warning you to be wary of people using the word against you.
Think of this as a transaction between two parties doing business together. When business A is making a deal with business B, then A is not entitled to know how much money B will make out of it. It's perfectly fine to ask, but B not answering is very much within the rules (and is the norm). It is your job to do your own research to figure out how valuable your information is to them.
I also often see an attitude of "They make $20/year from the information I give them. So I should have the option to pay them $20/year for not tracking me." No business deal works that way. Each party sets their own requirements. If you demand that they offer paid services to stop tracking you, they get to say "No - you're welcome to use someone else's service if they give you that kind of a deal"
Finally, the way negotiations work: The focus will not be on what they make from your money, but how much you are willing to pay for the services you receive. The majority of people I know will not pay $10/mo for their email service. So the fact that Google may make $30/year from your use of Gmail is irrelevant. This is your BATNA.
>Remember, they're showing us ads in addition to collecting volumes of information on us, so we're paying in two ways.
And in how many ways are they paying you in services? Far more, I imagine.
>Here in the US, we cried tyranny when our security agencies were collecting metadata to monitor threats. And then we gave everything we had to Facebook and Google and didn't bat an eye as they sold us to anyone who pays.
We probably hang out in different crowds, but while government monitoring did make the news, the percentage of people I personally know who cared about it was less than those I know who say "I'm using a service other than the one provided by Google because they're already tracking me in so many ways".
BTW, although I use some Google services, by and large I do not. I do not actively use Gmail. It was very difficult for me to get used to the idea of letting them get to me via Android. I don't use my account on Youtube. I don't have a Facebook account. I don't use Twitter. I pay for my own email and web services. I tend to have privacy extensions on my browser. I don't use Whatsapp. My life is not at all miserable.
Your reasoning is extremely fallacious because you're overlooking a lot of nuance.
Yes, it is true that in aggregate part of the advertising costs get passed on to consumers. How much exactly depends on macroeconomic factors, industry dynamics, pricing power of distributors and retailers, and hundreds other variables. But you are right that in aggregate a non-zero amount of the advertising costs get passed on to consumers.
However, advertising is valuable to advertisers because it works to some degree. And it works because people are influenced by it. Now, as an individual I have no control over what other people are doing and so I have no control over how valuable advertising is and how much is being spent on it and how much is being passed on to consumers. Therefore as an individual I look around me and prices are what they are and already have in them advertising costs which will not go up by me using service. Therefore as an individual the marginal cost of me using these services is effectively zero (imagine the ratio between the value of advertising to me and the aggregate value of advertising, that's effectively zero). As an individual those services are free.
So it depends on what your definition of "free" is. A) Would prices be lower if advertising didn't exist? Yes [0]. B) Would prices be lower to me if I didn't use these services? No. The thing is that in definition A) I don't really have a choice, so it's irrelevant. I only have a choice as an individual.
[0] And even here some people will disagree with you. What I call "free market fundamentalists" will argue that No, they wouldn't, because the advertising leads to higher sales and greater operational efficiency, so that advertising costs are exactly offset by lower per-unit costs.
> You pay for them in high prices for good and services. Those companies pay advertising which increase prices plus their tech size profits.
I don't think this is falsifiable. It sounds sort of feasible in the abstract, but how can we empirically measure that effect? As it stands, I could just as easily respond that, no, these companies make prices more efficient, because their advertising methodologies are more effective so companies spend less on their platforms than they would for equivalent ad campaigns elsewhere. Then we'd have to figure out how to investigate whether modern companies advertise more due to growth of the market, or because they have to due to catch up with everyone else using the tech advertising platforms. Narrowing down the answer, if there is one, would be nontrivial. Then I could also say that advertising makes tech services cheaper, because Facebook and Google earn more from user data than they would if they forced everyone to pay $20/month.
Which of us is right? We can't really say, it's effectively an opinion because, even if technically falsifiable, these claims are far from being empirically demonstrable.
Companies use advertising to increase their sales, either from 0 to something or from something to something more. Along with R&D, production costs, running costs and profit, the advertising cost is part of the planning from the start.
This is not how the stock market works. Where did you get this idea?
Also, these things tend to be monopolistic and data is more moat.
Hyperbolic indeed.
> You get those free, truly amazing services
lol. Services that exist without an honest portrayal about what they do with the data they gather, or anything like clarity on the extent of what is possible.
Not content with that, they bend the service to better manipulate us not to provide a better service. Facebook run fun psychological experiments to see if they can manipulate their users - without calling for volunteers first or anything I recognise as ethics. How many times have FB been pulled up for being, cough, "less than honest"?
Google, especially, have built a reputation for simply gathering maximum possible data, then shuttering the product 18 months later. There is no reason a company Google's size need close Reader or Picasa or any number of others. Buy their in-home device or use G Drive? Hell no, i don't trust them to provide any availability once the data grab is complete. Sure if the company is entering leaner times you can understand canning some of their services, right now it's no more than a rounding error. So no, I'm in no rush to use any new Google offering - they don't exist long enough.
I think as it currently stands (collect data on me, lie to me, track me around the web as far as technically possible, track me off the service as far as technically possible, track me offline as far as possible, sell retargeting and other dubious tactics to ensure my life is 101% a buying consumer experience with no time off. In return I get a service that doesn't even meet the core need any more thanks to engagement algorithms bending it to meet FB and Google need not users') is a pretty terrifying deal.
Terrifying because we still think it's an honest exchange akin to swapping £1 for a bar of chocolate in a shop, just swap data for service. We still feel it's a fair exchange. Yeah, without honesty of even 10% of what is being done.
I'd actually like the sort of exchange you imply. Give me a great service in exchange for some data. Actually, yes, that seems fair so long as you use it reasonably. Hoovering up everything, with the Hoover turned up to 11, whilst implying "it's just a bit of data" is a confidence trick on the non-technical.
In my opinion, you're mistaking apathy for ignorance. Reasonable people can disagree about whether or not the exchange is fair. I work in tech, have heard everything you just wrote many times, and my response is continually, "Eh, I'm okay with that." People I've spoken to about this outside of tech often don't know about the data collection to quite the extent you're talking about and don't care about it. When I mention how extensive the data collection is, they're often surprised, but often they still don't care. The most I've personally seen is a vague discomfort that didn't seem to persist, given the person's behavior didn't change at all.
It gets kind of frustrating for people like myself to hear this argument rehashed over and over in the way you've just presented it, as though people who are okay with their data being used simply "don't understand" or haven't been paying attention. Everything you wrote is - while clearly not neutrally presented - in the zeitgeist on HN. There's nothing really new about it, and you're only convincing the people who already agree with you. Yes, we know our data is being used, we get it. We're picking up what you're putting down. Right there with you. We just don't really care.
You can draw a normative conclusion about me based on that statement if you'd like. It probably sounds callous and asinine from your perspective, but like I said: reasonable people can disagree. Anecdotally speaking, most people I "enlighten" about the data hoovering you describe don't meaningfully change their behavior or preferences. Consider that when people are not aware of how their data is being used, it doesn't always mean they're being victimized; it can also mean they are vaguely aware that their data is "out there", and they implicitly don't care enough about it to investigate further. That's a valid position to take.
I've always wondered: How is that different from A/B testing, or any other marketing experiment?
To be fair, I do use some of these services, too (carefully). But, that’s only because often I can’t find quality alternatives. For instance, I would pay for a high-quality service like Mint (I don’t use it) that kept my data private, but there is none.
Since gathering / ad-selling services are so lucrative, they essentially subsidize the development of the highest-quality products in the market. The best engineers work for these companies. They have massive teams of people refining their products. And, these companies use IP laws to prevent clones. Essentially, they lock out for-fee competitors.
Because of this, I’m a strong supporter of consumer protection laws and regulations. Capital often flows toward deceptive and unethical practices. And, all companies are eventually forced to follow the leader, otherwise they disappear. Laws are the only ways to prevent this.
>Since gathering / ad-selling services are so lucrative, they essentially subsidize the development of the highest-quality products in the market. The best engineers work for these companies. They have massive teams of people refining their products. And, these companies use IP laws to prevent clones. Essentially, they lock out for-fee competitors.
> You get those free, truly amazing services for nothing because the companies found ways to generate value from the data being gathered. The cost to maintain that infrastructure and provide those great services is pretty high and has taken some brilliant engineers years of sweat.
The only real difference is your use of the phrase "lock-out", but I don't think that's a reasonable way to look at it. If enough people really want to be part of a fee-based ecosystem, no one is stopping a company from engaging those people. You bring up IP laws, but that's really an independent issue, since fee-based and ad-based businesses can both use IP laws in the same way.
Have you looked at ynab.com? I find their overall philosophy on budgeting to be more effective, the app is high quality, and it's solely about your budget, not about offering you credit cards.
It's one of the few products I recommend to anyone, and it's relatively expensive ($85 for the year I believe).
Of course. No one would be using them if they weren't better off from using them.
Democratizing the data so more companies can use it to build products beyond these major concentration of platform players is a much more practical. This is similar to the personal data requests legislation that Europe already has in place. This would be an extension to enable competing services to access this data as their customers opt into the the service. Enabling more companies to build more competitive products (employ people working towards market competitive products) not trying to set "sustainable" sheep that feed from the same troff of the concentrated power of these companies. Certainly universal basic income; but that should be wholly independent of being subordinates to these service providers.
Perhaps if I collected it and stored it, perhaps put it behind an accessible API and supplied Google et al with an API key to access it I might have a case for charging them for it.
The remedy, of course, is to establish intrinsic properties rights over our own data. No different than copyrights.
Maybe we need a new catchphrase.
I propose bodyrights. It captures the notions of self, privacy, identity, and sovereignty all in one.
I'd like to see a requirement for all companies receiving PII to issue an account of how they use it and who it's sold to, those receiving your PII would also have an obligation to notify you, and offer access to the same audit data (how they're using it, who they got it from, who they gave/sold it to).
Legit companies would then leak information, by design, showing companies using/selling your data without notifying you.
Real fines like those anticipated in the GDPR would be needed to encourage companies to adhere to the legislation.
/ifiwereemperoroftheworld
When we talk about "big data" and data collection, the vast majority of what we're talking about basically just boils down to web logs, and derived data that's inferred from those logs. So is an entry in a web log that corresponds to something you did now your property? Is anything someone writes down about me now my property? If I walk in a store and the store owner writes down that I was there, is that my property?
Only in aggregate is it worth a lot.
But the notion of getting paid for your data, when its worth so little by itself is not appealing.
Calculating the value of your data is pretty easy actually. Just take the market cap of the company divided by users to get an average.
Google is at about $700B with may ~2B global users, so global average is more like $350/user. The value of US users is probably 10X less developed parts of world (based on ad rates) so US users are worth more like $3000.
You are undervaluing yourself :) which is how they win.
No it's not. You're mixing up the value of the data with the value of your attention. The world has made a killing off of advertising long before personalization came about so saying it's all due to personalization is a bit silly imho.
But I'm not sure if Google even cares too much about doing that (I know they've experimented with this, but nothing on a scale that matters), and Facebook certainly doesn't. Apple seems to be the only one that does somewhat with its differential privacy approach.
I think they can all do much more, but they're not trying too hard.
It's basically this. Imagine I have two data sets that are identical, except one has your specific data and the other doesn't. It's possible in some circumstances to infer information about you by asking for things like averages on the datasets. Differential privacy is a method for defeating attacks that rely on that type of "leakage" of information. It assures that if two datasets are close enough, queries on those datasets will not yield significant information about their differences (i.e., the presence or absence of one particular data point won't be detectable).
It's all about how queries of the data perform, and not at all about the data itself.
Instead they've bet on harvesting and exploiting information about the behaviour of the many, to ultimately allow them to exert some 'control/influence' over individuals.
And they're winning the business lottery with it.
More shame us for only considering ourselves when deciding whether or not to use a service like google. I can choose not to use google/facebook, but if everyone else uses them, they'll still have incredible power over me through the insights they gleaned about everyone else
Count me in. I subscribe to YT Red as well.