DataEthics4All Ethics 1stᵀᴹ Live Talks: The Misuse of Private Information
“Whatever data you’re collecting from a user, you have to exactly notify the user what data you’re going to collect, what you’re going to use the data for, and the user should have the right to say no.”
– Shilpi Agarwal
“We need to be a bit more data-literate, know where our data is, request data if we want it, and be a bit more willing to understand some of the issues.”
– Sam Wigglesworth
In this Ethics 1st Live Discussion, members of the DataEthics4All leadership team discuss the misuse of data – what it is, some examples and how you can defend yourself against it.
0:08 Shilpi Agarwal:
Welcome to another Ethics 1st Live talk. I’m your host Shilpi Agarwal, founder of DataEthics4All, and I’m joined by my colleagues from the leadership team, Susanna Raj and Samantha Wigglesworth. Sam is joining us from the UK, and Susanna is joining us from California, sunny California! If you’ve joined us for the past five live talks or you’ve managed to see a recap of some of those, our intention with bringing these Ethics 1st Live talks to you is to bring food for thought for leaders with an ethics first mindset – those who put people above profits and have ethics in their DNA; who value ethics above all; it is not an afterthought, it is by design and it is at the forefront of everything they do. To encourage more such leaders and to bring more food for thought, this is our weekly talk series. Today’s topic is misuse of personal or private information; a really exciting topic because it is at the top of everyone’s mind. Private information and the misuse of it is very common these days and sometimes it’s a gray area and we don’t even know that we have misused it. Today’s talk will shed some light on what exactly is private or personal information, what is personal data, who has the right to use it, who has the ownership of that information and when is it okay to share or sell that information or use it publicly. I hope that our discussion today will spur up some conversation and food for thought. Let’s start with a basic question: what is personal information; what is private information?
3:27 Susanna Raj:
I believe private information is defined in very broad terms; legally even your name is considered private information. Your name – even your first name – your contact information, your home address, your IP address, and any information that can be linked to you to identify you as an individual is considered private information.
Yeah, it’s called PII – Personal Identifiable Information. Any information that can identify you as a person, as opposed to being anonymous; as opposed to a female living in California aged between this and this – that is all still demographic information, there can be hundreds, thousands, millions of people with those same exact demographics. But when you put a few things together then it’s easy to identify a unique person, so this PII is a unique identifier; anything that uniquely identifies you. Sam, do you want to add anything to that?
4:45 Sam Wigglesworth:
You’re absolutely right, according to legislation personal data includes name, surname, location – anything that’s personally identifiable to you. It’s not just demographic data, it might be just a reference to health data as well, it does span quite a lot, there’s a lot to cover.
5:37 Shilpi Agarwal:
It’s also sometimes a combination of these identifiers. For example, even the first name is considered PII, but there could still be many people with the same name, like there could be many Shilpis in the world, but when you combine that with more information like my health care data or non uniquely identifiable information like my zip code, it narrows it down and then it is more easy to uniquely identify that person. That’s when that information truly becomes personal or private information. For example with healthcare data, it could be of lots of different people, but when you combine it with name and demographic and location, when in an office there’s only one person with that name, then those are the kind of things where we need to be careful of how we reveal that information. Okay, so what is considered data misuse?
7:12 Sam Wigglesworth:
I was looking at this earlier and I thought for me it’s having someone’s individual information and using it not for the intended purpose. That’s my definition. It doesn’t mean that it’s necessarily illegal, but it’s not used for the intended purpose.
7:50 Susanna Raj:
I would add to that that it’s not only for the unintended purpose, but also that any data you collect without permission, without consent or you collect it for one use case and use it for another use case, or you expand the use case without going back to the user and informing them again – all of those are considered misuses. In the realm of social media releasing someone’s private information because they have a contrarian viewpoint or a political viewpoint, and then you want to reveal their personal information – all of those are data misuses.
8:32 Shilpi Agarwal:
Yeah absolutely. I gave a talk at the DATAcated show about the ethical considerations of data collection. Consent was a major topic then – first of all you have to take explicit consent for whatever data you’re collecting from a user, and it has to have three key ingredients. It has to exactly notify the user what data you’re going to collect, what you’re going to use the data for, and the user should have the right to say no. When you’re asking for consent there should be a clear way for them to say no. If you go further, there should also be a way for them to revoke that permission so it’s not that they’ve given you the permission once and you just have it forever. You should also have a clear sunset policy, meaning you know when that data will expire, so you don’t collect somebody’s data with their permission but hold it forever and then use it for other purposes or selling it. Overall, data misuse is data being used for a purpose other than that for which it was collected without the consent of the user. Whenever as a business we want to use the data we have collected for something else, we must go and inform the users. That’s not just ethical, nowadays there are new laws that will ask you to do exactly that. So, what are some examples of data misuse that we have seen in recent times?
10:46 Susanna Raj:
There’s been a lot since 2016, there has been so many misuses of data! Cambridge Analytica is one that comes to mind; another is the time when an undercover FBI agent’s address and her name was identified. Even though she was publicly identified as a person, the fact that she had another job that was undercover was linked; both of them linked together. Sam could be a teacher in the United Kingdom, but she could also be an undercover agent – if I link that information and leak it, that puts her life in danger. That is called doxxing – if I have a disagreement with her and I know this information about her and I put it on social media. That was one of the major misuses that happened during the past administration in the United States and then it was back and forth; the far right and the far left doxxing each other and releasing private information, putting people’s lives in danger just because they are disagreed with the political viewpoint of the other person. So those are the kind of misuses that have been happening, especially on Twitter and Facebook.
12:36 Shilpi Agarwal:
And also Google and Amazon – basically all of these companies that have marketplaces. Facebook’s Cambridge Analytica started with the marketplace; there were apps collecting data from users, but those users explicitly gave information to Facebook, and then these other apps through the marketplace went and collected that data from the friends of the friends who actually owned the app. I might give permission to a mobile game app to collect my data because I want to play that game, but that doesn’t mean I’m also giving you permission to access the data of my friends! These marketplaces are a big loophole, Google got into a lot of firewalls to collect location data without explicit content from the users. Amazon’s marketplace also was doing that, although it’s still under investigation and the charges have not been formally brought, but there is an in-depth investigation going on by the EU’s antitrust authority to see if sensitive information from Amazon’s marketplace has been used by the sellers to their advantage. Sam, do you want to add something to this?
14:20 Sam Wigglesworth:
Yeah, I was going to say that Twitter is starting to review its privacy policies based on previous issues with uploading of videos and images which are then being used against individuals, so they’ve reviewed their policies based on that quite recently. In the news in the UK we know that data is being and has been in the past misused by organizations for individuals that are quite prominent, quite famous – that’s been in the news recently as well, where data’s been obtained without permission. That’s something that we’ve seen quite a lot of here. You’ve mentioned Google and the collection of data from ads, what I’m seeing is more companies now looking at solutions to make their data protection policies more stringent and strong. We talked about it last week in the previous Ethics 1st Live talk about bounties being awarded to developers who can find loopholes and gaps, whether in software or web applications; people are now actively looking to do that and they’re hiring people to do that, so that’s really positive.
16:31 Susanna Raj:
17:47 Shilpi Agarwal:
It was a well-intentioned effort, they wanted to protect people. Social media nowadays is full of things like when you go to a party and people take pictures with you in the background and then they post it on social media – and there is no consent. What if I don’t want wherever I am publicly on social media? But it’s difficult to take consent from everyone, especially in a crowd, so what do you do? How do you put something in place where you can report to Twitter and ask them to take it down, and they will review it? It was a well-intentioned effort and something that definitely needs to be done because on social media platforms kids do it, we all do it, we want to brag about ourselves, but we forget that in the process we may be hurting someone else’s privacy or their private information. Like Susanna said, the far-right activists felt very happy that this policy is going to actually protect them because now nobody will know the face of people who are actually deliberately causing intentional harm to society. Obviously we don’t want that, but to be fair the policy said that you have to appeal to take something down, not that you can do it on your own, and Twitter will make an informed decision whether it needs to be taken down, so in this case if they see that it is an actual far-right activist group that is causing harm to society, then Twitter can decide not to take it down and let it be common knowledge that these are the kind of groups or people that are causing harm to society, and so stay away from those kind of things or protests. You gotta start somewhere, because this has become a giant beast; social media today has become a giant beast. It’s very hard no matter how well intentioned the laws you put in place are, there’s always good and bad to it. So, what is the difference between data misuse versus data theft?
20:31 Sam Wigglesworth:
They’re slightly different. Misuse as we’ve said is where data isn’t being used for its intended purpose as Susanna said, without consent. Theft is when there’s been a data breach or a cyber attack on a company or an individual’s data and it’s taken without consent. You’re not aware of it at all.
21:35 Shilpi Agarwal:
True, there’s no consent of any kind – with misuse there might be for one particular purpose, with theft there is no consent, period. Consent is not given by the user for any purpose to a particular person or company or vendor or user or app for that purpose; they decide to just steal that data and use it; that’s literally what it means, it’s basically theft. In some ways what happened with Cambridge Analytica also can be considered theft because the users only gave information consent for their own information, but that app stole data from all of their friends, so it would be considered theft. Susanna do you have anything to add to that?
22:40 Susanna Raj:
No, theft is just taking something without permission – that’s the basic definition of theft; that definition applies to data.
23:01 Shilpi Agarwal:
With misuse also, it is still our data so in both cases we are the owners of the data, but in the first case we are willingly sharing it with someone. So are there any laws today to protect us, the users, from companies misusing our personal information or stealing our data?
23:46 Sam Wigglesworth:
We do have in the UK and in Europe policies around data protections; we have the Data Protection Act in the UK which has actually been reviewed recently in 2018. That governs the use of personal data, how it’s collected and what we should be using it for. Then you’ve got the Computer Misuse Act as well which covers what we were just talking about, the risks that data in an organization could be collected illegally.
24:28 Shilpi Agarwal:
We also have GDPR and the CCPA, which is for California, but it should be everywhere; it should be applied nationwide. CCPA stands for California Consumer Privacy Act, it gives users protection for their data, so whenever you visit a website even if cookies are there, you can say ‘do not sell my information’. You can explicitly ask for a company to share whatever information they have on you; they have to actually show it to you, and if you say that you do not want that information stored, they have to abide by that and delete all the information they have collected on you. You can also fill out a form saying ‘do not sell my personal information’ and in that case they will not be allowed to make money out of your personal data. Those are some things that the California Consumer Privacy Act protects. I will let Susanna talk more about GDPR if she wants to add anything.
25:46 Susanna Raj:
Yeah, GDPR is much broader than the CCPA in that it goes further, not only with deletion of data – the right to forget your data – but also the definition in Article 48 of GDPR and also the European laws are very strict in how they define a person and personally identifiable information. A ‘living person’ versus a ‘natural person’ versus a ‘legal person’ – they have different definitions, and they also define in a granular way the difference between corporate rights and individual rights. All of those are much better in GDPR than the CCPA – which is a good start for California, as far as the whole of the United States we don’t have anything like that, so California is leading the way in that. It still falls very much behind the GDPR though in terms of defining what is a person; what the legal identity of a person is versus a company, that has been done very well by the GDPR.
27:06 Sam Wigglesworth:
What you mentioned earlier about sunset policies comes up in this particular legislation as well – if you’re going to be utilizing data it’s got to be done lawfully; it has to be time limited and limited to what it’s necessary to be used for, and accurate. You have the right to request what information is being held on you. It covers a broad range of parts of the legislation which are protected, and the UK is modeled on that – both the UK legislation and GDPR operate together.
28:00 Shilpi Agarwal:
You brought up the sunset policies; so data collection at the moment, especially with contact tracing and everything that’s happening with Covid, we have all these contact tracing apps and even the government, even schools are collecting your data. What schools are doing, for example in my kids’ schools, is they have a seating chart and they want kids to sit every day in the same exact place so that if somebody tests positive then they can inform everybody who came in contact with the kid. Again, it is well-intentioned and it is good for the purpose that it’s serving right now, but hopefully Covid is behind us and whatever data and information we have collected through contact tracing we need to let it go, we need to have a sunset policy. Basically they have minute-by-minute information on where you’re going, which stores you’re visiting; they have so much information on that person, so it could be misused in a big way. And also, while I was talking about this talk that I gave for DATAcated, there was also this use case where H&M was collecting information on their employees. Every time an employee went on a holiday or took a vacation or took personal leave, they were forced to have an interview with senior leaders, and they were asked to share more information than they needed to, and then this information was used for understanding their religious preferences among other things, also putting them on this corporate promotional ladder or whatever you want to call it. For all of these practices they were fined 41 million dollars. Maybe it was well-intentioned; maybe they only wanted to know how to better understand their employees to be able to give them better perks and create more things that will engage their employees; so maybe it was well intentioned. But the execution of that and the data that was collected was more than was needed. So coming back to data collection, everything starts with data collection, so we have to be very conscious and ethically minded when it comes to every stage of the data.
30:58 Susanna Raj:
Especially with IoT [Internet of Things] devices now, that is another whole big area of security issues. Here in California PG&E has smart meters which collect all the data about when you do your laundry, how long you do your laundry for, when your lights are on, when they’re off – it collects all kinds of information. We went through a breach in India and lost a lot of that information to thieves. Why are companies collecting this information when the purpose is to know how much energy you are using – you don’t need the granular details of each home and each individual’s user preferences. We have to be aware of what is being collected on us; I believe as individual users it’s on us.
31:57 Shilpi Agarwal:
But you know what PG&E is doing, right? They are not only seeing what time you are doing your laundry, but they actually have come up with different pricing points. Again, where they are coming from with this information is that they want to spread out the stress on the grid, because everybody comes back from work and maybe does laundry between 4-7pm, or between the morning times when there’s not that many people working and people are from working from home. So to reduce that they have different pricing strategies for people who are doing laundry from 4-7pm or on the weekends versus people who are doing it on other times of the day. So to your point, yes they are collecting that information; because they have the smart meter they can do that, and they are actually incentivizing people to do their laundry or use less electricity on the times when the grid is already overloaded. Yes it is well-intentioned, but with data comes responsibility, and with more data comes more responsibility. If we want to use the data we collect from our users, we have to secure it properly so it is not susceptible to breaches. That is an ongoing task of the security officer, it’s a nightmare these days! We talked about ethical hacking and the bounty and all of that – imagine the jobs of the privacy and the security officers, it must be hard because you never know who is going to come after your data.
Alright, last question: what can we do to better protect our data? Is there anything we can do as consumers to protect our data?
34:23 Sam Wigglesworth:
We’ve just got to realize, and I think we’re more aware of it now since the pandemic started and we started working from home using IoT devices a lot more than we used to and being on our computers at home, that actually our work and private lives have blended a little bit and we just have to be a little bit more vigilant of what we’re doing online, what we’re sharing and how we’re sharing our information. Just generally we need to be a bit more data literate; that’s what we try to do, we want to focus on that and develop those skills. Knowing where our data is and requesting data if we want it, being a bit more willing to understand the pitfalls and maybe some of the issues and if we want to know more we can find out more and build our knowledge. That all helps, definitely. Being ready, reading around the topic, visiting our community as well and learning about what we’re discussing in our community online and in these talks. That will make us more resilient, I think we’ve got to think about that going forward.
35:49 Shilpi Agarwal:
It starts with information: we have to know our rights, we have to know where we are sharing our data, what others are doing with our data; if we are signing our consent on a 45-page document, it’s our job to read it and understand it, and make sure what it is going to be used for.
36:13 Susanna Raj:
It’s also your job to demand that the 45 page be a four page document, it’s within your right to demand those things and it is also within your right to ask whether you really need a coffee maker with IoT in it? Can’t you just walk and go and put the damn switch on instead of asking Alexa to turn it on for you? All of those things – what is the convenience and what is the freedom you are paying for it? And also, incentivizing behavior is another thing that corporations are doing with our IoT information, and yes it may be to help the power grid, but also if they are changing the behavior of each individual household, and especially households that may have a disabled community member living in them, they may be using more power at different times of the day and they may not be able to transition to that incentivized behavior that PG&E wants us to have. Also thinking about other communities who may not be able to accommodate your behavior modeling practices. Including all of those things; I don’t think the burden should be only on us – it’s like it should be on you to read every 45 page document? No, no, I don’t want to read a 45 page document, I want you to give me a two page document, it’s also my right to demand those things and to make the laws work for us.
37:58 Shilpi Agarwal:
I think with that we will end this on a good note; know your rights and demand your rights, do not think that if it’s out there then it’s out of your hands, you still have control, the data belongs to you and you have ownership of that information – you alone have ownership of that information and you can now ask for the right to be forgotten, that data to be deleted and that data not be sold to anyone else. You yourself are the only person who should be able to make money out of your own data, no one else. So with that note, hope you got some good food for thought and we’ll see you next time. Thank you Susanna and Sam! For everyone who is going to watch this on demand please feel free to post your comments, we are always listening and we’d love to hear from you.
Join Us in this weekly discussion of Ethics 1stᵀᴹ Live and let’s build a better AI World Together! If you’d like to be a Guest on the Show, Sponsor the Show or for Media Inquires, please email us at firstname.lastname@example.org