Taking control over my own data
A lot of my work at the Ministry of Health focusses on empowering people/citizens/consumers/patients/experience experts/health professionals to be able to take control over their own (health) data. Now, that sounds like a noble cause, not something you’d disagree with on the face of it, but tricky to imagine the transition to a world where you do actually control your own data. So I thought, let’s see if I can get some basic control over my own data, and see what the experience actually is.
Now, you have to understand that I don’t have a lot of data myself (I hear you thinking: Hah, what a naive fool! The big data companies, ad trackers and many more have tons of data on you! and you’re right about that), so I wanted to see what I could do myself. And let me tell you: it isn’t easy to manage your own data.
Follow my experiment
I’ll write about my progress over time, as I go about discovering what I need to do to get control over my own data. For me it’s about the learning curve, about sharing my experiences with people who might have the same ambition and about having the tools that fit this ambition.
What data do I have, anyway?
There’s several kinds of data that I ‘have’:
I have websites, I have e-mails, documents, video’s, photo’s, backups etc. -basically the regular stuff that I actively manage using several ‘free’ and commercial services already. These I’ll call the known known data, it’s where I’m aware that I have it and that I actively manage it (or let others manage it for me).
Than there’s the known unknown data, which is the data I know is out there, but I have no idea’s where it is stored or how it is managed. This is a broad category, so I’ll include social media posts, location data, service usage statistics, other meta-data about me. But I’ll also include my energy usage, medical records and other data that seems to belong to (trusted?) third parties, and may or may not be about me personally, or about the services I use.
The last category is the unknown unknown data, which is data about me that I have no idea about. This is where my naivity will shine through, but also where more awareness is likely necessary. In this category I’ll put my search metadata, my mac-address being followed in stores, other types of surveillance, even inclusing captcha.
So, let’s see if I can take control over ‘my’ data!
Tools of the nerd: tech stack
There’s several things to consider in this mission. First of all: what do I mean by ‘take control’? As a nerd, my first response is to be able to manage it in an environment I have complete (or as complete as possible) agency over. And that means: self-hosted.
So, what do I need for this? I’ll need a server somewhere connected to the Interwebz, and since I want control, it’ll have to be self managed. I could go for the expensive hardware of a self hosted rack server, but my needs are small currently, so I’ll settle for a Virtual Private Server (VPS) solution. Ah, it already has the word private in it, so I’m feeling better already. That feeling evaporates as soon as I realize that I’ll need to learn how to manage a server…
What services do I need to install and self-manage (boo!) on this private (yay!) server?
Known known data:
- Websites/weblogs: I’ve been dabbling with websites since 1993 (yes, I’m that old) and I’ve been a user of the open source WordPress CMS since version 0.7. I love it’s openness and user-friendlyless and the fact that it has a huge ecosystem that provides a lot of support and learning material. I have a number of domains and web projects that I’ve had hosted by the likes of MediaTemple and such, but managing the webserver part myself needs to be easy.
- E-mail: this is one of my biggest wishes, that for the personal domains I use and the e-mail addresses this provides to me and my family, I’m not only feeding Google’s and Apple’s and Yahoo’s and whatnot’s algorithms, but have complete ownership of my correspondence. But, e-mail is notoriously maintenance-heavy, mainly due to the security risks of having a badly configured and secured mailserver. But, if this is going to be a learning project for me, e-mail shold be part of my stack.
- Documents, images, video’s: my personal cloud storage for documents. There’s several services dedicated to this, but many monetize your data and your useage of that service as well (like uploading your video’s to youtube, for instance).
- Backup storage: data I don’t want to lose will need to be backed up somewhere I can access them if necessary.
Known unknown data:
This is a tricky category, as a lot of this data it tied to services like Facebook, Apple etc. and are notoriously difficult to detatch. So, in the spirit of ‘taking control over my data’ I’ll be looking into ways to minimize the third party control over my data in their systems and ways to migrate my data from third party services to self-hosted alternatives.
- Facebook is at the top of my list. Because it’s got a terrible track record, because a lot of the time I’m on it I’m not really enjoying it, and because they’re just evil and I’m enabling their evil by remaining on their services.
- Amazon is a nice second to look into.
- Google services, because of their incredible farreaching scope.
- Apple services. I’m an Apple fanboy and their integration is really tight but user friendly, so this’ll be especially hard.
- Other services I use(d).
- Special attention will go to medical data and smart home data. Because this is what I strive to empower everyone to be able to do. So I’ll see what I can do myself.
So here I’ll look into what (meta)data the services I use gather and store about me, what the level of control is I have over that data, what the alternatives are, whether there are self-hosted alternatives, and what the consequences will be.
Unknown unknown data:
In this part I’ll research who has what kind of data points on me, and what I can do to minimize the potential damage that data might do to me and my family, as well as to see what kind of control I can exert on that data.