How I am using 50+ sources of my personal data

This is the list of personal data sources I use or planning to use with rough guides on how to get your hands on that data if you want it as well.

It's still incomplete and I'm going to update it regularly.

My goal is automating data collection to the maximum extent possible and making it work in the background, so one can set up pipelines once and hopefully never think about it again.

This is kind of a follow-up on my previous post on the sad state of personal data, and part of my personal way of getting around this sad state.

If you're terrified by the long list, you can jump straight into "Data consumers" section to find out how I use it.

¶ 1 Why do you collect X? How do you use your data? All things considered, I think it's a fair question! Why bother with all this infrastructure and hoard the data if you never use it? In the next section, I will elaborate on each specific data source, but to start with I'll list the rationales that all of them share: ¶backup It may feel unnecessary, but shit happens. What if your device dies, account gets suspended for some reason or the company goes bust? ¶lifelogging Most data in digital form got timestamps, so automatically, without manual effort, constitutes data for your timeline. I want to remember more, be able to review my past and bring back and reflect on memories. Practicing lifelogging helps with that. It feels very wrong that things can be forgotten and lost forever. It's understandable from the neuroscience point of view, i.e. the brain has limited capacity and it would be too distracting to remember everything all the time. That said, I want to have a choice whether to forget or remember events, and I'd like to be able to potentially access forgotten ones. ¶quantified self Most collected digital data is somewhat quantitative and can be used to analyze your body or mind.

¶ 3 Data consumers ¶Instant search Typical search interfaces make me unhappy as they are siloed, slow, awkward to use and don't work offline. So I built my own ways around it! I write about it in detail here. In essence, I'm mirroring most of my online data like chat logs, comments, etc., as plaintext. I can overview it in any text editor, and incrementally search over all of it in a single keypress. ¶orger orger is a tool that helps you generate an org-mode representation of your data. It lets you benefit from the existing tooling and infrastructure around org-mode, the most famous being Emacs. I'm using it for: searching, overviewing and navigating the data

creating tasks straight from the apps (e.g. Reddit/Telegram)

spaced repetition via org-drill Orger comes with some existing modules, but it should be easy to adapt your own data source if you need something else. I write about it in detail here and here. ¶promnesia promnesia is a browser extension I'm working on to escape silos by unifying annotations and browsing history from different data sources. I've been using it for more than a year now and working on final touches to properly release it for other people. ¶dashboard As a big fan of #quantified-self, I'm working on personal health, sleep and exercise dashboard, built from various data sources. I'm working on making it public, you can see some screenshots here. ¶timeline Timeline is a #lifelogging project I'm working on. I want to see all my digital history, search in it, filter, easily jump at a specific point in time and see the context when it happened. That way it works as a sort of external memory. Ideally, it would look similar to Andrew Louis's Memex, or might even reuse his interface if he open sources it. I highly recommend watching his talk for inspiration. ¶ HPI python package This python package is a my personal (python) API to access all collected data. I'm elaborating on it here.