Amazon knows more than just what books I’ve read and when – it knows which parts of them I liked the most

'They know us better than we know ourselves': how Amazon tracked my last two years of reading

When I requested my personal information from Amazon this month under California’s new privacy law, I received mostly what I expected: my order history, shipping information and customer support chat logs.

But tucked into the dozens of files were also two Excel spreadsheets, more than 20,000 lines each, with titles, time stamps and actions detailing my reading habits on the Kindle app on my iPhone.

I now know that on 15 February 2019 starting at 4.37pm, I read The Deeper the Water the Uglier the Fish – a dark novel by Katya Apekina – for 20 minutes and 30 seconds. On 5 January 2019 starting at 6.27pm, I read the apocalypse-thriller Severance by Ling Ma for 31 minutes and 40 seconds. Starting at 2.12pm on 3 November 2018, I read mermaid romance tale The Pisces by Melissa Broder for 20 minutes and 24 seconds.

Was anyone ever so young? What 10 years of my Instagram data revealed Read more

And Amazon knows more than just what books I’ve read and when – it also knows which parts of them I liked the most. On 21 May 2019 I highlighted an excerpt from the third installment of the diary of Anaïs Nin, the data shows, and on 23 August 2018 at 11.25 pm, I highlighted an excerpt from Leslie Jamison’s The Recovering: Intoxication and its Aftermath. On 27 August 2018, I changed the color of a highlighted portion of that same book.

Other habits tracked included the times I copied excerpts from books into my iPhone’s clipboard and how often I looked up definitions of words in Kindle’s attached dictionary.

I already understood Amazon tracks our purchases on its site, our activity across the web, our voice commands, our grocery shopping and our locations. But the extensive tracking of my reading habits – my most beloved and previously offline hobby – was jarring. Who is this information shared with, what is done with it, and how can it affect my privacy – and the future of the reading experience itself?

Amazon says it does not share what individual customers have highlighted with publishers or anyone else, a spokeswoman said. The highlights are logged to sync reading progress and actions across devices, she said. Aggregated data is used to show which parts of books have most frequently been highlighted, as Kindle customers can see while reading. It does say the data is used “to provide customers with products and services, pay content providers and improve the reading and shopping experience”, the spokeswoman said.

From my reading history, which included books on self-help and mental health, Amazon could easily make inferences about my personal health, career and hobbies. Even the time of day I read or the speed at which I turn pages can provide insights on personal traits, said Stacy Mitchell of the Institute for Local Self-Reliance.

Many of these companies just scoop up as much data as they can without knowing how it will be used Alastair Mactaggart

“It is hard for us to wrap our minds around what artificial intelligence enables Amazon to do with this data,” she said. “The kinds of nuanced correlations Amazon is able to find through analyzing that data is beyond what we can conceptualize as human beings.”

Though Amazon says it is not currently sharing the insights gleaned from reading habits with anyone else, that the company holds on to the data shows it could be used in the future, said Alastair Mactaggart, an advocate who co-wrote the ballot measure behind the California Consumer Privacy Act.

“Many of these companies just scoop up as much data as they can without knowing how it will be used – all they know is that more information is better,” he said. “The essential truth is that these entities know us better than we know ourselves.”

Activists and hackers claim this information is not, in fact, necessary for the apps to function. “There is no reason Amazon or any other company needs to collect that kind of information to provide you with the service, which is simply reading a book,” said Evan Greer, the director at privacy activist group Fight for the Future.

To limit the amount of data Amazon can collect on them, a number of readers are bypassing Amazon’s approved file formats and downloading pirated books to Kindle. The so-called Kindle hackers have found ways to modify book covers, change brightness and prevent tracking within ebooks.

Facebook Twitter Pinterest Some of Kari Paul’s Kindle data. Photograph: Kari Paul

While more tech-savvy users can attempt to alter the Kindle device or app to prevent tracking, the average reader can do little to escape Amazon’s reach. The company is now responsible for the sale of some 50% of physical books for major publishers and 80% of ebooks. For those who prefer to purchase books from brick-and-mortar stores, tracking reading on book social site Goodreads, which is owned by Amazon, will put you back into the tech giant’s purview.

“Ideally if we thought data collection practices were unfair, we could go somewhere else,” Greer said. “But that there is little choice speaks back to the fact that decisions Amazon makes have such an enormous effect across sectors because of its size and the monopoly that it exercises.”

Could Amazon’s monopoly over the publishing industry change the nature of books themselves? As a result of the economic pressures of the streaming industry, the length of the average song on the Billboard Hot 100 fell from 3 minutes and 50 seconds to 3 minutes and 30 seconds between 2013 and 2018. Will books be the next art form to be altered? Greer said it is possible.

“Never underestimate the power, or willingness, of tech companies to do almost anything to make a little extra money – including shifting the entire way we make music or read and write books,” she said. “They are perfectly willing for art to be collateral damage in their pursuit of profit.”