By using many web services you have entered into an agreement to supply a lot of personal information to parties that use it internally and market it. XRay is a project at Columbia that intends to track some of this beginning with Gmail, YouTube and Amazon and expanding in the future.
The Web can be a black box. When a user sees an ad about spiritual meditation methods, she may not realize that she's seeing that ad because she recently received an email about depression or cancer. We are seeking to change that, and in doing so bring more transparency to the Web.
For this, we developed XRay, a new tool that reveals which data in a web account, such as emails, searches, or viewed products, are being used to target which outputs, such as ads, recommended products, or prices. It can increase end-user awareness about what the services they use do with their data, and it can enable auditors and watchdogs with the necessary tools to keep the Web in check.
Currently, XRay can reveal some forms of targeting for Gmail ads, Amazon product recommendations, and YouTube video recommendations. However, XRay's core mechanisms are largely service-agnostic, providing the necessary building blocks that we hope will enable a new generation of auditing tools that will help lift the curtain on how users' personal data is being used.
Using our XRay Gmail prototype, we found some pretty interesting examples of data uses, such as a number of ads targeting depression, cancer, and other illnesses. We also saw quite a few subprime loan ads for used cars that targeted debt, loan, or borrow keywords in users' inboxes.