Matt Curtin
Paul Graves
Shaun Rowland
Interhack Corporation
http://www.interhack.net/
Date: 2000/07/31 22:18:26
This paper is also available in PDF.
Coremetrics has taken a bold step forward for marketers, going well beyond the typical use of Internet technology to observe and to record the online habits of unsuspecting Web users. While engaged in normal ``electronic commerce'' activity on various well-known Internet Web sites, we observed a user's name, postal mailing address, telephone number, electronic mail address, and many other personally-identifiable details being reported to Coremetrics. These data--we discovered 71 different variables--are formatted for easy entry into a database system that will build an extensive database of customers of sites that use Coremetrics' service. Personally-identifiable information sent to Coremetrics by one site can trivially be used to link the activity of that user on any site that uses Coremetrics' service. Thus, anything that any Coremetrics site learns about a user can be added to a single database that contains the sum of all information collected by all Coremetrics clients' sites. The more sites that use Coremetrics, the more extensive such a database will be on each user and the more users that would be in such a database.
This information is not the sort of ``anonymous'' and ``aggregate'' information that marketers claim (falsely, in our opinion) is harmless to consumers, but is very detailed, specific information like what's being purchased, the user's name, his mailing address, his email address, his phone number, etc.
This is implemented by embedding JavaScript code in vendor's web site that will cause a connection to be made to Coremetrics with a specially crafted query string that will report such details in a standardized format for easy entry into a database. This fetch is implemented as a request for an image, a 1 pixel blank GIF, what is commonly known as a ``web bug''.
Where we could get ahold of the JavaScript code in question, it was clearly obfuscated in an attempt to prevent a human from being able to read it. A cleaned-up version of one of these programs can be found in Appendix B.
Additionally, because the vendor's pages that include this JavaScript are often encrypted, our normal mechanism for learning what is happening on the network (a packet sniffer) was unable to determine many details.
To find the details of what information was being reported to Coremetrics, we identified the host to which all of these data are being reported and implemented a duplicate of it in our laboratory. Once our duplicate was running, we proceeded to surf Coremetrics client sites. Thus, all information that these sites were sending about us to Coremetrics wound up in our database, not that of Coremetrics.
| Site | Leaking Name, Address, Phone Number, or Email? |
| www.toysrus.com | |
| www.ashford.com | |
| www.fusion.com | |
| www.dxcart.com | |
| www.exofficio.com | |
| www.getplugged.com | |
| www.inchant.com | |
| www.lucy.com |
Coremetrics clients' sites commonly also make use of banner advertising networks, which can leak information about the user's activity to still other third parties. Table 2 shows a list of other third-party sites cited, whether those sites can identify a user from request to request through the use of cookies, and the sites that introduce these third parties into the transactions.
Such prolific leakage seems to suggest that very little, if any, concern for the privacy of the sites' visitors exists.
| Third-Party Site | Cookies? | Referred By |
| switch.avenuea.com | www.toysrus.com | |
| www.lucy.com | ||
| a1896.g.akamaitech.net | www.toysrus.com | |
| www.lucy.com | ||
| www.getplugged.com | ||
| a1428.g.akamai.net | www.getplugged.com | |
| medals.bizrate.com | www.fusion.com | |
| view.accendo.com | www.getplugged.com | |
| ads.admonitor.net | ? | |
| ad.doubleclick.net | www.petstore.com | |
| 207.178.130.149 | www.getplugged.com (?) | |
| partners.quokka.com | www.fusion.com | |
| service.bfast.com | www.fusion.com | |
| www.lucy.com | ||
| 209.24.233.190 | www.fusion.com | |
| 216.35.185.221 | ? | |
| ad.linksynergy.com | www.fusion.com | |
| www.dxcart.com | www.inchant.com |
As for why this is happening at all, it's best to consider Coremetrics' business. Coremetrics provides a means for Web site operators to outsource the job of collecting and analyzing web site data. Instead of providing a product that will serve this function, Coremetrics provides a service . Thus, there is a need for Coremetrics to be introduced into the conversation between the client and the server.
Information is being collected about surfers of Coremetrics' clients' sites without their knowledge and before they have the opportunity to decide whether they want to be tracked. Additionally, since they have not seen any descriptions of what's happening, they don't have the opportunity to see to what degree they're being tracked. Even those who don't mind banner advertising networks are likely to find the collection of their name, phone number, and email address to be extremely invasive. The fact that Coremetrics returns an invisible web bug instead of a visible image, perhaps one that would take the user to a description of Coremetrics and a list of exactly what data were collected in the transaction is strong evidence to support the assertion that this system was designed to work without the users' knowledge. Indeed, were the system drawing too much attention to itself, it would be a nuisance and could make the system unusable. So how quiet is quiet enough to avoid being a nuisance and how quiet is an attempt to avoid detection?
This approach relies on Coremetrics to continue to do the Right Thing with regard to its opt-out records and its handling of data internally. The nature of the HTTP cookies requires that if the system is enabled at all, the cookie in question will be sent along with the request to Coremetrics. Coremetrics cannot know whether it is to save the information sent until after it receives the information and looks at the value of the cookie. As a result, even those who have completely opted out of the system have their data reported; it's up to Coremetrics to honor the web surfer's request to have the information ignored. If Coremetrics changes its policy and begins to read data marked for ``opt out'', there's no way for anyone to tell.
Failures happen in software [2], people change browsers [3], and people will sometimes use other computers. In any of these cases, the opt out mechanism is defeated. As a result, those who have explicitly opted out are once again being tracked with extreme detail, often without their knowledge. We believe that reliance upon such an unpredictable mechanism is unworkable.
There is no guarantee about what will happen with the database that is built. What mechanisms are there in place to prevent the data from falling into the wrong hands--perhaps those who would like to use the data for blackmail--or being used in ways completely unrelated to the original purpose--perhaps being subpoenaed by a court that wants to know what a given user's activity on the web has been recently? Even if Coremetrics does take reasonable precautions to ensure the safety of the data and would fight such subpoenas, what happens if Coremetrics is bought by another company without the same convictions? Readers who consider such scenarios to be unbelievable might be interested to learn that Netscape Communications Corporation has still-unaddressed privacy problems with its Smart Browsing feature. These problems were first raised in 1998 [1]. Netscape has since been bought by America Online.
The primary difference between Coremetrics' possession of this information and some other apparently related situations is that Coremetrics, as a service provider to the vendor, does not own the data. Coremetrics doesn't have the option of selling that which is not theirs. Nevertheless, the potential for mishandling is present, as is the possibility of the data being stolen despite taking every reasonable precaution.
This is the tricky part about privacy: once private information is disclosed, there's no going back. There is no remedy against exposure of private information.
The fifth problem is that data in the Coremetrics database is likely to be taken as reliable. What is to prevent someone from creating a simple program that will constantly feed bogus information to the Coremetrics data collector? We estimate that building such a program from scratch would take a web programmer no more than a few hours to create and to debug. Individuals could have a similar effect by adding a few lines of HTML on their web sites. For the purposes stated by Coremetrics, this margin of uncertainty isn't highly important. But considering this in the context of unintended use of the database, there could be very serious consequences.
Descriptions of the Coremetrics service seem to indicate that the tool is used only to identify visitors of a particular site, that demographic data provided to a given site operator is only data collected from his site. If this is true, there is only one imaginable reason for the Coremetrics architecture to work as it does, that is, by having everything from every Coremetrics-enabled site being reported to a single source. That reason is the ability for a global opt-out of all Coremetrics tracking, irrespective of the site that calls the Coremetrics code.
The use of a persistent cookie allows Coremetrics to track a user as he moves from Coremetrics-enabled site to Coremetrics-enabled site. Were this cookie different for each site, the risk posed to the Web user would be less significant, as multi-site profiling would be rendered much more difficult. However, it would require that an opt-out take place on each of the sites that the user visits. Whether to make the cookie global or local is a key decision, one that we would have made differently.
A trivial solution to this problem is the placing of Coremetrics data
collectors in the domain name of the site using the service. So, if
www.example.com is using Coremetrics, instead of having all
data reported to data.coremetrics.com, it could be reported to
coremetrics.example.com. Thus, only example.com data
would be collected with that cookie. Another site that uses
Coremetrics would have a different server, and a Web user who visits
both sites will have different cookies for each site that he visits.
data.coremetrics.com could
be restricted, such that all attempts to report the data to
Coremetrics will fail. This, of course, assumes that data do not
start being sent to other sites for Coremetrics to start picking up.
Coremetrics' advertised service--providing Web site operators with information about people use their site--is a potentially useful and legitimate service. However, the system's design and even more importantly, how it is used in some cases, needlessly places users of these sites at risk. The fact that the tracking device is obfuscated and hidden from view suggests that despite Coremetrics' rhetoric, it does not place as high a priority on individual privacy as it should. The fact that the data collected are completely centralized, allowing activity across sites to be aggregated into a single profile has no legitimate explanation.
The Internet is becoming an increasingly dangerous place, not because of technology, but because of people. There is nothing new about the technology in question. There is nothing particularly novel about this or any of these user-tracking systems that we have found; they're merely a rather obvious collection of features that were never intended to be used together, resulting in systems that spy on users in ways never envisioned by the technologies' creators. Other systems are merely the result of poorly-considered designs and implementations. Each of us may draw his own conclusions about which systems fall into which categories.
Members of the public who do not understand this technology are often afraid of it, apparently for good reason. Vendors must stop trying to make a buck off of every impression on their Web sites and start giving serious consideration to the long-term harm that is likely to come from making the Internet out to be the most dangerous place on earth to do business.
| Variable Name | Description |
| ba1 | Postal Mailing Address |
| ba2 | Unknown |
| be | Email address |
| bp | Price (of an item being viewed) |
| bs | State of residence |
| bt | City of residence |
| by | Country of residence |
| bz | ZIP code |
| ct | City of residence |
| fn | First name |
| gd | Unknown |
| hf | Unknown |
| ln | Last name |
| pm | Product Manufacturer or Description |
| pn | Product Information |
| rf | Referring URL |
| ul | URL of page with Coremetrics web bug |
| Variable Name | Description |
| a1 | Unknown (ActionName1) |
| a2 | Unknown (ActionName2) |
| a3 | Unknown (ActionName3) |
| ag | Unknown |
| at | Unknown Integer |
| ba1 | Postal Mailing Address |
| ba2 | Unknown |
| be | Email address |
| bp | Price (of an item being viewed) |
| bs | State of residence |
| bt | City of residence |
| by | Country of residence |
| bz | ZIP code |
| ccf | Unkown Integer |
| cd | Unknown Integer |
| cg | Description of items being viewed, e.g., ``Men's Watches'' |
| ci | Identifier assigned to the site by Coremetrics (ClientID) |
| cn | Name of item being viewed (ContentName) |
| ct | City of residence |
| fn | First name |
| gd | Unknown |
| hf | Unknown |
| ln | Last name |
| mf | Manufacturer of item being viewed |
| n1 | Unknown |
| n2 | Unknown |
| n3 | Unknown |
| n4 | Unknown |
| nl | Unknown |
| nw | Unknown |
| on | Unknown Integer |
| Variable Name | Description |
| pa | ProductName |
| pc | Boolean (PageCount) |
| pi | Identifier of the Current Page (ClientPageID) |
| pm | Product Manufacturer or Description |
| pn | Product Information (PageName) |
| pn1 | Unknown (PageExtraNumeric1) |
| pn2 | Unknown (PageExtraNumeric2) |
| pr | Unknown Number |
| ps1 | Part of the Referring URL (PageExtraString1) |
| ps2 | Unknown (PageExtraString2) |
| pt | Unknown One-Letter Code (PluginType) |
| qt | Unknown Integer |
| rf | Referring URL |
| rnd | A pseudorandom number generated by the browser in JavaScript |
| s1 | Unknown |
| s2 | Unknown |
| s3 | Unknown |
| s4 | Unknown |
| sa | State |
| sa1 | Postal Address |
| sa2 | Unknown |
| sc | Category of product being viewed--see cn (Parent ContentName) |
| sd | Unknown |
| se | Search terms (e.g., ``movie'', ``video'') (Search) |
| sg | Unknown number (formatted 9.99, perhaps some kind of product price) |
| sl | Unknown boolean |
| sp | Phone number |
| sr | Unknown boolean |
| ss | State |
| st | City |
| su | Unknown product category (e.g., ``product view'') |
| sy | Country |
| sz | ZIP Code |
| tp | Boolean value of TestPerm cookie (TestPerm) |
| tr | Unknown number (formatted 99.9) |
| ts | Boolean value of TestSess cookie (TestSess) |
| ul | URL of page with Coremetrics web bug |
| vn1 | Part of Coremetrics JavaScript code version (Version1) |
| vn2 | Part of Coremetrics JavaScript code version (Version2) |
| zp | ZIP Code |
www.toysrus.com
site; other sites might have different versions of this software in
place.
<!--
/* Pageview Data-Transport-Tag v.2.2.8, 04/13/2000;
COPYRIGHT 1999-2000 COREMETRICS, INC. ALL RIGHTS RESERVED. U.S.PATENT PENDING. */
/* This line is used to set your Coremetrics client ID.
It should match the client ID that you were given.
Please replace 99999999 with your client ID.*/
ClientID = "90000002";
/* Please do not modify anything below this line. */
PluginType="C";
Version1="e2.2.8";
var Version2;
if(Version2==null){
Version2="e2.2";
}
var date=new Date();
var Rdm=date.getTime()%10000000;
var ReferralURL;
if (ReferralURL == null || ReferralURL == "" || ReferralURL == "(none)") {
if (navigator.appName == "Microsoft Internet Explorer" &&
parseFloat (navigator.appVersion) < 4) {
ReferralURL="unavailable";
} else if(document.referrer=="undefined"){
ReferralURL="";
} else ReferralURL=document.referrer;
}
URL=window.location.href;
TestSess=x06530905("TestSess");
TestPerm=x06530905("TestPerm");
if (TestSess != "Yes") {
document.cookie="TestSess=Yes";
}
if (TestPerm!="Yes") {
expiredate=new Date();
expiredate.setHours(expiredate.getHours()+5);
document.cookie="TestPerm=Yes;expires="+expiredate.toGMTString()+";"
}
TestSess=x06530905("TestSess");
TestPerm=x06530905("TestPerm");
arg="pt="+PluginType+"&vn1="+Version1+"&vn2="+Version2+
"&ci="+ClientID+"&rf="+x08226(ReferralURL)+"&ul="+x08226(URL)+
"&se="+x08226(Search)+"&pn="+x08226(PageName)+
"&pi="+x08226(ClientPageID)+"&cn="+x08226(ContentName)+
"&sc="+x08226(ParentContentName)+"&ps1="+x08226(PageExtraString1)+
"&ps2="+x08226(PageExtraString2)+"&pn1="+x08226(PageExtraNumeric1)+
"&pn2="+x08226(PageExtraNumeric2)+"&a1="+x08226(ActionName1)+"&a2="+
x08226(ActionName2)+"&a3="+x08226(ActionName3)+"&pa="+
x08226(PromotionName)+"&pc="+x08226(PageCount)+"&ts="+TestSess+
"&tp="+TestPerm+"&rnd="+Rdm;
pl=document.location.protocol;
if (pl!="http:"&pl!="https:") {
pl="http:";
}
prearg="<img width=\"1\" height=\"1\" src=\"";
prearg+=pl+"//";
prearg+="data.coremetrics.com/cgi-bin/eluminate.cgi?";
postarg="\">";
dummyImageURL="http://data.coremetrics.com/cgi-bin/eluminate.cgi?"+
x08226(arg);
Display=(prearg+x08226(arg)+postarg);
function x36273(ff) {
var i=0,j=0;
while (ff.charAt(i)==" ") i++;
j=ff.length-1;
while (ff.charAt(j)==" ") j--;
return ff.substring(i,j+1);
}
function x08226(s){
var tmp;
s=""+s;
s=x36273(s);
s=escape(s);
while(s.indexOf("+")>=0){
tmp=s.indexOf("+");
s=s.substring(0,tmp)+"%2B"+s.substring(tmp+1,s.length);
}
return s;
}
function x06530905(name){
var arg=name+"=";
var alen=arg.length;
var clen=document.cookie.length;
var i=0;
while(i<clen){
var j=i+alen;
if(document.cookie.substring(i,j)==arg)
return x05844687532(j);
i=document.cookie.indexOf(" ",i)+1;
if(i==0)
break;
}
return null;
}
function x05844687532(offset){
var endstr=document.cookie.indexOf(";",offset);
if (endstr==-1)
endstr=document.cookie.length;
return unescape(document.cookie.substring(offset,endstr));
}
//-->