Newer posts are loading.
You are at the newest post.
Click here to check if anything new just came in.

February 21 2011

Aberwitziger Datenschutz made in Germany

Heise berichtet über einen aktuellen Fall in dem der niedersächsische Datenschutzbeauftragte den Einsatz des Werbeprogramms Google AdSense, des Amazon Partnerprogramms und des IVW-Pixels als datenschutzwidrig beanstandet hat und zudem das Webhosting als Auftragsdatenverarbeitung (i.S.v. § 11 BDSG) qualifiziert. Dem Betreiber von zwei Internetforen wurde aufgegeben, die Übermittlung personenbezogener Daten über die Dienste Google AdSense, Amazon Einzeltitel-Links sowie IVW-Box einzustellen und die Anwendungen aus dem Quelltext zu entfernen. Aus dem Schreiben des Landesbeauftragten für den Datenschutz – das mir vorliegt – ergibt sich zudem, dass die Datenschutzbehörde der Ansicht ist, beim Hosting würde eine sog. Auftragsdatenverabeitung nach § 11 BDSG stattfinden. Der Forenbetreiber wurde aufgefordert, die Zulässigkeit der Auftragsdatenverarbeitung durch seinen Host-Provider (Host Europe) nachzuweisen.

Die Datenschützer schießen  mit solchen Maßnahmen sehr weit über das Ziel hinaus. Die konsequente Schlussfolgerung aus der Haltung des niedersächsischen Datenschutzbeauftragten ist letztlich die, dass sämtliche Websites, die Werbung mit Hilfe von Partnerprogrammen bzw. des Affiliate-Marketing treiben, gegen das Datenschutzrecht verstoßen.

Wenn man zudem das Hosting, wie es der Landesdatenschutzbeauftragte tut, als Auftragsdatenverarbeitung im Sinne von § 11 BDSG begreift, müssten damit eigentlich fast alle deutschen Websites vom Netz genommen werden und Massenhoster wie 1&1 und Strato könnten ihr Geschäft sofort einstellen.  Denn wenn die Rechtsansicht der Datenschutzbehörde zutreffend wäre, würde dies bedeuten, dass  jeder Webseitenbetreiber mit seinem Host-Provider eine schriftliche Vereinbarung über eine Auftragsdatenverarbeitung abschließen müsste, die die äußerst strengen Anforderungen von § 11 Abs. 2 BDSG erfüllt. Die Haltung des niedersächsischen Landesbeauftragten kann man daher getrost als aberwitzig bezeichnen.

Für mich ist das Vorgehen der Datenschützer aber auch ein weiterer Beleg dafür, dass das deutsche und europäische Datenschutzrecht nach wie vor den Anforderungen des Internetzeitalters nicht gewachsen ist und die Datenschutzbehörden dieses Problem durch eine exzessive Auslegung datenschutzrechtlicher Bestimmungen noch zusätzlich befeuern.

Das ist auch deshalb fragwürdig, weil die Datenschutzbehörden oft genug die von ihnen gestellten Anforderungen selbst nicht erfüllen. Das habe ich hier vor einiger Zeit am Beispiel des Hamburger Datenschutzbeauftragten schon dargestellt.

Für den hier agierenden niedersächsischen Datenschutzbeauftragten gilt nichts anderes, wie ein Blick in seine eigene Datenschutzerklärung zeigt. Dort werden zunächst mit dem TDG und dem MDStV gesetzliche Regelungen genannt, die es seit Jahren nicht mehr gibt.

Die Datenschutzerklärung genügt aber auch den Vorgaben der geltenden gesetzlichen Regelungen des § 13 TMG nicht. Der Datenschutzbeauftragte informiert nicht ausreichend über Art, Umfang und Zwecke der Erhebung und Verwendung personenbezogener Daten. Insbesondere wird nicht erklärt, welche Daten beim Aufruf des Servers “lfd.niedersachsen.de” genau erhoben werden. Wenn Daten, wie angeben, in einer Protokolldatei gespeichert werden, dann dürfte es sich hierbei um nichts anderes als die Logdateien des Webservers handeln. Und dort werden dann gerade auch die IP-Adressen der Anfragenden erfasst. Man muss also, ausgehend von der eigenen Datenschutzerklärung des Datenschutzbeauftragten, annehmen, dass der Webserver “lfd.niedersachsen.de” IP-Adressen speichert und zwar für einen Zeitraum von zwei Monaten.

In der Datenschutzerklärung des niedersächsischen Datenschutzbeauftragten heißt es ferner, dass alle allgemein zugänglichen Seiten benutzt werden können, ohne dass Cookies gesetzt werden. Diese Aussage ist schlicht falsch. Bereits beim Aufruf der Privacy-Seite setzt der Server des Landesbeauftragten ein Cookie, wie der nachfolgende Screenshot zeigt:

Die Datenschutzerklärung des niedersächsischen Datenschutzbeauftragten ist im Ergebnis also veraltet, unvollständig und falsch. Belegt wird dadurch einmal mehr, dass die Datenschutzbehörden, die hohen Anforderungen, die sie anderen abverlangen, selbst nicht erfüllen.

June 25 2010

Online-Werbung und Datenschutz

Die sog. “Art. 29-Gruppe” der EU, ein Gremium von Datenschützern, hat am 22.06.2010 eine Stellungnahme zum Thema “Online Behavioural Advertising” veröffentlicht, in der kritsiert wird, dass die Online-Werbung vielfach nicht den Vorgaben des europäischen Datenschutzrechts entspricht.

Der Bundesdatenschutzbeauftragte Schaar, der Mitglied der Gruppe ist, weist in einer Pressemitteilung u.a. darauf hin, dass Cookies oft ohne ausdrückliche Einwilligung des Nutzers auf dessen Rechner abgelegt werden und auch bei der Sammlung von Nutzerdaten häufig nicht ausreichend informiert wird.

December 22 2009

Being online: Your identity to advertisers--it's not all about you

Thy self thou gav'st, thy own worth then not knowing

(This post is the fourth in a series called "Being online: identity, anonymity, and all things in between.")



Voracious data foraging leads advertisers along two paths. One of
their aims is to differentiate you from other people. If vendors know
what condiments you put in your lunch or what material you like your
boots made from, they can pinpoint their ads and promotions more
precisely at you. That's why they love it when you volunteer that
information on your blog or social network, just as do the college
development staff we examined before.

The companies' second aim is to insert you into a group of people for
which they can design a unified marketing campaign. That is, in
addition to differentiation, they want demographics.

The first aim, differentiation, is fairly easy to understand. Imagine
you are browsing web sites about colic. An observer (and I'll discuss
in a moment how observations take place) can file away the reasonable
deduction that there is a baby in your life, and can load your browser
window with ads for diapers and formula. This is called behavioral
advertising.

Since behavioral advertising is normally a pretty smooth operator, you
may find it fun to try a little experiment that could lift the curtain
on it bit. Hand your computer over for a few hours to a friend or
family member who differs from you a great deal in interests, age,
gender, or other traits. (Choose somebody you trust, of course.) Let
him or her browse the web and carry on his or her normal business.
When you return and resume your own regular activities, check the ads
in your browser windows, which will probably take on a slant you never
saw before. Of course, the marketers reading this article will be
annoyed that I asked you to pollute their data this way.

Experiences like this might arouse you to be conscious of every online
twitch and scratch, just as you may feel in real life in the presence
of a security guard whose suspicion you've aroused, or when on stage,
or just being a normal teenager. Online, paranoia is level-headedness.
Someone indeed is collecting everything they can about you: the amount
of time you spend on one page before moving on to the next, the links
you click on, the search terms you enter. But it's all being collected
by a computer, and no human eyes are ever likely to gaze upon it.

Your identity in the computerized eyes of the advertiser is a strange
pastiche of events from your past. As mentioned at the beginning of
the article, Google's Dashboard lets you see what Google knows about
you, and even remove items--an impressive concession for a company
that has mastered better than any other how to collect information on
casual Web users and build a business on it. Of course, you have to
establish an identity with them before you can check what they know
about your identity. This is not the last irony we'll encounter when
exploring identity.

But advertisers do more than direct targeting, and I actually find
the other path their tracking takes--demographic analysis--more
problematic. Let's return to the colicky baby example. Advertisers add
you to their collection of known (or assumed) baby caretakers and tag
your record with related information to help them understand the
general category of "baby care." Anything they know about your age,
income, and other traits helps them understand modern parenting.

As I
wrote over a decade ago,
this kind of data mining typecasts us and encourages us to head down
well-worn paths. Unlike differentiation, demographics affect you
whether or not you play the game. Even if you don't go online, the
activities of other people like you determine how companies judge your
needs.

The latest stage in the evolution of demographic data mining is
sentiment analysis, which trawls through social networking messages to
measure the pulse of the public on some issue chosen by the
researcher. A crude application of sentiment analysis is to search for
"love" or "hate" followed by a product trademark, but the natural
language processing can become amazingly subtle. Once the data is
parsed, companies can track, for instance, the immediate reaction to a
product release, and then how that reaction changed after a review or
ad was widely disseminated. Results affect not only advertising but
product development.

Once again, my reaction to sentiment analysis mixes respect for its
technical sophistication with worries about what it does to our
independence. If you add your voice to the Twittersphere, it may be
used by people you'll never know to draw far-reaching conclusions. On
the other hand, if you refuse to participate, your opinion will be
lost.

Google's Dashboard tells you only what they preserve on you
personally, not the aggregated statistics they calculate that
presumably include anonymous browsing. But you can peek at those as
well, and carry on some rough sentiment analysis of your own, through
Google Trends.

Considering all this demographic analysis (behavioral, sentiment, and
other) catapults me into a bit of a 21st-century-style existential
crisis. If a marketer is able to combine facts about my age, income,
place of birth, and purchases to accurately predict that I'll want a
particular song or piece of clothing, how can I flaunt my identity as
an autonomous individual?

Perhaps we should resolve to face the brave new world stoically and
help the companies pursue their goals. Social networking sites are
developing APIs and standards that allow you to copy information
easily between them. For instance, there are sites that let you
simultaneously post the same message instantly to both Twitter and
Facebook. I think we should all step up and use these services. After
all, if your off-the-cuff Tweet about your skis from the lounge of a
ski resort goes into planning a multimillion dollar campaign, wouldn't
it be irresponsible to send the advertiser mixed messages?

My call to action sounds silly, of course, because the data gathering
and analysis will obviously not be swayed by a single Tweet. In fact,
sophisticated forms of data mining depend on the recent upsurge of new
members onto the forums where the information is collected. The volume
of status messages has to be so high that idiosyncrasies get ironed
out. And companies must also trust that the margin of error caused by
malicious competitors or other actors will be negligible.

We saw in an earlier section that your online presence is signaled by
a slim swath of information. At the low end, marketers know only your
approximate location through your IP address. At the other extreme
they can feast on the data provided by someone who not only logs into
a site--creating a persistent identity--but fills out a form with
demographic information (which the vendor hopes is truthful).

As another example of modern data-driven advertising, Facebook
delivers ads to you based on the information you enter there, such as age
and marital status. A tech journal reported that

the Google Droid phone combines contacts from many sources
,
but I haven't experienced this on my Droid and I don't see
technically how it could be done.

Most browsing takes place in an identity zone lying between the IP
address and the filled-out profile. We saw this zone in my earlier
example from the coffee shop. The visitor does not identify himself,
but lets the browser accept a cookie by default from each site.

Each cookie--so long as you don't take action to remove one, as I did
in my experiment--is returned to the server that left it on your
browser. If you use a different browser, the server doesn't know
you're the same person, and if a family member uses your browser to
visit the same server, it doesn't know you're different people.

Because the browser returns the cookie only to servers from the same
domain--say, yahoo.com--that sent the cookie, your identity
is automatically segmented. Whatever yahoo.com knows about
you, oreilly.com and google.com do not. Servers can
also subdivide domains, so that mail.yahoo.com can use the
cookie to keep track of your preferred mail settings while
weather.yahoo.com serves meteorological information
appropriate for your location.

This wall between cookies would seem to protect your browsing and
purchasing habits from being dumped into a large vat and served up to
advertisers. But for every technical measure protecting privacy, there
is another technical trick that clever companies can use to breach
privacy. In the case of cookies, the trick exploits the ability of a
web to can display content from multiple domains simultaneously. Such
flexibility in serving domains is normally used (aside from tweaks to
improve performance) to embed images from one domain in a web page
sent by another, and in particular to embed advertising images.

Now, if advertisers all contract with a single ad agency, such as

DoubleClick

(the biggest of the online ad companies), all the ads from different
vendors are served under the doubleclick.com domain and can
retrieve the same cookie. You don't have to click on an ad for the
cookie to be returned. Furthermore, each ad knows the page on which it
was displayed.

Therefore, if you visit web pages about colic, skis, and Internet
privacy at various times, and if DoubleClick shows an ad on each page,
it can tell that the same person viewed those disparate topics and use
that information to choose ads for future pages you visit. In the
United States, unlike other countries, no laws prohibit DoubleClick
from sharing that information with anyone it wants. Furthermore, each
advertiser knows whether you click on their ad and what activity you
carry on subsequently at their site, including any purchases you make
and any personal information you fill out in a form.

Put it all together, and you are probably far from anonymous on the
Internet. In addition, a more recent form of persistent data,
controlled by the popular Flash environment through a technology
called local shared objects, makes promiscuous sharing easy and
removing the information much harder.

The purchase of DoubleClick in 2007 by Google, which already had more
information on individuals than anybody else, spurred a great protest
from the privacy community, and the FTC took a hard look before
approving the merger. A similar controversy may surround Google's
recently announced purchase of

AdMob
,
which provides a service similar to DoubleClick for advertisers on
mobile phones.

So far I've just covered everyday corporate treatment of web browsing
and e-commerce. The frontiers of data mining extend far into
the rich veins of user content.

Deep packet inspection allows your Internet provider to snoop on your
traffic. Normally, the ISP is supposed to look only at the IP address
on each packet, but some ISPs check inside the packet's content for
various reasons that could redound to your benefit (if it squelches a
computer virus) or detriment (if it truncates a file-sharing session).
I haven't heard of any ISPs using this kind of inspection for
marketing, but many predictions have been aired that we'll cross that
frontier.

Governments have been snooping at the hubs that route Internet traffic
for years. China simply blocks references to domains, IP addresses, or
topics it finds dangerous, and monitors individuals for other
suspected behavior. The Bush administration and American telephone
companies got into hot water for collecting large gobs of traffic
without a court order. But for years before that, the Echelon project
was filtering all international traffic that entered or left the US
and several of its allies.

One alternative to being tossed on the waves of marketing is to join
the experiments in Vendor Relationship Management (VRM), which I

covered in a recent blog
.
Although not really implemented anywhere yet, this movement holds out
the promise that we can put out bids for what we want and get back
proposals for products and services. Maybe VRM will make us devote
more conscious thinking to how we present ourselves online--and how
many selves we want to present. These are the subjects of the next section.

The posts in "Being online: identity, anonymity, and all things in between" are:


  1. Introduction



  2. Being online: Your identity in real life--what people know


  3. Your identity online: getting down to basics


  4. Your identity to advertisers: it's not all about you (this post)


  5. What you say about yourself, or selves (to be posted December 24)

  6. Forged identities and non-identities (to be posted December 26)


  7. Group identities and social network identities (to be posted December 28)


  8. Conclusion: identity narratives (to be posted December 30)

December 20 2009

Being online: Your identity online--getting down to basics



What men daily do, not knowing what they do!

(This post is the third in a series called "Being online: identity, anonymity, and all things in between.")

Previous posts in this series explored the various identifies
that track you in real life. Now we can look at the traits that
constitute your identity online. A little case study may show how
fluid these are.

One day I drove from the Boston area a hundred miles west and logged
into the wireless network provided by an Amherst coffee shop in
Western Massachusetts. I visited the Yahoo! home page and noticed that
I was being served news headlines from my home town. This was a bit
disconcerting because I had a Yahoo! account but I wasn't logged into
it. Clearly, Yahoo! still knew quite a bit about me, thanks to a
cookie it had placed on my browser from previous visits.


[A cookie, in generic computer jargon, is a small piece of data that a
program leaves on a system as a marker. The cookie has a special
meaning that only the program understands, and can be retrieved later
by the program to recall what was done earlier on the system. Browsers
allow web sites to leave cookies, and preserve security by serving
each cookie only to the web site that left it (we'll see in a later
section how this limitation can be subverted by data gatherers).]

Among the ads I saw was one for the local newspaper in my town.
Technically, it would be possible Yahoo! to pass my name to the
newspaper so it could check whether I was already a
subscriber. However, the

Yahoo! privacy policy

promises not to do this and I'm sure they don't.

As an experiment, I removed the Yahoo! cookie (it's easy to do if you
hunt around in your browser's Options or Preferences menu) and
revisited the Yahoo! home page. This time, news headlines for Western
Massachusetts were displayed. Yahoo! had no idea who I was, but knew I
was logging in from an Internet service provider (ISP) in or near
Amherst.

What Yahoo! had on me was a minimal Internet identity: an IP address
provided by the Internet Protocol. These addresses, which usually
appear in human-readable form as four numbers like 150.0.20.1, bear no
intrinsic geographic association. But they are handed out in a
hierarchical fashion, which allows a pretty good match-up with
location. At the top of the address allocation system stand five
registries that cover areas the size of continents. These give out
huge blocks of addresses to smaller regions, which further subdivide
the blocks of addresses and give them out on a smaller and smaller
scale, until local organizations get ranges of addresses for
their own use.

Yahoo! simply had to look up the ISP associated with my particular IP
address to determine I was in Western Massachusetts. But the
technology is a bit more complicated than that. I was actually
associated with three IP addresses--a complexity that shows how the
fuzziness of identity on the Internet extends even to the lowest
technological levels.

First, when I logged in to the coffee shop's wireless hub, it gave me
a randomly chosen IP address that was meaningful only on its own local
network. In other words, this IP address could be used only by the
hub and anyone logged in to the hub.

The hub used an aged but still vigorous technology known as Network
Address Translation to send data from my system out to its ISP. As my
traffic emanated from the coffee shop, it bore a new address
associated with the coffee shop's wireless hub, not with me
personally. All the people in the coffee shop can share a single
address, because the hub associates other unique identifiers--port
numbers--with our different streams of traffic.

But the ISP treats the coffee shop as the coffee shop treats me. The
coffee shop's own address is itself a temporary address that is
meaningful to the local network run by the ISP. A second translation
occurs to give my traffic an identity associated with the ISP. This
third address, finally, is meaningful on a world scale. It is the only
one of the three addresses seen by Yahoo!.

However, an investigator (hopefully after getting a subpoena) could
ask an ISP for the identity of any of its customers, submitting the
global IP address and port numbers along with the date and time of
access. The coffee shop didn't require any personal information before
logging me in and therefore could not fulfill an investigator's
request, but a person doing illegal file transfers or other socially
disapproved activity from a home or office would be known to the hub
system and could therefore by identified--so long as logfiles with
this information had not been deleted from the hub.

The combination of IP address, port numbers, and date and time allows
the Recording Industry Association of America to catch people who
offer copyrighted music without authorization. And this technological
mechanism underlies the European Union requirement for ISPs to keep
the information they log about customer use, as mentioned in the first
section of this article.

If I want to hide this minimal Internet identity--the IP address--I
have to use another Internet account as a proxy. In the case of my
visit to Western Massachusetts, I was protected by logging in
anonymously to a coffee shop, but in some countries I'd be required to
use a credit card to gain access, and therefore to bind all my web
surfing to a strong real-world identity. Many European countries
require this form of identification, outlawing open wireless networks.

To generalize from my Amherst experiment, the information we provide
as we use the Internet is very limited, and can be limited even
further through simple measures such as removing cookies (a topic
covered further in a later section of this article). But what the
Internet still allows can be used in a supple manner to respond
instantly with ads and other material--such as the nearest coffee shop
or geographically relevant weather reports--that are hopefully of
greater value than the corresponding material in print publications we
peruse.

This post has explored the use of IP addresses metaphorically, as
well as illustratively, to show how our Internet identity is
context-sensitive and can change utterly from one setting to another.
Usually, we provide more of a handle to the people we communicate with
over email, instant messaging, forums, and so forth. Here too we have
multiple identities and spend hours collecting each other's handles.

Email, the oldest form of personal online communication, ironically
has one of the better hacks for combining identities. You email
accounts can be set up to forward mail, so that mail to the address
you kept from your alma mater goes automatically to your work address.

In contrast, you can't use your AIM instant message account to contact
someone on MSN, so you need a separate account on each IM service and
no one will know they all represent you unless you tell them. Twitter
is experimenting with ways to assure users that accounts with
well-known names are truly associated with the people after which
they're named.

If IM services all agreed to use XMPP (or some other protocol) you
could reduce all your IM accounts to one. And if every social network
supported OpenSocial, you could do a lot of networking while
maintaining an account on just one service.

A widely adopted protocol called OpenID allows one identity to support
another: if you have an account on Yahoo! or Blogger you can use it to
back up your assertion of identity on another site that accepts their
OpenID tokens. OpenID and related technologies such as Information
Card don't validate your existence or authenticate the personal traits
you have outside the Internet, but allow the identity you've built up
on one site to be transferable.

My next post shows how the minimal elements of online identity
have been expanded by advertisers and other companies, who combine the
various retrievable polyps of our identity. Following that, we'll see
how we ourselves manipulate our identities and forge new ones.

The posts in "Being online: identity, anonymity, and all things in between" are:


  1. Introduction



  2. Being online: Your identity in real life--what people know


  3. Your identity online: getting down to basics (this post)


  4. Your identity to advertisers: it's not all about you (to be posted December 22)


  5. What you say about yourself, or selves (to be posted December 24)


  6. Forged identities and non-identities (to be posted December 26)


  7. Group identities and social network identities (to be posted December 28)


  8. Conclusion: identity narratives (to be posted December 30)

November 11 2009

Older posts are this way If this message doesn't go away, click anywhere on the page to continue loading posts.
Could not load more posts
Maybe Soup is currently being updated? I'll try again automatically in a few seconds...
Just a second, loading more posts...
You've reached the end.

Don't be the product, buy the product!

Schweinderl