Newer posts are loading.
You are at the newest post.
Click here to check if anything new just came in.

January 02 2014

The Snapchat Leak

The number of Snapchat users by area code.The number of Snapchat users by area code.

The number of Snapchat users by geographic location. Users are predominately located in New York, San Francisco and the surrounding greater New York and Bay Areas.

While the site crumbled quickly under the weight of so many people trying to get to the leaked data—and has now been suspended—there isn’t really such a thing as putting the genie back in the bottle on the Internet.

Just before Christmas the Australian based Gibson Security published a report highlighting two exploits in the Snapchat API claiming that hackers could easily gain access to users’ personal data. Snapchat dismissed the report, responding that,

Theoretically, if someone were able to upload a huge set of phone numbers, like every number in an area code, or every possible number in the U.S., they could create a database of the results and match usernames to phone numbers that way.

Adding that they had various “safeguards” in place to make it difficult to do that. However it seems likely that—despite being explicitly mentioned in the initial report four months previously—none of these safeguards included rate limiting requests to their server, because someone seems to have taken them up on their offer.

Data Release

Earlier today the creators of the now defunct SnapchatDB site released 4.6 million records—both as an SQL dump and as a CSV file. With an estimated 8 million users (May, 2013) of the app this represents around half the Snapchat user base.

Each record consists of a Snapchat user name, a geographical location for the user, and partially anonymised phone number—the last two digits of the phone number having been obscured.

While Gibson Security’s find_friends exploit has been patched by Snapchat, minor variations on the exploit are reported to still function, and if this data did come from the exploit—or a minor variation on it—uncovered by Gibson, then the dataset published by SnapchatDB is only part of the data the hackers now hold.

In addition to the data already released they would have the full phone number of each user, and as well as the user name they should also have the—perhaps more revealing—screen name.

Data Analysis

Taking an initial look at the data, there are no international numbers in the leaked database. All entries are US numbers, with the bulk of the users from—as you might expect—the greater New York, San Francisco and Bay areas.

However I’d assume that the absence of international numbers is  an indication of laziness rather than due to any technical limitation. For US based hackers it would be easy to iterate rapidly through the fairly predictable US number space, while foreign” numbers formats might present more of a challenge when writing a script to exploit the hole in Snapchat’s security.

Only 76 of the 322 area codes in the United States appear in the leaked database, alongside another two Canadian area codes, mapping to 67 discrete geographic locations—although not all the area codes and locations match suggesting that perhaps the locations aren’t derived directly from the area code data.

Despite some initial scepticism about the provenance of the data I’ve confirmed that this is a real data set. A quick trawl through the data has got multiple hits amongst my own friend group, including some I didn’t know were on Snapchat—sorry guys.

Since the last two digits were obscured in the leaked dataset are obscured the partial phone number string might—and frequently does—generate multiple matches amongst the 4.6 million records against a comparison number.

I compared the several hundred US phone numbers amongst my own contacts against the database—you might want to do that yourself—and generated several spurious hits where the returned user names didn’t really seem to map in any way to my contact. That said, as I already mentioned, I found several of my own friends amongst the leaked records, although I only knew it was them for sure because I knew both their phone number and typical choices of user names.

Conclusions

As it stands therefore this data release is not—yet—critical, although it is certainly concerning, and for some individuals it might well be unfortunate. However if the SnapchatDB creators choose to release their full dataset things might well get a lot more interesting.

If the full data set was released to the public, or obtained by a malicious third party, then the username, geographic location, phone number, and screen name—which might, for a lot of people, be their actual full name—would be available.

This eventuality would be bad enough. However taking this data and cross-correlating it with another large corpus of data, say from Twitter or Gravatar, by trying to find matching user or real names on those services—people tend to reuse usernames on multiple services after all—you might end up with a much larger aggregated data set including email addresses, photographs, and personal information.

While there would be enough false positives—if matching solely against user names—that you’d have a interesting data cleaning task afterwards, it wouldn’t be impossible. Possibly not even that difficult.

I’m not interested in doing that correlation myself. But others will.

 

August 16 2013

*The vanishing cost of guessing*

The vanishing cost of guessing

http://radar.oreilly.com/2013/08/the-vanishing-cost-of-guessing.html

[C]ritics charge that big data will make us stick to constantly optimizing what we already know, rather than thinking out of the box and truly innovating. We’ll rely on machines for evolutionary improvements, rather than revolutionary disruption. An abundance of data means we can find facts to support our preconceived notions, polarizing us politically and dividing us into “filter bubbles” of like-minded intolerance. And it’s easy to mistake correlation for causality, leading us to deny someone medical coverage or refuse them employment because of a pattern over which they have no control, taking us back to the racism and injustice of Apartheid or Redlining.

// via soup.io where the complete article is available - here

#données - #prédiction #prévention #anticipation #corrélation - #coût #préjugé

#Daten #Vorhersage #Vorsorge #Analyse #Zusammenhang - #Kosten #Vorurteil - #gelenkte #Wahrnehmung

#big_data #data - #prediction #preventing #analysis - #cost #prejudice #bias #preconception

June 26 2013

Four short links: 26 June 2013

  1. Memory Allocation in Brains (PDF) — The results reviewed here suggest that there are competitive mechanisms that affect memory allocation. For example, new dentate gyrus neurons, amygdala cells with higher excitability, and synapses near previously potentiated synapses seem to have the competitive edge over other cells and synapses and thus affect memory allocation with time scales of weeks, hours, and minutes. Are all memory allocation mechanisms competitive, or are there mechanisms of memory allocation that do not involve competition? Even though it is difficult to resolve this question at the current time, it is important to note that most mechanisms of memory allocation in computers do not involve competition. Does the dissector use a slab allocator? Tip your waiter, try the veal.
  2. Living Foundries (DARPA) — one motivating, widespread and currently intractable problem is that of corrosion/materials degradation. The DoD must operate in all environments, including some of the most corrosively aggressive on Earth, and do so with increasingly complex heterogeneous materials systems. This multifaceted and ubiquitous problem costs the DoD approximately $23 Billion per year. The ability to truly program and engineer biology, would enable the capability to design and engineer systems to rapidly and dynamically prevent, seek out, identify and repair corrosion/materials degradation. (via Motley Fool)
  3. Innovate Salone — finalists from a Sierra Leone maker/innovation contest. Part of David Sengeh‘s excellent work.
  4. Arts, Humanities, and Complex Networks — ebook series, conferences, talks, on network analysis in the humanities. Everything from Protestant letter networks in the reign of Mary, to the repertory of 16th century polyphony, to a data-driven update to Alfred Barr’s diagram of cubism and abstract art (original here).

September 19 2012

Les opérateurs dominants partent en guerre contre la neutralité du Net via l'ITU

Depuis plusieurs mois, de nombreux débats ont lieu quant aux menaces suscitées par la tenue prochaine de la Conférence mondiale sur les télécommunications internationales (World Conference on International Telecommunications en anglais, ou « WCIT »). En décembre, les 193 États membres de l'Union Internationale des Télécoms (ITU), une agence de l'Organisation des Nations unies, se réuniront à Dubaï pour cette conférence de premier plan visant à amender les traités fondateurs de l'ITU, les « International Telecommunication Regulations » (« ITRs » ou Régulations Internationales des Télécommunications).

L'ISOC et le Centre pour la démocratie et la technologie (CDT) ont analysé les dangereux amendements proposés par de nombreux pays, qui visent à étendre le mandat de l'ITU à certaines problématiques comme l'adressage et le routage IP ou la coopération en matière de cybercrime, et ainsi saper la gouvernance globale d'Internet1. Si ces amendements sont de très sérieuses sources d'inquiétude, il n'est pas évident qu'ils puissent passer, notamment en raison de l'opposition de plusieurs États membres de l'ITU (en particulier d'Europe et les États-Unis) et d'acteurs de la société civile à une mainmise de l'ITU sur la gouvernance d'Internet.

Une autre proposition a été faite par un acteur européen majeur qui, bien qu'apparemment technique et non liée aux libertés en ligne, aurait des conséquences désastreuses sur la neutralité du Net2. Cette proposition a été détaillée la semaine dernière par ETNO — le lobby représentant les opérateurs télécoms à Bruxelles — dans sa contribution au WCIT. Jusqu'à présent, le législateur, en Europe et au-delà, est resté silencieux, refusant de réagir aux modifications des ITRs proposées par ETNO. Ce silence suggère que les propositions d'ETNO disposent déjà d'un soutien politique clé.


L'inertie de l'UE sur la neutralité du Net

Pour commencer, quelques éléments de contexte. En 2009, l'UE a adopté le « Paquet Télécom », le regroupement de cinq directives européennes régulant le secteur des télécommunications. Les amendements, soutenus par les opérateurs télécoms les plus importants et cherchant à légitimer des restrictions d'accès à Internet, ont alors suscité un intense débat sur la neutralité du Net.

Bien que les législateurs de l'UE aient refusé d'inscrire la neutralité du Net dans la loi, ils ont invalidé les pires de ces amendements. Lors de l'adoption du Paquet Télécom, la Commission européenne s'est même engagée à contrôler la situation et a déclaré que la neutralité du Net serait dorénavant, un « objectif politique »3.

Depuis, les preuves démontrant une extension croissante des restrictions d'accès imposées par les opérateurs télécoms se sont cependant multipliées. De façon inquiétante, seuls les Pays-Bas ont adopté un cadre juridique protégeant la neutralité du Net. Les autres pays semblent attendre que l'UE prenne l'initiative, mais la Commissaire européenne en charge du secteur des télécommunications, Neelie Kroes, s'est jusqu'à présent refusée à agir.

Pendant ce temps, les plus gros opérateurs de l'Union européenne, tels que Vodafone, Deutsch Telekom, Orange, ou Telefonica, ont fait lourdement pression sur les législateurs pour le développement de nouveaux modèles économiques, basés sur la restriction ou la discrimination des communications des utilisateurs qui, contrairement à ce que prétendent ces opérateurs, auraient des conséquences dramatiques sur la neutralité du Net.

Cette situation est devenue très claire début 2010, lorsque le Directeur Général de Telefonica a déclaré4 que :

Les moteurs de recherche sur Internet utilisent notre réseau sans rien payer, ce qui est bon pour eux, mais mauvais pour nous. Il est évident que cette situation doit changer, notre stratégie est de la changer. (traduction par nos soins)

Des déclarations similaires ont été faites par le Directeur Général d'Orange5.

Un tel discours montre clairement que certains opérateurs de télécommunication veulent développer de nouveaux modèles économiques en monétisant les transmissions de données entre les services en ligne et leurs abonnés. L'idée est ici de démontrer que, pour faire face aux épisodes de congestion sur les réseaux actuels (par exemple pour les vidéos de Youtube, dont le chargement est ralenti aux heures de grande utilisation), les opérateurs pourraient fournir un trafic « première classe » (c'est-à-dire priorisé) aux fournisseurs de services qui peuvent le payer. Cette discrimination est appelée une « qualité de service différenciée » et ralentirait par ailleurs inévitablement le reste du trafic. Pour mettre en place cette qualité de service différenciée, les opérateurs négocieraient des « accords d'interconnexion » avec les fournisseurs de services.

Le blanchiment de la politique des opérateurs à travers l'ITU

La seule chose que ces compagnies craignent est d'être empêchées par les législateurs et les régulateurs, dans le cadre du débat sur la neutralité du Net, d'établir de tels modèles économiques. L'adoption récente d'une loi dans ce sens aux Pays-Bas, sous la pression de la société civile, montre que les législateurs peuvent décider de protéger effectivement la neutralité du Net, en mettant en place des règles strictes pour lutter contre les pratiques de discrimination du trafic et réglementer les accords commerciaux.

Cette peur explique pourquoi, en juin dernier, les opérateurs américains AT&T et Verizon ont porté plainte contre l'ARCEP, le régulateur des communications français, après sa décision de collecter des données sur ces « accords d'interconnexion »6. Les opérateurs de télécommunications font fortement pression contre toute forme de régulation dans ce domaine, et ont l'entière attention de la Commission européenne.

À l'instar de l'industrie du divertissement, qui essaie de faire passer des mesures répressives contre le partage de fichiers à travers des accords commerciaux comme ACTA, les opérateurs des télécommunications tentent d'imposer leurs mesures par l'intermédiaire de l'ITU. Leur objectif est de s'assurer que les règles de l'ITU empêcheront les États membres d'exercer un contrôle sur les restrictions imposées par les opérateurs. À cette fin, ETNO propose les amendements suivants aux ITRs :

3.1 (...) Les États Membres doivent faciliter le développement d'interconnexions IP internationales fournissant à la fois un service « best effort » et une qualité du service de bout en bout.

4.4 Les opérateurs doivent coopérer pour le développement d'interconnexions IP internationales fournissant à la fois un service « best effort » et une qualité de service de bout en bout. Le service « best effort » devrait continuer de constituer le fondement du trafic IP international. Rien ne doit empêcher les accords commerciaux avec qualité de service différenciée de se développer. (emphase par nos soins)

D'après ETNO, le traité de base de l'ITU devrait « permettre l'accroissement des revenus grâce à des accords de tarification de la qualité de service de bout en bout et de la valeur des contenus » et autoriser le développement de « nouvelles politiques d'interconnexion basées sur la différenciation des critères de qualité de service pour des services et des types de trafic spécifiques (non uniquement pour les « volumes »). » Ces règles, disent-ils, devraient faire partie de « l'écosystème Internet » (et donc, ne plus s'appliquer uniquement aux « services gérés » ou « services spécialisés » qui résident sur des réseaux IP privés, distincts de l'Internet public), et devraient être décidées entre les opérateurs et les fournisseurs de services en ligne, mettant les régulateurs et les utilisateurs à l'écart du débat une fois pour toutes.

Les dangers de la proposition ETNO

La proposition d'ETNO s'oppose totalement à la neutralité du Net telle qu'elle est généralement définie. En effet, si les cercles politiques peinent à s'accorder sur ce qu'est la neutralité du Net, tous sont d'accords pour affirmer l'importance de ce principe, en le définissant différemment selon leurs affinités avec les options défendues par les opérateurs des télécommunications.

En France, un rapport parlementaire de 2011 souligne à juste titre que la neutralité du Net doit être définie comme :

La capacité pour les utilisateurs d’Internet d’envoyer et de recevoir le contenu de leur choix, d’utiliser les services ou de faire fonctionner les applications de leur choix, de connecter le matériel et d’utiliser les programmes de leur choix, (...) avec une qualité de service transparente, suffisante et non discriminatoire (...). (Emphase par nos soins)

Ce rapport clarifie un élément essentiel : la notion de « qualité de service non discriminatoire » implique qu'un fournisseur d'accès à Internet ne peut établir de « qualité de service différenciée » sur l'Internet public. Selon les parlementaires français :

La notion de non-discrimination peut être interprétée de différentes manières notamment : comme traitement homogène des flux, comme différentiation de la manière dont sont traités des flux en fonction des besoins objectifs des usages qu’ils supportent, ou comme accès non discriminatoire aux différents niveaux de qualité de service. (...) La notion de non-discrimination [est employée ici] dans le sens d’acheminement homogène des flux. (Emphase par nos soins)

Ceci est une définition rigoureuse de la neutralité du Net. Empêcher les opérateurs de mettre en place des qualités de service différenciées est un enjeu d'importance primordiale pour protéger Internet.

Si les propositions d'ETNO étaient adoptées, elles pourraient :

  • Nuire à la liberté de communication, en empêchant les États membres d'adopter des législations protégeant la neutralité du Net par l'interdiction aux opérateurs de bloquer, ralentir, ou prioriser certains types de contenus, applications ou services, puisque la possibilité de fournir une qualité de service différenciée serait explicitement garantie par l'ITU.
  • Mettre en péril la vie privée, en permettant la généralisation de l'utilisation de technologies de surveillance invasives, comme le « Deep Packet Inspection ». De telles technologies seraient nécessaires pour identifier les types de trafic spécifiques auxquels appliquer les règles définies dans les accords d'interconnexion, et mettraient de facto en place un système de surveillance global sur de vastes parties d'Internet7.
  • Freiner l'innovation et la compétition, en favorisant des fournisseurs de services puissants tels que Google, qui seraient en capacité de payer la priorisation de leurs données, tandis que les nouveaux entrants et les acteurs plus faibles seraient désavantagés. Ce changement signifierait la fin du caractère horizontal de l'environnement économique d'Internet.
  • Réduire l'incitation à investir dans l'amélioration du réseau. Le développement de la congestion augmenterait l'intérêt des accords de différentiation de qualité de service permettant de prioriser une partie du trafic. Les opérateurs seraient ainsi en situation de bénéficier de l’appauvrissement de leur propre bande passante, et seraient donc moins incités à augmenter la capacité de leurs réseaux.

Le WCIT une étape charnière pour la neutralité du Net

Le président d'ETNO, Luigi Gambardella, soutient que cette proposition est inoffensive, et que toute critique à son égard relève « de la désinformation » et de la « propagande ».

Selon lui, les amendements sur la qualité de service différenciée proposés par ETNO ne relèvent que du « choix du consommateur »8 :

Le problème, c'est que nous voulons plus de choix. En dernière instance, le consommateur aura plus de choix. C'est comme voyager en classe économique. Pourquoi ne pourrions-nous aussi autoriser la classe affaire, la classe premium, pour différencier le service ? (traduction par nos soins)

Il est aussi assez explicite sur la manœuvre de blanchiment politique : « Certains États membres pourraient demander l'introduction de nouvelles restrictions sur Internet. Donc, fondamentalement, le paradoxe c'est que notre proposition est d'empêcher certains États membres de réguler plus encore Internet » ... C'est-à-dire de les empêcher de réguler pour interdire aux opérateurs télécoms de casser Internet.

Le fait est que si la proposition d'ETNO pourrait mettre fin à l'Internet tel que nous le connaissons, il faut au moins lui reconnaître un mérite : elle forcera les décideurs européens, en particulier Neelie Kroes, à clarifier leur position. Au cours des derniers mois, la Commissaire européenne chargée de la société numérique a montré un manque de volonté évident à réglementer en la matière, invoquant l'argument fumeux de la « libre concurrence » pour ne pas intervenir. Dans un rapport datant de 2011 sur la neutralité du Net, elle affirmait que les restrictions d'accès à Internet ne posaient pas de problème, et que ces comportement abusifs disparaîtraient d'eux-mêmes sous la pression du marché. Elle a aussi refusé de condamner le projet de certains opérateurs des télécommunications de mettre en place des qualités de service différenciés9. Pour finir, la proposition de la Commission pour la position européenne sur le WCIT prépare le terrain pour le type d'amendements mis en avant par ETNO10.

Il est temps pour Neelie Kroes et le législateur de s'élever contre la proposition d'ETNO, et de protéger le libre accès à l'information des citoyens, ainsi que la libre concurrence et l'innovation en ligne, contre les comportements prédateurs d'opérateurs dominants désireux d’accroître leurs profits.

September 13 2012

Dominant Telcos Try to End Net Neutrality Through ITU

For some months now, there have been intense discussions on the threats raised by the upcoming World Conference on International Telecommunications (WCIT). In December, the 193 Member States of the International Telecommunication Union (ITU), an agency of the United Nations, will gather in Dubaï for this important conference aimed at amending the ITU's founding treaty, the "International Telecommunication Regulations" (ITRs).

ISOC and the Center for Democracy and Technology have analyzed the dangerous amendments proposed by many countries, which would expand the ITU mandate to cover issues such as IP addressing and routing, cooperation in cybercrime and further undermine global Internet governance1. These are very serious sources of concerns, but because several important Members States of the ITU (in particular the US and the EU) and many civil society actors oppose an "ITU take-over of Internet governance", it is unclear whether they could actually pass.

There is another proposal that have been made by a major European actor which, although it might seem technical and unrelated to freedoms online, could have a disastrous impact on Net neutrality2. It was detailed last week by ETNO - the lobby representing incumbent EU telecoms operators in Brussels - in its "contribution to WCIT". And so far, policy-makers, in the EU and beyond, have remained silent, refusing to react to ETNO's proposed changes to the ITRs. This suggests that ETNO's proposal may actually have key political support.



The EU's inertia on Net neutrality

First, some context. In 2009, the EU adopted the "Telecoms Package", a bundle of five European directives regulating the telecommunications sector. At the time, amendments pushed by prominent telecoms operators seeking to legitimise Internet access restrictions sparked an intense debate on Net neutrality.

Although EU lawmakers refused to enshrine Net neutrality in EU law, they struck down the worst amendments. Then, when the Telecoms Package was adopted, the EU Commission pledged to monitor the situation and said that Net neutrality would from now on be a "policy objective"3.

Since then, however, there has been mounting evidence of widespread access restrictions imposed by telecoms operators. Disturbingly, only the Netherlands adopted a legal framework protecting Net neutrality. Other countries seem to be waiting for the EU to take the lead, but the European commissioner in charge of the telecoms sector, Mrs. Neelie Kroes, has so far refused to take any action.

Meanwhile, the EU's biggest operators, such as Vodafone, Deutsch Telekom, Orange or Telefonica, have extensively lobbied for the development of new business models based on restricting or discriminating users' communications, which (although they try to claim to the contrary) would have dire consequences for Net neutrality.

This was made very explicit in early-2010, when the CEO of Telefonica declared that:

Internet search engines use our Net without paying anything at all, which is good for them but bad for us. It is obvious that this situation must change, our strategy is to change this.4

Similar statements have been made by the CEO of Orange5 and others.

Such language clearly shows that some telecom operators want to develop new business models by monetizing the communications coming from online services and going to their subscribers. Here is the idea: since some types of services sometimes suffer from congestion on current networks (think of YouTube's video that take a long time buffering during peak hours), operators could provide "first-class" (i.e priorised) traffic delivery to online service providers who are able to pay -- what they call a "differentiated Quality of Service" (QoS) -- which would also inevitably slow down the rest of traffic. To implement paid differentiated QoS, operators would negotiate so-called "interconnection agreements" with online service providers.

Through ITU, EU telcos follow a policy-laundering strategy

But there is one thing that these companies fear: that, in the process of the Net neutrality debate, they could be banned by lawmakers and regulators from establishing such business models. The recently-adopted Dutch legislation shows that, under the pressure of civil society, lawmakers can decide to actually enforce Net neutrality, establishing stringent regulations against telecoms operators' discriminatory traffic management practices and commercial agreements.

Such fear explains why, last June, US telecoms operators AT&T and Verizon sued the French regulator ARCEP over its decision to gather data on so-called "interconnection agreements"6. Telcos are strongly pushing against any regulation in this field, and have the EU Commission's ear.

So just like the entertainment industry tries to establish repressive measures against file-sharing through trade agreements such as ACTA, incumbent operators are going to the ITU to push their agenda. It is a clear policy laundering strategy. The idea is to make sure that ITU rules will prevent Member States from regulating the way telecom operators may restrict, prioritize or otherwise discriminate Internet communications. To that end, ETNO proposed the following amendments to the ITRs:

3.1 (...) Member States shall facilitate the development of international IP interconnections providing both best effort delivery and end to end quality of service delivery.

4.4 Operating Agencies shall cooperate in the development of international IP interconnections providing both, best effort delivery and end to end quality of service delivery. Best effort delivery should continue to form the basis of international IP traffic exchange. Nothing shall preclude commercial agreements with differentiated quality of service delivery to develop. (Our emphasis).

According to ETNO, the ITU's founding treaty should "enable incremental revenues by end‐to‐end QoS pricing and content value pricing" and allow for "new interconnection policies based on the differentiation of the QoS parameters for specific services and types of traffic (not only on the “volume”)." That, they say, should be part of the "Internet ecosystem" (i.e. not just for so-called "managed services" or "specialized services", which are private IP networks distinct from the public Internet), and should be decided between network operators and online service providers, putting regulators and end-users out of the picture once and for all.

The dangers of the ETNO proposal

ETNO's proposal is totally contrary to a comprehensive definition of Net neutrality. There are still debates in political circles about what exactly is Net neutrality. Everybody will say they agree with the concept, but will provide varying definitions depending on how sympathetic they are to the policy options defended by incumbent players in the telecoms sector.

In France, a parliamentary report rightly stressed last year that Net neutrality is to be understood as the "Internet users’ ability to send and receive the content of their choice, to use services or run applications of their choice, connect the equipment and use the programs of their choice (...) with a transparent, sufficient, and non-discriminatory quality of service (...)."

The report clarifies a key point: the notion of "non-discriminatory QoS" means that Internet access provider cannot establish "differentiated QoS" on the public Internet. According to the French Members of Parliament:

The concept of nondiscrimination can be interpreted in various ways, including as a homogeneous treatment of flows, as a differentiation in how flows are processed according to the objective needs of the uses they support, or as no discriminatory access to various levels of quality of service. (...) The concept of nondiscrimination is used here in the sense of homogeneous delivery. (Our emphasis).

This is a rigorous definition of Net neutrality. Preventing operators from introducing differentiated QoS interconnection policies is indeed of paramount importance to protect the Internet.

If ETNO's proposals were adopted, they could:

  • Hurt freedom of communication, by preventing Member States from adopting rigorous Net neutrality regulations to ban operators from blocking, throttling, or priorising specific types of content, applications or services, since the ability to provide differentiated QoS on the Internet would be explicitly protected by the ITU.
  • Undermine privacy, by leading to the generalization of privacy-invasive traffic monitoring technologies, such as Deep Packet Inspection. Such technologies would be necessary to identify specific types of traffic and implement ad hoc policies as provided by interconnection agreements, but would de facto establish a comprehensive surveillance infrastructures over vast portions of the Internet.7
  • Hamper innovation and competition, by favoring powerful service providers such as Google, which would be in position to pay for priorisation, whereas smaller players and new entrants would be at a competitive disadvantage. It would be the end of the level playing field provided by the Internet economy.
  • Decrease incentives to invest in more bandwidth. Congestion would increase the value of differentiated QoS agreements providing priorised traffic delivery. Operators would therefore be in position to benefit from the scarcity of their network's bandwidth, and would have less incentives to invest in increasing the capacity of their networks.

The WCIT as a defining moment for Net neutrality

The ETNO chairman, Luigi Gambardella, argues that his proposal is harmless, and that any attempt to criticize it is just "false information" and bad "propaganda".

According to him, ETNO's amendments on differentiated QoS are only about "consumer choice": "The problem is that we want more choice. In the end, the customer will have more choice. It's like if you travel in economy. But why don't you also allow business class, a premium class, to differentiate the service?", he said in an interview.

But he is quite explicit about the policy-laundering scheme: "Our proposal is to impede some member state to regulate further the Internet"... Regulate against the telecoms operators' attempts to break the Net, that is.

ETNO's proposal could put an end to the Internet as we know it, but it has at least one merit: It will force EU policy-makers, and in particular Neelie Kroes, to make clear where they stand. In the last months, the EU commissioner for the Digital Agenda has showed an obvious lack of will to legislate on the matter, invoking dubious "free market" arguments not to intervene. In a 2011 report on Net neutrality, she claimed that there was no problem with ongoing access restrictions and that any bad behaviour would be solved by the market anyway. She also refused to condemn the plans of some telecoms operators to charge for differentiated QoS8. Lastly, the Commission's recent proposal for a WCIT negotiation mandate paves the way for exactly the type of amendments pushed by ETNO9.

It is time for Neelie Kroes and other policy-makers to step up against ETNO proposal, and protect citizens' freedom of information as well as free competition and innovation online against the predatory behaviors of dominant, rent-seeking operators.

September 06 2012

Digging into the UDID data

Over the weekend the hacker group Antisec released one million UDID records that they claim to have obtained from an FBI laptop using a Java vulnerability. In reply the FBI stated:

The FBI is aware of published reports alleging that an FBI laptop was compromised and private data regarding Apple UDIDs was exposed. At this time there is no evidence indicating that an FBI laptop was compromised or that the FBI either sought or obtained this data.

Of course that statement leaves a lot of leeway. It could be the agent’s personal laptop, and the data may well have been “property” of an another agency. The wording doesn’t even explicitly rule out the possibility that this was an agency laptop, they just say that right now they don’t have any evidence to suggest that it was.

This limited data release doesn’t have much impact, but the possible release of the full dataset, which is claimed to include names, addresses, phone numbers and other identifying information, is far more worrying.

While there are some almost dismissing the issue out of hand, the real issues here are: Where did the data originate? Which devices did it come from and what kind of users does this data represent? Is this data from a cross-section of the population, or a specifically targeted demographic? Does it originate within the law enforcement community, or from an external developer? What was the purpose of the data, and why was it collected?

With conflicting stories from all sides, the only thing we can believe is the data itself. The 40-character strings in the release at least look like UDID numbers, and anecdotally at least we have a third-party confirmation that this really is valid UDID data. We therefore have to proceed at this point as if this is real data. While there is a possibility that some, most, or all of the data is falsified, that’s looking unlikely from where we’re standing standing at the moment.

With that as the backdrop, the first action I took was to check the released data for my own devices and those of family members. Of the nine iPhones, iPads and iPod Touch devices kicking around my house, none of the UDIDs are in the leaked database. Of course there isn’t anything to say that they aren’t amongst the other 11 million UDIDs that haven’t been released.

With that done, I broke down the distribution of leaked UDID numbers by device type. Interestingly, considering the number of iPhones in circulation compared to the number of iPads, the bulk of the UDIDs were self-identified as originating on an iPad.

Distribution of UDID by device type

What does that mean? Here’s one theory: If the leak originated from a developer rather than directly from Apple, and assuming that this subset of data is a good cross-section on the total population, and assuming that the leaked data originated with a single application … then the app that harvested the data is likely a Universal application (one that runs on both the iPhone and the iPad) that is mostly used on the iPad rather than on the iPhone.

The very low numbers of iPod Touch users might suggest either demographic information, or that the application is not widely used by younger users who are the target demographic for the iPod Touch, or alternatively perhaps that the application is most useful when a cellular data connection is present.

The next thing to look at, as the only field with unconstrained text, was the Device Name data. That particular field contains a lot of first names, e.g. “Aaron’s iPhone,” so roughly speaking the distribution of first letters in the this field should give a decent clue as to the geographical region of origin of the leaked list of UDIDs. This distribution is of course going to be different depending on the predominant language in the region.

Distribution of UDID by the first letter of the “Device Name” field

The immediate stand out from this distribution is the predominance of device name strings starting with the letter “i.” This can be ascribed to people who don’t have their own name prepended to the Device Name string, and have named their device “iPhone,” “iPad” or “iPod Touch.”

The obvious next step was to compare this distribution with the relative frequency of first letters in words in the English language.

Comparing the distribution of UDID by first letter of the “Device Name” field against the relative frequencies of the first letters of a word in the English language

The spike for the letter “i” dominated the data, so the next step was to do some rough and ready data cleaning.

I dropped all the Device Name strings that started with the string “iP.” That cleaned out all those devices named “iPhone,” “iPad” and “iPod Touch.” Doing that brought the number of device names starting with an “i” down from 159,925 to just 13,337. That’s a bit more reasonable.

Comparing the distribution of UDID by first letter of the “Device Name” field, ignoring all names that start with the string “iP,” against the relative frequencies of the first letters of a word in the English language

I had a slight over-abundance of “j,” although that might not be statistically significant. However, the stand out was that there was a serious under-abundance of strings starting with the letter “t,” which is interesting. Additionally, with my earlier data cleaning I also had a slight under-abundance of “i,” which suggested I may have been too enthusiastic about cleaning the data.

Looking at the relative frequency of letters in languages other than English it’s notable that amongst them Spanish has a much lower frequency of the use of “t.”

As the de facto second language of the United States, Spanish is the obvious next choice  to investigate. If the devices are predominantly Spanish in origin then this could solve the problem introduced by our data cleaning. As Marcos Villacampa noted in a tweet, in Spanish you would say “iPhone de Mark” rather than “Mark’s iPhone.”

Comparing the distribution of UDID by first letter of the “Device Name” field, ignoring all names that start with the string “iP,” against the relative frequencies of the first letters of a word in the Spanish language

However, that distribution didn’t really fit either. While “t” was much better, I now had an under-abundance of words with an ”e.” Although it should be noted that, unlike our English language relative frequencies, the data I was using for Spanish is for letters in the entire word, rather than letters that begin the word. That’s certainly going to introduce biases, perhaps fatal ones.

Not that I can really make the assumption that there is only one language present in the data, or even that one language predominates, unless that language is English.

At this stage it’s obvious that the data is, at least more or less, of the right order of magnitude. The data probably shows devices coming from a Western country. However, we’re a long way from the point where I’d come out and say something like ” … the device names were predominantly in English.” That’s not a conclusion I can make.

I’d be interested in tracking down the relative frequency of letters used in Arabic when the language is transcribed into the Roman alphabet. While I haven’t been able to find that data, I’m sure it exists somewhere. (Please drop a note in the comments if you have a lead.)

The next step for the analysis is to look at the names themselves. While I’m still in the process of mashing up something that will access U.S. census data and try and reverse geo-locate a name to a “most likely” geographical origin, such services do already exist. And I haven’t really pushed the boundaries here, or even started a serious statistical analysis of the subset of data released by Antisec.

This brings us to Pete Warden’s point that you can’t really anonymize your data. The anonymization process for large datasets such as this is simply an illusion. As Pete wrote:

Precisely because there are now so many different public datasets to cross-reference, any set of records with a non-trivial amount of information on someone’s actions has a good chance of matching identifiable public records.

While this release in itself is fairly harmless, a number of “harmless” releases taken together — or cleverly cross-referenced with other public sources such as Twitter, Google+, Facebook and other social media — might well be more damaging. And that’s ignoring the possibility that Antisec really might have names, addresses and telephone numbers to go side-by-side with these UDID records.

The question has to be asked then, where did this data originate? While 12 million records might seem a lot, compared to the number of devices sold it’s not actually that big a number. There are any number of iPhone applications with a 12-million-user installation base, and this sort of backend database could easily have been built up by an independent developer with a successful application who downloaded the device owner’s contact details before Apple started putting limitations on that.

Ignoring conspiracy theories, this dataset might be the result of a single developer. Although how it got into the FBI’s possession and the why of that, if it was ever there in the first place, is another matter entirely.

I’m going to go on hacking away at this data to see if there are any more interesting correlations, and I do wonder whether Antisec would consider a controlled release of the data to some trusted third party?

Much like the reaction to #locationgate, where some people were happy to volunteer their data, if enough users are willing to self-identify, then perhaps we can get to the bottom of where this data originated and why it was collected in the first place.

Thanks to Hilary Mason, Julie Steele, Irene RosGemma Hobson and Marcos Villacampa for ideas, pointers to comparative data sources, and advice on visualisation of the data.

Update

9/6/12

In response to a post about this article on Google+, Josh Hendrix made the suggestion that I should look at word as well as letter frequency. It was a good idea, so I went ahead and wrote a quick script to do just that…

The top two words in the list are “iPad,” which occurs 445,111 times, and “iPhone,” which occurs 252,106 times. The next most frequent word is “iPod,” but that occurs only 36,367 times. This result backs up my earlier result looking at distribution by device type.

Then there are various misspellings and mis-capitalisations of “iPhone,” “iPad,” and “iPod.”

The first real word that isn’t an Apple trademark is “Administrator,” which occurs 10,910 times. Next are “David” (5,822), “John” (5,447), and “Michael” (5,034). This is followed by “Chris” (3,744), “Mike” (3,744), “Mark” (3,66) and “Paul” (3,096).

Looking down the list of real names, as opposed to partial strings and tokens, the first female name doesn’t occur until we’re 30 places down the list — it’s “Lisa” (1,732) with the next most popular female name being “Sarah” (1,499), in 38th place.

The top 100 names occurring in the UDID list.

The word “Dad” occurs 1,074 times, with “Daddy” occurring 383 times. For comparison the word “Mum” occurs just 58 times, and “Mummy” just 33. “Mom” came in with 150 occurrences, and “mommy” with 30. The number of occurrences for “mum,” “mummy,” “mom,” and “mommy” combined is 271, which is still very small compared to the combined total of 1,457 for “dad” and “daddy.”

[Updated: Greg Yardly wisely pointed out on Twitter that I was being a bit English-centric in only looking for the words "mum" and "mummy," which is why I expanded the scope to include "mom" and "mommy."]

There is a definite gender bias here, and I can think of at least a few explanations. The most likely is fairly simplistic: The application where the UDID numbers originated either appeals to, or is used more, by men.

Alternatively, women may be less likely to include their name in the name of their device, perhaps because amongst other things this name is used to advertise the device on wireless networks?

Either way I think this definitively pins it down as a list of devices originating in an Anglo-centric geographic region.

Sometimes the simplest things work better. Instead of being fancy perhaps I should have done this in the first place. However this, combined with my previous results, suggest that we’re looking at an English speaking, mostly male, demographic.

Correlating the top 20 or so names and with the list of most popular baby names (by year) all the way from the mid-’60s up until the mid-’90s (so looking at the most popular names for people between the ages of say 16 and 50) might give a further clue as to the exact demographic involved.

Both Gemma Hobson and Julie Steele directed me toward the U.S. Social Security Administration’s Popular Baby Names By Decade list. A quick and dirty analysis suggests that the UDID data is dominated by names that were most popular in the ’70s and ’80s. This maps well to my previous suggestion that the lack of iPod Touch usage might suggest that the demographic was older.

I’m going to do a year-by-year breakdown and some proper statistics later on, but we’re looking at an application that’s probably used by: English speaking males with an Anglo-American background in their 30s or 40s. It’s most used on the iPad, and although it also works on the iPhone, it’s used far less on that platform.

Thanks to Josh Hendrix, and again to Gemma Hobson and Julie Steele, for ideas and pointers to sources for this part of the analysis.

Related:

April 19 2012

La mobilisation contre ACTA, et au-delà

Paris, 19 avril 2012 – Au cours des prochaines semaines, le Parlement européen va poursuivre ses travaux sur l'ACTA, l'accord commercial anti-contrefaçon, jusqu'au vote final prévu cet été. C'est une période cruciale pour l'opposition citoyenne à l'accord. Elle devra faire face à une pression accrue des lobbies du copyright sur le Parlement. Au-delà de l'ACTA, c'est toute la politique du droit d'auteur européen qui doit être revue. Seule une véritable réforme pourra réellement protéger les droits fondamentaux sur Internet, et rompre avec les répressions aveugles pour promouvoir une économie de la culture adaptée à Internet. Voici un point d'étape sur la situation, en vue de la campagne à venir au Parlement européen.

Lundi dernier, le rapporteur britannique David Martin (SD) a rendu public son projet de rapport relatif à l'ACTA. Bien qu'il y appelle le Parlement européen à rejeter ce dangereux accord, le rapporteur invite tout de même la Commission à proposer de nouvelles mesures répressives pour protéger le droit d'auteur, des brevets et des marques. Compte tenu des projets de la Commission, qui s'apprête à réviser la directive anti-partage IPRED et à accentuer la répression des infractions en ligne au droit d'auteur, il est clair que la victoire contre ACTA ne signifie pas la fin de la lutte pour le partage de la culture.

Ces attaques ne sont toutefois pas le seul fait de la Commission. Avec IPRED, la directive services en ligne (ou directive « e-Commerce »), et d'autres initiatives, l'industrie du divertissement va tout faire pour reprendre en main le débat, et s'opposer à tout changement significatif contre toute réforme visant à faire évoluer le droit d'auteur. Ces lobbies envoient d'ores-et-déjà des lettres au Parlement, pressant nos élus d'adopter ACTA1.

Tout en poursuivant la campagne en faveur du rejet de l'ACTA au Parlement européen, nous devons agir pour que le législateur européen revoie sa conception extrémiste en matière de droit d'auteur, de brevets et de droit des marques.

Pour contacter vos eurodéputés et prendre part à la mobilisation, visitez notre page recensant les moyens d'actions conre l'ACTA.

Ci-dessous, un point d'étape sur le débat au Parlement européen :

Commission « Commerce international » (INTA)

Au sein de la commission « Commerce international » (INTA) - en charge de guider les travaux sur l'ACTA - le rapporteur du texte pour le Parlement européen, l'eurodéputé David Martin (UK, SD), a rendu public son projet de rapport, appelant le Parlement à rejeter l'ACTA2. Après avoir rappelé les principaux dangers de l'ACTA, comme la privatisation de la répression des infractions au droit d'auteur sur Internet, il souligne que « les bénéfices attendus de cet accord international sont largement inférieurs aux menaces qu'il fait peser sur les libertés civiles ».

Ce projet de rapport fait toutefois le jeu des lobbies du copyright, en appelant la Commission à « faire de nouvelles propositions pour protéger la propriété intellectuelle ».

C'est pourquoi les citoyens doivent contacter les membres de la Commission INTA et les appeler à rejeter l'ACTA, afin de mettre un terme à l'extension continue du droit d'auteur, des brevets, et des marques dans tous les accords commerciaux, comme ceux adoptés ou actuellement négociés par l'Union européenne avec l'Inde, la Corée, la Colombie et le Pérou...

David Martin présentera son projet de rapport le 25 avril. La commission INTA devrait adopter le rapport le 30 mai ou le 20 juin, après avoir pris en compte les rapports pour avis des autres commissions (voir ci-dessous).

Commission « Développement » (DEVE)

Au sein de la commission « Développement » (DEVE), le rapporteur Jan Zahradil (République Tchèque, ECR), a présenté un très mauvais projet de rapport pour avis en janvier. Ce dernier doit absolument être amendé pour prendre en compte les questions cruciales relatives aux conséquences de l'ACTA sur les pays en développement, et notamment les inquiétudes concernant l'accès aux médicaments et la liberté d'expression en ligne.

Au delà de l'ACTA, les membres de DEVE doivent aussi comprendre à quel point les mesures d'application du droit d'auteur et des brevets peuvent compromettent l'accès à la culture, aux médicaments et aux technologies, freinant ainsi le développement socio-économique des pays émergents.

La commission DEVE doit normalement décider cette semaine de la suite à donner à son rapport. Ses prochaines réunions sont prévues pour les 23 et 24 avril et pour le 14 mai.

Commission « Libertés civiles »(LIBE)

Au sein de la commission « Libertés civiles » (LIBE), qui doit rendre un rapport sur l'impact de l'ACTA sur les droits fondamentaux, le rapporteur Dimitrios Droutsas (Grèce, S&D) a déclaré la semaine dernière qu'il était convaincu que l'ACTA constituait une menace pour les libertés. Là encore, les citoyens doivent contacter les membres de la commission pour s'assurer que les autres membres de LIBE partagent son point de vue.

Le rapport de la commission LIBE doit notamment interpréter le texte d'ACTA, dont l’ambiguïté est dangereuse, à la lumière des récentes évolutions des politiques en matière de droit d'auteur sur Internet. Devront ainsi être passées en revue les sanctions pénales, la responsabilité des intermédiaires et les appels à la « coopération » entre les acteurs de l'Internet et l'industrie du divertissement, qui constituerait une forme de censure privatisée3. Et ouvrir la voie à une vraie réforme du droit d'auteur, le rapport de LIBE doit affirmer son opposition à des mesures extra-judiciaires pour protéger un droit d'auteur devenu inadapté.

Il est possible que le rapporteur Droutsas demande un court délai dans la procédure, de manière à laisser du temps pour le dépôt d'amendements. La présentation du rapport à la commission aurait lieu la semaine prochaine ou le 8 mai, pour un vote le 30 ou 31 mai.

Commission « Industrie » (ITRE)

Au sein de la commission « Industrie » (ITRE), la rapporteure Amelia Andersdötter (Suède, Greens/EFA) a rendu un projet de rapport pour avis encourageant. Elle souligne que l'ACTA « semble contraire à l'ambition (...) de faire de l'Europe le théâtre d'une innovation de pointe, ainsi que (...) de promouvoir la neutralité du Net et l'accès des PME au marché numérique en ligne »4. D'autres eurodéputés connus pour leurs positions allant à l'encontre de la protection des libertés sur Internet, comme Daniel Caspary (Allemagne, PPE), veulent cependant amender le rapport, probablement pour y insérer des arguments favorables à l'ACTA.

D'avantage encore que les autres citoyens, les innovateurs et entrepreneurs doivent appeler la commission ITRE pour s'assurer que tous ses membres comprennent pourquoi l'ACTA va à l'encontre de la croissance et de l'innovation, comme l'a d'ailleurs récemment rappelé l'industrie européenne des télécoms et d'Internet. Au delà de l'ACTA, les membres d'ITRE doivent être invités à dénoncer la pression grandissante qui pèse sur les fournisseurs de services en ligne pour qu'ils endossent le rôle de police privée du droit d'auteur.

La commission ITRE débattra sur son rapport le 25 avril, et devrait voter le 8 mai.

Commission des « Affaires juridiques » (JURI)

Au sein de la commission « Affaires juridiques » (JURI), la rapporteure Marielle Gallo (France, EPP), connue pour sa position extrême sur le renforcement du droit d'auteur, a récemment rendu public son projet de rapport pour avis. Sans surprise, ce dernier défend âprement l'ACTA, et appelle la commission INTA à recommander l'adoption de l'accord. Malheureusement, la commission a décidé de ne pas ouvrir ce projet de rapport à des amendements, et elle devra donc l'adopter ou le rejeter en l'état.

Les citoyens peuvent appeler les membres de la commission JURI à rejeter le projet de rapport de Mme Gallo. D'une part, ce dernier fait l'impasse sur les dispositions de l'ACTA qui vont au-delà du droit de l'Union européenne (par exemple pour les sanctions pénales, ou les mesures aux frontières). D'autre part, il se refuse à dénoncer le contournement démocratique évident de ce prétendu « accord commercial », négocié à l'écart des organisations internationales légitimes et des Parlements légitimes, et qui impose une répression brutale aux pays tiers.

Le vote de la commission JURI sur le projet de rapport pour avis de Mme Gallo devrait avoir lieu le 26 avril.

January 20 2012

October 20 2011

02mydafsoup-01
Video: Datenspuren 2011 - Vortrag über den Bundestrojaner

Constanze Kurz und Frank Rieger vom CCC erklären technische Details zum Staatstrojaner "0zapftis" und üben heftige Kritik an den Reaktionen der Politik.

Reposted byra-tm-an ra-tm-an

October 03 2011

02mydafsoup-01

The Shock of Victory by David Graeber | theanarchistlibrary.org 2007


The biggest problem facing direct action movements is that we don’t know how to handle victory.

This might seem an odd thing to say because of a lot of us haven’t been feeling particularly victorious of late. Most anarchists today feel the global justice movement was kind of a blip: inspiring, certainly, while it lasted, but not a movement that succeeded either in putting down lasting organizational roots or transforming the contours of power in the world. The anti-war movement was even more frustrating, since anarchists and anarchist tactics were largely marginalized. The war will end, of course, but that’s just because wars always do. No one is feeling they contributed much to it.

I want to suggest an alternative interpretation. Let me lay out three initial propositions here:

  1. Odd though it may seem, the ruling classes live in fear of us. They appear to still be haunted by the possibility that, if average Americans really get wind of what they’re up to, they might all end up hanging from trees. It know it seems implausible but it’s hard to come up with any other explanation for the way they go into panic mode the moment there is any sign of mass mobilization, and especially mass direct action, and usually try to distract attention by starting some kind of war.

  2. In a way this panic is justified. Mass direct action — especially when organized on democratic lines — is incredibly effective. Over the last thirty years in America, there have been only two instances of mass action of this sort: the anti-nuclear movement in the late ‘70s, and the so called “anti-globalization” movement from roughly 1999-2001. In each case, the movement’s main political goals were reached far more quickly than almost anyone involved imagined possible.

  3. The real problem such movements face is that they always get taken by surprise by the speed of their initial success. We are never prepared for victory. It throws us into confusion. We start fighting each other. The ratcheting of repression and appeals to nationalism that inevitably accompanies some new round of war mobilization then plays into the hands of authoritarians on every side of the political spectrum. As a result, by the time the full impact of our initial victory becomes clear, we’re usually too busy feeling like failures to even notice it.

Let me take the two most prominent examples case by case:

[...]

-------------------------

oAnth:

this entry is part of the OccupyWallStreet compilation 2011-09/10, here.

02mydafsoup-01

August 31 2011

Why the finance world should care about big data and data science

ABOVE by Lyfetime, on FlickrFinance experts already understand that data has value. It's the lifeblood of their industry, after all. But as O'Reilly director of market research Roger Magoulas notes in the following interview, some in the financial domain may not grasp all that data has to offer. Data science and big data have led to an expansion of data types, Magoulas says, and the associated influx of information could very well shape investment strategies and create new businesses.

How does big data apply to the financial world?

Roger Magoulas: There are two flavors of it. One is analyzing things like your investments, econometrics, trading activity, and longer-term data analysis. That's clearly part and parcel of the finance business, and people in the space already have great familiarity with this side of data.

The second flavor is the integrated approach to data in all facets of how organizations do business. This involves understanding customers, understanding competitors, understanding behavior, taking advantage of the world of sensors, and using a computational and quantitative mindset to make sense of a very confusing world.

Is there a disconnect between the finance world and terms like "data science" and "big data"?

Roger Magoulas: Everyone is struggling with the semantics, so finance isn't worse off than others. They're actually making an effort to understand it. Adding to the semantic confusion, the terms "data science" and "big data" are sometimes co-opted by organizations trying to show how they embody these attributes. That's fine, but the finance ecosystem has a responsibility to learn as much as it can about these areas. The best way to do that is directly from the data science practitioners: see the tools data scientists use and how they approach their work. That firsthand experience will help finance experts inform their investment strategies and see where the data space is heading.

What's the relationship between data science and business intelligence?

Roger Magoulas: My background is in data warehousing, and the front-end access to the data warehouse was known as "business intelligence" in the '90s. These early data warehouses were mostly constructed out of quantitative data from operational systems — things like order entry and customer service systems. "Business intelligence" tools were used to access the mostly well-understood operational data in the data warehouses.

What's changed is that we've had an explosion of data types. For example, no one was doing analysis on search terms back in the '90s because the tools to do that weren't available. Now, we need new terms to help accommodate what analysts do: natural language processing, machine learning, etc. Moreover, the old business intelligence tools were based on operational things, like how many orders a customer placed. They weren't built to tackle these new tasks.

Will data science and big data incrementally improve existing techniques with new tools? Or are we also talking about the creation of whole new industries?

Roger Magoulas: It's going to do both. The analogy might be to when open source software became widely used. While there were open source business models and companies, the real growth of open source came from companies like Google, Yahoo and Amazon that based their core technologies on the open source stacks. There was this two-headed approach that came out of the adoption of open source.

LinkedIn is an example of this two-headed approach. The company is a social network, but it uses data science tools, techniques and processes to build products that make sense of the social network for LinkedIn's clients. Would LinkedIn exist without data science? I think you can imagine a social network that just helps business people connect with each other, but the real monetization part — the thing that helped them go public — came from LinkedIn using the data they capture to identify and build products.

This interview was edited and condensed.

Strata Conference New York 2011, being held Sept. 22-23, covers the latest and best tools and technologies for data science — from gathering, cleaning, analyzing, and storing data to communicating data intelligence effectively.

Save 30% on registration with the code ORM30

Finance sessions at Strata New York

A number of sessions at the three Strata NY events (Sept. 19-23 in New York City) will examine the intersection of finance and data science. Here's a selection:

Thin and Thick Value in a Transparent Environment

Presenter: Umair Haque, Havas Media Lab, HBR

Big data is a necessary part of a transition to an economy that's not just more efficient and productive, but more efficient and productive in 21st century terms. Yet today, we're hyper-connected, but in a relative data vacuum, which leaves us prone to large-scale crises and "too big to fail" thinking. In this session, Harvard's Umair Haque looks at the future of thin and thick value in a data-driven world.

Next Best Action for MBAs

Presenter: James Kobielus, Forrester Research, Inc.

Leading-edge organizations have implemented "next best action" (NBA) technologies, such as big data analytics, within their multichannel customer relationship management programs. In this session, Forrester senior analyst James Kobielus will provide a vision, case studies, ROI metrics, and guidance for business professionals evaluating applications of NBA in their organizations.

Big Data: The Next Frontier

Presenter: Michael Chui, McKinsey Global Institute

McKinsey's influential big data report has helped define and explain the opportunity created by the torrent of data flowing daily through business. Michael Chui outlines the big picture of data innovation, challenges and competitive advantage.

The New Corporate Intelligence

Presenter: Sean Gourley, Quid

What if corporate strategists could literally draw a map to find growth opportunities? A technique called semantic clustering analysis makes this possible. When applied to technology entities worldwide, this analysis can reveal not only which innovation areas are thick with competition, but also where in the market there are opportunities, or "white spaces," ripe for innovation.

Creating a National Data Utility: Dodd-Frank Financial Reforms

Presenters: Donald F. Donahue, The Depository Trust & Clearing Corporation; Paul Sforza, U.S. Department of the Treasury

Donahue and Sforza will discuss America's first public financial services data utility. This project is being incorporated into the United States' existing information infrastructure to provide consistent, quality data to investors, institutions, and regulators.

Photo: ABOVE by Lyfetime, on Flickr

July 25 2011

Gold, fine wine, art or under the bed: what's the safest place for your cash?

In uncertain economic times, alarmed investors want to minimise their risks. We take a look at the options

For City traders digesting the news via their terminals today, the language had a constant ring.

"Swiss franc leaps as investors seek havens," screamed one story. "Investors poured into perceived safe-haven assets, driving gold to a record high," stated another, before adding: "US government bonds failed to benefit from their usual safe-haven status after the weekend breakdown in talks fuelled investor anxiety over treasury holdings."

It is a small sample but the message seems clear. Spooked investors, unsure of where to put their cash, are looking for places where they can be confident it will not suddenly vanish. So what are the safest investments – and can they really be risk-free?

Swiss franc

Sometimes you can look desirable just because the weaknesses of those around you are so stark. That was true of successors to Robert Green in the England goal, and it is now proving to be the case with the Swiss franc, which has appreciated by about 15% against the dollar this year. While some argue that the Swiss National Bank (SNB) will have to step in, others believe it is powerless to dampen demand.

In a research note, Simon Smith, chief economist at the foreign exchange specialists FXPro, wrote: "Switzerland is not the only country in decent fiscal shape but, apart from the Aussie dollar, it is the most liquid alternative to the US dollar, euro, yen and sterling, all of which have sovereign fiscal issues to varying degrees. Furthermore, the SNB could once again find itself pretty helpless in terms of trying to fight this strength, should this aversion to countries with sustained deficits really take hold.

"Intervention is an option but, despite the increased reserve levels and balance sheet position (around 55% in euros, from over 70% last year), it could well be a futile one."

Gold

Gold is always considered the ultimate safe haven, and it has so far served investors brilliantly during this downturn – rising 16% during 2011. Silver has performed even better, up 30%, despite a crazy period in the spring when margin calls (when an investor has to deposit more cash or securities to cover possible losses) were increased four times in six weeks as regulators feared that speculators were driving the price too high.

The consensus among analysts is that both metals will continue to rise, although there are some famous names who strongly disagree. George Soros, the financier who "broke the Bank of England", is a gold bear. "The ultimate bubble is gold," he said in May. "Gold has shown tendencies to go parabolic, and usually bubbles tend to end in that parabolic rise before the collapse."

Premium bonds

Another perennial safe bet, but are they worth it? According to the Premium Bond calculator on the financial website Moneysavingexpert.com, an investor enjoying average luck and punting £30,000 would expect to win £400 over one year – or a return of 1.3%. That comes tax free, so is equivalent to a 2.2% return at the 40% rate. In normal financial times, that would not appear stellar. But, with interest rates at 0.5%, it suddenly does not look too shabby.

Fine wine and art

Your typical City wine investor delights in boring acquaintances about how he drinks for free by buying two cases of young wine and leaving them to mature, before quaffing one and selling the other to finance his purchase. The brag is almost always nonsense but there are those around who reckon that wine can deliver decent investment returns. The Wine Investment Fund, which asks investors for a minimum £10,000 commitment, says it has paid out annualised returns of upwards of 13% on its portfolios between 2003 and 2006, while 2009 punters are enjoying a vintage year with many showing profits of more than 20%. Even 2010 investors currently have profits upwards of 8%.

However, there are those who believe that this cannot last. "Historically wine has had a good run, but there is a feeling it is getting near the end of the bubble," said one City trader. "Every man and his dog seems to be cropping up as a wine broker. And unlike gold and stocks, if the price starts to fall you might struggle to get out as it's not the most liquid asset, if you forgive the pun."

Yes, very droll. Equally, art is not an easily sold asset but it is also touted as another area for nervous investors. "The idea of contemporary art as a safe haven is a joke," said the entrepreneur Luke Johnson. "It is particularly illiquid, transaction costs are enormous, there is clearly no income and capital growth prospects are at best uncertain."

A number of art funds fell over during the downturn, but the Fine Art Fund Group, which has a base level $250,000 (£153,000) investment, is still around and boasts annualised returns in excess of 25% in its two main funds.

Still, in a world where everybody talks up their own book, it may be worth noting a quote frequently (but dubiously) attributed to Pablo Picasso: "I'm a joker who has understood his epoch and has extracted all he possibly could from the stupidity, greed and vanity of his contemporaries." Maybe not a screaming buy, then.

High-yielding stocks

Can high-yielding equities suddenly be a safe haven? UK shareholders have received their largest dividend payouts since the collapse of Lehman Brothers in 2008, with companies returning £19.1bn to shareholders in the three months to July – a 27% increase on the same period last year, according to Capita Registrars.

However, investing for the income might still put your capital at risk. David Jones, chief market strategist at IG  Index, said: "The stock market has been going up for two and a half years and might be fully valued. You may get the dividend, but possibly not the capital appreciation."

The mattress

If you invested in the stock market 11 years ago, you are still waiting for a return. And with interest rates at 0.5% for more than two years, leaving your cash in the bank has not proved to be a massively profitable option. Sticking your funds under the bed might be one approach and it is similar to one adopted by many companies, which are now reluctant to lend their spare funds in the wholesale money markets.

Louise Cooper, market analyst at BGC Partners, said: "For some risk-averse companies, it may be better just to keep the cash inside the company and earn nothing on it, rather than lend it out for a minimal return, with degrees of risk currently being replaced with fear of the ultimate risk."

So should private investors follow suit and simply stuff their wads under the mattress? "It's an option," admitted one frustrated City analyst. A word of caution, though: if the house goes up in smoke, the insurance will only pay out on £500 or so of burnt notes.


guardian.co.uk © Guardian News & Media Limited 2011 | Use of this content is subject to our Terms & Conditions | More Feeds


July 05 2011

Cy Twombly - an appreciation: Paintings about sex and death

He painted supremely ambitious and convincing epics of charismatic colour and vertigo-inducing space

Cy Twombly's paintings are today on view at Dulwich Picture Gallery in south London, cheek by jowl with works by the 17th century master Nicolas Poussin, and a stone's throw from paintings by Rubens and Rembrandt. It is a company in which he manifestly belongs.

In an age when some said painting was finished, he proved otherwise. His ambitious and convincing epics of charismatic colour and vertigo-inducing space do what painting has always done, and tell stories of sex, death, history and the gods.

Here is an artist who can teach you to read. Few of us read as Twombly did, steeping himself in Greek, Latin and English verse, and teasing the beholder to follow up enigmatic quotations scrawled in a languid stain on his sighs of paintings.

At Dulwich is a painting, Hero and Leandro (for Christopher Marlowe), that is a white misty spume of oceanic spray assailed by a bloody smear of red. Blood in water, it seemed to me. Only later did I read Marlowe's poem Hero and Leander that begins: "On Hellespont, guilty of true love's blood..."

Twombly came of agein the America of Jackson Pollock and the Abstract Expressionists. It was surely, in part, a sense that imperial New York's historical double is ancient Rome that made him emigrate to Italy.

What he found there was low life and sex in a landscape of ruins: his way of responding to the dolce vita was to turn the arabesques of Pollock's style into outbursts of graffiti. In his paintings the myths of the gods found in Roman frescoes are retold with obscene pink smears for buttocks and breasts. Out of this comes a deeply romantic art of colour and time and place that brutally breathes new life into the mythologies of Greece and Rome.

Above all, he came from America's south; when born in 1928 the civil war and the (albeit deserved) destruction of southern pride was a living memory for some in his native Virginia. Classical architecture has a history there going back to Thomas Jefferson; and no southerner can fail to see history as a melancholic process. He found in the Mediterranean a world even more crumbling with ruins and memories, where it is still possible to imagine the sea stained with the blood of old battles. He may have seemed apolitical, yet shortly before 9/11 he unveiled paintings of the sea battle of Lepanto, the traumatic 16th century conflict between Christians and Muslims.

While Twombly was alive and working – and his last paintings of flowers were ripely beautiful – it was possible to see a connection between the art of today and the noble legacy of Greece and Rome as it has been perpetuated by artists such as Raphael and Picasso. His death really hurts, it leaves a black hole. A link has been cut, a lifeline lost. Some artists fade from memory when they die. Twombly will grow in stature. He will be mourned by all who truly love painting. The great god Pan is dead, as a voice was heard to cry by sailors in the age of the Roman emperor Augustus.


guardian.co.uk © Guardian News & Media Limited 2011 | Use of this content is subject to our Terms & Conditions | More Feeds


May 12 2011

Re-engineering the data stack for speed

AcunuBig data creates a number of storage and processing challenges for developers — efficiency, complexity, cost, among others. London-based data storage startup Acunu is tackling these issues by re-engineering the data stack and taking a new approach to disk storage.

In the following interview, Acunu CEO Tim Moreton discusses the new techniques and how they might benefit developers.

Why do we need to re-engineer the data stack?

Tim MoretonTim Moreton: New workloads mean we must collect, store and serve large volumes of data quickly and cheaply. This poses two challenges. The first is a distributed systems challenge: How do you scale a database across many cheap commodity machines, and deal with replication, nodes failing, etc? There are now many tools that provide a good answer to this — Apache Cassandra is one. Then, the second challenge is once you've decided on the node in the cluster where you're going to read or write some data, how do you do that efficiently? That's the challenge we're trying to solve.

Most distributed databases see it as outside their domain to solve this problem: they support pluggable storage backends, and often use embedded tools like Berkeley DB. Cassandra and HBase go further, and implement their own storage engines based on Google BigTable — these amount to being file systems that run in userspaces as part of their Java codebases.

The problem is that underneath any of these sits a storage stack that hasn't changed much over 20 years. The workloads look different from 20 years ago, and the hardware looks very different. So, we built the Acunu Storage Core, an open-source Linux kernel module that contains optimizations and data structures that let you make better use of the commodity hardware that you already have.

It offers a new storage interface, where keys have any number of dimensions, and values can be very small or very large. Whole ranges can be queried, and large values streamed in and out. It's designed to be just general-purpose enough to model simple key-value stores, BigTable data models like Cassandra's, Redis-style data structures, graphs, and others.

OSCON Data 2011, being held July 25-27 in Portland, Ore., is a gathering for developers who are hands-on, doing the systems work and evolving architectures and tools to manage data. (This event is co-located with OSCON.)

Save 20% on registration with the code OS11RAD


Why would big data stores need versioning?

Tim Moreton: There are many possible reasons, but we're focusing on two. The first is whole-cluster backup. Service outages like Amazon's, and Google having to restore some Gmail data from tape, reminds us that just because our datasets may be different, backup can still be pretty important. Acunu takes snapshots at intervals across a whole cluster and you can copy these "checkpoints" off the cluster with little impact on your cluster's performance. Or, if you mess something up, you can roll back a Cassandra ColumnFamily to a previous point in time.

Speeding up your dev/test cycle is the second reason for versioning. Say you have a Cassandra application serving real users. If you want to develop a new feature in your app that changes what data you store or how you use it, how do you know it's going to work? Most people have separate test clusters and craft test data; others experiment to see if it works on a small portion of their users. Our versioning lets you take a clone of your production ColumnFamily and give it to a developer or automated test run. We're working on making sure these clones are entirely isolated from the production version so whatever you do to it, you won't affect your real users. This lets you try out new code on the whole dataset. When you're confident your code works, you can throw the clone away. This speeds up the dev cycle and reduces the risks of putting new code into production.

What kinds of opportunities do you see this speed boost creating?

Tim Moreton: The decisions around what data gets collected and analyzed are often economic. Cassandra and Hadoop help to make new data problems tractable, but we can do more.

In concrete terms, if you have a Cassandra cluster, and you're continuously collecting lots of log entries or sensor data, and you want to do real-time analytics on that, then our benchmarking shows that Acunu delivers those results up to 50 times faster than vanilla Cassandra. That means you can process 50 times the amount of data, or work at greatly increased detail, or do the same work while buying and managing much less hardware. And this is comparing Acunu against Cassandra, which is in our view the best-of-breed datastore for these types of workloads.

Do you plan to implement speedups for other database systems?

Tim Moreton: Absolutely. Although the first release focuses on Cassandra and an S3-compatible store, we have already ported Voldemort and memcached. The Acunu Storage Core and its language bindings will be open source, and we are actively working with developers on several other databases. Cassandra already gives us good support for a lot of the Hadoop framework. HBase is on the cards, but it's a trickier architectural fit since it sits above HDFS.

You'll be able to interoperate between these various databases. For example, if you have an application that uses memcached, you can read and write the same data that you access with Cassandra — perhaps ingesting it with Flume, then processing it with Cassandra's Hadoop or Pig integrations. We plan to let people use the right tools and interfaces for the job, but without having to move or transform data between clusters.

This interview was edited and condensed.

Related:

New Google Analytics - Overview Reports Overview

This is part of our series of posts highlighting the new Google Analytics. The new version of Google Analytics is currently available in beta to all Analytics users. And follow Google Analytics on Twitter for the latest updates.

This week we’re going a bit meta with an overview of the new Overview reports in the new Google Analytics. Overview reports were part of the old version of Analytics, of course, but we’ve made some changes to help your analysis.

Anatomy of the Overview Report
Each overview report consists of three sections. There's a timeline graph, some aggregate metrics, and a set of reports.



Whats inside of each of these sections depends on which report you’re looking at. For example, the Visitor Overview shows a graph of visits and metrics like New vs. Returning visitors, while Content Overview shows metrics like pageviews and average time on page.

The Graph
We’ve made a few changes to the graphs in the new Google Analytics, and we'll share them here. You can now make adjustments to the graphs you see in Google Analytics from the buttons on the top right of the graph:
  • Switch a graph between Line Chart and Motion Chart
  • Graph different metrics: Select from the dropdown or the scorecard
Metrics dropdown Metrics Scorecard
  • Compare two metrics: Graph an additional metric for comparison

  • Graph By: Change graph from between Monthly, Weekly, Daily, and even Hourly for some reports


Reports
The bottom section of an overview reports lets you look through a subset of the reports available in that section. You can flip through these reports to see where you want to start your analysis. In the Traffic Sources Overview, we can start by looking at a report of Keywords.



From here we can go view the full report or look at another report, like Referral Sources:



Intelligence Overview
Google Analytics Intelligence automatically searches your website traffic to look for anomalies. When it finds something that's out of the ordinary it surfaces this as an alert. You can also setup your own alerts by defining custom alerts.

Now you can feel like the president of the principality of Analytica with your very own Intelligence Overview report.



The Intelligence Overview report shows you all of your automatic alerts (daily, weekly, and monthly) at a glance. From the Intelligence Overview, you can click on Details to see a graph of the alert and go directly into the GA report. You can also add or review an annotation right from the pop-up graph.


I hope you enjoyed this overview of Overview Reports. Please continue to send us feedback on the new Google Analytics. Stay tuned for next week’s installment in New Google Analytics series.

Posted by Trevor Claiborne, Google Analytics Team
Reposted fromdarinrmcclure darinrmcclure

May 04 2011

Turner prize shortlist – the expert view

A critical look at the work of the four candidates: two very different types of painter, a video artist and a sculptor

All of the artists here could easily have been included seamlessly in any Turner shortlist of the past decade. Is this George Shaw's moment? Somewhere, on an English housing estate, it is always a George Shaw moment – a dull Sunday or a walk-the-bloody-dog empty afternoon. His paintings of 1960s estates and hinterlands are defiantly local and prosaic, but probably look exotic and poetic to viewers from abroad. And there is something Larkinesque – as well as Hancockesque – in his work. Perversely anachronistic, at his best he achieves a kind of universality. The tedium in Shaw's dutiful technique matches the places he depicts. To describe this as minor art also catches the mood exactly.

Karla Black's work is painting by other means. Her work is all about surfaces and materials – scrumpled polythene sheet, powdery makeup, glistening conditioners, drools of body lotion. It's as if she started making herself up for a Friday night but found herself painting instead. What she does manages to be at once tacky and beguiling, oddly pretty and pretty horrible; this is to do with the physical properties of her everyday materials and their associations – to the cosmetic and to all that faecal paint the abstract expressionists once flung about. Its all makeup and make-believe. Both a parody of painting and a homage to it. Black's work frequently makes me laugh, nervously.

Bodies also appear and disappear in Hilary Lloyd's videos, films and slide-show presentations. Lloyd is as much into the hardware of projectors, screens on shiny metal poles, the techno-dreck of wires and boxes, as the images of cranes and bridges, motorbikes and bodies they project.

All this technology will soon look dated. Lloyd's art frequently bores me, but I keep thinking I'm not clever enough, not hard-wired for it. I'm always aware of a dry little academic commentary starting up in my head. Then another thought intrudes: can I stop watching yet?

Martin Boyce's sculptures and installations often relate to utopian, modernist design. Lots of artists are currently mining this territory (even Shaw's work makes reference to it), and it is beginning to feel a bit of a cliche. But this is the world whose legacy we live with. Angular neon-tube trees – that look like they're doing calisthenics, drifts of fallen leaves, a hosepipe snaking through a metal grille, all have their place in his work.

If Black's work looks like painting but somehow isn't, Boyce's looks like sculpture, but is just as much a sort of mangled décor. That's what a lot of art is now, for better or for worse. Shaw or Black to win!


guardian.co.uk © Guardian News & Media Limited 2011 | Use of this content is subject to our Terms & Conditions | More Feeds


April 27 2011

02mydafsoup-01

[...]

In a recent NYT, David Brooks reports the mood of the large majority of Americans – majority as large as the majority of Russians sharing that mood. He does it under a saying-it-all title: “The Big Disconnect”:

The current arrangements are stagnant but also fragile. American politics is like a boxing match atop a platform. Once you’re on the platform, everything looks normal. But when you step back, you see that the beams and pillars supporting the platform are cracking and rotting.

This cracking and rotting is originally caused by a series of structural problems that transcend any economic cycle: There are structural problems in the economy as growth slows and middle-class incomes stagnate. There are structural problems in the welfare state as baby boomers spend lavishly on themselves and impose horrendous costs on future generations. There are structural problems in energy markets as the rise of China and chronic instability in the Middle East leads to volatile gas prices. There are structural problems with immigration policy and tax policy and on and on.

[..]

On Dsyfunctionality of the Global Elites | Zygmunt Bauman - Social Europe Journal - 2011-04-27

April 19 2011

Data and a sense of self

qslogo.png Data is gathered by outsiders every time we buy a product, search the Internet, play with our smartphones, stream a movie on Netflix, go to the doctor ... the list is endless. What if we could tap into all that personal data for our own use? This, says Gary Wolf (@agaricus), contributing editor at Wired magazine, is what the Quantified Self (QS) is all about.

QS, which started as a collaboration between Wolf and Kevin Kelly, has grown to include a multitude of forward thinkers who work on QS projects and gather for "Show & Tell" sessions to discuss and present ideas.

Likening QS to the personal computer in the 1980s, Wolf explains in the following interview how the concept is reaching the mainstream and what needs to change for it to become ubiquitous.


How did Quantified Self get started?

GaryWolf.jpgGary Wolf: We started the Quantified Self as a way to investigate some things coming out of the tech scene that could affect our lives in a big way.

Let me tell you how the Quantified Self looks to me, with the caveat that although I'm the co-founder and play an active role, it has grown far beyond the point where any one person's view can be accurate and complete. We are a collaboration among users and tool-makers of self-tracking systems, devoted to exploring the personal meaning of personal data — that is, self-knowledge through numbers.

Numbers play a key role in science and management. But we tend to think of data as a tool that others — advertisers, marketers, academics, and bureaucrats — use to understand or manipulate us. We're interested in how the new tools of tracking and data analysis can give us knowledge about ourselves.

I see a parallel to what happened with computing in the '80s. Computers were understood as tools of management. A few people saw things differently: they argued that computers were for personal expression and communication. That notion seemed very strange — why use a computer to connect with another person when you could call them on the phone or talk to them face to face? But it turned out that the personal uses of computers were not just an important use, but the most important use.

Our collaboration in exploring the personal use of personal data takes several forms. There are open "Show & Tell" meetings in more than 20 cities around the world — Helsinki is our latest addition. We also have a blog, and we recently received some generous funding from the Robert Wood Johnson Foundation to produce an online user's guide to self-tracking tools. The biggest thing in our immediate future is our first Quantified Self conference. This will be a relatively small meeting at the end of May, in which users and tool-makers from around the world will gather to share practices, methods, and questions.

How do you see QS making its way into the mainstream?

Gary Wolf: QS is becoming mainstream so quickly that it is taking our breath away. This has been a funny experience because Kevin Kelly and I have learned through experience that the adoption of new cultural practices associated with technology usually takes longer than our intuitions suggest.

Although we both travel and talk to people, the view of the future from inside the tech culture tends to be foreshortened. In this case, though, three forces are driving QS from outside. The first two of these are obvious: fitness trends and the health care crisis. The third is a bit more obscure, but important: the rise of big institutional systems to track individual behavior. We know we are being tracked, and that others are gathering more information about us than we have about ourselves. So, part of what is happening in the world of QS is a response to this sense that powerful tools of understanding are available. We want access to them for our own purposes.

What components — hardware, usability, technological advancements — need to improve before QS will be widely adopted?

Gary Wolf: The most important thing missing is the widespread understanding that personal data has personal meaning. That your data is not for your boss, your teacher, or your doctor — it's for you. This is a cultural shift. Again, it is analogous to the shift in how we came to understand computers. Everybody who makes a useful QS invention is contributing to this shift.

When it comes to specific inventions, there are many opportunities for making things more useful at this stage. We are seeing new products and companies focused on data collection through sensors and interfaces; on data analysis through aggregation and multi-sensor inputs; on data-driven social/community, mainly through web services; and on scientific discovery through platforms for experimentation. Many participants in QS do more than one of these things. Of course, there is some imitation and mutual influence — sometimes I feel like I'm seeing the same slide decks, just shuffled in a different order. But this is always going to happen in a lively community of inventors and advanced users.

How do you think QS will affect health care?

Gary Wolf: We know that health care — or let me say "the health care industry" — is a train wreck. The problems are well known: immense costs, questionable efficacy, brutal inequity, and perverse incentives.

We see QS as part of the new system that will emerge from this wreckage. Every institution that has a stake in the current system — medical professionals, hospital corporations, pharmaceutical companies — will be picking their way through the ruined landscape of health care, repurposing its valuable components, and hooking them up into the new practices that emerge. We could get all technical and talk about "capitation" (this means payment of a set fee "per head" to health care companies, rather than fee-for-service). Capitation will change care and make it very important to take individual knowledge, desire, and behavior more seriously.

But even without getting into the details of how the health care industry works, we can see that we are not being well served by conventional medicine. You take a drug, for instance, but have no good data about whether it works and what side effects it causes. Health care is going to undergo the most important critical advance in perhaps a century when QS data becomes the yardstick by which its success is judged.

In what other ways do you see QS affecting our daily lives in the future?

Gary Wolf: I wouldn't pretend to be able to give a complete summary or prediction, so let me just throw out a few interesting and colorful examples I've seen. This are all real projects presented at various QS Show & Tell meetings, not prototypes:

  • Facial tracking to improve happiness.
  • Cognition tracking to evaluate effects of simple dietary changes on brain function.
  • Food composition tracking to determine ideal protein/carb meal ratios for athletic performance.
  • Concentration tracking to determine effects of coffee on productivity.
  • Proximity tracking to assist in evaluation of mood swings.
  • Mapping of asthma incidents and correlation with humidity, pollen count, and temperature.
  • Energy use tracking to find opportunities for savings.
  • Gas mileage tracking to figure out if driving style matters.

Again, I want to emphasize that these are not prototype or lab experiments. These are individual users who are exploring an aspect of their own behavior and performance, often for deeply personal reasons.

This interview was edited and condensed.



Related:


Older posts are this way If this message doesn't go away, click anywhere on the page to continue loading posts.
Could not load more posts
Maybe Soup is currently being updated? I'll try again automatically in a few seconds...
Just a second, loading more posts...
You've reached the end.

Don't be the product, buy the product!

Schweinderl