Newer posts are loading.
You are at the newest post.
Click here to check if anything new just came in.

December 28 2013

Willkommen in der Sackgasse: Das Leistungsschutzrecht für Presseverlage

Welche Auswirkungen wird das Leistungsschutzrecht für Presseverlage für Verleger, IT-Dienstleister, Nutzer und Journalisten haben? Bei den Antworten auf häufige Fragen zeigt sich: Das neue Recht wird kaum einen Nutzen haben, dafür aber viel Schaden anrichten.

Der deutsche Gesetzgeber hat 2013 nach langer Diskussion ein neues Leistungsschutzrecht für Presseverleger beschlossen. Letztlich haben die Presseverleger gegen den massiven Widerstand der Wirtschaft, Wissenschaft und Zivilgesellschaft einen Teilsieg errungen. Das Gesetz ist am 1. August in Kraft getreten. Es herrscht viel Unsicherheit darüber, was es bedeutet, wer betroffen ist und was die Betroffenen jetzt tun können oder sollten. Der Gesetzestext enthält eine Vielzahl vager Begriffe und allerhand Ungereimtheiten.
Was ist das Leistungsschutzrecht?

Das Leistungsschutzrecht wird in einem neuen Abschnitt des Urheberrechtsgesetzes geregelt. Es findet sich zukünftig in den Paragrafen 87f-87h. Jeder, der ein Presseerzeugnis herstellt, ist fortan ein Presseverleger, dem das Recht zusteht. Presseerzeugnisse in diesem Sinne sind nicht nur Zeitungen, Zeitschriften oder Webseiten von Verlagen. Auch Betreiber von journalistischen Blogs oder Webseiten können nach dem neuen Gesetz Verleger sein und damit Leistungsschutzrechte haben. Ob das so ist, hängt davon ab, ob die jeweilige Publikation der gesetzlichen Defini­tion eines Presseerzeugnisses entspricht.

Nach der Gesetzesbegründung (PDF) sind das Veröffentlichungen, in denen regelmäßig journalistische Beiträge erscheinen und die „als überwiegend verlagstypisch anzusehen sind“. Reine Nachrichtenzusammenstellungen – gemeint sind wahrscheinlich Pressespiegel oder reine Linklisten – sind ebenso wenig Presseerzeugnisse wie ausschließlich Werbezwecken oder der Kundenbindung dienende Informationsquellen. Das belässt eine Grauzone, die in langwierigen Prozessen durch die Rechtsprechung geklärt werden muss. Vor allem der Begriff „verlagstypisch“ wird im Gesetz mit keinem Wort erläutert.

Klar dürfte sein, dass Blogs, in denen regelmäßig journalistisch publiziert wird, Presseerzeugnisse sind. Damit dürften bekannte Blogs wie zum Beispiel netzpolitik.org oder stefan-niggemeier.de ebenso Presseerzeugnisse sein, wie die Angebote von Spiegel Online, welt.de oder Heise Online. Auch die Nachrichtenseiten von öffentlich-rechtlichen Rundfunkunternehmen wie tagesschau.de oder heute.de fallen unter diese Defini­tion. Presseerzeugnisse können nach der Gesetzesbegründung auch der reinen Unterhaltung dienen. Somit dürfte das Leistungsschutzrecht auch für Angebote gelten, die sich der politischen Satire verschrieben haben, wie etwa den Postillon. Voraussetzung ist, dass sie in gewisser Weise Angeboten von Verlagen ähneln. Was das genau bedeutet, ist unklar. Ob hierfür ein redaktioneller Ablauf mit mehreren Personen erforderlich ist oder das Merkmal lediglich erfordert, dass regelmäßig Beiträge veröffentlicht werden, liegt im Auge des Betrachters und unterliegt im Zweifel der Entscheidungsmacht der Gerichte.

Was tun, wenn man weiterhin gefunden werden will?

Ob sie es wollen oder nicht – die Inhaber der Leistungsschutzrechte stehen nun vor einer Herausforderung. Werden sie nicht aktiv, laufen sie Gefahr, in Suchmaschinen oder News-Aggregatoren nicht mehr oder nur noch unter Anzeige nackter Links gelistet zu werden. Denn das Leistungsschutzrecht schreibt den Anbietern solcher Dienste vor, dass sie Rechte einholen müssen, wenn sie Auszüge aus dem Presseerzeugnis (Snippets) anzeigen wollen. Ob und unter welchen Umständen die Regelung greift, ist unklar. Das Recht anbieterseitig einfach zu ignorieren, wäre dennoch unklug. Denn Inhaber dieses Rechts können mit Abmahnungen, Verfügungsverfahren und Klagen gegen unbefugte Nutzungen vorgehen. Und nicht jeder kann es sich leisten, sein Vorgehen in Gerichtsverfahren zu verteidigen.

Google hat bereits reagiert. Wer zukünftig in Google News gelistet sein will, muss dies durch eine elektronische Erklärung bestätigen – auch wenn er zuvor bereits gelistet war. Darin erklärt der Webseitenbetreiber, dass seine Veröffentlichungen weiterhin bei Google News erscheinen sollen und er dafür kein Geld verlangt. Viele – auch große – Verlage haben sich bereits angemeldet. Die wenigen, die sich dagegen entschieden haben, tauchen seit dem 1. August bei Google News nicht mehr auf.

Dieses Verfahren bezieht sich ausdrücklich nur auf Google News und nicht auf die allgemeine Suchfunktion. Hier soll offenbar alles beim Alten bleiben. Heißt: Es gibt kein Geld für Snippets, aber man bleibt weiterhin gelistet. Wie sich andere Anbieter von Suchmaschinen und News-Aggregatoren verhalten werden, ist bislang unklar. Yahoo hat sich in seiner Nachrichtensektion von Snippets verabschiedet. Hier werden nur noch reine Links ohne Vorschautext angezeigt.

Hieran zeigt sich, dass es zu Googles Vorgehensweise kaum attraktive Alternativen gibt. Der Suchanbieter kann journalistische Angebote großflächig aus seinem Index werfen; versuchen, um das Leistungsschutzrecht herumzukommen; abwarten, bis er verklagt wird oder von einer zukünftigen Presseverleger-Verwertungsgesellschaft aufgefordert wird, Lizenzverträge zu unterzeichnen und zu zahlen. Oder er zeigt – wie Yahoo – einfach gar keine Vorschautexte mehr an. Ob eine solche Suchfunktion dann noch einen Nutzen für den Suchenden hat (und sich dementsprechend für den Anbieter lohnt), ist allerdings mehr als fraglich.

screenshot_yahoo_kreutzer

Screenshot: Links ohne Vorschautext bei Yahoo News.

Verwertungsgesellschaft für Presseverleger als Allheilmittel?

Die Verlegerverbände planen, ihre Rechte in eine Verwertungsgesellschaft einzubringen. Ähnlich der GEMA im Musikbereich könnten hier Dienstanbieter die notwendigen Rechte erwerben. Voraussetzung ist, dass alle Presseverleger Mitglied der Verwertungsgesellschaft sind, also mit ihr Wahrnehmungsverträge abschließen. Dass dadurch sämtliche Unwägbarkeiten ausgeräumt werden können, ist aber unwahrscheinlich. Das Leistungsschutzrecht kann, muss aber nicht von einer Verwertungsgesellschaft wahrgenommen werden. Selbst wenn ein spezialisierter Anbieter eines News-Aggregationsdienstes einen Lizenzvertrag mit der Verwertungsgesellschaft geschlossen hat, läuft er Gefahr, von Presseverlegern, die keine Mitglieder sind, abgemahnt und verklagt zu werden. Eine praktikable One-Stop-Shop-Lösung wird damit nicht erzielt.

Disclaimer und Erklärungen auf der Webseite als Lösung?

Das schadet vor allem den vielen Inhalte-Anbietern, die plötzlich gegen ihren Willen Leistungsschutzrechte besitzen. Die meisten Anbieter von journalistischen Blogs oder Verlagswebseiten werden schließlich wollen, dass ihre Angebote weiterhin indexiert, gefunden und mit Snippets in Suchmaschinen angezeigt werden. Verschiedene große Seiten haben sich bereits durch Erklärungen dafür ausgesprochen, à la: „Wir möchten gern weiterhin indexiert und auch unter Verwendung von Snippets gelistet werden.“ Solche Hinweise haben den Vorteil eines klaren Statements. Sie haben aber den Nachteil, dass sie von den Software-Bots der Suchmaschinen nicht ausgelesen werden können. Suchmaschinen-Anbieter können schwerlich tausende von Webseiten per Hand auf solche Hinweise untersuchen und entscheiden, wer wie gelistet wird. Es bedürfte technischer Lösungen, die bislang jedoch nicht in Sicht sind.

Hauptadressat: Suchmaschinen

Adressaten des Leistungsschutzrechts sind zunächst die Anbieter von Suchmaschinen. Das sind die großen Unternehmen wie Google, Yahoo und Microsoft, aber auch die unzähligen Anbieter kleiner Suchmaschinen. Hier ist die einzige Grenze dadurch gesetzt, dass der Anbieter gewerblich handeln muss. Dieses Merkmal ist aber – wie auch sonst im Urheberrecht – im Zweifel eher weit zu verstehen. Dass tatsächlich Gewinne gemacht werden, wird nicht erforderlich sein, auch nicht, dass der Anbieter eine GmbH oder eine andere juristische Person ist. Ob einzelne Werbebanner für eine Gewerblichkeit ausreichen, werden letztlich die Gerichte klären müssen.
Suchfunktionen auf der eigenen Webseite

Weiterhin ausgenommen sind – laut Gesetzesbegründung – webseiteninterne Suchfunktionen. Offen ist dabei aber, ob in die eigene Webseite implementierte Suchfunktionen („powered by Google“), mit denen man neben dem eigenen Datenbestand auch allgemein im Web suchen kann, hierunter fallen. Genauer gesagt stellt sich die Frage, ob in solchen Fällen der Webseitenbetreiber zum Suchmaschinenanbieter wird oder ob sich etwaige Ansprüche auf den Suchmaschinenbetreiber (zum Beispiel Google) beschränken. Eine Antwort gibt es darauf derzeit nicht. Im Zweifel wird es sich bis auf Weiteres empfehlen, solche Funktionen zu deaktivieren.

Horizontale Suchdienste und soziale Netzwerke als Adressaten

Neben vertikalen Suchdiensten, die das ganze Netz indexieren, sollen auch horizontale Dienste vom Leistungsschutzrecht betroffen sein. Gemeint sind themenspezifische Suchfunktionen wie vor allem News-Aggregatoren. Das Gesetz nennt sie „Dienste, die Informationen ähnlich [wie Suchmaschinen, Anm. des Autors] aufbereiten“. Auch diese Formulierung eröffnet großen Interpretationsspielraum. In der Gesetzesbegründung ist die Rede von „systematischen Zugriffen“ und Diensten, die nach „Art einer Suchmaschine ihre Treffer generieren“. Hiermit ist lediglich gesagt, dass nur automatisch operierende Suchtechnologien unter das Leistungsschutzrecht fallen. Wer Quellen manuell auswählt, zum Beispiel Linklisten erstellt, braucht sich darum nicht zu kümmern.

Wie verhält es sich aber beispielsweise mit Anbietern von sozialen Netzwerken wie Facebook oder Twitter? Immerhin können auch in sozialen Netzwerken themenbezogene Suchen durchgeführt werden. Ob diese Funktionen suchmaschinenartig sind, werden die Gerichte entscheiden müssen. Hier werden sich im Zweifel die Anbieter der Netzwerke herumstreiten müssen. Dass ein Facebook-Nutzer, der Links mit Snippets postet, darunter fällt, ist wohl auszuschließen. Er betreibt keine Suchmaschine oder ähnlichen Dienst.

Achtung: Es ist unmöglich, das Leistungsschutzrecht einzuhalten!

Weiß man, dass man eine Suchmaschine oder einen ähnlichen Dienst in diesem Sinne betreibt, gehen die wirklich schwierigen Fragen erst los. Denn eines sollte deutlich sein: Es ist derzeit für einen Suchdienst unmöglich, das Leistungsschutzrecht einzuhalten, also Rechte für die Anzeige von Snippets einzuholen. Das würde erfordern, mit allen Anbietern von Webseiten, die nach der Definition des Gesetzes Presseerzeugnisse sind, einzeln Verträge zu schließen. Abgesehen davon, dass es keine zentrale Stelle für die Lizenzierung gibt, dürfte diese Vorgehensweise daran scheitern, dass es unmöglich herauszufinden ist, welche Webseiten unter den Begriff des Presseerzeugnisses fallen.

Die einzige Strategie ist – und das macht die Ironie dieser Gesetzesnovellierung besonders deutlich –, sich um das Leistungsschutzrecht herumzudrücken. Mit anderen Worten: Den Anbietern von Suchdiensten bleibt nur übrig, ihre Angebote so zu gestalten, dass sie nicht unter das neue Recht fallen.

Wie können Vermeidungsstrategien aussehen?

Möglich ist zunächst, pauschal jede Publikation, die entfernt an ein Presseerzeugnis erinnert, auszulisten. Naheliegender wird der Versuch sein, die Suchergebnisse so auszugestalten, dass sie gar nicht vom Leistungsschutzrecht betroffen sind. Das Gesetz schließt „nackte Links“ aus. Wie jedem bekannt sein dürfte, sind sie jedoch als alleiniges Suchergebnis unnütz.

Weil das so ist, hat der Gesetzgeber selbst einen Vermeidungsmechanismus geschaffen. In einer letzten Iterationsschleife (vulgo: Schnellschuss) hat er schließlich davon Abstand genommen, jede Art Snippet zu schützen. Sinngemäß heißt es im Gesetz: Das Leistungsschutzrecht ist irrelevant, wenn nur der Link sowie eine „knappe, aber zweckdienliche“ Beschreibung in Form einzelner Worte oder „kleinster Textausschnitte“ angezeigt werden. Nur wer mehr anzeigen will, läuft in das Dilemma, sich unmöglich rechtskonform verhalten zu können.

Was aber ist ein kleinster Textausschnitt? Der Bundestag konnte sich nicht einigen, dies konkreter zu fassen, die Begründung beschränkt sich auf die vage Erläuterung:

Die Empfehlung soll sicherstellen, dass Suchmaschinen und Aggregatoren ihre Suchergebnisse kurz bezeichnen können, ohne gegen Rechte der Rechteinhaber zu verstoßen. [...] Einzelne Wörter oder kleinste Textausschnitte, wie Schlagzeilen, zum Beispiel ‚Bayern schlägt Schalke’, fallen nicht unter das Schutzgut des Leistungsschutzrechts. Die freie, knappe, aber zweckdienliche Beschreibung des verlinkten Inhalts ist gewährleistet. Suchmaschinen und Aggregatoren müssen eine Möglichkeit haben, zu bezeichnen, auf welches Suchergebnis sie verlinken.

Andere Juristen vertreten, dass die elf bis fünfzehn Worte des durchschnittlichen Snippets in der allgemeinen Google-Suche die Grenze des gesetzlichen Vermeidungsmechanismus darstellen. Wieder andere sagen, dass ein Snippet immer ein „kleinster Teil“ ist. Und schließlich gibt es die Meinung, dass die Beurteilung des „kleinsten Teils“ von der Länge des Beitrags abhängt, auf den verlinkt wird. Damit wäre die zulässige Länge eines Snippets Einzelfallfrage, was wiederum den Tod effizienter, automatisierter Suchdienstleistungen bedeuten würde.

Angesichts der Entstehungsgeschichte des gesetzlichen Vermeidungsmecha­nismus ist meines Erachtens davon auszugehen, dass der Gesetzgeber Snippets in der derzeit üblichen Länge vom Leistungsschutzrecht ausnehmen wollte. Momentan kann aber niemand diese Frage mit hinreichender Sicherheit beantworten. Der Gesetzgeber hat ein neues Betätigungsfeld für Anwälte und Gerichte geschaffen. Man kann nur hoffen, dass dies unter den großen Anbietern wie Google und Springer ausgetragen wird und nicht auf dem ­Rücken der kleinen.

Bedeutung für Journalisten

Das Leistungsschutzrecht kann für Journalisten in zweierlei Hinsicht bedeutsam sein. Einerseits stellt sich die Frage, ob sie hierdurch einen Nachteil erleiden könnten, zum Beispiel weil sie daran gehindert werden, ihre Beiträge ein zweites Mal zu verwerten. Zum anderen ist zweifelhaft, ob sie von etwaigen Einnahmen der Verleger profitieren können.

Der Gesetzgeber hat versucht, jegliche beeinträchtigende Wirkung für Journalisten auszuschließen. Auf der einen Seite sind sie – sofern sie nicht ausnahmsweise eine Suchmaschine oder einen News-Aggregator betreiben – nicht Adressat der Ansprüche. Sie müssen also zumindest für ihre eigenen Blogs und Webseiten keine Rechte einholen oder sich mit Vermeidungsstrategien beschäftigen.

Andererseits kann das Recht nach dem Gesetzeswortlaut nicht „zum Nachteil des Urhebers [...] geltend gemacht werden“. Was das bedeuten soll, ist unklar. So wäre es gerade für freie Journalisten ein Nachteil, wenn ein Verlag, für den sie schreiben, die Einwilligung für Google News nicht abgibt. Denn dann würden ihre Beiträge dort nicht gelistet und sie würden Aufmerksamkeit und Reichweite verlieren. Dass ein Journalist aus diesem Grund einen Verlag dazu zwingen könnte, sich bei Google News anzumelden, ist aber höchst unwahrscheinlich. Wahrscheinlicher ist, dass es sich bei der Formulierung um ein Relikt aus einer früheren Gesetzesversion handelt, die in der überarbeiteten Fassung nicht entfernt wurde.

Auch in Bezug auf die Einnahmeseite der Journalisten sind noch viele Fragen offen. Der Gesetzgeber hat immer wieder hervorgehoben, dass auch die Urheber wirtschaftlich vom Leistungsschutzrecht profitieren sollen. Im Gesetz heißt es daher, dass die Urheber an der Vergütung „angemessen zu beteiligen“ sind. Auch hierüber wird man sich trefflich streiten. Kommt es wie prognostiziert, werden ohnehin keine Einnahmen erzielt. Selbst wenn das aber der Fall sein sollte, haben verlagsnahe Juristen bereits eine Antwort: Im Zweifel wird Beteiligung der Urheber bei Null liegen.

Fazit

Andere als die Anbieter von Suchmaschinen und Aggregatoren sind durch das Leistungsschutzrecht nicht betroffen. Es kann also – ohne Rücksicht auf die neuen Regelungen – weiterhin gebloggt, verlinkt und zitiert werden – natürlich nur in dem vom Urheberrecht gesetzten Rahmen.

Im Vorfeld waren sich fast alle unabhängigen Beobachter der Causa Leistungsschutzrecht einig: Das neue Recht wird kaum einen Nutzen haben, aber viel Schaden und Rechtsunsicherheit verursachen. Genau diese Folge ist jetzt eingetreten. Zwar hat sich der Gesetzgeber überzeugen lassen, den Anwendungsbereich so klein zu halten, dass er kaum noch feststellbar ist. Wenn man bedenkt, dass am Anfang zur Debatte stand, mehr oder weniger jede berufstätige Person, die gesamte deutsche Wirtschaft und jeden Blogger mit dem Leistungsschutzrecht zu belasten, ist dies immerhin ein Erfolg.

Dennoch: Das nunmehr in Kraft tretende Rudiment wird nichts als Kollateralschäden anrichten. Google hat mit seinen Maßnahmen schon jetzt deutlich gemacht, dass es sich nicht dazu zwingen lässt, für Snippets zu bezahlen. Andere Dienstanbieter werden vorwiegend auf Vermeidungsstrategien setzen. Dabei werden sie mit ihren offenen Fragen und der damit einhergehenden Rechtsunsicherheit allein gelassen.

Die einen wissen nicht, ob und wie sie zukünftig noch über Suchtechnologien auffindbar sind und wie sie es anstellen sollen, dass ihre Reichweite und Publizität nicht beschädigt wird. Die anderen werden an der Frage verzweifeln, wie sie ihre Dienste rechtskonform – wohlgemerkt abseits vom Leistungsschutzrecht! – ausgestalten sollen. Sie werden auslisten, nur noch nackte Links anzeigen, ihre Dienste einstellen oder gar nicht erst auf den deutschen Markt kommen. Die Lose-Lose-Situation, die ich schon lange vorher prognostiziert hatte, hat sich damit in vollen Zügen realisiert.

foto_till-kreutzerTill Kreutzer ist Rechts­anwalt, Rechtswissenschaftler und Publizist. Er lebt in Berlin. Er ist Partner beim iRights.Lab und Anwalt bei iRights.Law. Foto: Jana Pofalla


.
Dieser Text ist auch im Magazin „Das Netz – Jahresrückblick Netzpolitik 2013-2014“ erschienen. Sie können das Heft für 14,90 EUR bei iRights.Media bestellen. „Das Netz – Jahresrückblick Netzpolitik 2013-2014“ gibt es auch als E-Book, zum Beispiel bei Amazon*, beim Apple iBook-Store* oder bei Beam (* Affiliate-Link).

November 01 2013

NSA infiltriert Google- und Yahoo-Netzwerke, Adobe-Kopierschutz, iCloud-Schlüsselbund

In den Cloud-Links der Woche: NSA zapft interne Datenleitungen an, US-Dienste wollen E-Mail sicherer machen, neuer Kopierschutz für E‑Books von Adobe, Klage um Streaming-Einnahmen und Passwörter bei iCloud.

NSA soll auch in interne Netze von Google und Yahoo eindringen

Wie zuerst von der Washington Post berichtet, zapft die NSA in Verbindung mit dem britischen Geheimdienst GCHQ offenbar auch interne Datenleitungen von Google und Yahoo an. Im Unterschied zum bereits bekannten „PRISM” soll das „Muscular” genannte Programm ohne Kenntnis der Unternehmen und ohne gerichtliche Grundlage ablaufen. Mit welchen Methoden genau die Dienste in private Netze eindringen, ist nicht mit Sicherheit zu sagen. Die Washington Post stellt mögliche Szenarien in einer Infografik dar. In einer Einschätzung meint Sicherheitsforscher Bruce Schneier, dass auch Microsoft, Apple, Facebook, Dropbox und andere Clouddienste in gleicher Weise als kompromittiert gelten müssten.

US-Anbieter wollen Sicherheit bei Mails weiterentwickeln

Die US-Dienste Lavabit und Silent Circle haben sich einer Entwicklungsallianz zusammengeschlossen, die E-Mails sicherer vor Ausspähung machen will. Wie aus einem Blogpost bei Silent Circle hervorgeht, will die neugegründete „Dark Mail Alliance” wohl vorerst keinen eigenen Dienst anbieten, sondern die dem Mailverkehr zugrundeliegenden Protokolle und Verfahren weiterentwickeln und dafür unter anderem auf das bei Chat-Programmen verbreitete XMPP-Protokoll zurückgreifen. Lavabit hatte im August seinen Dienst eingestellt, statt private Schlüssel an US-Behörden zu übergeben. Kurz darauf schaltete auch das von PGP-Erfinder Phil Zimmermann gegründete Unternehmen „Silent Circle” seinen E-Mail-Dienst ab. Nun hoffen die Unternehmen darauf, größere Mailanbieter ins Boot zu holen.

Adobe plant neues Kopierschutzsystem

Wie Johannes Haupt bei lesen.net berichtet, will Adobe in den kommenden Monaten eine neue Version seines DRM-Systems für E‑Books einführen. Kopierschutz von Adobe ist bei E‑Books im Epub-Format und PDF-Dateien das am weitesten verbreitete System und wird an Verlage unterlizenziert. Adobe nennt das neue System „unknackbar”; erfahrungsgemäß ist es nur eine Zeitfrage, bis Kopierschutz-Systeme geknackt sind. Beim jetzigen von Adobe eingesetzten System ist das bereits seit einigen Jahren der Fall.

Streaming-Einnahmen: Schwedische Künstler wollen Labels verklagen

Musiker in Schweden haben angekündigt, gegen die Plattenfirmen Universal und Warner Music vor Gericht zu ziehen. Wie musikmarkt.de berichtet, will die schwedische Musikergewerkschaft einen höheren Anteil für die Künstler an den Einnahmen von Streaming-Diensten erstreiten. In Schweden machen die Dienste – an erster Stelle das dort gegründete Spotify – dem Bericht nach 70 Prozent der Umsätze im Musikmarkt aus. Die Musiker erhielten 6 bis 10 Prozent der Einnahmen, ebenso wie im klassischen Tonträgermarkt. Die Künstler dagegen fordern 50 Prozent.

Heise: Wie sicher sind Passwörter in der iCloud?

Mit Apples neuen Betriebssystemen iOS 7 und Mavericks lassen sich auch Passwörter im Clouddienst des Unternehmens sichern. Bei Heise Security untersucht Jürgen Schmidt, wie es um die Sicherheit steht. Gegen Angriffe durch Dritte sei das System „schon recht gut abgesichert”, „erschreckend schlecht” sei jedoch die Sicherheit zu bewerten, wenn man Zugriffe von oder über Apple selbst in die Betrachtung einbezieht. Für eine genaue Sicherheitsbewertung müsste Apple jedoch entweder technische Details offenlegen oder Forscher müssten weitere Analysen durchführen.

July 24 2013

Le 1er ministre anglais en croisade contre la pornographie

Le 1er ministre anglais en croisade contre la pornographie

http://cur.lv/1hceb

David Cameron, le premier ministre anglais, estime que la présence de la pornographie sur Internet n’a que trop duré. Il mène depuis peu une campagne pour remédier à cela.

Selon lui, les acteurs majeurs d’Internet que sont google, Bing et Yahoo ! ont leur part de responsabilité dans l’accessibilité de la pornographie. Le politicien leur a demandé de mettre en place un filtre activé de base, ce qui obligerait les clients à contacter leur FAI pour en demander explicitement la désactivation.

David Cameron estime qu’il est temps que les FAI fassent preuve de moral afin de lutter plus efficacement contre la pornographie, particulièrement celle concernant les enfants, en mettant en blacklistant certains mots lors des recherches. Et si jamais cela n’est techniquement pas possible, le 1er ministre anglais les somme de mettre au travail leurs petits génies.❞

#pornographie #cameron #anglais #google #bing #yahoo

May 17 2012

Strata Week: Google unveils its Knowledge Graph

Here's what caught my attention in the data space this week.

Google's Knowledge Graph

Google Knowledge Graph"Google does the semantic Web," says O'Reilly's Edd Dumbill, "except they call it the Knowledge Graph." That Knowledge Graph is part of an update to search that Google unveiled this week.

"We've always believed that the perfect search engine should understand exactly what you mean and give you back exactly what you want," writes Amit Singhal, Senior VP of Engineering, in the company's official blog post.

That post makes no mention of the semantic web, although as ReadWriteWeb's Jon Mitchell notes, the Knowledge Graph certainly relies on it, following on and developing from Google's acquisition of the semantic database Freebase in 2010.

Mitchell describes the enhanced search features:

"Most of Google users' queries are ambiguous. In the old Google, when you searched for "kings," Google didn't know whether you meant actual monarchs, the hockey team, the basketball team or the TV series, so it did its best to show you web results for all of them.

"In the new Google, with the Knowledge Graph online, a new box will come up. You'll still get the Google results you're used to, including the box scores for the team Google thinks you're looking for, but on the right side, a box called "See results about" will show brief descriptions for the Los Angeles Kings, the Sacramento Kings, and the TV series, Kings. If you need to clarify, click the one you're looking for, and Google will refine your search query for you."

Yahoo's fumbles

The news from Yahoo hasn't been good for a long time now, with the most recent troubles involving the departure of newly appointed CEO Scott Thompson over the weekend and a scathing blog post this week by Gizmodo's Mathew Honan titled "How Yahoo Killed Flickr and Lost the Internet." Ouch.

Over on GigaOm, Derrick Harris wonders if Yahoo "sowed the seeds of its own demise with Hadoop." While Hadoop has long been pointed to as a shining innovation from Yahoo, Harris argues that:

"The big problem for Yahoo is that, increasingly, users and advertisers want to be everywhere on the web but at Yahoo. Maybe that's because everyone else that's benefiting from Hadoop, either directly or indirectly, is able to provide a better experience for consumers and advertisers alike."

De-funding data gathering

The appropriations bill that recently passed the U.S. House of Representatives axes funding for the Economic Census and the American Community Survey. The former gathers data about 25 million businesses and 1,100 industries in the U.S., while the latter collects data from three million American households every year.

Census Bureau director Robert Groves writes that the bill "devastates the nation's statistical information about the status of the economy and the larger society." BusinessWeek chimes in that the end to these surveys "blinds business," noting that businesses rely "heavily on it to do such things as decide where to build new stores, hire new employees, and get valuable insights on consumer spending habits."

Got data news to share?

Feel free to email me.

OSCON 2012 — Join the world's open source pioneers, builders, and innovators July 16-20 in Portland, Oregon. Learn about open development, challenge your assumptions, and fire up your brain.

Save 20% on registration with the code RADAR


Related:


May 10 2012

Understanding Mojito

Yahoo's Mojito is a different kind of framework: all JavaScript, but running on both the client and the server. Code can run on the server, or on the client, depending on how the framework is tuned. It shook my web architecture assumptions by moving well beyond the convenience of a single language, taking advantage of that approach to process code where it seems most efficient. Programming this way will make it much easier to bridge the gap between developing code and running it efficiently.

I talked with Yahoo architect fellow and VP Bruno Fernandez-Ruiz (@olympum) about the possibilities Node opened and Mojito exploits.

Highlights from the full video interview include:

  • "The browser loses the chrome." Web applications no longer always look like they've come from the Web. [Discussed at the 02:11 mark]
  • Basic "Hello World" in Mojito. How do you get started? [Discussed at the 05:05 mark]
  • Exposing web services through YQL. Yahoo Query Language lets you work with web services without sweating the details. [Discussed at the 07:56 mark]
  • Manhattan, a closed Platform as a Service. If you want a more complete hosting option for your Mojito applications, take a look. [Discussed at the 10:29 mark]
  • Code should flow among devices. All of these devices speak HTML and JavaScript. Can we help them talk with each other? [Discussed at the 11:50 mark]

You can view the entire conversation in the following video:

Fluent Conference: JavaScript & Beyond — Explore the changing worlds of JavaScript & HTML5 at the O'Reilly Fluent Conference (May 29 - 31 in San Francisco, Calif.).

Save 20% on registration with the code RADAR20


Related:


April 13 2012

Publishing News: DoJ lawsuit is great news for Amazon

Here are a few stories from the publishing space that caught my eye this week.

Amazon does a little Snoopy dance

DoJSeal.pngThe biggest story this week was the U.S. Department of Justice (DoJ) filing a lawsuit against Apple and publishers Hachette, HarperCollins, Macmillan, Simon & Schuster and Penguin, accusing them of colluding over ebook prices. If you unplugged or dropped off-grid for the past several days, solid roundups and analyses can be found with Tim Carmody at Wired and Laura Hazard Owen at PaidContent, and you can read the complaint itself here (PDF).

Right off the bat, three publishers — Hachette, HarperCollins and Simon & Schuster — settled, and Macmillan and Penguin stood their ground. Amazon responded to the situation almost immediately as well:

"This is a big win for Kindle owners, and we look forward to being allowed to lower prices on more Kindle books."

Book publishing analyst Michael Norris told the New York Times: "Amazon must be unbelievably happy today. Had they been puppeteering this whole play, it could not have worked out better for them."

Apple finally responded yesterday. As reported by Peter Kafka at All Things Digital, Apple spokesman Tom Neumayr said:

"The DOJ's accusation of collusion against Apple is simply not true. The launch of the iBookstore in 2010 fostered innovation and competition, breaking Amazon's monopolistic grip on the publishing industry. Since then customers have benefited from eBooks that are more interactive and engaging. Just as we've allowed developers to set prices on the App Store, publishers set prices on the iBookstore."

Much discussion and analysis has ensued in the aftermath — and I'm sure it will continue in the coming days and weeks.

Some are purporting that even if the collusion between the publishers proves to be true, Apple might walk away squeaky clean. A report at CNET noted why this may be the case:

"One reason lies in the Justice Department's 36-page complaint, which recounts how publishers met over breakfast in a London hotel and dinners at Manhattan's posh Picholine restaurant, which boasts a "Best of Award of Excellence" from Wine Spectator magazine. The key point is that Apple wasn't present."

Bryan Chaffin at the Mac Observer argued that yes, collusion most probably occurred but that it will be a mistake to undo it: "Doing so will clear the way for Amazon to dump books below price, taking ever more share (and power) in the book industry — that is the greater anticompetitive threat."

On the flipside, Mike Cane argued on his xBlog that the suit didn't go far enough and that the DoJ needs to sue Apple again. In a letter sent to all of the Department of Justice attorneys listed in the antitrust suit papers filed, he said:

"The advantage iPhone and iPad owners have in using the iBooks app is that they can browse and purchase eBooks from within that app. It's a seamless customer experience.

By contrast, all eBook apps from competing eBook stores — such as those from Amazon, Kobo, Barnes & Noble, and others — cannot offer an identical shopping experience. They are disallowed by Apple. Apple has demanded from each of its iBookstore competitors a 30% cut of any purchases made using Apple APIs for what is called 'in-app purchasing.'

To me, this is every bit as much restraint of trade as the collusive price-fixing that made the Department bring Apple and its co-conspirators before the court for remedy."

Individual U.S. states have thrown in as well: 16 State Attorneys have filed suit, alleging that agency pricing cost consumers $100 million.

Earlier this week before any suits were filed, at least two of the Big Six publishers refused to sign new contracts with Amazon. It will be interesting to see how this all plays out and whether or not publishers are spurred into action to do more to prevent Amazon from totally monopolizing the market, such as dropping DRM.

The future of publishing has a busy schedule.
Stay up to date with Tools of Change for Publishing events, publications, research and resources. Visit us at oreilly.com/toc.

This chapter brought to you by ...

Just about a year ago, Amazon introduced an ad-supported Kindle at a reduced cost in exchange for the consumer enduring ads on the home and screen saver pages. Now, Yahoo has filed patent applications that indicate a plan to bring those ads directly into ebook content. A report at the BBC explained:

"The filings suggest that users could be offered titles at a variety of prices depending on the ads' prominence. They add that the products shown could be determined by the type of book being read, or even the contents of a specific chapter, phrase or word ... It suggests users could be offered ads as hyperlinks based within the book's text, in-laid text or even 'dynamic content' such as video. Another idea suggests boxes at the bottom of a page could trail later chapters or quotes saying 'brought to you by Company A.'"

From a revenue perspective, ads in ebook content makes all kinds of sense. From a reader perspective, I just hope there's always a price point for those of us who prefer to do our reading sans corporate sponsorship.

B&N one-ups Amazon

A close friend recently told me a story highlighting an issue with his Kindle: While reading in the car on a road trip, he had to give up his Kindle and resort to the Kindle app on his iPad to keep reading when it got dark. Maybe he should have waited and bought a Nook.

B&N introduced the Nook Simple Touch with GlowLight this week — the first e-ink device to employ light. Alexandra Chang described the device in a post for Wired:

"The GlowLight resembles B&N's flagship Nook Simple Touch — same 6-inch touchscreen display, same size and includes the same internal parts. The Nook Simple Touch with GlowLight, however, is slightly lighter at just 6.95 ounces, compared to the Nook Simple Touch's 7.48 ounces ... The GlowLight technology consists of LED lights located at the top of the Nook's screen and an anti-glare screen protector. The light is evenly scattered across the screen and is adjustable via the menu."

The timing of the release is interesting, as rumors surfaced last week that Amazon was readying a front-lit display for its Kindle device.

Seal: US-DeptOfJustice-Seal, on Wikimedia Commons

Related:


April 05 2012

Four short links: 5 April 2012

  1. Who Else Uses Masonry Style? (Quora) -- list of sites using the multi-columns effect as provided by the jQuery plugin.
  2. Will Hatchette Be First Big 6 Publisher To Drop DRM? (Paid Content) -- DRM “doesn’t stop anyone from pirating,” Hachette SVP digital Thomas said in a publishing panel at Copyright Clearance Center’s OnCopyright 2012. “It just makes it more difficult, and anyone who wants a free copy of any of our books can go online now and get one." (via Tim O'Reilly)
  3. Javascript Mental Models (Alex Russell) -- What we’re witnessing here isn’t “right” or “wrong”-ness. It’s entirely conflicting world views that wind up in tension.
  4. Mojito (Github) -- BSD-licensed Mojito is the JavaScript library implementing Cocktails, a JavaScript-based on-line/off-line, multi-device, hosted application platform. This is Javascript on server and/or on client.

March 26 2012

The vision behind Yahoo's Cocktails platform and Livestand app

This post is part of the TOC podcast series. You can also subscribe to the free TOC podcast through iTunes.


New content distribution platforms are springing up all around us. Most are from startups struggling to gain market visibility. When a long-term player like Yahoo enters the market, though, it's important to give them thorough consideration. Late last year, Yahoo launched a multi-pronged platform called Cocktails, which they described as "a mix of HTML5, Node.JS, CSS3, JavaScript and a lot of ingenious, creative, mind-bending tricks from Yahoo's engineers." In this TOC podcast interview, Bruno Fernandez-Ruiz (@olympum), architect fellow and VP at Yahoo, shares the thinking that went into Cocktails as well as their Livestand app.

Key points from the full video interview (below) include:

  • Cocktails & discoverability — Recommendations and delivering better, highly targeted content are keys to the Cocktails platform. [Discussed at the 1:20 mark.]
  • Livestand was built with Cocktails — What you see looks like a typical news app, but below the surface are loads of transformation and optimization tricks done via Cocktails that result in a terrific user experience. [Discussed at 2:02.]
  • We live in a "partially connected" world — One of the mistakes made by mobile app developers is the assumption that there's always a live connection to the web. Yahoo recognizes that's not always the case and built Cocktails with this issue in mind. [Discussed at 3:00.]
  • HTML5 as an alternative to native apps — Because Cocktails is built upon HTML5, publishers can experiment with it without feeling as locked into a platform as they would with native apps. [Discussed at 6:49.]
  • More than a presentation model — Livestand also lets publishers leverage Yahoo's advertising and personalization systems. [Discussed at 8:40.]
  • Open source will play a critical role — Mojito, a component of Cocktails, will be open sourced soon. The benefits are to have the community look at what Yahoo has created and help extend the platform further. [Discussed at 9:17.]
  • Formats will converge ... toward HTML5 — EPUB and mobi are tied to book formats whereas HTML5 allows for a much richer experience. As we rethink what a "book" can become, we'll probably want to lean more on HTML5 and not try to graft more HTML5-like functionality onto EPUB/mobi. [Discussed at 11:45.]

You can view the entire interview in the following video.

Click here for more information on the Yahoo Developer Network.

Mini TOC Chicago — Being held April 9, Mini TOC Chicago is a one-day event focusing on Chicago's thriving publishing, tech, and bookish-arts community.

Register to attend Mini TOC Chicago

Related:

March 22 2012

Strata Week: Machine learning vs domain expertise

Here are a few of the data stories that caught my attention this week:

Debating the future of subject area expertise

Data Science Debate panel at Strata CA 12
The "Data Science Debate" panel at Strata California 2012. Watch the debate.

The Oxford-style debate at Strata continues to be one of the most-talked-about events from the conference. This week, it's O'Reilly's Mike Loukides who weighs in with his thoughts on the debate, which had the motion "In data science, domain expertise is more important than machine learning skill." (For those that weren't there, the machine learning side "won." See Mike Driscoll's summary and full video from the debate.)

Loukides moves from the unreasonable effectiveness of data to examine the "unreasonable necessity of subject experts." He writes that:

"Whether you hire subject experts, grow your own, or outsource the problem through the application, data only becomes 'unreasonably effective' through the conversation that takes place after the numbers have been crunched ... We can only take our inexplicable results at face value if we're just going to use them and put them away. Nobody uses data that way. To push through to the next, even more interesting result, we need to understand what our results mean; our second- and third-order results will only be useful when we understand the foundations on which they're based. And that's the real value of a subject matter expert: not just asking the right questions, but understanding the results and finding the story that the data wants to tell. Results are good, but we can't forget that data is ultimately about insight, and insight is inextricably tied to the stories we build from the data. And those stories are going to be ever more essential as we use data to build increasingly complex systems."

Microsoft hires former Yahoo chief scientist

Microsoft has hired Raghu Ramakrishnan as a technical fellow for its Server and Tools Business (STB), reports ZDNet's Mary Jo Foley. According to his new company bio, Ramakrishnan's work will involve "big data and integration between STB's cloud offerings and the Online Services Division's platform assets."

Ramakrishnan comes to Microsoft from Yahoo, where he's been the chief scientist for three divisions — Audience, Cloud Platforms and Search. As Foley notes, Ramakrishnan's move is another indication that Microsoft is serious about "playing up its big data assets." Strata chair Edd Dumbill examined Microsoft's big data strategy earlier this year, noting in particular its work on a Hadoop distribution for Windows server and Azure.

Analyzing the value of social media data

How much is your data worth? The Atlantic's Alexis Madrigal does a little napkin math based on figures from the Internet Advertising Bureau to come up with a broad and ambiguous range between half a cent and $1,200 — depending on how you decide to make the calculation, of course.

In an effort to make those measurements easier and more useful, Google unveiled some additional reports as part of its Analytics product this week. It's a move Google says will help marketers:

"... identify the full value of traffic coming from social sites and measure how they lead to direct conversions or assist in future conversions; understand social activities happening both on and off of your site to help you optimize user engagement and increase social key performance indicators (KPIs); and make better, more efficient data-driven decisions in your social media marketing programs."

Engagement and conversion metrics for each social network will now be trackable through Google Analytics. Partners for this new Social Data Hub, include Disqus, Echo, Reddit, Diigo, and Digg, among others.

Fluent Conference: JavaScript & Beyond — Explore the changing worlds of JavaScript & HTML5 at the O'Reilly Fluent Conference (May 29 - 31 in San Francisco, Calif.).

Save 20% on registration with the code RADAR20

Got data news?

Feel free to email me.

Related:

March 06 2012

EuGH: Fussball-Spielpläne sind nicht urheberrechtlich geschützt

Spielpläne für Fußballbegegnungen können nicht urheberrechtlich geschützt werden. Das hat der Europäische Gerichtshof (EuGH) entschieden.

Weiterlesen

February 16 2012

Strata Week: The data behind Yahoo's front page

Here are a few of the data stories that caught my attention this week.

Data and personalization drive Yahoo's front page

Yahoo offered a peak behind the scenes of its front page with the release of the Yahoo C.O.R.E. Data Visualization. The visualization provides a way to view some of the demographic details behind what Yahoo visitors are clicking on.

The C.O.R.E. (Content Optimization and Relevance Engine) technology was created by Yahoo Labs. The tech is used by Yahoo News and its Today module to personalize results for its visitors — resulting in some 13,000,000 unique story combinations per day. According to Yahoo:

"C.O.R.E. determines how stories should be ordered, dependent on each user. Similarly, C.O.R.E. figures out which story categories (i.e. technology, health, finance, or entertainment) should be displayed prominently on the page to help deepen engagement for each viewer."

Screenshot from Yahoo's CORE visualization
Screenshot from Yahoo's CORE data visualization. See the full visualization here.

Scaling Tumblr

Over on the High Scalability blog, Todd Huff examines how the blogging site Tumblr was able to scale its infrastructure, something that Huff describes as more challenging than the scaling that was necessary at Twitter.

To put give some idea of the scope of the problem, Hoff cites these figures:

"Growing at over 30% a month has not been without challenges. Some reliability problems among them. It helps to realize that Tumblr operates at surprisingly huge scales: 500 million page views a day, a peak rate of ~40k requests per second, ~3TB of new data to store a day, all running on 1000+ servers."

Hoff interviews Blake Matheny, distributed systems engineer at Tumblr, for a look at the architecture of both "old" and "new" Tumblr. When the startup began, it was hosted on Rackspace where "it gave each custom domain blog an A record. When they outgrew Rackspace there were too many users to migrate."

The article also describes the Tumblr firehose, noting again its differences from Twitter's. "A challenge is to distribute so much data in real-time," Huff writes. "[Tumblr} wanted something that would scale internally and that an application ecosystem could reliably grow around. A central point of distribution was needed." Although Tumblr initially used Scribe/Hadoop, "this model stopped scaling almost immediately, especially at peak where people are creating 1000s of posts a second."

Strata 2012 — The 2012 Strata Conference, being held Feb. 28-March 1 in Santa Clara, Calif., will offer three full days of hands-on data training and information-rich sessions. Strata brings together the people, tools, and technologies you need to make data work.

Save 20% on registration with the code RADAR20


Visualization creation

Data scientist Pete Warden offers his own lessons learned about building visualizations this week in a story here on Radar. His first tip: "Play with your data" -- that is, before you decide what problem you want to solve or visualization you want to create, take the time to know the data you're working with.

Warden writes:

"The more time you spend manipulating and examining the raw information, the more you understand it at a deep level. Knowing your data is the essential starting point for any visualization."

Warden explains how he was able to create a visualization for his new travel startup, Jetpac, that showed where American Facebook users go on vacation. Warden's tips aren't simply about the tools he used; he also walks through the conceptualization of the project as well as the crunching of the data.

Got data news?

Feel free to email me.

Related:

January 06 2012

Commerce Weekly: Yahoo's new CEO has data focus

As the payments world roused itself from its holiday hiatus, here are some of the items that caught my eye.

Former PayPal chief brings data focus to Yahoo CEO position

YahooScott Thompson's move from leading eBay's PayPal division to becoming CEO of Yahoo received ample coverage in this light news week. The most interesting aspect to me was this former chief technology officer's focus on the importance of data to Yahoo's success. While past CEOs have focused on advertising, the company's role in the media landscape and alliances with U.S. and Chinese companies, Thompson showed his tech-centered origins in an interview with Ad Age:

At PayPal, we were able to create an unbelievably compelling business because we used data to understand risk and fraud better than anyone on earth. And that was the secret sauce. We had more data than anyone else, better tools and models, and super smart people who were challenged by the problem. It doesn't seem glamorous, but that was the reason.

Fast Company emphasized Thompson's background as PayPal's CTO and made clear to its lay-business audience that when he's talking about data, he's not just talking about a better dashboard to understand advertising opportunities. He's talking about the "big data" opportunity, tapping into large datasets produced by the transactions and interactions of Yahoo's 700 million members around the world.

From E.B. Boyd's Fast Company post:

Every day, those 700 million souls log in to the Yahoo universe and start making their way around its sites, moving from story to story to story to story — effectively giving Yahoo a media mogul's dream: the largest petri dish in the world to understand what sorts of content appeal to which sorts of people and what sorts of things will make them likely to consume more and more.

Of course, this is hardly news to Yahoo's data engineers or the big data community, but it will be interesting to see what effect a data-savvy CEO will have on Yahoo's prospects.

X.commerce harnesses the technologies of eBay, PayPal and Magento to create the first end-to-end multi-channel commerce technology platform. Our vision is to enable merchants of every size, service providers and developers to thrive in a marketplace where in-store, online, mobile and social selling are all mission critical to business success. Learn more at x.com.

Flurry: More than one billion apps downloaded in 2011's final week

While most retailers focus on the crucial weeks leading up to the holidays, the week between Christmas and New Year's Day — when customers are off work playing with their newly received devices — is more important for app developers. In fact, Flurry reports that this particular week was the largest ever for iOS and Android device activations and app downloads.

Flurry estimates that more than 20 million iOS and Android devices were activated, and 1.2 billion applications were downloaded on the two platforms. Christmas day itself was the biggest day ever for downloads: Flurry estimates that 242 million apps were downloaded while happy recipients explored their new toys.

Flurry also predicted that Apple's App Store will have delivered more than 10 billion apps in 2011 — more than twice the number downloaded in 2008, 2009 and 2010 combined.

EBay's mobile VP goes shopping with Robert Scoble

Just before the holiday, we reported on the "Watch with eBay" feature in eBay's iPad app, which offers viewers a sort of real-time catalog, proffering goods related to the program they're viewing on TV. Robert Scoble has an interesting follow-up interview with Steve Yankovich, eBay's vice president of mobile. Yankovich dropped by Scoble's home office with the app to show him how it works, and he revealed a new feature that identifies fabric patterns in clothing and taps related clothing items in eBay's inventories.

Posters on Scoble's related Google+ thread were more fascinated (or irritated) by Yankovich's comments that even though Android devices are dominating the market, the iOS platform is still more important from a commerce perspective.

Got news?

News tips and suggestions are always welcome, so please send them along.


Related:


November 28 2011

Four short links: 28 November 2011

  1. Twine (Kickstarter) -- modular sensors with connectivity, programmable in If This Then That style. (via TechCrunch)
  2. Small Sample Sizes Lead to High Margins of Error -- a reminder that all the stats in the world won't help you when you don't have enough data to meaningfully analyse.
  3. Yahoo! Cocktails -- somehow I missed this announcement of a Javascript front-and-back-end dev environment from Yahoo!, which they say will be open sourced 1Q2012. Until then it's PRware, but I like that people are continuing to find new ways to improve the experience of building web applications. A Jobsian sense of elegance, ease, and perfection does not underly the current web development experience.
  4. UK Govt To Help Businesses Fight Cybercrime (Guardian) -- I view this as a good thing, even though the conspiracy nut in me says that it's a step along the path that ends with the spy agency committing cybercrime to assist businesses.

September 19 2011

Four short links: 19 September 2011

  1. 1996 vs 2011 Infographic from Online University (Evolving Newsroom) -- "AOL and Yahoo! may be the butt of jokes for young people, but both are stronger than ever in the Internet's Top 10". Plus ça change, plus c'est la même chose.
  2. Pandas -- open source Python package for data analysis, fast and powerful. (via Joshua Schachter)
  3. The Society of Mind -- MIT open courseware for the classic Marvin Minsky theory that explains the mind as a collection of simpler processes. The subject treats such aspects of thinking as vision, language, learning, reasoning, memory, consciousness, ideals, emotions, and personality. Ideas incorporate psychology, artificial intelligence, and computer science to resolve theoretical issues such as whole vs. parts, structural vs. functional descriptions, declarative vs. procedural representations, symbolic vs. connectionist models, and logical vs. common-sense theories of learning. (via Maria Popover)
  4. Gamers Solve Problem in AIDS Research That Puzzled Scientists for Years (Ed Yong) -- researchers put a key protein from an HIV-related virus onto the Foldit game. If we knew where the halves joined together, we could create drugs that prevented them from uniting. But until now, scientists have only been able to discern the structure of the two halves together. They have spent more than ten years trying to solve structure of a single isolated half, without any success. The Foldit players had no such problems. They came up with several answers, one of which was almost close to perfect. In a few days, Khatib had refined their solution to deduce the protein’s final structure, and he has already spotted features that could make attractive targets for new drugs. Foldit is a game where players compete to find the best shape for a protein, but it's capable of being played by anyone--barely an eighth of players work in science.

April 15 2011

Search Notes: More scrutiny for Google, more share for Bing

This week, worldwide courts continue their interest in Google while Bing is edging up in market share. That may actually be good news for Google as they fight antitrust allegations.

Google and privacy and governments

GoogleI've written in this column before about both U.S. and international courts looking at all aspects of Google, including antitrust and citizen privacy. That scrutiny continues. The Justice Department has given the go-ahead to Google's acquisition of travel technology company ITA, but the FTC has also instituted conditions to prevent the acquisition from substantially lessening competition. Google agreed to the terms and closed the deal on April 12.

This could pave the way for a FTC antitrust investigation, however. It remains to be seen if the FTC will see the concessions stipulated by the Justice Department to be enough to forgo the investigation. As the result of another FTC investigation, Google hasagreed to 20 years of privacy audits.

The U.S. isn't the only country keeping an eye over Google. Courts in Italy have ruled that for search results in Italy, Google has to filter out negative suggested queries in its autocomplete product.

Swiss courts have ruled that Google has to ensure all faces and license plates are blurred out in its Street View product. Google's technology currently catches and blurs out 98%-99% of both already, but the Swiss ruling mandates that Google blur out the remaining by hand if necessary.

In Germany, Google has stopped Street View photography, possibly to avoid burdensome requirements from German courts. Bing is already facing objections from the German government for its plans to operate a similar service.

Bing's growing market share

BingBoth Hitwise and comScore search engine market share numbers are out, and both show Bing gaining.

Hitwise shows that Bing gained 6% in March, for a current share of 14.32%. Bing-powered search (which includes Yahoo) now stands at 30.1%. (Google lost 3% for a share of 64.42%.)

comScore March data shows that Bing's gain from the previous month is much smaller at .3%, for a current share of 13.9% (and a total Bing-powered search share of 29.6%). ComScore's data shows Google with a .3% increase as well for a current share of 65.7%.

Bing's increase may be due, in part, to increased usage of Internet Explorer 9.

Yahoo BOSS relaunches

The original version of Yahoo BOSS was intended to spark innovation in the startup industry and provide a free, white labeled search index that developers could build from. The newest version, however is a fairly substantial change from the original mission, as it includes branding and pricing requirements.

Of course, Yahoo search itself has changed since the original launch. When BOSS was first envisioned, Yahoo had its own search engine and was looking to disrupt the search engine landscape and compete with both Bing and Google. Now, Yahoo uses Bing's search engine, and in fact, this new version of BOSS uses Bing's index as well.

Will applications built on Yahoo BOSS continue to use the platform with these new requirements? I'd be interested in talking to developers who are facing this decision.

Google rolls out its "content farm" algorithm internationally

In late February, Google launched a substantial change to its ranking algorithms that impacted nearly 12% of queries. This change was intended to identify low quality sites, such as those known as content farms and reduce their ranking.

Google has now made some tweaks and has rolled out the change worldwide for all English queries. Sites around the world are already beginning to see the impact.

One tweak is that Google is now taking into account data about which sites searchers block. Google uses hundreds of signals to determine what web pages are the most useful to searchers and this is one example of how user behavior can play into that.

Online reputation management

Nick Bilton recently wrote a piece in the New York Times about the rise of online reputation management. In today's online world, a quick search for a person's name or a company can surface old past discretions, mistakes, or the crazy rantings of someone with a grudge and passable HTML skills.

Mike Loukides followed this up with a Radar post about how he was disturbed by the idea of manipulating search results and using black hat SEO techniques to make negative information disappear.

This topic becomes more important as our lives and culture move online. Justask Rick Santorum.

So what can you do that's not "black hat" if negative information starts appearing about you or your organization? Google recommends that you "proactively publish [positive] information." For example, make sure your business website is optimized well for search and claim ownership of your business listings on the major search engine maps.

Make sure that you've filled out profiles on social media sites, use traditional public relations to raise visibility, and get involved in the conversation. For instance, if negative forum posts appear about your company in search results, reply in those forums with additional information.

[Note: If the "traditional public relations" that you use is to raise visibility of the negative issue a la Rick Santorum, you'll likely only increase the number of search results that appear about the negative issue, as he's perhaps learned.]

If you're able to get a site owner to take down negative information about you, you can request that Google remove that page from its index. And if you have gotten a court order related to unlawful content, you can request Google remove that content from its index as well.



Related:


March 30 2011

Search Notes: The future of advertising could get really personal

This week, we imagine the future of advertising as we think about how much can really be tracked about us, including what we watch, our chats with our friends, and if we buy a lot of bacon.

Google expands its predictions

Search engines such as Google have an amazing amount of data, both in general (they do store the entire web, after all) and about what we search for (in aggregate, regionally, and categorized in all kinds of segments). In 2009, Google published a fascinating paper about predictions based on search data. The company has made use of this data in kinds of ways, such forecasting flu trends and predicting the impact of the Gulf oil spill on Florida tourism.

You can see the forecasted interest for all kinds of things using Google Insights for Search. Own a gardening web site? You might want to know that people are going to be looking for information on planting bulbs in April and October.

Web Search Interest: planting bulbs
Click to enlarge

Those predictions are all based on search data, but search engines can do similar things with data from websites. Google is now predicting answers to searches using its Google Squared technology. Want to know the release date of a movie or video game? Just ask Google. A Google spokesperson said this feature is for any type of query as long as they have "enough high quality sites corroborating the answer."

Movie guess
Click to enlarge

Yahoo and Bing evolve the search experience

We hear a lot about Google's experiments with changes in the user experience of search, but the other major search engines are changing as well.

When Yahoo replaced their search engine with Bing's, they said they would continue to innovate the search experience. The most recent change they've made is with Search Direct, which is similar to Google's instant search but includes rich media and advertising directly in a dropdown box.

Bing also continues to revise their user interface, the latest being tweets shown on the Bing news search results page (in a box called "public updates"). This is in addition to their "most recent" box.

Bing results
Click to enlarge

Search engines and social networks continue to change the face of advertising

Most of us don't spend much time thinking about the ads that appear next to Google search results, but search-based ads were an amazing transformation in advertising. For the first time, advertisers could target consumers who were looking for exactly what those advertisers had to offer. At scale. Want to target an audience looking to buy black waterproof boots? A snowboard roof rack for a 2007 Mini Cooper? A sparkly pink mini skirt? No problem!

Several years ago, Google introduced ads in Gmail that were intended to be contextually relevant to the email you were reading. This attempt was a bit more hit or miss. Contextual advertising is always going to be a bit less relevant than search advertising. If I'm searching for "best hiking gear," I'm likely looking to buy some. If I'm reading an article in the New York Times about hiking trails in Vermont, I might just be filling time while I wait in line to renew my driver's license. And matching advertising to email is even harder. I might open an email about hiking and wonder how I got on an outdoor mailing list.

For Gmail ads, Google is now looking to use additional signals about how you interact with your mail beyond just the content of the message. They noted that when working on the Priority Inbox feature, they found that signals that determined what mail was important could also potentially be used to figure out what types of ads you might be most interested in.

For example, if you've recently received a lot of messages about photography or cameras, a deal from a local camera store might be interesting. On the other hand if you've reported these messages as spam, you probably don't want to see that deal.

Facebook is also looking to show us ads based on conversations we're having online. This type of advertising has been available in a more general way on Facebook for some time, but this newest test shows ads based on posts in real time. AdAge's description of it sounds like it hits upon the core reason search ads are so effective:

The moment between a potential customer expressing a desire and deciding on how to fulfill that desire is an advertiser sweet spot, and the real-time ad model puts advertisers in front of a user at that very delicate, decisive moment.

Simply showing better ads in email and next to conversations in social networks is one thing, but the more interesting idea is how this idea can be used more broadly. Advertising has always provided the profit for most media (television, newspapers, websites) and innovation as we saw with the original search ads is critical in thinking through the future of journalism.

A breakthrough that makes advertising in online versions of videos more successful than commercials on television could be key in the transition of television to online viewing. Americans engaged in 5 billion online video viewing sessions in February 2011. We watched 3.8 billion ads, but if you are like me and watch a lot of Hulu (and many of you are, as Hulu served more video ads than anyone else), you might wonder if all of those ad views were of the same PSA.

Part of why mainstream advertisers haven't taken the leap from traditional television commercials to video ads is that TV commercials are tried and true. Why transition away from that? A good motivator would be an entirely new ad platform that takes real advantage of the online medium. (In the future, perhaps awebcam will track our facial expressions and use that data to stop showing us that annoying commercial!)

Ad platforms have been evolving use of behavioral targeting for a while, but it's still early days. As for the changes in Gmail ads, it will be interesting to see if the types of email we get one day is part of the personalization algorithm for our search (and search ad) results and if what kinds of email lists we subscribe to and what types of things we search for impact the video ads we see on YouTube.

Add to that the predictive elements of search and that organizations such as Rapleaf can tie our email addresses to what we buy at the grocery store (Googlers drink a lot of Mountain Dew and snack on Dorritos ... and bacon) and it's pretty clear that radical shifts in personalized advertising are likely not too far away.

Google still the top place to work

One in four job applicants wants to work at Google. That's nearly twice the number who want to work at Apple. The top write-in company (a list of 150 was offered in the study) was Facebook, followed by the Department of Homeland Security. No, I don't know why either.

Google was also named the top brand of 2011. So,despite their legal woes, consumers and potential employees are still fans.



Related:


January 28 2011

November 05 2010

Four short links: 5 November 2010

  1. S4 -- S4 is a general-purpose, distributed, scalable, partially fault-tolerant, pluggable platform that allows programmers to easily develop applications for processing continuous unbounded streams of data. Open-sourced (Apache license) by Yahoo!.
  2. RDF and Semantic Web: Can We Reach Escape Velocity? (PDF) -- spot-on presentation from the data.gov.uk linked data advisor. It nails, clearly and in only 12 slides, why there's still resistance to linked data uptake and what should happen to change this. Amen! (via Simon St Laurent)
  3. Pew Internet Report on Location-based Services -- 10% of online Hispanics use these services - significantly more than online whites (3%) or online blacks (5%).
  4. Slate -- Python library for extracting text from PDFs easily.

June 29 2010

Four short links: 29 June 2010

  1. The Diary of Samuel Pepys -- a remarkable mashup of historical information and literature in modern technology to make the Pepys diaries an experience rather than an object. It includes historical weather, glosses, maps, even an encyclopedia. (prompted by Jon Udell)
  2. The Tonido Plug Server -- one of many such wall-wart sized appliances. This caught my eye: CodeLathe, the folks behind Tonido, have developed a web interface and suite of applications. The larger goal is to get developers to build other applications for inclusion in Tonido’s own app store.
  3. Wikileaks Fails "Due Diligence" Review -- interesting criticism of Wikileaks from Federation of American Scientists. “Soon enough,” observed Raffi Khatchadourian in a long profile of WikiLeaks’ Julian Assange in The New Yorker (June 7), “Assange must confront the paradox of his creation: the thing that he seems to detest most-power without accountability-is encoded in the site’s DNA, and will only become more pronounced as WikiLeaks evolves into a real institution.” (via Hacker News)
  4. Yahoo Style Guide -- a paper book, but also a web site with lots of advice for those writing online.

June 03 2010

Four short links: 3 June 2010

  1. How to Get Customers Who Love You Even When You Screw Up -- a fantastic reminder of the power of Kathy Sierra's "I Rock" moments. In that moment I understood Tom's motivation: Tom was a hero. (via Hacker News)
  2. Yahoo! Mail is Open for Development -- you can write apps that sit in Yahoo! Mail, using and extending the UI as well as taking advantage of APIs that access and alter the email.
  3. Canon Hack Development Kit -- hack a PowerShot to be controlled by scripts. (via Jon Udell)
  4. 10TB of US PTO Data (Google Books) -- the PTO has entered into a two year deal with Google to distribute patent and trademark data for free. At the moment it's 10TB of images and full text of grants, applications, classifications, and more, but it will grow over time: in the future we will be making more data available including file histories and related data. (via Google Public Policy blog post)

Older posts are this way If this message doesn't go away, click anywhere on the page to continue loading posts.
Could not load more posts
Maybe Soup is currently being updated? I'll try again automatically in a few seconds...
Just a second, loading more posts...
You've reached the end.

Don't be the product, buy the product!

Schweinderl